Tuesday, May 28, 2013
Secondary Replica of AlwaysOn Availability group is in "Resolving" state after fail over
While testing AlwaysOn Availability groups I failed over primary to secondary couple of times. However 3rd time when I failed over the secondary did not come up immediately and was in “Resolving” state.
Problem was with maximum failures threshold. By default it is set to 2 failures in 6 hours.
So since I failed over AG more than 2 times within an hour it tripped the maximum failures threshold for this clustered resource and came up with “Resolving” state.
Solution 1: wait for default period of 6 hours.
Solution 2: Change the threshold and fail back to original primary.
Default failback setting is set to immediate. I would also recommend setting it to Prevent Failback.
Here is how I have configured my availability group clustered resource
Following KB also has other scenarios
Please note that I used these configurations only for testing. You may need to reconsider for your production setup depending your fail over needs