SQL Server AlwaysOn–Adding node fails December 11th, 2013

Vinod Kumar

Last week I got a call from our internal deployment teams that when they were adding a node to Windows Cluster for SQL Server AlwaysOn it was timing out and it was some sort of latency issues. It was interesting because the latencies between the two datacenters which were far apart (Delhi and Hyderabad) was around 28ms. The best way for me to understand what was going on was to understand what is happening behind the scenes.

First, I asked what their current setup was. They explained they were planning to achieve the architecture as shown below. They were able to setup on the Primary DC1 the two servers SRV1 and SRV2. Now on the Windows Cluster they were trying to add the third node (SRV3) and it was throwing an error. The same was the case when they tried to add SRV4 node.

The usual ping, tracert etc yielded no big information for us. Infact there was no problem in connectivity between the sites when this operation was performed. At this point in time, it got clear it is not an SQL Server AlwaysOn problem but more of a Windows Cluster Problem that we had in our hands.

The next step was to actually get the Cluster Error Logs and most importantly we wanted to see the actual error message.

Failed to access remote registry on ‘OLTPDB.POC.MYDOMAIN.IN’. Ensure that the remote registry service is running, and have remote administration enabled.

The error message was interesting because it didn’t mention anything about latencies and the deployment partner was actually troubleshooting in the wrong direction.

Actual Solution

It is interesting to find what could have gone wrong.

  1. Firewall is blocking Remote Registry connection between SRV1 and SRV3/SRV4
  2. Though Cluster port 3343 is open, Remote Registry is needed.
  3. Remote Registry works on RPC which uses Dynamic Ports + Port 135

Once we figured out this problem, we had to involve the Network Team to open Firewall for Dynamic Ports or Open for all communications for the specifically Cluster nodes on both sides.

If you are interested in knowing about Remote Registry and how to query the ports, use the Port Query Tool.

It is worth to mention the ports assignment for RPC:

 Service Name

UDP

TCP

HTTP

80, 443, 593

80, 443, 593

Named Pipes

445

445

RPC Endpoint Mapper

135

135

RPC Server Programs

<Dynamically assigned>

<Dynamically assigned>

On customer side the port analysis revealed something interesting. We had to open the communication port of 445 in this specific case.

Port

Protocol

Mode

Status

1433

TCP

Two way

Listening

5022

TCP

Two way

Listening

3343

TCP/UDP

Two way

Listening

135

TCP

Two way

Listening

137

UDP

Two way

Listening

445

UDP/TCP

Two way

Not Listening

The issue was closed as soon as the team worked with the networking team. This was an important learning for us in multiple ways:

  1. SQL Server AlwaysOn is just not about SQL Server, we need to have our Windows Team also involved.
  2. Sometimes a Windows Error message can lead us to Network and Infrastructure requirements that we normally oversee off.
  3. It is a team work of ALL parties (SQL Administrator, Windows Administrator and Network Administrator).

Finally, wanted to mention – Learning is a journey and I seem to be learning something every single day.

This post is inspired from Book content: SQL Server 2012 AlwaysOn Joes 2 Pros®: A Tutorial for Implementing High Availability and Disaster Recovery using AlwaysOn Availability Groups” (Paperback, Kindle).

Tags: , , , , , , , ,

This entry was posted on Wednesday, December 11th, 2013 at 09:59 and is filed under Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.


One Response to “SQL Server AlwaysOn–Adding node fails”

  1. Thanks for sharing … VINOD

Leave a Reply