Lync Edge Hairpin Requirement

Lync Edge DNS Round Robin w/ NAT: Hairpin/50k Port Range Issue

Issue:

Bob (remote user) tries to call Carol (internal user) but receives an error message indicating the “Call failed due to network issues”. Snooper reveals the following error: “Call failed to establish due to a media connectivity failure when one endpoint is internal and the other is remote” with an ICE Warning “ICEWarn=0x40003e0”.

clip_image002[4]

Assuming the following scenario:

· Lync is deployed in a scaled consolidated topology using NAT

· The 50k port range inbound is blocked

· DNS load balancing

· The External Corporate firewall is blocking Hairpin traffic

clip_image004[5]

Bob initiates a call to Carol

Before Bob can send a SIP Invite message to Carol, Lync utilizes STUN, TURN, and ICE to discover a candidate list for completing the media path. To understand that process, have a look at the following article: http://blogs.technet.com/b/nexthop/archive/2009/04/22/how-communicator-uses-sdp-and-ice-to-establish-a-media-channel.aspx .

SIP Invite

Here is the SDP candidate list that Bob sends as part of the SIP Invite to Carol:

a=candidate:1 1 UDP 2130705919 192.168.1.100 33728 typ host

a=candidate:1 2 UDP 2130705406 192.168.1.100 33729 typ host

a=candidate:2 1 TCP-PASS 6556159 178.64.39.80 50468 typ relay raddr 65.10.10.189 rport 26654

a=candidate:2 2 TCP-PASS 6556158 178.64.39.80 50468 typ relay raddr 65.10.10.189 rport 26654

a=candidate:3 1 UDP 16648703 178.64.39.80 57548 typ relay raddr 65.10.10.189 rport 14932

a=candidate:3 2 UDP 16648702 178.64.39.80 57555 typ relay raddr 65.10.10.189 rport 14933

a=candidate:4 1 UDP 1694235135 65.10.10.189 14932 typ srflx raddr 192.168.1.100 rport 14932

a=candidate:4 2 UDP 1694233598 65.10.10.189 14933 typ srflx raddr 192.168.1.100 rport 14933

a=candidate:5 1 TCP-ACT 7075839 178.64.39.80 50468 typ relay raddr 65.10.10.189 rport 26654

a=candidate:5 2 TCP-ACT 7075326 178.64.39.80 50468 typ relay raddr 65.10.10.189 rport 26654

a=candidate:6 1 TCP-ACT 1684796927 65.10.10.189 26654 typ srflx raddr 192.168.1.100 rport 26654

a=candidate:6 2 TCP-ACT 1684796414 65.10.10.189 26654 typ srflx raddr 192.168.1.100 rport 26654

SIP/2.0 200 OK

Carol uses the same discovery process with STUN, Turn, and ICE to create an SDP candidate list to send to Bob.

Here is the SDP candidate list that Carol sends to Bob in the SIP/2.0 200 OK response:

a=candidate:1 1 UDP 2130706431 10.10.10.211 55476 typ host

a=candidate:1 2 UDP 2130705918 10.10.10.211 55477 typ host

a=candidate:2 1 tcp-pass 6555135 178.64.39.81 54978 typ relay raddr 10.10.10.211 rport 49583

a=candidate:2 2 tcp-pass 6555134 178.64.39.81 54978 typ relay raddr 10.10.10.211 rport 49583

a=candidate:3 1 UDP 16647679 178.64.39.81 52755 typ relay raddr 10.10.10.211 rport 53324

a=candidate:3 2 UDP 16647678 178.64.39.81 56065 typ relay raddr 10.10.10.211 rport 53325

a=candidate:4 1 tcp-act 7076863 178.64.39.81 54978 typ relay raddr 10.10.10.211 rport 49583

a=candidate:4 2 tcp-act 7076350 178.64.39.81 54978 typ relay raddr 10.10.10.211 rport 49583

a=candidate:5 1 tcp-act 1684797951 10.10.10.211 49583 typ srflx raddr 10.10.10.211 rport 49583

a=candidate:5 2 tcp-act 1684797438 10.10.10.211 49583 typ srflx raddr 10.10.10.211 rport 49583

Carol tries Bob’s candidate list

When Carol receives Bob’s candidate list, she tries to connect directly using this information:

192.168.1.100 (Bob’s real IP)

Carol is unable to establish a connection with Bob’s real IP because his IP is non-routable

clip_image005[4]

65.10.10.189 (Bob’s public IP)

Carol is unable to establish a connection with Bob’s public IP because Bob’s Home Firewall blocks this traffic

clip_image006[4]

178.64.39.80 (LyncEdge1 AV Edge public IP)

Carol is unable to connect directly to LyncEdge1’s AV edge interface because hairpin traffic is blocked on the corporate network, and because the 50k port range is blocked inbound on LyncEdge1’s AV public IP.

clip_image007[4]

 

clip_image009

 

Bob tries Carol’s candidate list

When Bob receives Carol’s candidate list, he tries to connect directly using this information:

10.10.10.211 (Carol’s Real IP)

Bob is unable to connect direct to Carol’s real IP because he is unable to route to this address

clip_image010

178.64.39.81 (LyncEdge2 AV Edge Public IP)

Bob is unable to connect direct to Carol’s Media Relay because the inbound 50k port range is blocked on the External Corporate Firewall

clip_image011[4]

 

clip_image013

Lync Edge AV Media Relay tries candidate list

Bob and Carol have exhausted all efforts to try and establish a media path directly. There is no direct line of site in which the connection can be made.

The Media Relay service on the Edge AV server will attempt to relay the connection using the candidate lists provided by each client. The Media Relay Service will initiate a Turn FORWARD request with a source port of UDP 3478 and a destination port of UDP 3478.

LyncEdge1 Media Relay using TURN Forward

LyncEdge1 tries to relay the connection to Carol, for Bob, via the Media Relay service using TURN Forward on UDP 3478.

10.10.10.211 (Carol’s Real IP)

Inbound traffic on UDP 3478 to Carol is blocked on the Internal Corporate Firewall

clip_image014

78.64.39.81 (LyncEdge2 AV Edge Public IP)

LyncEdge2’s AV edge interface is unreachable because Hairpin traffic is blocked on the External Corporate Firewall

clip_image015[4]

 

clip_image017

LyncEdge2 Media Relay using TURN Forward

Similarly, LyncEdge2 tries to relay the connection to Bob, for Carol, via the Media Relay service using TURN Forward on UDP 3478

192.168.1.100 (Bob’s real IP)

Bob’s real IP is non-routable

clip_image018

65.10.10.189 (Bob’s public IP)

Bob’s Home Firewall blocks inbound traffic on this port

clip_image019[4]

178.64.39.80 (LyncEdge1 AV Edge public IP)

LyncEdge1’s AV edge interface is unreachable because hairpin traffic is blocked on the External Corporate Firewall

clip_image020[4]

 

clip_image022

Findings

 

Given the assumptions stated at the top of this article:

· Lync is deployed in a scaled consolidated topology using NAT

· The 50k port range inbound is blocked

· DNS load balancing

· The External Corporate Firewall is blocking Hairpin traffic

The media path cannot be established, and we can expect the call to fail.

“Call failed due to network issues”

ms-client-diagnostics: 23; reason=”Call failed to establish due to a media connectivity failure when one endpoint is internal and the other is remote”;CallerMediaDebug=”audio:ICEWarn=0x40003e0,LocalSite=65.10.10.189:26654,LocalMR= 178.64.39.80:50468,RemoteSite=10.10.10.211:49583,RemoteMR=178.64.39.81:54978,PortRange= 1025:65000,LocalMRTCPPort=50468,RemoteMRTCPPort=54978,LocalLocation=1, RemoteLocation=2,FederationType=0″

There are 2 ways to resolve this issue:

 

1. Open the 50k port inbound. When Bob tries to complete the media path using Carol’s candidate list, we see him try to connect to Carol’s Media Relay server (LyncEdge2) using the 50k port range. If this connection is successful, the call will complete. While this is the easiest method, it is not always the most preferred given the number of ports required.

UPDATE: Opening the 50K port range will require that the remote user (Bob) is able to make an outbound TCP connection to LyncEdge2 on a port in the 50K range.  This is usually not a problem when Bob is working remote from a home office using a personal wireless router/firewall.  However, when Bob is traveling to a customer site and connects into the corporate guest WiFi network, outbound ports in the 50K range may be blocked.  It is not unheard of to see 80 and 443 to be the only ports open outbound from corporate networks, especially guest WiFi networks.  Thanks to Thomas Binder for providing this additional information.  I highly suggest to have a look at his presentation from TechEd Europe: “Lync Deep Dive: Edge Media Connectivity with ICE” http://channel9.msdn.com/Events/TechEd/Europe/2012/EXL412.  About 1 hour in will discuss this scenario.

UPDATE 2: Check out the new session from Thomas Binder at the Lync Conference 2014: Edge Media Connectivity in Lync 2013 http://aka.ms/AVEdge.

 

 

clip_image024[4]

 

2. Allow Hairpin traffic on the Corporate Edge Firewall between the Lync Edge servers. When the Media Relay service on the Edge AV server tries the candidate lists, it will attempt to connect to the public IP of the opposing Edge server using port UDP 3478. UDP 3478 should already be open on the External Corporate Firewall based on Determining External A/V Firewall and Port Requirements .

Resolution 2 Implementation:

Based on resolution number 2, the following solution was implemented for a Cisco ASA:

access-list tcp_state_bypass permit tcp host 10.1.0.77 host 10.1.0.78

access-list tcp_state_bypass permit tcp host 10.1.0.78 host 10.1.0.77

access-list tcp_state_bypass permit udp host 10.1.0.77 host 10.1.0.78

access-list tcp_state_bypass permit udp host 10.1.0.78 host 10.1.0.77

class-map tcp_bypass

match access-list tcp_state_bypass

policy-map bypass_policy

class tcp_bypass

set connection advanced-options tcp_state_bypass

static (dmz,dmz) 178.64.39.80 10.1.0.77 netmask 255.255.255.255

static (dmz,dmz) 178.64.39.81 10.1.0.78 netmask 255.255.255.255

Now that traffic is allowed to traverse between the two AV Edge servers public IP addresses, the media path is complete and the call is established. Bob maintains a connection with LyncEdge1, while Carol maintains a connection with LyncEdge2. LyncEdge1 uses TURN Forward to relay the media path through LyncEdge2’s public IP address, and LyncEdge 2 uses TURN Forward to relay the media path through LyncEdge1’s public IP address.

 

clip_image026[4]

2 Replies to “Lync Edge Hairpin Requirement

  1. Can you explain the commands in the second solution ?

    also small question ,if the two edges nated behind same firewall and the two real IPs are assigned to the same interface will it applicable to apply this solution

    1. In solution 2, the top section is related to TCP State Bypass. TAC recommended this setting if there are different Cisco ASA’s in use for inbound/outbound flows. http://www.cisco.com/en/US/products/ps6120/products_configuration_example09186a0080b2d922.shtml

      More importantly for Edge server routing are the “static (dmz,dmz)” NAT commands. This essentially NAT’s communication that originates in the DMZ that is destined for the alternate Edge servers public IP.

      “static (dmz,dmz) 178.64.39.80 10.1.0.77 netmask 255.255.255.255”. Traffic from the dmz to the dmz with a destination IP of 178.65.39.80, will be NAT’d to 10.1.0.77. So when LyncEdge2 sends traffic to 178.64.39.80, the firewall will NAT this to 10.1.0.77 and send the traffic direct to LyncEdge1’s External NIC.

      If the two Edge servers have Public IP’s associated with them, these settings will not be necessary as the traffic will be routed direct to the alternate Edge.

Leave a Reply

Your email address will not be published. Required fields are marked *