Adding a "Cloud Only" SIP Domain in a Skype for Business Hybrid World

March 21st, 2017 | Tags:

You know the old saying, when it rains it pours.  Well, I had 4 different inquiries last week alone on this topic.  I thought I knew how it would work, but figured I needed proof.  So naturally, I took to my lab.

Setup:

Contoso has Skype for Business deployed in a hybrid (split-domain) configuration. 80% of the users are homed On-Prem as they use Enterprise Voice. 20% do not use Enterprise Voice, and these users are homed online. Contoso uses the SIP namespace contoso.com for all users, both On-Prem and Online.

A very common scenario in today’s hybrid world.

Scenario:

There becomes a need to introduce a new SIP namespace (Fabrikam.com). The desire is to home these users 100% Online.

The natural thought process would be to simply add the validated domain to the O365 tenant, and enable the new users for Skype for Business Online using the new SIP namespace. Meaning, create the new users in AD with a UPN matching user@fabrikam.com, let AADC sync the user to O365, then license the user for Skype for Business.

This would also include creating all the necessary DNS records and pointing them to O365. The rational thought here being this is a SfB Online ONLY SIP Namespace. No On-Prem users will use this namespace. So, let’s setup it up as such.

Issue:

Once this has been setup, we see our first issue.  Users created in SfB Online using the new SIP domain (Fabrikam.com) cannot see presence or IM with users On-Prem (Contoso.com).

Test Users:

User Location SIP Domain SIP Address
Alice Wonderland On-Prem Contoso.com awonderland@contoso.com
Peter Parker Online Contoso.com pparker@contoso.com
Jack Bauer Online Fabrikam.com jbauer@fabrikam.com
Jeremy Silber Federated User Hidden to protect the innocent Hidden to protect the innocent

Symptoms: (Screenshots Below per Scenario)

  1. On-Prem to Online:
    • Alice Wonderland (Contoso.com – Homed On-Prem) can see presence and initiate IMs with Jack Bauer (Fabrikam.com – Homed Online)
    • Jack Bauer (Fabrikam.com – Homed Online) can receive IM’s from Alice Wonderland (Contoso.com – Homed On-Prem) and reply to IM’s, but presence for Alice is not available.
      Image 1      clip_image004[10]
  2. Online to OnPrem:
    • Jack Bauer (Fabrikam.com – Homed Online) can sign-in to SfB using the new SIP namespace
    • Jack Bauer (Fabrikam.com – Homed Online) can see presence and initiate IM’s with users homed online within the same tenant (both SIP domains) and vice-versa.
    • Jack Bauer (Fabrikam.com – Homed Online) can see presence and initiate IM’s with federated domains and vice-versa
    • Jack Bauer (Fabrikam.com – Homed Online), CANNOT see presence or initiate IM’s with Alice Wonderland (Contoso.com – Homed On-Prem)
      clip_image006[12]
  3. Online to Online:
    • Peter Parker (Contoso.com – Homed Online) can see presence and IM with everyone.
      clip_image007[10]

Troubleshooting:

From within Snooper, we can see a “504 Server Time-out” error, when Jack Bauer tries to initiate an IM with Alice Wonderland.

clip_image009[12]

clip_image010[10]

Naturally, my first troubleshooting step is to Google the error. “Cannot route From and To domains in this combination”;cause=”Possible server configuration issue”;summary=”The domain of the message that corresponds to local deployment (internal) is not shared with remote peer.” Which doesn’t return anything of value. Hence my writing this article.

In previous conversations with Microsoft, they have stated something to the effect of all Online SIP Namespaces must also be valid On-Prem SIP Domains. Meaning both Contoso.com and Fabrikam.com should be added as valid SIP domains in the Topology Builder of my On-Prem SfB Deployment. While the error message is kind of vague and cryptic, it sounds plausible that this could be the issue.

Testing:

To test this theory, I figured that I would need to add Fabrikam.com as an “Additional supported SIP Domain” within the topology builder.

I wanted to test each stage individually to see when exactly this would start working.

  1. Add Fabrikam.com as supported SIP Domain. Publish Topology : No change
    Before:
    clip_image012[4]
    After:
    clip_image014[4]
  2. Update internal Front End Certificates: No Change
  3. Restart Front End Service and Access Edge Service after successful replication: No Change
  4. Update Access Edge Certificate: Success

Updating the Access Edge Server certificate to include “Sip.fabrikam.com” is the step that made this start working.

This made me think, do I really need Fabrikam.com added as a valid SIP domain in the On-Prem topology, or does it only want the certificate? So, I removed fabrikam.com from the topology builder, but left the certificate in place on the Edge server. What do you know, it still worked! Presence was still available and IM continued to work.

To double check, I added the old certificate without the sip.fabrikam.com SAN entry.  Again IM/P broke.  I then re-added the new cert with the SAN entry sip.fabkrikam.com and voila, again IM/P started working.

Updating the Access Edge certificate to include a SAN entry for the new SIP namespace (sip.fabrikam.com) works, without having to update the entire Lync/SfB Topology.   Albeit, what works…isn’t always supported.

DNS CNAME and SRV records for fabrikam.com point to SfB Online.  Which is exactly what I wanted.

Resolution:

From a support perspective, Microsoft states (Although I’m still looking for an official statement), that any SIP domain used in SfB Online, must also be a valid SIP domain in the Lync/SfB On-Prem topology.  Which makes sense.  If you update the topology builder to include the new SIP namespace, and re-run the certificate wizard on the Edge servers as you’re supposed to, the new SIP namespace will automatically be included as a new SAN entry.  As this is an expected outcome of adding the new SIP namespace to the topology, this is what Microsoft tested, and therefore supports.

While it’s always my recommendation to stay in a supported scenario, it does seem plausible to just update the Edge server certificates with the new SIP SAN entry, without updating the entire topology.  I’d also bet that updating the certificate really is the only step necessary.   Of course, I will admit that I did not test all functionality. Only IM/P in this scenario with a single Edge server.  Further testing may prove other workloads don’t work as expected.

UPDATE 4/5/2017 – In reading through the new SOF material, specifically “3 – Design-Cloud PBX, PSTN Conferencing and Client – Design and Migration Document”, I came across the Hybrid Deployment Prerequisite section, Table 38.

Table 38 – Hybrid Deployment prerequisites
Question Answer Comments
SIP domain(s) in the on-premises Lync Server
or Skype for Business Server deployment
Verify the list of SIP domain(s) matches the list of Office 365 tenant’s validated domain(s) for Skype for Business Online.

If not, plan and document the effort to ensure SIP domain(s) between on-premises and the cloud are in- sync as there are impacts to certificates and DNS records.

Office 365 tenant’s validated domain(s)
enabled/planned for Skype for Business Online
Verify the list of Office 365 tenant’s validated domain(s) for Skype for Business Online matches the list of on-premises SIP domain(s).

If not, plan and document the effort to ensure Office 365 tenant’s validated domain(s) for Skype for Business Online are in-sync with on-premises SIP domain(s) as there are impacts to external DNS configuration to verify domain ownership.

Details of on-premises Edge server’s Access
Edge certificate (issuer, subject name,
subject alternative name(s), etc.)
Verify the certificate is issued by a public Certificate Authority (CA), or a CA listed in the list of Unified Communications certificate partners.
Verify the list of subject alternative name(s) contains access edge FQDN(s) for all SIP domain(s) intended for the hybrid relationship.

If not, plan and document the effort to reissue the certificate to include subject alternative name(s) to support all SIP domain(s) in the hybrid relationship.

While not entirely the same use case, there is also a reference to this in the “Plan for Skype for Business Cloud Connector Edition” documentation https://technet.microsoft.com/en-us/library/mt605227.aspx#BKMK_Certs.  “You will need to add sip.sipdomain.com for every SIP domain and the name of the access Edge pools per domain”.

Further, in discussing this with a colleague at Microsoft, it would seem that there is logic in CCE that requires a SAN entry for “sip.sipdomain.com” on the Access Edge Pool certificate for every SIP domain in the environment.  This is used as part of the authentication/validation check that permits communication from SfB Online to CCE.  It would seem this same logic is used for authentication/validation in a Hybrid deployment as we can see here.

 

Next…In certain scenarios, unsupported ones of course, I can see where adding the SIP namespace On-Prem would not be possible.  Let’s take the scenario where we have an acquisition.  2 separate Active Directory Forests, with the assumption they will stay separate for some time.  We can’t share the SIP Namespace between 2 On-Prem Lync/SfB deployments.  Plus, we want to start moving workloads to the cloud.  Well, Azure AD Connect can be setup to sync both forests to the same O365 tenant (Supported).  Exchange Hybrid can be configured in both forests to sync to the same O365 tenant (Supported).  Why wouldn’t we want Lync/SfB setup in hybrid with the same O365 tenant as well (unsupported).  I know dual-Hybrid in Lync/SfB is not supported, but now I’m curious if we can make it work.  Can we use the same workaround used here, to trick SfB Online into working in this unsupported configuration?  We’re out of time for today though, so that will have to wait for another post.  Stay Tuned!

No comments yet.