I had pre provisioned five exchange objects, two for servers meant to hold the HUB/CAS roles, two for the CCR Cluster Nodes and one for the CMS. The delegated local admin had installed the HUB/CAS servers with no issues, but hit a problem after installing the mailbox role on the first node, the CMS creation failed with the following message.
Clustered Mailbox Server ......................... FAILED
The computer account 'Server' was created on the domain controller
'\\fsmoholder.domain.suffix', but has not replicated to the desired domain
controller (localdc.domain.suffix) after waiting approximately 60 seconds
. Please wait for the account to replicate and re-run setup /newcms.
The Exchange Server Setup operation did not complete. For more information, visit http://support.microsoft.com and enter the Error ID.
Exchange Server setup encountered an error.
The local admin tried a few things to discover what the problem was and resolve it himself with no success. So passed the problem back.
Waiting for replication as indicated in the error message, doesn’t solve the issue. So I checked the obvious oversights for causes of the problem.
- Service Account for the Cluster has full control of the CMS Computer Object in AD
- Manually re-creating the Network Name Resource in Cluster Manager before waiting for a replication cycle and then running setup /newcms with all the required switches again.
- Checking TCP Chimney was disabled. This can cause timeouts when talking to Domain Controllers, especially on Exchange Boxes.
- Removing all traces of exchange from the nodes. (Ensuring v8.0 registry hive is removed is often overlooked) Removing the provisioned objects from AD and going through the motions of installing again.
After a few hours of troubleshooting I logged a call with MS, getting the same error numerous times. So they went through a very similar troubleshooting process that I’d been through and reached the same point.
I should explain that this exchange installation was sitting in an AD Site with a 15 minute replication delay from the DC containing all the FSMO roles.
Microsoft eventually helped us get this resolved after a day of troubleshooting.
The problem here lies with the Server 2003 Cluster Service, not Exchange. When a network name resource is created, it insists on going to the PDC Emulator to take control / create the Computer Object. This means that when attempts to use the computer object on a local DC occur, it gets locked out, because the password set on the PDC Emulator for the object is not the password on the local domain controller, yet, due to replication delays. The CMS creation then fails.
When our AD was a lot smaller, with hardly any users, we had moved the PDC emulator to the local site to resolve the issue. This wasn’t an option any more, with upwards of 40,000 user accounts and active migrations taking place.
The solution was quite simple. Essentially, we removed the CMS Resource Group from the Cluster, pre-created a new one, along with the Network Name Resource and the IP Resource, and waited for an AD Replication Cycle.
We then added the /domaincontroller switch to the setup command we were using to create the cms.
Setup.com /newcms /DomainControler localdc.dom.suff /cmsname:ExchangeCMSName /cmsipaddress:10.0.0.10 /CMSSharedStorage /CMSDataPath:"M:\Storage Groups"
The reason we did this was to ensure we knew what local DC exchange was going to use. It didn’t change the fact that the Cluster Service was attempting to use the computer account on the FSMO holder.
We then kicked off the command, but that’s not it. As soon as we had kicked off the command, we had to monitor the computer account on the DC we specified in the setup.com command above and continually refresh its status. At some point in the setup, you would see that computer account getting disabled, sometimes more than once, we needed to enable it as soon as possible. Once we enabled the computer account, only once for us. The command to provision the CMS worked fine and we could watch the other Resources being created in Cluster Manager. All fixed!
I should point out that this problem doesn’t exists in server 2008 as the Cluster Service is a little more intelligent.