Weird DC issue
I've got our primary DC in our office, and I've got a backup DC in a colo site a few miles away (all 2003 servers). Everything's worked great for nearly a year now, but for some reason it seems that our machines are going to the backup DC for stuff instead of the primary. Some of the users will be working away and then when they go to attach an email to our insurance software, they'll get a password prompt that says "Welcome back to <mail server name>" and in the username box it will have "<mail server name>\emmajohnson108016" instead of "djohnson".
I'm suspecting that the machines are timing out trying to get to the backup DC over the slower connection. I can load up ADUC on our Exchange server and it may take a minute or so for it to load, and when it does I see it's using the backup DC. I can try it again and it might load ADUC from the primary DC and it loads nearly instantly. The inter-site transport doesn't have a cost assigned to it. Would changing the cost resolve this issue? Thanks in advance!
I'm suspecting that the machines are timing out trying to get to the backup DC over the slower connection. I can load up ADUC on our Exchange server and it may take a minute or so for it to load, and when it does I see it's using the backup DC. I can try it again and it might load ADUC from the primary DC and it loads nearly instantly. The inter-site transport doesn't have a cost assigned to it. Would changing the cost resolve this issue? Thanks in advance!
[size=-2]Started WGU - BS IT:NDM on 1/1/13, finished 12/31/14
Working on: Waiting on the mailman to bring me a diploma
What's left: Graduation![/size]
Working on: Waiting on the mailman to bring me a diploma
What's left: Graduation![/size]
Comments
-
it_consultant Member Posts: 1,903You should do some DCDIAGs and post the results. It sounds like one of your DCs have tombstoned.
-
arwes Member Posts: 633 ■■■□□□□□□□Sure thing. Here's from the local DC:
Domain Controller Diagnosis Performing initial setup: Done gathering initial info. Doing initial required tests Testing server: Monroe\CFICWSDC1 Starting test: Connectivity ......................... CFICWSDC1 passed test Connectivity Doing primary tests Testing server: Monroe\CFICWSDC1 Starting test: Replications ......................... CFICWSDC1 passed test Replications Starting test: NCSecDesc ......................... CFICWSDC1 passed test NCSecDesc Starting test: NetLogons ......................... CFICWSDC1 passed test NetLogons Starting test: Advertising ......................... CFICWSDC1 passed test Advertising Starting test: KnowsOfRoleHolders ......................... CFICWSDC1 passed test KnowsOfRoleHolders Starting test: RidManager ......................... CFICWSDC1 passed test RidManager Starting test: MachineAccount ......................... CFICWSDC1 passed test MachineAccount Starting test: Services ......................... CFICWSDC1 passed test Services Starting test: ObjectsReplicated ......................... CFICWSDC1 passed test ObjectsReplicated Starting test: frssysvol ......................... CFICWSDC1 passed test frssysvol Starting test: frsevent ......................... CFICWSDC1 passed test frsevent Starting test: kccevent ......................... CFICWSDC1 passed test kccevent Starting test: systemlog ......................... CFICWSDC1 passed test systemlog Starting test: VerifyReferences ......................... CFICWSDC1 passed test VerifyReferences Running partition tests on : ForestDnsZones Starting test: CrossRefValidation ......................... ForestDnsZones passed test CrossRefValidatio Starting test: CheckSDRefDom ......................... ForestDnsZones passed test CheckSDRefDom Running partition tests on : DomainDnsZones Starting test: CrossRefValidation ......................... DomainDnsZones passed test CrossRefValidatio Starting test: CheckSDRefDom ......................... DomainDnsZones passed test CheckSDRefDom Running partition tests on : Schema Starting test: CrossRefValidation ......................... Schema passed test CrossRefValidation Starting test: CheckSDRefDom ......................... Schema passed test CheckSDRefDom Running partition tests on : Configuration Starting test: CrossRefValidation ......................... Configuration passed test CrossRefValidation Starting test: CheckSDRefDom ......................... Configuration passed test CheckSDRefDom Running partition tests on : cficdata Starting test: CrossRefValidation ......................... cficdata passed test CrossRefValidation Starting test: CheckSDRefDom ......................... cficdata passed test CheckSDRefDom Running enterprise tests on : cficdata.net Starting test: Intersite ......................... cficdata.net passed test Intersite Starting test: FsmoCheck ......................... cficdata.net passed test FsmoCheck
And here's the output from the remote DC:Domain Controller Diagnosis Performing initial setup: Done gathering initial info. Doing initial required tests Testing server: Circle-Drive\CFICWSDC2 Starting test: Connectivity ......................... CFICWSDC2 passed test Connectivity Doing primary tests Testing server: Circle-Drive\CFICWSDC2 Starting test: Replications ......................... CFICWSDC2 passed test Replications Starting test: NCSecDesc ......................... CFICWSDC2 passed test NCSecDesc Starting test: NetLogons ......................... CFICWSDC2 passed test NetLogons Starting test: Advertising ......................... CFICWSDC2 passed test Advertising Starting test: KnowsOfRoleHolders ......................... CFICWSDC2 passed test KnowsOfRoleHolders Starting test: RidManager ......................... CFICWSDC2 passed test RidManager Starting test: MachineAccount ......................... CFICWSDC2 passed test MachineAccount Starting test: Services ......................... CFICWSDC2 passed test Services Starting test: ObjectsReplicated ......................... CFICWSDC2 passed test ObjectsReplicated Starting test: frssysvol ......................... CFICWSDC2 passed test frssysvol Starting test: frsevent ......................... CFICWSDC2 passed test frsevent Starting test: kccevent ......................... CFICWSDC2 passed test kccevent Starting test: systemlog ......................... CFICWSDC2 passed test systemlog Starting test: VerifyReferences ......................... CFICWSDC2 passed test VerifyReferences Running partition tests on : ForestDnsZones Starting test: CrossRefValidation ......................... ForestDnsZones passed test CrossRefValidation Starting test: CheckSDRefDom ......................... ForestDnsZones passed test CheckSDRefDom Running partition tests on : DomainDnsZones Starting test: CrossRefValidation ......................... DomainDnsZones passed test CrossRefValidation Starting test: CheckSDRefDom ......................... DomainDnsZones passed test CheckSDRefDom Running partition tests on : Schema Starting test: CrossRefValidation ......................... Schema passed test CrossRefValidation Starting test: CheckSDRefDom ......................... Schema passed test CheckSDRefDom Running partition tests on : Configuration Starting test: CrossRefValidation ......................... Configuration passed test CrossRefValidation Starting test: CheckSDRefDom ......................... Configuration passed test CheckSDRefDom Running partition tests on : cficdata Starting test: CrossRefValidation ......................... cficdata passed test CrossRefValidation Starting test: CheckSDRefDom ......................... cficdata passed test CheckSDRefDom Running enterprise tests on : cficdata.net Starting test: Intersite ......................... cficdata.net passed test Intersite Starting test: FsmoCheck ......................... cficdata.net passed test FsmoCheck
Everything looks good to me. DC2 did tombstone last year due to a prolonged site link issue, and when it was resolved I had to demote it, clear out any AD info on it and then join it back to the domain and promote it. However I've got no signs of tombstoning in the event logs on either server. Apparently our site link is less than stable again though. We were using Neverfail to replicate our SQL and file servers and it performed really poorly. I spoke with my boss earlier and he's working with the colo site to see about getting a more reliable connection to them.
Since the site link is performing so poorly anyway, I may go ahead and just demote the server since it wouldn't do us much good as a backup for the time being. It would probably be the quickest fix anyway.[size=-2]Started WGU - BS IT:NDM on 1/1/13, finished 12/31/14
Working on: Waiting on the mailman to bring me a diploma
What's left: Graduation![/size] -
undomiel Member Posts: 2,818Do you have sites setup properly? Create your subnets in there and assign them to separate sites. It sounds like both DCs are in the same site because otherwise it would be grabbing a DC from its local site first. Also check and see if replication is working properly. What version of Exchange is this? 2007/2010? This does sound like an autodiscover or certificate issue so I would recommend checking that you have all of your URLs configured correctly and the proper authentication setup in IIS. Also make sure that your certificate is configured with the correct URLs.Jumping on the IT blogging band wagon -- http://www.jefferyland.com/
-
RobertKaucher Member Posts: 4,299 ■■■■■■■■■■Do you have sites setup properly? Create your subnets in there and assign them to separate sites. It sounds like both DCs are in the same site because otherwise it would be grabbing a DC from its local site first. Also check and see if replication is working properly. What version of Exchange is this? 2007/2010? This does sound like an autodiscover or certificate issue so I would recommend checking that you have all of your URLs configured correctly and the proper authentication setup in IIS. Also make sure that your certificate is configured with the correct URLs.
-
arwes Member Posts: 633 ■■■□□□□□□□The fun part is I didn't set them up, but I'll be reworking all this pretty soon. However, I think I found a quick fix that should get the users off my back long enough to get the site stuff figured out.
On DC1, I created a DWORD entry called LdapSrvWeight with a value of 50 (decimal) in HKLM\SYSTEM\CurrentControlSet\Services\Netlogon\Parameters. Before doing this, ADUC on the Exchange server would almost always go to DC2. After doing this, it goes to DC1 every time. I'll find out tomorrow morning if this works well enough for the users that are having issues.[size=-2]Started WGU - BS IT:NDM on 1/1/13, finished 12/31/14
Working on: Waiting on the mailman to bring me a diploma
What's left: Graduation![/size]