Issue at Doculand Jan, Mars 2019
The error in the Logs (Event 4012, Error 9061)
Error: 9061 (The replicated folder has been offline for too long.)
Replicated Folder Name: SYSVOL Share
----
The DFS Replication service stopped replication on the folder with the following local path: C:\Windows\SYSVOL\domain. This server has been disconnected from other partners
for 242 days, which is longer than the time allowed by the MaxOfflineTimeInDays parameter (60). DFS Replication considers the data in this folder to be stale, and this server will not replicate the folder until this error is corrected.
To resume replication of this folder, use the DFS Management snap-in to remove this server from the replication group, and then add it back to the group. This causes the server to perform an initial synchronization task, which replaces the stale data with fresh data from other members of the replication group.
Additional Information: Error: 9061 (The replicated folder has been offline for too long.)
Replicated Folder Name: SYSVOL Share
Replicated Folder ID: 0EEB20A9-F80B-4141-9E39-04D91576BFBA
Replication Group Name: Domain System Volume
Replication Group ID: 4194E2B2-028F-4BB3-835C-1DCEA8F0A1D4
Member ID: 50D333C0-40B9-445B-BB3E-8D58F9302FB3
What I did
Executed this on both DCs to raise the threshold from 60 days to 300 and thus allow replication to continue:
wmic.exe /namespace:\\root\microsoftdfs path DfsrMachineConfig set MaxOfflineTimeInDays=300 After 2 days reset it to default with:
wmic.exe /namespace:\\root\microsoftdfs path DfsrMachineConfig set MaxOfflineTimeInDays=60
ISSUE Jan 2018
What I did
Followed "Seize the FSMO roles"
Followed "Removing A Domain Controller Server Manually"
Followed "Metadata cleanup (the easy way)"
Executed the "netdom tests"
Executed the "repadmin tests"
Executed "dcdiag test"
NOTE THAT I HAVE A FAILURE ON DFSR
todo
- on all DCs: either disable NICs that are not used or Just un-check the "Register this connection's address in DNS" checkbox (e.g. if it's used for ISCSI )
- on all DCs: IF SOMETHING GOES WRONG AND for troubleshooting reasons only point them to one and the same DNS Server , then run ipconfig /registerdns then restart the netlogon service
- on all DCs: Make sure that these services are running: Server, Workstation, NETLOGON
- You should only configure a NTP Server only at the DC that has the PDC role, other DCs, member servers and clients should not use NTP.
- run dcdiag and look for any line that displays "failed test"
- I found out the Both my DCs are Global Catalog server — should they? I think it is because a few good sysadmins at spiceworks are saying it's good for some use cases with huge domains and non of them says it's bad in any case.
Seize the FSMO roles
If you're curious see "What are the FSMO Roles". TLDR: there are 5 FSMO roles and each one must be assigned to at least one D.C.: 1. Schema master, 2. Domain naming master, 3. RID master, 4. PDC emulator, 5. Infrastructure master. For completence I must note that there's also the option for any and all DCs to also keep a Global Catalog (GC). It's not an FSMO role because more than one DCs can keep the GC of the same domain.
If you demote a DC that has any of the FSMO roles it will pass it to another DC (this is called a transfer of the role ). If you fail to demote it you must seize(
take by force) the role.
For single domain forests all FSMO roles are installed in the first domain controller.
When you are carrying out a planned move, it is called transferring the role.
It is extremely important to remember two things:
- It's OK for a DC hosting a role to go down for a short time.
- If either the Schema Master, Domain Naming Master, or RID Master role is seized from a domain controller, that domain controller must never be allowed to come back online.
Log on to the domain controller that you are assigning FSMO roles to.
type "
ntdsutil" at the command line. The prompt will then change to reflect the current level of the menu. In this case, at the "ntdsutil" prompt, you would type "
roles." The command prompt will then change to FSMO Maintenance.
Type
roles Type
connections Type
connect to server <servername>
servername is the name of the domain controller that you want to assign the FSMO role to.
At the server connections prompt, type
q, and then press ENTER.
For a list of roles that you can seize, type
? at the fsmo maintenance prompt, and then press ENTER
Type
seize <role>
where role is the role that you want to seize.
At the fsmo maintenance prompt, type
q, and then press ENTER
Type
q, and then press ENTER to quit the Ntdsutil utility.
Do not put the Infrastructure master role on the same domain controller as the Global Catalog (GC) server. If the Infrastructure master runs on a global catalog server it stops updating object information. (The GC contains a searchable, partial representation of every object in every domain in a multidomain forest --
Read More)
https://blogs.technet.microsoft.com/canitpro/2016/02/17/step-by-step-removing-a-domain-controller-server-manually/ What about the Global Catalog? Is any server a global catalog?
Removing A Domain Controller Server Manually
(from
https://blogs.technet.microsoft.com/canitpro/2016/02/17/step-by-step-removing-a-domain-controller-server-manually/)
Log in to DC server as Domain/Enterprise administrator and navigate to Server Manager > Tools > Active Directory Users and Computers
Expand the Domain > Domain Controllers
Right click on the DC server that need to be removed manually and click delete
In next dialog box, click yes to confirm
In next dialog box, select This Domain Controller is permanently offline and can no longer be demoted using the Active Directory Domain Services Installation Wizard (DCPROMO) and click Delete
If the domain controller is global catalog server, in next window click yes to continue with deletion
If the domain controller holds any FSMO roles in next window, click ok to move them to the domain controller which is available
Step 2: Cleaning up the DC server instance from the Active Directory Sites and Services
Go to Server manager > Tools > Active Directory Sites and Services
Expand the Sites and go to the server which need to remove
Right click and click Delete
In next window click yes to confirm
https://chinnychukwudozie.com/2014/01/27/using-ntdsutil-metada-cleanup-to-remove-a-failedoffline-domain-controller-object/
(based on
https://chinnychukwudozie.com/2014/01/27/using-ntdsutil-metada-cleanup-to-remove-a-failedoffline-domain-controller-object/)
- ntdsutil
- metadata cleanup
- connections
- connect to server pdcsrv
- quit
- select operation target
(Entering this mode, will enable me select the sites, domains and servers I intend to work with.)
Overview of things to do:
- seize the FSMO role on the healthy DC
- transfer all other AD services like DNS/GC/DHCP etc on it.
- Remove/clean the crashed DC objects & its references from AD to inform that this DC is no more.
Metadata cleanup is process which is required to remove the failed DC from the domain which can’t be demoted gracefully. Does metadata cleanup removes all the entry from AD? the answer is NO, it doesn’t remove all the records from AD & from few places manual cleaning of those objects are required. Why? There are different aging/scavenging configurations defined on DNS to remove entries when they become stale. When a host record is removed, its actually not removed but its Dnstombstoned attribute is set to true for later point of deletion using again/scavenging configured interval, so the records still exists in AD.
The places required to look after either using normal demotion or force demotion of a DC are below.
- -Each & every sub folder inside _msdcs folder in DNS
- -Name server tab in DNS
- -Host records in DNS
- -Server object under NTDS setting in AD sites & services.
Note: Once you perform the metadata cleanup of DC, don’t immediately reuse the same Hostname/IP of failed DC to configure it back to a new DC, because you have to allow changes to be replicated to all other domain controllers in the forest by allowing & waiting for at least one replication cycle to complete. But if you got few DC’s & good bandwidth, you can force the replication using repadmin /syncall /Aped
Related link for Metadata cleanup in windows 2008
TESTS
netdom tests
PS >
netdom query /domain:doculand dc List of domain controllers with accounts in the domain:
PDCSRV
DC2
PS >
netdom query /domain:doculand pdc Primary domain controller for the domain:
PDCSRV
PS >
netdom query /domain:doculand fsmo Schema master pdcsrv.doculand.local
Domain naming master pdcsrv.doculand.local
PDC pdcsrv.doculand.local
RID pool manager pdcsrv.doculand.local
Infrastructure master pdcsrv.doculand.local
The command completed successfully.
repadmin tests
PS >
repadmin /replsum Replication Summary Start Time: 2018-01-30 01:01:11
Source DSA largest delta fails/total %% error
DC2 04m:04s
0 / 5 0
PDCSRV 04m:20s
0 / 5 0
Destination DSA largest delta fails/total %% error
DC2 04m:20s
0 / 5 0
PDCSRV 04m:04s
0 / 5 0
PS >
repadmin /showreps Default-First-Site-Name\PDCSRV
DSA Options: IS_GC
Site Options: (none)
DSA object GUID: ac3968d7-7e1b-4483-92e8-899b6e2449aa
DSA invocationID: 4cc1cb82-2f89-478c-abe8-fd5cc3890683
INBOUND NEIGHBORS
DC=
doculand,DC=local
Default-First-Site-Name\
DC2 via RPC
DSA object GUID: 37f00268-a782-4441-9857-6bb7a82ca881
Last attempt @ 2018-01-30 00:57:09
was successful.
CN=
Configuration,DC=doculand,DC=local
Default-First-Site-Name\
DC2 via RPC
DSA object GUID: 37f00268-a782-4441-9857-6bb7a82ca881
Last attempt @ 2018-01-30 01:03:30
was successful.
CN=
Schema,CN=Configuration,DC=doculand,DC=local
Default-First-Site-Name\
DC2 via RPC
DSA object GUID: 37f00268-a782-4441-9857-6bb7a82ca881
Last attempt @ 2018-01-30 00:57:07
was successful.
DC=
DomainDnsZones,DC=doculand,DC=local
Default-First-Site-Name\
DC2 via RPC
DSA object GUID: 37f00268-a782-4441-9857-6bb7a82ca881
Last attempt @ 2018-01-30 00:57:15
was successful.
DC=
ForestDnsZones,DC=doculand,DC=local
Default-First-Site-Name\
DC2 via RPC
DSA object GUID: 37f00268-a782-4441-9857-6bb7a82ca881
Last attempt @ 2018-01-30 00:57:12
was successful.
errors I show
event log
Error 1/26/2018 7:31:04 AM NETLOGON 5719 None
****These Errors started at 12/19/2017 4:13:06 PM
This computer was not able to set up a secure session with a domain controller in domain DOCULAND due to the following:
There are currently no logon servers available to service the logon request.
This may lead to authentication problems. Make sure that this computer is connected to the network. If the problem persists, please contact your domain administrator.
ADDITIONAL INFO
If this computer is a domain controller for the specified domain, it sets up the secure session to the primary domain controller emulator in the specified domain. Otherwise, this computer sets up the secure session to any domain controller in the specified domain.
Error 1/26/2018 7:27:16 AM GroupPolicy (Microsoft-Windows-GroupPolicy) 1055 None
The processing of Group Policy failed. Windows could not resolve the computer name. This could be caused by one of more of the following:
a) Name Resolution failure on the current domain controller.
b) Active Directory Replication Latency (an account created on another domain controller has not replicated to the current domain controller
Error 1/25/2018 11:46:32 PM NETLOGON 5513 None
The computer KMC8000-RIP tried to connect to the server \\PDCSRV using the trust relationship established by the DOCULAND domain. However, the computer lost the correct security identifier (SID) when the domain was reconfigured. Reestablish the trust relationship.
The computer XGII-7567 tried to connect to the server \\PDCSRV using the trust relationship established by the DOCULAND domain. However, the computer lost the correct security identifier (SID) when the domain was reconfigured. Reestablish the trust relationship.
What are the FSMO Roles
Earlier versions of Windows were using a single-master model (the DC was performing all the important roles). Active Directory extends this to include multiple roles, and the ability to transfer roles to any domain controller (DC) in the enterprise. Because the roles of the Active Directory controller are not bound to a single host, we use the term
Flexible Single Master Operations (FSMO). Currently in Windows there are five FSMO roles:
- Schema master
- Domain naming master
- RID master
- PDC emulator
- Infrastructure master
The
schema master is the DC responsible for performing updates to the directory schema (that is, the schema naming context or cn=schema,cn=configuration,dc=<domain>).
The
domain naming master FSMO role holder is the DC responsible for making changes to the forest-wide domain name space of the directory (that is, the Partitions\Configuration naming context or CN=Partitions,CN=Configuration,DC=<domain>).
When a DC creates a security principal object such as a user or group, it attaches a unique Security ID (SID) which consists of a domain-wide part (the same for all objects of the domain) and a relative ID (RID) that is unique for each object. Each Windows DC in a domain gets a pool of RIDs that it is allowed to assign from the
RID master.
The
PDC emulator has mutiple responsibilities: a) it is necessary to synchronize time in all domain hosts b) password changes performed by other DCs in the domain are replicated preferentially to the PDC emulator c) Authentication failures that occur at a given DC in a domain are forwarded to the PDC emulator before a failure message is reported to the user d) it processes account lockouts e) it supports old clients that only work with Windows-NT DCs
Other Notes
What's special about the SYSVOL & NETLOGON replication
Unlike custom DFSR replicated folders, SYSVOL is intentionally protected from any editing through its management interfaces to prevent accidents. As a result, if you want to force the non-authoritative synchronization of SYSVOL on a domain controller, you cannot use the DFS Management snap-in (Dfsmgmt.msc) or the Dfsradmin.exe command-line tool.
What is Authoritative and Non-Authoritative Restore
Non-Authoritative restore will restore all active directory data with data from the other domain controllers. You run it on the server that has bad data.
Authoritative : this is when you select a DC as the Authoritative one (perhaps after restoring a known good backup of the AD in that DC), then restore all other DC's to match the authoritative one. Authoritative restore must be your LAST option.
This is good: http://www.rebeladmin.com/2017/08/non-authoritative-authoritative-sysvol-restore-dfs-replication/
This is official: https://support.microsoft.com/en-us/help/2218556/how-to-force-an-authoritative-and-non-authoritative-synchronization-fo. When it says "
Force Active Directory replication throughout the domain" use this command at the “healthy” DC: repadmin /syncall /AdePq
DFS Replication: How to troubleshoot missing SYSVOL and Netlogon shares
https://support.microsoft.com/en-ca/help/2958414/dfs-replication-how-to-troubleshoot-missing-sysvol-and-netlogon-shares
Old processes (pre-2008) you should NEVER use after Windows 2008
FRS versus DFSR
Windows Server 2000 and 2003 uses File Replication Service (FRS) to replicate the SYSVOL folder content to other domain controllers. With Windows Server 2008, it deprecated the FRS and introduced Distributed File System (DFS) for the SYSVOL folder replication:
D2/D4 trick for OLD (FRS) servers without SYSVOL NETLOGON shares
HKLM\System\CurrentControlSet\services\NtFrs\Backup/Restore\Process at Startup\
BurFlags