In a Microsoft Active Directory environment, authentication is obviously critical. Windows domain members use something called Kerberos authentication and if it isn't working properly then likely the domain member isn't either. Most of the time Kerberos just works, but when it doesn't, you need to be prepared put it in its place.

First, it will help troubleshooting if you understand what Kerberos is and how it works. Kerberos takes its name from Greek mythology. You may also know it from its Latin spelling Cerebus. Kerberos is the three headed dog that guards the entrance to the underworld. As a protocol developed at MIT in the 1980's, it similarly employs 3 'heads'.


Figure 1 Cerebus by William Blake

In the Windows world, the three 'heads' are the client, a server and a trusted third party. The latter is a resource that both the client and server trust, which in Windows will be a domain controller. Here's a simplified version of how this works.

Alice wants to talk to Bob at a party, who is rather protective of his privacy. But Alice and Bob both trust Charlie. Alice enters the room, gives Charlie the secret handshake and asks for a pass she can use later to get introductions to talk to Bob. Charlie recognizes Alice and gives her a time sensitive pass, signed by Charlie. After a drink or two of 'courage' Alice decides to talk to Bob. So she goes back to Charlie and shows him the card he provided earlier (Charlie has short term memory problems apparently). She asks for an introduction ticket she can use to talk to Bob. Charlie obliges with another time sensitive document he has signed along with an encrypted secret that only he and Bob know. Alice goes to Bob and offers this ticket. Bob looks at the ticket. Sees that it is still valid, that it has been signed by Charlie and has their secret code, so he knows it came from Charlie. Well, any friend of Charlie is a friend of mine so Bob accepts Alice's offer to talk. If their conversation runs long, Bob might start getting suspicious so he'll politely ask Alice to check in with Charlie again to get another ticket. It is also possible for Alice to ask Bob to prove his identity, and Bob can oblige by returning a secret handshake.

In the Windows world Charlie is the domain controller. Without going into all the nitty gritty details involving public key infrastructure, encryption and protocols, when a user authenticates to the domain controller, she receives a special Ticket Granting Ticket (TGT). When the client wants to communicate with another domain member, she presents the TGT to the domain controller and asks for a service ticket to the server. The service ticket is then passed to the member server which verifies the ticket data, and if all is well, accepts a client server session.

The primary service on the domain controller that manages all of this is the Key Distribution Center or KDC. Figure 2 provides a high level summary of the authentication process.


Figure 2 Kerberos in a Nutshell

As you can see there are a lot of moving parts, which means plenty of places where something could go wrong. Fortunately, if you have a well-designed and maintained Active Directory infrastructure Kerberos-related problems should be rare. But if you suspect an authentication problem, here are some steps you can take.

First off, make sure the KDC service is running on your domain controllers. You can manually check using the Services management console, or use PowerShell. Here's how I can check using the Microsoft Active Directory provider.

PS C:\> import-module ActiveDirectory
PS C:\> Get-ADComputer -filter * -SearchBase "OU=Domain Controllers,DC=globomantics,DC=local" | foreach { get-service KDC -ComputerName $_.Name} | Select Status,Name,Machinename
Status Name''''''''''''''''''''''''' MachineName
------ ----''''''''''''''''''''''''' -----------
Running KDC'''''''''''''''''''''''''''' CHI-DC01
Running KDC'''''''''''''''''''''''''''' CHI-DC02

You should also verify that the SRV records for the Kerberos service are correct from the client. Open a command prompt and start NSLookup in interactive mode. Set the record type to SRV and then query for the record _kerberos._tcp.dc._msdcs.. Here's what I run in my Globomantics domain.

C:\>nslookup
Default Server:' chi-dc01.globomantics.local
Address:' 172.16.20.1
> set type=SRV
> _kerberos._tcp.dc._msdcs.globomantics.local
Server:' chi-dc01.globomantics.local
Address:' 172.16.20.1
_kerberos._tcp.dc._msdcs.globomantics.local'''' SRV service location:
priority'''''' = 0
weight'''''''' = 100
port'''''''''' = 88
svr hostname'' = chi-dc01.globomantics.local
_kerberos._tcp.dc._msdcs.globomantics.local'''' SRV service location:
priority'''''' = 0
weight'''''''' = 100
port'''''''''' = 88
svr hostname'' = chi-dc02.globomantics.local
chi-dc01.globomantics.local'''' internet address = 172.16.20.1
chi-dc02.globomantics.local'''' internet address = 172.16.20.2
> exit
C:\>

Those are my domain controllers at those IP addresses so there are no problems here.

Because Kerberos relies heavily on time stamping, it is imperative that domain members be configured with an authoritative time source and that everyone is in synch. Kerberos should allow a few minutes leeway but if clocks are skewed more than that, then the tickets issued from the KDC won't be worth the bits they are printed on.

The next tool to become familiar with is NLTEST.EXE. This is a multi-purpose command line tool for querying domain and domain controller configurations. You can use it to find a domain controller that offers the KDC service. NLTEST.EXE should be part of Windows 7. Open a command prompt and type something like this:

C:\>nltest /dsgetdc:globomantics /kdc
DC: \\CHI-DC02
Address: \\172.16.20.2
Dom Guid: 44e3c936-5c8f-40cd-af67-f846c184cc8c
Dom Name: GLOBOMANTICS
Forest Name: GLOBOMANTICS.local
Dc Site Name: Default-First-Site-Name
Our Site Name: Default-First-Site-Name
Flags: GC DS LDAP KDC TIMESERV WRITABLE DNS_FOREST CLOSE_SITE FULL_SECRET WS
The command completed successfully
C:\>

Or you can have it query DNS for the KDC server records.

C:\>nltest /dnsgetdc:globomantics.local /kdc
List of DCs in pseudo-random order taking into account SRV priorities and weight
s:
Non-Site specific:
chi-dc01.globomantics.local' 172.16.20.1
chi-dc02.globomantics.local' 172.16.20.2
The command completed successfully
C:\>

This should give you the same results from the NSLookup example I mentioned earlier. Just remember to use the fully qualified domain name.

The big dog of domain testing is the command line tool DCDIAG.EXE. For our purposes we can use it to verify the proper services are running on a domain controller, which includes the KDC. This utility should also be found on your Windows 7 desktop.

C:\>dcdiag /test:services /s:chi-dc01
Directory Server Diagnosis
Performing initial setup:
* Identified AD Forest.
Done gathering initial info.
Doing initial required tests
Testing server: Default-First-Site-Name\CHI-DC01
Starting test: Connectivity
......................... CHI-DC01 passed test Connectivity
Doing primary tests
Testing server: Default-First-Site-Name\CHI-DC01
Starting test: Services
kdc Service is stopped on [CHI-DC01]
......................... CHI-DC01 failed test Services
Running partition tests on : ForestDnsZones
Running partition tests on : DomainDnsZones
Running partition tests on : Schema
Running partition tests on : Configuration
Running partition tests on : GLOBOMANTICS
Running enterprise tests on : GLOBOMANTICS.local
C:\>

I used it to query a specific domain controller and I can see that I have a problem with the KDC service which I will have to look into. This is a handy tool you can use to query all domain controllers in the site, but you need to specify at least one.

C:\>dcdiag /test:services /test:dns /s:chi-dc01 /a /v /f:dcdiag-results.txt

With this command I decided to also run some DNS tests while I'm at, get verbose details and save the results to a text file.

The last domain specific task is to double check that the service account for the KDC is still there and is disabled. You can either use the PowerShell module:

PS C:\> get-aduser krbtgt
DistinguishedName : CN=krbtgt,CN=Users,DC=GLOBOMANTICS,DC=local
Enabled'''''''''' : False
GivenName'''''''' :
Name''''''''''''' : krbtgt
ObjectClass'''''' : user
ObjectGUID''''''' : 87a43158-929d-4cbb-b4a9-e4a2bf83a9fb
SamAccountName''' : krbtgt
SID'''''''''''''' : S-1-5-21-2552845031-2197025230-307725880-502
Surname'''''''''' :
UserPrincipalName :

Or the command line tool DSGet.

C:\>dsquery user -name krbtgt | dsget user -L
dn: CN=krbtgt,CN=Users,DC=GLOBOMANTICS,DC=local
desc: Key Distribution Center Service Account
samid: krbtgt
dsget succeeded
C:\>

It would be odd if anything happened to this account and if it did I would expect you to have problems with the KDC service. Still, it doesn't hurt to be thorough.

On the client side, the best tool you have in your troubleshooting tool kit is another command line tool called KLIST.EXE. The first way to use it is to retrieve information about the user's TGT. The command doesn't require any special privileges, nor do you want it to. You want to see what things look like from the user's perspective.

On a Windows 7 box, Jack Frost is logged on, opens a command prompt and runs KLIST.EXE with the TGT parameter.

C:\Users\jfrost>klist tgt
Current LogonId is 0:0x49f68d
Cached TGT:
ServiceName''''''' : krbtgt
TargetName (SPN)'' : krbtgt
ClientName'''''''' : jfrost
DomainName'''''''' : GLOBOMANTICS.LOCAL
TargetDomainName'' : GLOBOMANTICS.LOCAL
AltTargetDomainName: GLOBOMANTICS.LOCAL
Ticket Flags'''''' : 0x40e00000 -> forwardable renewable initial pre_authent
Session Key''''''' : KeyType 0x12 - AES-256-CTS-HMAC-SHA1-96
: KeyLength 32 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
StartTime''''''''' : 12/20/2011 15:29:06 (local)
EndTime''''''''''' : 12/21/2011 1:29:06 (local)
RenewUntil'''''''' : 12/27/2011 15:29:06 (local)
TimeSkew'''''''''' :''' 0:00 minute(s)
EncodedTicket''''' : (size: 1099)
0000' 61 82 04 47 30 82 04 43:a0 03 02 01 05 a1 14 1b' a..G0..C........
0010' 12 47 4c 4f 42 4f 4d 41:4e 54 49 43 53 2e 4c 4f' .GLOBOMANTICS.LO
0020' 43 41 4c a2 27 30 25 a0:03 02 01 02 a1 1e 30 1c' CAL.'0%.......0.
0030' 1b 06 6b 72 62 74 67 74:1b 12 47 4c 4f 42 4f 4d' ..krbtgt..GLOBOM
0040' 41 4e 54 49 43 53 2e 4c:4f 43 41 4c a3 82 03 fb' ANTICS.LOCAL....
0050' 30 82 03 f7 a0 03 02 01:12 a1 03 02 01 03 a2 82' 0...............
0060' 03 e9 04 82 03 e5 00 4d:6d e8 f8 e8 06 80 3b f0' .......Mm.....;.
0070' eb 9d 3a 5e 9b 8c b6 c1:46 b4 64 62 ad 72 28 74' ..:^....F.db.r(t
'

I've truncated the output because there's not much else to see after the first few lines of the encoded ticket itself. But I do get some useful information about when the TGT was issued, how long it is good for and if there are any time discrepancies. This ticket looks pretty good. On the other hand, if you get output like this:

C:\>klist tgt
Current LogonId is 0:0xef2d17
Error calling API LsaCallAuthenticationPackage (Ticket Granting Ticket substatus
): 1312
klist failed with 0x8009030e/-2146893042: No credentials are available in the security package

Then you have a problem. Or, as is the case from this example, the computer I ran this on does not belong to a domain. I would expect a similar result with a domain member experiencing trust relationship issues with the domain. In any event, any thing I try to do that utilizes Kerberos from this desktop is going to fail.

Assuming the TGT is ok, I can run KLIST.EXE without any parameters and get a list of all the current Kerberos tickets

C:\Users\jfrost>klist

Figure 3 shows the output.


Figure 3 KLIST Kerberos Tickets

It is also possible to wipe out all the tickets and start from scratch. This should happen if you logoff and back on again, or you can purge the Kerberos ticket cache using KLIST.EXE

C:\>klist purge
Current LogonId is 0:0x36786
Deleting all tickets:
Ticket(s) purged!
C:\>klist
Current LogonId is 0:0x36786
Cached Tickets: (0)

As you begin accessing network resources, you'll automatically acquire new tickets.

By now, I hope you are realizing that if you are experiencing a Kerberos related problem the most likely culprits are either related to system time or name resolution. In fact, if you query your members and domain controllers for Kerberos errors in the event log you might find an entry like this:

PS C:\> get-eventlog system -source Kerberos -ComputerName chi-dc01 | select TimeGenerated,EntryType,Message | format-list
TimeGenerated : 11/16/2011 10:55:39 PM
EntryType'''' : Error
Message'''''' : The kerberos client received a KRB_AP_ERR_TKT_NYV error
from the server chi-dc02$. This indicates that the ticket
used against that server is not yet valid (in relationship
to that server time). Contact your system administrator to
make sure the client and server times are in sync, and that
the KDC in realm GLOBOMANTICS.LOCAL is in sync with the KDC
in the client realm.

The bottom line is not to neglect searching event logs for Kerberos related errors.

Kerberos authentication is widely used in Windows networks. Most of the time it works just fine with very little effort on your part. But when it stops working, you'll know because Kerberos' bite can be nasty! When that happens, use these tools and techniques to isolate and resolve the underlying cause and tame the beast.