[Authentication] Delay in update of AD msDS-KeyVersionNumber after computer password change via "adcli join"

Stef Walter stefw at gnome.org
Fri Dec 18 08:01:33 PST 2015


On 18.12.2015 16:15, Chris Rutledge wrote:
> Hello,
> 
>  
> 
> About 2 months ago we started having issues when using adcli to join our
> Windows AD domain. The symptom we first noticed was not being able to
> log into our stateless HPC compute nodes and messages in the logs
> stating the kvno was mismatched.
> 
>  
> 
> As our compute cluster nodes are stateless, every time they are
> rebooted, they rejoin the domain upon boot via "adcli join".
> Historically this has worked great. We did discovered we could work
> around the issue by repeating the adcli join command until we finally
> received the latest kvno. The number of attempts would vary from node to
> node – timing and luck I suspect.
> 
>  
> 
> Yesterday, I decided to download the latest version of adcli from GitHub
> to debug.
> 
>  
> 
> Here is what I have found:
> 
>  
> 
> 1)      The adcli command is confirmed to connect to any one of the 3
> domain controllers and stay connected to that server throughout the session.
> 
> 2)      With unmodified versions of adcli, there was a very large chance
> we would get the old kvno value after the password change.
> 
> a.       We could see the server we are talking to does not yet have
> this changed value by observing the msDS-KeyVersionNumber via ldapsearch
> against all 3 DCs.
> 
> 3)      Not until I entered a sleep(30) statement in adcli after the
> password update and before we retrieve the new kvno did things start to
> work reliably.
> 
>  
> 
> I would understand the need to sleep if adcli would attempt to retrieve
> the updated kvno value from any one of the 3 DCs. However, it is my
> understanding that there is code in there to make sure we talk to only
> the one and the expectation is once the password change has been made
> the kvno value should reflect this – immediately.
> 
>  
> 
> Also, if we delete the computer object from the domain we get an error
> the first time we attempt to join setting the password. I suspect the
> same timing issue here…the computer object does not exist yet on this
> server.
> 
>  
> 
> Based on my testing and observations, this smells like a performance or
> configuration issue on the Windows AD side. Others are not so convinced
> of this and think that perhaps adcli should delay between operations for
> replication to complete.
> 
>  
> 
> So I figured I would ask the experts, should adcli delay this retrieval
> of the new kvno or are we looking at an AD issues? If you too suspect an
> issue with AD, any idea where to begin?

Does adcli 0.8.0 fix the issue?

http://lists.freedesktop.org/archives/authentication/2015-December/000321.html

It's pretty new, but if you have a chance to try it out. In particular:

https://bugs.freedesktop.org/show_bug.cgi?id=91185

Similar case seems to be described here:

https://bugs.freedesktop.org/show_bug.cgi?id=91185#c4

Stef


More information about the Authentication mailing list