[Authentication] Delay in update of AD msDS-KeyVersionNumber after computer password change via "adcli join"
Chris Rutledge
crutledge at renci.org
Fri Dec 18 08:27:40 PST 2015
Hello Stef,
That case was one of the first documented issues we came across at the beginning of this which looked like a match to our issue. I went back to that same issue yesterday and upon seeing that a change had been checked in for it, I did a git clone.
It did not help in our case.
-Chris
-----Original Message-----
From: Authentication [mailto:authentication-bounces at lists.freedesktop.org] On Behalf Of Stef Walter
Sent: Friday, December 18, 2015 11:02 AM
To: Cross-desktop authentication and single sign-on
Subject: Re: [Authentication] Delay in update of AD msDS-KeyVersionNumber after computer password change via "adcli join"
On 18.12.2015 16:15, Chris Rutledge wrote:
> Hello,
>
>
>
> About 2 months ago we started having issues when using adcli to join
> our Windows AD domain. The symptom we first noticed was not being able
> to log into our stateless HPC compute nodes and messages in the logs
> stating the kvno was mismatched.
>
>
>
> As our compute cluster nodes are stateless, every time they are
> rebooted, they rejoin the domain upon boot via "adcli join".
> Historically this has worked great. We did discovered we could work
> around the issue by repeating the adcli join command until we finally
> received the latest kvno. The number of attempts would vary from node
> to node – timing and luck I suspect.
>
>
>
> Yesterday, I decided to download the latest version of adcli from
> GitHub to debug.
>
>
>
> Here is what I have found:
>
>
>
> 1) The adcli command is confirmed to connect to any one of the 3
> domain controllers and stay connected to that server throughout the session.
>
> 2) With unmodified versions of adcli, there was a very large chance
> we would get the old kvno value after the password change.
>
> a. We could see the server we are talking to does not yet have
> this changed value by observing the msDS-KeyVersionNumber via
> ldapsearch against all 3 DCs.
>
> 3) Not until I entered a sleep(30) statement in adcli after the
> password update and before we retrieve the new kvno did things start
> to work reliably.
>
>
>
> I would understand the need to sleep if adcli would attempt to
> retrieve the updated kvno value from any one of the 3 DCs. However, it
> is my understanding that there is code in there to make sure we talk
> to only the one and the expectation is once the password change has
> been made the kvno value should reflect this – immediately.
>
>
>
> Also, if we delete the computer object from the domain we get an error
> the first time we attempt to join setting the password. I suspect the
> same timing issue here…the computer object does not exist yet on this
> server.
>
>
>
> Based on my testing and observations, this smells like a performance
> or configuration issue on the Windows AD side. Others are not so
> convinced of this and think that perhaps adcli should delay between
> operations for replication to complete.
>
>
>
> So I figured I would ask the experts, should adcli delay this
> retrieval of the new kvno or are we looking at an AD issues? If you
> too suspect an issue with AD, any idea where to begin?
Does adcli 0.8.0 fix the issue?
http://lists.freedesktop.org/archives/authentication/2015-December/000321.html
It's pretty new, but if you have a chance to try it out. In particular:
https://bugs.freedesktop.org/show_bug.cgi?id=91185
Similar case seems to be described here:
https://bugs.freedesktop.org/show_bug.cgi?id=91185#c4
Stef
_______________________________________________
Authentication mailing list
Authentication at lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/authentication
More information about the Authentication
mailing list