[systemd-devel] Antw: [EXT] Re: [dm-devel] RFC: one more time: SCSI device identification
Ewan D. Milne
emilne at redhat.com
Tue Apr 27 20:04:19 UTC 2021
On Tue, 2021-04-27 at 12:52 +0200, Ulrich Windl wrote:
> > > > Hannes Reinecke <hare at suse.de> schrieb am 27.04.2021 um 10:21
> > > > in Nachricht
>
> <2a6903e4-ff2b-67d5-e772-6971db8448fb at suse.de>:
> > On 4/27/21 10:10 AM, Martin Wilck wrote:
> > > On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:
> > > > >
> > > > > Wrt 1), we can only hope that it's the case. But 2) and 3)
> > > > > need work,
> > > > > afaics.
> > > > >
> > > >
> > > > In my view the WWID should never change.
> > >
> > > In an ideal world, perhaps not. But in the dm‑multipath realm, we
> > > know
> > > that WWID changes can happen with certain storage arrays. See
> > >
https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html
> > >
> > > and follow‑ups, for example.
> > >
> >
> > And it's actually something which might happen quite easily.
> > The storage array can unmap a LUN, delete it, create a new one, and
> > map
> > that one into the same LUN number than the old one.
> > If we didn't do I/O during that interval upon the next I/O we will
> > be
> > getting the dreaded 'Power‑On/Reset' sense code.
> > _And nothing else_, due to the arcane rules for sense code
> > generation in
> > SAM.
> > But we end up with a completely different device.
> >
> > The only way out of it is to do a rescan for every POR sense code,
> > and
> > disable the device eg via DID_NO_CONNECT whenever we find that the
> > identification has changed. We already have a copy of the original
> > VPD
> > page 0x83 at hand, so that should be reasonably easy.
>
> I don't know the depth of the SCSI or FC protocol, but storage
> systems
> typically signal such events, maybe either via some unit attention or
> some FC
> event. Older kernels logged that there was a change, but a manual
> SCSI bus scan
> is needed, while newer kernels find new devices "automagically" for
> some
> products. The HP EVA 6000 series wored that way, a 3PAR SotorServ
> 8000 series
> also seems to work that way, but not Pure Storage X70 R3. FOr the
> latter you
> need something like a FC LIP to make the kernel detect the new
> devices (LUNs).
> I'm unsure where the problem is, but in principle the kernel can be
> notified...
There has to be some command on which the Unit Attention status
can be returned. (In a multipath configuration, the path checker
commands may do this). In absence of a command, there is no
asynchronous mechanism in SCSI to report the status.
On FC things related to finding a remote port will trigger a rescan.
-Ewan
>
> >
> > I had a rather lengthy discussion with Fred Knight @ NetApp about
> > Power‑On/Reset handling, what with him complaining that we don't
> > handle
> > is correctly. So this really is something we should be looking
> > into,
> > even independently of multipathing.
> >
> > But actually I like the idea from Martin Petersen to expose the
> > parsed
> > VPD identifiers to sysfs; that would allow us to drop sg_inq
> > completely
> > from the udev rules.
>
> Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there
> was a
> kernel change regarding trailing blanks in VPD data. That change blew
> up
> several configurations being unable to re-recognize the devices. In
> one case
> the software even had bound a license to a specific device with
> serial number,
> and that software found "new" devices while missing the "old" ones...
>
> Regards,
> Ulrich
>
> >
> > Cheers,
> >
> > Hannes
> > ‑‑
> > Dr. Hannes Reinecke Kernel Storage Architect
> > hare at suse.de +49 911 74053
> > 688
> > SUSE Software Solutions Germany GmbH, 90409 Nürnberg
> > GF: F. Imendörffer, HRB 36809 (AG Nürnberg)
>
>
>
More information about the systemd-devel
mailing list