[PATCH v5 07/19] media: vsp1: dl: Support one-shot entries in the display list

Mon Mar 18 16:59:21 UTC 2019

Hi Laurent,

Sorry for the delay, I was travelling last week and didn't find a
chance to digest your diagrams (thanks for all the detail!)

On Fri, Mar 08, 2019 at 02:24:40PM +0200, Laurent Pinchart wrote:
> Hi Brian,
> 
> On Thu, Mar 07, 2019 at 12:28:08PM +0000, Brian Starkey wrote:
> > On Wed, Mar 06, 2019 at 08:22:44PM +0200, Laurent Pinchart wrote:
> > 
> > [snip]
> > 
> > > I can always queue a new one, but I have no way of telling if the newly
> > > queued list raced with the frame end interrupt.
> > > 
> > > There's another register I'm not using that contains a shadow copy of
> > > the registers list DMA address. It mirrors the address programmed by the
> > > driver when there is no DMA of the registers list in progress, and
> > > contains the address the of registers list being DMA'ed when a DMA is in
> > > progress. I don't think I can do much with it, as reading it either
> > > before or after reprogramming (to check if I can reprogram or if the
> > > reprogram has been taken into account) is racy.
> > 
> > When you say it mirrors the address programmed, is that latched when
> > the update is accepted, or it will update the moment you program the
> > address register?
> 
> It is latched when the update is processed, at the next vblank following
> the address programming. The timing diagram below shows vblank, the UPD
> bit I use for synchronization, the registers list address register
> exposed to the CPU, and the internal address used to DMA the registers
> list. The address register is written four times to show how the
> hardware reacts, but in practice there will be a single write per frame,
> right after the beginning of vblank.
> 
>                                       DMA starts
>                                       |     DMA ends
>                                       |     |
>                                       V     V
>                                       ___________
> VBLANK         ______________________|           |________
>                      ________________     ________________
> UPD            _____|                |___|
>                _____ ______ _____________ ________ _______
> ADDR register  __A__X__B___X______C______X___D____X___E___
>                ______________________ ____________________
> ADDR internal  ___________A__________X__________C_________
> 
> I can reprogram the address any number of times I want before the
> vblank, but the update bit mechanism only lets me protect the race
> related to the first write. For any subsequent write I won't be able to
> tell whether it was complete before the hardware started the frame, or
> arrived too late.
> 
> > I assume the latter or you would have thought of this yourself (that
> > seems like a really strange/not-useful behaviour!). But if it is the
> > former you could:
> > 
> >  - In writeback start-of-frame, create a copy of the register list,
> >    disabling writeback.
> >  - Write the address of this copy to the register
> > 
> > If/when an atomic commit comes in before you service the next
> > end-of-frame:
> > 
> >  - Write the address of the new register list
> >  - Check the status register. If the "pending" bit is zero, you know
> >    you had a potential race.
> >     - Check the DMA address register. If it corresponds to the new
> >       scene, the new commit won the race, otherwise it's been delayed
> >       by a frame.
> 
> The issue here is that there's a potential race if the pending (UPD) bit
> is one too. If the commit arrives just before vblank, but the address is
> written just after vblank, after the UPD bit has been reset to 0 by the
> hardware, but before the vblank interrupt is processed, then the commit
> won't be applied yet, and I will have no way to know. Compare the two
> following scenarios:
> 
> 		   [1]       [2] [3]           [4]       [5]
>                     |         |   |             |         |
>                     |         V   V             |         V
>                     V          __________       V          __________
> VBLANK         _______________|          |________________|          |__
>                      _________     _______________________
> UPD            _____|         |___|                       |_____________
>                _____ _____________ _____________ _______________________
> ADDR register  __A__X______B______X______C______X___________D___________
>                ___________________________________________ _____________
> ADDR internal  _______A_______X______B____________________X______D______
> 
> [1] Atomic commit, registers list address write, with writeback enabled
> [2] DMA starts for first atomic commit
> [3] Writeback disable registers list address write
> [4] Next atomic commit, with writeback disabled
> [5] DMA starts for second atomic commit
> 
> 		   [1]       [2] [3]                     [4][5]
>                     |         |   |                       |  |
>                     |         V   V                       V  V
>                     V          __________                  __________
> VBLANK         _______________|          |________________|          |__
>                      _________     _______________________    __________
> UPD            _____|         |___|                       |__|
>                _____ _____________ __________________________ __________
> ADDR register  __A__X______B______X_____________C____________X_____D____
>                ___________________________________________ _____________
> ADDR internal  _______A_______X______B____________________X______C______
> 
> [1] Atomic commit, registers list address write, with writeback enabled
> [2] DMA starts for first atomic commit
> [3] Writeback disable registers list address write
> [4] DMA starts for writeback disable registers list (3)
> [5] Next atomic commit, with writeback disabled, performed right after
>     vblank but befrore the vblank interrupt is serviced
> 
> The UPD bit is 1 after writing the ADDR register the second time in both
> cases. Furthermore, if [4] and [5] are very close in the second case,
> the UPD bit may read 1 just before [5] if the read comes before [4]:
> 
> 	read UPD bit;
> 	/* VBLANK [4] */
> 	write ADDR register;
> 
> I thus can't rely on UPD = 1 before the write meaning that the write was
> performed before vblank, and I can't rely either on the UPD bit after
> write, as it's 1 in both cases.

My mistake, I got the UPD bit the wrong way around. I'm still not
entirely sure why you can't use "ADDR internal" to determine which
side won the race. It shows 'B' in the first case, and 'C' in the
second.

When a new commit comes, unconditionally:
 - Write new address
 - Read status

 if status.UPD == 0 --> You know for sure your new commit was just
                        latched.
 if status.UPD == 1 --> You need to check ADDR internal to see which
			of these three happened:

    1) Your first case happened. We're somewhere in the middle of the
       frame. ADDR internal will show 'B', and you know commit 'D' is
       going on-screen at the next vblank.

    2) Your second case happened. The new commit raced with the
       latching of writeback-disable and "lost". ADDR internal will
       show 'C', and the new commit is delayed by a frame

    3) (Teeny tiny small window) In-between reading status and ADDR
       internal, the new commit was latched. ADDR internal will show
       'D'. You know the new commit "won" so treat it the same as if
       UPD == 0 (which it will be, now).

Anyway, it's all moot now that you've found the chained lists thing -
that sounds ideal. I'll take a look at the new series shortly.

Thanks,
-Brian

> 
> I initially thought I could protect against the race using the following
> procedure.
> 
> - Single atomic commit, no IRQ delay
> 
> 		   [1]       [2]        [3]              [4]
>                     |         |          |                |
>                     |         V          V                V
>                     V          __________                  __________
> VBLANK         _______________|          |________________|          |__
>                      _________    
> UPD            _____|         |_________________________________________
>                _____ ___________________________________________________
> ADDR register  __A__X___________________B_______________________________
>                _______________ _________________________________________
> ADDR internal  _______A_______X______B__________________________________
> 
> [1] Atomic commit, registers list address write, with writeback enabled
> [2] DMA starts for first atomic commit
> [3] Frame start, disable writeback in-place in registers list B
> [4] DMA starts for "patched" registers list, disables writeback
> 
> - Two atomic commits, no IRQ delay
> 
> 		   [1]       [2]  [3]   [4]              [5]
>                     |         |    |     |                |
>                     |         V    V     V                V
>                     V          __________                  __________
> VBLANK         _______________|          |________________|          |__
>                      _________      ______________________
> UPD            _____|         |____|                      |_____________
>                _____ ______________ ____________________________________
> ADDR register  __A__X______B_______X_________________C__________________
>                _______________ ___________________________ _____________
> ADDR internal  _______A_______X_____________B_____________X______C______
> 
> [1] Atomic commit, registers list address write, with writeback enabled
> [2] DMA starts for first atomic commit
> [3] Next atomic commit, registers list address write, with writeback
>     disabled
> [4] Frame start, disable writeback in-place in registers list B
> [5] DMA starts for second atomic commit, disables writeback
> 
> [3] and [4] can happen in any order, as long as they both come before
> [5]. If [3] comes after [5], we're back to the previous case (Single
> atomic commit, no IRQ delay).
> 
> - Single atomic commit, IRQ delay
> 
> 		   [1]       [2]        [3]              [4] [5]
>                     |         |          |                |   |
>                     |         V          V                V   V
>                     V          __________                  __________
> VBLANK         _______________|          |________________|          |__
>                      _________    
> UPD            _____|         |_________________________________________
>                _____ ___________________________________________________
> ADDR register  __A__X___________________B_______________________________
>                _______________ _________________________________________
> ADDR internal  _______A_______X______B__________________________________
> 
> [1] Atomic commit, registers list address write, with writeback enabled
> [2] DMA starts for first atomic commit
> [3] Frame start, IRQ is delayed to [5]
> [4] DMA starts for unmodified registers list, writeback still enable
> [5] disable writeback in-place in registers list B, too late
> 
> Here I need to detect that [5] was delayed after [4], and thus delay the
> completion of the writeback job by one frame. This could be done by
> checking the vblank interrupt status bit, if it is set then vblank
> occurred and raced [5]. However, the same issue can also happen when no
> race occurred if processing of the vblank interrupt for [2] is delayed
> until [3]. Both the vblank interrupt and the frame start interrupt
> status bits will be set, indicate a potential race.
> 
> The time between [2] and [3] is very short compared to the time between
> [3] and [4] and to interrupt latency in general, so we would have lots
> of false positives.
> 
> > >> You don't happen to have a DMA engine trigger or something you could
> > >> use to do the register list modification at a guaranteed time do you?
> > > 
> > > Not that I know of, unfortunately.
> > > 
> > >> Are you always going to be protected by an IOMMU, preventing the
> > >> writeback from trashing physical memory? If that's not the case, then
> > >> the race can have pretty dire consequences.
> > > 
> > > If the IOMMU is enabled in DT, yes. It's a system-level decision.
> > 
> > Well, it's your driver at the end of the day. But for me, a known
> > race-condition which would cause random memory corruption sounds like
> > a really Bad Thing. Halving frame-rate on systems with no IOMMU seems
> > preferable to me.
> 
> It is a really bad thing. I think the decision should be taken by
> Renesas though.
> 
> -- 
> Regards,
> 
> Laurent Pinchart