[BUG] drm: vc4: refcount_t: increment on 0; use-after-free.

Eric Anholt eric at anholt.net
Mon Nov 27 01:40:11 UTC 2017


Boris Brezillon <boris.brezillon at free-electrons.com> writes:

> Hi Kees,
>
> On Sat, 25 Nov 2017 12:57:04 -0800
> Kees Cook <keescook at chromium.org> wrote:
>
>> On Wed, Nov 22, 2017 at 11:57 PM, Daniel Vetter <daniel at ffwll.ch> wrote:
>> > On Wed, Nov 22, 2017 at 07:21:00PM +0100, Boris Brezillon wrote:  
>> >> On Wed, 22 Nov 2017 19:13:09 +0100
>> >> Daniel Vetter <daniel at ffwll.ch> wrote:
>> >>  
>> >> > On Wed, Nov 22, 2017 at 6:51 PM, Boris Brezillon
>> >> > <boris.brezillon at free-electrons.com> wrote:  
>> >> > > Hi Stefan,
>> >> > >
>> >> > > On Wed, 22 Nov 2017 17:43:35 +0100 (CET)
>> >> > > Stefan Wahren <stefan.wahren at i2se.com> wrote:
>> >> > >  
>> >> > >> Hi Boris,
>> >> > >> if i boot Raspberry Pi 3 (ARM64, defconfig, linux-next-20171122) with sufficient CMA memory (32 MB), i'll get this warning during boot:
>> >> > >>
>> >> > >> [    7.623510] vc4-drm soc:gpu: bound 3f902000.hdmi (ops vc4_hdmi_ops [vc4])
>> >> > >> [    7.632453] vc4-drm soc:gpu: bound 3f806000.vec (ops vc4_vec_ops [vc4])
>> >> > >> [    7.639707] vc4-drm soc:gpu: bound 3f400000.hvs (ops vc4_hvs_ops [vc4])
>> >> > >> [    7.647364] vc4-drm soc:gpu: bound 3f206000.pixelvalve (ops vc4_crtc_ops [vc4])
>> >> > >> [    7.655451] vc4-drm soc:gpu: bound 3f207000.pixelvalve (ops vc4_crtc_ops [vc4])
>> >> > >> [    7.663415] vc4-drm soc:gpu: bound 3f807000.pixelvalve (ops vc4_crtc_ops [vc4])
>> >> > >> [    7.730580] vc4-drm soc:gpu: bound 3fc00000.v3d (ops vc4_v3d_ops [vc4])
>> >> > >> [    7.743965] [drm] Initialized vc4 0.0.0 20140616 for soc:gpu on minor 0
>> >> > >> [    7.750841] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
>> >> > >> [    7.757620] [drm] Driver supports precise vblank timestamp query.
>> >> > >> [    7.811775] ------------[ cut here ]------------
>> >> > >> [    7.811780] refcount_t: increment on 0; use-after-free.
>> >> > >> [    7.811881] WARNING: CPU: 2 PID: 2188 at lib/refcount.c:153 refcount_inc+0x44/0x50
>> >> > >> [    7.811884] Modules linked in: vc4(+) cfg80211 cec drm_kms_helper smsc95xx usbnet drm rfkill brcmutil bcm2835_rng rng_core bcm2835_dma crc32_ce i2c_bcm2835 pwm_bcm2835 ip_tables x_tables ipv6
>> >> > >> [    7.811950] CPU: 2 PID: 2188 Comm: systemd-udevd Not tainted 4.14.0-next-20171122 #1
>> >> > >> [    7.811953] Hardware name: Raspberry Pi 3 Model B Rev 1.2 (DT)
>> >> > >> [    7.811958] task: ffff800036b91c00 task.stack: ffff00000d6f0000
>> >> > >> [    7.811967] pstate: 20000005 (nzCv daif -PAN -UAO)
>> >> > >> [    7.811974] pc : refcount_inc+0x44/0x50
>> >> > >> [    7.811981] lr : refcount_inc+0x44/0x50
>> >> > >> [    7.811984] sp : ffff00000d6f3300
>> >> > >> [    7.811988] x29: ffff00000d6f3300 x28: 0000000000000000
>> >> > >> [    7.811996] x27: 0000000000000003 x26: ffff800037107800
>> >> > >> [    7.812004] x25: 0000000000000001 x24: ffff800035afdc00
>> >> > >> [    7.812012] x23: 0000000000000000 x22: ffff800035dfa600
>> >> > >> [    7.812020] x21: ffff800035afd9b0 x20: ffff800035afd9a4
>> >> > >> [    7.812027] x19: 0000000000000000 x18: 0000000000000000
>> >> > >> [    7.812034] x17: 0000000000000001 x16: 0000000000000019
>> >> > >> [    7.812042] x15: 0000000000000001 x14: 00000000fffffff0
>> >> > >> [    7.812049] x13: ffff0000090ae840 x12: ffff000008fa2d50
>> >> > >> [    7.812057] x11: ffff000008fa2000 x10: ffff0000090ad000
>> >> > >> [    7.812064] x9 : 0000000000000000 x8 : ffff0000090b5c2f
>> >> > >> [    7.812072] x7 : 0000000000000000 x6 : 000000000015ee00
>> >> > >> [    7.812079] x5 : 0000000000000000 x4 : 0000000000000000
>> >> > >> [    7.812087] x3 : ffffffffffffffff x2 : 0000800030047000
>> >> > >> [    7.812094] x1 : ffff800036b91c00 x0 : 000000000000002b
>> >> > >> [    7.812102] Call trace:
>> >> > >> [    7.812109]  refcount_inc+0x44/0x50
>> >> > >> [    7.812226]  vc4_bo_inc_usecnt+0x84/0x88 [vc4]
>> >> > >> [    7.812310]  vc4_prepare_fb+0x40/0xf0 [vc4]
>> >> > >> [    7.812460]  drm_atomic_helper_prepare_planes+0x54/0xf0 [drm_kms_helper]
>> >> > >> [    7.812543]  vc4_atomic_commit+0x88/0x130 [vc4]
>> >> > >> [    7.812868]  drm_atomic_commit+0x48/0x68 [drm]
>> >> > >> [    7.813002]  restore_fbdev_mode_atomic+0x1d8/0x1e8 [drm_kms_helper]
>> >> > >> [    7.813113]  restore_fbdev_mode+0x28/0x160 [drm_kms_helper]
>> >> > >> [    7.813223]  drm_fb_helper_restore_fbdev_mode_unlocked.part.24+0x28/0x90 [drm_kms_helper]
>> >> > >> [    7.813331]  drm_fb_helper_set_par+0x54/0xa8 [drm_kms_helper]
>> >> > >> [    7.813346]  fbcon_init+0x4e8/0x538
>> >> > >> [    7.813357]  visual_init+0xb4/0x108
>> >> > >> [    7.813366]  do_bind_con_driver+0x1b8/0x3a0
>> >> > >> [    7.813373]  do_take_over_console+0x150/0x1d0
>> >> > >> [    7.813380]  do_fbcon_takeover+0x6c/0xf0
>> >> > >> [    7.813387]  fbcon_event_notify+0x8fc/0x928
>> >> > >> [    7.813399]  notifier_call_chain+0x50/0x90
>> >> > >> [    7.813406]  __blocking_notifier_call_chain+0x4c/0x90
>> >> > >> [    7.813413]  blocking_notifier_call_chain+0x14/0x20
>> >> > >> [    7.813420]  fb_notifier_call_chain+0x1c/0x28
>> >> > >> [    7.813426]  register_framebuffer+0x1d0/0x2d8
>> >> > >> [    7.813533]  __drm_fb_helper_initial_config_and_unlock+0x1e8/0x350 [drm_kms_helper]
>> >> > >> [    7.813639]  drm_fb_helper_initial_config+0x40/0x58 [drm_kms_helper]
>> >> > >> [    7.813747]  drm_fbdev_cma_init_with_funcs+0x88/0x158 [drm_kms_helper]
>> >> > >> [    7.813855]  drm_fbdev_cma_init+0x14/0x28 [drm_kms_helper]
>> >> > >> [    7.813943]  vc4_kms_load+0xa4/0xf0 [vc4]
>> >> > >> [    7.814026]  vc4_drm_bind+0x100/0x168 [vc4]
>> >> > >> [    7.814038]  try_to_bring_up_master+0x144/0x1a8
>> >> > >> [    7.814044]  component_master_add_with_match+0x9c/0xe0
>> >> > >> [    7.814128]  vc4_platform_drm_probe+0xb4/0xf0 [vc4]
>> >> > >> [    7.814137]  platform_drv_probe+0x58/0xc0
>> >> > >> [    7.814146]  driver_probe_device+0x224/0x308
>> >> > >> [    7.814153]  __driver_attach+0xac/0xb0
>> >> > >> [    7.814161]  bus_for_each_dev+0x64/0xa0
>> >> > >> [    7.814169]  driver_attach+0x20/0x28
>> >> > >> [    7.814177]  bus_add_driver+0x108/0x228
>> >> > >> [    7.814184]  driver_register+0x60/0xf8
>> >> > >> [    7.814190]  __platform_driver_register+0x40/0x48
>> >> > >> [    7.814274]  vc4_drm_register+0x38/0x1000 [vc4]
>> >> > >> [    7.814283]  do_one_initcall+0x38/0x120
>> >> > >> [    7.814295]  do_init_module+0x58/0x1b8
>> >> > >> [    7.814304]  load_module+0x1fa8/0x2280
>> >> > >> [    7.814311]  SyS_finit_module+0xc0/0xd0
>> >> > >> [    7.814318]  __sys_trace_return+0x0/0x4
>> >> > >> [    7.814325] ---[ end trace 3348554eb91e19a1 ]---  
>> >> > >
>> >> > > Looks like I didn't test this code with CONFIG_REFCOUNT_FULL enabled :-/.
>> >> > >
>> >> > > Anyway, can you try to apply the following diff and let me know if it
>> >> > > fixes the problem?
>> >> > >
>> >> > > Thanks,
>> >> > >
>> >> > > Boris
>> >> > >  
>> >> > > --->8---  
>> >> > > diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
>> >> > > index 4ae45d7dac42..2decc8e2c79f 100644
>> >> > > --- a/drivers/gpu/drm/vc4/vc4_bo.c
>> >> > > +++ b/drivers/gpu/drm/vc4/vc4_bo.c
>> >> > > @@ -637,7 +637,8 @@ int vc4_bo_inc_usecnt(struct vc4_bo *bo)
>> >> > >         mutex_lock(&bo->madv_lock);
>> >> > >         switch (bo->madv) {
>> >> > >         case VC4_MADV_WILLNEED:
>> >> > > -               refcount_inc(&bo->usecnt);
>> >> > > +               if (!refcount_inc_not_zero(&bo->usecnt))
>> >> > > +                       refcount_set(&bo->usecnt, 1);  
>> >> >
>> >> > This is racy.  
>> >>
>> >> Well, in this case it's not, because this section is protected by a
>> >> mutex.
>> >>  
>> >> > You need a refcount_inc_allow_zero (if that exists).  
>> >>
>> >> I searched for something like that, but it seems that there's no such
>> >> helper.  
>> >
>> > In that case I think the right thing is to switch to atomic_t. That means
>> > no overflow protection, but if we don't have a "this can be 0, I know what
>> > we're doing" style refcount, then I don't think it's suitable.
>> >
>> > Adding Kees to confirm or clarify or ...  
>> 
>> Adding Elena for more details. IIRC there have been two cases:
>> 
>> 1) refcount == 0 doesn't mean the object has been released. The
>> solution tends to be a +1 to the count everywhere.
>> 
>> 2) refcount == 0 means the object was released, but the refcount gets
>> reused for the next allocation. The solution tends to be to use
>> refcount_set() in the new initialization instead of refcount_inc().
>> 
>> It's not clear to me if either apply here, though.
>
> I'd say #1, since ->usecnt is not controlling the VC4-BO object
> lifetime. This being said, I'm not sure the +1 trick is really
> appropriate. Indeed, we're using the
> refcount_inc_not_zero()/refcount_dec_not_one() helpers to remove/put
> the object from/in a
> "mem-pointed-by-obj-can-be-freed-under-mem-pressure" pool (note the
> object itself is not freed, just the memory portion that is pointed by
> this object). If I apply +1 everywhere then I have no way to handle the
> 0 -> 1 and 1 -> 0 transitions differently (these become 1 -> 2 and 2 ->
> 1 transitions, and there's no functions called refcount_dec_not_two()
> and refcount_inc_not_one()).
>
> Anyway, if refcount_inc_allow_zero() is not option, maybe I should just
> use an atomic_t object (as suggested by Daniel) or a plain unsigned
> integer protected by the ->madv.lock (as suggested by Eric). Of course,
> that means losing built-in counter-overflow checking and reworking a bit
> the VC4-MADV logic, but that's not such a big problem. 

I was profiling minetest last week, and the madv lock was at least 10%
of the kernel overhead, since we need to grab it for each BO that goes
idle->busy and also each BO that goes busy->idle in a job.  Given that
we don't reallocate out of the userspace BO cache until it goes idle,
that's potentially a lot of BOs per job, and we'd probably be better off
with the global bo_lock.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20171126/83d2d160/attachment.sig>


More information about the dri-devel mailing list