[PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

Steven Price steven.price at arm.com
Thu Feb 18 16:15:56 UTC 2021


On 18/02/2021 15:45, Alyssa Rosenzweig wrote:
>> Yeah plus Cc: stable for backporting and I think an igt or similar for
>> panfrost to check this works correctly would be pretty good too. Since
>> if it took us over 1 year to notice this bug it's pretty clear that
>> normal testing doesn't catch this. So very likely we'll break this
>> again.
> 
> Unfortunately there are a lot of kernel bugs which are noticed during actual
> use (but not CI runs), some of which have never been fixed. I do know
> the shrinker impl is buggy for us, if this is the fix I'm very happy.

I doubt this will actually "fix" anything - if I understand correctly 
then the sequence which is broken is:

  * allocate BO, mmap to CPU
  * madvise(DONTNEED)
  * trigger purge
  * try to access the BO memory

which is an invalid sequence for user space - the attempt to access 
memory should cause a SIGSEGV. However because drm_vma_node_unmap() is 
unable to find the mappings there may still be page table entries 
present which would provide access to memory the kernel has freed. Which 
is of course a big security hole and so this fix is needed.

In what way do you find the shrinker impl buggy? I'm aware there's some 
dodgy locking (although I haven't worked out how to fix it) - but AFAICT 
it's more deadlock territory rather than lacking in locks. Are there 
correctness issues?

>> btw for testing shrinkers recommended way is to have a debugfs file
>> that just force-shrinks everything. That way you avoid all the trouble
>> that tend to happen when you drive a system close to OOM on linux, and
>> it's also much faster.
> 
> 2nding this as a good idea.
> 

Sounds like a good idea to me too. But equally I'm wondering whether the 
best (short term) solution is to actually disable the shrinker. I'm 
somewhat surprised that nobody has got fed up with the "Purging xxx 
bytes" message spam - which makes me think that most people are not 
hitting memory pressure to trigger the shrinker.

The shrinker on kbase caused a lot of grief - and the only way I managed 
to get that under control was by writing a static analysis tool for the 
locking, and by upsetting people by enforcing the (rather dumb) rules of 
the tool on the code base. I've been meaning to look at whether sparse 
can do a similar check of locks.

Sadly at the moment I'm struggling to find time to look at such things.

Steve


More information about the dri-devel mailing list