Adreno crash on i.MX53 running 5.3-rc6

Robin Murphy robin.murphy at arm.com
Wed Sep 4 18:06:33 UTC 2019


On 04/09/2019 01:12, Rob Clark wrote:
> On Tue, Sep 3, 2019 at 12:31 PM Fabio Estevam <festevam at gmail.com> wrote:
>>
>> Hi Jonathan,
>>
>> On Tue, Sep 3, 2019 at 4:25 PM Jonathan Marek <jonathan at marek.ca> wrote:
>>>
>>> Hi,
>>>
>>> I tried this and it works with patches 4+5 from Rob's series and
>>> changing gpummu to use sg_phys(sg) instead of sg->dma_address
>>> (dma_address isn't set now that dma_map_sg isn't used).
>>
>> Thanks for testing it. I haven't had a chance to test it yet.
>>
>> Rob,
>>
>> I assume your series is targeted to 5.4, correct?
> 
> maybe, although Christoph Hellwig didn't seem like a big fan of
> exposing cache ops, and would rather add a new allocation API for
> uncached pages.. so I'm not entirely sure what the way forward will
> be.

TBH, the use of map/unmap looked reasonable in the context of 
"start/stop using these pages for stuff which may include DMA", so even 
if it was cheekily ignoring sg->dma_address I'm not sure I'd really 
consider it "abuse" - in comparison, using sync without a prior map 
unquestionably violates the API, and means that CONFIG_DMA_API_DEBUG 
will be rendered useless with false positives if this driver is active 
while trying to debug something else.

The warning referenced in 0036bc73ccbe represents something being 
unmapped which didn't match a corresponding map - from what I can make 
of get_pages()/put_pages() it looks like that would need msm_obj->flags 
or msm_obj->sgt to change during the lifetime of the object, neither of 
which sounds like a thing that should legitimately happen. Are you sure 
this isn't all just hiding a subtle bug elsewhere? After all, if what 
was being unmapped wasn't right, who says that what's now being synced is?

Robin.

> In the mean time, it is a bit ugly, but I guess something like this should work:
> 
> --------------------
> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> index 7263f4373f07..5a6a79fbc9d6 100644
> --- a/drivers/gpu/drm/msm/msm_gem.c
> +++ b/drivers/gpu/drm/msm/msm_gem.c
> @@ -52,7 +52,7 @@ static void sync_for_device(struct msm_gem_object *msm_obj)
>   {
>       struct device *dev = msm_obj->base.dev->dev;
> 
> -    if (get_dma_ops(dev)) {
> +    if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
>           dma_sync_sg_for_device(dev, msm_obj->sgt->sgl,
>               msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
>       } else {
> @@ -65,7 +65,7 @@ static void sync_for_cpu(struct msm_gem_object *msm_obj)
>   {
>       struct device *dev = msm_obj->base.dev->dev;
> 
> -    if (get_dma_ops(dev)) {
> +    if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
>           dma_sync_sg_for_cpu(dev, msm_obj->sgt->sgl,
>               msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
>       } else {
> --------------------
> 
> BR,
> -R
> 
>> If this is the case, what we should do about the i.MX5 regression on 5.3?
>>
>> Would a revert of the two commits be acceptable in 5.3 in order to
>> avoid the regression?
>>
>> Please advise.
>>
>> Thanks


More information about the dri-devel mailing list