[PATCH 2/2] drm/xe/display: Unify display page table mapping

Tvrtko Ursulin tursulin at ursulin.net
Tue Feb 4 18:05:17 UTC 2025


On 04/02/2025 13:11, Juha-Pekka Heikkilä wrote:
> On Tue, Feb 4, 2025 at 2:31 PM Tvrtko Ursulin <tursulin at ursulin.net> wrote:
>>
>>
>> On 04/02/2025 12:10, Juha-Pekka Heikkilä wrote:
>>> On Tue, Feb 4, 2025 at 1:38 PM Tvrtko Ursulin <tursulin at ursulin.net> wrote:
>>>>
>>>>
>>>> On 04/02/2025 11:25, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 14/01/2025 18:04, Juha-Pekka Heikkila wrote:
>>>>>> Unify writing of remapped and straight DPTs. Take out writing of
>>>>>> rotated DPT since Xe doesn't support any platform that does
>>>>>> 90 degree rotated framebuffers.
>>>>>
>>>>> Consider giving my series which fixes AuxCSS a look instead? Given there
>>>>> is external interest to have all that work on ADL-P. I could respin with
>>>>> rotated cleaned up in a similar fashion I did for remapped. Maintenance
>>>>> cost is minimal since it is all very contained.
>>>>
>>>> Oh ADL-P does not support hardware rotation. I missed the fact it is
>>>> display version 13..
>>>>
>>>> But the point regarding changes to remapping code are still valid I
>>>> think, since to suport aux plane the refactoring from this patch looks
>>>> in the wrong direction.
>>>>
>>>
>>> Yea, I kinda agree here with Tvrtko. I did earlier mention this to
>>> JSaa. That aux thing will need to be checked.
>>
>> Ah cool, fingers crossed.
>>
>>> Though, how far are we with your set Tvrtko? It did seem lot kms_ccs
>>> tests failed.
>>> https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-144186v1/shards-all.html?testfilter=kms_ccs.*crc.*gen12&hosts=adlp
>>>
>>> Did you check you had compression when you had clean image on your
>>> machine? (aux surface was not zeroes) Those failures on my link ci are
>>> with failing crc which mean the picture on screen was not correct.
>>
>> I was looking into this just this morning and posted some
>> half-conclusions here:
>>
>> https://lore.kernel.org/intel-xe/eb3a3729-e8ef-4868-b8ff-b6efd66ef4fa@igalia.com/T/#u
>>
>> When you say test with a clear image that I did not. I smoke tested it
>> in desktop usage and looked at kms_flip_tiling which visually looks
>> correct but as the above msg-id talks about, something is potentially
>> fishy with crc collections and/or vblank handling. Since display isn't
>> my area of expertise that is as far as I could get.
> 
> Those test explicitly testing for flipping and such may show there
> something interesting but imo unrelated to this situation (for now).
> 
> If you have adlp machine where you see the screen you could try those
> kms_ccs crc tests, you can run the tests in interactive mode with
> '--i' parameter where the test always stop at commit so you'll see
> what's there on screen. On kms_ccs I would not expect any vblank/other
> issues to interfere crc because kms_ccs uses
> igt_pipe_crc_collect_crc() to get the crc, what this mean is once the
> test image is on screen crc counting is started, collected one crc and
> crc counting is stopped .. to get the first crc this way few vblanks
> have passed already since getting image on screen. Then on kms_ccs
> test there is -c flag which should check there is something written on
> aux surface. Getting these to work reliably I think is the first
> thing...and on that subject I'd at first concentrate just one gen12rc
> ccs, not ccs-cc and not mc ccs. I'd get these results by myself but
> I'm bit of recovering at home and don't have needed hw in my hands.

kms_crc with and without -c are both a complete pass.

> On that lore.kernel.org thread where you mention "Bad CRC also varies
> run to run, while visually things look fine to me on screen. Also a
> delay before collecting the crc seems to improve things." could be
> quite much anything, my first quess with those always is some
> alignment missed somewhere.

Going back to kms_flip_tiling it is hard to figure out what's up. Good 
news is that I have spotted some visual glitching but it mostly only 
happens withing the first few frames after commit of the reference frame.

For instance when I add a loop such as:

diff --git a/tests/intel/kms_flip_tiling.c b/tests/intel/kms_flip_tiling.c
index e937c21716ad..5538fe88c5a0 100644
--- a/tests/intel/kms_flip_tiling.c
+++ b/tests/intel/kms_flip_tiling.c
@@ -131,7 +131,10 @@ test_flip_tiling(data_t *data, enum pipe pipe, 
igt_output_t *output, uint64_t mo
                       "commit failed with " IGT_MODIFIER_FMT "\n",
                       IGT_MODIFIER_ARGS(modifier[1]));
         pipe_crc_new(data, pipe);
-       igt_pipe_crc_get_current(data->drm_fd, data->pipe_crc, 
&reference_crc);
+       ret = 20;
+       while (ret--)
+               igt_pipe_crc_get_current(data->drm_fd, data->pipe_crc,
+                                        &reference_crc);

         /* Commit the first fb. */
         igt_plane_set_fb(primary, &data->fb[0]);

All tests pass.

Could this be interacting with some other display feature like fbc or psr?

There is also something interesting with CRC if I put a sleep into that 
loop. Apparently something gets powered off and CRC garbage such as 
0xffffffff starts appearing. But only if there is a sleep for like 
250ms. A busy loop can go for 3000 frames and not hit that.

Regards,

Tvrtko

>>>>>> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila at gmail.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/xe/display/xe_fb_pin.c | 141 +++++++++----------------
>>>>>>     1 file changed, 48 insertions(+), 93 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/xe/display/xe_fb_pin.c
>>>>>> b/drivers/gpu/drm/xe/display/xe_fb_pin.c
>>>>>> index c28885316986..70322f28eee5 100644
>>>>>> --- a/drivers/gpu/drm/xe/display/xe_fb_pin.c
>>>>>> +++ b/drivers/gpu/drm/xe/display/xe_fb_pin.c
>>>>>> @@ -14,66 +14,44 @@
>>>>>>     #include "xe_ggtt.h"
>>>>>>     #include "xe_pm.h"
>>>>>> -static void
>>>>>> -write_dpt_rotated(struct xe_bo *bo, struct iosys_map *map, u32
>>>>>> *dpt_ofs, u32 bo_ofs,
>>>>>> -          u32 width, u32 height, u32 src_stride, u32 dst_stride)
>>>>>> +static void encode_and_write_pte(struct xe_bo *bo, struct iosys_map
>>>>>> *map,
>>>>>> +                 u32 *ofs, u32 src_idx, struct xe_device *xe)
>>>>>>     {
>>>>>> -    struct xe_device *xe = xe_bo_device(bo);
>>>>>>         struct xe_ggtt *ggtt = xe_device_get_root_tile(xe)->mem.ggtt;
>>>>>> -    u32 column, row;
>>>>>> -
>>>>>> -    /* TODO: Maybe rewrite so we can traverse the bo addresses
>>>>>> sequentially,
>>>>>> -     * by writing dpt/ggtt in a different order?
>>>>>> -     */
>>>>>> -
>>>>>> -    for (column = 0; column < width; column++) {
>>>>>> -        u32 src_idx = src_stride * (height - 1) + column + bo_ofs;
>>>>>> -
>>>>>> -        for (row = 0; row < height; row++) {
>>>>>> -            u64 pte = ggtt->pt_ops->pte_encode_bo(bo, src_idx *
>>>>>> XE_PAGE_SIZE,
>>>>>> -                                  xe->pat.idx[XE_CACHE_NONE]);
>>>>>> -
>>>>>> -            iosys_map_wr(map, *dpt_ofs, u64, pte);
>>>>>> -            *dpt_ofs += 8;
>>>>>> -            src_idx -= src_stride;
>>>>>> -        }
>>>>>> -
>>>>>> -        /* The DE ignores the PTEs for the padding tiles */
>>>>>> -        *dpt_ofs += (dst_stride - height) * 8;
>>>>>> -    }
>>>>>> -
>>>>>> -    /* Align to next page */
>>>>>> -    *dpt_ofs = ALIGN(*dpt_ofs, 4096);
>>>>>> +    u64 pte = ggtt->pt_ops->pte_encode_bo(bo, src_idx * XE_PAGE_SIZE,
>>>>>> +                          xe->pat.idx[XE_CACHE_NONE]);
>>>>>> +    iosys_map_wr(map, *ofs, u64, pte);
>>>>>> +    *ofs += 8;
>>>>>>     }
>>>>>> -static void
>>>>>> -write_dpt_remapped(struct xe_bo *bo, struct iosys_map *map, u32
>>>>>> *dpt_ofs,
>>>>>> -           u32 bo_ofs, u32 width, u32 height, u32 src_stride,
>>>>>> -           u32 dst_stride)
>>>>>> +static void write_dpt(struct xe_bo *bo, struct iosys_map *map, u32
>>>>>> *dpt_ofs,
>>>>>> +              const struct intel_remapped_plane_info *plane,
>>>>>> +              enum i915_gtt_view_type type)
>>>>>>     {
>>>>>>         struct xe_device *xe = xe_bo_device(bo);
>>>>>> -    struct xe_ggtt *ggtt = xe_device_get_root_tile(xe)->mem.ggtt;
>>>>>> -    u64 (*pte_encode_bo)(struct xe_bo *bo, u64 bo_offset, u16 pat_index)
>>>>>> -        = ggtt->pt_ops->pte_encode_bo;
>>>>>> -    u32 column, row;
>>>>>> -
>>>>>> -    for (row = 0; row < height; row++) {
>>>>>> -        u32 src_idx = src_stride * row + bo_ofs;
>>>>>> -
>>>>>> -        for (column = 0; column < width; column++) {
>>>>>> -            iosys_map_wr(map, *dpt_ofs, u64,
>>>>>> -                     pte_encode_bo(bo, src_idx * XE_PAGE_SIZE,
>>>>>> -                     xe->pat.idx[XE_CACHE_NONE]));
>>>>>> -
>>>>>> -            *dpt_ofs += 8;
>>>>>> -            src_idx++;
>>>>>> +    const u32 dpt_even = (plane->dst_stride - plane->width) * 8;
>>>>>> +    const u32 dest_width = plane->width;
>>>>>> +    const u32 dest_height = plane->height;
>>>>>> +    u32 src_idx;
>>>>>> +
>>>>>> +    for (u32 row = 0; row < dest_height; ++row) {
>>>>>> +        for (u32 column = 0; column < dest_width; ++column) {
>>>>>> +            switch (type) {
>>>>>> +            case I915_GTT_VIEW_NORMAL:
>>>>>> +                src_idx = plane->offset + column;
>>>>>> +                break;
>>>>>> +            case I915_GTT_VIEW_REMAPPED:
>>>>>> +                src_idx = plane->offset +
>>>>>> +                    row * plane->src_stride + column;
>>>>>> +                break;
>>>>>> +            default:
>>>>>> +                WARN(1, "Unsupported GTT view type: %d", type);
>>>>>> +                return;
>>>>>> +            }
>>>>>> +            encode_and_write_pte(bo, map, dpt_ofs, src_idx, xe);
>>>>>>             }
>>>>>> -
>>>>>> -        /* The DE ignores the PTEs for the padding tiles */
>>>>>> -        *dpt_ofs += (dst_stride - width) * 8;
>>>>>> +        *dpt_ofs += dpt_even;
>>>>>>         }
>>>>>> -
>>>>>> -    /* Align to next page */
>>>>>>         *dpt_ofs = ALIGN(*dpt_ofs, 4096);
>>>>>>     }
>>>>>> @@ -125,59 +103,36 @@ static int __xe_pin_fb_vma_dpt(const struct
>>>>>> intel_framebuffer *fb,
>>>>>>     {
>>>>>>         struct xe_device *xe = to_xe_device(fb->base.dev);
>>>>>>         struct xe_tile *tile0 = xe_device_get_root_tile(xe);
>>>>>> -    struct xe_ggtt *ggtt = tile0->mem.ggtt;
>>>>>>         struct drm_gem_object *obj = intel_fb_bo(&fb->base);
>>>>>>         struct xe_bo *bo = gem_to_xe_bo(obj), *dpt;
>>>>>>         u32 dpt_size, size = bo->ttm.base.size;
>>>>>> +    const struct intel_remapped_plane_info *plane;
>>>>>> +    u32 i, plane_count, dpt_ofs = 0;
>>>>>> +    struct intel_remapped_plane_info normal_plane;
>>>>>> -    if (view->type == I915_GTT_VIEW_NORMAL)
>>>>>> +    if (view->type == I915_GTT_VIEW_NORMAL) {
>>>>>>             dpt_size = ALIGN(size / XE_PAGE_SIZE * 8, XE_PAGE_SIZE);
>>>>>> -    else if (view->type == I915_GTT_VIEW_REMAPPED)
>>>>>> -        dpt_size =
>>>>>> ALIGN(intel_remapped_info_size(&fb->remapped_view.gtt.remapped) * 8,
>>>>>> -                 XE_PAGE_SIZE);
>>>>>> -    else
>>>>>> -        /* display uses 4K tiles instead of bytes here, convert to
>>>>>> entries.. */
>>>>>> -        dpt_size = ALIGN(intel_rotation_info_size(&view->rotated) * 8,
>>>>>> +        normal_plane.offset = 0;
>>>>>> +        normal_plane.width = size / XE_PAGE_SIZE;
>>>>>> +        normal_plane.height = 1;
>>>>>> +        normal_plane.src_stride = size / XE_PAGE_SIZE;
>>>>>> +        normal_plane.dst_stride = size / XE_PAGE_SIZE;
>>>>>> +        plane = &normal_plane;
>>>>>> +        plane_count = 1;
>>>>>> +    } else {
>>>>>> +        dpt_size = ALIGN(intel_remapped_info_size(&view->remapped) * 8,
>>>>>>                      XE_PAGE_SIZE);
>>>>>> +        plane = view->remapped.plane;
>>>>>> +        plane_count = ARRAY_SIZE(view->remapped.plane);
>>>>>> +    }
>>>>>>         dpt = xe_alloc_dpt_bo(xe, tile0, dpt_size, physical_alignment);
>>>>>>         if (IS_ERR(dpt))
>>>>>>             return PTR_ERR(dpt);
>>>>>> -    if (view->type == I915_GTT_VIEW_NORMAL) {
>>>>>> -        u32 x;
>>>>>> -
>>>>>> -        for (x = 0; x < size / XE_PAGE_SIZE; x++) {
>>>>>> -            u64 pte = ggtt->pt_ops->pte_encode_bo(bo, x * XE_PAGE_SIZE,
>>>>>> -                                  xe->pat.idx[XE_CACHE_NONE]);
>>>>>> -
>>>>>> -            iosys_map_wr(&dpt->vmap, x * 8, u64, pte);
>>>>>> -        }
>>>>>> -    } else if (view->type == I915_GTT_VIEW_REMAPPED) {
>>>>>> -        const struct intel_remapped_info *remap_info = &view->remapped;
>>>>>> -        u32 i, dpt_ofs = 0;
>>>>>> -
>>>>>> -        for (i = 0; i < ARRAY_SIZE(remap_info->plane); i++)
>>>>>> -            write_dpt_remapped(bo, &dpt->vmap, &dpt_ofs,
>>>>>> -                       remap_info->plane[i].offset,
>>>>>> -                       remap_info->plane[i].width,
>>>>>> -                       remap_info->plane[i].height,
>>>>>> -                       remap_info->plane[i].src_stride,
>>>>>> -                       remap_info->plane[i].dst_stride);
>>>>>> -
>>>>>> -    } else {
>>>>>> -        const struct intel_rotation_info *rot_info = &view->rotated;
>>>>>> -        u32 i, dpt_ofs = 0;
>>>>>> -
>>>>>> -        for (i = 0; i < ARRAY_SIZE(rot_info->plane); i++)
>>>>>> -            write_dpt_rotated(bo, &dpt->vmap, &dpt_ofs,
>>>>>> -                      rot_info->plane[i].offset,
>>>>>> -                      rot_info->plane[i].width,
>>>>>> -                      rot_info->plane[i].height,
>>>>>> -                      rot_info->plane[i].src_stride,
>>>>>> -                      rot_info->plane[i].dst_stride);
>>>>>> -    }
>>>>>> +    for (i = 0; i < plane_count; i++)
>>>>>> +        write_dpt(bo, &dpt->vmap, &dpt_ofs, &plane[i], view->type);
>>>>>>         vma->dpt = dpt;
>>>>>>         vma->node = dpt->ggtt_node[tile0->id];


More information about the Intel-xe mailing list