[Mesa-dev] [PATCH 1/5] clover/memory: Copy data when creating buffers with CL_MEM_USE_HOST_PTR

Aaron Watry awatry at gmail.com
Thu Aug 3 19:33:21 UTC 2017


On Thu, Aug 3, 2017 at 11:53 AM, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> On Thu, 2017-08-03 at 00:28 -0500, Aaron Watry wrote:
>> /
>>
>> On Sat, Jul 22, 2017 at 2:27 PM, Jan Vesely <jan.vesely at rutgers.edu> wrote:
>> > On Fri, 2017-07-21 at 23:19 -0500, Aaron Watry wrote:
>> > > Fixes: OpenCL CTS test/conformance/buffers/buffer_copy
>> >
>> > Similar patch was pushed in 2013:
>> > 56647c5d8f8e60269f0a3277e3caa7ee57d1fe6a
>> > "clover: Append buffers that use CL_MEM_USE_HOST_PTR."
>> >
>> > Grigory(added to cc) reverted the change and implemented user_ptr
>> > mechanism in:
>> > f972b223c4cb4ec58a9451cbac5d120ac9deb336
>> > "clover: try userptr for CL_MEM_USE_HOST_PTR"
>> >
>> > The buffer-flags piglit still passes, but it maps even CL_USE_HOST_PTR
>> > buffers. I'm not sure what the CTS does, we might need a
>> > synchronization point after kernels finish execution. I couldn't find
>> > the relevant part of specs that would define accessing
>> > CL_MEM_USE_HOST_PTR buffers without
>> > clEnqueueMapBuffer or clEnqueueReadBuffer
>> >
>>
>> I got rid of this patch in my tree and then started digging into the
>> user_ptr stuff.
>>
>> It looks like the issue stems from alignment restrictions.
>>
>> In clover/core/resource.cpp, there's a call to:
>>   dev.pipe->resource_from_user_memory(dev.pipe, &info, obj.host_ptr())
>>
>> That is failing for me.
>>
>> The CTS allocates memory that is aligned to the value returned by
>> clGetDeviceInfo (device,CL_DEVICE_MEM_BASE_ADDR_ALIGN, ...) converted
>> to bytes.
>>
>> Clover's api/device.cpp  has that hard-coded to 1024 bits/128 bytes at
>> the moment.
>>
>> If I change the device info query for the mem_base_addr_align to
>> forward a call to:
>>   pipe->get_param(pipe, PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT)
>
> I'd say we should be doing this anyway instead of hardcoding the value.

Completely agree.  Hard-coding a constant minimum alignment at the API
level is wrong. Once we figure out exactly how do derive that value, I
plan on writing up a patch to pull that value out of the pipe driver.

Cc'ing Christian since he added both the user_ptr support and the
large PTE support.

As far as I've looked (which hasn't necessarily been exhaustive), I
haven't found any other
pipe caps that look like they'd apply for minimum user_ptr buffer
alignment restrictions, but
at the same time the description of that cap doesn't seem perfectly applicable.

Do we need to introduce a new cap for minimum user_ptr buffer
alignment, or is the value of PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT
sub-optimal/wrong? I'm thinking it's correct as-is, and we might want
to consider a new cap which reports the proper user pointer alignment
when PIPE_CAP_RESOURCE_FROM_USER_MEMORY is supported.

Right now it's hard-coded to R600_MAP_BUFFER_ALIGNMENT in si_pipe.c
and r600_pipe.c which has a value of 64 (bytes, I believe).

>
>>
>> And also change si_pipe.c:si_get_param's switch statement value to return:
>>   case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
>>     return sscreen->b.info.gart_page_size;
>
> I'm not sure what the correct value is here. AFAIK, EG uses 256B cache
> lines so I'd expect the value of to be at least that

Depending on how the weather works out tonight, I might be able to at
least find out what NI reports for gart page sizes and compare that to
my SI.  I haven't tried to test user pointer support on r600g yet, so
either it's working alright with the existing 64-byte alignment, or
it's broken when we allocate pointers using the actual alignments
reported by clGetDeviceInfo. If it's broken, I'll try 256B, then keep
bumping it up until it either starts working or I hit GART page size.

--Aaron

>
> Both NI and GCN should be able to use 4K pages (which is what
> gart_page_size is set to), but we might want higher alignment for
> better performance[0]
>
> [0]https://lists.freedesktop.org/archives/dri-devel/2014-May/058858.htm
> l
>>
>> Then I can successfully create buffers from user pointers on my SI card.
>>
>> I'm a bit fuzzy on what alignment restrictions exist for SI/GCN cards,
>> but the winsys seems to indicate we should align things to gart page
>> size, which makes sense on the surface at least.
>>
>> If the alignment restrictions have changed between R600 and GCN, that
>> might explain why what's broken for me is working for you/Grigori (on
>> r600).
>
> I remember there was a buffer alignment patch form AMD recently for
> SI/CI vs. VI+, but I can't find it.
> It looks like a separate issue however. if incorrect alignment makes
> user_ptr fail, and the test still fails, it looks like the no-user_ptr
> fallback is broken.
>
> Jan
>
>>
>> --Aaron
>>
>> >
>> > Jan
>> >
>> >
>> > >
>> > > Signed-off-by: Aaron Watry <awatry at gmail.com>
>> > > CC: Francisco Jerez <currojerez at riseup.net>
>> > > ---
>> > >  src/gallium/state_trackers/clover/core/memory.cpp | 2 +-
>> > >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > >
>> > > diff --git a/src/gallium/state_trackers/clover/core/memory.cpp b/src/gallium/state_trackers/clover/core/memory.cpp
>> > > index b852e6896f..912d74830a 100644
>> > > --- a/src/gallium/state_trackers/clover/core/memory.cpp
>> > > +++ b/src/gallium/state_trackers/clover/core/memory.cpp
>> > > @@ -30,7 +30,7 @@ memory_obj::memory_obj(clover::context &ctx, cl_mem_flags flags,
>> > >                         size_t size, void *host_ptr) :
>> > >     context(ctx), _flags(flags),
>> > >     _size(size), _host_ptr(host_ptr) {
>> > > -   if (flags & CL_MEM_COPY_HOST_PTR)
>> > > +   if (flags & (CL_MEM_COPY_HOST_PTR | CL_MEM_USE_HOST_PTR))
>> > >        data.append((char *)host_ptr, size);
>> > >  }
>> > >
>> >
>> > --
>> > Jan Vesely <jan.vesely at rutgers.edu>
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list