map user space memory as gart memory for intel integrated graphics chip

Austin Yuan yuanshengquan at gmail.com
Thu Dec 8 06:04:34 PST 2005


On 12/8/05, Keith Whitwell <keith at tungstengraphics.com> wrote:
> Alan Cox wrote:
> > On Iau, 2005-12-08 at 19:52 +0800, Austin Yuan wrote:
> >
> >>buffer. Because the interface of "alloc_by_type" only receives a
> >>simple parameter "type", here I hide the user space address into
> >>"type" and re-get it in alloc_userspace_memory.
> >
> >
> > That should probably be fixed by extending the API to pass both
> >
> >
> >>I use this interface to easily implement XAA readpixmap/imagewrite
> >>driver interfaces, and get a better performance. Here, I didn't attach
> >>the patch for i810 driver. I just want to get some comments about it,
> >>and if you think it makes sense, I'd like to make it more generic.
> >>
> >>Any comments are appreciated, thanks.
> >
> >
> >
> > The one thing I don't understand looking at this is that I understood
> > AGP pages should be marked uncached. However user space pages may exist
> > in many mappings and the CPU also requires all mappings of a page are
> > consistent.
> >
> > Does i8xx need the page uncached or is it enough to wbinvd the pages in
> > question on writing and invalidate them before reading, or does the i8xx
> > in fact take full part in the cache coherency ?
>
> I believe the hardware can deal with both cases, and has rules for what uses
> each type of memory can be applied to.  I think that it's ok to use snooped
> memory (in their parlance) for upload type tasks, but not for things like render
> destinations.  I'll try and be more specific when I get time to review the docs.
>
> I'm pleased to see someone has taken on this much-talked about task.  Austin -
> have you gotten any feel for how quick this sort of "upload" might be?
>
I tried this method to implement XAA readpixmap, and used x11perf to
test the performance(x11perf -getimage10 -getimage100 -getimage500). 
Below is the result (resolution is 1024x768 at 16):
                            SW readpixmap   HW readpixmap
10x10 square           5150.0                 2880.0
100x100 square        427.0                  1850.0
500x500 square        18.5                     112.0

Because for every readpixmap, I need to alloc/bind/dealloc agp memory,
for small square blt, HW readpixmap is slow than SW readpixmap, but
for large squre blt, HW readpixmap is more than 6 time than SW
readpixmap

> Keith
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
> for problems?  Stop!  Download the new AJAX search engine that makes
> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
> --
> _______________________________________________
> Dri-devel mailing list
> Dri-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dri-devel
>



More information about the xorg mailing list