[Mesa-dev] RFH and status of XvMC on r600g
Christian König
deathsimple at vodafone.de
Sat Jan 8 07:39:21 PST 2011
Hi,
in the past couple of weeks i tried to optimize the shaders used for the
iDCT and MC code. Beside optimizing the TGSI code for the shaders i
optimized the TGSI->R600 code generation in r600g quite a bit:
* Removed the temporary register use from most instructions
* Optimize away CF_INST_POP
* Use special constants for 0, 1, -1, 1.0f, 0.5f etc
* Implement output modifiers and use them to further optimize
LRP
* Fixed TEX and VTX joining
* Optimize away CF ALU instructions even if type doesn't match
* Fix alu slot assignment
* Reworked and fixed bank swizzle code
* Implement replacing gpr with pv and ps
* Merging of alu slots into larger groups
* Reworked literal handling
* Implement register remapping
* Optimized away unneeded alu moves
* Rearanging and merging of export instructions
* Fully implemented barrier handling
The end result still looks valid and gives a nice 25% speed increase for
a 720x480p videos (probably a bit more because the the bottleneck is
definitely the CPU now), but for 1280x1080i and 1920x1080i the increase
is only around 7% and 5% with the cpu still quite idle.
I assume that the bottleneck for the higher resolutions is the memory
bandwidth caused by the access patterns the iDCT and MC code uses. I
tried to enable tilling, but wasn't successfully so far, all i got when
setting R600_FORCE_TILING is:
Failed to allocate :
size : 0 bytes
alignment : 0 bytes
I updated the kernel and merged my branch with master on a regular
basis, but still getting the same error.
So what i'm missing? Do i need to update some other component, like
libdrm for example? Is there any way to debug the memory bandwith usage
of the GPU?
I'm currently a bit frustrated, because it looks like I'm stuck and
can't improve the speed further. Any help would be very welcome.
Regards,
Christian.
More information about the mesa-dev
mailing list