[Mesa-dev] [PATCH] radeonsi: add cs tracing v2

Wed Mar 27 06:33:51 PDT 2013

On Wed, Mar 27, 2013 at 4:45 AM, Christian König
<deathsimple at vodafone.de> wrote:
> Am 27.03.2013 01:43, schrieb Jerome Glisse:
>
>> On Tue, Mar 26, 2013 at 6:45 PM, Dave Airlie <airlied at gmail.com> wrote:
>>>>>>
>>>>>> correctly). But Marek is quite right that this only counts for state
>>>>>> objects
>>>>>> and makes no sense for set_* and draw_* calls (and I'm currently
>>>>>> thinking
>>>>>> how to avoid that and can't come up with a proper solution). Anyway
>>>>>> it's
>>>>>> definitely not an urgent problem for radeonsi.
>>>>>
>>>>> It will be a problem once we actually start caring about performance
>>>>> and, most importantly, the CPU overhead of the driver.
>>>>>
>>>>>> I still think that writing into the command buffers directly (e.g.
>>>>>> without
>>>>>> wrapper functions) is a bad idea, cause that lead to mixing driver
>>>>>> logic
>>>>>> and
>>>>>
>>>>> I'm convinced the exact opposite is a bad idea, because it adds
>>>>> another layer all commands must go through. A layer which brings no
>>>>> advantage. Think about apps which issue 1k-10k draw calls per frame.
>>>>> It's obvious that every byte moved around counts and the key to high
>>>>> framerate is to do (almost) nothing in the driver. It looks like the
>>>>> idea here is to make the driver as slow as possible.
>>>>>
>>>>>> packet building in r600g. For example just try to figure out how the
>>>>>> relocation in NOPs work by reading the source (please keep in mind
>>>>>> that
>>>>>> one
>>>>>> of the primary goals why AMD is supporting this driver is to give a
>>>>>> good
>>>>>> example code for customers who want to implement that stuff on their
>>>>>> own
>>>>>> systems).
>>>>>
>>>>> I'm shocked. Sacrificing performance in the name of making the code
>>>>> nicer for some customers? Seriously? I thought the plan was to make
>>>>> the best graphics driver ever.
>>>>
>>>>
>>>> Well, maybe I'm repeating myself: Performance is not a priority, it's
>>>> only
>>>> nice to have!
>>>>
>>>> Sorry to say so, but if we sacrifice a bit of performance for more code
>>>> readability than that is perfectly ok with me (Don't understand me wrong
>>>> I
>>>> would really prefer to replace the closed source driver today than
>>>> tomorrow,
>>>> it's unfortunately just not what I'm paid for).
>>>>
>>>> On the other hand, we are talking about perfectly optimizeable inline
>>>> functions and/or macros. All I'm saying is that we should structurize
>>>> the
>>>> code a bit more.
>>>
>>> Its okay to take steps in the right direction, but if you start taking
>>> steps that away
>>> from performance in lieu of code readability then please be prepared
>>> to deal with
>>> objections.
>>>
>>> The thing is in a lot of cases, code readability is in the eye of the
>>> beholder, I'm sure
>>> Jerome though r600g was perfectly readable when he wrote it, but a lot
>>> of us didn't
>>> and spent a lot of time trying to remove the CPU overheads, not least
>>> the amount of
>>> time Marek spent. The thing is performance is measureable, code
>>> readability isn't.
>>>
>>> Dave.
>>
>> Maybe once again you forgot why i did things the way i did them, i
>> explained myself to you back then, i designed r600g for a new kernel
>> api which was violently different from the cs one, my hope was that
>> the other kernel api would be better, it was not and i never pushed
>> more on that front. So r600g design was definitely not adapted to the
>> cs ioctl and not thinked for it. History often explain a lot of things
>> and people seems to forget about them.
>>
>> That being said, i too find ironic the code readability argument, if
>> one understand the cs ioctl then the r600g code as it's nowadays make
>> sense, but the radeonsi code is closer to what r600g use to be. So
>> assuming same ioctl i would say that radeonsi should move towards what
>> r600g is nowadays.
>>
>> Anyway just wanted to set history straight.
>
>
> Well I think you hit the point here quite well, may I ask what your kernel
> interface would have been looked like?
>
> Christian.

I use to have a branch on fdo with it, basicly what use to be
r600_hw_context was a nop in gallium and you had state in kernel (cb,
db, sampler view, sampler, ...) and you created them and then bound
them so everything was mostly security check at creation time and
bound time was pretty quick, it was also transaction based. Relocation
was easier too. Anyway it was a bad API, i know that in closed world
or more obscure stack you can have a kernel api that doesn't do much
security check and call it a day which gives you a lot more freedom on
api.

Cheers,
Jerome