[Mesa-dev] [PATCH 5/6] i965: Implement nir_intrinsic_shader_clock

Emil Velikov emil.l.velikov at gmail.com
Thu Oct 8 03:39:53 PDT 2015


On 8 October 2015 at 00:36, Connor Abbott <cwabbott0 at gmail.com> wrote:
> On Wed, Oct 7, 2015 at 7:51 AM, Emil Velikov <emil.l.velikov at gmail.com> wrote:
>> Signed-off-by: Emil Velikov <emil.velikov at collabora.com>
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   |  9 +++++++++
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 10 ++++++++++
>>  2 files changed, 19 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index 03fe680..bcb8f38 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -1317,6 +1317,15 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
>>        break;
>>     }
>>
>> +   case nir_intrinsic_shader_clock: {
>> +      /* We cannot do anything if there is an event, so ignore it for now */
>> +      fs_reg shader_clock = get_timestamp(bld);
>
> get_timestamp() isn't doing quite what you want. If you look at the
> definition of fs_visitor::get_timestamp(), you see this comment:
>
>    /* The caller wants the low 32 bits of the timestamp.  Since it's running
>     * at the GPU clock rate of ~1.2ghz, it will roll over every ~3 seconds,
>     * which is plenty of time for our purposes.  It is identical across the
>     * EUs, but since it's tracking GPU core speed it will increment at a
>     * varying rate as render P-states change.
>     *
>     * The caller could also check if render P-states have changed (or anything
>     * else that might disrupt timing) by setting smear to 2 and checking if
>     * that field is != 0.
>     */
> When you read the ARF register, you get three components set in your
> SIMD8 register, plus some junk because we have to read in four
> components at a time. That is, this code in get_timestamp:
>
>    fs_reg ts = fs_reg(retype(brw_vec4_reg(BRW_ARCHITECTURE_REGISTER_FILE,
>                                           BRW_ARF_TIMESTAMP,
>                                           0),
>                              BRW_REGISTER_TYPE_UD));
>
>    bld.group(4, 0).exec_all().MOV(dst, ts);
>
> will create a register called dst that looks like this:
>
>  ------------------------------
> | tm0.0 | tm0.1 | tm0.2 | junk |
>  ------------------------------
>
> Then doing dst.set_smear(0) will create a source that looks like this:
>
Thanks for this. Not sure how I missed this hunk.

I'm leaning towards pushing the comment + smear to
emit_shader_time_{begin.end} as it's applicable only there. This will
make the function analogous to the vec4 one, not to mention that it'll
save us the back and forth gymnastic, as you kindly described.

Cheers,
Emil


More information about the mesa-dev mailing list