[PATCH] drm/xe: Get XE_FORCEWAKE_ALL to read RING_TIMESTAMP

Umesh Nerlige Ramappa umesh.nerlige.ramappa at intel.com
Tue Jun 25 15:30:23 UTC 2024


On Mon, Jun 24, 2024 at 05:54:19PM -0700, Umesh Nerlige Ramappa wrote:
>On Mon, Jun 24, 2024 at 02:56:12PM -0700, Matt Roper wrote:
>>On Mon, Jun 24, 2024 at 09:44:03AM -0500, Lucas De Marchi wrote:
>>>On Sat, Jun 22, 2024 at 03:23:25AM GMT, Umesh Nerlige Ramappa wrote:
>>>> Per client engine utilization uses RING_TIMESTAMP to return
>>>> drm-total-cycles to the user. We read only the rcs RING_TIMESTAMP in
>>>> platforms that have render. While this works for rcs and
>>>> ccs, it is observed that this value is 0 when running work on bcs, vcs
>>>> and vecs. Ideally we should read the engine specific RING_TIMESTAMP, but
>>>
>>>what do you mean by engine specific RING_TIMESTAMP? We are already
>>>reading the engine specific copy of the same timestamp. AFAIR just
>>>getting the gt fw was working for reading this register from vcs, vecs,
>>>bcs. Can you demonstrate the failure with a bug open or a series of
>>>intel_reg invocations showing otherwise?
>>>
>>>I think the simplest way to show it doesn't work is to call
>>>`intel_reg write` to get a GT forcewake, then `intel_reg read` to read
>>>this register.
>>
>>I'm surprised to hear that this has been working for the RCS/CCS but not
>>the BCS; that seems backward.  RING_TIMESTAMP is $engine_base + 0x358.
>>According to bspec pages 71185 & 71186:
>>
>>- Render  (0x2358) needs RENDER forcewake
>>- BCS0   (0x22358) needs GT forcewake
>>- BCS8  (0x3ee358) needs GT forcewake
>>- CCS0   (0x1a358) needs RENDER forcewake
>>- CCS1   (0x1c358) needs RENDER forcewake
>>- CCS2   (0x1e358) needs RENDER forcewake
>>- CCS3   (0x26358) needs RENDER forcewake
>>- VCS0  (0x1c0358) needs VDBOX0 forcewake
>>- VECS0 (0x1c8358) needs VEBOX0 forcewake
>>- GSC   (0x11a358) needs GSC forcewake
>>
>>So I'd expect everything _except_ the BCS engines to be broken today...
>>
>>FORCEWAKE_ALL is a big hammer that should indeed solve the immediate
>>problem, but I wonder if it would make more sense to store the specific
>>domain that needs to be grabbed for accesses to any registers in an
>>engine's ($engine_base .. $engine_base+0x2000) MMIO region so that we
>>can avoid unnecessarily waking up all the unrelated power domains
>>whenever we want to read a timestamp?
>
>Well.. I don't know if we really gain anything by doing so, so let me 
>know if (2) is what you are suggesting below.
>
>As an example, show_run_ticks() will get called every second and would 
>rely on reading the GT timestamp each second. We could potentially 
>have domain specific logic here, but logic would still iterate over 
>all domains since we are capturing activity for all classes.
>
>1)
>get forcewake_all
>read ring_timestamp
>put forcewake_all
>
>vs.
>
>2) for_each_class() {
>	get class specific FW
>	read class specific ring_timestamp
>	put class specific FW
>}

I think I understand now. Since we are reading only one hwe specific 
timestamp, we are better off just using that FW domain. I agree that 
it's much better than FORCEWAKE_ALL. Will post that in rev2.

Thanks,
Umesh

>
>Thanks,
>Umesh
>
>>
>>
>>Matt
>>
>>>
>>>Lucas De Marchi
>>>
>>>> to keep the logic simple, just get XE_FORCEWAKE_ALL instead of XE_GT_FW.
>>>>
>>>> This should work fine on multi-gt platforms as well since the
>>>> gt_timestamp is in sync on all GTs.
>>>>
>>>> Fixes: 188ced1e0ff8 ("drm/xe/client: Print runtime to fdinfo")
>>>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
>>>> ---
>>>> drivers/gpu/drm/xe/xe_drm_client.c | 4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_drm_client.c b/drivers/gpu/drm/xe/xe_drm_client.c
>>>> index 4a19b771e3a0..74f2244679f3 100644
>>>> --- a/drivers/gpu/drm/xe/xe_drm_client.c
>>>> +++ b/drivers/gpu/drm/xe/xe_drm_client.c
>>>> @@ -264,9 +264,9 @@ static void show_run_ticks(struct drm_printer *p, struct drm_file *file)
>>>> 		if (!hwe)
>>>> 			continue;
>>>>
>>>> -		xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
>>>> +		xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>>>> 		gpu_timestamp = xe_hw_engine_read_timestamp(hwe);
>>>> -		xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
>>>> +		xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
>>>> 		break;
>>>> 	}
>>>>
>>>> --
>>>> 2.34.1
>>>>
>>
>>-- 
>>Matt Roper
>>Graphics Software Engineer
>>Linux GPU Platform Enablement
>>Intel Corporation


More information about the Intel-xe mailing list