[Intel-gfx] [PATCH v2 1/1] drm/i915: Save PM interrupt register offsets in device info
Tvrtko Ursulin
tursulin at ursulin.net
Tue Oct 24 20:26:02 UTC 2017
On 24/10/17 18:48, Jani Nikula wrote:
> On Tue, 24 Oct 2017, Chris Wilson <chris at chris-wilson.co.uk> wrote:
>> Quoting Sagar Arun Kamble (2017-10-24 11:41:13)
>>> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
>>> index 875d428..d1a4911 100644
>>> --- a/drivers/gpu/drm/i915/intel_device_info.c
>>> +++ b/drivers/gpu/drm/i915/intel_device_info.c
>>> @@ -462,4 +462,15 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>>> info->sseu.has_subslice_pg ? "y" : "n");
>>> DRM_DEBUG_DRIVER("has EU power gating: %s\n",
>>> info->sseu.has_eu_pg ? "y" : "n");
>>> +
>>> + /* Initialize PM interrupt register offsets */
>>> + if (INTEL_GEN(dev_priv) >= 8) {
>>> + info->pm_iir_offset = GEN8_GT_IIR(2);
>>> + info->pm_imr_offset = GEN8_GT_IMR(2);
>>> + info->pm_ier_offset = GEN8_GT_IER(2);
>>> + } else {
>>> + info->pm_iir_offset = GEN6_PMIIR;
>>> + info->pm_imr_offset = GEN6_PMIMR;
>>> + info->pm_ier_offset = GEN6_PMIER;
>>> + }
>>
>> If you are going to take another pass at this, move these into the
>> static tables in i915_pci.c
>>
>> Updating GEN6_FEATURES and GEN8_FEATURES will then percolate into
>> individual platform defines.
>
> Like I wrote in reply to v1, I'm not convinced we should do this at all.
>
> What makes *these* registers so important they must be in device info?
> What makes most of i915_reg.h so unimportant they don't deserve the same
> treatment? Where do you draw the line?
>
> I'd draw the line at, no registers at device info.
I suggested to Sagar this change during review so feel responsible to
chime in.
So in general I just find the amount of times our driver asks itself
what it's running on a bit tasteless. :(
I did quick and dirty check by bumping a counter in all the
IS_this|or|that checks, all which can be known at driver probe time, and
wired it up to the PMU so I can check their frequency. The annotated
perf stat output:
root at e31:~# perf stat -a -e i915/whoami/ -I 1000
# time counts unit events
# idle system no X running
1.000298100 10 i915/whoami/
2.000750955 8 i915/whoami/
3.001104193 10 i915/whoami/
4.001333433 10 i915/whoami/
5.001703162 10 i915/whoami/
6.002122721 10 i915/whoami/
# starting X now..
7.002266228 2,203 i915/whoami/
8.002392598 4,682 i915/whoami/
9.002764398 0 i915/whoami/
10.003027119 0 i915/whoami/
11.003486048 42 i915/whoami/
# X idling..
12.003854660 0 i915/whoami/
13.004221680 0 i915/whoami/
14.004622571 0 i915/whoami/
15.004968110 0 i915/whoami/
16.005372363 0 i915/whoami/
17.005778034 0 i915/whoami/
18.005941970 0 i915/whoami/
19.006313427 0 i915/whoami/
20.006676048 0 i915/whoami/
21.007059927 0 i915/whoami/
22.007507818 0 i915/whoami/
23.007887628 0 i915/whoami/
24.008207035 0 i915/whoami/
25.008580496 0 i915/whoami/
# time counts unit events
26.008949236 0 i915/whoami/
27.009433473 0 i915/whoami/
# gfxbench trex starting up
28.009677600 2,605 i915/whoami/
29.009941972 716 i915/whoami/
30.010127588 2,190 i915/whoami/
31.010249535 46 i915/whoami/
32.010383565 36 i915/whoami/
33.010527674 0 i915/whoami/
# trex running
34.010760584 4,709 i915/whoami/
35.011079891 5,381 i915/whoami/
36.011280234 5,306 i915/whoami/
37.011719986 5,505 i915/whoami/
38.012017531 5,386 i915/whoami/
39.012529241 5,687 i915/whoami/
40.012922986 6,009 i915/whoami/
41.013120143 5,791 i915/whoami/
42.013399982 5,296 i915/whoami/
43.013712979 5,349 i915/whoami/
44.014107375 5,127 i915/whoami/
45.014553950 5,387 i915/whoami/
46.014953020 5,364 i915/whoami/
47.015243748 4,738 i915/whoami/
48.015560460 4,788 i915/whoami/
49.015867395 4,927 i915/whoami/
50.016152690 4,886 i915/whoami/
So.. I am not saying these particular registers are mega important, and
not even saying that these 5k/s conditionals are measurable (either as
branches or increased code size effect), but overall the situation is a
bit of.. bleurgh from the elegance point of view. :(
If we have register sets which are 100% mutually exclusive, then I see
them as candidates to put them in some object at probe time. It doesn't
have to be device_info but I don't see why we wouldn't do it. It is just
a different flavour of the vfunc approach after all.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list