<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Oct 25, 2017 at 10:31 AM, Kenneth Graunke <span dir="ltr"><<a href="mailto:kenneth@whitecape.org" target="_blank">kenneth@whitecape.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Wednesday, October 25, 2017 7:33:41 AM PDT Jason Ekstrand wrote:<br>
> On October 25, 2017 06:05:16 Joonas Lahtinen wrote:<br>
[snip]<br>
<span class="">> > There indeed seems to be quite a lot of missing registers from the i915<br>
> > driver where the context is initialized. (Psst. You can read that as:<br>
> > "all the 33 non-privileged registers we could quickly list, are<br>
> > missing").<br>
><br>
> We probably don't need *all* of them initialized.  For instance, the<br>
> initial values of the ALU registers or the indirect draw parameter<br>
> registers will probably never matter.  However, if you want to just<br>
> initialized them all, that's fine.<br>
<br>
</span>I agree - I think we can cut down the list substantially, if you like.<br>
Here's my breakdown of Skylake's non-privileged register list:<br>
<br>
Cache_Mode_0               0x7000<br>
Cache_Mode_1               0x7004<br>
GT_MODE                    0x7008<br>
L3_Config                  0x7034<br>
TD_CTL                     0xE400<br>
TD_CTL2                    0xE404<br>
L3SQCREG4                  0xB118<br>
NOPID                      0x2094<br>
INSTPM                     0x20C0<br>
<br>
   Should be initialized by the kernel.  Several of these can severely<br>
   break unsuspecting userspace, and we'd like to be able to rely on a<br>
   default value.<br>
<br>
IA_VERTICES_COUNT          0x2310<br>
IA_PRIMITIVES_COUNT        0x2318<br>
VS_INVOCATION_COUNT        0x2320<br>
HS_INVOCATION_COUNT        0x2300<br>
DS_INVOCATION_COUNT        0x2308<br>
GS_INVOCATION_COUNT        0x2328<br>
GS_PRIMITIVES_COUNT        0x2330<br>
SO_NUM_PRIMS_WRITTEN0      0x5200<br>
SO_NUM_PRIMS_WRITTEN1      0x5208<br>
SO_NUM_PRIMS_WRITTEN2      0x5210<br>
SO_NUM_PRIMS_WRITTEN3      0x5218<br>
SO_PRIM_STORAGE_NEEDED0    0x5240<br>
SO_PRIM_STORAGE_NEEDED1    0x5248<br>
SO_PRIM_STORAGE_NEEDED2    0x5250<br>
SO_PRIM_STORAGE_NEEDED3    0x5258<br>
CL_INVOCATION_COUNT        0x2338<br>
CL_PRIMITIVES_COUNT        0x2340<br>
PS_INVOCATION_COUNT_0      0x22C8<br>
PS_DEPTH_COUNT_0           0x22D8<br>
PS_INVOCATION_COUNT_1      0x22F0<br>
PS_DEPTH_COUNT_1           0x22F8<br>
PS_INVOCATION_COUNT_2      0x2448<br>
PS_DEPTH_COUNT_2           0x2450<br>
GPGPU_THREADS_DISPATCHED   0x2290<br>
<br>
   The kernel can skip these if you like.  Statistics registers just count<br>
   things, and userspace always calculates (end counter - start counter)<br>
   deltas, so the initial value doesn't really matter.<br>
<br>
SO_WRITE_OFFSET0           0x5280<br>
SO_WRITE_OFFSET1           0x5284<br>
SO_WRITE_OFFSET2           0x5288<br>
SO_WRITE_OFFSET3           0x528C<br>
GPUGPU_DISPATCHDIMX        0x2500<br>
GPUGPU_DISPATCHDIMY        0x2504<br>
GPUGPU_DISPATCHDIMZ        0x2508<br>
MI_PREDICATE_SRC0          0x2400<br>
MI_PREDICATE_SRC0          0x2404<br>
MI_PREDICATE_SRC1          0x2408<br>
MI_PREDICATE_SRC1          0x240C<br>
MI_PREDICATE_DATA          0x2410<br>
MI_PREDICATE_DATA          0x2414<br>
MI_PREDICATE_RESULT        0x2418<br>
MI_PREDICATE_RESULT_1      0x241C<br>
MI_PREDICATE_RESULT_2      0x23BC<br>
3DPRIM_END_OFFSET          0x2420<br>
3DPRIM_START_VERTEX        0x2430<br>
3DPRIM_VERTEX_COUNT        0x2434<br>
3DPRIM_INSTANCE_COUNT      0x2438<br>
3DPRIM_START_INSTANCE      0x243C<br>
3DPRIM_BASE_VERTEX         0x2440<br>
<br>
   The kernel can skip these if you like, IMO.  These registers are only<br>
   used when enabling an optional feature - stream out (SO_WRITE_*),<br>
   indirect compute dispatch (GPGPU_*), predicated draws (MI_PREDICATE_*),<br>
   indirect draws (3DPRIM_*).  Userspace has to explicitly opt in to each<br>
   of these features by enabling a flag, so there isn't a cross-context<br>
   contamination problem.  If userspace opts in to these features, it can<br>
   be responsible for programming the registers correctly.<br>
<br>
CS_GPR (1-16)              0x2600<br>
<br>
   The kernel can skip these if you like.  They're temporary storage when<br>
   using the MI_MATH instruction.  Example usage: load values into CS_GPR1<br>
   and CS_GPR2, add them, store the result in CS_GPR3.  Store to memory.<br>
<br>
   Nobody should be doing math on register values without setting them.<br>
   That's clearly a userspace bug.<br>
<br>
BB_OFFSET                  0x2158<br></blockquote><div><br></div><div>This is used for indirect BATCH_BUFFER_START which is a thing on SKL+ I believe (I didn't look at the docs).<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
OA_CTX_CONTROL             0x2360<br>
OACTXID                    0x2364<br>
OA CONTROL                 0x2B00<br>
PERF_CNT_1_DW0             0x91b8<br>
PERF_CNT_1_DW1             0x91bc<br>
PERF_CNT_2_DW0             0x91c0<br>
PERF_CNT_2_DW1             0x91c4<br>
<br>
   I don't know about these.<br>
</blockquote></div><br></div></div>