time for amber2 branch?
Triang3l
triang3l at yandex.ru
Thu Jun 20 18:46:21 UTC 2024
On 19/06/2024 20:34, Mike Blumenkrantz wrote:
> Terakan is not a Mesa driver, and Mesa has no obligation to cater to
out-of-tree projects which use its internal API. For everything else,
see above.
I don't think, however, that it can simply be dismissed like it doesn't
exist when it's:
• striving to become a part of Mesa among the "cool" drivers with
broad extension support like RADV, Anvil, Turnip, and now NVK;
• actively developed nearly every day (albeit for around 2 hours per
day on average because it's a free time project);
• trying to explore horizons Mesa hasn't been to yet (submitting
hardware commands directly on Windows).
As for R600g, it's one thing to drop the constraints imposed by some
Direct3D 9 level GPUs that, for instance, don't even support integers in
shaders or something like that (if that's even actually causing issues
in reality that slow down development of everything else significantly —
the broad hardware support is something that I absolutely LOVE Mesa and
overall open source infrastructure for, and I think that's the case for
many others too), but here we're talking about Direct3D 11 (or 10, but
programmed largely the same way) class hardware with OpenGL 4.5 already
supported, and 4.6 being straightforward to implement.
This means that, with the exception of OpenCL-specific global addressing
issues (R9xx can have a 4 GB "global memory" binding though possibly),
the interface contract between Gallium's internals and R600g shouldn't
differ that much from that of the more modern drivers — the _hardware_
architecture itself doesn't really warrant dropping active support in
common code.
Incidents like one change suddenly breaking vertex strides are thus
mainly a problem in how _the driver itself_ is written, and that's of
course another story… While I can't say much about Gallium interactions
specifically, I keep encountering more and more things that are
unhandled or broken in how the driver actually works with the GPU, and
there are many Piglit tests that fail. I can imagine the way R600g is
integrated into Gallium isn't in a much better state.
So I think it may make sense (even though I definitely don't see any
serious necessity) to **temporarily** place R600g in a more stable
environment where regressions in it are less likely to happen, but then
once it's brought up to modern Mesa quality standards, and when it
becomes more friendly to the rest of Mesa, to **move it back** to the
main branch (but that may stumble upon a huge lot of interface version
conflicts, who knows). Some of the things we can do to clean it up are:
• Make patterns of interaction with other subsystems of Gallium more
similar to those used by other drivers. Maybe use RadeonSI as the
primary example because of their shared roots.
• Fix some GPU configuration bugs — that I described in my previous
message, as well as some other ones, such as these small ones:
• Emit all viewports and scissors at once without using the dirty
mask because the hardware requires that (already handled years ago in
RadeonSI).
• Fix gl_VertexID in indirect draws — the DRAW_INDIRECT packets
write the base to SQ_VTX_BASE_VTX_LOC, which has an effect on vertex
fetch instructions, but not on the vertex ID input; instead switch from
SQ_VTX_FETCH_VERTEX_DATA to SQ_VTX_FETCH_NO_INDEX_OFFSET, and COPY_DW
the base to VGT_INDX_OFFSET.
• Properly configure the export format of the pixel shader DB export
vector (gl_FragDepth, gl_FragStencilRefARB, gl_SampleMask).
• Investigate how queries currently work if the command buffer was
split in the middle of a query, add the necessary stitching where needed.
• Make Piglit squeal less. I remember trying to experiment with
glDispatchComputeIndirect, only to find out that the test I wanted to
run to verify my solution was broken for another reason. Oink oink.
• If needed, remove the remaining references to TGSI enums, and also
switch to the NIR transform feedback interface that, as far as I
understand, is compatible with Nine and D3D10 frontends (or maybe it's
the other way around (= either way, make that consistent).
• Do some cleanup in common areas:
• Register, packet and shader structures can be moved to JSON
definitions similar to those used for GCN/RDNA, but with more clear
indication of the architecture revisions they can be used on (without
splitting into r600d.h and evergreend.h). I've already stumbled upon a
typo in that probably hand-written S_/G_/C_ #define soup that has caused
weird Vulkan CTS failures once, specifically in
C_028780_BLEND_CONTROL_ENABLE in evergreend.h, and who knows what other
surprises may be there. Some fields there are apparently just for the
wrong architecture revisions (though maybe actually present, but
undocumented, I don't know, given the [RESERVED] situation with the
documentation for anisotropic filtering and maybe non-1D/2D_THIN tiling
modes, for example, and that we have the reference for the 3D registers,
but not for compute).
• A lot of format information can be shared between vertex fetch,
texture fetch, and color/storage attachments. I'm finishing writing some
common format code for Terakan currently, that may be adopted by R600g.
• Carefully make sure virtual memory is properly supported in all
places on R9xx (using virtual addresses, and not emitting relocation
NOPs that are harmless but wasteful — moreover, this part deserves some
common function that will make it easier to port R600g to other
platforms, such as by making it write D3DKMTRender patch locations on
Windows).
• Unify R6xx/R7xx and R8xx/R9xx code wherever possible. There's
r600_state.c that is over 100 KB large, and evergreen_state.c that's
even bigger, but in many places it's just the same code, just including
r600d.h in one file and evergreend.h in another — and how much technical
debt we already have in the R6xx/R7xx code is an interesting question.
To me, there doesn't seem to be any necessity to abandon R6xx/R7xx
support completely currently considering that the programming
differences from R8xx/R9xx are pretty minor. At least as long as someone
occasionally runs tests on the older generations.
Maybe that will involve some small-scale changes, maybe that will end up
being more like a rewrite, but still it's totally possible that R600g
may have a new beginning at this point, especially with Gert Wollny's
compiler, and me visiting every aspect of the interface of those GPUs,
rather than an ending. At some point we may even start exposing
R600-specific functionality such as D3DFMT_D24FS8 in Gallium Nine on
R6xx/R7xx.
However, I don't like the whole idea of moving drivers away from the
main branch because that affects not only development, but also users of
Mesa. It'd be necessary to ensure that Linux distribution maintainers
are well-notified of the new branch, but even then that may still cause
issues. Like, what if the amber2 drivers end up in a separate package in
a distribution — and that'll possibly mean that after some `apt-get
dist-upgrade`, users will suddenly lose GPU acceleration on their
systems for an unobvious reason. And we definitely shouldn't be
underestimating the number of users of that old hardware outside Linux
developer circles — especially TeraScale (I think Firefox regularly gets
issue reports from Nvidia Rankine/Curie users?) I occasionally see
people on Reddit and other platforms discussing the status of Terakan,
and I'd expect that the people who talk about some software are just a
small fraction of those who use it at all. And sometimes weird things
just happen like Bringus Studios bringing up a Xi3 Piston out of
semi-vaporware nowhere…
Regarding CI, I can't promise anything right now, but I think that's not
an unsolvable issue. Overall just one machine with a Trinity APU, an
R6xx/R7xx card, and an R8xx card (one of them preferably being RV670,
RV770, or Cypress/Hemlock, to be able to test co-issuing of float64
instructions with a transcendental one when that's implemented) likely
should cover most of our regression testing needs — at least in Gallium
interaction most definitely.
Terakan development will surely continue being based on the main branch,
partly because the original reason behind the split suggestion mostly
doesn't apply to it. I do need recent Vulkan headers and all the WSI
improvements at the very least — and there are areas where Terakan
itself may contribute something new to the common Vulkan runtime code. I
already have some WSI-demanded binary-over-timeline sync type
enhancements on my branch, and if my Windows experiments go forward,
there will likely be a lot of what can be added to the common code, such
as WDDM 1 synchronization primitives (even though WDDM 2's timeline
semaphores aka monitored fences are more important to modern drivers,
there's no WDDM 2 on Windows older than 10), as well as paths for
zero-copy presentation (primarily for WDDM 1 level configurations — like
via sharing images with Direct3D 10/11, or with OpenGL to take advantage
of the "exclusive borderless" driver hack, or maybe even via
D3DKMTPresent where possible).
On 20/06/2024 20:30, Adam Jackson wrote:
> We're using compute shaders internally in more and more ways, for
example, maybe being able to assume them would be a win.
I'd imagine that compute shader usage scenarios in common Gallium code
are optional, and depending on the hardware, compute shaders can even be
the less optimal approach to things like image copying/resolving (where
specialized copy hardware is available) from the perspective of
performance or maybe format support (early, or maybe actually all, I
don't know for sure yet, AMD R8xx hardware, for instance, hangs with
linear storage images according to one comment in R800AddrLib, and
that's why a quad with a color target may be preferable for copying —
and it also has fast resolves inside its color buffer hardware, as well
as a DMA engine).
— Triang3l
More information about the mesa-dev
mailing list