[Mesa-dev] [ANNOUNCE] mesa 17.3.4
Mark Janes
mark.a.janes at intel.com
Sat Feb 17 20:03:12 UTC 2018
Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl> writes:
> (-mesa-announce + Mark, Dave and James)
>
> Hi Emil,
>
> radv is broken for nearly all commercial games in 17.3.4. The cause is
>
> commit ad764e365beb8a119369b97f22225cb95fc7ea8c
> Author: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
> Date: Mon Jan 22 09:01:29 2018 +0100
>
> ac/nir: Use instance_rate_inputs per attribute, not per variable.
>
> This did the wrong thing if we had e.g. an array for which only some
> of the attributes use the instance index. Tripped up some new CTS
> tests.
>
> CC: <mesa-stable at lists.freedesktop.org>
> Reviewed-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
> Reviewed-by: Dave Airlie <airlied at redhat.com>
> (cherry picked from commit 5a4dc285002e1924dbc8c72d17481a3dbc4c0142)
>
> Conflicts:
> src/amd/common/ac_nir_to_llvm.c
>
> A typo was introduced during the conflict resolution while
> cherrypicking to 17.3.
>
> First things first, can we mitigate this? Would it be possible to get
> a 17.3.5 with this fixed ASAP? This can be fixed by either rolling
> back ad764e365beb8a119369b97f22225cb95fc7ea8c or by applying
> https://patchwork.freedesktop.org/patch/204260/ (obviously not applied
> to master as the issue did not occur there).
>
> Secondly, I'd like to talk about process and how to prevent this in
> the future. Bugfix releases are supposed to be stable so downstream
> maintainers don't have to to deal with this kind of stuff happening,
> so I think that breaking radv pretty much completely is particularly
> egregious and we should look at how to prevent this happnening another
> time.
>
> First a short summary of what happened and when (all times in UTC):
>
> 2018-02-09 4:47: Emil sends out the pre-release announcement, git
> branch contains the faulty commit
> 2018-02-13 16:05: James Legg replies to the pre-release announcement:
> https://lists.freedesktop.org/archives/mesa-dev/2018-February/185355.html
> 2018-02-13 16:06: James Legg sends a patch to fix it:
> https://lists.freedesktop.org/archives/mesa-dev/2018-February/185356.html
> 2018-02-13 16:33: Bas reviews the patch.
> 2018-02-15 12:56: Emil releases 17.3.4 without the fix:
> https://lists.freedesktop.org/archives/mesa-dev/2018-February/185691.html
> 2018-02-16 21:09: Someone else replies with a fix to the pre-release
> announcement: https://lists.freedesktop.org/archives/mesa-dev/2018-February/185908.html
> 2018-02-17 ~13:00: Bas notices the attached fix has "mismerge" in the
> name and decides to investigate the issue
>
>
> So I think there have been a couple of issues in communication here:
>
> 1) Essentially none of the communication talked about this being an
> issue on the branch only. This made me underestimate the severity of
> the issue, since the games kept working on master. This also meant no
> manual pings from my side to the release manager for not applying this
> to master first.
> 2) The reply to the pre-release announcement only said there was an
> issue, not that there was a fix sent out, nor any details like
> severity.
> 3) As far as I can tell no action had been taken by the release
> manager. James' reply did not get a response nor was the fix included
> in the release.
>
> My question would be how to improve the communication here. Could you
> elaborate the reason for (3)? Was it because you thought it would have
> to be picked up by the radv developers first? Would it help if I
> replied to James' reply in the announcement to link to the patch?
>
> The remaining issue is mainly about testing. The initial detection of
> this issue by James was already more than the 24-48 hours after the
> pre-release announcement that is recommended between the announcement
> and release, so a faster release would have prevented timely detection
> in the first place. I suspect radv does not get tested at all on
> releases except maybe a build test?
>
> Since manually testing all games on every release is not scalable for
> either the radv developers or the release manager I'm thinking of
> getting the automated tests from the vulkan CTS and crucible running
> before a release. That said I don't think keeping track of what tests
> are supposed to pass is something we should be pushing on the release
> manager, so I suppose we should be setting up our own CI and surfacing
> results of the stable branches to the release manager?
Tracking results is at best a significant investment which attenuates as
the driver becomes more stable and automation improves. It takes us
perhaps an hour per day on average to track i965 status in CI. This
investment pays bigger dividends for larger developer teams. There is
constant demand to expand the coverage of CI.
Tracking results requires developer time when the driver has been
broken, but this cost is less than any other alternative.
Unfortunately, this cost is often borne by whoever runs the CI, and can
become burdensome.
> I'm assuming there is already a similar arrangement with intel? Is
> this something were the release manager triggers their CI specifically
> when doing a pre-release or are they just continuously watching stable
> branches? Also how do the results get surfaced back to the release
> manager?
We trigger CI verification on release manager branches as a stable
release is prepared. Issues like this are very rare but bound to
happen, and we found that it was necessary to have CI verification as a
precondition of release.
Stable branch verification is cheaper and more accurate when predicated
on a CI process which is already tracking master.
The minimal investment would be to A/B test vulkancts for each release.
You would have to reconcile:
- unstable tests
- tests fixed by the driver
- driver regressions
- tests which were wrong, and were fixed by a subsequent driver and
cts/crucible commit
It's unlikely that you would find much for minor releases, because the
release process is mostly safe. For each new branchpoint, it takes
quite a while to reconcile all the differences, but this usually means
there are unfixed bugs in the driver.
> Apologies for the barrage of questions!
>
> Thanks,
> Bas Nieuwenhuizen
>
> On Thu, Feb 15, 2018 at 1:56 PM, Emil Velikov <emil.l.velikov at gmail.com> wrote:
>> Mesa 17.3.4 is now available.
>>
>> In this release we have:
>>
>> Dozens of fixes in the i965, ANV and RADV drivers. Additionally
>> the r600, virgl, etnaviv and renderonly drivers have also seen some love.
>>
>> The experimental Vulkan extension VK_KHX_multiview was disabled.
>>
>> On the video decoding drivers side:
>> r600/radeonsi correctly handle new UVD/VCN firmware. The VA and OMX
>> state-trackers have some MPEG2 glitches resolved, while locking is correctly
>> handled in the error paths.
>>
>> To top it up, the libGL module should build fine on non-dri and Darwin systems.
>>
>>
>> Andres Gomez (1):
>> i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen<8
>>
>> Bas Nieuwenhuizen (10):
>> radv: Fix ordering issue in meta memory allocation failure path.
>> radv: Fix memory allocation failure path in compute resolve init.
>> radv: Fix freeing meta state if the device pipeline cache fails
>> to allocate.
>> radv: Fix fragment resolve init memory allocation failure paths.
>> radv: Fix bufimage failure deallocation.
>> radv: Init variant entry with memset.
>> radv: Don't allow 3d or 1d depth/stencil textures.
>> ac/nir: Use instance_rate_inputs per attribute, not per variable.
>> ac/nir: Use correct 32-bit component writemask for 64-bit SSBO stores.
>> ac/nir: Fix vector extraction if source vector has >4 elements.
>>
>> Boyuan Zhang (2):
>> radeon/vcn: add and manage render picture list
>> radeon/uvd: add and manage render picture list
>>
>> Chuck Atkins (1):
>> configure.ac: add missing llvm dependencies to .pc files
>>
>> Dave Airlie (10):
>> r600/sb: fix a bug emitting ar load from a constant.
>> ac/nir: account for view index in the user sgpr allocation.
>> radv: add fs_key meta format support to resolve passes.
>> radv: don't use hw resolve for integer image formats
>> radv: don't use hw resolves for r16g16 norm formats.
>> radv: move spi_baryc_cntl to pipeline
>> r600/sb: insert the else clause when we might depart from a loop
>> radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)
>> radv/gfx9: fix block compression texture views. (v2)
>> virgl: also remove dimension on indirect.
>>
>> Eleni Maria Stea (1):
>> mesa: Fix function pointers initialization in status tracker
>>
>> Emil Velikov (19):
>> cherry-ignore: i965: Accept CONTEXT_ATTRIB_PRIORITY for brwCreateContext
>> cherry-ignore: swr: refactor swr_create_screen to allow for
>> proper cleanup on error
>> cherry-ignore: anv: add explicit 18.0 only nominations
>> cherry-ignore: radv: fix sample_mask_in loading. (v3.1)
>> cherry-ignore: meson: multiple fixes
>> cherry-ignore: swr/rast: support llvm 3.9 type declarations
>> Revert "cherry-ignore: intel/fs: Use the original destination
>> region for int MUL lowering"
>> cherry-ignore: ac/nir: set amdgpu.uniform and invariant.load for UBOs
>> cherry-ignore: add gen10 fixes
>> cherry-ignore: add r600/amdgpu 18.0 nominations
>> cherry-ignore: add i965 shader cache fixes
>> cherry-ignore: nir: mark unused space in packed_tex_data
>> radv: Stop advertising VK_KHX_multiview
>> cherry-ignore: radv: Don't expose VK_KHX_multiview on android.
>> configure.ac: correct driglx-direct help text
>> cherry-ignore: add meson fix
>> cherry-ignore: add a few more meson fixes
>> Update version to 17.3.4
>> docs: add release notes for 17.3.4
>>
>> Eric Engestrom (1):
>> radeon: remove left over dead code
>>
>> Gert Wollny (1):
>> r600/shader: Initialize max_driver_temp_used correctly for the first time
>>
>> Grazvydas Ignotas (2):
>> st/va: release held locks in error paths
>> st/vdpau: release held lock in error path
>>
>> Igor Gnatenko (1):
>> link mesautil with pthreads
>>
>> Indrajit Das (4):
>> st/omx_bellagio: Update default intra matrix per MPEG2 spec
>> radeon/uvd: update quantiser matrices only when requested
>> radeon/vcn: update quantiser matrices only when requested
>> st/va: clear pointers for mpeg2 quantiser matrices
>>
>> Jason Ekstrand (19):
>> i965: Call brw_cache_flush_for_render in predraw_resolve_framebuffer
>> i965: Add more precise cache tracking helpers
>> i965/blorp: Add more destination flushing
>> i965: Track the depth and render caches separately
>> i965: Track format and aux usage in the render cache
>> Re-enable regular fast-clears (CCS_D) on gen9+
>> i965/miptree: Refactor CCS_E and CCS_D cases in render_aux_usage
>> i965/miptree: Add an explicit tiling parameter to create_for_bo
>> i965/miptree: Use the tiling from the modifier instead of the BO
>> i965/bufmgr: Add a create_from_prime_tiled function
>> i965: Set tiling on BOs imported with modifiers
>> i965/miptree: Take an aux_usage in prepare/finish_render
>> i965/miptree: Add an aux_disabled parameter to render_aux_usage
>> i965/surface_state: Drop brw_aux_surface_disabled
>> intel/fs: Use the original destination region for int MUL lowering
>> anv/pipeline: Don't look at blend state unless we have an attachment
>> anv/cmd_buffer: Re-emit the pipeline at every subpass
>> anv: Stop advertising VK_KHX_multiview
>> i965: Call prepare_external after implicit window-system MSAA resolves
>>
>> Jon Turney (3):
>> configure: Default to gbm=no on osx
>> glx/apple: include util/debug.h for env_var_as_boolean prototype
>> glx/apple: locate dispatch table functions to wrap by name
>>
>> José Fonseca (1):
>> svga: Prevent use after free.
>>
>> Juan A. Suarez Romero (1):
>> docs: add sha256 checksums for 17.3.3
>>
>> Kenneth Graunke (2):
>> i965: Bind null render targets for shadow sampling + color.
>> i965: Bump official kernel requirement to Linux v3.9.
>>
>> Lucas Stach (2):
>> etnaviv: dirty TS state when framebuffer has changed
>> renderonly: fix dumb BO allocation for non 32bpp formats
>>
>> Marek Olšák (1):
>> radeonsi: don't ignore pitch for imported textures
>>
>> Matthew Nicholls (2):
>> radv: restore previous stencil reference after depth-stencil clear
>> radv: remove predication on cache flushes
>>
>> Maxin B. John (1):
>> anv_icd.py: improve reproducible builds
>>
>> Michel Dänzer (1):
>> winsys/radeon: Compute is_displayable in surf_drm_to_winsys
>>
>> Roland Scheidegger (1):
>> r600: don't do stack workarounds for hemlock
>>
>> Samuel Pitoiset (1):
>> radv: create pipeline layout objects for all meta operations
>>
>> Samuel Thibault (1):
>> glx: fix non-dri build
>>
>> Timothy Arceri (2):
>> ac: fix buffer overflow bug in 64bit SSBO loads
>> ac: fix visit_ssa_undef() for doubles
>>
>> git tag: mesa-17.3.4
>>
>> https://mesa.freedesktop.org/archive/mesa-17.3.4.tar.gz
>> MD5: 8e38be8ff6310271ecc14da16287c775 mesa-17.3.4.tar.gz
>> SHA1: 96bbfec43aa4ba0c1d1161601b4743f43c07f91d mesa-17.3.4.tar.gz
>> SHA256: 2d3a4c3cbc995b3e192361dce710d8c749e046e7575aa1b7d8fc9e6b4df28f84
>> mesa-17.3.4.tar.gz
>> SHA512: e629fee58fe8976a09fbeca129c954900c8b955e1ccfd16fda0b47f58442530fd8749e15b6202ed205abe2931a054107a74bdab398494cae94b82e1b295c2175
>> mesa-17.3.4.tar.gz
>> PGP: https://mesa.freedesktop.org/archive/mesa-17.3.4.tar.gz.sig
>>
>> https://mesa.freedesktop.org/archive/mesa-17.3.4.tar.xz
>> MD5: f08eccad27f34366db1bb3997d288c2f mesa-17.3.4.tar.xz
>> SHA1: bb9be653a26d89f3b6c4c00c64f4a5896d9d7f38 mesa-17.3.4.tar.xz
>> SHA256: 71f995e233bc5df1a0dd46c980d1720106e7f82f02d61c1ca50854b5e02590d0
>> mesa-17.3.4.tar.xz
>> SHA512: 8a077aa89b9d314188e62a215abe8e0db890afbbdd9c1ba9d214735d5304956b55723132f19e8a4ac3e3f404eca1dd9b5fbc936de9ac63d91562c0bc62708fe3
>> mesa-17.3.4.tar.xz
>> PGP: https://mesa.freedesktop.org/archive/mesa-17.3.4.tar.xz.sig
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list