[Mesa-dev] [RFC] Merging feature work to Mesa master

Thu Feb 16 08:56:43 PST 2012

On 15 February 2012 15:28, Ian Romanick <idr at freedesktop.org> wrote:

> Over the last six months a lot of feature work has happened in Mesa, and
> the load has been carried by a lot of different people / organization. In
> the process, we discovered a number of development process issues that made
> things more difficult than they needed to be.
>
> The biggest problem the we encountered was the number of features marked
> "50% done" in GL3.txt.  To complete these features, a significant amount of
> time had to be spent figuring out what was done, and what was left to do.
>  Since the changes had landed on master a very long time ago, this was a
> frustrating and error prone process.  Transform feedback was the worst case
> of this.  One developer spent several weeks trying to assess the state of
> development.  In the process, a couple items were missed and not detected
> until early January.
>
> PROPOSAL: No feature will be merged to master until a vertical slice[1] of
> functionality is complete.
>
> To be specific, this means that some useful workload should be able to
> correctly execute.  This doesn't mean that everything is done or even that
> any particular real application needs to work.  At the very least there
> should be a demo or piglit test that uses the functionality in its intended
> manner that works.
>
> This means that some incomplete features may sit on branches of a long
> time.  That's okay!  It's really easy to see what has been done on a branch
> because you can 'git log origin/master..EXT_some_**feature'.  This
> trivializes the assessment of the state of development.
>
> We encountered similar problems with pieces of functionality that were
> ostensibly done.  Many elements of OpenGL functionality are like Koch
> snowflakes:  everything is a corner case.  Integer textures and
> floating-point textures are prime cases of this.  Even though the
> implementation was done and enabled in several drivers, we had no way to
> assess the quality.  The same problem holds in cases of features that are
> known to be incomplete, even if a vertical slice is functional.
>
> PROPOSAL: No feature will be merged to master until a robust set of tests
> are implemented or at least documented.
>
> We don't necessarily need 10,000 tests for some feature, but there should
> be some.  There should also be a list of "test this corner case, test that
> corner case, check this error condition, etc."  As an example, we've come
> up with a list of missing test cases for EXT_framebuffer_multisample:
>
> - Test turning multisample on and off on a MSAA buffer.
> - Test multisample points smooth
> - Test multisample points non-smooth
> - Test multisample lines smooth
> - Test multisample lines non-smooth
> - Test multisample line stipple
> - Test multisample polygon smooth
> - Test multisample polygon stipple
> - Test multisample glBitmap
> - Test sample alpha to one
> - Test sample coverage
> - Test sample coverage invert
> - Test sample alpha-to-coverage, with alpha-to-one
> - Test sample alpha-to-coverage, without alpha-to-one
> - Test blit multisample-to-multisample (matching sample count)
> - Test blit multisample-to-multisample (non-matching sample count)
> - Test blit multisample-to-multisample (non-matching format)
> - Test blit singlesample-to-multisample
> - Test blit multisample-to-multisample (scaling blit)
> - Iterate all the above for all the sample counts
> - Iterate all internal formats for some basic MSAA test
> - "All of the above" for depth and stencil
>    - Try really hard to hit HiZ resolves too
>
> This list represents about 30 or 40 test cases.  By giving some indication
> of areas that need testing, we communicate the parts of the feature that
> are known to be incomplete or just plain broken (Keith's 1st Law: Any piece
> of code that isn't tested is broken.).
>
> The test list also need not necessarily come from the developer.  It is
> completely reasonable and desirable for reviewers to suggest or implement
> tests.  All of that information just needs to be captured, and consensus
> must be achieved.
>
> The onus of enforcing these proposals will fall on the reviewers.  This
> still leaves room for sensible things to happen while preventing the
> problems that we encountered.  For example, it should be fine to merge
> Marek and nobled's ARB_debug work with few tests implemented or specified.
>  The same could not be said for the proposed implementation of
> ARB_uniform_buffer_object.
>
> [1] http://en.wikipedia.org/wiki/**Vertical_slice<http://en.wikipedia.org/wiki/Vertical_slice>
> ______________________________**_________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/**mailman/listinfo/mesa-dev<http://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>

To add some concreteness to the discussion, I went back through all of my
git commits from the last several months to see where, specifically, I had
to make unexpected fixes to code that others had only partially finished.
The majority of the cases boiled down to the following problem areas:

1. Missing checks for error conditions that the spec requires an
implementation to check for.

2. Newly-added state is not being saved/restored properly by meta-ops.

3. Overzealous error checks: some error condition is not being checked for
in the correct way, causing an error to be generated even for correct input.

4. API Functions neglecting to set ctx->NewState when necessary, so the
driver fails to notice that state has changed.

IMHO, Ian's proposals, or some variant of them, would be useful because
they directly address pain points like the ones above.  For instance,
requiring complete vertical slices of functionality would help with 3 and
4, because it would ensure that at the time state was added to the
front-end, we could easily test that things propagated properly to the
back-end.  And requiring a robust set of tests would help with 1 and 2,
because any adequately robust set of tests for a feature would include
testing error conditions and meta-op behaviour.

Having said that, I am concerned about these proposals encouraging people
to batch their work into even larger patch series than they do already
(and, in general, I think we already batch up our patches more than we
should).  To that end, I'd like to propose two clarifications and an
exception to Ian's proposed rules:

CLARIFICATION: any patch series that affects what is output to the screen
(or to a buffer) counts as a vertical slice of functionality, even if it is
not an entire feature or extension.  So, for example, in the work that I
did on transform feedback last quarter, "make transform feedback work
properly for gl_ClipVertex" would still count as a vertical slice of
functionality, because it affects what is output to the transform feedback
buffer when gl_ClipVertex is in use.  It also counts as a vertical slice of
functionality if a patch series allows drawing to take place in a situation
where an error or crash would have previously occurred.

CLARIFICATION: for any patch series that implements a vertical slice of
functionality, we should have at least one piglit test (but hopefully many
tests) which checks that the correct output is generated, and the
appropriate patch in the series should be annotated with "fixes Piglit test
XYZ".

EXCEPTION: we allow changes that don't affect what is output to the screen
(or a buffer) provided that they aren't adding functionality.  For example,
refactoring to prepare for later commits is fine.  Performance improvements
are fine.  Eliminating duplicate code is fine.  These kinds of changes
don't require tests.  However, we don't add a bunch of front end support
for something that has no back-end support yet.

Note: I have worked on teams in the past that would have required tests
even for performance optimization and some refactoring, but those teams had
strong unit testing* in place.  (*By "unit testing", I mean tests that
exercise individual components of the software in isolation, rather than
just testing from the client side as we do with Piglit).  Adding unit tests
to a project that hasn't had them before often requires a fair amount of
refactoring, to insert the hooks necessary to test a component in
isolation.  I don't want to stall progress by asking for a level of unit
testing that would be hard to achieve right now.  But I do hope we can
improve our level of unit testing in the future--in my experience it is one
of the most effective ways to catch bugs early in the development process.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20120216/f2f16671/attachment-0001.htm>