[RFC 0/7] Proposal for isolating FPU operation

Christian König christian.koenig at amd.com
Mon Jan 25 15:29:34 UTC 2021


Hi Rodrigo,

good to see this finally be tackled. The whole approach looks solid to 
me, just one things I've noted.

> +void dcn3x_populate_dml_writeback_from_context(struct dc *dc,
> +	struct resource_context *res_ctx, display_e2e_pipe_params_st *pipes)
> +{
> +	DC_FP_START();
> +	_dcn3x_populate_dml_writeback_from_context(dc, res_ctx, pipes);
> +	DC_FP_END();
> +}

The calls to DC_FP_START()/DC_FP_END() must be outside of the 
fpu_operation directory.

The problem is that even before the call to DC_FP_START() the compiler 
might think it is a good idea to use FP registers for spilling on some 
architectures.

So my understanding is that all calls of functions declared inside the 
fpu_operation directory must be made with DC_FP_START()/DC_FP_END() in 
the calleing and not the called function.

Regards,
Christian.

Am 25.01.21 um 14:43 schrieb Rodrigo Siqueira:
> Hi,
>
> In the display core, we utilize floats and doubles units for calculating
> modesetting parameters. One side effect of our approach to use double-precision
> is the fact that we spread multiple FPU access across our driver, which means
> that we can accidentally clobber user space FPU state. As an attempt to fix
> this problem, we have the following proposal:
>
> 1. We first need to move functions that deal with FPU to a single place in
>     order to make things more manageable;
> 2. After we isolate these function in a single place, we want to remove any
>     compilation flag that deals with FPU from other files and centralize it only
>     in the files that need it;
> 3. We need to implement an interface for safely calling those FPU functions.
>     The idea is to add a thin function layer where FPU functions are invoked
>     under the protection of kernel_fpu_begin/end.
>
> One of the challenges from the above steps is identifying which function uses
> FPU registers; fortunately, Peter Zijlstra wrote a patch a couple of months ago
> where he introduced an FPU check for objtool. I used the following command for
> identifying the potential FPU usage:
>
>   ./tools/objtool/objtool check -Ffa "drivers/gpu/drm/amd/display/dc/ANY_FILE.o"
>
> Based on the above command output and the step-by-step approach that we want to
> adopt, I decided to start this work focusing on DCN3 and DCN302. I believe that
> the best way to see this RFC is:
>
> 1. The first patch introduces an FPU folder inside display/dc, intending to
>     centralize functions that deal with FPU. Note that I introduced two new C
>     files named dcn3x_commons inside a new folder called fpu_operation; I used
>     the name dcn3x because some of the functions inside this folder are shared
>     with DCN301 and DCN302. In other words, all FPU function which is shared
>     across DCN3x will be placed in that file.
> 2. The next set of patches, start to move some of the function that requires
>     FPU access to the file dcn3x_commons. I did it in a small chunk to make it
>     easy to bisect in case of regressions.
> 3. Note that one of the patch touch DCN2, the reason for that is the fact that
>     the function dcn20_calculate_dlg_params is shared from DCN2 to DCN3. Because
>     of that, I create a new file named fpu_commons for keeping functions that
>     are shared across multiple ASICs.
> 4. When we move some of the functions, notice that I also add an API for
>     accessing it via fpu_kernel_begin/end.
> 5. At the end of the series, I dropped the FPU flags from the files that I
>     initialize refactored.
>
> We are also working on test stress for validating this change from the user
> space and kernel perspective.
>
> Keep in mind that this series is not done yet. I'm looking for feedback about
> this approach because we have plans to use it for trying to fix our FPU
> problems for the next couple of weeks. Finally, we want to do this work
> step-by-step because it is easy to introduce regression when dealing with these
> FPU problems.
>
> Best Regards
>
> Rodrigo Siqueira (7):
>    drm/amd/display: Introduce FPU directory inside DC
>    drm/amd/display: Moves dcn30_set_mcif_arb_params to FPU folder
>    drm/amd/display: Add FPU file for functions shared across ASICs
>    drm/amd/display: Move calculate_wm_and_dlg to FPU folder
>    drm/amd/display: Move patch bounding box to FPU folder
>    drm/amd/display: Move bounding box functions to FPU folder
>    drm/amd/display: Drop float flages from DCN30 files
>
>   drivers/gpu/drm/amd/display/dc/Makefile       |   1 +
>   .../drm/amd/display/dc/dcn20/dcn20_resource.c | 106 +--
>   .../drm/amd/display/dc/dcn20/dcn20_resource.h |   8 -
>   .../drm/amd/display/dc/dcn21/dcn21_resource.c |   2 +
>   drivers/gpu/drm/amd/display/dc/dcn30/Makefile |  30 -
>   .../drm/amd/display/dc/dcn30/dcn30_resource.c | 683 +---------------
>   .../drm/amd/display/dc/dcn30/dcn30_resource.h |  20 -
>   .../amd/display/dc/dcn301/dcn301_resource.c   |  10 +-
>   .../gpu/drm/amd/display/dc/dcn302/Makefile    |  25 -
>   .../amd/display/dc/dcn302/dcn302_resource.c   |  10 +-
>   .../drm/amd/display/dc/fpu_operation/Makefile |  58 ++
>   .../display/dc/fpu_operation/dcn3x_commons.c  | 743 ++++++++++++++++++
>   .../display/dc/fpu_operation/dcn3x_commons.h  |  44 ++
>   .../display/dc/fpu_operation/fpu_commons.c    | 145 ++++
>   .../display/dc/fpu_operation/fpu_commons.h    |  37 +
>   15 files changed, 1051 insertions(+), 871 deletions(-)
>   create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operation/Makefile
>   create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operation/dcn3x_commons.c
>   create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operation/dcn3x_commons.h
>   create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operation/fpu_commons.c
>   create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operation/fpu_commons.h
>



More information about the amd-gfx mailing list