[RFC 0/7] Proposal for isolating FPU operation
Christian König
christian.koenig at amd.com
Mon Jan 25 15:29:34 UTC 2021
Hi Rodrigo,
good to see this finally be tackled. The whole approach looks solid to
me, just one things I've noted.
> +void dcn3x_populate_dml_writeback_from_context(struct dc *dc,
> + struct resource_context *res_ctx, display_e2e_pipe_params_st *pipes)
> +{
> + DC_FP_START();
> + _dcn3x_populate_dml_writeback_from_context(dc, res_ctx, pipes);
> + DC_FP_END();
> +}
The calls to DC_FP_START()/DC_FP_END() must be outside of the
fpu_operation directory.
The problem is that even before the call to DC_FP_START() the compiler
might think it is a good idea to use FP registers for spilling on some
architectures.
So my understanding is that all calls of functions declared inside the
fpu_operation directory must be made with DC_FP_START()/DC_FP_END() in
the calleing and not the called function.
Regards,
Christian.
Am 25.01.21 um 14:43 schrieb Rodrigo Siqueira:
> Hi,
>
> In the display core, we utilize floats and doubles units for calculating
> modesetting parameters. One side effect of our approach to use double-precision
> is the fact that we spread multiple FPU access across our driver, which means
> that we can accidentally clobber user space FPU state. As an attempt to fix
> this problem, we have the following proposal:
>
> 1. We first need to move functions that deal with FPU to a single place in
> order to make things more manageable;
> 2. After we isolate these function in a single place, we want to remove any
> compilation flag that deals with FPU from other files and centralize it only
> in the files that need it;
> 3. We need to implement an interface for safely calling those FPU functions.
> The idea is to add a thin function layer where FPU functions are invoked
> under the protection of kernel_fpu_begin/end.
>
> One of the challenges from the above steps is identifying which function uses
> FPU registers; fortunately, Peter Zijlstra wrote a patch a couple of months ago
> where he introduced an FPU check for objtool. I used the following command for
> identifying the potential FPU usage:
>
> ./tools/objtool/objtool check -Ffa "drivers/gpu/drm/amd/display/dc/ANY_FILE.o"
>
> Based on the above command output and the step-by-step approach that we want to
> adopt, I decided to start this work focusing on DCN3 and DCN302. I believe that
> the best way to see this RFC is:
>
> 1. The first patch introduces an FPU folder inside display/dc, intending to
> centralize functions that deal with FPU. Note that I introduced two new C
> files named dcn3x_commons inside a new folder called fpu_operation; I used
> the name dcn3x because some of the functions inside this folder are shared
> with DCN301 and DCN302. In other words, all FPU function which is shared
> across DCN3x will be placed in that file.
> 2. The next set of patches, start to move some of the function that requires
> FPU access to the file dcn3x_commons. I did it in a small chunk to make it
> easy to bisect in case of regressions.
> 3. Note that one of the patch touch DCN2, the reason for that is the fact that
> the function dcn20_calculate_dlg_params is shared from DCN2 to DCN3. Because
> of that, I create a new file named fpu_commons for keeping functions that
> are shared across multiple ASICs.
> 4. When we move some of the functions, notice that I also add an API for
> accessing it via fpu_kernel_begin/end.
> 5. At the end of the series, I dropped the FPU flags from the files that I
> initialize refactored.
>
> We are also working on test stress for validating this change from the user
> space and kernel perspective.
>
> Keep in mind that this series is not done yet. I'm looking for feedback about
> this approach because we have plans to use it for trying to fix our FPU
> problems for the next couple of weeks. Finally, we want to do this work
> step-by-step because it is easy to introduce regression when dealing with these
> FPU problems.
>
> Best Regards
>
> Rodrigo Siqueira (7):
> drm/amd/display: Introduce FPU directory inside DC
> drm/amd/display: Moves dcn30_set_mcif_arb_params to FPU folder
> drm/amd/display: Add FPU file for functions shared across ASICs
> drm/amd/display: Move calculate_wm_and_dlg to FPU folder
> drm/amd/display: Move patch bounding box to FPU folder
> drm/amd/display: Move bounding box functions to FPU folder
> drm/amd/display: Drop float flages from DCN30 files
>
> drivers/gpu/drm/amd/display/dc/Makefile | 1 +
> .../drm/amd/display/dc/dcn20/dcn20_resource.c | 106 +--
> .../drm/amd/display/dc/dcn20/dcn20_resource.h | 8 -
> .../drm/amd/display/dc/dcn21/dcn21_resource.c | 2 +
> drivers/gpu/drm/amd/display/dc/dcn30/Makefile | 30 -
> .../drm/amd/display/dc/dcn30/dcn30_resource.c | 683 +---------------
> .../drm/amd/display/dc/dcn30/dcn30_resource.h | 20 -
> .../amd/display/dc/dcn301/dcn301_resource.c | 10 +-
> .../gpu/drm/amd/display/dc/dcn302/Makefile | 25 -
> .../amd/display/dc/dcn302/dcn302_resource.c | 10 +-
> .../drm/amd/display/dc/fpu_operation/Makefile | 58 ++
> .../display/dc/fpu_operation/dcn3x_commons.c | 743 ++++++++++++++++++
> .../display/dc/fpu_operation/dcn3x_commons.h | 44 ++
> .../display/dc/fpu_operation/fpu_commons.c | 145 ++++
> .../display/dc/fpu_operation/fpu_commons.h | 37 +
> 15 files changed, 1051 insertions(+), 871 deletions(-)
> create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operation/Makefile
> create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operation/dcn3x_commons.c
> create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operation/dcn3x_commons.h
> create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operation/fpu_commons.c
> create mode 100644 drivers/gpu/drm/amd/display/dc/fpu_operation/fpu_commons.h
>
More information about the amd-gfx
mailing list