[Mesa-dev] [PATCH] R600/SI: fix MIMG writemask adjustement

Marek Olšák maraeo at gmail.com
Tue Sep 24 03:25:20 PDT 2013


Sorry, I didn't investigate why. Maybe some pass is executed twice?

Marek

On Tue, Sep 24, 2013 at 11:48 AM, Christian König
<deathsimple at vodafone.de> wrote:
> Sorry, my fault let me refine the question: Why the heck is the function
> applied twice?
>
> Christian.
>
> Am 24.09.2013 11:44, schrieb Marek Olšák:
>
>> If the TGSI writemask is .xzw and the initial dmask is 0xf, the first
>> application of the function sees 4 VGPRs and correctly sets the dmask
>> to 13 (xzw) because the second VGPR is unused, but the second
>> application of the function now sees 3 VGPRs, ignores the fact that
>> dmask is 13 meaning xzw, and sets the writemask to 7 meaning xyz,
>> which breaks the texture instruction. I thought it was obvious from
>> the comments I added.
>>
>> Marek
>>
>> On Tue, Sep 24, 2013 at 9:00 AM, Christian König
>> <deathsimple at vodafone.de> wrote:
>>>
>>> Why should that be necessary?
>>>
>>> Christian.
>>>
>>> Am 23.09.2013 21:09, schrieb maraeo at gmail.com:
>>>
>>>> From: Marek Olšák <marek.olsak at amd.com>
>>>>
>>>> This fixes piglit:
>>>> - shaders/glsl-fs-texture2d-masked
>>>> - shaders/glsl-fs-texture2d-masked-4
>>>>
>>>> Signed-off-by: Marek Olšák <marek.olsak at amd.com>
>>>> ---
>>>>    lib/Target/R600/SIISelLowering.cpp | 27 +++++++++++++++++++++------
>>>>    1 file changed, 21 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/lib/Target/R600/SIISelLowering.cpp
>>>> b/lib/Target/R600/SIISelLowering.cpp
>>>> index 2174753..891a51b 100644
>>>> --- a/lib/Target/R600/SIISelLowering.cpp
>>>> +++ b/lib/Target/R600/SIISelLowering.cpp
>>>> @@ -1065,7 +1065,9 @@ static unsigned SubIdx2Lane(unsigned Idx) {
>>>>    void SITargetLowering::adjustWritemask(MachineSDNode *&Node,
>>>>                                           SelectionDAG &DAG) const {
>>>>      SDNode *Users[4] = { };
>>>> -  unsigned Writemask = 0, Lane = 0;
>>>> +  unsigned Lane = 0;
>>>> +  unsigned OldDmask = Node->getConstantOperandVal(0);
>>>> +  unsigned NewDmask = 0;
>>>>        // Try to figure out the used register components
>>>>      for (SDNode::use_iterator I = Node->use_begin(), E =
>>>> Node->use_end();
>>>> @@ -1076,29 +1078,42 @@ void
>>>> SITargetLowering::adjustWritemask(MachineSDNode *&Node,
>>>>            I->getMachineOpcode() != TargetOpcode::EXTRACT_SUBREG)
>>>>          return;
>>>>    +    /* Lane means which subreg of %VGPRa_VGPRb_VGPRc_VGPRd is used.
>>>> +     * Note that subregs are packed, i.e. Lane==0 is the first bit set
>>>> +     * in OldDmask, so it can be any of X,Y,Z,W; Lane==1 is the second
>>>> bit
>>>> +     * set, etc. */
>>>>        Lane = SubIdx2Lane(I->getConstantOperandVal(1));
>>>>    +    // Set which texture component corresponds to the lane.
>>>> +    unsigned Comp;
>>>> +    for (unsigned i = 0, Dmask = OldDmask; i <= Lane; i++) {
>>>> +      assert(Dmask);
>>>> +      Comp = ffs(Dmask)-1;
>>>> +      Dmask &= ~(1 << Comp);
>>>> +    }
>>>> +
>>>>        // Abort if we have more than one user per component
>>>>        if (Users[Lane])
>>>>          return;
>>>>          Users[Lane] = *I;
>>>> -    Writemask |= 1 << Lane;
>>>> +    NewDmask |= 1 << Comp;
>>>>      }
>>>>    -  // Abort if all components are used
>>>> -  if (Writemask == 0xf)
>>>> +  // Abort if there's no change
>>>> +  if (NewDmask == OldDmask)
>>>>        return;
>>>>        // Adjust the writemask in the node
>>>>      std::vector<SDValue> Ops;
>>>> -  Ops.push_back(DAG.getTargetConstant(Writemask, MVT::i32));
>>>> +  Ops.push_back(DAG.getTargetConstant(NewDmask, MVT::i32));
>>>>      for (unsigned i = 1, e = Node->getNumOperands(); i != e; ++i)
>>>>        Ops.push_back(Node->getOperand(i));
>>>>      Node = (MachineSDNode*)DAG.UpdateNodeOperands(Node, Ops.data(),
>>>> Ops.size());
>>>>        // If we only got one lane, replace it with a copy
>>>> -  if (Writemask == (1U << Lane)) {
>>>> +  // (if NewDmask has only one bit set...)
>>>> +  if (NewDmask && (NewDmask & (NewDmask-1)) == 0) {
>>>>        SDValue RC = DAG.getTargetConstant(AMDGPU::VReg_32RegClassID,
>>>> MVT::i32);
>>>>        SDNode *Copy = DAG.getMachineNode(TargetOpcode::COPY_TO_REGCLASS,
>>>>                                          SDLoc(),
>>>> Users[Lane]->getValueType(0),
>>>
>>>
>


More information about the mesa-dev mailing list