[Nouveau] Second copy engine on GF116
Andy Ritger
aritger at nvidia.com
Tue Nov 25 13:05:20 PST 2014
On Tue, Nov 25, 2014 at 10:57:44AM -0500, Ilia Mirkin wrote:
> On Mon, Nov 24, 2014 at 8:33 PM, Andy Ritger <aritger at nvidia.com> wrote:
> > On Fri, Nov 21, 2014 at 01:39:55AM -0500, Ilia Mirkin wrote:
> >> On Fri, Nov 21, 2014 at 1:16 AM, Andy Ritger <aritger at nvidia.com> wrote:
> >> > Hi Ilia,
> >> >
> >> > Actually 0x90b8 is different than copy engine. I'm not very familiar
> >> > with it, but 0x90b8 is an engine for performing LZO decompression as
> >> > part of performing the copy. It has a variety of limitations (e.g.,
> >> > cannot handle blocklinear format), and was only in a few Fermi chips,
> >> > as I understand it.
> >>
> >> According to our driver source, GF100, GF104, GF110, GF114, and GF116
> >> all have it. [So GF106, GF108, GF117, GF119 don't have it.] We've only
> >> had problems reported against GF116... and only for some people.
> >
> > Hmm, some of our internal documentation is inconsistent about whether it
> > applies to GF100, but otherwise what I see matches your list. I guess
> > "few" was not entirely accurate.
> >
> >> > It is probably easiest to just ignore it. You can distinguish this
> >> > decompress engine from normal copy engine by looking at the CE capability
> >> > register on falcon (0x00000650). If bit 2 is '1', then the falcon is
> >> > a decompress engine.
> >>
> >> I presume you mean a +0x650 register on the pcopy engines (0x104000
> >> and 0x105000). I only have access to the GF108 right now, which
> >> returns 3 for 0x104650 and 4 for 0x105650. We're using the engine at
> >> 0x104000 for copy on the GF108...
> >
> > Yes, 0x104650 and 0x105650 are the right addresses, from what I can tell.
> >
> > FWIW, the other capability bits are:
> > bit 0: "DMACOPY_SUPPORTED"
> > bit 1: "PIXREMAP_SUPPORTED"
> >
> > (I think PIXREMAP_SUPPORTED is in reference to the component remapping
> > controlled by methods 0x00000700, 0x00000704, and 0x00000708 in the
> > copy engine class).
>
> Neat. We went around and grabbed that 0x650 register on a bunch of
> GPUs, see the CE* columns at:
>
> http://envytools.readthedocs.org/en/latest/hw/gpu.html#fermi-kepler-maxwell-family
I don't see the 0x650 register values on that page. Maybe I'm not
looking at the right place?
> It looks like it's actually returning 0 on both "copy" engines for a
> bunch of those cards -- GF100, GF104, GF114, probably GF110. But other
> cards have them as either 3 or 4. I'm guessing that '0' should be
> treated as if it were a '3' (or a '7')?
That's curious. If I can get the table of where that reads zero, I can
try to investigate how to interpret that.
> Curiously, a GF116 card that I thought was working fine on nouveau
> actually has 3 for the first engine and 4 for the second. Perhaps it
> just had enough VRAM that I never triggered the conditions required
> for nouveau to use that second copy engine (we use it, when available,
> for drm-initiated buffer moves).
Interesting. Would that explain why this hasn't manifested on configs
other than the GF116 user reports?
Thanks,
- Andy
> >> From my admittedly limited understanding, both 0x104000 and 0x105000
> >> appear to be falcon engines, where the fuc is presumably able to drive
> >> some underlying hardware. The actual fifo methods are implemented in
> >> the fuc, which in turn does iowr/etc commands.
> >>
> >> Are you saying that the "decompress" engine (at 0x105000 right?) has a
> >> different piece of hardware behind it than the copy engine at
> >> 0x104000, or does NVIDIA simply provide different fuc for it that
> >> exposes somewhat different functionality via FIFO methods?
> >
> > There is definitely a falcon at the frontend, and there is different
> > falcon ucode for "normal" copy engine versus the "decompress" engine.
> > But, I don't know off hand what dedicated hardware, if any, is behind it.
>
> Seems likely that the HW is different, since it'd be madness to try to
> do decompression in the falcon code itself. (Not to say that the ISA
> isn't suited to it, just they have relatively slow clocks.) mwk is in
> the process of working it all out.
>
> -ilia
More information about the Nouveau
mailing list