slow rx 5600 xt fps

Thu May 21 19:15:33 UTC 2020

Please provide your dmesg output and xorg log.

Alex

On Thu, May 21, 2020 at 3:03 PM Javad Karabi <karabijavad at gmail.com> wrote:
>
> Alex,
> yea, youre totally right i was overcomplicating it lol
> so i was able to get the radeon to run super fast, by doing as you
> suggested and blacklisting i915.
> (had to use module_blacklist= though because modprobe.blacklist still
> allows i915, if a dependency wants to load it)
> but with one caveat:
> using the amdgpu driver, there was some error saying something about
> telling me that i need to add BusID to my device or something.
> maybe amdgpu wasnt able to find the card or something, i dont
> remember. so i used modesetting instead and it seemed to work.
> i will try going back to amdgpu and seeing what that error message was.
> i recall you saying that modesetting doesnt have some features that
> amdgpu provides.
> what are some examples of that?
> is the direction that graphics drivers are going, to be simply used as
> "modesetting" via xorg?
>
> On Wed, May 20, 2020 at 10:12 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> >
> > I think you are overcomplicating things.  Just try and get X running
> > on just the AMD GPU on bare metal.  Introducing virtualization is just
> > adding more uncertainty.  If you can't configure X to not use the
> > integrated GPU, just blacklist the i915 driver (append
> > modprobe.blacklist=i915 to the kernel command line in grub) and X
> > should come up on the dGPU.
> >
> > Alex
> >
> > On Wed, May 20, 2020 at 6:05 PM Javad Karabi <karabijavad at gmail.com> wrote:
> > >
> > > Thanks Alex,
> > > Here's my plan:
> > >
> > > since my laptop's os is pretty customized, e.g. compiling my own kernel, building latest xorg, latest xorg-driver-amdgpu, etc etc,
> > > im going to use the intel iommu and pass through my rx 5600 into a virtual machine, which will be a 100% stock ubuntu installation.
> > > then, inside that vm, i will continue to debug
> > >
> > > does that sound like it would make sense for testing? for example, with that scenario, it adds the iommu into the mix, so who knows if that causes performance issues. but i think its worth a shot, to see if a stock kernel will handle it better
> > >
> > > also, quick question:
> > > from what i understand, a thunderbolt 3 pci express connection should handle 8 GT/s x4, however, along the chain of bridges to my device, i notice that the bridge closest to the graphics card is at 2.5 GT/s x4, and it also says "downgraded" (this is via the lspci output)
> > >
> > > now, when i boot into windows, it _also_ says 2.5 GT/s x4, and it runs extremely well. no issues at all.
> > >
> > > so my question is: the fact that the bridge is at 2.5 GT/s x4, and not at its theoretical "full link speed" of 8 GT/s x4, do you suppose that _could_ be an issue?
> > > i do not think so, because, like i said, in windows it also reports that link speed.
> > > i would assume that you would want the fastest link speed possible, because i would assume that of _all_ tb3 pci express devices, a GPU would be the #1 most demanding on the link
> > >
> > > just curious if you think 2.5 GT/s could be the bottleneck
> > >
> > > i will pass through the device into a ubuntu vm and let you know how it goes. thanks
> > >
> > >
> > >
> > > On Tue, May 19, 2020 at 9:29 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> > >>
> > >> On Tue, May 19, 2020 at 9:16 PM Javad Karabi <karabijavad at gmail.com> wrote:
> > >> >
> > >> > thanks for the answers alex.
> > >> >
> > >> > so, i went ahead and got a displayport cable to see if that changes
> > >> > anything. and now, when i run monitor only, and the monitor connected
> > >> > to the card, it has no issues like before! so i am thinking that
> > >> > somethings up with either the hdmi cable, or some hdmi related setting
> > >> > in my system? who knows, but im just gonna roll with only using
> > >> > displayport cables now.
> > >> > the previous hdmi cable was actually pretty long, because i was
> > >> > extending it with an hdmi extension cable, so maybe the signal was
> > >> > really bad or something :/
> > >> >
> > >> > but yea, i guess the only real issue now is maybe something simple
> > >> > related to some sysfs entry about enabling some powermode, voltage,
> > >> > clock frequency, or something, so that glxgears will give me more than
> > >> > 300 fps. but atleast now i can use a single monitor configuration with
> > >> > the monitor displayported up to the card.
> > >> >
> > >>
> > >> The GPU dynamically adjusts the clocks and voltages based on load.  No
> > >> manual configuration is required.
> > >>
> > >> At this point, we probably need to see you xorg log and dmesg output
> > >> to try and figure out exactly what is going on.  I still suspect there
> > >> is some interaction going on with both GPUs and the integrated GPU
> > >> being the primary, so as I mentioned before, you should try and run X
> > >> on just the amdgpu rather than trying to use both of them.
> > >>
> > >> Alex
> > >>
> > >>
> > >> > also, one other thing i think you might be interested in, that was
> > >> > happening before.
> > >> >
> > >> > so, previously, with laptop -tb3-> egpu-hdmi> monitor, there was a
> > >> > funny thing happening which i never could figure out.
> > >> > when i would look at the X logs, i would see that "modesetting" (for
> > >> > the intel integrated graphics) was reporting that MonitorA was used
> > >> > with "eDP-1",  which is correct and what i expected.
> > >> > when i scrolled further down, i then saw that "HDMI-A-1-2" was being
> > >> > used for another MonitorB, which also is what i expected (albeit i
> > >> > have no idea why its saying A-1-2)
> > >> > but amdgpu was _also_ saying that DisplayPort-1-2 (a port on the
> > >> > radeon card) was being used for MonitorA, which is the same Monitor
> > >> > that the modesetting driver had claimed to be using with eDP-1!
> > >> >
> > >> > so the point is that amdgpu was "using" Monitor0 with DisplayPort-1-2,
> > >> > although that is what modesetting was using for eDP-1.
> > >> >
> > >> > anyway, thats a little aside, i doubt it was related to the terrible
> > >> > hdmi experience i was getting, since its about display port and stuff,
> > >> > but i thought id let you know about that.
> > >> >
> > >> > if you think that is a possible issue, im more than happy to plug the
> > >> > hdmi setup back in and create an issue on gitlab with the logs and
> > >> > everything
> > >> >
> > >> > On Tue, May 19, 2020 at 4:42 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> > >> > >
> > >> > > On Tue, May 19, 2020 at 5:22 PM Javad Karabi <karabijavad at gmail.com> wrote:
> > >> > > >
> > >> > > > lol youre quick!
> > >> > > >
> > >> > > > "Windows has supported peer to peer DMA for years so it already has a
> > >> > > > numbers of optimizations that are only now becoming possible on Linux"
> > >> > > >
> > >> > > > whoa, i figured linux would be ahead of windows when it comes to
> > >> > > > things like that. but peer-to-peer dma is something that is only
> > >> > > > recently possible on linux, but has been possible on windows? what
> > >> > > > changed recently that allows for peer to peer dma in linux?
> > >> > > >
> > >> > >
> > >> > > A few things that made this more complicated on Linux:
> > >> > > 1. Linux uses IOMMUs more extensively than windows so you can't just
> > >> > > pass around physical bus addresses.
> > >> > > 2. Linux supports lots of strange architectures that have a lot of
> > >> > > limitations with respect to peer to peer transactions
> > >> > >
> > >> > > It just took years to get all the necessary bits in place in Linux and
> > >> > > make everyone happy.
> > >> > >
> > >> > > > also, in the context of a game running opengl on some gpu, is the
> > >> > > > "peer-to-peer" dma transfer something like: the game draw's to some
> > >> > > > memory it has allocated, then a DMA transfer gets that and moves it
> > >> > > > into the graphics card output?
> > >> > >
> > >> > > Peer to peer DMA just lets devices access another devices local memory
> > >> > > directly.  So if you have a buffer in vram on one device, you can
> > >> > > share that directly with another device rather than having to copy it
> > >> > > to system memory first.  For example, if you have two GPUs, you can
> > >> > > have one of them copy it's content directly to a buffer in the other
> > >> > > GPU's vram rather than having to go through system memory first.
> > >> > >
> > >> > > >
> > >> > > > also, i know it can be super annoying trying to debug an issue like
> > >> > > > this, with someone like me who has all types of differences from a
> > >> > > > normal setup (e.g. using it via egpu, using a kernel with custom
> > >> > > > configs and stuff) so as a token of my appreciation i donated 50$ to
> > >> > > > the red cross' corona virus outbreak charity thing, on behalf of
> > >> > > > amd-gfx.
> > >> > >
> > >> > > Thanks,
> > >> > >
> > >> > > Alex
> > >> > >
> > >> > > >
> > >> > > > On Tue, May 19, 2020 at 4:13 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> > >> > > > >
> > >> > > > > On Tue, May 19, 2020 at 3:44 PM Javad Karabi <karabijavad at gmail.com> wrote:
> > >> > > > > >
> > >> > > > > > just a couple more questions:
> > >> > > > > >
> > >> > > > > > - based on what you are aware of, the technical details such as
> > >> > > > > > "shared buffers go through system memory", and all that, do you see
> > >> > > > > > any issues that might exist that i might be missing in my setup? i
> > >> > > > > > cant imagine this being the case because the card works great in
> > >> > > > > > windows, unless the windows driver does something different?
> > >> > > > > >
> > >> > > > >
> > >> > > > > Windows has supported peer to peer DMA for years so it already has a
> > >> > > > > numbers of optimizations that are only now becoming possible on Linux.
> > >> > > > >
> > >> > > > > > - as far as kernel config, is there anything in particular which
> > >> > > > > > _should_ or _should not_ be enabled/disabled?
> > >> > > > >
> > >> > > > > You'll need the GPU drivers for your devices and dma-buf support.
> > >> > > > >
> > >> > > > > >
> > >> > > > > > - does the vendor matter? for instance, this is an xfx card. when it
> > >> > > > > > comes to different vendors, are there interface changes that might
> > >> > > > > > make one vendor work better for linux than another? i dont really
> > >> > > > > > understand the differences in vendors, but i imagine that the vbios
> > >> > > > > > differs between vendors, and as such, the linux compatibility would
> > >> > > > > > maybe change?
> > >> > > > >
> > >> > > > > board vendor shouldn't matter.
> > >> > > > >
> > >> > > > > >
> > >> > > > > > - is the pcie bandwidth possible an issue? the pcie_bw file changes
> > >> > > > > > between values like this:
> > >> > > > > > 18446683600662707640 18446744071581623085 128
> > >> > > > > > and sometimes i see this:
> > >> > > > > > 4096 0 128
> > >> > > > > > as you can see, the second value seems significantly lower. is that
> > >> > > > > > possibly an issue? possibly due to aspm?
> > >> > > > >
> > >> > > > > pcie_bw is not implemented for navi yet so you are just seeing
> > >> > > > > uninitialized data.  This patch set should clear that up.
> > >> > > > > https://patchwork.freedesktop.org/patch/366262/
> > >> > > > >
> > >> > > > > Alex
> > >> > > > >
> > >> > > > > >
> > >> > > > > > On Tue, May 19, 2020 at 2:20 PM Javad Karabi <karabijavad at gmail.com> wrote:
> > >> > > > > > >
> > >> > > > > > > im using Driver "amdgpu" in my xorg conf
> > >> > > > > > >
> > >> > > > > > > how does one verify which gpu is the primary? im assuming my intel
> > >> > > > > > > card is the primary, since i have not done anything to change that.
> > >> > > > > > >
> > >> > > > > > > also, if all shared buffers have to go through system memory, then
> > >> > > > > > > that means an eGPU amdgpu wont work very well in general right?
> > >> > > > > > > because going through system memory for the egpu means going over the
> > >> > > > > > > thunderbolt connection
> > >> > > > > > >
> > >> > > > > > > and what are the shared buffers youre referring to? for example, if an
> > >> > > > > > > application is drawing to a buffer, is that an example of a shared
> > >> > > > > > > buffer that has to go through system memory? if so, thats fine, right?
> > >> > > > > > > because the application's memory is in system memory, so that copy
> > >> > > > > > > wouldnt be an issue.
> > >> > > > > > >
> > >> > > > > > > in general, do you think the "copy buffer across system memory might
> > >> > > > > > > be a hindrance for thunderbolt? im trying to figure out which
> > >> > > > > > > directions to go to debug and im totally lost, so maybe i can do some
> > >> > > > > > > testing that direction?
> > >> > > > > > >
> > >> > > > > > > and for what its worth, when i turn the display "off" via the gnome
> > >> > > > > > > display settings, its the same issue as when the laptop lid is closed,
> > >> > > > > > > so unless the motherboard reads the "closed lid" the same as "display
> > >> > > > > > > off", then im not sure if its thermal issues.
> > >> > > > > > >
> > >> > > > > > > On Tue, May 19, 2020 at 2:14 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> > >> > > > > > > >
> > >> > > > > > > > On Tue, May 19, 2020 at 2:59 PM Javad Karabi <karabijavad at gmail.com> wrote:
> > >> > > > > > > > >
> > >> > > > > > > > > given this setup:
> > >> > > > > > > > > laptop -thunderbolt-> razer core x -> xfx rx 5600 xt raw 2 -hdmi-> monitor
> > >> > > > > > > > > DRI_PRIME=1 glxgears gears gives me ~300fps
> > >> > > > > > > > >
> > >> > > > > > > > > given this setup:
> > >> > > > > > > > > laptop -thunderbolt-> razer core x -> xfx rx 5600 xt raw 2
> > >> > > > > > > > > laptop -hdmi-> monitor
> > >> > > > > > > > >
> > >> > > > > > > > > glx gears gives me ~1800fps
> > >> > > > > > > > >
> > >> > > > > > > > > this doesnt make sense to me because i thought that having the monitor
> > >> > > > > > > > > plugged directly into the card should give best performance.
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > Do you have displays connected to both GPUs?  If you are using X which
> > >> > > > > > > > ddx are you using?  xf86-video-modesetting or xf86-video-amdgpu?
> > >> > > > > > > > IIRC, xf86-video-amdgpu has some optimizations for prime which are not
> > >> > > > > > > > yet in xf86-video-modesetting.  Which GPU is set up as the primary?
> > >> > > > > > > > Note that the GPU which does the rendering is not necessarily the one
> > >> > > > > > > > that the displays are attached to.  The render GPU renders to it's
> > >> > > > > > > > render buffer and then that data may end up being copied other GPUs
> > >> > > > > > > > for display.  Also, at this point, all shared buffers have to go
> > >> > > > > > > > through system memory (this will be changing eventually now that we
> > >> > > > > > > > support device memory via dma-buf), so there is often an extra copy
> > >> > > > > > > > involved.
> > >> > > > > > > >
> > >> > > > > > > > > theres another really weird issue...
> > >> > > > > > > > >
> > >> > > > > > > > > given setup 1, where the monitor is plugged in to the card:
> > >> > > > > > > > > when i close the laptop lid, my monitor is "active" and whatnot, and i
> > >> > > > > > > > > can "use it" in a sense
> > >> > > > > > > > >
> > >> > > > > > > > > however, heres the weirdness:
> > >> > > > > > > > > the mouse cursor will move along the monitor perfectly smooth and
> > >> > > > > > > > > fine, but all the other updates to the screen are delayed by about 2
> > >> > > > > > > > > or 3 seconds.
> > >> > > > > > > > > that is to say, its as if the laptop is doing everything (e.g. if i
> > >> > > > > > > > > open a terminal, the terminal will open, but it will take 2 seconds
> > >> > > > > > > > > for me to see it)
> > >> > > > > > > > >
> > >> > > > > > > > > its almost as if all the frames and everything are being drawn, and
> > >> > > > > > > > > the laptop is running fine and everything, but i simply just dont get
> > >> > > > > > > > > to see it on the monitor, except for one time every 2 seconds.
> > >> > > > > > > > >
> > >> > > > > > > > > its hard to articulate, because its so bizarre. its not like, a "low
> > >> > > > > > > > > fps" per se, because the cursor is totally smooth. but its that
> > >> > > > > > > > > _everything else_ is only updated once every couple seconds.
> > >> > > > > > > >
> > >> > > > > > > > This might also be related to which GPU is the primary.  It still may
> > >> > > > > > > > be the integrated GPU since that is what is attached to the laptop
> > >> > > > > > > > panel.  Also the platform and some drivers may do certain things when
> > >> > > > > > > > the lid is closed.  E.g., for thermal reasons, the integrated GPU or
> > >> > > > > > > > CPU may have a more limited TDP because the laptop cannot cool as
> > >> > > > > > > > efficiently.
> > >> > > > > > > >
> > >> > > > > > > > Alex