Summary: | RX 480 does not work as eGPU (amdgpu crashes at amdgpu_bo_init) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Gašper Sedej <gsedej> | ||||||||
Component: | Driver/AMDgpu | Assignee: | xf86-video-ati maintainers <xorg-driver-ati> | ||||||||
Status: | RESOLVED NOTOURBUG | QA Contact: | Xorg Project Team <xorg-team> | ||||||||
Severity: | normal | ||||||||||
Priority: | medium | CC: | airlied | ||||||||
Version: | git | ||||||||||
Hardware: | Other | ||||||||||
OS: | All | ||||||||||
Whiteboard: | |||||||||||
i915 platform: | i915 features: | ||||||||||
Attachments: |
|
Can you try with an older kernel (e.g., 4.8)? This looks to be related to some recent changes in 4.9. Created attachment 127696 [details]
linux 4.8
Thanks for reply I tried it with kernel 4.8 but now i get error: [ 64.395083] [drm:amdgpu_device_init [amdgpu]] *ERROR* sw_init of IP block <gmc_v8_0> failed -12 (other log in attachment) Also note, the laptop I am testing is HP elitebook 8730w Core2Duo and already uses Mobility Radeon HD 3670 apparrmor crashed, I disable it. now I only get ... [ 387.295971] ATOM BIOS: E347 [ 387.295994] [drm] GPU not posted. posting now... Your system does not seem to be allocating any address space for the PCI BAR: [ 64.395009] [drm] Detected VRAM RAM=8192M, BAR=0M so the driver has not way to access vram with the CPU. What can I do? Any way to debug? This system DOES run with Nvidia GTX 770 (using prop driver) (In reply to Gašper Sedej from comment #6) > What can I do? Any way to debug? You'll probably want to ask on the Linux PCI mailing list or file a bug against the linux pci subsystem. > > This system DOES run with Nvidia GTX 770 (using prop driver) Is the nvidia device physically part of the laptop or connected via some external mechanism? Generally the sbios handles resource allocation with respect to PCI BARs. You may have just run out of address space. I am using "EXP GDC Beast", http://www.banggood.com/EXP-GDC-Laptop-External-PCI-E-Graphics-Card-p-934367.html The limitation is on PCIe lanes - only 1x or 2x. The pcie version is probably v2 because of laptop age. The EXP GDC hase some CTD and PTD settings as seen in link, but I am not sure what it does. (nvidia works in current settings) The laptop can boot via BIOS and via EFI. I did EFI installation, since I read it was better for such setups. I am using kernel parameters pci=norcs and pci=realloc - I don't know why, but nvidia does not work without those parameters. I think "pci=realloc" is for kernel memory allocation, but this is different than "PCI BAR" right? Where can I ask about PCI BAR issue? (In reply to Gašper Sedej from comment #8) > I am using "EXP GDC Beast", > http://www.banggood.com/EXP-GDC-Laptop-External-PCI-E-Graphics-Card-p-934367. > html > The limitation is on PCIe lanes - only 1x or 2x. The pcie version is > probably v2 because of laptop age. > The EXP GDC hase some CTD and PTD settings as seen in link, but I am not > sure what it does. (nvidia works in current settings) > > The laptop can boot via BIOS and via EFI. I did EFI installation, since I > read it was better for such setups. > > I am using kernel parameters pci=norcs and pci=realloc - I don't know why, > but nvidia does not work without those parameters. > I think "pci=realloc" is for kernel memory allocation, but this is different > than "PCI BAR" right? > > Where can I ask about PCI BAR issue? pci=realloc will attempt to handle additional pci resources like bars that were not handled by the bios on boot. Whether or not it succeeds will depend on the number and size of the bars. Do you also need pci=norcs? You may just be getting lucky on the nvidia card if it uses fewer or smaller bars. Please attach the full dmesg output. You should see messages about unassigned resources. By "BIOS" you also mean "EFI"? I will try to post full dmesg, but it's quite hard - to get "live" dmesg, I am using another computer connected via ssh and command watch -n 0.05 "dmesg | tail -n 100" and console to "micro" font size, so I can select and copy text it comes trough... How can I access logs when kernel freezes due to graphics issue? Also, if I boot computer with enabled eGPU (RX 480), I simply won't get anything because it crashes too quick to be able to capture trough ssh... (the monitor is blank even with text boot mode) Couldn't you run ssh foo@box "dmesg -w" | tee dmesg.log (In reply to Gašper Sedej from comment #10) > By "BIOS" you also mean "EFI"? > I mean the system bios, whether you are using legacy or EFI mode is largely irrelevant. > I will try to post full dmesg, but it's quite hard - to get "live" dmesg, I > am using another computer connected via ssh and command > watch -n 0.05 "dmesg | tail -n 100" > and console to "micro" font size, so I can select and copy text it comes > trough... > > How can I access logs when kernel freezes due to graphics issue? > > Also, if I boot computer with enabled eGPU (RX 480), I simply won't get > anything because it crashes too quick to be able to capture trough ssh... > (the monitor is blank even with text boot mode) You can blacklist the amdgpu driver so it doesn't load and then dump the dmesg output. E.g., append the following to your kernel command line in grub: modprobe.blacklist=amdgpu If you want to load it manually after the system has booted just run (as root): modprobe amdgpu Created attachment 127713 [details]
modprobe amdgpu
I was able to capture more using command
ssh foo@box "dmesg -w" | tee dmesg.log
see the attachment
blacklisting didn't help, if I boot with card connected, after bios, I just get blank screen, nothing changes. With RX 480 I can't boot (I can with nvidia)
It might be hardware (system) issue...
Still, here can I look about this BAR=0 issue?
I'm confused when you say "nvidia card" I see in your dmesg it's posting an RV635 which is a crazy old radeon part not nvidia. Is there an nvidia dgpu in the system? Also by looking at your mem you seem to have 4GB of memory with 256M reserved for the "PCI address space." On a normal PC you'd typically have ~768M reserved for that. Maybe try the opposite, blacklist radeon and not amdgpu and try booting with the card inserted. [ 478.856218] pci 0000:04:00.0: BAR 0: no space for [mem size 0x10000000 64bit pref] [ 478.856221] pci 0000:04:00.0: BAR 0: failed to assign [mem size 0x10000000 64bit pref] It doesn't matter when/if you load any specific drivers. You need to fix the PCI resource allocation on your platform. I'm not sure it's even possible with your current sbios. Sorry for the confusion. I was not explicit enough My the laptop I wish to "upgrade" is HP envy 15, with intel i7-4700qm (igpu intel hd4700) and "dgpu" nvidia optimus 840m (not so fast!!!). The envy lacks express card The "eGPU" GDC EXP BEAST, which includes expressCard "cable" to connect to your laptop. (I ordered minipcie cable, but I am waiting for delivery) In mean time I am testing on company's old HP EliteBook, which have Intel Core2duo, and AMD RV635 (no intel gpu), that HAS expressCard. (this is the only computer that has expressCard) I also have few real, fullsized gpus to test: - nvidia GTX 770 - it's working as eGPU, but it after few minutes in game (also the card is my friends) - AMD RX 480 - does not work as eGPU -(I also have AMD HD5570, and GeForce 610, also doesn't work) I tried with blacklisting radeon, but it didnt help So the "PCI resource allocation" is something that bios is doing? So another computer to test...? (In reply to Gašper Sedej from comment #16) > So the "PCI resource allocation" is something that bios is doing? So another > computer to test...? It's a combination of the bios and the kernel. Your best bet is to email the linux-pci mailing list or file a kernel bug against the pci subsystem about the failure to assign pci resources. There's nothing the gpu driver can do until that is resolved. linux-pci ML: http://vger.kernel.org/vger-lists.html#linux-pci Kernel bugzilla: https://bugzilla.kernel.org/enter_bug.cgi?product=Drivers Select PCI from the component list. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 127690 [details] dmesg from crash I have recently new laptop with i7 4700qm CPU. The nvidia 840m is very slow even with nvidia driver (I can play new steam linux games at 720p at low to medium). My idea is to user "eGPU" solution using RX 480 (external graphics connected to laptop via expresscard or minipcie) I prepeared my system - Ubuntu 16.04, kernel 4.9rc3, mesa 13.1 dev If I start laptop connected to the RX 480, I get blank screen after GRUB, and can't access it When I connect GPU while running I can get dmesg using ssh from another computer. I am using kernel parameters pci=norcs and pci=realloc here is probably the problematic part (rest is in the attachment) [ 980.754662] [<ffffffffa1e705a9>] io_reserve_memtyp+e0x59/0x130 [ 980.754688] [<ffffffffa1e706af>] arch_io_reserve_memtype_wc+0x2f/0x50 [ 980.754780] [<ffffffffc08f3180>] amdgpu_bo_init+0x20/0x90 [amdgpu] [ 980.754835] [<ffffffffc092cbca>] gmc_v8_0_sw_init+0x37a/0x5a0 [amdgpu] [ 980.754885] [<ffffffffc08e12a4>] amdgpu_device_init+0xc64/0x11e0 [amdgpu] [ 980.754920] [<ffffffffa1fd0584>] ? kmalloc_order_trace+0x24/0xa0 [ 980.754967] [<ffffffffc08e39db>] amdgpu_driver_load_kms+0x5b/0x1f0 [amdgpu] [ 980.755028] [<ffffffffc00a5657>] drm_dev_register+0xa7/0xd0 [drm] [ 980.755069] [<ffffffffc00a773c>] drm_get_pci_dev+0x9c/0x1c0 [drm] [ 980.755121] [<ffffffffc08de49c>] amdgpu_pci_probe+0xbc/0xe0 [amdgpu] [ 980.755150] [<ffffffffa226e415>] local_pci_probe+0x45/0xa0 [ 980.755173] [<ffffffffa226f8c9>] pci_device_probe+0x109/0x160 [ 980.755202] [<ffffffffa2393fd3>] driver_probe_device+0x223/0x430 [ 980.755229] [<ffffffffa23942bf>] __driver_attach+0xdf/0xf0 [ 980.755253] [<ffffffffa23941e0>] ? driver_probe_device+0x430/0x430 [ 980.755279] [<ffffffffa2391b0c>] bus_for_each_dev+0x6c/0xc0 [ 980.755306] [<ffffffffa239371e>] driver_attach+0x1e/0x20 [ 980.755330] [<ffffffffa2393140>] bus_add_driver+0x170/0x270 [ 980.755357] [<ffffffffc0a25000>] ? 0xffffffffc0a25000 [ 980.755383] [<ffffffffa2394c30>] driver_register+0x60/0xe0 [ 980.755407] [<ffffffffc0a25000>] ? 0xffffffffc0a25000 [ 980.755432] [<ffffffffa226dd0c>] __pci_register_driver+0x4c/0x50 [ 980.755468] [<ffffffffc00a794b>] drm_pci_init+0xeb/0x100 [drm] [ 980.755493] [<ffffffffc0a25000>] ? 0xffffffffc0a25000 [ 980.755519] [<ffffffffc0a25000>] ? 0xffffffffc0a25000 [ 980.755566] [<ffffffffc0a25079>] amdgpu_init+0x79/0x7b [amdgpu] [ 980.755596] [<ffffffffa1e02190>] do_one_initcall+0x50/0x180 [ 980.755624] [<ffffffffa1ff0201>] ? __vunmap+0x81/0xd0 [ 980.755651] [<ffffffffa200e742>] ? kmem_cache_alloc_trace+0x142/0x190 [ 980.755683] [<ffffffffa1fa3df1>] do_init_module+0x5f/0x1f7 [ 980.755707] [<ffffffffa1f1392b>] load_module+0x199b/0x1d00 [ 980.755732] [<ffffffffa1f10020>] ? __symbol_put+0x60/0x60 [ 980.755759] [<ffffffffa21bcf8e>] ? ima_post_read_file+0x7e/0xa0 [ 980.755790] [<ffffffffa2175edb>] ? security_kernel_post_read_file+0x6b/0x80 [ 980.755824] [<ffffffffa1f13eff>] SYSC_finit_module+0xdf/0x110 [ 980.755853] [<ffffffffa1f13f4e>] SyS_finit_module+0xe/0x10 [ 980.755878] [<ffffffffa268bbbb>] entry_SYSCALL_64_fastpath+0x1e/0xad