Created attachment 74857 [details] Screenshot of Xorg I've recently bought a Radeon 7000 graphics card - Sapphire Radeon HD 7870 XT. I enabled glamor in xorg.conf. The X server fails to start with glamor enabled. For 10 seconds it shows nothing. I've attached a screenshot taken with my mobile phone after that. After killing X the monitor turns off as if the GPU is off (probably true). X works if I disable glamor. However this way I'd be forced to use swrast. xorg-xserver - 1.13.2 xf86-video-ati - git or 7.1.0 (both tried) glamor - git or 0.5 (both tried) libdrm - git or 2.4.42 (both tried) kernel - git or 3.7.7 (both tried) mesa - git weston also fails to start. It just shows a black screen with a white stripe on top. Killing it returns me to a tty. I'll attach parts of the kernel log when starting Xorg and weston.
Created attachment 74859 [details] Kernel log when starting Xorg (lines containing radeon or drm)
Created attachment 74860 [details] Kernel log when starting weston (lines containing radeon or drm)
glamor doesn't work with xserver 1.13 or newer yet. See also bug 58910.
glamor builds fine with Xorg 1.13.2. The exact same thing happens with Xorg 1.12.4 - builds fine but doesn't start.
I don't think it's a Xorg/glamor bug because weston doesn't start too. It may be in the kernel, libdrm, mesa or llvm. When starting weston, nothing renders. When starting X, a shader probably enters an infinite loop. Then the kernel tries to reset the GPU and fails. Reconfirmed with today's git for mesa, llvm and libdrm.
(In reply to comment #4) > glamor builds fine with Xorg 1.13.2. Yes, but it really can't work with it per bug 58910. You'll have to stick to pre-1.13 X servers for testing glamor. (In reply to comment #5) > I don't think it's a Xorg/glamor bug because weston doesn't start too. How about a simple EGL app, e.g. mesa/demos/src/egl/opengl/egltri_screen ? This could be a Tahiti specific problem.
egltri_screen shows the same crap as weston. It segfaults 5 seconds after start.
(In reply to comment #7) > egltri_screen shows the same crap as weston. It segfaults 5 seconds after > start. Please attach a backtrace of the segfault and the stderr output from running it with the environment variables EGL_LOG_LEVEL=debug RADEON_DUMP_SHADERS=1 set.
Created attachment 75409 [details] Debug output Here is the debug output and the stack trace (llvm and mesa from yesterday git/svn)
The segfault looks like some kind of LLVM issue on shutdown, we can probably ignore that for now. The real problem is that your GPU hangs even while running the trivial shaders egltri_screen uses. Alex, could it be we're trying to use a missing/disabled shader engine or something like that? Hristo, can you also attach the kernel output corresponding to egltri_screen? I expect it'll be basically the same as for starting X, but just in case.
It's the same as starting weston. egltri_screen doesn't hang the GPU. Only Xorg does. egltri_screen and weston fail to clear the buffer or render anything. With current llvm svn, mesa git, libdrm git and linux 3.8.2 the problem is still there.
(In reply to comment #11) > egltri_screen doesn't hang the GPU. Only Xorg does. egltri_screen and weston > fail to clear the buffer or render anything. Because the GPU hangs trying to render anything. :) The kernel output is from resetting the hung GPU.
Created attachment 76481 [details] [review] Mesa test patch Does it work better with this Mesa patch?
Sadly this patch doesn't fix this bug. egltri_screen does not render anything and does not cause GPU reset. However eglgears_screen and Xorg cause the GPU to reset. Without the patch it's the same.
Now egl{tri,gears}_screen work. However they don't render properly. Square pixel blocks seem to be "misplaced". I will test with Xorg and attach a screenshot.
Created attachment 76921 [details] Screenshots with and without glamor X11 doesn't render properly. The pixels seem to be shuffled. Maybe a shader doesn't write where it's supposed to? There is no kernel output caused by this.
(In reply to comment #16) > Created attachment 76921 [details] > Screenshots with and without glamor > > X11 doesn't render properly. The pixels seem to be shuffled. Maybe a shader > doesn't write where it's supposed to? Looks like the tiling configuration is wrong on your system.
Actually the GPU only works if it has been initialized by fglrx and then the driver is switched to radeon without rebooting. When booting with radeon it doesn't work. I bruteforced all 24 tiling configs by setting them in evergreen_interpret_tiling in radeonsi_pipe.c and none of them worked for eglgears_screen. They all lead to the same result. The tiling config given to evergreen_interpret_tiling is 0x1023. I have no idea what the (1<<12) bit in it means.
(In reply to comment #18) > Actually the GPU only works if it has been initialized by fglrx and then the > driver is switched to radeon without rebooting. When booting with radeon it > doesn't work. Can you dump the registers with avivotool (http://cgit.freedesktop.org/~airlied/radeontool/) from both radeon and fglrx? E.g., cold boot with radeon and dump the registers: sudo avivotool regmatch '*' > radeon.regs then boot with fglrx and switch radeon: sudo avivotool regmatch '*' > fglrx.regs and post the outputs here?
(In reply to comment #19) > (In reply to comment #18) > > Actually the GPU only works if it has been initialized by fglrx and then the > > driver is switched to radeon without rebooting. When booting with radeon it > > doesn't work. > > Can you dump the registers with avivotool > (http://cgit.freedesktop.org/~airlied/radeontool/) from both radeon and > fglrx? E.g., cold boot with radeon and dump the registers: > sudo avivotool regmatch '*' > radeon.regs > then boot with fglrx and switch radeon: > sudo avivotool regmatch '*' > fglrx.regs > and post the outputs here? Actually, three outputs would be better: 1. cold boot with radeon 2. cold boot with fglrx 3. cold boot with fglrx, warm boot radeon
Created attachment 77012 [details] Outputs of avivotool AVIVO_D2GRPH_{ENABLE,CONTROL} seem to make the difference between lockup and wrong rendering.
Ugh. Sorry. I told you the wrong option for avivotool so it dumped the wrong registers. It should be : avivotool regs all
Created attachment 77211 [details] Outputs of avivotool avivotool while running fglrx nicely halts the GPU. Xorg wouldn't die so I couldn't switch to radeon. cold-fglrx-warm-radeon was from the next boot.
Any idea regarding the reason for this behavior? P.S. I noted that glxgears renders OK on fullscreen with fglrx->radeon. In window it's wrong.
(In reply to comment #24) > Any idea regarding the reason for this behavior? > > P.S. I noted that glxgears renders OK on fullscreen with fglrx->radeon. In > window it's wrong. Unfortunately, I didn't see anything obvious in the dump. Can you try my drm-next branch? http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.10 Does that help?
Sadly the problem still remains.
(In reply to comment #26) > Sadly the problem still remains. Which problem? Does that kernel act like radeon with fglrx loaded first, or still like radeon without fglrx loaded at all?
The kernel acts the same way as it did before.
I just noticed a regression: the GPU halts when cold boot was done by fglrx and driver is switched to radeon.
Are the golden registers the same in radeonsi and fglrx?
(In reply to comment #30) > Are the golden registers the same in radeonsi and fglrx? Yes.
Created attachment 79504 [details] Results of OpenCL test BREAKTHROUGH! OpenCL works. Kinda. Tried the following kernel: __kernel void add(__global const uint *a, __global const uint *b, __global uint *c){ c[0]=1; } Complicated operations such as addition, memory loads, getting global ID, etc. fail with Cannot select errors. I have no idea if this has worked with earlier LLVM/mesa. After the kernel is run, the 0-th element of c is equal to 1. I've attached full source code and outputs for various kernels.
(In reply to comment #32) > Created attachment 79504 [details] > Results of OpenCL test > > BREAKTHROUGH! > > OpenCL works. Kinda. Tried the following kernel: > __kernel void add(__global const uint *a, __global const uint *b, __global > uint *c){ > c[0]=1; > } > Complicated operations such as addition, memory loads, getting global ID, > etc. fail with Cannot select errors. > I have no idea if this has worked with earlier LLVM/mesa. > All that is supported in the git tree is stores to global memory. I have global loads, work item functions, and a fair amount of arithmetic operations working in a local branch, and I hope to get that pushed to mainline in the next week or two. > After the kernel is run, the 0-th element of c is equal to 1. I've attached > full source code and outputs for various kernels.
What's the difference between integer addition in OpenGL shaders and OpenCL kernels? Aren't the intrinsics the same?
OpenCL update: On floating point, addition, subtraction, multiplication, division and pow work. On integer, addition, subtraction and multiplication work. Division and modulo halt the GPU. If they are implemented the same way as in OpenGL, this might be the bug I'm facing.
For OpenCL with radeonsi, make sure your LLVM and Mesa SVN/Git snapshots are up to date as of today. However, I'm afraid your success with OpenCL doesn't necessarily mean anything for the graphics problem, as the latter involves much more complex hardware state setup.
I updated llvm, clang and mesa. Division and modulo still don't work. Another thing I noticed is that ifs which depend on memory loads cause llvm crash: __kernel void add(__global const uint *a, __global const uint *b, __global uint *c){ ulong id=get_global_id(0); // OK if(id>10) return; // OK if(b[id]==0) return; // crash c[id]=a[id]/b[id]; // GPU hang } a[id] is id+1 b[id] is 2*id+2 Stack dump: 0. Running pass 'Function Pass Manager' on module 'radeon'. 1. Running pass 'AMDGPU DAG->DAG Pattern Instruction Selection' on function '@add' Segmentation fault #0 0x00007ffff461c8a7 in ?? () from /usr/lib64/llvm/libLLVM-3.4svn.so #1 0x00007ffff3e36208 in llvm::SelectionDAGISel::DoInstructionSelection() () from /usr/lib64/llvm/libLLVM-3.4svn.so #2 0x00007ffff3e3c620 in llvm::SelectionDAGISel::CodeGenAndEmitDAG() () from /usr/lib64/llvm/libLLVM-3.4svn.so #3 0x00007ffff3e3e0f2 in llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) () from /usr/lib64/llvm/libLLVM-3.4svn.so #4 0x00007ffff3e3f421 in llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) () from /usr/lib64/llvm/libLLVM-3.4svn.so #5 0x00007ffff3acaeb2 in llvm::FPPassManager::runOnFunction(llvm::Function&) () from /usr/lib64/llvm/libLLVM-3.4svn.so #6 0x00007ffff3acaf4b in llvm::FPPassManager::runOnModule(llvm::Module&) () from /usr/lib64/llvm/libLLVM-3.4svn.so #7 0x00007ffff3acb195 in llvm::MPPassManager::runOnModule(llvm::Module&) () from /usr/lib64/llvm/libLLVM-3.4svn.so #8 0x00007ffff3acd1dc in llvm::PassManagerImpl::run(llvm::Module&) () from /usr/lib64/llvm/libLLVM-3.4svn.so #9 0x00007ffff417c009 in ?? () from /usr/lib64/llvm/libLLVM-3.4svn.so #10 0x00007ffff417c382 in LLVMTargetMachineEmitToMemoryBuffer () from /usr/lib64/llvm/libLLVM-3.4svn.so #11 0x00007ffff2ae6ab1 in radeon_llvm_compile () from /usr/lib64/gallium-pipe/pipe_radeonsi.so #12 0x00007ffff2adc65d in si_compile_llvm () from /usr/lib64/gallium-pipe/pipe_radeonsi.so #13 0x00007ffff2adef79 in ?? () from /usr/lib64/gallium-pipe/pipe_radeonsi.so #14 0x00007ffff6d882a7 in _cl_kernel::exec_context::bind(_cl_command_queue*) () from /usr/lib64/libOpenCL.so.1 #15 0x00007ffff6d88e46 in _cl_kernel::launch(_cl_command_queue&, std::vector<unsigned long, std::allocator<unsigned long> > const&, std::vector<unsigned long, std::allocator<unsigned long> > const&, std::vector<unsigned long, std::allocator<unsigned long> > const&) () from /usr/lib64/libOpenCL.so.1 #16 0x00007ffff6d847dc in _cl_event::trigger() () from /usr/lib64/libOpenCL.so.1 #17 0x00007ffff6d84e54 in clover::hard_event::hard_event(_cl_command_queue&, unsigned int, std::vector<_cl_event*, std::allocator<_cl_event*> >, std::function<void (_cl_event&)>) () from /usr/lib64/libOpenCL.so.1 #18 0x00007ffff6d9fad5 in clEnqueueNDRangeKernel () from /usr/lib64/libOpenCL.so.1
The OpenCL failures are unrelated to the original bug, so can you please file a separate bug for them. This bug has been outstanding for a while, and it seems like there are actually several "bugs". Could you please summarize the problems you are currently having and list the versions or git HEAD commits that you are using for glamor, xf86-video-ati, Xorg server, Linux kernel, Mesa, and LLVM. Thanks.
What is the difference between OpenCL integer division and OpenGL shader integer division?
After some testing I found out that my GPU crashes on big shaders. 67 32-bit words is enough to crash it, 50 isn't. How is udiv implemented?
I'm really sorry I have to do that. Bump.
Does X11 still crash for you on radeonsi?
Hi! I'm not 100% sure if this is the same issue but when I try to start with the default settings (without an X conf file) the screen remains black and the gpu-fan speeds up. After 5 minutes I abort it. According to the logs the X-Server starts and (for me) there are no suspicious messages. As a workaround I put 'Option "AccelMethod" "EXA"' into a X conf file. My system: GPU: XFX Radeon HD 7870 GHz Edition (AMD Tathiti) kernel: 3.11.6-1-ARCH xf86-video-ati: 1:7.2.0 mesa: 9.2.2 glamor-egl: 0.5.1 X-server: 1.14.3 llvm: 3.3 Should I attach the Xorg.log? What else can I do to track this down? Thanks in advance
(In reply to comment #40) > After some testing I found out that my GPU crashes on big shaders. 67 32-bit > words is enough to crash it, 50 isn't. How is udiv implemented?
(In reply to comment #44) > (In reply to comment #40) > > After some testing I found out that my GPU crashes on big shaders. 67 32-bit > > words is enough to crash it, 50 isn't. How is udiv implemented? Does this crash happen when X starts or when you are running on OpenCL program?
(In reply to comment #43) > Hi! > I'm not 100% sure if this is the same issue but when I try to start with the > default settings (without an X conf file) the screen remains black and the > gpu-fan speeds up. After 5 minutes I abort it. > According to the logs the X-Server starts and (for me) there are no > suspicious messages. > > As a workaround I put 'Option "AccelMethod" "EXA"' into a X conf file. > > My system: > GPU: XFX Radeon HD 7870 GHz Edition (AMD Tathiti) > kernel: 3.11.6-1-ARCH > xf86-video-ati: 1:7.2.0 > mesa: 9.2.2 > glamor-egl: 0.5.1 > X-server: 1.14.3 > llvm: 3.3 > > Should I attach the Xorg.log? What else can I do to track this down? > Thanks in advance Yes, please post your Xorg.log.
What crashes the GPU: - OpenGL - OpenCL: big kernels (> 66 words) What does not crash the GPU: - KMS - Copying buffers to/from GPU - HDMI (sound not tested, shows up with recent kernels) - OpenCL: small kernels (< 51 words)
(In reply to comment #47) > What crashes the GPU: > - OpenGL > - OpenCL: big kernels (> 66 words) A new bug should be opened for these failures. > What does not crash the GPU: > - KMS > - Copying buffers to/from GPU > - HDMI (sound not tested, shows up with recent kernels) > - OpenCL: small kernels (< 51 words)
Created attachment 88014 [details] Xorg.log
*** Bug 70778 has been marked as a duplicate of this bug. ***
Bump
(In reply to comment #52) > Bump What is the PCI ID of your GPU? If you run: lspci -nn | grep VGA The PCI ID will be the number at the end of the line inside the brackets.
(In reply to comment #53) > What is the PCI ID of your GPU? From attachment 74859 [details]: > [drm] initializing kernel modesetting (TAHITI 0x1002:0x679E 0x174B:0xE246). So the PCI ID is 0x679E, a harvested Tahiti I think.
I have similar or same problem. I wrote info to this bug: https://bugs.freedesktop.org/show_bug.cgi?id=71488#c10 If you need some more info let me know and I will try to provide it. I would like to see working radonsi driver with my card.
It happends with graphic card AMD Radeon HD 7730. In lspci -nn it is identified by: 05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde LE [Radeon HD 7730/8730] [1002:6837] So PCI ID should be 0x1002:0x6837. @Tom Stellard, @Michel Dänzer: Can you look at this problem?
Created attachment 92941 [details] [review] Possible Fix Does this patch help?
OpenCL with big kernels and glxinfo still hang unconditionally. Default raster_config = 0x2a00126a rb mask = 255 Final raster_config = 0x2a00126a
(In reply to comment #58) > Default raster_config = 0x2a00126a > rb mask = 255 > Final raster_config = 0x2a00126a The patch didn't modify the raster_config value, so either it's not correct yet, or the kernel is providing incorrect information about which backends are enabled.
(In reply to comment #59) > (In reply to comment #58) > > Default raster_config = 0x2a00126a > > rb mask = 255 > > Final raster_config = 0x2a00126a > > The patch didn't modify the raster_config value, so either it's not correct > yet, or the kernel is providing incorrect information about which backends > are enabled. The RB mask is 255, which means all 8 rbs are enabled, so either the kernel is providing the wrong information or there is something else besides the raster_config that we need to fix.
Hi Tom Stellard! Now I updated kernel to 3.13, drm from git, radeon x driver from git and mesa from git with above patch. And patch really fixed problem with glamor rendering :-) Here is output from glxgears $ glxgears Default raster_config = 0x124a rb mask = 10 Final raster_config = 0x124f Default raster_config = 0x124a rb mask = 10 Final raster_config = 0x124f Running synchronized to the vertical refresh. The framerate should be approximately the same as the monitor refresh rate. 270 frames in 5.0 seconds = 53.990 FPS ^C Tested also with 3.14-rc1 kernel and still working. But with old kernels (3.11) there is same problem and not working.
(In reply to comment #61) > Hi Tom Stellard! > > Now I updated kernel to 3.13, drm from git, radeon x driver from git and > mesa from git with above patch. And patch really fixed problem with glamor > rendering :-) > > Here is output from glxgears > > $ glxgears > Default raster_config = 0x124a > rb mask = 10 > Final raster_config = 0x124f > Default raster_config = 0x124a > rb mask = 10 > Final raster_config = 0x124f > Running synchronized to the vertical refresh. The framerate should be > approximately the same as the monitor refresh rate. > 270 frames in 5.0 seconds = 53.990 FPS > ^C > > Tested also with 3.14-rc1 kernel and still working. But with old kernels > (3.11) there is same problem and not working. Older kernels don't provide an interface to query the enabled backends. It should work with 3.12.x and newer.
Ok, so do you need to test something other? Or will you include that patch into mesa?
(sh_per_se | 0x1) /* WTF? Shift widths aren't often used that way in a bitmask. */ (1u<<sh_per_se)-1 /* probably what was meant */ Patch + kernel 3.13 + llvm,mesa,libdrm,glamor,xf86-video-ati,wayland,weston git: rb_config=255 - OpenCL <64 dwords works - OpenCL >64 dwords hangs - X11 starts - glxinfo hangs - glxgears hangs - weston works rb_config&=0b00001100: - OpenCL <64 dwords works - OpenCL >64 dwords hangs - X11 starts - glxinfo works - glxgears works! (~2600 Frames/s) - My X session hangs (probably chromium) - weston corrupted - Any OpenCL (even *a=0;) after OpenGL fails with error about kernel rejecting CS, nothing on dmesg Also, GPU reset fails. I used S3 sleep in order to reset the GPU. Sleeps for <15 seconds may sometimes fail to reset it.
*** Bug 74154 has been marked as a duplicate of this bug. ***
@Tom Stellard: Will you prepare new patch for testing? And when you include this fix into mesa?
(In reply to comment #66) > @Tom Stellard: Will you prepare new patch for testing? And when you include > this fix into mesa? I'm still trying to track down a Tahiti GPU, so I can see what the issue is there.
@Tom Stellard: Do you have something new about this problem?
BUMP!
Anything new?
rb_config&=0b00001100; does not work(=run glxgears) anymore
(In reply to comment #57) > Created attachment 92941 [details] [review] [review] > Possible Fix > > Does this patch help? @Tom Stellard: That patch does not apply anymore on top of mesa git.
@Tom Stellard, @Michel Dänzer: ping
Created attachment 98257 [details] [review] Fix v2 Pali, I have sent this patch to the mailing list for review, can you confirm that it fixes the issue for you.
Created attachment 98258 [details] [review] Tahiti Fix Hristo, can you try this kernel patch?
(In reply to comment #75) > Created attachment 98258 [details] [review] [review] > Tahiti Fix + for (i = 0; i < rdev->config.si.max_texture_channel_caches; i++) + cgts_tcc_disable &= ~(1 << (16 + i)); this should be: + for (i = 0; i < rdev->config.cik.max_texture_channel_caches; i++) + cgts_tcc_disable &= ~(1 << (16 + i));
Created attachment 98320 [details] /var/log/Xorg.0.log Hi, I have the same issue with Radeon HD 7870 XT (https://bugs.freedesktop.org/show_bug.cgi?id=74154). I tried to apply the 0001-radeonsi-Program-RASTER_CONFIG-for-harvested-GPUs-v2 patch to mesa 10.1.1, build and install but it doesn't help. Maybe I did something wrong? ./autogen.sh --prefix=/usr --libdir=/usr/lib64/ --sysconfdir=/etc --enable-selinux --enable-osmesa --enable-egl --disable-gles1 --enable-gles2 --disable-gallium-egl --disable-xvmc --enable-vdpau --with-egl-platforms=x11,drm,wayland --enable-shared-glapi --enable-gbm --enable-opencl --enable-opencl-icd --enable-glx-tls --enable-texture-float=yes --enable-gallium-llvm --with-llvm-shared-libs --enable-dri --enable-xa --with-gallium-drivers=svga,radeonsi,swrast,r600,r300,nouveau --disable-dri3 --with-clang-libdir=/usr/lib/ make sudo make install
(In reply to comment #76) > this should be: > > + for (i = 0; i < rdev->config.cik.max_texture_channel_caches; i++) > + cgts_tcc_disable &= ~(1 << (16 + i)); Why? This is si_gpu_init(). (In reply to comment #75) > Created attachment 98258 [details] [review] [review] > Tahiti Fix [...] >+ WREG32(CGTS_TCC_DISABLE, cgts_tcc_disable); My understanding is that this register indicates which TCCs are not functional. So this line should be replaced by cgts_tcc_disable |= RREG32(CGTS_TCC_DISABLE);
(In reply to comment #78) > (In reply to comment #76) > > this should be: > > > > + for (i = 0; i < rdev->config.cik.max_texture_channel_caches; i++) > > + cgts_tcc_disable &= ~(1 << (16 + i)); > > Why? This is si_gpu_init(). > whoops, I was thinking about CIK at the time. disregard my comment.
Created attachment 98377 [details] /var/log/Xorg.0.log I tried also to rebuild kernel with the Tahiti Fix, but still nothing.
(In reply to comment #74) > Created attachment 98257 [details] [review] [review] > Fix v2 > > Pali, I have sent this patch to the mailing list for review, can you confirm > that it fixes the issue for you. Hello, I applied this patch on top of mesa, but it is not working :-( Xserver show only black screen. And in dmesg I see this: [ 31.269778] radeon 0000:05:00.0: GPU fault detected: 147 0x09e25201 [ 31.269785] radeon 0000:05:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0E6B9BCF [ 31.269788] radeon 0000:05:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02052001 [ 31.269792] VM fault (0x01, vmid 1) at page 241933263, read from CB_CMASK (82) In Xorg.0.log are no errors. After killing X in dmesg are these lines: [ 307.388090] radeon 0000:05:00.0: GPU lockup CP stall for more than 276120msec [ 307.388104] radeon 0000:05:00.0: GPU lockup (waiting for 0x0000000000000002 last fence id 0x0000000000000001 on ring 0) [ 312.832194] pci_pm_runtime_suspend(): radeon_pmops_runtime_suspend+0x0/0xc0 [radeon] returns -22 [ 320.270503] detected fb_set_par error, error code: -22 When I start X again, it immediately crash and in Xorg.0.log are these errors: [ 320.199] drmOpenDevice: node name is /dev/dri/card0 [ 320.199] drmOpenDevice: open result is -1, (Invalid argument) [ 320.199] drmOpenByBusid: Searching for BusID pci:0000:05:00.0 [ 320.199] drmOpenDevice: node name is /dev/dri/card0 [ 320.199] drmOpenDevice: open result is -1, (Invalid argument) ... [ 320.270] (EE) RADEON(0): [drm] Failed to open DRM device for pci:0000:05:00.0: No such file or directory [ 320.270] (EE) RADEON(0): Kernel modesetting setup failed [ 320.270] (II) UnloadModule: "radeon" [ 320.270] (II) Unloading radeon [ 320.270] (EE) Screen(s) found, but none have a usable configuration. [ 320.270] Fatal server error: [ 320.270] no screens found Old patch (which can be applied on older mesa version) worked fine without any problem. Note that I did not changed kernel, still using same version 3.14-rc1.
I tried ubuntu 14.04 with 7870xt,no bug happened , seems this issue is fixed in latest Ubuntu release
With last version of mesa from git with v2 I'm still getting black screen with these errors in dmesg: [ 36.661540] radeon 0000:05:00.0: GPU fault detected: 147 0x06625201 [ 36.661548] radeon 0000:05:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0A4FB8B3 [ 36.661551] radeon 0000:05:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02052001 [ 36.661555] VM fault (0x01, vmid 1) at page 172996787, read from CB_CMASK (82 @Tom Stellard: Can you look at it?
Created attachment 104448 [details] [review] diff between my patch and patch from comment #74 I commented lines si_pm4_set_reg(pm4, GRBM_GFX_INDEX, SE_INDEX(i) | SH_BROADCAST_WRITES); and si_pm4_set_reg(pm4, GRBM_GFX_INDEX, SE_BROADCAST_WRITES); in patch from comment #74 and my radeon hd 7730 started working :-) glamor and opengl3 working fine. In attchment is diff between my patch and patch from comment #74.
*** Bug 79231 has been marked as a duplicate of this bug. ***
You're the man, Pali! I just tried the modified patch and it works for me too. Glamor, OpenGL and vdpau seem to be working perfectly now!
Created attachment 106006 [details] [review] Fix v3 Thanks for tracking down the bug with v2. Can you try this patch?
Is this patch supposed to apply cleanly against mesa 10.1.5? I'm getting the following build error: In file included from ../../../../src/gallium/auxiliary/util/u_inlines.h:41:0, from ../../../../src/gallium/auxiliary/pipebuffer/pb_buffer.h:49, from ../../winsys/radeon/drm/radeon_winsys.h:43, from si_pm4.h:30, from si_state.h:30, from si_pipe.h:29, from si_state.c:27: si_state.c: In function 'si_init_config': si_state.c:3291:49: error: 'struct radeon_info' has no member named 'max_sh_per_se' unsigned sh_per_se = MAX2(sctx->screen->b.info.max_sh_per_se, 1); ^ ../../../../src/gallium/auxiliary/util/u_math.h:767:27: note: in definition of macro 'MAX2' #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ si_state.c:3291:49: error: 'struct radeon_info' has no member named 'max_sh_per_se' unsigned sh_per_se = MAX2(sctx->screen->b.info.max_sh_per_se, 1); ^ ../../../../src/gallium/auxiliary/util/u_math.h:767:37: note: in definition of macro 'MAX2' #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ si_state.c:3292:46: error: 'struct radeon_info' has no member named 'max_sh_per_se' unsigned num_se = MAX2(sctx->screen->b.info.max_sh_per_se, 1); ^ ../../../../src/gallium/auxiliary/util/u_math.h:767:27: note: in definition of macro 'MAX2' #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ si_state.c:3292:46: error: 'struct radeon_info' has no member named 'max_sh_per_se' unsigned num_se = MAX2(sctx->screen->b.info.max_sh_per_se, 1); ^ ../../../../src/gallium/auxiliary/util/u_math.h:767:37: note: in definition of macro 'MAX2' #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) )
(In reply to comment #88) > Is this patch supposed to apply cleanly against mesa 10.1.5? No, looks like it's for Git master, should probably apply against the 10.3 branch at least though. (In reply to comment #87) > Fix v3 [...] > + for (i = 0; i < num_se; i++) { > + si_pm4_set_reg(pm4, GRBM_GFX_INDEX, > + SE_INDEX(i) | > + SH_BROADCAST_WRITES | > + INSTANCE_BROADCAST_WRITES); > + si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, raster_config); > + } Since this uses the same raster_config value for all SEs, couldn't it just use a single write with SE_BROADCAST_WRITES enabled in GRBM_GFX_INDEX? If not: > + unsigned sh_per_se = MAX2(sctx->screen->b.info.max_sh_per_se, 1); > + unsigned num_se = MAX2(sctx->screen->b.info.max_sh_per_se, 1); sh_per_se and num_se have the same value. Should one of them be calculated differently, or does a single variable suffice?
Created attachment 106097 [details] Fix v4 Here is an updated patch that addresses Michel's comments.
Created attachment 106113 [details] [review] Another approach If Tom's v4 patch doesn't work, you can try this patch on top of it. If that still doesn't work, please provide the stderr debugging output about raster_config.
Created attachment 106530 [details] [review] Fix v5 Can you try this patch? I've merged Michel's patch with mine and it works on my Verde. Even if this patch works for you could you still post the output when running glxgears?
Hello I've got Radeon HD 7870 XT running under arch linux. It works fine on catalyst drivers, but fails on OS driver. Without patches from this bug (patch 3&4 or 5) my computer hangs. Signal to monitor is off and monitor suspends, logging through ssh is impossible, system logs are cut before launching X. After applying patch (3&4 or 5) screen goes black, but monitor doesn't go to suspend. I can log in with ssh, after killing Xorg.bin console is visible and operational. Aftr killing X kernel logs show 1868.387061] radeon 0000:01:00.0: ring 0 stalled for more than 395470msec [ 1868.387066] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000005 last fence id 0x0000000000000001 on ring 0) [ 1868.695731] [drm:si_dpm_set_power_state] *ERROR* si_disable_ulv failed Stack trace of running X shows it's waiting for some fence inside radeon dri. I'm attaching dmesg and xorg log after starting and killing X server with patch 5.
Created attachment 106562 [details] kernel logs with patch v5
Created attachment 106563 [details] xorg.log with mesa-git and patch 5
What happens if you try only this patch: https://bugs.freedesktop.org/attachment.cgi?id=106097 Also do you see any output when you start X? The best way to check is to ssh into the system and then run startx.
Mesa-git with patch v4 startx output: X.Org X Server 1.16.0 Release Date: 2014-07-16 X Protocol Version 11, Revision 0 Build Operating System: Linux 3.15.5-2-ARCH x86_64 Current Operating System: Linux pecet 3.16.2-1-ARCH #1 SMP PREEMPT Sat Sep 6 13:12:51 CEST 2014 x86_64 Kernel command line: BOOT_IMAGE=/boot/vmlinuz-linux root=UUID=b02f6229-160f-4999-9979-82f4e15dacff rw quiet Build Date: 31 July 2014 11:53:19AM Current version of pixman: 0.32.6 Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. (==) Log file: "/var/log/Xorg.0.log", Time: Sat Sep 20 02:13:24 2014 (==) Using system config directory "/usr/share/X11/xorg.conf.d" (II) [KMS] Kernel modesetting enabled. Screen stays blank and on After kill -9 on Xorg.bin: [root@pecet ~]# kill -9 425[root@pecet ~]# XIO: fatal IO error 2 (No such file or directory) on X server ":0" after 18 requests (18 known processed) with 0 events remaining. xset: unable to open display ":0" xsetroot: unable to open display ':0' startkde: Starting up... xprop: unable to open display ':0' xprop: unable to open display ':0' Connecting to deprecated signal QDBusConnectionInterface::serviceOwnerChanged(QString,QString,QString) kdeinit4: Can not connect to the X Server. kdeinit4: Might not terminate at end of session. QDBusConnection: session D-Bus connection created before QCoreApplication. Application may misbehave. QDBusConnection: session D-Bus connection created before QCoreApplication. Application may misbehave. kded4: cannot connect to X server :0 kded(476): Communication problem with "kded" , it probably crashed. Error message was: "org.freedesktop.DBus.Error.NoReply" : " "Message did not receive a reply (timeout by message bus)" " kcminit_startup: cannot connect to X server :0 unnamed app(481): Cannot connect to the X server ksmserver: cannot connect to X server :0 startkde: Shutting down... klauncher: Exiting on signal 1 startkde: Running shutdown scripts... xprop: unable to open display ':0' xprop: unable to open display ':0' startkde: Done. and monitor turns off Dmesg output: 243.740994] radeon 0000:01:00.0: ring 0 stalled for more than 211963msec 243.741009] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000003 last fence id 0x0000000000000001 on ring 0) Second run of startx causes hard hangup - monitor turns on, startx output ends with KMS Kernel mode setting enabled, after few seconds monitor disables and system stops responding to ssh Mesa-git built from master with options: ./autogen.sh --prefix=/usr \ --sysconfdir=/etc \ --with-dri-driverdir=/usr/lib/xorg/modules/dri \ --with-gallium-drivers=radeonsi \ --with-dri-drivers=radeon \ --with-egl-platforms=x11,drm,wayland \ --enable-llvm-shared-libs \ --disable-gallium-egl \ --disable-gallium-gbm \ --enable-egl \ --enable-gbm \ --enable-gallium-llvm \ --enable-shared-glapi \ --enable-glx-tls \ --enable-dri \ --enable-glx \ --enable-osmesa \ --enable-gles1 \ --enable-gles2 \ --enable-texture-float \ --enable-xa \ --enable-vdpau \ --enable-xvmc \ --enable-dri3 \ --enable-omx \ --enable-opencl \ --enable-opencl-icd \ --with-clang-libdir=/usr/lib
Łukasz, can you attach (as opposed to paste) the startx output with patch v5? Tom, BTW, what happened to your kernel patches?
(In reply to comment #97) > Mesa-git with patch v4 > startx output: Are you sure that your X server is using mesa-git and not your system Mesa? I don't see any of the printfs from the patch in your output.
(In reply to comment #98) > Łukasz, can you attach (as opposed to paste) the startx output with patch v5? > > Tom, BTW, what happened to your kernel patches? I don't think the user with the bad Tahiti ever tested them.
I'll verify library paths and attach x logs/outs with patch v5 Should I apply Tahiti Fix (https://bugs.freedesktop.org/attachment.cgi?id=98258) to my kernel ?
(In reply to comment #101) > I'll verify library paths and attach x logs/outs with patch v5 > > Should I apply Tahiti Fix > (https://bugs.freedesktop.org/attachment.cgi?id=98258) to my kernel ? Sure.
(In reply to comment #102) > > Should I apply Tahiti Fix > > (https://bugs.freedesktop.org/attachment.cgi?id=98258) to my kernel ? > > Sure. Tom, did you see comment 78?
I've tested tahiti-fix.patch for kernel with version 3.16-3, it does not help. I've added debug info to it: radeonsi cgts_tcc_disable: -268435456 that's the value of register for my card. patch v5 also doesn't resolve problem, after adding additional prinf's it seems that si_init_config goes into: if (rb_mask && util_bitcount(rb_mask) >= num_rb) { so si_write_harvested_raster_configs is not called after commenting out that if, xorg logs from si_write_harvested_raster_configs shows: Original raster_config = 0x2a00126a, rb_mask = 0xff attachments: - startx-0410-2.out - output of startx with -verbose 9, tahiti-fix kernel, mesa-git (c74be01e80fcdd7feabc0f27df4aebe66abb626e) with patchv5 + additional fprintf's - kernel-0410-2.out - dmesg, additional debug in tahiti-fix: radeonsi cgts_tcc_disable
Created attachment 107322 [details] startx out with -verbose 9, fix v5 + printf's
Created attachment 107323 [details] kernel log with tahiti-fix after starting and killing Xorg.bin
I finally updated to pre-release Fedora 21 which packages mesa 10.3. The "v5 fix" seems to work OK with my 7730 LE (Verde chip). glxgears output attached. (Is there any reason why would the desktop animations feel much smoother and snappier than with the old hackofix?)
Created attachment 108412 [details] glxgears output with "Fix v5" in place on 7730 LE
(In reply to madcatx from comment #107) > (Is there any reason why would the desktop animations feel much smoother and > snappier than with the old hackofix?) Sounds like the 'hackofix' disabled more SEs than necessary, so the card wasn't running as fast as it can.
Michel Dänzer: is this patch going to be included in mesa git?
Looks like patch was commited to mesa git: http://cgit.freedesktop.org/mesa/mesa/commit/?id=67dcbcd92cb9877a04747d6cf7fef14c2b8af8b3
Is there any chance of this getting backported to stable 10.3 series? The v5 fix works fine for me with 10.3.3.
(In reply to madcatx from comment #112) > Is there any chance of this getting backported to stable 10.3 series? The v5 > fix works fine for me with 10.3.3. Yes, it should show up in the stable releases. From the commit message: CC: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
*** Bug 93023 has been marked as a duplicate of this bug. ***
*** Bug 92518 has been marked as a duplicate of this bug. ***
*** Bug 87728 has been marked as a duplicate of this bug. ***
Based on the latest duplicates, this still doesn't seem fixed for Tahiti XT. :( I'll attach a v2 of the Tahiti kernel fix.
Created attachment 119959 [details] [review] Tahiti fix v2 Does this help on 7870 XT?
Created attachment 119985 [details] v2 patch dmesg No dice with patch, attached another dmesg. Maybe I applied the patch wrong?
Hi, I was the person who reported Bug 92518 and I switched to Fedora 23 to try and compile drm, xf86-video-ati, and mesa from the git master. I am assuming the "Tahiti fix v2" patch is for mesa, but I have no idea how to apply the patch. After some googling, I guess I am supposed to use: git-apply --directory=<path to file of interest> The problem is that in the patch, I am supposed to look for a path to the file called a/drivers/gpu/drm/radeon/si.c, but that file si.c does not exist in the mesa tree when I ran the command: find <mesa source folder> si.c I need some help with this so that I can test the patch.
AFAIK the patch has to be applied to the linux kernel, not to mesa. That's how I did it after all, but it didn't seem to fix the issue. I'm using Arch Linux btw, so I had to a command to the PKGBUILD in order to apply the patch during the build process.
Thanks summerrainbowz, I had no idea that was supposed to be for the Kernel. I can also confirm summerrainbowz's results. My dmesg output looks different though. My setup current is Fedora 23. I compiled and installed mesa, xf86-video-ati, and drm from git master. I also applied the "Tahiti fix v2" patch to vanilla Kernel 4.2 to test. Kernel config settings were identical to Fedora's config. I took out nomodeset from grub boot parameters when testing as well. I am uploading the attached Xorg and dmesg logs.
Created attachment 120010 [details] dmesg.log with kernel 4.2 with Tahiti fix v2 patch
Created attachment 120011 [details] Xorg.0.log with kernel 4.2 with Tahiti fix v2 patch
I forgot to link the pictures I took of the graphical corruptions. First picture: http://i65.tinypic.com/5xlxj9.jpg Second Picture: http://i68.tinypic.com/2yv9k3l.jpg
I can confirm same problem with Tahiti LE. Mesa 11.1.1-1 from Manjaro Linux repository.
Created attachment 121633 [details] Dmesg Kernel 4.5RC3 Demseg Kernel 4.5RC3
Created attachment 121634 [details] Xorg.0.log
Just wondering guys, is there really nothing that can be done in order to fix this issue?
madmalkav, your dmesg log had messages from the fglrx kernel driver... not saying that *is* your problem but it definitely can't help... any chance you can test on a vanilla system that hasn't had fglrx installed ? [ 9.614226] <6>[fglrx] Maximum main memory to use for locked dma buffers: 7714 MBytes. [ 9.614510] <6>[fglrx] vendor: 1002 device: 679e revision: 0 count: 1 [ 9.615005] <6>[fglrx] ioport: bar 4, base 0xe000, size: 0x100 [ 9.615248] <6>[fglrx] Kernel PAT support is enabled [ 9.615263] <6>[fglrx] module loaded - fglrx 15.20.3 [Sep 8 2015] with 1 minors
(In reply to John Bridgman from comment #130) > madmalkav, your dmesg log had messages from the fglrx kernel driver... not > saying that *is* your problem but it definitely can't help... any chance you > can test on a vanilla system that hasn't had fglrx installed ? > > [ 9.614226] <6>[fglrx] Maximum main memory to use for locked dma buffers: > 7714 MBytes. > [ 9.614510] <6>[fglrx] vendor: 1002 device: 679e revision: 0 count: 1 > [ 9.615005] <6>[fglrx] ioport: bar 4, base 0xe000, size: 0x100 > [ 9.615248] <6>[fglrx] Kernel PAT support is enabled > [ 9.615263] <6>[fglrx] module loaded - fglrx 15.20.3 [Sep 8 2015] with 1 > minors I can't at the moment as I need this computer for working. I can tell you the system had problems with OSS driver since minute 1, i.e. I had to use the option to use a kernel with propietary drivers in order to manage to install Linux in this machine. If any other option can be valid -booting the installer media with the OSS driver or installing a kernel without the fglrx module- I'll gladly try that. If not, I will try to format the system and do vanilla tests as soon as possible.
I get updates to this report via the dri-devel mailing list, please don't add me to the CC list.
I've set up a bounty for fixing this bug. Quantity is quite low, sorry. If anyone else affected by this bug can throw some bucks into this, I think it can help to get a solution sooner. https://www.bountysource.com/issues/5643054-radeonsi-x11-can-t-start-with-acceleration-enabled
(In reply to madmalkav from comment #131) > (In reply to John Bridgman from comment #130) > > madmalkav, your dmesg log had messages from the fglrx kernel driver... not > > saying that *is* your problem but it definitely can't help... any chance you > > can test on a vanilla system that hasn't had fglrx installed ? > > > > [ 9.614226] <6>[fglrx] Maximum main memory to use for locked dma buffers: > > 7714 MBytes. > > [ 9.614510] <6>[fglrx] vendor: 1002 device: 679e revision: 0 count: 1 > > [ 9.615005] <6>[fglrx] ioport: bar 4, base 0xe000, size: 0x100 > > [ 9.615248] <6>[fglrx] Kernel PAT support is enabled > > [ 9.615263] <6>[fglrx] module loaded - fglrx 15.20.3 [Sep 8 2015] with 1 > > minors > > I can't at the moment as I need this computer for working. I can tell you > the system had problems with OSS driver since minute 1, i.e. I had to use > the option to use a kernel with propietary drivers in order to manage to > install Linux in this machine. You should at least set modprobe.blacklist=fglrx on the kernel command line.
(In reply to madmalkav from comment #133) > I've set up a bounty for fixing this bug. Quantity is quite low, sorry. If > anyone else affected by this bug can throw some bucks into this, I think it > can help to get a solution sooner. > > https://www.bountysource.com/issues/5643054-radeonsi-x11-can-t-start-with- > acceleration-enabled I have upped it to $100. I know this is very low amount of money for the amount of skill required to fix something like this, but I still hope that it will motivate someone.
It is great that there is a bounty, but the developers are actually asking for more information to actually be able to solve the problem. madmalkav, I think Marek is asking you to blacklist the fglrx driver temporarily at boot up so that the system reverts to the open source drivers if those drivers haven't been blacklisted already.
I have been testing and commenting my results on the IRC channel, didn't mention anything here as nothing interesting happened. Blacklisting the module bears no difference. I can upload logs if needed but they are the same just without the fglrx lines. Also, I'm trying to repeat the tests of the user in comment #104 but I'm totally unable to get any debug message I insert in si_state.c to show on any log. Any tips or a link to some form of "Mesa debugging for dummies" will be gladly appreciate.
I don't think we should use X if we know that the GPU driver is totally broken. Piglit should be used for such testing. How to build it: 1) Mesa should be built with: --with-egl-platforms=x11,drm This is also required for X acceleration, so it should be set already. 2) Build and install waffle: git://github.com/waffle-gl/waffle 3) Build piglit (no install): https://cgit.freedesktop.org/piglit/ Configure it with ccmake and enable waffle. How to get ready: 1) Boot with the "text" kernel parameter (disables X) and also add "radeon.lockup_timeout=0" to prevent the kernel driver from trying to recover from GPU hangs. 2) Go to the piglit/bin directory. 3) Type: export PIGLIT_PLATFORM=gbm Tests to run: 1) If this works, most things will work: ./fbo-generatemipmap-formats -auto 2) Something simpler: ./ext_transform_feedback-position -auto 3) You can invoke very simple internal driver tests by setting GALLIUM_TESTS=1. This will exit before the program can do something, so the executable doesn't matter. For example: GALLIUM_TESTS=1 ./ext_transform_feedback-position Diagnosing GPU hangs: If the GPU hangs during these tests, you can see errors in dmesg. I recommend using radeontop for overview of which GPU hw blocks are busy. If some blocks report 100% activity for no reason, they are stuck. Which blocks are stuck is the first piece of information we need to know. Then, we need to know if any internal driver tests pass if you run something with GALLIUM_TESTS=1 (see above).
Created attachment 122201 [details] attachment-32429-0.html Thanks for the great explanation, Marek. I will try to start tests today but probably I won't have enough time until weekend. On Thu, Mar 10, 2016 at 2:01 AM -0800, <bugzilla-daemon@freedesktop.org> wrote: https://bugs.freedesktop.org/show_bug.cgi?id=60879 --- Comment #138 from Marek Olšák <maraeo@gmail.com> --- I don't think we should use X if we know that the GPU driver is totally broken. Piglit should be used for such testing. How to build it: 1) Mesa should be built with: --with-egl-platforms=x11,drm This is also required for X acceleration, so it should be set already. 2) Build and install waffle: git://github.com/waffle-gl/waffle 3) Build piglit (no install): https://cgit.freedesktop.org/piglit/ Configure it with ccmake and enable waffle. How to get ready: 1) Boot with the "text" kernel parameter (disables X) and also add "radeon.lockup_timeout=0" to prevent the kernel driver from trying to recover from GPU hangs. 2) Go to the piglit/bin directory. 3) Type: export PIGLIT_PLATFORM=gbm Tests to run: 1) If this works, most things will work: ./fbo-generatemipmap-formats -auto 2) Something simpler: ./ext_transform_feedback-position -auto 3) You can invoke very simple internal driver tests by setting GALLIUM_TESTS=1. This will exit before the program can do something, so the executable doesn't matter. For example: GALLIUM_TESTS=1 ./ext_transform_feedback-position Diagnosing GPU hangs: If the GPU hangs during these tests, you can see errors in dmesg. I recommend using radeontop for overview of which GPU hw blocks are busy. If some blocks report 100% activity for no reason, they are stuck. Which blocks are stuck is the first piece of information we need to know. Then, we need to know if any internal driver tests pass if you run something with GALLIUM_TESTS=1 (see above). -- You are receiving this mail because: You are on the CC list for the bug.
*** Bug 71689 has been marked as a duplicate of this bug. ***
Created attachment 122212 [details] Mesa debug file after tests with Marek on the IRC
CP works. Shaders don't work. The hardware hangs in the vertex shader. The draw call doesn't even enable the rasterizer. radeon/si.c:si_setup_spi looks very wrong to me: - The function sets SPI_STATIC_THREAD_MGMT_3, which only configures CUs for LS and HS stages. - I don't understand why SPI_STATIC_THREAD_MGMT_3 is set 16 times? - SPI_STATIC_THREAD_MGMT_1 (PS,VS) and SPI_STATIC_THREAD_MGMT_2 (GS,ES) are not set at all. It looks like that's the root cause of this bug.
Created attachment 122225 [details] [review] possible fix Can you test the attached patch? How to build the kernel: - use "git clone" to get the kernel source - go to the kernel directory - git am $patch_filename # apply the patch - cp /boot/config-`uname -r` .config # copy your current kernel config - make -j4 - sudo make modules_install - sudo make install
Created attachment 122240 [details] Dump with Marek patch
Can you please attach dmesg with the patch?
You don't have to run the tests. Dmesg after boot is sufficient.
Created attachment 122244 [details] Dmesg from the boot I took the debug dump
What I had thought was incorrect kernel code is actually correct and hw folks confirmed it. To be completely honest with you, I have absolutely no idea why Tahiti LE driver support is broken.
Are there any new information about this bug? I would like to use my GPU again.
Created attachment 124323 [details] New Mesa dump with Kernel 4.7rc1, mesa-git, llvm-svn
Created attachment 124324 [details] dmesg after the mesa dump No more "Ring stalled..." messages with kernel 4.7rc1
Tahiti LE is still broken, after 3 years. Not trying to be a dick, but FFS, at this point ALL the other cards based on very similar chips(7950, 7970, 280X) work very well, and yet this one is still broken. I mean really, once you have all the work well done for all similar cards, how can it be so hard to bring this one to life? I just upgraded from a HD4890 to a Tahiti LE card. I did not bother to check how well it is supported under Linux before buying it, because I assumed that all GCN1 cards are very well supported via radeonsi. Now I found out, that my card is probably the only one that is not supported at all, so to say I am mad would be an understatement.
(In reply to madmalkav from comment #151) > Created attachment 124324 [details] > dmesg after the mesa dump > > No more "Ring stalled..." messages with kernel 4.7rc1 madmalkav, any more recent findings?
Nothing. More people doing tests surely will help, getting one of this cards to a developer will be great, but I can't afford that at the moment -come on, AMD, you surely have one or two on a basement...-
At this point people might wanna try amdgpu driver too from agd5f tree... i think i saw some commits for harvested chips there maybe week ago.
Or it was 3 weeks ago :D anyway who knows some magic touches like this might change something: https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-4.7&id=d207295db45b576eddf60749c0c24fc8528f3c80
Created attachment 127226 [details] Xorg log with lastest 4.9-wip kernel and mesa master branch I've tried agd5f's 4.9 wip beanch with latest Mesa. Wflinfo fails with an "amdgpu: unkown family" error, no Gallium dump log, error on dmesg: [ 353.579289] wflinfo[1529]: segfault at 8 ip 00007f8260c8dd6a sp 00007ffe2f8d74f0 error 4 in radeonsi_dri.so[7f8260912000+8fc000] If I try to start X, similar errors appears.
(In reply to madmalkav from comment #157) > I've tried agd5f's 4.9 wip beanch with latest Mesa. Wflinfo fails with an > "amdgpu: unkown family" error, no Gallium dump log, error on dmesg: Which Mesa Git commit is that exactly? It looks like it's before SI support was added to the amdgpu winsys code. Double-check that you're really building current Git master, and that your self-built radeonsi_dri.so is getting picked up.
(In reply to Michel Dänzer from comment #158) > Double-check that you're really building current Git master, and that your > self-built radeonsi_dri.so is getting picked up. You are probably right, but for personal reasons I won't continue testing things for this bug. I hope some of other affected users step up and continue the tests. I will remain subscribed to the bug so I can grant the bounty if someone gets to fix it.
Created attachment 128182 [details] openSuse Tumbleweed - Linux 4.8.10 So I tested my card again by installing openSuse Tumbleweed on my computer again and I don't know what happened, but I am now able to reach the KDE login screen. While the computer is booting up, there are graphical corruptions so I checked dmesg to see if the card is really ok or not. The driver still has some trouble with the card, but I am able to get into a graphical environment. How can I help to completely fix the issue with this card?
Ben, I'm afraid the initialization of the conflicting parts of the card is just delayed in your current install, that's why it fails later. Hope I'm wrong. If you want support for your tests, #radeon @ irc.freenode.net is always full of people that will give you a hand with that.
I just bit the bullet and swapped out my Radeon HD 7870 XT for an RX 480 due to this issue. I'll happily donate the 7870 to any developer who would like to put it to use fixing this bug. Get in touch in the next week or so if interested. Otherwise I'll try and sell it before Christmas, it's still a good card for Windows gaming.
Created attachment 128452 [details] dmesg for linux 4.9 amdgpu driver The attachment is dmesg of linux 4.9 and AMDGPU driver is in use. After startx, the colorful scrambled screen happened again, but it stayed still, not resetting screen for every 5-6 seconds like radeon driver. I issued a reboot command via SSH shell, it returned into old framebuffer correctly and rebooted succesfully, unlike the radeon driver case. Dmesg file is the original one, not the journalctl capture, journalctl sometimes omits some messages. This line was written after startx: [drm] xxxx: dce_v6_0_afmt_setmode ----no impl !!!!!!!! Also that line was omitted by journalctl...
*** Bug 70779 has been marked as a duplicate of this bug. ***
4.10.0-28-generic [AMD/ATI] Tahiti LE [Radeon HD 7870 XT] I have the same problem with this card. On amdgpu I got black screen. But monitor remains on. I can log in via ssh and all. On radeon first I get screen full of colourful pixels, then black screen, then monitor says no-signal and goes stand-by. And then pc hangs and I can no longer log in via ssh. On radeon there are a lot of radeon 0000:01:00.0: ring 3 stalled for more than 10036msec kinda logs in dmesg. But on amdgpu there is nothing interesting really. It seems like it almost works, except for the black screen ;) I think I'm gonna try fresh kernel from padoka ppa next. When I have some free time.
Created attachment 133231 [details] dmesg amdgpu kernel 4.10
Created attachment 133232 [details] Xorg.log amdgpu kernel 4.10
Created attachment 133233 [details] dmesg radeon kernel 4.10
Created attachment 134388 [details] dmesg radeon kernel 4.12.13 Not sure if it helps, but here's another dmesg output from radeon + kernel 4.12.13. What does work though is the fallback to llvmpipe, and I end up with a working system + graphics.
Created attachment 134576 [details] journalctl -k amdgpu with linux-amd-staging-git Also no luck with AMDGPU. Here's the kernel output for booting with linux-amd-staging-git 4.12.0-2a69a4b35621, (on Arch Linux, see also https://aur.archlinux.org/packages.php?ID=442065).
Created attachment 135238 [details] journalctl_radeon_4.14.0-041400rc7 [AMD/ATI] Tahiti LE [Radeon HD 7870 XT] 4.14.0-041400rc7-generic from: http://kernel.ubuntu.com/~kernel-ppa/mainline/ OpenGL version string: 3.0 Mesa 17.3.0-rc2 - padoka PPA Basically still the same, black screen then: radeon 0000:01:00.0: ring 0 stalled for more than 10244msec And then: radeon 0000:01:00.0: GPU reset succeeded, trying to resume That's the last line in log file, after this I can no longer connect via ssh.
Created attachment 135239 [details] journalctl_amdgpu_4.14.0-041400rc7 I've had more luck with amdgpu. Well kinda. It finally boots without nomodeset (starting from 4.13 kernel). But falls back to software rendering (Device: llvmpipe). It also complains that: amdgpu 0000:01:00.0: SI support provided by radeon. amdgpu 0000:01:00.0: Use radeon.si_support=0 amdgpu.si_support=1 to override. But when I start it with: modprobe.blacklist=radeon radeon.si_support=0 amdgpu.si_support=1 screen freezes, gnome doesn't start and it produces output from attachment. Something about "dead whales": lis 04 19:35:39 pc gnome-session-binary[1369]: CRITICAL: We failed, but the fail whale is dead. Sorry.... There is also something about: lis 04 19:34:09 pc org.gnome.Shell.desktop[1440]: amdgpu_device_initialize: Cannot parse ASIC IDs, 0xffffffea./usr/share/libdrm/amdgpu.ids: No such file or directory
Created attachment 138230 [details] attachment-10505-0.html As it has failed to attract any developer attention in for two years, I have cancelled the bountysource reward. ________________________________ From: bugzilla-daemon@freedesktop.org <bugzilla-daemon@freedesktop.org> Sent: Saturday, November 4, 2017 10:14:28 PM To: myhateisblind@hotmail.com Subject: [Bug 60879] [radeonsi] Tahiti LE: GFX block is not functional, CP is okay Comment # 172<https://bugs.freedesktop.org/show_bug.cgi?id=60879#c172> on bug 60879<https://bugs.freedesktop.org/show_bug.cgi?id=60879> from MAD<mailto:adamczuk@tlen.pl> Created attachment 135239 [details]<attachment.cgi?id=135239> [details]<attachment.cgi?id=135239&action=edit> journalctl_amdgpu_4.14.0-041400rc7 I've had more luck with amdgpu. Well kinda. It finally boots without nomodeset (starting from 4.13 kernel). But falls back to software rendering (Device: llvmpipe). It also complains that: amdgpu 0000:01:00.0: SI support provided by radeon. amdgpu 0000:01:00.0: Use radeon.si_support=0 amdgpu.si_support=1 to override. But when I start it with: modprobe.blacklist=radeon radeon.si_support=0 amdgpu.si_support=1 screen freezes, gnome doesn't start and it produces output from attachment. Something about "dead whales": lis 04 19:35:39 pc gnome-session-binary[1369]: CRITICAL: We failed, but the fail whale is dead. Sorry.... There is also something about: lis 04 19:34:09 pc org.gnome.Shell.desktop[1440]: amdgpu_device_initialize: Cannot parse ASIC IDs, 0xffffffea./usr/share/libdrm/amdgpu.ids: No such file or directory ________________________________ You are receiving this mail because: * You are on the CC list for the bug.
Thanks for coming with detailed bug which you faced while starting X server and what happened after few minutes within screenshots for better reference to find a solution for other users and experts of it. Emily, http://www.dissertationhelp.uk
Created attachment 143276 [details] journalctl-b0-radeonsi-4.20.6.log
Created attachment 143277 [details] Xorg-radeonsi-4.20.6.log
Created attachment 143278 [details] journalctl-b0-amdgpu-4.20.6.log
Created attachment 143279 [details] Xorg-amdgpu-4.20.6.log
great post. http://www.winmilliongame.com http://www.gtagame100.com http://www.subway-game.com http://www.zumagame100.com
Does booting with pci=noats on the kernel command line in grub help?
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1208.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.