Created attachment 125226 [details] Photograph of screen when the problem occurs This is an odd one, as of posting I have only managed to reproduce this when using a Virtual Programming OpenGL 3.2 based game, such as The Witcher 2. I have not been able to reproduce it with Valve's Source games. For example, If you run The Witcher 2 in windowed mode, and allow it to get to the menu screen, doing anything with another window (such as resizing) causes the display to turn to junk. Eventually the whole X11 display will be corrupt, and only rebooting the system will clear it. Text mode TTY's are unaffected. Occasionally, rather than corruption, the system will lock up, requiring a hard reset or magic sysrq reset. I could not find a way to consistently reproduce this though, whereas I can consistently reproduce the corruption. Specs: Xubuntu 16.04.1 LTS Kernel 4.0.0-31-generic, x86_64 Mesa 12.1.0-devel (Padoka PPA) DRM 2.43.0 GPU Radeon HD 7750 (CAPE VERDE PCI 1002:683F), 1GB VRAM Xorg 1.18.3 Desktop Xfce4 4.12, using builtin compositor Disabling the compositor makes no difference. Restarting Xorg when the fault occurs makes no difference. The bug is not present in the 11.2.0 release of Mesa shipped with Ubuntu 16.04 by default.
Created attachment 125228 [details] Part of kern.log that was written at lockup
So far I have only produced this bug with two of our games - The Witcher 2, and Overlord (due to be released today). Witcher 2 causes the display corruption seen in the screenshot. Overlord produces corruption quickly followed by a system lockup. Alt-SysRq-B rebooted the system, and I have attached the last written messages to kern.log
Can you bisect Mesa? Note that Xorg needs to be restarted before testing each commit, as glamor might contribute to the problem.
Not easily as I am not currently compiling Mesa, but rather using the version built by the Padoka PPA in Ubuntu
Today I gave the latest Padoka (git1600721124400.4f89cf4) a try during debugging Overlord. I got a screen corruption followed by a hang. Just as I was about to reach for the SysRq keys, Xorg restarted, and seemed to operate as normal - except I was now on llvmpipe instead. Quite an amount of kernel log was generated, and some relevant looking info in Xorg.log too. They are attached here.
Created attachment 125257 [details] Latest kern.log from crash described
Created attachment 125258 [details] Xorg log from same crash
Created attachment 125261 [details] Part of kern.log written during forcing a GPU reset I've found that force resetting the GPU clears the corruption - using cat /sys/kernel/debug/dri/0/radeon_gpu_reset This does not take effect straight away however, I often have to kill Xorg before it'll happen. There are some errors written to the kernel log - i have attached those. Here is lspci output for the card in case it's relevant: 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde PRO [Radeon HD 7750/8740 / R7 250E] [1002:683f] (prog-if 00 [VGA controller]) Subsystem: ASUSTeK Computer Inc. Cape Verde PRO [Radeon HD 7750/8740 / R7 250E] [1043:0427] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 26 Region 0: Memory at d0000000 (64-bit, prefetchable) [size=256M] Region 2: Memory at fde80000 (64-bit, non-prefetchable) [size=256K] Region 4: I/O ports at de00 [size=256] [virtual] Expansion ROM at fde00000 [disabled] [size=128K] Capabilities: <access denied> Kernel driver in use: radeon Kernel modules: radeon
Confirmed. ArchLinux: 4.6.4 x86_64 Mesa: 12.0.1 GPU: AMD Curacao PRO [Radeon R7 370 / R9 270/370 OEM] Xorg: 1.18.4 Game: Overlord What makes you think it's a regression in Mesa?
I am not sure exactly what this comment means, so I will answer as thoroughly as I can. I am not 100% sure that "mesa" is the component responsible, it could be drm or radeonsi. I am not familiar enough with the Mesa components to be sure. The problem described does not happen with the Mesa 11.2.0 distribution that Ubuntu Xenial has packaged by default. The problem only occurs when Mesa 12.1.0 is installed using a PPA such as Oibaf or Padoka. It also does not occur on Mesa 12.0.1 that is part of an installation of Manjaro (Arch Linux) that I have on the same machine. The problem does not occur with the nvidia or fglrx binary drivers. The problem is not just an app crash, but display corruption/kernel panic. This should not happen due to the behaviour of a userspace app, even if our apps were doing something "wrong", being able to crash the whole machine is a bug.
No, I was asking Michael actually, but thanks for explanation, it guves a clue. Regression is a situation when something was working before update and stopped afterwards. So in your case in Mesa 11 it worked and broke in Mesa 12. Most likely something was broken between releases, that's why Michael asked for bisect.
I bisected and reverted the problematic commit: https://cgit.freedesktop.org/mesa/mesa/commit/?id=1ebf3c4b6741a3a3a9d46048abe3996cb9a86334
I can confirm the problem is now fixed. Great job :D
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.