Bug 65192

Summary: [r600g] Screensavers lock up machine (screen goes blank, keyboard unresponsive, sound loops; sysrq/ssh possible)
Product: Mesa Reporter: Luzipher <luziphermcleod>
Component: Drivers/Gallium/r600Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: medium    
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Luzipher 2013-05-30 22:35:39 UTC
For a while now, some screensavers sometimes lock up my machine. That means all screens go blank, the keyboard is unresponsive (numpad-key doesn't toggle the indicator led), the sound loops. But I still can use ssh from a remote machine and the magic-sysrq-keys also work.

There is nothing to see in neither /var/log/messages (acquired via netconsole) nor /var/log/Xorg.0.log.

I can trigger the bug (or regression, I think it used to work about 2 months ago) reliably by using the Xfce4 screensaver settings application, which has a preview of the screensaver selected. When switching between screensavers, the lockup quickly occurs. A good candidate is the screensaver named "AntMaze", it alsmost always locks up my machine (only worked once so far).
Other stuff works quite solid, Half-Life 2 worked multiple hours today.

Setting R600_HYPERZ=0 in /etc/environment didn't help. Also occurs on kernel 3.8.0-rc7.

I'm happy to provide more info if needed.



System Specs:
Intel Core i7-965
2x Radeon HD4870 (rv770, currently only one active without xorg.conf), 2 Monitors

Gentoo Linux
Kernel 3.10.0-rc3
Mesa 9.2.0 (git-60f9b72) git commit 60f9b722ef80c499a94b4e5ab7304dcd739ea569 Revert "i965: fix problem with constant out of bounds access (v2)"
xorg-server-1.14.1
libdrm git commit 8a88e349975a64676f143183e835e6d296f29627 modetest: Make RGB565 pwetty too
xf86-video-ati git commit commit bd2557ea5ef84b975060e929d5ece53ec464336f DRI2: add interpolated blanks to frame number in event handlers
Comment 1 Luzipher 2013-05-30 22:39:55 UTC
I forgot to mention: I use the classic shader compiler at the moment and I have egl, gles1, gles2, vdpau and wayland enabled.

Full flags on portage:
[ebuild   R   #] media-libs/mesa-9999::x11  USE="egl gallium gles1 gles2 llvm nptl shared-glapi vdpau wayland -abiwrapper -bindist -classic -debug -gbm -opencl -openvg -osmesa -pax_kernel -pic -r600-llvm-compiler (-selinux) -xa -xorg -xvmc" MULTILIB_ABI="amd64 x86" PYTHON_SINGLE_TARGET="python2_7 -python2_6" PYTHON_TARGETS="python2_7 -python2_6" VIDEO_CARDS="r600 (-freedreno) -i915 -i965 -ilo -intel -nouveau -r100 -r200 -r300 -radeon -radeonsi -vmware"
Comment 2 Michel Dänzer 2013-05-31 09:19:25 UTC
(In reply to comment #2)
> Setting R600_HYPERZ=0 in /etc/environment didn't help.

Are you sure this ended up in the environment of the screensaver hacks? Can you try e.g.

R600_HYPERZ=0 /usr/lib/xscreensaver/antmaze

manually to see if that works better? Same thing might be interesting for R600_DEBUG=sb (or R600_LLVM=0, but AFAICT you're building without the r600g LLVM compiler backend anyway?).
Comment 3 Luzipher 2013-05-31 16:31:51 UTC
I did as suggested and ran:

R600_HYPERZ=0 /usr/lib/misc/xscreensaver/antmaze

This worked more often, but I could make it crash when using different screensavers. Using

R600_HYPERZ=0 /usr/bin/xscreensaver-demo

and switching between screensavers crashed as often as without R600_HYPERZ=0.
Also, the screensaver juggler3d crashed twice when exiting the window it ran in.

R600_DEBUG=sb didn't help either.

I also confirmed that environment variables are working as they should via setting GALLIUM_HUD.
Comment 4 Michel Dänzer 2013-06-03 14:42:17 UTC
(In reply to comment #3)
> R600_HYPERZ=0 /usr/bin/xscreensaver-demo
> 
> and switching between screensavers crashed as often as without R600_HYPERZ=0.

Beware that AFAIK the screensaver hacks aren't spawned from the xscreensaver-demo process but from the xscreensaver daemon process. Did you confirm that the latter saw the environment variable?
Comment 5 Luzipher 2013-06-03 17:02:16 UTC
(In reply to comment #4)
> Beware that AFAIK the screensaver hacks aren't spawned from the
> xscreensaver-demo process but from the xscreensaver daemon process. Did you
> confirm that the latter saw the environment variable?

I'm very sure that the environment variable is seen. Today, I added it to my /etc/environment again, restarted, and confirmed it is there by echo $R600_HYPERZ, which printed 0, as expected.
I'm starting X manually (with startx), so I'm quite certain that the xscreensaver daemon process also sees the variable.

All the following tests have been done with R600_HYPERZ=0 in /etc/environment.

As I mentioned in my previous comment, starting antmaze with the command
R600_HYPERZ=0 /usr/lib/misc/xscreensaver/antmaze
works more often. In fact I couldn't get it to crash in about 25 tries today.

What did crash was again juggler3d on the third try when _closing_ its window. The command used was:
R600_HYPERZ=0 /usr/lib/misc/xscreensaver/juggler3d

Also, by using the xscreensaver-demo application (with R600_HYPERZ=0 in /etc/environment) and switching between random screensavers, I could trigger a crash quickly.
Comment 6 Luzipher 2013-06-03 17:17:57 UTC
More info:
I tried again with juggler3d and it crashed first time upon exit. flurry also triggered the crash on exit.
So I now suspect that antmaze isn't problematic - it is not starting a screensaver that causes the crash, but closing one. As I switched screensavers randomly in xscreensaver-demo, I just by chance clicked on antmaze when I had a problematic screensaver selected previously.
Comment 7 Luzipher 2013-06-03 18:02:13 UTC
Update: The crash just triggered with a fullscreen youtube video, after a few seconds of playing.
Comment 8 Michel Dänzer 2013-06-04 17:06:19 UTC
(In reply to comment #8)
> [...] the bug (or regression, I think it used to work about 2 months ago) 

Can you try confirming that, e.g. by trying Mesa from the 9.1 branch or an older snapshot from master?
Comment 9 Luzipher 2013-06-04 23:26:52 UTC
Unfortunately I couldn't confirm my thoughts. But I still am quite sure I did not have those problems earlier - I had screensavers and youtube last year and didn't notice regular crashes. I frequently use this linux installation, in fact it's my main os for several years now, so I really would have noticed.
Maybe it's another package ? X, libdrm, radeon-ucode, xf86-video-ati come to mind. Maybe mostly radeon-ucode, as I have the feeling that the problems started at about the same time as the buzz on the uvd code drop.

Well, tests done (I could reproduce the crash on every of these with directly started juggler3d on closing the window, mostly first or second try):

mesa-9.0.3 (forgot to get glxinfo)

mesa-9.0.1.ebuild, glxinfo:
OpenGL renderer string: Gallium 0.4 on AMD RV770
OpenGL version string: 3.0 Mesa 9.0.1
OpenGL shading language version string: 1.30

mesa-8.0.4-r1.ebuild, glxinfo:
OpenGL renderer string: Gallium 0.4 on AMD RV770
OpenGL version string: 2.1 Mesa 8.0.4
OpenGL shading language version string: 1.20
with 8.0.4, I got only garbage and a lot of these messages:
radeon: The kernel rejected CS, see dmesg for more information.
dmesg:
[ 1580.805418] radeon 0000:02:00.0: r600_cs_track_validate_cb invalid tiling 6 for 0 (0x08110668)
[ 1580.805463] radeon 0000:02:00.0: r600_packet3_check:1720 invalid cmd stream 573
[ 1580.805465] [drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream !



I also tried the oldest kernel I have with 8.0.4, it's a vanilla 3.4.0-rc6. Even there I could get the same crash after closing the window with the garbage output.
Comment 10 Luzipher 2013-07-30 23:38:37 UTC
I think this bug is fixed.

I retested for this issue on current mesa git (commit 	7568a89500c35f14cbd397f87c77acc915afc672) on kernel 3.10.0-rc7. I could start and exit juggler3d at least 20 times without crash as well as use the xscreensaver-demo application for switching between screensaver previews.

Sorry for not being able to nail down the issue better, but I have limited time.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.