Bug 35697 - System locks up when watching fullscreen flash video
Summary: System locks up when watching fullscreen flash video
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium critical
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-03-26 09:21 UTC by Nikos Chantziaras
Modified: 2011-07-26 05:28 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
possible fix (4.68 KB, patch)
2011-03-27 13:23 UTC, Alex Deucher
no flags Details | Splinter Review
config-2.6.39.1 (51.79 KB, text/plain)
2011-06-17 10:40 UTC, Nikos Chantziaras
no flags Details

Description Nikos Chantziaras 2011-03-26 09:21:10 UTC
(I have no idea if I filled the bug details correctly; I'm simply assuming this is DRI related. Could be a mesa bug or xf86-video-ati problem, or whatever; I've no idea.)

There is a regression in kernel 2.6.38 that was not there in 2.6.37: When switching a flash video to fullscreen, three things can happen:

 * The system locks up completely (not even SysRq works)

 * The screen becomes black but the system is still
   responding (switching to a console and restarting X works)

 * The screen stops updating (old windows are still there even
   after closing them.)

There's no dmesg nor any X.org.0 output when any of the above happen.

I'm running Gentoo AMD64, kernel 2.6.38.1, x11-drivers/xf86-video-ati Git master, mesa Git master, xorg-server 1.9.5.  My graphics card is a Radeon HD4870.  I'm using KMS and Gallium and I'm on KDE 4.6.1 with desktop effects enabled.
Comment 1 Alex Deucher 2011-03-26 09:24:51 UTC
Was it only the kernel that you updated?  E.g., does just booting an earlier kernel with the same userspace drivers fix the problem?  If so, can you bisect?
Comment 2 Nikos Chantziaras 2011-03-26 09:27:46 UTC
(In reply to comment #1)
> Was it only the kernel that you updated?

Yes.


> E.g., does just booting an earlier kernel with the same userspace
> drivers fix the problem?

Yes. I also have 2.6.37.5 in my Grub menu. Booting that fixes those problems.


> If so, can you bisect?

Nope, sorry.  I simply don't have any free time currently for bisecting the kernel.
Comment 3 Nikos Chantziaras 2011-03-26 09:31:30 UTC
Forgot to mention the Flash version I'm using:

10.2.153.1_p201011173, 64-bit.
Comment 4 Dave Airlie 2011-03-27 01:29:16 UTC
lets blame page flipping until proven otherwise for any 2.6.38 problems ;-)

Can you try disabling page flip?

Option "EnablePageFlip" "FALSE" in xorg.conf device section.
Comment 5 Nikos Chantziaras 2011-03-27 12:58:35 UTC
(In reply to comment #4)
> lets blame page flipping until proven otherwise for any 2.6.38 problems ;-)
> 
> Can you try disabling page flip?
> 
> Option "EnablePageFlip" "FALSE" in xorg.conf device section.

Indeed that fixed the problem completely.
Comment 6 Alex Deucher 2011-03-27 13:23:58 UTC
Created attachment 44923 [details] [review]
possible fix

Does this drm patch fix the issue?
Comment 7 Nikos Chantziaras 2011-03-27 13:41:49 UTC
(In reply to comment #6)
> Does this drm patch fix the issue?

Nope.
Comment 8 Mario Kleiner 2011-03-27 13:55:39 UTC
Pure guesswork, but maybe worth trying:

Can you check in your ~/.kde/share/config/kwinrc if setting the option UnredirectFullscreen to false...

UnredirectFullscreen=false

...and restarting kwin makes any difference?

-mario
Comment 9 Alex Deucher 2011-03-27 13:59:16 UTC
Might be related to bug 35452.  You might also try the xserver patch on that bug.
Comment 10 Nikos Chantziaras 2011-03-27 18:48:02 UTC
(In reply to comment #8)
> Pure guesswork, but maybe worth trying:
> 
> Can you check in your ~/.kde/share/config/kwinrc if setting the option
> UnredirectFullscreen to false...
> 
> UnredirectFullscreen=false
> 
> ...and restarting kwin makes any difference?

Yes. This also fixes the issue. Another "issue" appears though when doing this: fullscreen animations are not smooth anymore; there's frameskipping at every exact interval (about one second.) This is not video related though. With compositing active, all animations, including glxgears running in a very small window, exhibit this frameskipping.  The only way to get really even and smooth animations is to run in fullscreen (which suspends compositing by default on KDE.)

So I guess the issue is triggered when KDE unredirects the rendering. But it's strange that it happens only with Flash and not other applications (KDE unredirects them too).


(In reply to comment #9)
> Might be related to bug 35452.  You might also try the xserver patch on that
> bug.

Just tried the patch. Doesn't help.
Comment 11 Mario Kleiner 2011-03-27 19:18:46 UTC
(In reply to comment #10)
> (In reply to comment #8)
> > Pure guesswork, but maybe worth trying:
> > 
> > Can you check in your ~/.kde/share/config/kwinrc if setting the option
> > UnredirectFullscreen to false...
> > 
> > UnredirectFullscreen=false
> > 
> > ...and restarting kwin makes any difference?
> 
> Yes. This also fixes the issue. Another "issue" appears though when doing this:
> fullscreen animations are not smooth anymore; there's frameskipping at every
> exact interval (about one second.) This is not video related though. With
> compositing active, all animations, including glxgears running in a very small
> window, exhibit this frameskipping.  The only way to get really even and smooth
> animations is to run in fullscreen (which suspends compositing by default on
> KDE.)
> 
> So I guess the issue is triggered when KDE unredirects the rendering. But it's
> strange that it happens only with Flash and not other applications (KDE
> unredirects them too).
> 
> 
> (In reply to comment #9)
> > Might be related to bug 35452.  You might also try the xserver patch on that
> > bug.
> 
> Just tried the patch. Doesn't help.

If UnredirectFullscreen=false fixes the problem then it really sounds like it is bug 35452 and that xserver patch should help. Without that patch and unredirected rendering you should observe screen corruption in most other fullscreen apps as well, esp. when they switch out of fullscreen mode. At least Compiz is almost unuseable without that patch with unredirected fullscreen windows.

The frameskipping with redirected windows is probably because the compositor doesn't run at the same composition rate as the redraw rate of the app or the refresh rate of the monitor.
Comment 12 Nikos Chantziaras 2011-03-27 19:28:16 UTC
(In reply to comment #11)
> [...]
> If UnredirectFullscreen=false fixes the problem then it really sounds like it
> is bug 35452 and that xserver patch should help. Without that patch and
> unredirected rendering you should observe screen corruption in most other
> fullscreen apps as well, esp. when they switch out of fullscreen mode. At least
> Compiz is almost unuseable without that patch with unredirected fullscreen
> windows.

This isn't the case here. There's no screen corruption whatsoever without that patch. The problem I'm having with Flash never results in actuall screen corruption either. It's mostly a black screen, a system freeze or the screen just "hangs" and only the mouse is able to move. I never observe any corruption (which I understand to mean "random garbage" on the screen.)


> The frameskipping with redirected windows is probably because the compositor
> doesn't run at the same composition rate as the redraw rate of the app or the
> refresh rate of the monitor.

I've set MaxFPS=60 in kwinrc and my monitor runs at 60Hz. I guess KWin isn't good in keeping things synced, that's why it skips frames.
Comment 13 Jana Saout 2011-03-28 02:01:51 UTC
This sounds an very similar to the issue I am having. Note that I am seeing this for quite a while now, ever since I switched to the gallium r300 driver. First I blamed it for most of my components being bleeding-edge (kernel drm, userspace libraries and so on). But the stuff slowly got merged into the kernel and distributions, but the issue persists. I had high hops for the page-flipping bug fix in 2.6.38, but it didn't fix the issue I am seeing.

Most of the time, at some point X would just stop updating the screen (sound continues working and everything else seems to be alive). Actually, I traced it down to compiz being unable to update the screen. Ctrl-Alt-F1 to the console, "killall -9 compiz" and starting another non-GL window manager, the desktop is usable just fine. (in 80% of the cases, I also had occasional total lockups, where I had to hard reset the machine)
Most of the time, this issue is triggered by heavy screen updates.  I tend to not run Flash full-screen, but this is the kind of screen updates that is likely to trigger the issue. Sometimes I can go for 1 day without that type of crash, sometimes I can get it within 30 minutes.

After killing compiz (when the issue got triggered), GL is totally unusable. Also, while the whole desktop then fully functional, a simple glxgears would just lock up, both via AIGLX or directly. Killing the X server resolves the issue, but sometimes even that won't help, everything GL related seems then to be totally locked up and only a reboot can resolve the issue.

I am also running 64 bit Gentoo and only out-of-the box software versions (2.6.38, mesa-7.10.1) at the moment, and sane compiler flags (-O2 -march=core2) on a Lenovo T60 (Mobility X1400, r500 family).

I haven't tried disabling page flipping or any patches yet, since I just stumbled upon this bug report.

So, I'm not sure if this is the same or a different issue. It seems to be something in the kernel locking up, and in most of the cases userspace is still running, so there might be a chance to being able to debug it without resorting to guess-work. Since the thing starte since I switched to the gallium version, I wouldn't know how to start bisecting.
Comment 14 Michel Dänzer 2011-03-28 03:00:50 UTC
(In reply to comment #13)
> Actually, I traced it down to compiz being unable to update the screen.
> Ctrl-Alt-F1 to the console, "killall -9 compiz" and starting another non-GL
> window manager, the desktop is usable just fine.

That really sounds like bug 35452.
Comment 15 Jana Saout 2011-03-29 07:19:04 UTC
Hmm, I wouldn't have made the connection with "graphical glitches" and also I'm not disabling redirection in fullscreen mode, but I'm still trying the attached patch. I applied it yesterday evening and X has been running ever since. If it stays that for the next few days I will call you my personal hero. ;) I'll come back otherwise.
Comment 16 Alex Deucher 2011-03-29 07:27:42 UTC
(In reply to comment #15)
> Hmm, I wouldn't have made the connection with "graphical glitches" and also I'm
> not disabling redirection in fullscreen mode, but I'm still trying the attached
> patch. I applied it yesterday evening and X has been running ever since. If it
> stays that for the next few days I will call you my personal hero. ;) I'll come
> back otherwise.

Just to be clear, which patch are you referring to?  The one from bug 35452 or the one from comment 6 of this bug?
Comment 17 Jana Saout 2011-03-29 08:07:17 UTC
I was referring to Bug 35452.  But bad news: That fix didn't make a difference, I just experienced the same GL lockup again.  Just for completeness I'll also apply the patch from comment 6 ind addition to see if that makes any difference.
Comment 18 Jana Saout 2011-03-29 08:26:51 UTC
Oh, well, that patch only concerns rs600, right?  So no point in that...
Comment 19 Alex Deucher 2011-03-29 08:30:16 UTC
(In reply to comment #18)
> Oh, well, that patch only concerns rs600, right?  So no point in that...

That rs600 code is shared by r5xx-r7xx (same regs on all asics in that range), so it applies to your card.
Comment 20 Jana Saout 2011-03-29 23:43:09 UTC
Applied that patch to my kernel yesterday and just experienced the issue again. Killed compiz, X went on, but GL is unusable. If I run e.g. glxgears, the application doesn't draw anything, but is still running and seems to be waiting for something from the X server. Closing the window results in some error "34" IIRC (I can write down the exact error next time if it helps).
Comment 21 Nikos Chantziaras 2011-05-13 04:58:33 UTC
I just tried kernel 2.6.39-rc7 and with page flipping enabled, the driver hangs with that version too.
Comment 23 Nikos Chantziaras 2011-06-17 10:00:48 UTC
(In reply to comment #22)
> Does this patch fix the issue:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=498c555f56a02ec1059bc150cde84411ba0ac010

Nope.  (I applied it to 2.6.39.1.)
Comment 24 Nikos Chantziaras 2011-06-17 10:40:40 UTC
Created attachment 48109 [details]
config-2.6.39.1

Just in case it matters (though probably unlikely), here's my .config. It's very minimal and only includes stuff I need, so it should build very quickly (5 minutes on a 2.7Gh Core 2 Duo.) Of course chipset and USB drivers would probably need to change if you don't happen to run an X38-based mainboard.
Comment 25 Nikos Chantziaras 2011-07-14 13:43:10 UTC
I just found out that this seems to happen only with the 64-bit Flash plugin, version 10.2.159.1.  To reproduce the crash, here's the download for that Flash version:

http://download.macromedia.com/pub/labs/flashplayer10/flashplayer10_2_p3_64bit_linux_111710.tar.gz

The new version of 64-bit Flash (11.0.something) seems to use a different method to display video (a pretty useless method, which tears in fullscreen and is extremely slow, making fullscreen videos unwatchable.)
Comment 26 Sérgio M. Basto 2011-07-14 14:03:56 UTC
(In reply to comment #25)
> To reproduce the crash, here's the download for that Flash
> version:
> 
> http://download.macromedia.com/pub/labs/flashplayer10/flashplayer10_2_p3_64bit_linux_111710.tar.gz

> The new version of 64-bit Flash (11.0.something) seems to use a different
> method to display video (a pretty useless method, which tears in fullscreen and
> is extremely slow, making fullscreen videos unwatchable.)

yeah, I got this problem since November / December  , with my Intel i915, various problems after upgrades of glibc ... I still have it ( System locks up when watching fullscreen flash video ) , well I don't like lock ups , so I don't test it often . 
With Fedora 14 and 15
Comment 27 Nikos Chantziaras 2011-07-26 05:28:44 UTC
Something changed somewhere (no idea where), and I can't reproduce it anymore.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.