Bug 47519 - Xorg hangs in ioCtl in path of RADEONDownloadFromScreenCS
Summary: Xorg hangs in ioCtl in path of RADEONDownloadFromScreenCS
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
: 47512 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-03-19 08:54 UTC by Torsten Krah
Modified: 2012-06-12 09:43 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
first trace (18.19 KB, text/plain)
2012-03-19 08:54 UTC, Torsten Krah
no flags Details
2nd trace (13.48 KB, text/plain)
2012-03-19 08:54 UTC, Torsten Krah
no flags Details
Xorg.log (56.41 KB, text/plain)
2012-03-19 08:55 UTC, Torsten Krah
no flags Details
dmesg (53.02 KB, text/plain)
2012-03-19 09:02 UTC, Torsten Krah
no flags Details
screenshot with artifacts (28.68 KB, image/png)
2012-03-22 04:28 UTC, Torsten Krah
no flags Details

Description Torsten Krah 2012-03-19 08:54:26 UTC
Created attachment 58691 [details]
first trace

Hi,

using gnome-shell i am suffering many hangs a day.
Attached is a full BT where it does hang.

Even more serious - i am unable to get back to work again, i must reboot. Restarting X the complete X server does hang at the main loop (looks like 47512 - but i don't know if they are related at all).
First StackTrace is a hang time (i did only scroll a Website @firefox when it happened). Second one is the trace taken after i did call "service lightdm restart", so new X server start.
I am using xorg-edgers ppa and 3.3.0 precise kernel.
Comment 1 Torsten Krah 2012-03-19 08:54:51 UTC
Created attachment 58692 [details]
2nd trace
Comment 2 Torsten Krah 2012-03-19 08:55:25 UTC
Created attachment 58693 [details]
Xorg.log
Comment 3 Alex Deucher 2012-03-19 09:00:16 UTC
Please attach your dmesg output.
Comment 4 Torsten Krah 2012-03-19 09:01:48 UTC
Some notes (but it may be fate or something, but it feels like it does have some influence):
Using this config - it does still hang from time to time (e.g. one or two times a day):

Section "Device"
        Identifier  "Card0"
	Option	"EnablePageFlip"        "off"  #supported on all R/RV/RS4xx and older hardware and set on by default
	Option	"ColorTiling"           "off"  #enabled by default on RV300 and later radeon cards.
	Option	"EXAPixmaps"            "off"  #when on icreases 2D performance, but may also cause artifacts on some old cards
	Option	"AccelDFS"              "off"  #default is on, read the radeon manpage for more information
EndSection

If i remove those config or set all to "on" it happens really frequently (enough to make me report this one here to seek help :-) ).
Comment 5 Torsten Krah 2012-03-19 09:02:18 UTC
Created attachment 58694 [details]
dmesg
Comment 6 Alex Deucher 2012-03-19 09:10:47 UTC
Is this a regression?  If so what components did you change and what was the last working version?
Comment 7 Torsten Krah 2012-03-19 09:21:30 UTC
I suffer this one since i'll changed from lucid to oneiric, where gnome2 was removed and i did switch to gnome3.
Because it did freeze very often, i switched to lxde where is does not seem to happen - but i am back to gnome3 now to see if the error is gone (seems not ;-) ).

So the combination of gnome-shell and xorg does have this problem for me since the beginning.
Maybe its a regression, maybe not - don't know. Any other help or information i can provide?
Comment 8 Michel Dänzer 2012-03-20 04:07:12 UTC
> Restarting X the complete X server does hang at the main loop (looks like 47512
> - but i don't know if they are related at all).

Might well be one and the same problem.


(In reply to comment #4)
> If i remove those config or set all to "on" it happens really frequently
> (enough to make me report this one here to seek help :-) ).

Option "AccelDFS" has no effect with KMS (the log file incorrectly claims otherwise). It might be interesting to narrow down which of the other three options really make a difference for you. My first guess would be page flipping.


(In reply to comment #5)
> dmesg

Did you capture this after a hang occurred? If so, it looks like we can rule out the hangs being due to GPU lockups.

Does booting with radeon.msi=0 on the kernel command line work around the problem?


(In reply to comment #7)
> I suffer this one since i'll changed from lucid to oneiric, where gnome2 was
> removed and i did switch to gnome3.

Was lucid even using KMS yet?


> Because it did freeze very often, i switched to lxde where is does not seem to
> happen

Not even for OpenGL apps? Is the GNOME3 fallback mode affected now?
Comment 9 Torsten Krah 2012-03-20 04:59:37 UTC
> (In reply to comment #4)
> > If i remove those config or set all to "on" it happens really frequently
> > (enough to make me report this one here to seek help :-) ).
> 
> Option "AccelDFS" has no effect with KMS (the log file incorrectly claims
> otherwise). It might be interesting to narrow down which of the other three
> options really make a difference for you. My first guess would be page
> flipping.

Ok - it happens too with all of them off. But yes, it feels like page flipping.

> 
> 
> (In reply to comment #5)
> > dmesg
> 
> Did you capture this after a hang occurred? If so, it looks like we can rule
> out the hangs being due to GPU lockups.

Yes i'll take it after the hang. It did hang too today again with all options off. dmesg is similar. And the trace if got after hanging is the same as in 47512 - so yes it looks like the same.
Should other one be marked as duplicate?

> 
> Does booting with radeon.msi=0 on the kernel command line work around the
> problem?

I'll try it (with all options on again) and will report.

> 
> 
> (In reply to comment #7)
> > I suffer this one since i'll changed from lucid to oneiric, where gnome2 was
> > removed and i did switch to gnome3.
> 
> Was lucid even using KMS yet?

With backported kernel if was possible afaik. But this comment was about that i don't know if its a regression. So it should not matter at all.

> 
> 
> > Because it did freeze very often, i switched to lxde where is does not seem to
> > happen
> 
> Not even for OpenGL apps? Is the GNOME3 fallback mode affected now?

Hm it might happen for OpenGL apps - any hint here what app i can try to provoke it? Some hints for me?
Gnome3 fallback mode i'll did not try - but i will test it too (with msi=1 and msi=0).
All test cycles will take some time - guess i get a result in the next few days.
Comment 10 Torsten Krah 2012-03-22 04:27:32 UTC
First feedback:

After running 3 days with radeon.msi=0 it seems the hang is gone. Did not suffer this problem until now.
I am using this config now:

/etc/X11/xorg.conf.d/20-radeon:

Section "Device"
        Identifier  "Card0"
        Option  "EnablePageFlip"        "on"
        Option  "ColorTiling"           "on"
        Option  "EXAPixmaps"            "off"
        Option  "AccelDFS"              "on"
        Option  "MigrationHeuristic" "greedy"
EndSection


"EXAPixmaps" is "off" because i see many artifacts sometimes on the screen otherwise - maybe its the greedy option, did not test both exclusive. See the screen for artifacts i am seeing, maybe someone can tell me if i can try other options here than pixmap = off.

I'll test gnome3 fallback + lxde with openGl in the next days and provide additional feedback.
Comment 11 Torsten Krah 2012-03-22 04:28:44 UTC
Created attachment 58857 [details]
screenshot with artifacts
Comment 12 Michel Dänzer 2012-03-23 01:18:55 UTC
(In reply to comment #10)
> After running 3 days with radeon.msi=0 it seems the hang is gone. Did not
> suffer this problem until now.

Okay, so I'm reassigning this to the kernel driver for now, though this might just be a duplicate of another MSI related report here or in the kernel bugzilla.


> "EXAPixmaps" is "off" because i see many artifacts sometimes on the screen
> otherwise - maybe its the greedy option, did not test both exclusive.

Option "MigrationHeuristic" has no effect with KMS. Disabling EXAPixmaps probably just avoids the problem because it prevents hardware acceleration in most cases. The corruption looks like bug 47266, would be nice if somebody could find a working driver snapshot and bisect.
Comment 13 Jeremy Huddleston Sequoia 2012-06-12 09:15:31 UTC
*** Bug 47512 has been marked as a duplicate of this bug. ***
Comment 14 Michel Dänzer 2012-06-12 09:43:52 UTC
Current Linux kernels disable MSI by default on these cards.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.