Bug 88536

Summary: AMD graphics hardware hangs with an homogeneous coloured screen or blank screen, and with chirp coming from the graphics card
Product: Mesa Reporter: Alberto Salvia Novella <es20490446e>
Component: EGLAssignee: mesa-dev
Status: RESOLVED INVALID QA Contact: mesa-dev
Severity: critical    
Priority: highest    
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
URL: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-ati/+bug/881526
Whiteboard:
i915 platform: i915 features:
Attachments: xorg.0.log
dmesg.0.log

Description Alberto Salvia Novella 2015-01-17 15:13:31 UTC
HOW TO REPRODUCE:
- On Ubuntu this is happening at random, but more frequently running videos on YouTube with the HTML5 player (https://www.youtube.com/html5) or with the XBMC home theatre software.

RESULT:
- Graphics hang with an homogeneous colour on screen or blank screen, and with chirp coming from the graphics card.

/var/log/kern.log (provided by another user):
Oct 25 14:53:46 lt9630 kernel: [ 737.900416] INFO: rcu_sched_state detected stall on CPU 2 (t=15000 jiffies)
Oct 25 14:55:46 lt9630 kernel: [ 857.813946] [fglrx] ASIC hang happened
Oct 25 14:55:46 lt9630 kernel: [ 857.813950] Pid: 1149, comm: Xorg Tainted: P C 3.0.0-12-generic #20-Ubuntu
Oct 25 14:55:46 lt9630 kernel: [ 857.813951] Call Trace:
Oct 25 14:55:46 lt9630 kernel: [ 857.813985] [<ffffffffa00a0d7e>] KCL_DEBUG_OsDump+0xe/0x10 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814003] [<ffffffffa00ae20c>] firegl_hardwareHangRecovery+0x1c/0x50 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814036] [<ffffffffa013ad59>] ? _ZN4Asic9WaitUntil15ResetASICIfHungEv+0x9/0x10 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814068] [<ffffffffa013ad0c>] ? _ZN4Asic9WaitUntil15WaitForCompleteEv+0x6c/0xb0 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814099] [<ffffffffa0135fb4>] ? _ZN15ExecutableUnits10CPRingIdleE15idle_WaitMethod12_QS_CP_RING_+0xe4$
Oct 25 14:55:46 lt9630 kernel: [ 857.814127] [<ffffffffa0115acb>] ? _ZN10CMMSurfaceD1Ev+0xcb/0xe0 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814158] [<ffffffffa0135e7b>] ? _ZN15ExecutableUnits7PM4idleE15idle_WaitMethod+0x4b/0x90 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814187] [<ffffffffa012e8f1>] ? _ZN15QS_PRIVATE_CORE9QsPM4idleE15idle_WaitMethod+0x31/0x60 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814215] [<ffffffffa0119d8a>] ? _ZN10QS_PRIVATE11synchronizeEv+0x2a/0x30 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814243] [<ffffffffa01233f5>] ? _Z8uCWDDEQCmjjPvjS_+0x3b5/0x10c0 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814246] [<ffffffff810871ee>] ? down+0x2e/0x50
Oct 25 14:55:46 lt9630 kernel: [ 857.814266] [<ffffffffa00cc932>] ? firegl_cmmqs_CWDDE_32+0x332/0x440 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814285] [<ffffffffa00cb260>] ? firegl_cmmqs_CWDDE32+0x70/0x100 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814288] [<ffffffff81282e5a>] ? security_capable+0x2a/0x30
Oct 25 14:55:46 lt9630 kernel: [ 857.814306] [<ffffffffa00cb1f0>] ? firegl_cmmqs_createdriver+0x170/0x170 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814322] [<ffffffffa00a9e18>] ? firegl_ioctl+0x1e8/0x250 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814333] [<ffffffffa009a9be>] ? ip_firegl_unlocked_ioctl+0xe/0x20 [fglrx]
Oct 25 14:55:46 lt9630 kernel: [ 857.814335] [<ffffffff8117939a>] ? do_vfs_ioctl+0x8a/0x340
Oct 25 14:55:46 lt9630 kernel: [ 857.814338] [<ffffffff811679ed>] ? vfs_read+0x10d/0x180
Oct 25 14:55:46 lt9630 kernel: [ 857.814339] [<ffffffff811796e1>] ? sys_ioctl+0x91/0xa0
Oct 25 14:55:46 lt9630 kernel: [ 857.814342] [<ffffffff815f22c2>] ? system_call_fastpath+0x16/0x1b
Oct 25 14:55:46 lt9630 kernel: [ 857.814345] pubdev:0xffffffffa02fd600, num of device:1 , name:fglrx, major 8, minor 88.
Oct 25 14:55:46 lt9630 kernel: [ 857.814346] device 0 : 0xffff88019b850000 .

Oct 25 14:55:46 lt9630 kernel: [ 857.814345] pubdev:0xffffffffa02fd600, num of device:1 , name:fglrx, major 8, minor 88.
Oct 25 14:55:46 lt9630 kernel: [ 857.814346] device 0 : 0xffff88019b850000 .
Oct 25 14:55:46 lt9630 kernel: [ 857.814348] Asic ID:0x6779, revision:0x3c, MMIOReg:0xffffc90012b80000.
Oct 25 14:55:46 lt9630 kernel: [ 857.814349] FB phys addr: 0xc0000000, MC :0xf00000000, Total FB size :0x40000000.
Oct 25 14:55:46 lt9630 kernel: [ 857.814351] gart table MC:0xf0f8fd000, Physical:0xcf8fd000, size:0x402000.
Oct 25 14:55:46 lt9630 kernel: [ 857.814353] mc_node :FB, total 1 zones
Oct 25 14:55:46 lt9630 kernel: [ 857.814354] MC start:0xf00000000, Physical:0xc0000000, size:0xfd00000.
Oct 25 14:55:46 lt9630 kernel: [ 857.814356] Mapped heap -- Offset:0x0, size:0xf8fd000, reference count:33, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814357] Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814359] Mapped heap -- Offset:0xf8fd000, size:0x403000, reference count:1, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814360] mc_node :INV_FB, total 1 zones
Oct 25 14:55:46 lt9630 kernel: [ 857.814362] MC start:0xf0fd00000, Physical:0xcfd00000, size:0x30300000.
Oct 25 14:55:46 lt9630 kernel: [ 857.814363] Mapped heap -- Offset:0x302f4000, size:0xc000, reference count:1, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814365] mc_node :GART_USWC, total 2 zones
Oct 25 14:55:46 lt9630 kernel: [ 857.814366] MC start:0x40100000, Physical:0x0, size:0x50000000.
Oct 25 14:55:46 lt9630 kernel: [ 857.814367] Mapped heap -- Offset:0x0, size:0x2000000, reference count:6, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814369] mc_node :GART_CACHEABLE, total 3 zones
Oct 25 14:55:46 lt9630 kernel: [ 857.814370] MC start:0x10400000, Physical:0x0, size:0x2fd00000.
Oct 25 14:55:46 lt9630 kernel: [ 857.814371] Mapped heap -- Offset:0xa800000, size:0x700000, reference count:1, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814373] Mapped heap -- Offset:0x9a00000, size:0xe00000, reference count:2, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814374] Mapped heap -- Offset:0x8c00000, size:0xe00000, reference count:1, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814376] Mapped heap -- Offset:0x7e00000, size:0xe00000, reference count:2, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814378] Mapped heap -- Offset:0x7000000, size:0xe00000, reference count:3, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814379] Mapped heap -- Offset:0x6200000, size:0xe00000, reference count:2, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814381] Mapped heap -- Offset:0x5400000, size:0xe00000, reference count:2, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814382] Mapped heap -- Offset:0x4600000, size:0xe00000, reference count:2, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814384] Mapped heap -- Offset:0x3800000, size:0xe00000, reference count:5, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814385] Mapped heap -- Offset:0x2f00000, size:0x900000, reference count:6, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814387] Mapped heap -- Offset:0x2100000, size:0xe00000, reference count:3, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814389] Mapped heap -- Offset:0x1700000, size:0xa00000, reference count:4, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814390] Mapped heap -- Offset:0x1000000, size:0x700000, reference count:12, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814392] Mapped heap -- Offset:0x200000, size:0xe00000, reference count:5, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814393] Mapped heap -- Offset:0x0, size:0x200000, reference count:14, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814395] Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0,
Oct 25 14:55:46 lt9630 kernel: [ 857.814398] GRBM : 0xa0003828, SRBM : 0x200006c0 .
Oct 25 14:55:46 lt9630 kernel: [ 857.814400] CP_RB_BASE : 0x401000, CP_RB_RPTR : 0x10aa0 , CP_RB_WPTR :0x10aa0.
Oct 25 14:55:46 lt9630 kernel: [ 857.814402] CP_IB1_BUFSZ:0x2d8, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x4040d000.
Oct 25 14:55:46 lt9630 kernel: [ 857.814404] last submit IB buffer -- MC :0x4040d000,phys:0x196e97000.
Oct 25 14:55:46 lt9630 kernel: [ 857.814406] Dump the trace queue.
Oct 25 14:55:46 lt9630 kernel: [ 857.814407] End of dump
Comment 1 Alex Deucher 2015-01-17 16:47:39 UTC
You are using fglrx.  Is this an issue with the open source drivers?
Comment 2 Alberto Salvia Novella 2015-01-17 17:10:26 UTC
Yes, I'm not using fgrlx at all.
Comment 3 Alberto Salvia Novella 2015-01-17 17:11:13 UTC
It happens using both the libre and the proprietary drivers.
Comment 4 Alberto Salvia Novella 2015-01-19 21:42:15 UTC
Because this normally happens after first boot, but not in posterior ones, I greatly suspect this is a hardware bug.

Probably this is happening because of the GPU passing from a cold state to a warm state too fast, under graphic demanding operations as watching videos are.

And Windows users won't be experiencing this as the GPU doesn't stay in a maximum power state all the time.

So this should be a bug in the Linux kernel, Direct Rendering Manager's, Dinamic Power Management feature.
Comment 5 Michel Dänzer 2015-01-20 01:40:43 UTC
You seem to be using a very old version of Ubuntu and in particular the Linux kernel. Can you try newer versions? Current versions of the kernel radeon driver support DPM.
Comment 6 Alberto Salvia Novella 2015-01-20 01:45:23 UTC
It also happens to me using Ubuntu 14.10 with xorg 7.7.
Comment 7 Michel Dänzer 2015-01-20 01:46:53 UTC
(In reply to Alberto Salvia Novella from comment #6)
> It also happens to me using Ubuntu 14.10 with xorg 7.7.

Please attach the full /var/log/Xorg.0.log and output of dmesg corresponding to that here.
Comment 8 Alberto Salvia Novella 2015-01-20 02:00:26 UTC
Created attachment 112508 [details]
xorg.0.log
Comment 9 Alberto Salvia Novella 2015-01-20 02:07:06 UTC
Created attachment 112509 [details]
dmesg.0.log
Comment 10 Alex Deucher 2015-01-20 16:35:57 UTC
Both fglrx and radeon support dynamic power management so this does not likely have anything to do with power management.  It looks like a plain old GPU hang.  I'd suggest updating your mesa stack in the case of the open source driver.
Comment 11 Alberto Salvia Novella 2015-01-20 18:09:14 UTC
Does fglrx depend on Mesa?
How can I figure out if this is a bug in Mesa?
Comment 12 Timothy Arceri 2015-01-20 19:46:33 UTC
(In reply to Alberto Salvia Novella from comment #11)
> Does fglrx depend on Mesa?

No

> How can I figure out if this is a bug in Mesa?

Make sure you have completely removed fglrx (there are many guides on how to do this in Ubuntu just google it) and then try using this ppa with updated drivers https://launchpad.net/~oibaf/+archive/ubuntu/graphics-drivers
Comment 13 Alberto Salvia Novella 2015-01-21 00:39:19 UTC
So why is this happening also with the proprietary driver?

And about upgrading using the PPA, I have just done it. If in three weeks the problem doesn't reproduce, I will report it as gone in git version.
Comment 14 Alberto Salvia Novella 2015-01-21 00:43:36 UTC
I can confirm upgrading from the PPA didn't solve the problem :)

Graphics hanged again just after finishing the latest comment.
Comment 15 Timothy Arceri 2015-01-21 01:17:45 UTC
The open and closed drivers currently don't play nice together. Hopefully one day that will change. But to me it looks like you haven't fully removed fglrx and this is causing you problems.
Comment 16 Alberto Salvia Novella 2015-01-21 01:31:50 UTC
The Synaptic package manager says there isn't any fglrx package installed, and following instructions at <http://askubuntu.com/questions/78675/how-do-i-remove-the-fglrx-drivers-after-ive-installed-them-by-hand> says the same.
Comment 17 Alberto Salvia Novella 2015-01-21 01:34:37 UTC
Wait, comment 2 says further steps are needed... I'm testing now.
Comment 18 Alberto Salvia Novella 2015-01-21 01:52:14 UTC
I will test this for a while.
Comment 19 Alberto Salvia Novella 2015-01-31 12:55:34 UTC
I can still hang graphics by doing this:

1. Let the computer cool down for several hours.
2. Preferably in the morning, when ambiance is cool, switch on the computer.
3. Using the HTML5 YouTube player, play a video.
Comment 20 Alberto Salvia Novella 2015-01-31 12:56:50 UTC
It looks like if the computer has been on for a while, is harder to make it to hang.
Comment 21 Alberto Salvia Novella 2015-02-06 15:01:56 UTC
Other symptom is sound hangs too.

And sometimes, when graphics hang, what appear in screen is a pattern of stripes of near the same colour.
Comment 22 Alberto Salvia Novella 2015-02-06 15:03:18 UTC
Looks like <http://cdn.overclock.net/7/7b/7b3c3458_P110110_20.2701.jpeg>
Comment 23 Alberto Salvia Novella 2015-02-11 04:13:45 UTC
I confirm it also happens with the Adobe Flash player when reproducing videos in YouTube.
Comment 24 mirh 2017-10-21 12:54:09 UTC
https://bugs.launchpad.net/ubuntu/+source/mesa/+bug/881526/comments/56
User reported this to be an overheating issue

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.