Bug 98355 - [SNB 4.8] gpu at 0x7955b000, but request was 0x77b0b000
Summary: [SNB 4.8] gpu at 0x7955b000, but request was 0x77b0b000
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-10-20 11:31 UTC by Mariusz Libera
Modified: 2016-12-06 14:09 UTC (History)
1 user (show)

See Also:
i915 platform: SNB
i915 features: GPU hang


Attachments
cat /sys/class/drm/card0/error | bzip2 > error.bz2 (112.59 KB, application/x-bzip)
2016-10-20 11:31 UTC, Mariusz Libera
no flags Details
dmesg (68.50 KB, text/plain)
2016-10-20 11:33 UTC, Mariusz Libera
no flags Details
Xorg.0.log (77.07 KB, text/x-log)
2016-10-20 12:39 UTC, Mariusz Libera
no flags Details
relevant part of journalctl -b (60.79 KB, text/plain)
2016-10-20 12:40 UTC, Mariusz Libera
no flags Details
cat /sys/class/drm/card0/error | bzip2 > error.bz2 (113.91 KB, application/x-bzip2)
2016-10-22 20:29 UTC, Mariusz Libera
no flags Details
Xorg.0.log (30.68 KB, text/plain)
2016-10-22 20:31 UTC, Mariusz Libera
no flags Details
dmesg (50.84 KB, text/plain)
2016-10-22 20:32 UTC, Mariusz Libera
no flags Details
dmesg (51.91 KB, text/plain)
2016-10-23 11:12 UTC, Mariusz Libera
no flags Details
Xorg.0.log (34.57 KB, text/x-log)
2016-10-23 11:13 UTC, Mariusz Libera
no flags Details
cat /sys/class/drm/card0/error | bzip2 > error.bz2 (112.83 KB, application/x-bzip)
2016-10-23 11:14 UTC, Mariusz Libera
no flags Details
cat /sys/class/drm/card0/error | bzip2 > error.bz2 (102.74 KB, application/x-bzip2)
2016-10-23 15:27 UTC, Mariusz Libera
no flags Details
dmesg (49.77 KB, text/plain)
2016-10-23 15:28 UTC, Mariusz Libera
no flags Details
Xorg.0.log (24.08 KB, text/plain)
2016-10-23 15:29 UTC, Mariusz Libera
no flags Details
cat /sys/class/drm/card0/error | bzip2 > error.bz2 (112.49 KB, application/x-bzip)
2016-10-24 18:34 UTC, Mariusz Libera
no flags Details
dmesg (68.34 KB, text/plain)
2016-10-24 18:35 UTC, Mariusz Libera
no flags Details
Xorg.0.log (46.29 KB, text/x-log)
2016-10-24 18:35 UTC, Mariusz Libera
no flags Details
dmesg (48.05 KB, text/plain)
2016-10-28 01:02 UTC, Mariusz Libera
no flags Details
cat /sys/class/drm/card0/error | bzip2 > error.bz2 (107.47 KB, application/x-bzip)
2016-10-28 10:40 UTC, Mariusz Libera
no flags Details
Xorg.0.log (35.04 KB, text/plain)
2016-10-28 10:40 UTC, Mariusz Libera
no flags Details
dmesg (60.81 KB, text/plain)
2016-10-28 10:41 UTC, Mariusz Libera
no flags Details
cat /sys/class/drm/card0/error | bzip2 > error.bz2 (111.20 KB, application/x-bzip)
2016-11-02 17:13 UTC, Mariusz Libera
no flags Details
dmesg (94.70 KB, text/plain)
2016-11-02 17:14 UTC, Mariusz Libera
no flags Details
dmesg (139.75 KB, text/plain)
2016-11-04 07:29 UTC, Mariusz Libera
no flags Details
journalctl -kb -1 (199.40 KB, text/plain)
2016-11-04 21:12 UTC, Mariusz Libera
no flags Details
cat /sys/class/drm/card0/error | bzip2 > error.bz2 (124.81 KB, application/x-bzip)
2016-11-04 21:34 UTC, Mariusz Libera
no flags Details
dmesg (49.39 KB, text/plain)
2016-11-04 21:38 UTC, Mariusz Libera
no flags Details

Description Mariusz Libera 2016-10-20 11:31:42 UTC
Created attachment 127422 [details]
cat /sys/class/drm/card0/error | bzip2 > error.bz2

After resuming from suspend instead of KDE's lockscreen I was greeted only by my wallpaper and mouse cursor (responsive). I switched to another tty (it worked, I could log in) and back and this time it was only black screen, after another tty switch xserver crashed. After that I was able to start another X session.

This is what happened this time, but honestly I can't use my laptop for 2 days without experiencing, at best, freezing of my desktop for a few seconds or, at worst, forcing me to reboot using sysrq. I can't reliably reproduce it, seems to happen at random, and may not always be a "GPU Hang", I also saw messages like:
WARNING: CPU: 0 PID: 500 at drivers/gpu/drm/i915/intel_display.c:3857 intel_atomic_commit+0x1459/0x1460 [i915]
Removing stuck page flip
followed by a backtrace, but was too lazy to report.


Archlinux x86_64
Kernel 4.8.2
Mesa 12.0.3
Xserver 1.18.4
Comment 1 Mariusz Libera 2016-10-20 11:33:49 UTC
Created attachment 127423 [details]
dmesg
Comment 2 Chris Wilson 2016-10-20 11:48:10 UTC
GPU is executing batch from 0x7955b000. Active request points to 0x77b0b000.
Comment 3 Chris Wilson 2016-10-20 11:59:37 UTC
X server crashed? Please see if there is a record of that in the logs. In the meantime, if this is reproducible could you try https://cgit.freedesktop.org/drm-intel/log/ and attach a fresh error state (if it still fails!).
Comment 4 Mariusz Libera 2016-10-20 12:39:14 UTC
Created attachment 127428 [details]
Xorg.0.log

I assumed it crashed. I have this line in my .profile 
if [[ -z $DISPLAY && $XDG_VTNR -eq 1]] ; then startx; exit; fi
so when I saw getty on tty1 I assumed it crashed.
I'll attach Xorg.0.log and logs from journalctl.
Comment 5 Mariusz Libera 2016-10-20 12:40:41 UTC
Created attachment 127429 [details]
relevant part of journalctl -b
Comment 6 Chris Wilson 2016-10-20 12:54:41 UTC
Ah, modesetting, Yes it crashed (randomly called exit).
Comment 7 Mariusz Libera 2016-10-20 13:50:40 UTC
Should I switch to xf86-video-intel? Would it be able to recover from GPU hangs like this one? I remember I've been using it previously, but some tooltips in Google Chrome had artifacts so I uninstalled it to test modesetting. It didn't help (it's probably Chrome's bug), but it seems to work equally well so I didn't bother reinstalling xf86-video-intel.
Comment 8 Chris Wilson 2016-10-20 13:53:26 UTC
(In reply to Mariusz Libera from comment #7)
> Should I switch to xf86-video-intel? Would it be able to recover from GPU
> hangs like this one? 

Like this, it won't crash but the kernel / hw seem to be in quite a disarray.

> I remember I've been using it previously, but some
> tooltips in Google Chrome had artifacts so I uninstalled it to test
> modesetting. It didn't help (it's probably Chrome's bug), but it seems to
> work equally well so I didn't bother reinstalling xf86-video-intel.

Actually, that's a bug in mesa's glXWaitX() (or lack thereof).
Comment 9 Mariusz Libera 2016-10-22 20:29:40 UTC
Created attachment 127480 [details]
cat /sys/class/drm/card0/error | bzip2 > error.bz2

It happened again, similar scenario - resuming from suspend and instead of lockscreen there's only wallpaper and mouse cursor. But this time I had xf86-video-intel installed and it did somewhat recover. After switching to another tty and back a textbox for password appeared and I could type but characters would appear with few seconds delay. After logging in desktop is partially broken - for example Chrome doesn't redraw window content unless it's resized, some KDE menus are broken, etc.

I'll attack the logs, but this happened with kernel 4.8.3, so I don't know how useful they will be. I did build drm-intel kernel using this - https://aur.archlinux.org/packages/linux-drm-intel-nightly/ - but it wouldn't boot (couldn't find root device for some reason) and I didn't have time to make it work.
Comment 10 Mariusz Libera 2016-10-22 20:31:00 UTC
Created attachment 127481 [details]
Xorg.0.log
Comment 11 Mariusz Libera 2016-10-22 20:32:28 UTC
Created attachment 127482 [details]
dmesg
Comment 12 Mariusz Libera 2016-10-23 11:12:15 UTC
Created attachment 127492 [details]
dmesg

Another hang, again after resuming from suspend. Kernel 4.8.4. This time graphics seem to have recovered completely after few seconds, everything seems to work fine.
Comment 13 Mariusz Libera 2016-10-23 11:13:07 UTC
Created attachment 127493 [details]
Xorg.0.log
Comment 14 Mariusz Libera 2016-10-23 11:14:29 UTC
Created attachment 127494 [details]
cat /sys/class/drm/card0/error | bzip2 > error.bz2
Comment 15 Mariusz Libera 2016-10-23 15:27:31 UTC
Created attachment 127497 [details]
cat /sys/class/drm/card0/error | bzip2 > error.bz2

And another one...
This time Chrome and parts of KDE remained broken.
Comment 16 Mariusz Libera 2016-10-23 15:28:14 UTC
Created attachment 127498 [details]
dmesg
Comment 17 Mariusz Libera 2016-10-23 15:29:36 UTC
Created attachment 127499 [details]
Xorg.0.log
Comment 18 Mariusz Libera 2016-10-24 18:34:34 UTC
Created attachment 127517 [details]
cat /sys/class/drm/card0/error | bzip2 > error.bz2

Should I keep posting those? Are they useful?
Comment 19 Mariusz Libera 2016-10-24 18:35:07 UTC
Created attachment 127518 [details]
dmesg
Comment 20 Mariusz Libera 2016-10-24 18:35:41 UTC
Created attachment 127519 [details]
Xorg.0.log
Comment 21 Mariusz Libera 2016-10-28 01:02:11 UTC
Created attachment 127577 [details]
dmesg

I don't know if this is related, because /sys/class/drm/card0/error is empty and messages in dmesg are different. My screen froze while scrolling through some website in Chrome, only mouse cursor was moving. It unfroze after switching to another tty and back.
Comment 22 Mariusz Libera 2016-10-28 10:40:08 UTC
Created attachment 127583 [details]
cat /sys/class/drm/card0/error | bzip2 > error.bz2

Another one. At least I could eventually, after few minutes, log in and close my apps.
Comment 23 Mariusz Libera 2016-10-28 10:40:40 UTC
Created attachment 127585 [details]
Xorg.0.log
Comment 24 Mariusz Libera 2016-10-28 10:41:05 UTC
Created attachment 127586 [details]
dmesg
Comment 25 Mariusz Libera 2016-11-02 17:13:50 UTC
Created attachment 127699 [details]
cat /sys/class/drm/card0/error | bzip2 > error.bz2

This time a hang happened while I was watching a youtube video. Screen froze for a few seconds and then all went back to normal. Kernel 4.8.6.
Comment 26 Mariusz Libera 2016-11-02 17:14:26 UTC
Created attachment 127700 [details]
dmesg
Comment 27 Mariusz Libera 2016-11-04 07:29:37 UTC
Created attachment 127749 [details]
dmesg

Screen froze while I was using Google maps in Chrome, after switching to another tty and back it unfroze.
Comment 28 Mariusz Libera 2016-11-04 21:12:46 UTC
Created attachment 127765 [details]
journalctl -kb -1

Screen kept freezing from time to time, like described in the previous comment, but without any messages in dmesg. Then it no longer unfroze after switching ttys, so I rebooted using ctrl+alt+del. Turns out it gave some error messages just before reboot. Does this require separate bug report?
Comment 29 Mariusz Libera 2016-11-04 21:34:47 UTC
Created attachment 127767 [details]
cat /sys/class/drm/card0/error | bzip2 > error.bz2

Another hang:
[ 1842.449017] [drm] GPU HANG: ecode 6:0:0x87e8effd, in chrome [1452], reason: Hang on render ring, action: reset
[ 1842.449019] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 1842.449020] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 1842.449020] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 1842.449021] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 1842.449022] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 1842.449079] drm/i915: Resetting chip after gpu hang
Comment 30 Mariusz Libera 2016-11-04 21:38:14 UTC
Created attachment 127768 [details]
dmesg

...followed by this.
Comment 31 Mariusz Libera 2016-12-06 14:01:52 UTC
I think it has been fixed somewhere between kernels 4.8.6 - 4.8.10 or maybe by mesa update from 13.0.0 to 13.0.1. I have an uptime of 12 days with kernel 4.8.10 and mesa 13.0.1 and I've not experienced any hangs.
Comment 32 yann 2016-12-06 14:09:51 UTC
(In reply to Mariusz Libera from comment #31)
> I think it has been fixed somewhere between kernels 4.8.6 - 4.8.10 or maybe
> by mesa update from 13.0.0 to 13.0.1. I have an uptime of 12 days with
> kernel 4.8.10 and mesa 13.0.1 and I've not experienced any hangs.

Thanks Marius, I am closing then for now but if it occurs again, please reopen with fresh logs.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.