Created attachment 129423 [details] /sys/class/drm/card1/error [16267.939845] [drm] GPU HANG: ecode 9:0:0x85dffffb, in plasmashell [1732], reason: Hang on render ring, action: reset [16267.939847] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [16267.939849] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [16267.939850] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [16267.939851] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [16267.939852] [drm] GPU crash dump saved to /sys/class/drm/card1/error [16267.939947] drm/i915: Resetting chip after gpu hang [16267.940489] [drm] GuC firmware load skipped [16269.953879] [drm] RC6 on This happend on an optimus system while using intel for the plasma desktop, on Kubuntu 16.10. All kubuntu updates as of Feb 08 2017 installed.
Created attachment 131342 [details] another error log, this time from /sys/class/drm/card0/error This bug triggers an X restart everytime it occurs, causing me to loose all unsaved work. It appears to happen most frequently when using LibreOffice Impress to edit a complex slide. I have since upgraded to Kubuntu 17.04 and the bug remains unchanged, as far as I can tell. Note: I am using a custom kernel with an APST fix applied (see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678184), but I have encountered the bug with the stock kernel as well. Anything NVMe related (like the APST fix) should not affect the graphics stack anyways.
Mesa devs at Intel have not had a good way to reproduce this hang. If you can attach a complex slide that causes gpu hang reliably, we might be able to fix this issue. Some gpu hangs have been fixed upstream in linux, mesa, and sna. You may find the issue resolved with upstream sources. Switching to modesetting may improve things also. Please indicate your hardware and attach your xorg.log with the repro steps.
(In reply to Mark Janes from comment #2) > Mesa devs at Intel have not had a good way to reproduce this hang. If you > can attach a complex slide that causes gpu hang reliably, we might be able > to fix this issue. > > Some gpu hangs have been fixed upstream in linux, mesa, and sna. You may > find the issue resolved with upstream sources. Switching to modesetting may > improve things also. Please indicate your hardware and attach your xorg.log > with the repro steps. This is a mesa bug, please stop deflecting. The hardware is reported in the error state.
I'm not deflecting, I'm asking for help reproducing the issue.
I have been editing many libre-office documents, and all of them (in impress, in writer) appear to be able to trigger the bug. Once the bug has happened, it appears to be much more likely to happen again unless I reboot the machine (restarting the X server does not help). I am now running the latest ubuntu kernel (Linux 4.10.0-23-generic #25-Ubuntu SMP Fri Jun 9 09:39:09 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux) and the situation is unchanged. Could the fact that this machine has a 4K/UHD screen contribute somehow, as my other intel machine (HP CoreM tablet with FHD screen) has never crashed like this yet? Or might it be a hidden Optimus problem? (both nvidia and nouveau kernel modules are blacklisted/not loaded, bumblebee is not configured, nvidia GPU is switched off at all times)? As any text/graphics document in Libre office appears to trigger the but I am not sure if it would make sense to attach one - they all contain personal/confidential information to some amount. What else can I do to help chase this down or at least find a workaround? Cheers Jan
Created attachment 131981 [details] Xorg.0.log added Xorg log file
Jan, our team set up a hidpi sklgt2 with the default kubuntu 16.10 install and spent hours manipulating large libreoffice impress documents. We couldn't generate a gpu hang.
I really do appreciate your efforts. However, the problem remains - although the error has changed a little bit, no more saving of a crash report it appears: Jun 23 09:43:42 Trouble kernel: [26432.693147] [drm] GPU HANG: ecode 9:0:0x86dffffd, in Xorg [1223], reason: Hang on render ring, action: reset Jun 23 09:43:42 Trouble kernel: [26432.693178] drm/i915: Resetting chip after gpu hang Jun 23 09:43:42 Trouble kernel: [26432.693294] [drm] RC6 on Jun 23 09:43:42 Trouble kernel: [26432.707575] [drm] GuC firmware load skipped Jun 23 09:43:45 Trouble systemd[1]: Started Session 17 of user jan. Jun 23 09:43:52 Trouble kernel: [26442.636333] drm/i915: Resetting chip after gpu hang Jun 23 09:43:52 Trouble kernel: [26442.636402] [drm] RC6 on Jun 23 09:43:52 Trouble kernel: [26442.650659] [drm] GuC firmware load skipped On Kubuntu, all updates as of June 23 2017. Linux Trouble 4.10.0-25-generic #29-Ubuntu SMP Tue Jun 20 15:00:02 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux ...and until now it has ONLY happened with Libre Office, never with anything else (openGL games, VirtualBox/Windows10/MicrosoftOffice2016, wine/MicrosoftOffice2010, Matlab, intensive use of Firefox, etc. I am not sure what to do here... just wait for the next (K)ubuntu-version to hopefully fix things? Would there be any promise in applying the 2017Q1 Intel Graphics Stack Recipe, or should I try to get some updated mesa from somewhere? Any recommendations?
Use the oibaf ppa if it is compatible with kubuntu.
Created attachment 132225 [details] /sys/class/drm/card0/error I am also seeing the same bug on almost similar hardware. This is on Arch Linux running kernel 4.11.6 and Gnome 3.24.2 on a Dell Precision 5510 with 4K screen. The issue is impossible to reliably reproduce, it always appears at random while using Libreoffice, I mostly use Calc but the hang happens also when using Writer or Impress, repeating the same task on the same file doesn't reproduce the error. Also, as is the case with OP, nvidia gpu is switched off. [drm] GPU HANG: ecode 9:0:0x85dffffb, in Xwayland [7741], reason: Hang on render ring, action: reset [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [drm] GPU crash dump saved to /sys/class/drm/card0/error ernel: drm/i915: Resetting chip after gpu hang [drm] RC6 on org.gnome.Shell.desktop[7718]: libinput error: libinput bug: timer: offset negative (-2080361) [drm] GuC firmware load skipped drm/i915: Resetting chip after gpu hang [drm] RC6 on [drm] GuC firmware load skipped org.gnome.Shell.desktop[7718]: intel_do_flush_locked failed: Input/output error libreoffice-calc.desktop[27753]: X IO Error
I have used the oibaf ppa in the past for trying (maybe 2 months ago), but the error remained, and I purged the ppa again. I guess Hesham Ahmend's post also tells me that I don't need to try the mainline kernels from 4.11.x - would it make sense for me to try the 4.12-rc6? I guess there is no immediate way to identify what LibreOffice is doing that no other software seems to do in order to trigger the bug?
...and now it is also hitting me on another intel-based system. As I could not reliably work on my main system (Precision 5110, see reports above) I have switched to my other laptop, a HP Elite X2 1012 G1 (tablet-type, Skylake CoreM Intel(R) Core(TM) m5-6Y54 CPU), and, again in LibreOffice, suddenly the screen froze (but not the mouse) and I found this in syslog: Jun 25 08:20:11 Chaos kernel: [12746.287681] [drm] GPU HANG: ecode 9:0:0x85dffffb, in Xorg [1150], reason: Hang on render ring, action: reset Jun 25 08:20:11 Chaos kernel: [12746.287752] drm/i915: Resetting chip after gpu hang Jun 25 08:20:11 Chaos kernel: [12746.289722] [drm] RC6 on Jun 25 08:20:11 Chaos kernel: [12746.307587] [drm] GuC firmware load skipped Jun 25 08:20:31 Chaos kernel: [12766.193242] drm/i915: Resetting chip after gpu hang Jun 25 08:20:31 Chaos kernel: [12766.193350] [drm] RC6 on Jun 25 08:20:31 Chaos kernel: [12766.210600] [drm] GuC firmware load skipped Jun 25 08:20:50 Chaos kernel: [12785.200890] drm/i915: Resetting chip after gpu hang Jun 25 08:20:50 Chaos kernel: [12785.202985] [drm] RC6 on Jun 25 08:20:50 Chaos kernel: [12785.214964] [drm] GuC firmware load skipped Jun 25 08:21:02 Chaos kernel: [12797.168631] drm/i915: Resetting chip after gpu hang Jun 25 08:21:02 Chaos kernel: [12797.168756] [drm] RC6 on Jun 25 08:21:02 Chaos kernel: [12797.180730] [drm] GuC firmware load skipped After a little while, X restarted (...work lost...). I will assume this is the same bug?
Created attachment 132228 [details] lspci -vvv Added output of lspci -vvv
Bugs that appear similar to me, all involve GPU hangs and LibreOffice: Bug 100905 Bug 100794 Bug 95062
In Bug 95062 Danna Gifford suggested the following: >> A work-around seems to be starting Libre Office without hardware acceleration >> with the variable LIBGL_ALWAYS_SOFTWARE >> >> e.g. to start Impress from the terminal >> $ LIBGL_ALWAYS_SOFTWARE=1 loimpress I am testing this now.
Another similar report (hang with libreoffice) found on LKML: http://lkml.iu.edu/hypermail/linux/kernel/1704.3/01901.html
*** Bug 100905 has been marked as a duplicate of this bug. ***
Re testing the workaround: yesterday, I worked with LibreOffice (multiple documents) for several hours without a crash. Before, being able to work for that long would have been highly unlikely. Therefore, it seems to me as if disabling hardware acceleration actually worked. Now at least, my computer is fully operational again - but of course, this doesn't fix the bug...
I think i am hitting the same bug with Linux 4.12.10 in a Dell Precision 5510. I will attach my card error, output of dmesg is: [36249.475910] [drm] GPU HANG: ecode 9:0:0x86dffffd, in Xorg [1345], reason: Hang on rcs, action: reset [36249.475912] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [36249.475912] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [36249.475912] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [36249.475913] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [36249.475913] [drm] GPU crash dump saved to /sys/class/drm/card0/error [36249.475940] drm/i915: Resetting chip after gpu hang [36249.476049] [drm] RC6 off [36257.467677] drm/i915: Resetting chip after gpu hang [36257.467912] [drm] RC6 off [36265.535608] drm/i915: Resetting chip after gpu hang [36265.535819] [drm] RC6 off [36273.467462] drm/i915: Resetting chip after gpu hang [36273.467723] [drm] RC6 off [36286.523448] drm/i915: Resetting chip after gpu hang [36286.523714] [drm] RC6 off
Created attachment 133971 [details] gpu crash dump
Definitely related to Libreoffice, as it is the only program that triggers this behaviour
Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz I originally proposed the workaround in comment 15. However, in testing I've found that it does not significantly reduce crashes for me after all. Booting with nomodeset as a grub option stops the gpu hangs associated with libreoffice, but with a severe performance penalty, and loss of HDPI graphics scaling and suspend.
It's been 7 months, several other bugs report very similar behaviour with LibreOffice and it seems something is just not right. Can we do anything to get this finally fixed, Vladimir?
Created attachment 133986 [details] lspci -vvv
(In reply to Danna Gifford from comment #22) > I originally proposed the workaround in comment 15. However, in testing I've > found that it does not significantly reduce crashes for me after all. It doesn't work for me either. The workaround I found is to run LO in Xephyr.
I'm currently testing the Intel drivers installed using intel-graphics-update-tool (see link below). I think this is providing i965-va-driver version 1.8.3-1 (whereas 1.7.3-1 is available from the Zesty repo). I don't have xserver-xorg-video-intel installed. So far, things seem to be more stable running LibreOffice. https://01.org/linuxgraphics/downloads/intel-graphics-update-tool-linux-os-v2.0.5 Sorry I'd try to provide more detail but quite busy at the moment.
I have taken a look at the Intel's page about the intel-graphics-update-tool and i don't think that should be related. I am running Debian testing, which contains version 1.8.3 of the affected libraries by that tool, which is more modern than the 1.8.0 installed with the tool and still, i am affected by this bug. In case it is related to kernel code i have tested: - 4.9.2 - 4.9.8 - 4.9.9 - 4.10 - 4.11 - 4.12-rc3 - 4.12-rc6 - 4.12-rc7 - 4.12.4 - 4.12.10 From these, by far, the better results i've got were with the 4.12.10. I've been running for maybe two or three weeks without a hang, so i tough it could be resolved, by the other day i got the very same old hang, so it seems the changes in the kernel made it more rare, but the cause still is in there. I am currently testing 4.13 in my machine. Will tell you about the results.
(In reply to Alejandro Lorenzo from comment #27) > I have taken a look at the Intel's page about the intel-graphics-update-tool > and i don't think that should be related. I am running Debian testing, which > contains version 1.8.3 of the affected libraries by that tool, which is more > modern than the 1.8.0 installed with the tool and still, i am affected by > this bug. As Alejandro predicted, it doesn't fix it. However, running with 1.8.3 on 4.10.0-34-generic gave me several hours without a hang tonight, compared with several minutes on 1.7.3, so perhaps an improvement (but could just be very stochastic). It seems something has changed, because it also wouldn't reset the chip/restart X after the hang, and I had to reboot.
The bug is marked as NEEDINFO. What info is necessary ? It seems nobody actually is taking care of this bug.
I can reproduce this also. This is a heavy impact for my work if the system crashes during work. Sep 12 15:54:42 localhost kernel: [drm] GPU HANG: ecode 9:0:0x85dffffb, in Xwayland [2067], reason: Hang on rcs, action: reset Sep 12 15:54:42 localhost kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. Sep 12 15:54:42 localhost kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel Sep 12 15:54:42 localhost kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. Sep 12 15:54:42 localhost kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. Sep 12 15:54:42 localhost kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error Sep 12 15:54:42 localhost kernel: drm/i915: Resetting chip after gpu hang Sep 12 15:54:42 localhost kernel: [drm] RC6 on Sep 12 15:54:42 localhost org.gnome.Shell.desktop[2047]: Window manager warning: last_user_time (114029411) is greater than comparison timestamp (114029001). This most likely represents a buggy client sending inaccurate timestamps in mess ages such as _NET_ACTIVE_WINDOW. Trying to work around... Sep 12 15:54:42 localhost org.gnome.Shell.desktop[2047]: Window manager warning: 0x6e00032 (itsa_brief) appears to be one of the offending windows with a timestamp of 114029411. Working around... Sep 12 15:54:42 localhost systemd-udevd[665]: Network interface NamePolicy= disabled on kernel command line, ignoring. Sep 12 15:54:50 localhost kernel: drm/i915: Resetting chip after gpu hang Sep 12 15:54:50 localhost kernel: [drm] RC6 on Sep 12 15:54:53 localhost kernel: asynchronous wait on fence i915:gnome-shell[2047]/1:53b49 timed out Sep 12 15:54:58 localhost kernel: drm/i915: Resetting chip after gpu hang Sep 12 15:54:58 localhost kernel: [drm] RC6 on Sep 12 15:55:06 localhost kernel: drm/i915: Resetting chip after gpu hang Sep 12 15:55:06 localhost kernel: [drm] RC6 on Sep 12 15:55:14 localhost kernel: drm/i915: Resetting chip after gpu hang Sep 12 15:55:14 localhost kernel: [drm] RC6 on Sep 12 15:55:14 localhost org.gnome.Shell.desktop[2047]: intel_do_flush_locked failed: Input/output error Sep 12 15:55:14 localhost libreoffice-splash.desktop[21306]: X IO Error
Hello everybody, I'm trying to reproduce the issue in a SKL eDP 4k GT2, with no luck so far. I'm using kubuntu 17.04 with latest drm-tip. I manage to make LibreOffice crash using impress with a presentation full of images and gifs, and one time X closed, but i didn't get the hang, just Atomic update failure messages and a oom-killer warning. Could you please help me with this information. 1. What LibreOffice version are you using? 2. What Mesa version are you using? 3. Are you using firmware, guc and huc? 4. How much time(hours) have you been working on libreoffice when the hang happens? 5. Have you verified if HW acceleration is enable? 6. Is there a step list to reproduce the issue? 7. Have you tried to reproduce with intel_iommu=igfx_off parameter on grub? Also, since this issue is reproducible in your devices, could you please add a full dmesg with "drm.debug=0xe log_bug_len=4M" parameters on grub and/or a clean kern.log. Sharing my own information: 1. LibreOffice: version 5.3.1.2 build id 1:5.3.1-0ubuntu2 2. Mesa version: 17.0.7 3. No firmwares(guc/huc) used. 4. One hour approximately, once libreoffice crash the first time like half an hour or less even after reboot. 5. Disable in my case. 6. I tried by using glxgears + youtube + heavy libreoffice impress document, then copy-paste-write-presentation_mode(F5)-esc and repeat until libreoffice crash. 7. No, since i haven't been able to reproduce. This is my configuration, and before latest drm-tip I also tried with latest 4.10 since kubuntu distro provides this version, neither could reproduce. ====================================== Software ====================================== kernel version : 4.14.0-rc2-drm-tip-ww39-commit-0b65077+ architecture : x86_64 hardware acceleration : disabled swap partition : disabled ====================================== Hardware ====================================== platform : Skylake cpu information : Intel(R) Core(TM) m5-6Y57 CPU @ 1.10GHz gpu card : Intel Corporation HD Graphics 515 (rev 07) (prog-if 00 [VGA controller]) memory ram : 3.83 GB max memory ram : 16 GB display resolution : 3840x2160 hard drive : 74GiB (80GB) current cd clock frequency : 540000 kHz displays connected : eDP-1 ====================================== Firmware ====================================== dmc fw loaded : yes dmc version : 1.26 guc fw loaded : NONE guc version wanted : 0.0 guc version found : 0.0 ====================================== kernel parameters ====================================== initcall_debug drm.debug=0xe log_bug_len=2M As a side note, last error_state reported by Alejandro Lorenzo is a different issue, ecode is different and his issue is inside the batch of the render ring, while the others are outside the batch. So that hang should be filed on a different case. Thanks in advance for your time.
Hello, I have also been fighting with this bug for a while. Tried to update GPU driver using Intel Graphic Update Tool 2.02 and while LO did not hang anymore my whole desktop had lots of issues. I need to do more testing myself but wanted to point towards this potential solution linked to issues with the HWE stack on 16.04. So far seems to work both in terms of the LO hang and for my desktop: https://askubuntu.com/questions/964576/libreoffice-5-1-6-2-crashes-ubuntu-16-04-64-bit My config is Skylake (GT2) HD Graphic 520 Mesa 17.0.7 Kernel 4.10.0-37 KDE neon user (Ubuntu 16.04) Plasma 5.11.2
A similar GPU hang was recently fixed for 2D workloads. It would help us if someone affected by this libreoffice crash would attempt to reproduce it with mesa 17.3.0rc6 see also: https://bugs.freedesktop.org/show_bug.cgi?id=103555
Just so you know, it's been a long time since i've seen this happen with up-to-date kernel + mesa, so i would say this has been fixed
(In reply to Alejandro Lorenzo from comment #34) > Just so you know, it's been a long time since i've seen this happen with > up-to-date kernel + mesa, so i would say this has been fixed Same for me.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.