Bug 80568

Summary: [gen4] GPU Crash During Google Chrome Operation
Product: Mesa Reporter: haineb
Component: Drivers/DRI/i965Assignee: Ian Romanick <idr>
Status: VERIFIED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium CC: agonzalez, alex, alexsecret, annejan, astronouth7303, bjoern, bugzilla, christophe.curis, cooks.go.hungry, crypticscripture, danilo.pianini, dariuskellermann, ddstreet, dkg, egalanos, fdeskbug, fedeb1995, felix.schwarz, fxlehma, gianguidorama, hanno, hudsantos, isma.casti, jamiejaxon, jiri.bati.novak, jlbiord, johny_quest, lohmaier, maarten256, mhw-freedesktop, mikbini, nemesis, nickbz, ou.ghorbel, pac12referee, raiderx+freedesktop, raquacontact, shacharr, slex.bi, thexerothermicsclerodermoid, thomas.langewouters, wolf.duttlinger, ysangkok, zamiere
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: /sys/class/drm/card0/error as requested
dmesg tail after GPU crash
The content of "chrome://gpu " in Chrome 38.
The content of "chrome://gpu " in Chromium 34.
apt-get logs to bisect changes
more logs drom current GPU Crash
When crash happens (screenshot from glmark2 demo)
Error log dump
Compilation helper for old code versions, to assist future bisects (though it seems the issue is not bisectable)
foobilliardplus showing display corruption before crash
/sys/class/drm/card0/error when running firefox http://www.google.com/chrome/

Description haineb 2014-06-26 18:49:02 UTC
Created attachment 101821 [details]
/sys/class/drm/card0/error as requested

Normal browsing, no other significant applications running. Crash has occurred 3+ times. Kernel remains up after display crash. Able to use other vtys normally, but restarting mdm after display crash results in full kernel lock.

Google Chrome 35.0.1916.153

Linux host3 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Comment 1 haineb 2014-06-26 18:49:48 UTC
Unable to consistently replicate with any specific use case.
Comment 2 haineb 2014-06-26 18:50:19 UTC
Created attachment 101823 [details]
dmesg tail after GPU crash
Comment 3 haineb 2014-06-26 19:48:34 UTC
Issue replicated. Browsing in chrome again. No GPU crash reported this time. I'm lost on what component is truly causing the failure at this point.

[   18.736862] init: plymouth-upstart-bridge main process ended, respawning
[ 2738.952528] perf samples too long (2509 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[ 4610.808077] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... render ring idle
[ 4616.535139] Watchdog[2389]: segfault at 0 ip 00007fae73e88b4e sp 00007fae638ae670 error 6 in chrome[7fae6fe97000+4f39000]
[ 4638.603458] Watchdog[4588]: segfault at 0 ip 00007f2827619b4e sp 00007f281703f670 error 6 in chrome[7f2823628000+4f39000]
[ 4648.644352] Watchdog[4627]: segfault at 0 ip 00007f2c06ceeb4e sp 00007f2bf6714670 error 6 in chrome[7f2c02cfd000+4f39000]
Comment 4 Matt Turner 2014-11-14 19:24:13 UTC
*** Bug 83627 has been marked as a duplicate of this bug. ***
Comment 5 Matt Turner 2014-11-14 19:24:49 UTC
*** Bug 84781 has been marked as a duplicate of this bug. ***
Comment 6 Matt Turner 2014-11-14 19:25:25 UTC
*** Bug 84971 has been marked as a duplicate of this bug. ***
Comment 7 Matt Turner 2014-11-14 19:25:45 UTC
*** Bug 85344 has been marked as a duplicate of this bug. ***
Comment 8 Matt Turner 2014-11-14 19:26:12 UTC
*** Bug 85406 has been marked as a duplicate of this bug. ***
Comment 9 Matt Turner 2014-11-14 19:26:26 UTC
*** Bug 85656 has been marked as a duplicate of this bug. ***
Comment 10 Matt Turner 2014-11-14 19:26:38 UTC
*** Bug 85726 has been marked as a duplicate of this bug. ***
Comment 11 Matt Turner 2014-11-14 19:27:02 UTC
*** Bug 85749 has been marked as a duplicate of this bug. ***
Comment 12 Matt Turner 2014-11-14 19:27:16 UTC
*** Bug 85824 has been marked as a duplicate of this bug. ***
Comment 13 Matt Turner 2014-11-14 19:27:28 UTC
*** Bug 86108 has been marked as a duplicate of this bug. ***
Comment 14 Matt Turner 2014-11-14 19:27:39 UTC
*** Bug 85874 has been marked as a duplicate of this bug. ***
Comment 15 Matt Turner 2014-11-14 19:28:00 UTC
*** Bug 86167 has been marked as a duplicate of this bug. ***
Comment 16 Matt Turner 2014-11-14 19:28:27 UTC
*** Bug 81394 has been marked as a duplicate of this bug. ***
Comment 17 Matt Turner 2014-11-14 19:32:26 UTC
I've just marked a lot of bugs as duplicates of this one, so that we can hopefully get to the bottom of the Great Gen4 Chrome Hang of 2014.

Firstly, apologies if your bug is not actually the same bug as all of the rest. It's hard to determine that in advance.

It looks like this report was the first, and it's from the end of June. What changed when you started experiencing the hang? Did you update Mesa, the kernel, or Chrome itself?

Does reverting to a version of one of these from, say, April avoid the problem? If so, that's great and we should be able to bisect it.
Comment 18 Matt Turner 2014-11-14 19:34:18 UTC
*** Bug 83807 has been marked as a duplicate of this bug. ***
Comment 19 Matt Turner 2014-11-14 19:35:14 UTC
*** Bug 84561 has been marked as a duplicate of this bug. ***
Comment 20 Matt Turner 2014-11-14 19:36:28 UTC
*** Bug 85249 has been marked as a duplicate of this bug. ***
Comment 21 Matt Turner 2014-11-14 19:46:02 UTC
*** Bug 78483 has been marked as a duplicate of this bug. ***
Comment 22 Matt Turner 2014-11-14 19:46:39 UTC
*** Bug 74094 has been marked as a duplicate of this bug. ***
Comment 23 Michele Bini 2014-11-14 19:48:27 UTC
Well, it never worked for me (I'm the reporter of bug 85824) but I tried chrome only very recently (v 38.0.2125.111).

On the other hand I have a case that reproduces bug 85824 100% of the times: bootstrap, login, launch chrome, go to the chrome store, don't touch anything, wait a few instants ... video hangs.
Comment 24 haineb 2014-11-14 20:35:09 UTC
The mainboard that this graphics chipset was embedded in has failed and I no longer have it available for testing. (Failure seemed to be DIMM related; probably not relevant to this bug.)

I rolled out Mint 17 with Chrome perhaps two weeks after its release/two weeks before opening this report. These intermittent failures started happening immediately. The failures persisted throughout all subsequent system/Chrome updates until the board died a few weeks ago.

Unfortunately, I was not able to identify any cases under which this set of packages worked stably on this graphics chipset.

(The new system with different mainboard is running the same packages; just moved the hard disc over. No failures have occurred since swapping to a different mainboard/graphics chipset.)
Comment 25 Jarno Suni 2014-11-14 22:49:40 UTC
I have upgraded chromium-browser from 37.0.2062.120 to 38.0.2125.111 recently, but my OS (Ubuntu Trusty) provides only version 34 as the alternative version to downgrade to.
Comment 26 lagreca 2014-11-15 03:25:00 UTC
I downgraded chromium from the current 38 version to the 34.0.1847.116 Built on Ubuntu 14.04, running on LinuxMint 17 aura (260972). It was available from the official repositories selecting the option to force version in synaptic.

Since I made it, I never experienced the gpu hang again.

PS: I'm running Linux Mint 17. On previous versions this bug didn't exist.
Comment 27 lagreca 2014-11-15 03:41:36 UTC
That leads me to the conclusion that the problem doesn't reside on intel's driver, but on chromium itself.
Comment 28 txtsd 2014-11-15 03:52:39 UTC
lagreca can you please visit the Google Chrome Store and multiple random videos on Youtube and see if Chrome still doesn't hang?
Comment 29 lagreca 2014-11-15 04:02:39 UTC
txtsd, I've already done that. I have browsed multiple tabs on youtube, other video websites, outlook website (which made the hang happen when clicking the instanst messenger button), and so on.

The bug just doesn't happen on Chromium Version 34.0.1847.116 Built on Ubuntu 14.04, running on LinuxMint 17 aura (260972).

I have tested it with and without the chromium ffmpeg extra codecs package.

I have tested flash videos and html5 videos.

There is no bug with this version.

Of course I'd like to use the newer versions of Chrome and Chromium.
Comment 30 lagreca 2014-11-15 04:03:34 UTC
I've entered chrome store while testing other tabs. 

It's ok.

There's no gpu hang anymore.
Comment 31 Alex 2014-11-15 04:07:19 UTC
lagreca, I think Chrome was still using software acceleration for page rendering in version 34.  It hadn't changed it to hardware acceleration yet.  I am not 100% sure though.
Another factor that is not mentioned here is that Chrome is not the only app that causes the crash.  Glmark2, an OpenGL benchmark app, causes exactly the same issue when it reaches the "Idea" test.
Comment 32 txtsd 2014-11-15 04:08:24 UTC
I've encountered the bug with github's Atom too.
Comment 33 lagreca 2014-11-15 04:16:22 UTC
Alex, 

But I checked the box to enable hardware acceleration when available.

If there are other programs affected, then I must agree with you that it's really a driver's bug.

But the Chromium version I installed works for me.
Comment 34 Alex 2014-11-15 04:32:01 UTC
It used to work for all of us, I remember it too.  The bug first appeared in version 36 and it was only a couple of times in 4 months.  It got worse in 37 and finally it's permanent in 38.

They keep changing things.  The fact that hardware acceleration is enabled, doesn't mean that all functions are enabled too.  If you type chrome://gpu on your search bar, you will see a more thorough list of what is actually enabled and what's not yet.  They keep certain functions disabled till they become stable.

On my system, the Web store is crashing the GPU each and every time I go there without disabling hardware acceleration first.  It's not random.  The web store has no videos.  hotmail is doing it too but randomly.  Google Chrome download page caused it once too.  It has to be something else or there maybe 2 or 3 factors causing this.
Comment 35 Alex 2014-11-15 04:35:14 UTC
(In reply to Alex from comment #34)
> It used to work for all of us, I remember it too.  The bug first appeared in
> version 36 and it was only a couple of times in 4 months.  It got worse in
> 37 and finally it's permanent in 38.
> 
> They keep changing things.  The fact that hardware acceleration is enabled,
> doesn't mean that all functions are enabled too.  If you type chrome://gpu
> on your search bar, you will see a more thorough list of what is actually
> enabled and what's not yet.  They keep certain functions disabled till they
> become stable.
> 
> On my system, the Web store is crashing the GPU each and every time I go
> there without disabling hardware acceleration first.  It's not random.  The
> web store has no videos.  hotmail is doing it too but randomly.  Google
> Chrome download page caused it once too.  It has to be something else or
> there maybe 2 or 3 factors causing this.

4 months is probably wrong.  It was a long time though.  :)
Comment 36 lagreca 2014-11-15 04:40:52 UTC
Created attachment 109501 [details]
The content of "chrome://gpu " in Chrome 38.
Comment 37 lagreca 2014-11-15 04:41:27 UTC
Created attachment 109502 [details]
The content of "chrome://gpu " in Chromium 34.
Comment 38 lagreca 2014-11-15 04:42:54 UTC
I attached two pdf files showing the content of chrome://gpu both in 38 and 34 versions.
Comment 39 Tom 2014-11-15 09:02:19 UTC
Hello,

I have tried to found first version of kernel / intel drivers / chroem drivers which was cousing problems from apt-get logs:


root@Dell-LD830:~# grep -i Start-Date apt/* | perl -n -e 'BEGIN{my %hn;} $hn{$1}->{$2} = 1 if m/(\w+\.?\w*\.?\w*\.?\w*):.*?:\s+(\d+-\d+-\d+)/imgs; END{ print $_.": ".join(", ", sort keys %{$hn{$_}})."\n" for (sort(keys %hn)); }'

history.log: 2014-11-02, 2014-11-11
history.log.1: 2014-10-02, 2014-10-13, 2014-10-16, 2014-10-20
history.log.10: 2014-01-06, 2014-01-16
history.log.11: 2013-12-01, 2013-12-04, 2013-12-08, 2013-12-18, 2013-12-27, 2013-12-30
history.log.12: 2013-11-17, 2013-11-24
history.log.2: 2014-09-09, 2014-09-21, 2014-09-26
history.log.3: 2014-08-24
history.log.4: 2014-07-03, 2014-07-19
history.log.5: 2014-06-08
history.log.6: 2014-05-28
history.log.7: 2014-04-15
history.log.8: 2014-03-01, 2014-03-05, 2014-03-09, 2014-03-11, 2014-03-16
history.log.9: 2014-02-06, 2014-02-10, 2014-02-22


The GPU-Hung bug first occured about 2 weeks beforei've reported it here. So I should check versions before 2014-06-12. Probably bug was introduced in one of this upgrades:

history.log.5: 2014-06-08
history.log.6: 2014-05-28


Here are version that i was using at that time (extracted from logs):

From history.log.9:
Start-Date: 2014-02-06  19:28:45
Commandline: apt-get upgrade ...
linux-image-3.11.0-15-generic:i386 (3.11.0-15.23, 3.11.0-15.25)
linux-firmware:i386 (1.116, 1.116.1)
xserver-xorg-video-intel:i386 (2.99.904-0ubuntu2, 2.99.904-0ubuntu2.1)
flashplugin-installer:i386 (11.2.202.335ubuntu0.13.10.1, 11.2.202.336ubuntu0.13.10.1)
...

From history.log.7:
Start-Date: 2014-04-15  16:41:31
Commandline: apt-get upgrade ...
google-chrome-stable:i386 (33.0.1750.152-1, 34.0.1847.116-1)
...

From history.log.6:
Start-Date: 2014-05-28  21:05:02
Commandline: apt-get upgrade ...
linux-firmware:i386 (1.127, 1.127.2)
linux-image-extra-3.13.0-24-generic:i386 (3.13.0-24.46, 3.13.0-24.47)
linux-libc-dev:i386 (3.13.0-24.46, 3.13.0-27.50)
linux-headers-3.13.0-24:i386 (3.13.0-24.46, 3.13.0-24.47)
linux-headers-3.13.0-24-generic:i386 (3.13.0-24.46, 3.13.0-24.47)
linux-image-3.13.0-24-generic:i386 (3.13.0-24.46, 3.13.0-24.47)
...

From history.log.5:
Start-Date: 2014-06-08  20:15:24
Commandline: apt-get upgrade ...
libegl1-mesa:i386 (10.1.0-4ubuntu5, 10.1.3-0ubuntu0.1)
libosmesa6:i386 (10.1.0-4ubuntu5, 10.1.3-0ubuntu0.1)
libegl1-mesa-drivers:i386 (10.1.0-4ubuntu5, 10.1.3-0ubuntu0.1)
libwayland-egl1-mesa:i386 (10.1.0-4ubuntu5, 10.1.3-0ubuntu0.1)
...


For further reference I will include whole apt-get history as zip file.
Comment 40 Tom 2014-11-15 09:04:23 UTC
Created attachment 109507 [details]
apt-get logs to bisect changes

Added apt.zip log files;
Comment 41 Tom 2014-11-15 10:04:32 UTC
Created attachment 109509 [details]
more logs drom current GPU Crash

Today I was able to reproduce this bug with:

Linux Dell-LD830 3.13.0-39-generic #66-Ubuntu SMP Tue Oct 28 13:31:23 UTC 2014 i686 i686 i686 GNU/Linux

Google Chrome 38.0.2125.111 

and glmark2 installed via apt-get install:

glmark2-data:i386 (2012.08-0ubuntu2, automatic), glmark2:i386 (2012.08-0ubuntu2)

I'm attaching logs for more information.
Comment 42 Tom 2014-11-15 10:12:12 UTC
Created attachment 109510 [details]
When crash happens (screenshot from glmark2 demo)

I was able to reproduce error using glmark2 demo on Dell D830.

I've also got other PC with Intel GPU (ThinkPad T430s). I've tried to reproduce error on ThinkPad Ivy Bridge Mobile GPU but the bug is not there. I'm attaching screenshot to show exact moment from demo when crash happens (the screen is from ThinkPad Ivy Bridge not Dell D830).
Comment 43 Matt Turner 2014-11-16 08:09:45 UTC
*** Bug 83423 has been marked as a duplicate of this bug. ***
Comment 44 lagreca 2014-11-18 02:09:10 UTC
Any news?
Comment 45 lagreca 2014-11-18 02:26:47 UTC
Why is Intel ignoring us?
Comment 46 lagreca 2014-11-18 02:27:10 UTC
We should file a class action!
Comment 47 Matt Turner 2014-11-18 03:15:58 UTC
Good grief. Take it easy. I work for Intel.
Comment 48 shachar 2014-11-18 06:32:20 UTC
The glmark2 issue might be related to https://www.libreoffice.org/bugzilla/show_bug.cgi?id=85367 .
It is reproducing on my machine as well (GLmark2 ideas crashes the GPU, "render ring stuck"). Can provide the error file/log files if needed. Can't bisect, as for me the issue showed up when jumping from ubuntu 10.04 to ubuntu 14.04.
However, I'm not sure that the glmark issue and the chrome issue are related.
Matt Turner, any way I can help diagnose/resolve the issue?
Comment 49 Daniel Vetter 2014-11-18 10:26:24 UTC
*** Bug 85958 has been marked as a duplicate of this bug. ***
Comment 50 coladict 2014-11-18 11:06:17 UTC
Created attachment 109667 [details]
Error log dump

Sending my latest dump from the crash as well.
I'll give it a few tries on my home computer as well tonight.
Comment 51 lagreca 2014-11-18 13:00:16 UTC
I've found this topic which seems to be related to our bug:
https://bugs.archlinux.org/task/38518?project=1

It says:
"Description:
The new xf86-video-intel (2.99.907-1) causes the display to crash. Reverting to previous version (2.21.15-2) resolves the issue."
Comment 52 Alex 2014-11-18 15:25:32 UTC
This bug exists in driver version 2.99.911 too.  This is the latest one.  I have installed this driver using the graphics installer offered by Intel.

I don't think there's any way for us to go back to version 2.21.15-2.  There's nowhere in the Ubuntu repos.

Intel will have to correct it eventually since reverting to older drivers is not the right solution for any driver, is it?  ;)
Comment 53 lagreca 2014-11-18 16:05:14 UTC
(In reply to Alex from comment #52)
> This bug exists in driver version 2.99.911 too.  This is the latest one.  I
> have installed this driver using the graphics installer offered by Intel.
> 
> I don't think there's any way for us to go back to version 2.21.15-2. 
> There's nowhere in the Ubuntu repos.
> 
> Intel will have to correct it eventually since reverting to older drivers is
> not the right solution for any driver, is it?  ;)

You're right, Alex.

I only hope that it helps to identify when the bug first happened and why. Maybe it's enough to eliminate it.
Comment 54 lagreca 2014-11-18 18:30:55 UTC
(In reply to Matt Turner from comment #47)
> Good grief. Take it easy. I work for Intel.

Matt, what then can you tell us about Intel's efforts to solve this issue?
Comment 55 Alex 2014-11-20 00:43:09 UTC
Ok.  There's an update for our case here.

Yesterday I installed and tested the new Chrome 39.  If we re-enable hardware acceleration on it and visit the Web store, the screen does not switch off any more BUT, there is a huge BUT...  The driver switches off hardware acceleration for the whole system after that and unless the system is rebooted, it stays off.  This of course means that we have to keep hardware acceleration off in Chrome if we don't want to have the driver switch it off for the entire system.

It would be nice if Intel could solve this issue soon.
Comment 56 lagreca 2014-11-20 13:09:53 UTC
(In reply to Alex from comment #55)
> Ok.  There's an update for our case here.
> 
> Yesterday I installed and tested the new Chrome 39.  If we re-enable
> hardware acceleration on it and visit the Web store, the screen does not
> switch off any more BUT, there is a huge BUT...  The driver switches off
> hardware acceleration for the whole system after that and unless the system
> is rebooted, it stays off.  This of course means that we have to keep
> hardware acceleration off in Chrome if we don't want to have the driver
> switch it off for the entire system.
> 
> It would be nice if Intel could solve this issue soon.

It's up to intel then.
Comment 57 shachar 2014-11-21 05:05:42 UTC
I did few tests to see if I can narrow down the cause of the issue, here are my insights so far:

- It seems that the bug is triggered only if AccelMethod is SNA. Setting the AccelMethod to UXA seems to be hiding the issue with chrome/chromium. Issue still shows up when running glmark2 -b ideas (though this might be a different issue)

- I tried changing the Xorg/XFree driver versions. I used the freedesktop git, and went all the way back to 2.20.0, where SNA was officially introduced. Bug is reproducing there as well. I attach a small patch to make the code from 2.20.0 compile on modern Xorg version.

- Last April was when Ubuntu released their new "long term support" version, this (and derivatives) might explain part of the spike in the bug reports at this point, as large number of people jumped ship to have SNA enabled by default in their distro

- The ArchLinux page on Intel graphics ( https://wiki.archlinux.org/index.php/Intel_graphics ) contains few pointers to additional tweaking knobs to try out. Going to try them next when I have some free time.
Comment 58 shachar 2014-11-21 05:06:55 UTC
Created attachment 109788 [details]
Compilation helper for old code versions, to assist future bisects (though it seems the issue is not bisectable)
Comment 59 Alex 2014-11-21 15:51:45 UTC
(In reply to shachar from comment #57)
> I did few tests to see if I can narrow down the cause of the issue, here are
> my insights so far:
> 
> - It seems that the bug is triggered only if AccelMethod is SNA. Setting the
> AccelMethod to UXA seems to be hiding the issue with chrome/chromium. Issue
> still shows up when running glmark2 -b ideas (though this might be a
> different issue)
> 
> - I tried changing the Xorg/XFree driver versions. I used the freedesktop
> git, and went all the way back to 2.20.0, where SNA was officially
> introduced. Bug is reproducing there as well. I attach a small patch to make
> the code from 2.20.0 compile on modern Xorg version.
> 
> - Last April was when Ubuntu released their new "long term support" version,
> this (and derivatives) might explain part of the spike in the bug reports at
> this point, as large number of people jumped ship to have SNA enabled by
> default in their distro
> 
> - The ArchLinux page on Intel graphics (
> https://wiki.archlinux.org/index.php/Intel_graphics ) contains few pointers
> to additional tweaking knobs to try out. Going to try them next when I have
> some free time.


I'm really surprised to see that switching to UXA really helped you because that exact test was the very first I did when the bug first appeared and it didn't work at all.  Actually, I didn't even have to visit the Web Store in order to produce it.  I just attempted to watch a random video on Youtube and the screen went black at once.  I didn't even need a second test.  :)
Comment 60 lagreca 2014-11-21 16:25:39 UTC
I tried changing accel mode to uxa again. Surprise, it doesn't work for me. The bug persists. Chrome still crashes X11.
Comment 61 lagreca 2014-11-21 16:26:49 UTC
Intel must intervene as soon as possible.
Comment 62 Ian Romanick 2014-11-21 17:26:46 UTC
There is some evidence (see bug #85267) that always_flush_cache=true or always_flush_batch=true may help.

always_flush_cache=true always_flush_batch=true chromium-browser

This suggests that we're missing a flush somewhere... but finding where is like finding a needle in all the haystacks. :(
Comment 63 Ian Romanick 2014-11-21 17:28:03 UTC
(In reply to Ian Romanick from comment #62)
> There is some evidence (see bug #85267) that always_flush_cache=true or
                      Oops... bug #85367

> always_flush_batch=true may help.
> 
> always_flush_cache=true always_flush_batch=true chromium-browser
> 
> This suggests that we're missing a flush somewhere... but finding where is
> like finding a needle in all the haystacks. :(
Comment 64 Alex 2014-11-21 18:03:10 UTC
So, are you suggesting that we create a launcher using this as command line?

always_flush_cache=true always_flush_batch=true /usr/bin/google-chrome

or

always_flush_cache=true always_flush_batch=true glark2

I am not sure I understand correctly.  :)
Comment 65 Matt Turner 2014-11-21 18:18:51 UTC
(In reply to Alex from comment #64)
> So, are you suggesting that we create a launcher using this as command line?
> 
> always_flush_cache=true always_flush_batch=true /usr/bin/google-chrome

Yes.
Comment 66 Alex 2014-11-21 20:45:29 UTC
Ok.  I tested all three, Chrome, glmark2 and glmark2-es2 running them from the terminal using your suggestion.

All three are working fine, the web store is loading fine, videos play ok.

I wish I knew how to incluse those two "always_flush_cache=true always_flush_batch=true" in a launcher command line though since the only way I can launch chrome this way, is through a terminal and that terminal becomes so crowded with data after a while.

I think this trick can be used till the final fix is released.
Comment 67 lagreca 2014-11-21 21:42:41 UTC
(In reply to Alex from comment #66)
> Ok.  I tested all three, Chrome, glmark2 and glmark2-es2 running them from
> the terminal using your suggestion.
> 
> All three are working fine, the web store is loading fine, videos play ok.
> 
> I wish I knew how to incluse those two "always_flush_cache=true
> always_flush_batch=true" in a launcher command line though since the only
> way I can launch chrome this way, is through a terminal and that terminal
> becomes so crowded with data after a while.
> 
> I think this trick can be used till the final fix is released.

Yes, it really helps!!! Everything's fine with this workaround.
Comment 68 Ian Romanick 2014-11-22 01:52:13 UTC
Another method is to set those options in either the user or system drirc.  I'm posting this from my phone, so I'll leave finding the details as an exercise for the reader. :)
Comment 69 Alex 2014-11-22 03:09:46 UTC
(In reply to Ian Romanick from comment #68)
> Another method is to set those options in either the user or system drirc. 
> I'm posting this from my phone, so I'll leave finding the details as an
> exercise for the reader. :)

First of all thank you for showing us this workaround.  I read about drirc and found about driconf too and how we can set those options.  Using this syntax though: "always_flush_cache=true always_flush_batch=true /usr/bin/google-chrome-stable & exit" runs chrome and closes the terminal right after and I guess this seems to be a better way to handle a temporary situation than changing dri settings and then changing them back to what they were when a fix is released.

I'd like to ask what the drawbacks for these two settings are though.  Are they slowing things down for the specific app that's using them for example?
Comment 70 Alex 2014-11-25 15:21:32 UTC
Today, kernel 3.16 was officially released for 14.04 LTS in the Canonical repos.  Do you think installing it would solve the problem we're facing or would we still have to use the same workaround anyway please?
Comment 71 Ryan Underwood 2014-11-25 19:35:43 UTC
Don't bother, unless it contains a special patch it doesn't help (also tested with 3.17.x and 3.18-rc5)
Comment 72 Matt Turner 2014-11-27 19:02:48 UTC
*** Bug 86757 has been marked as a duplicate of this bug. ***
Comment 73 Daniel Vetter 2014-11-27 19:15:50 UTC
Just aside to everyone suffering from gen4 gpu hangs: If your kernel doesn't manage to reset the gpu after a hang please grab the latest drm-intel-nightly branch from http://cgit.freedesktop.org/drm-intel It has fixed up gen4 gpu reset from Ville. I'll try to get this into 3.19 (but the drm subsystem merge window is kinda gone already).
Comment 74 Nikola Kovacevic 2014-11-28 09:53:17 UTC
I can confirm that starting chrome as "always_flush_cache=true always_flush_batch=true google-chrome" mitigates the issue. I can not seem to set that option in ~/.drirc file though, neither by changing default settings or adding chrome as an application using driconf tool, so if someone could post more info on how to do that until the issue gets fixed I'd really appreciate it.
Comment 75 Nikola Kovacevic 2014-11-28 17:25:42 UTC
(In reply to Daniel Vetter from comment #73)
> Just aside to everyone suffering from gen4 gpu hangs: If your kernel doesn't
> manage to reset the gpu after a hang please grab the latest
> drm-intel-nightly branch from http://cgit.freedesktop.org/drm-intel It has
> fixed up gen4 gpu reset from Ville. I'll try to get this into 3.19 (but the
> drm subsystem merge window is kinda gone already).

It works - as in it doesn't crash the GPU, but screen flickers (goes black quickly and then shows content again) and system becomes unresponsive until the page is finished rendering.
Comment 76 Alex 2014-11-28 17:46:32 UTC
(In reply to nikolak from comment #75)
> (In reply to Daniel Vetter from comment #73)
> > Just aside to everyone suffering from gen4 gpu hangs: If your kernel doesn't
> > manage to reset the gpu after a hang please grab the latest
> > drm-intel-nightly branch from http://cgit.freedesktop.org/drm-intel It has
> > fixed up gen4 gpu reset from Ville. I'll try to get this into 3.19 (but the
> > drm subsystem merge window is kinda gone already).
> 
> It works - as in it doesn't crash the GPU, but screen flickers (goes black
> quickly and then shows content again) and system becomes unresponsive until
> the page is finished rendering.

What you're describing, is the way they've programmed the new Chrome 39 to handle the problem.  This happens if you don't issue the "always_flush..." settings or if they don't work for some reason.

The point is that if the symptoms you're describing happen, it means that the driver is switching off hardware acceleration completely for the whole system and it uses software rendering for all apps afterwards.  That's why you can see the contents afterwards.

You can verify this by checking the Xorg.0.log file in /var/log.  You will see a line near the end of the file saying "Disabling hardware acceleration...".

Personally, I'm using this method to start Chrome on my system which is Ubuntu based.  I am opening a terminal and I'm pasting the line "always_flush_cache=true always_flush_batch=true /usr/bin/google-chrome-stable & exit".  After pressing ENTER, the terminal is closed automatically and Chrome starts as usual.
Comment 77 Nikola Kovacevic 2014-11-28 19:26:28 UTC
(In reply to Alex from comment #76)
[...]
> The point is that if the symptoms you're describing happen, it means that
> the driver is switching off hardware acceleration completely for the whole
> system and it uses software rendering for all apps afterwards.  That's why
> you can see the contents afterwards.
> 
> You can verify this by checking the Xorg.0.log file in /var/log.  You will
> see a line near the end of the file saying "Disabling hardware
> acceleration...".


Doesn't seem to be it. Xorg.0.log doesn't show anything like that.

The only difference error is in my syslog:

Nov 28 20:13:20 mint kernel: [ 7497.992041] [drm] stuck on render ring
Nov 28 20:13:20 mint kernel: [ 7497.993305] [drm] GPU HANG: ecode 4:0:0x9f47f9fd, in chrome [4630], reason: Ring hung, action: reset
Nov 28 20:13:20 mint kernel: [ 7498.016208] drm/i915: Resetting chip after gpu hang
Nov 28 20:13:21 mint kernel: [ 7498.445210] ------------[ cut here ]------------
Nov 28 20:13:21 mint kernel: [ 7498.445258] WARNING: CPU: 1 PID: 1830 at /home/apw/COD/linux/drivers/gpu/drm/drm_irq.c:1081 drm_wait_one_vblank+0x125/0x130 [drm]()
Nov 28 20:13:21 mint kernel: [ 7498.445262] vblank not available on crtc 1, ret=-22
...etc

And if it did turn off hardware rendering it probably wouldn't happen on each page refresh like it does currently.

> Personally, I'm using this method to start Chrome on my system which is
> Ubuntu based.  I am opening a terminal and I'm pasting the line
> "always_flush_cache=true always_flush_batch=true
> /usr/bin/google-chrome-stable & exit".  After pressing ENTER, the terminal
> is closed automatically and Chrome starts as usual.

Starting chrome with those parameters does the trick, but doesn't work if chrome gets started by clicking link in an external application for example.

Hopefully better fix, either by intel or google, will make it in official repositories.
Comment 78 Matt Turner 2014-12-04 22:08:11 UTC
*** Bug 87000 has been marked as a duplicate of this bug. ***
Comment 79 Warren 2014-12-21 15:54:12 UTC
Has anyone had success in either creating a desktop launcher or configuring .drirc to set the always_flush_cache=true and always_flush_batch=true parameters?
Comment 80 txtsd 2014-12-21 20:15:00 UTC
(In reply to Warren from comment #79)
> Has anyone had success in either creating a desktop launcher or configuring
> .drirc to set the always_flush_cache=true and always_flush_batch=true
> parameters?

I use this in my chrome.desktop
Exec=env always_flush_cache=true always_flush_batch=true /usr/bin/google-chrome-unstable %U

But my chrome has been failing to paint tabs, and refreshing them does nothing. I have to open a new tab and enter the same links, or duplicate the tabs.
Comment 81 Alex 2014-12-21 20:21:33 UTC
(In reply to txtsd from comment #80)
> (In reply to Warren from comment #79)
> > Has anyone had success in either creating a desktop launcher or configuring
> > .drirc to set the always_flush_cache=true and always_flush_batch=true
> > parameters?
> 
> I use this in my chrome.desktop
> Exec=env always_flush_cache=true always_flush_batch=true
> /usr/bin/google-chrome-unstable %U
> 
> But my chrome has been failing to paint tabs, and refreshing them does
> nothing. I have to open a new tab and enter the same links, or duplicate the
> tabs.

This "unstable" I see in the path, wouldn't have something to do with that, would it?
Comment 82 Warren 2014-12-22 01:13:48 UTC
(In reply to txtsd from comment #80)
> (In reply to Warren from comment #79)
> > Has anyone had success in either creating a desktop launcher or configuring
> > .drirc to set the always_flush_cache=true and always_flush_batch=true
> > parameters?
> 
> I use this in my chrome.desktop
> Exec=env always_flush_cache=true always_flush_batch=true
> /usr/bin/google-chrome-unstable %U
> 
> But my chrome has been failing to paint tabs, and refreshing them does
> nothing. I have to open a new tab and enter the same links, or duplicate the
> tabs.

Thank you. I am unfamiliar with the env parameter for the Exec= entry. I had tried basically the same Exec= statement as your suggestion less the env and it failed. Afer adding the env it works fine. Thanks again.
Comment 83 txtsd 2014-12-22 08:27:53 UTC
(In reply to Alex from comment #81)
> (In reply to txtsd from comment #80)
> > (In reply to Warren from comment #79)
> > > Has anyone had success in either creating a desktop launcher or configuring
> > > .drirc to set the always_flush_cache=true and always_flush_batch=true
> > > parameters?
> > 
> > I use this in my chrome.desktop
> > Exec=env always_flush_cache=true always_flush_batch=true
> > /usr/bin/google-chrome-unstable %U
> > 
> > But my chrome has been failing to paint tabs, and refreshing them does
> > nothing. I have to open a new tab and enter the same links, or duplicate the
> > tabs.
> 
> This "unstable" I see in the path, wouldn't have something to do with that,
> would it?

Yea, I run the dev version, and that was actually a bug in the previous version.
Comment 84 Alex 2015-01-09 00:05:04 UTC
Guys, this is a bit irrelevant but since all the driver updates and stuff are done using these servers, I thought I'd post these errors they report for the past two days during apt-get update:
--------------------------------------------------
Err https://download.01.org trusty/main amd64 Packages                         
  server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none
Err https://download.01.org trusty/main i386 Packages
  server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none
Ign https://download.01.org trusty/main Translation-en_US
Ign https://download.01.org trusty/main Translation-en
Fetched 738 kB in 34s (21.4 kB/s)
W: Failed to fetch https://download.01.org/gfx/ubuntu/14.04/main/dists/trusty/main/binary-amd64/Packages  server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none

W: Failed to fetch https://download.01.org/gfx/ubuntu/14.04/main/dists/trusty/main/binary-i386/Packages  server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none

E: Some index files failed to download. They have been ignored, or old ones used instead.
--------------------------------------------------
Is anyone else facing this issue please?  Do we know anything about it?
Comment 85 Ryan Underwood 2015-01-09 02:54:40 UTC
*** Bug 86847 has been marked as a duplicate of this bug. ***
Comment 86 Kenneth Graunke 2015-01-19 21:30:05 UTC
Hi all.  I believe this should be fixed with Mesa master - specifically, this commit:

commit c4fd0c9052dd391d6f2e9bb8e6da209dfc7ef35b
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Sat Jan 17 23:21:15 2015 -0800

    i965: Work around mysterious Gen4 GPU hangs with minimal state changes.

If you're able to test with Mesa master, I'd appreciate any reports of whether this solved the problem for you.  It seems to have helped for me.
Comment 87 Darius Kellermann 2015-01-26 11:50:16 UTC
Updated the mesa package to version 10.4.3 today, which includes the specified commit. Problem is now fixed for me. Thank you very much!
Comment 88 Andrea Bini 2015-01-26 16:47:39 UTC
(In reply to Kenneth Graunke from comment #86)
> Hi all.  I believe this should be fixed with Mesa master - specifically,
> this commit:
> 
> commit c4fd0c9052dd391d6f2e9bb8e6da209dfc7ef35b
> Author: Kenneth Graunke <kenneth@whitecape.org>
> Date:   Sat Jan 17 23:21:15 2015 -0800
> 
>     i965: Work around mysterious Gen4 GPU hangs with minimal state changes.
> 
> If you're able to test with Mesa master, I'd appreciate any reports of
> whether this solved the problem for you.  It seems to have helped for me.

Hi all, I'm the reporter of this bug https://bugs.launchpad.net/ubuntu/+source/xorg/+bug/1388612 that was redirected here.
 
I've tested the fix on my machine updating the Xorg stack from https://launchpad.net/~xorg-edgers/+archive/ubuntu/ppa A packaged version of Mesa with the fix was released there yesterday only. Unfortunately, even though I'd like to, I'm not able to update the system by myself as soon as a change is committed. Cloning the Mesa repository, building and installing it is not enough, right? Is there a "faster" method than the one I've used?

Anyway, glmark2 ideas benchmark works fine now but, in my case, the issue is not solved yet. An application written by myself using SDL (https://www.libsdl.org/), that before the fix used to cause the hang most of the times, now terminates due to a failing assertion in Mesa.

The most frequent is:
SDLApp: ../../../../src/mesa/vbo/vbo_exec_draw.c:222: vbo_exec_bind_arrays: Assertion `exec->vtx.bufferobj->Mappings[MAP_INTERNAL].Pointer' failed.

Sometimes this happens:
SDLApp: ../../../../src/mesa/vbo/vbo_exec_draw.c:278: vbo_exec_vtx_unmap: Assertion `exec->vtx.buffer_ptr != ((void *)0)' failed.

The application works fine on another machine with a different chip. 

The billard game FooBillard++ (http://foobillardplus.sourceforge.net/) still causes the hang immediately.
The workaround of using always_flush_cache=true and always_flush_batch=true doesn't work with my application nor with FooBillard++. I was just able to use it to let ideas benchmark running fine prior to the fix. I've not tested Chrome yet but I can do it if it can be useful. 

Thank you Kenneth for the fix and for any further help. I'd really like to use Linux on this machine for my developments. If I can help somehow let me know.
Comment 89 Janus Troelsen 2015-01-27 19:14:21 UTC
Created attachment 112888 [details]
foobilliardplus showing display corruption before crash

foobilliardplus official x86_64 binaries for ubuntu 11.10 running on ubuntu 14.10 with xorg-edgers (mesa build from January 25th, 2015)
Comment 90 Janus Troelsen 2015-01-27 19:18:12 UTC
The issue is resolved in Chrome for me; thank you Kenneth, your patch makes it so that I do not have this issue every day on YouTube. However, foobilliardplus does make the GPU hang. See my attachment for a screenshot showing display corruption.

Can we reopen this?
Comment 91 Matt Turner 2015-01-27 19:27:40 UTC
(In reply to Janus Troelsen from comment #90)
> The issue is resolved in Chrome for me; thank you Kenneth, your patch makes
> it so that I do not have this issue every day on YouTube. However,
> foobilliardplus does make the GPU hang. See my attachment for a screenshot
> showing display corruption.
> 
> Can we reopen this?

We're up to 90 comments, and the foobillardplus crash must be different from the one you confirmed is fixed. Let's file a new bug.
Comment 92 Jamie Jackson 2015-01-28 13:05:33 UTC
What's the edgers/package-based workaround?

I tried the following (on Linux Mint 17):

sudo apt-add-repository ppa:xorg-edgers/ppa && sudo apt-get update && sudo apt-get upgrade 

...and I got some new packages, but also saw that some were held back:

The following packages have been kept back:
  libegl1-mesa libgbm1 libgl1-mesa-dri:i386 libgl1-mesa-dri libqt5gui5
  libwayland-egl1-mesa libxatracker2 lxc-docker mintupdate python-cupshelpers
  system-config-printer-gnome

After restarting the system, I still get a Firefox crash or black screen when hitting http://www.google.com/chrome/ (which I think was part of the same problem--I don't think I ever had a 100% reproducible Chrome/YouTube test case).
Comment 93 Ryan Underwood 2015-01-28 15:17:11 UTC
You didn't actually install the packages that include the fix. :-)  Try dist-upgrade.
Comment 94 1544c 2015-01-28 18:21:40 UTC
(In reply to Ryan Underwood from comment #93)
> You didn't actually install the packages that include the fix. :-)  Try
> dist-upgrade.

Newbie here. I tried the same thing as Jamie Jackson but it keeps crashing.
I have a freshly installed Ubuntu 14.04
This happens everytime when I try to visit "google.com/chrome" with Firefox and when I try to watch a YouTube video with Google Chrome.
Comment 95 Alex 2015-01-28 18:36:12 UTC
I installed the latest available drivers and Mesa 10.5.0 from xorg-edgers today (Jan 28, 2015) on my system and the results SO FAR and after quite extensive and repetitive tests are:

glmark2               : fixed
glmark2-es2           : fixed

Chrome
webstore              : fixed
www.google.com/chrome : fixed
youtube videos        : fixed

I will continue running tests though, since the new Chrome 40 seems to have screen drawing issues when I change to a different workspace and back to the one Chrome is in and every time Chrome is minimized and restored.  Clicking anywhere on the desktop and back on the Chrome window refreshes it though and it goes back to normal.

Thanks anyway!  ;-)
Comment 96 Ryan Underwood 2015-01-28 19:21:38 UTC
> Newbie here. I tried the same thing as Jamie Jackson but it keeps crashing.
> I have a freshly installed Ubuntu 14.04

I'm not sure if this means that you have also not actually upgraded to the fixed packages as Jamie Jackson indicated.  Please dpkg -l libgl1-mesa-dri and check that the installed version is equivalent to the mesa version listed on xorg-edgers: https://launchpad.net/~xorg-edgers/+archive/ubuntu/ppa
Comment 97 1544c 2015-01-28 20:27:16 UTC
(In reply to Ryan Underwood from comment #96)
> I'm not sure if this means that you have also not actually upgraded to the
> fixed packages as Jamie Jackson indicated.  Please dpkg -l libgl1-mesa-dri
> and check that the installed version is equivalent to the mesa version
> listed on xorg-edgers: https://launchpad.net/~xorg-edgers/+archive/ubuntu/ppa

This is what I have

Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name
ii     libgl1-mesa-dri:amd64
Version
10.5.0~git20150127.5c83a0d2-0ubuntu0ricotz~trusty     

Architecture
amd64

Description
free implementation of the OpenGL API -- DRI modules
Comment 98 Janus Troelsen 2015-01-28 21:11:17 UTC
Regarding the GPU reset, it is in Linux since 3.19-rc1. So if your GPU is not resetting after crashing, upgrade your kernel. Ubuntu has mainline Linux kernel dpkg packages that work well.

This is the patching commit: https://github.com/torvalds/linux/commit/656bfa3afc14e45e2d9e1624bf60d79b3beb12f2

It sounds like everyone's GPU's are resetting, so I'm wondering if maybe the Ubuntu guys backported this.
Comment 99 Jamie Jackson 2015-01-29 00:59:21 UTC
(In reply to 1544c from comment #94)
> (In reply to Ryan Underwood from comment #93)
> > You didn't actually install the packages that include the fix. :-)  Try
> > dist-upgrade.
> 
> Newbie here. I tried the same thing as Jamie Jackson but it keeps crashing.
> I have a freshly installed Ubuntu 14.04
> This happens everytime when I try to visit "google.com/chrome" with Firefox
> and when I try to watch a YouTube video with Google Chrome.

Thanks, Ryan. I had high hopes for dist-upgrade; alas, it didn't seem to work.

Let me know if I did something wrong, or if I'm barking up the wrong tree, but here's what I got:

# show currently installed version (mine showed 10.1.3-0ubuntu0.3 for amd64 and i386)
dpkg -l libgl1-mesa-dri
# add edgers repo
sudo apt-add-repository ppa:xorg-edgers/ppa
# get the new package lists
sudo apt-get update
# install the edgers packages
sudo apt-get dist-upgrade 
# show currently installed version (my packages now show 10.5.0~git20150127)
dpkg -l libgl1-mesa-dri
# reboot
sudo reboot

# try firefox test case
firefox http://www.google.com/chrome/

============= Yields BSOD, with... ================
(process:3075): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed

(firefox:3075): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::sm-connect after class was initialised

(firefox:3075): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::show-crash-dialog after class was initialised

(firefox:3075): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::display after class was initialised

(firefox:3075): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::default-icon after class was initialised
ATTENTION: default value of option force_s3tc_enable overridden by environment.
intel_do_flush_locked failed: Invalid argument
===================================================

BTW:
jamie@minty ~ $ uname -a
Linux minty 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
jamie@minty ~ $ lsb_release -d
Description:	Linux Mint 17 Qiana
Comment 100 Alex 2015-01-29 02:09:29 UTC
Jamie Jackson, having read you issue with www.google.com/chrome, I've visited the site at least 10 times using Chrome after yesterday's updates and all went just fine.  Are you facing the same issue when you're using chrome or is it a firefox issue only?
Comment 101 Ryan Underwood 2015-01-29 02:34:20 UTC
Jamie, your uname -a output indicates that you haven't installed a kernel that contains the other part of the fix.
Go here: http://kernel.ubuntu.com/~kernel-ppa/mainline/
Pick a kernel >= v3.19-rc1 and install the proper debs.  Then make sure you choose that kernel at your bootloader.
Comment 102 Jamie Jackson 2015-01-29 04:47:01 UTC
(In reply to Alex from comment #100)
> Jamie Jackson, having read you issue with www.google.com/chrome, I've
> visited the site at least 10 times using Chrome after yesterday's updates
> and all went just fine.  Are you facing the same issue when you're using
> chrome or is it a firefox issue only?

Not sure if the Chrome/YouTube still persisted after that, because that was intermittent--I never had a 100% reliable Chrome test case, which is why I was using the reliable Firefox case.

(In reply to Ryan Underwood from comment #101)
> Jamie, your uname -a output indicates that you haven't installed a kernel
> that contains the other part of the fix.
> Go here: http://kernel.ubuntu.com/~kernel-ppa/mainline/
> Pick a kernel >= v3.19-rc1 and install the proper debs.  Then make sure you
> choose that kernel at your bootloader.

Thanks for putting the pieces together for me, Ryan. I seem to have success now. (My package management is probably pretty weird as a result, but I'm going to ignore that.)

For the other n00bs, here was the final piece:

mkdir -p /tmp/kernel && cd /tmp/kernel
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc5-vivid/linux-image-3.19.0-031900rc5-generic_3.19.0-031900rc5.201501180935_amd64.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc5-vivid/linux-headers-3.19.0-031900rc5_3.19.0-031900rc5.201501180935_all.deb
sudo dkpg -i linux-image*
# In truth, I did this next one in GDebi, after I ran 
# into a problem that "sudo apt-get install -f" fixed
sudo dpkg -i linux-headers*
sudo reboot

uname -a # yields:Linux minty 3.19.0-031900rc5-generic #201501180935 SMP Sun Jan 18 09:36:49 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
# try firefox test case
firefox http://www.google.com/chrome/

=== Yields a black blip, but it recovers! and the followng output ===
(process:3038): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed

(firefox:3038): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::sm-connect after class was initialised

(firefox:3038): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::show-crash-dialog after class was initialised

(firefox:3038): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::display after class was initialised

(firefox:3038): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::default-icon after class was initialised
ATTENTION: default value of option force_s3tc_enable overridden by environment.
======================================================================
Comment 103 shachar 2015-01-30 03:04:37 UTC
I can verify that using updated linux kernel (3.19-rc6) and libmesa (10.5.0~git20150127.5c83a0d2-0ubuntu0ricotz~trusty ), chrome and firefox do not crash the system. I tried Google Inbox in chrome, where scrolling for few minutes was crashing the system, and no crash. I tried going to youtube in google chrome, fiddled around with the videos there, and no crash.

Going in firefox to google.com/chrome/ causes a GPU hang (will attach GPU state dump soon), however the new GPU reset code works well and the system is still functional afterwards. Should I file a new bug on this crash?

--Shachar
Comment 104 shachar 2015-01-30 03:08:56 UTC
Created attachment 112956 [details]
/sys/class/drm/card0/error when running firefox http://www.google.com/chrome/

Relevant dmesg print:
[  151.816215] [drm] stuck on render ring
[  151.817277] [drm] GPU HANG: ecode 4:0:0xf41b8c79, in firefox [2491], reason: Ring hung, action: reset
[  151.817279] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[  151.817281] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[  151.817282] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[  151.817284] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[  151.817286] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[  152.168294] drm/i915: Resetting chip after gpu hang
[  152.201301] ------------[ cut here ]------------
[  152.201383] WARNING: CPU: 0 PID: 86 at /home/kernel/COD/linux/drivers/gpu/drm/i915/intel_sdvo.c:1424 intel_sdvo_get_config+0x201/0x220 [i9
15]()
[  152.201388] SDVO pixel multiplier mismatch, port: 0, encoder: 1
[  152.201391] Modules linked in: bnep rfcomm dm_crypt snd_hda_codec_idt snd_hda_codec_hdmi snd_hda_codec_generic wl(POE) snd_hda_intel gpio_
ich snd_hda_controller uvcvideo snd_hda_codec videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common dell_wmi snd_hwdep snd_pcm videod
ev sparse_keymap dell_laptop dcdbas i8k snd_seq_midi snd_seq_midi_event snd_rawmidi joydev media serio_raw coretemp r852 snd_seq sm_common na
nd snd_seq_device nand_ecc btusb nand_bch bch snd_timer nand_ids cfg80211 bluetooth snd mtd soundcore r592 memstick lpc_ich mac_hid binfmt_mi
sc parport_pc ppdev lp parport btrfs xor raid6_pq hid_generic usbhid hid psmouse sdhci_pci sdhci firewire_ohci firewire_core ahci crc_itu_t l
ibahci i915 sky2 i2c_algo_bit drm_kms_helper video wmi drm
[  152.201502] CPU: 0 PID: 86 Comm: kworker/0:2 Tainted: P     U  W  OE  3.19.0-031900rc6-generic #201501261152
[  152.201507] Hardware name: Dell Inc. Inspiron 1525                   /0U990C, BIOS A13 06/27/2008
[  152.201552] Workqueue: events i915_error_work_func [i915]
[  152.201557]  0000000000000590 ffff8800b5fabb98 ffffffff817c4584 0000000000000007
[  152.201565]  ffff8800b5fabbe8 ffff8800b5fabbd8 ffffffff81076df7 ffff8800b5fabc08
[  152.201571]  ffff880036368710 ffff88003596a000 0000000000000000 0000000000000001
[  152.201578] Call Trace:
[  152.201591]  [<ffffffff817c4584>] dump_stack+0x45/0x57
[  152.201600]  [<ffffffff81076df7>] warn_slowpath_common+0x97/0xe0
[  152.201607]  [<ffffffff81076ef6>] warn_slowpath_fmt+0x46/0x50
[  152.201665]  [<ffffffffc02195cf>] ? intel_sdvo_get_value+0x3f/0x60 [i915]
[  152.201723]  [<ffffffffc021ab21>] intel_sdvo_get_config+0x201/0x220 [i915]
[  152.201776]  [<ffffffffc01d4d9e>] intel_modeset_readout_hw_state+0x2ae/0x450 [i915]
[  152.201830]  [<ffffffffc01eeabe>] intel_modeset_setup_hw_state+0x2e/0x3c0 [i915]
[  152.201883]  [<ffffffffc01ef320>] intel_finish_reset+0x160/0x1b0 [i915]
[  152.201931]  [<ffffffffc01b482f>] i915_error_work_func+0xdf/0x150 [i915]
[  152.201945]  [<ffffffff8108f6dd>] process_one_work+0x14d/0x460
[  152.201952]  [<ffffffff810900bb>] worker_thread+0x11b/0x3f0
[  152.201960]  [<ffffffff8108ffa0>] ? create_worker+0x1e0/0x1e0
[  152.201967]  [<ffffffff81095cc9>] kthread+0xc9/0xe0
[  152.201974]  [<ffffffff81095c00>] ? flush_kthread_worker+0x90/0x90
[  152.201982]  [<ffffffff817d17fc>] ret_from_fork+0x7c/0xb0
[  152.201989]  [<ffffffff81095c00>] ? flush_kthread_worker+0x90/0x90
[  152.201994] ---[ end trace 4841b4400897d54d ]---
Comment 105 1544c 2015-01-31 17:30:49 UTC
Thanks Jamie Jackson, I haven't experienced any crashes since I applied the update.
Comment 106 txtsd 2015-02-01 08:12:12 UTC
(In reply to Kenneth Graunke from comment #86)
> Hi all.  I believe this should be fixed with Mesa master - specifically,
> this commit:
> 
> commit c4fd0c9052dd391d6f2e9bb8e6da209dfc7ef35b
> Author: Kenneth Graunke <kenneth@whitecape.org>
> Date:   Sat Jan 17 23:21:15 2015 -0800
> 
>     i965: Work around mysterious Gen4 GPU hangs with minimal state changes.
> 
> If you're able to test with Mesa master, I'd appreciate any reports of
> whether this solved the problem for you.  It seems to have helped for me.

Hi.
On Archlinux 3.18.4-1-ARCH, with mesa 10.4.3-1, the crashes caused by chrome have gone away. However, visiting google.com/chrome on firefox now causes the same crash that chrome used to cause.


[179870.322075] [drm] stuck on render ring
[179870.323089] [drm] GPU HANG: ecode 0:0x7f64fafd, in firefox [14619], reason: Ring hung, action: reset
[179870.323092] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[179870.323093] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[179870.323095] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[179870.323097] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[179870.323098] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[179870.323548] [drm:i915_reset] *ERROR* Failed to reset chip: -19
Comment 107 Matt Turner 2015-03-06 23:24:52 UTC
*** Bug 88881 has been marked as a duplicate of this bug. ***
Comment 108 Matt Turner 2015-03-10 17:06:48 UTC
*** Bug 87088 has been marked as a duplicate of this bug. ***
Comment 109 Gurinder Johar 2015-03-12 01:00:25 UTC
Hi,

After searching half a day for a solution to the black screen of death issue, I finally ended up at this thread. I am happy to report the problem on Lenovo T61 with Intel 965GM graphic card on a Ubuntu 14.04 LTS is fixed by following Jamie's steps listed in comment #99.  It does need an updated kernel, mine is 3.13.0-46-generic #79-ubuntu.

I would appreciate if someone with more knowledge then I propagates this solution to 50+ sites which discusses the black screen problem but with no viable solution!

Thank you very much for your dedication to help resolve this issue.
Comment 110 Alex 2015-03-19 21:58:40 UTC
After all these days I've been using the latest drivers and stuff offered by xorg-edgers and everything seemed to be running fine, tonight it happened again.  It seems that if I visit www.google.com/chrome using the first tab on Chrome, everything is fine.  Tonight I needed to download Chrome for another computer and I visited that page at a moment when several other apps and seven other tabs were already open on Chrome, thus creating an eighth tab.  That caused an immediate GPU crash and the screen went off again.

This has become very annoying.  I need to work without these issues guys.  :(
Comment 111 Janus Troelsen 2015-03-19 22:00:25 UTC
Did you upgrade your kernel Alex? If you did, the GPU should be able to properly reset.
Comment 112 Alex 2015-03-19 22:11:12 UTC
(In reply to Janus Troelsen from comment #111)
> Did you upgrade your kernel Alex? If you did, the GPU should be able to
> properly reset.

The last official kernel released for *ubuntus 14.04 is 3.16 which is no good for this issue.  If I recall correctly the one needed is 3.19 and in this case it has to be downloaded from the mainline site and they do not recommend that.

I don't know whether I should trust it or not.
Comment 113 Matt Turner 2015-04-30 18:14:10 UTC
*** Bug 89249 has been marked as a duplicate of this bug. ***
Comment 114 Matt Turner 2015-04-30 18:14:50 UTC
*** Bug 88281 has been marked as a duplicate of this bug. ***
Comment 115 Matt Turner 2015-04-30 18:15:07 UTC
*** Bug 88195 has been marked as a duplicate of this bug. ***
Comment 116 Matt Turner 2015-04-30 18:15:24 UTC
*** Bug 87770 has been marked as a duplicate of this bug. ***
Comment 117 Matt Turner 2015-04-30 18:15:39 UTC
*** Bug 87723 has been marked as a duplicate of this bug. ***
Comment 118 Matt Turner 2015-04-30 18:16:37 UTC
*** Bug 89489 has been marked as a duplicate of this bug. ***
Comment 119 Matt Turner 2015-04-30 18:16:57 UTC
*** Bug 87550 has been marked as a duplicate of this bug. ***
Comment 120 Matt Turner 2015-04-30 18:17:12 UTC
*** Bug 86972 has been marked as a duplicate of this bug. ***
Comment 121 Matt Turner 2015-04-30 18:17:31 UTC
*** Bug 87089 has been marked as a duplicate of this bug. ***
Comment 122 Matt Turner 2015-04-30 18:17:46 UTC
*** Bug 86937 has been marked as a duplicate of this bug. ***
Comment 123 Matt Turner 2015-04-30 23:21:45 UTC
*** Bug 84803 has been marked as a duplicate of this bug. ***
Comment 124 Matt Turner 2015-05-14 04:58:30 UTC
*** Bug 86721 has been marked as a duplicate of this bug. ***
Comment 125 Matt Turner 2015-06-01 17:54:37 UTC
*** Bug 89341 has been marked as a duplicate of this bug. ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.