Bug 73373 - [NVE4] GPU lockup after opening many tabs in Chromium web browser
Summary: [NVE4] GPU lockup after opening many tabs in Chromium web browser
Status: RESOLVED MOVED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/nouveau (show other bugs)
Version: 10.0
Hardware: x86-64 (AMD64) Linux (All)
: medium blocker
Assignee: Nouveau Project
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 92077
Blocks:
  Show dependency treegraph
 
Reported: 2014-01-07 21:44 UTC by Mario Barrera
Modified: 2019-09-18 20:39 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
First reproduction of the bug. (83.21 KB, text/plain)
2014-01-07 21:44 UTC, Mario Barrera
Details
Second reproduction of the bug. (130.18 KB, text/plain)
2014-01-07 21:45 UTC, Mario Barrera
Details
Third reproduction of the bug. (130.40 KB, text/plain)
2014-01-07 21:46 UTC, Mario Barrera
Details
Fourth reproduction of the bug. (164.49 KB, text/plain)
2014-01-07 21:47 UTC, Mario Barrera
Details
System logs after setting nouveau.config=NvGrUseFW=1 kernel option (82.08 KB, text/plain)
2014-01-25 19:02 UTC, Mario Barrera
Details
Reproducing the issue with chromium browser now with the correct firmware loaded. (225.60 KB, text/plain)
2014-01-25 20:13 UTC, Mario Barrera
Details
opening several HTML files with Firefox (7.93 MB, text/plain)
2015-11-03 16:12 UTC, Mario Barrera
Details
opening several HTML files with Firefox (82.34 KB, text/plain)
2015-11-03 16:12 UTC, Mario Barrera
Details

Description Mario Barrera 2014-01-07 21:44:05 UTC
When using the Chromium web browser, if I open too many tabs with graphics, the graphics will crash leaving the desktop not responsive, still the system will continue to work.

Attached are the kernel logs I got through SSH after reproducing the bug four times.
Comment 1 Mario Barrera 2014-01-07 21:44:49 UTC
Created attachment 91618 [details]
First reproduction of the bug.
Comment 2 Mario Barrera 2014-01-07 21:45:10 UTC
Created attachment 91619 [details]
Second reproduction of the bug.
Comment 3 Mario Barrera 2014-01-07 21:46:49 UTC
Created attachment 91620 [details]
Third reproduction of the bug.
Comment 4 Mario Barrera 2014-01-07 21:47:06 UTC
Created attachment 91621 [details]
Fourth reproduction of the bug.
Comment 5 Mario Barrera 2014-01-07 21:50:24 UTC
Chromium is not the only software making this bug happen but it is the easiest way to reproduce the bug. The GPU will also lockup after a long browsing session with Firefox or a game. 3D games also have lots of visual artifacts. Openarena and Freeminer tested.
Comment 6 Ilia Mirkin 2014-01-07 21:52:30 UTC
First, try upgrading your mesa installation to at least 10.0.1.

Second, you can see if the situation improves with blob graph fw. Take a look at http://nouveau.freedesktop.org/wiki/NVC0_Firmware/ for instructions on how to get the fw. (Note you only need to go up to video fw. There are other ways of getting the video fw should you want it.) You'll need to boot with nouveau.NvGrUseFW=1 in order for the fw to be loaded.
Comment 7 Mario Barrera 2014-01-08 18:03:49 UTC
(In reply to comment #6)
> First, try upgrading your mesa installation to at least 10.0.1.

Hi Ilia. My version is 10.0.1, but the most precise I could specify when filing the bug was 10.0.
 
> Second, you can see if the situation improves with blob graph fw. Take a
> look at http://nouveau.freedesktop.org/wiki/NVC0_Firmware/ for instructions
> on how to get the fw. (Note you only need to go up to video fw. There are
> other ways of getting the video fw should you want it.) You'll need to boot
> with nouveau.NvGrUseFW=1 in order for the fw to be loaded.

I have followed your instructions and installed the fw with the aur/nouveau-fw package in Archlinux and added the kernel option too, but the performance seems to not change and I could reproduce the bug, though I am not really sure if the changes resulted in using the firmware correctly. How can I check that?
Comment 8 Ilia Mirkin 2014-01-08 18:22:22 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > Second, you can see if the situation improves with blob graph fw. Take a
> > look at http://nouveau.freedesktop.org/wiki/NVC0_Firmware/ for instructions
> > on how to get the fw. (Note you only need to go up to video fw. There are
> > other ways of getting the video fw should you want it.) You'll need to boot
> > with nouveau.NvGrUseFW=1 in order for the fw to be loaded.
> 
> I have followed your instructions and installed the fw with the
> aur/nouveau-fw package in Archlinux and added the kernel option too, but the

Pretty sure my instructions were to mmiotrace the blob by following the instructions on that wiki page, not to install a firmware package that only provides the video firmware. (And while I'm working on making my script also extract the graph firmware, that's not ready yet.) On the bright side, you should be able to use VDPAU for hw-accelerated decoding now.

Oh, I also got the cmdline option wrong -- nouveau.config=NvGrUseFW=1 -- sorry.
Comment 9 Mario Barrera 2014-01-08 23:44:53 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #6)
> > > Second, you can see if the situation improves with blob graph fw. Take a
> > > look at http://nouveau.freedesktop.org/wiki/NVC0_Firmware/ for instructions
> > > on how to get the fw. (Note you only need to go up to video fw. There are
> > > other ways of getting the video fw should you want it.) You'll need to boot
> > > with nouveau.NvGrUseFW=1 in order for the fw to be loaded.
> > 
> > I have followed your instructions and installed the fw with the
> > aur/nouveau-fw package in Archlinux and added the kernel option too, but the
> 
> Pretty sure my instructions were to mmiotrace the blob by following the
> instructions on that wiki page, not to install a firmware package that only
> provides the video firmware. (And while I'm working on making my script also
> extract the graph firmware, that's not ready yet.) On the bright side, you
> should be able to use VDPAU for hw-accelerated decoding now.
> 
> Oh, I also got the cmdline option wrong -- nouveau.config=NvGrUseFW=1 --
> sorry.

I did not make the mmiotrace because I did not see any change or know if the firmware was actually loaded.
Now I changed the option to nouveau.config=NvGrUseFW=1 and the graphics freeze when the nouveau module loads apparently.
Comment 10 Ilia Mirkin 2014-01-09 01:14:27 UTC
(In reply to comment #9)
> Now I changed the option to nouveau.config=NvGrUseFW=1 and the graphics
> freeze when the nouveau module loads apparently.

Any chance you can get logs when that happens? (e.g. by ssh'ing in, or perhaps they make it to some system log)

I assume that this is with the relevant nve4_* firmware files available in /lib/firmware/nouveau at nouveau module load time...
Comment 11 Mario Barrera 2014-01-25 19:01:24 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > Now I changed the option to nouveau.config=NvGrUseFW=1 and the graphics
> > freeze when the nouveau module loads apparently.
> 
> Any chance you can get logs when that happens? (e.g. by ssh'ing in, or
> perhaps they make it to some system log)
> 
> I assume that this is with the relevant nve4_* firmware files available in
> /lib/firmware/nouveau at nouveau module load time...

Indeed it seems there is a missing file. I attached the system logs.
Comment 12 Mario Barrera 2014-01-25 19:02:20 UTC
Created attachment 92779 [details]
System logs after setting nouveau.config=NvGrUseFW=1 kernel option
Comment 13 Mario Barrera 2014-01-25 20:12:12 UTC
I managed to get the right fw files with help provided on #nouveau@irc.freenode.net.
I attach now the kernel logs after reproducing the issue with chrome, which still happens.
Comment 14 Mario Barrera 2014-01-25 20:13:41 UTC
Created attachment 92783 [details]
Reproducing the issue with chromium browser now with the correct firmware loaded.
Comment 15 Mario Barrera 2014-02-01 01:45:44 UTC
Maybe useful, opening about 10 tabs with embedded flash elements in Firefox seems to cause the running Chromium browser to trigger the GPU lockup.

[47622.380883] nouveau E[   PFIFO][0000:01:00.0] PFIFO: read fault at 0x0000038000 [PAGE_NOT_PRESENT] from (unknown enum 0x00000007)/(unknown enum 0x00000006) on channel 0x00becd6000 [unknown]
[47652.758653] nouveau E[     DRM] GPU lockup - switching to software fbcon
[47667.778774] nouveau E[  X[586]] failed to idle channel 0xcccc0001 [X[586]]
[47682.791547] nouveau E[  X[586]] failed to idle channel 0xcccc0001 [X[586]]
[47697.804319] nouveau E[  X[586]] failed to idle channel 0xcccc0000 [X[586]]
[47712.817091] nouveau E[  X[586]] failed to idle channel 0xcccc0000 [X[586]]
[47728.123445] nouveau E[chromium[17985]] failed to idle channel 0xcccc0000 [chromium[17985]]
[47743.136216] nouveau E[chromium[17985]] failed to idle channel 0xcccc0000 [chromium[17985]]
[47758.152323] nouveau E[chromium[1104]] failed to idle channel 0xcccc0000 [chromium[1104]]
[47773.165095] nouveau E[chromium[1104]] failed to idle channel 0xcccc0000 [chromium[1104]]
[47788.184539] nouveau E[plugin-containe[23749]] failed to idle channel 0xcccc0000 [plugin-containe[23749]]
[47803.197309] nouveau E[plugin-containe[23749]] failed to idle channel 0xcccc0000 [plugin-containe[23749]]
[47818.213419] nouveau E[plugin-containe[23749]] failed to idle channel 0xcccc0000 [plugin-containe[23749]]
[47833.226191] nouveau E[plugin-containe[23749]] failed to idle channel 0xcccc0000 [plugin-containe[23749]]
[47848.242298] nouveau E[plugin-containe[23749]] failed to idle channel 0xcccc0000 [plugin-containe[23749]]
[47863.255070] nouveau E[plugin-containe[23749]] failed to idle channel 0xcccc0000 [plugin-containe[23749]]
Comment 16 andros 2014-07-28 08:31:39 UTC
I witnessed the same with
Linux 3.15.6-gentoo #1 SMP Mon Jul 21 07:42:30 CEST 2014 x86_64 AMD A8-3500M APU with Radeon(tm) HD Graphics AuthenticAMD GNU/Linux

After a while hard drives are inaccessible, the system stay use able for a while but then freezes. Sometime Magic SysRq will work.
Comment 17 andros 2014-07-28 08:36:40 UTC
mesa 10.0.4
xf86-video-ati 7.3.0
Comment 18 Ilia Mirkin 2015-10-22 19:57:23 UTC
(In reply to Mario Barrera from comment #15)
> Maybe useful, opening about 10 tabs with embedded flash elements in Firefox
> seems to cause the running Chromium browser to trigger the GPU lockup.

Does this still happen with a recent kernel / mesa? Among other things, mesa 11.0.3 fixes a number of annoying resource management issues.
Comment 19 Mario Barrera 2015-11-03 16:07:27 UTC
(In reply to Ilia Mirkin from comment #18)
> (In reply to Mario Barrera from comment #15)
> > Maybe useful, opening about 10 tabs with embedded flash elements in Firefox
> > seems to cause the running Chromium browser to trigger the GPU lockup.
> 
> Does this still happen with a recent kernel / mesa? Among other things, mesa
> 11.0.3 fixes a number of annoying resource management issues.

I can still reproduce the issue by opening many HTML files with Firefox. It happens always.

I have mesa 11.0.4 in Linux 4.2.5-1-ARCH.

I'm attaching the logs.
Comment 20 Mario Barrera 2015-11-03 16:12:05 UTC
Created attachment 119386 [details]
opening several HTML files with Firefox
Comment 21 Mario Barrera 2015-11-03 16:12:41 UTC
Created attachment 119387 [details]
opening several HTML files with Firefox
Comment 22 Tomasz Paweł Gajc 2016-12-10 14:37:34 UTC
Looks like this is related to broken multi-threading in nouveau, see linked bugs.
Comment 23 Ali Akcaagac 2016-12-13 15:56:04 UTC
I would like to add, that I am experiencing the same issues as the reporter.

But!

I am using the *radeon* drivers!

Lockup of google-chrome after a few tabs or simple random browsing for just a little while.

This has shown up after I switched from Fedora 24 to Fedora 25 a couple of weeks ago. It's really frustrating to use google-chrome, knowing that It will crash any moment.

After some digging, I found out that there are a few other similar bugs posted on freedesktop (and even on redhat bugzilla).

https://bugzilla.redhat.com/show_bug.cgi?id=1376107

After uninstalling the entire mesa-dri package:

mesa-dri-drivers-13.0.2-1.fc25.x86_64

All crashes and random lockups are gone. I've been running this one google-chrome instance the entire day without one single lockup (in software mode I think).

So it may be, that the bug is also related to other parts of mesa.
Comment 24 GitLab Migration User 2019-09-18 20:39:08 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1059.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.