Bug 105884 - [GP106] Firefox causes a crash / X lockup in the nouveau driver on GTX 1060
Summary: [GP106] Firefox causes a crash / X lockup in the nouveau driver on GTX 1060
Status: NEW
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: high critical
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2018-04-04 12:02 UTC by Artem S. Tashkinov
Modified: 2018-05-30 12:48 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg (102.14 KB, text/plain)
2018-04-04 12:02 UTC, Artem S. Tashkinov
no flags Details
Another crash (88.54 KB, text/plain)
2018-04-12 13:19 UTC, Artem S. Tashkinov
no flags Details
X.org.log (37.90 KB, text/x-log)
2018-04-12 13:30 UTC, Artem S. Tashkinov
no flags Details
dmesg again (77.06 KB, text/plain)
2018-05-30 12:45 UTC, Artem S. Tashkinov
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Artem S. Tashkinov 2018-04-04 12:02:34 UTC
Created attachment 138576 [details]
dmesg

I don't remember what exactly I was doing (something in Firefox apparently) but my entire X session hang and I could only move the mouse pointer - everything else was static and didn't react to key/mouse presses.

4.15.13-300.fc27.x86_64 #1 SMP Mon Mar 26 19:06:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Comment 1 Artem S. Tashkinov 2018-04-12 13:19:48 UTC
Created attachment 138790 [details]
Another crash

This is definitely a regression in kernel 4.15 because I've never had this issue before.
Comment 2 Artem S. Tashkinov 2018-04-12 13:25:22 UTC
It is possible that kernel 4.15.9 didn't have this issue, so a regression window is quite small.
Comment 3 Artem S. Tashkinov 2018-04-12 13:30:19 UTC
Created attachment 138797 [details]
X.org.log

Seemingly there's nothing interesting in the X.org log.
Comment 4 Ilia Mirkin 2018-04-12 13:48:14 UTC
4.15 had a number of things added to it.

One thing is using hugepages in ttm, which a lot of people have reported issues with. Try disabling CONFIG_SWIOTLB (not really needed on modern hw, afaik).

Also nouveau's vm got a rewrite. Some fixes needed to be applied, I think they're all in 4.16, not as sure about 4.15.x.

The start of your errors is

[ 7661.417755] nouveau 0000:01:00.0: fifo: PBDMA0: 00040000 [PBENTRY] ch 7 [017f3cc000 firefox[19007]] subc 0 mthd 0000 data 00000000
[ 7661.417774] nouveau 0000:01:00.0: gr: TRAP ch 7 [017f3cc000 firefox[19007]]
[ 7661.417780] nouveau 0000:01:00.0: gr: DISPATCH 80000002 [CLASS_SUBCH_MISMATCH]
[ 7661.417793] nouveau 0000:01:00.0: fifo: PBDMA0: 00200000 [METHOD] ch 7 [017f3cc000 firefox[19007]] subc 0 mthd 000c data 88888888

which doesn't look extremely familiar. "FIFO is confused", after which everything goes downhill pretty fast.
Comment 5 Artem S. Tashkinov 2018-04-12 14:40:25 UTC
(In reply to Ilia Mirkin from comment #4)
> 4.15 had a number of things added to it.
> 
> One thing is using hugepages in ttm, which a lot of people have reported
> issues with. Try disabling CONFIG_SWIOTLB (not really needed on modern hw,
> afaik).
> 
> Also nouveau's vm got a rewrite. Some fixes needed to be applied, I think
> they're all in 4.16, not as sure about 4.15.x.
> 

I'd be glad to disable this option but I'm on Fedora 27 and I use the distro kernel, so I'm not sure how I can do that. Aren't you employed by RedHat/Fedora, Ilia? I guess it's in your power to raise this issue internally.
Comment 6 Ilia Mirkin 2018-04-12 14:54:31 UTC
(In reply to Artem S. Tashkinov from comment #5)
> (In reply to Ilia Mirkin from comment #4)
> > 4.15 had a number of things added to it.
> > 
> > One thing is using hugepages in ttm, which a lot of people have reported
> > issues with. Try disabling CONFIG_SWIOTLB (not really needed on modern hw,
> > afaik).
> > 
> > Also nouveau's vm got a rewrite. Some fixes needed to be applied, I think
> > they're all in 4.16, not as sure about 4.15.x.
> > 
> 
> I'd be glad to disable this option but I'm on Fedora 27 and I use the distro
> kernel, so I'm not sure how I can do that. Aren't you employed by
> RedHat/Fedora, Ilia? I guess it's in your power to raise this issue
> internally.

I don't think so. Perhaps you know something I don't?

Either way, this is the upstream bug system. If you have an issue with $distro, bring up the problem with $distro.
Comment 7 Artem S. Tashkinov 2018-04-12 14:58:00 UTC
(In reply to Ilia Mirkin from comment #6)
> 
> Either way, this is the upstream bug system. If you have an issue with
> $distro, bring up the problem with $distro.

Does "iommu=off" disable SWIOTLB?
Comment 8 Artem S. Tashkinov 2018-05-11 14:24:16 UTC
This happens under 4.16 as well but a lot less frequently (maybe because I switched to Google Chrome to watch youtube):

May 11 14:17:06 localhost kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 00040000 [PBENTRY] ch 9 [017ed17000 firefox[14167]] subc 0 mthd 0000 data 00000000
May 11 14:17:06 localhost kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 00040000 [PBENTRY] ch 9 [017ed17000 firefox[14167]] subc 0 mthd 0000 data 00000000
May 11 14:17:06 localhost kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 00040000 [PBENTRY] ch 9 [017ed17000 firefox[14167]] subc 0 mthd 0000 data 00000000
Comment 9 Artem S. Tashkinov 2018-05-30 12:45:49 UTC
Created attachment 139854 [details]
dmesg again

[14443.603553] kauditd_printk_skb: 51 callbacks suppressed
Comment 10 Artem S. Tashkinov 2018-05-30 12:48:11 UTC
Probably related bug 99202


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.