Bug 105884 - [GP106] Firefox causes a crash / X lockup in the nouveau driver on GTX 1060
Summary: [GP106] Firefox causes a crash / X lockup in the nouveau driver on GTX 1060
Status: RESOLVED MOVED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: high critical
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2018-04-04 12:02 UTC by Artem S. Tashkinov
Modified: 2019-12-04 09:37 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg (102.14 KB, text/plain)
2018-04-04 12:02 UTC, Artem S. Tashkinov
no flags Details
Another crash (88.54 KB, text/plain)
2018-04-12 13:19 UTC, Artem S. Tashkinov
no flags Details
X.org.log (37.90 KB, text/x-log)
2018-04-12 13:30 UTC, Artem S. Tashkinov
no flags Details
dmesg again (77.06 KB, text/plain)
2018-05-30 12:45 UTC, Artem S. Tashkinov
no flags Details

Description Artem S. Tashkinov 2018-04-04 12:02:34 UTC
Created attachment 138576 [details]
dmesg

I don't remember what exactly I was doing (something in Firefox apparently) but my entire X session hang and I could only move the mouse pointer - everything else was static and didn't react to key/mouse presses.

4.15.13-300.fc27.x86_64 #1 SMP Mon Mar 26 19:06:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Comment 1 Artem S. Tashkinov 2018-04-12 13:19:48 UTC
Created attachment 138790 [details]
Another crash

This is definitely a regression in kernel 4.15 because I've never had this issue before.
Comment 2 Artem S. Tashkinov 2018-04-12 13:25:22 UTC
It is possible that kernel 4.15.9 didn't have this issue, so a regression window is quite small.
Comment 3 Artem S. Tashkinov 2018-04-12 13:30:19 UTC
Created attachment 138797 [details]
X.org.log

Seemingly there's nothing interesting in the X.org log.
Comment 4 Ilia Mirkin 2018-04-12 13:48:14 UTC
4.15 had a number of things added to it.

One thing is using hugepages in ttm, which a lot of people have reported issues with. Try disabling CONFIG_SWIOTLB (not really needed on modern hw, afaik).

Also nouveau's vm got a rewrite. Some fixes needed to be applied, I think they're all in 4.16, not as sure about 4.15.x.

The start of your errors is

[ 7661.417755] nouveau 0000:01:00.0: fifo: PBDMA0: 00040000 [PBENTRY] ch 7 [017f3cc000 firefox[19007]] subc 0 mthd 0000 data 00000000
[ 7661.417774] nouveau 0000:01:00.0: gr: TRAP ch 7 [017f3cc000 firefox[19007]]
[ 7661.417780] nouveau 0000:01:00.0: gr: DISPATCH 80000002 [CLASS_SUBCH_MISMATCH]
[ 7661.417793] nouveau 0000:01:00.0: fifo: PBDMA0: 00200000 [METHOD] ch 7 [017f3cc000 firefox[19007]] subc 0 mthd 000c data 88888888

which doesn't look extremely familiar. "FIFO is confused", after which everything goes downhill pretty fast.
Comment 5 Artem S. Tashkinov 2018-04-12 14:40:25 UTC
(In reply to Ilia Mirkin from comment #4)
> 4.15 had a number of things added to it.
> 
> One thing is using hugepages in ttm, which a lot of people have reported
> issues with. Try disabling CONFIG_SWIOTLB (not really needed on modern hw,
> afaik).
> 
> Also nouveau's vm got a rewrite. Some fixes needed to be applied, I think
> they're all in 4.16, not as sure about 4.15.x.
> 

I'd be glad to disable this option but I'm on Fedora 27 and I use the distro kernel, so I'm not sure how I can do that. Aren't you employed by RedHat/Fedora, Ilia? I guess it's in your power to raise this issue internally.
Comment 6 Ilia Mirkin 2018-04-12 14:54:31 UTC
(In reply to Artem S. Tashkinov from comment #5)
> (In reply to Ilia Mirkin from comment #4)
> > 4.15 had a number of things added to it.
> > 
> > One thing is using hugepages in ttm, which a lot of people have reported
> > issues with. Try disabling CONFIG_SWIOTLB (not really needed on modern hw,
> > afaik).
> > 
> > Also nouveau's vm got a rewrite. Some fixes needed to be applied, I think
> > they're all in 4.16, not as sure about 4.15.x.
> > 
> 
> I'd be glad to disable this option but I'm on Fedora 27 and I use the distro
> kernel, so I'm not sure how I can do that. Aren't you employed by
> RedHat/Fedora, Ilia? I guess it's in your power to raise this issue
> internally.

I don't think so. Perhaps you know something I don't?

Either way, this is the upstream bug system. If you have an issue with $distro, bring up the problem with $distro.
Comment 7 Artem S. Tashkinov 2018-04-12 14:58:00 UTC
(In reply to Ilia Mirkin from comment #6)
> 
> Either way, this is the upstream bug system. If you have an issue with
> $distro, bring up the problem with $distro.

Does "iommu=off" disable SWIOTLB?
Comment 8 Artem S. Tashkinov 2018-05-11 14:24:16 UTC
This happens under 4.16 as well but a lot less frequently (maybe because I switched to Google Chrome to watch youtube):

May 11 14:17:06 localhost kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 00040000 [PBENTRY] ch 9 [017ed17000 firefox[14167]] subc 0 mthd 0000 data 00000000
May 11 14:17:06 localhost kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 00040000 [PBENTRY] ch 9 [017ed17000 firefox[14167]] subc 0 mthd 0000 data 00000000
May 11 14:17:06 localhost kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 00040000 [PBENTRY] ch 9 [017ed17000 firefox[14167]] subc 0 mthd 0000 data 00000000
Comment 9 Artem S. Tashkinov 2018-05-30 12:45:49 UTC
Created attachment 139854 [details]
dmesg again

[14443.603553] kauditd_printk_skb: 51 callbacks suppressed
Comment 10 Artem S. Tashkinov 2018-05-30 12:48:11 UTC
Probably related bug 99202
Comment 11 Martin Peres 2019-12-04 09:37:39 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/419.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.