Bug 105045 - System freeze with nouveau
Summary: System freeze with nouveau
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/nouveau (show other bugs)
Version: 17.3
Hardware: x86-64 (AMD64) Linux (All)
: high critical
Assignee: Nouveau Project
QA Contact: Nouveau Project
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-11 17:09 UTC by Sergey Tereschenko
Modified: 2018-02-28 09:27 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg after crash (86.46 KB, text/x-log)
2018-02-11 17:09 UTC, Sergey Tereschenko
Details
gnome-shell core dump (10.18 KB, text/plain)
2018-02-11 17:24 UTC, Sergey Tereschenko
Details
xwayland coredump (5.67 KB, text/plain)
2018-02-11 17:25 UTC, Sergey Tereschenko
Details

Description Sergey Tereschenko 2018-02-11 17:09:07 UTC
Created attachment 137273 [details]
dmesg after  crash

I use ArchLinux with latest updates:

- linux 4.15.2-2
- mesa 17.3.3-2
- wayland 1.14.0-1
- wayland-protocols 1.12-1
- gnome-shell 3.26.2+14+g64c857e3f-1

And Nvidia GTX 1080 hardware.


My system sometimes freezes when i open new tab in gvim. It does not respond to Ctrl+Alt+1-7, so i cannot switch to terminal. But it reacts to REISUB, which i used until today.

Today I connected to desktop via ssh, used `strace` to connect to Xwayland, and it was doing over and over something with DRM_IOCTL_NOUVEAU_GEM_PUSHBUF.
I forgot to save that log, i will include it next time.

Then i killed process with `kill -QUIT`, and it saved coredumps from Xwayland and gnome-shell. After that gdm restarted and everything was fine.

Attaching dmesg and coredumps. What else can i do to help fix this bug?
Comment 1 Sergey Tereschenko 2018-02-11 17:24:44 UTC
Created attachment 137276 [details]
gnome-shell core dump
Comment 2 Sergey Tereschenko 2018-02-11 17:25:11 UTC
Created attachment 137277 [details]
xwayland coredump
Comment 3 Ilia Mirkin 2018-02-11 17:55:34 UTC
This happens because GPU fault recovery currently leaves the process in a totally broken state. (And all other processes that use the GPU, seemingly.)

The general approach has been to try to avoid the GPU faults in the first place -- that will be printed before any of your other errors (i.e. before the nv50_cal stuff).

Note that mesa 18.0 may have some relevant fixes, e.g. 

commit adcd241b563f44b2e3e92f5d840e2f617bc25836
Author: Ilia Mirkin <imirkin@alum.mit.edu>
Date:   Mon Jan 1 14:54:17 2018 -0500

    nvc0: ensure that pushbuf keeps ref to old text/tls bos

Perhaps worth a shot. (This could hit if gnome-shell were generating tons and tons of silly shaders. Which it might be.)
Comment 4 Sergey Tereschenko 2018-02-11 18:19:12 UTC
Can i just apply that patch to 17.3 or better build everything from git?
Comment 5 Ilia Mirkin 2018-02-11 18:23:02 UTC
(In reply to Sergey Tereschenko from comment #4)
> Can i just apply that patch to 17.3 or better build everything from git?

You can just apply it. There's other stuff that's gone in though... I was just pointing out one thing I did. A bunch of people had issues with 17.3 and resizing stuff as I recall... other things too. I'd basically recommend running mesa master -- mesa is not very good at backporting fixes. And master's very rarely in a broken state.
Comment 6 Sergey Tereschenko 2018-02-11 18:26:09 UTC
(In reply to Ilia Mirkin from comment #5)
> (In reply to Sergey Tereschenko from comment #4)
> > Can i just apply that patch to 17.3 or better build everything from git?
> 
> You can just apply it. There's other stuff that's gone in though... I was
> just pointing out one thing I did. A bunch of people had issues with 17.3
> and resizing stuff as I recall... other things too. I'd basically recommend
> running mesa master -- mesa is not very good at backporting fixes. And
> master's very rarely in a broken state.

Thanks, i'll try to build from git.
Comment 7 Sergey Tereschenko 2018-02-11 19:20:41 UTC
(In reply to Ilia Mirkin from comment #5)
> (In reply to Sergey Tereschenko from comment #4)
> > Can i just apply that patch to 17.3 or better build everything from git?
> 
> You can just apply it. There's other stuff that's gone in though... I was
> just pointing out one thing I did. A bunch of people had issues with 17.3
> and resizing stuff as I recall... other things too. I'd basically recommend
> running mesa master -- mesa is not very good at backporting fixes. And
> master's very rarely in a broken state.

I found there is also 18.0.0-rc4, and looks like it almost recently released.
I just built it, it working but now i can't click anything with mouse nor gdm nor after login into gnome-shell.

But under weston it is working.
Comment 8 Sergey Tereschenko 2018-02-11 19:56:02 UTC
(In reply to Sergey Tereschenko from comment #7)
> (In reply to Ilia Mirkin from comment #5)
> > (In reply to Sergey Tereschenko from comment #4)
> > > Can i just apply that patch to 17.3 or better build everything from git?
> > 
> > You can just apply it. There's other stuff that's gone in though... I was
> > just pointing out one thing I did. A bunch of people had issues with 17.3
> > and resizing stuff as I recall... other things too. I'd basically recommend
> > running mesa master -- mesa is not very good at backporting fixes. And
> > master's very rarely in a broken state.
> 
> I found there is also 18.0.0-rc4, and looks like it almost recently released.
> I just built it, it working but now i can't click anything with mouse nor
> gdm nor after login into gnome-shell.
> 
> But under weston it is working.

Switching from wayland to xorg fixed issue with mouse in gnome. But issues with xorg it is what get me to use nouveau+wayland (under xorg there is broken keyboard switching using grp:shift_caps_switch).
Comment 9 Sergey Tereschenko 2018-02-12 20:43:17 UTC
I built mesa-git from source, and gdm starting only in xorg mode.

When i try to run

> XDG_SESSION_TYPE=wayland exec dbus-run-session gnome-session

It fails with message "Connection to the bus can't be made"

weston is still running fine.

reverted back to 17.3.3
Comment 10 Sergey Tereschenko 2018-02-28 09:27:07 UTC
After updating to mesa 17.3.5-1 i haven't seen this bug for few days.

Looks like it was fixed.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.