Bug 94637

Summary: system crash, no messages, GT215, ubuntu 16.04 when running glmark2 for a few minutes.
Product: Mesa Reporter: Peter Silva <peter>
Component: Drivers/DRI/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED MOVED QA Contact: Nouveau Project <nouveau>
Severity: critical    
Priority: medium    
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: general info about the system... going through the FAQ
Sample Xorg.0.log, end is crash (no message)
today´s dmesg
attachment-21701-0.html
when I run dmesg, I get the right stuff, but the file in /var/log is wrong. weird!

Description Peter Silva 2016-03-20 13:49:14 UTC
Created attachment 122444 [details]
general info about the system... going through the FAQ

running fully patched ubuntu xenial, only nouveau drivers.  If I start minecraft, and run for a few minutes, the system locks up for about 3 minutes, then spontaneously reboots.  There are no messages I can see in any log, only bootup messages after the event.

Using distribution packages, no xorg.conf, no trace of proprietary drivers I can see.

The same hardware was stable using proprietary drivers on ubuntu 14.04.
just upgraded to 16.04 and switched to nouveau.

I can browse the web all day. Works fine as long as the GL isn't exercised.  Nothing to do with minecraft as, it takes a little longer, say 10 to 15 minutes, but I can reproduce the problem exactly by just running glmark2 package from the repository.
Comment 1 Peter Silva 2016-03-20 13:52:00 UTC
Created attachment 122445 [details]
Sample Xorg.0.log, end is crash (no message)
Comment 2 Karol Herbst 2016-03-20 13:59:46 UTC
please also attach your dmesg when it crashes
Comment 3 Peter Silva 2016-03-20 14:05:54 UTC
Created attachment 122446 [details]
today´s dmesg
Comment 4 Pierre Moreau 2016-03-20 14:14:47 UTC
The dmesg you linked is *interesting*. There is no mention of Nouveau being loaded, and it reports a kernel version of 3.13, whereas your Xorg.log and your information paste mention kernel 4.4. Could you link a dmesg from 4.4 when it crashes? (If you have systemd on your laptop, you can get the logs from the previous boot by running `journalctl -b -1`.)
Comment 5 Peter Silva 2016-03-20 14:39:35 UTC
Created attachment 122447 [details]
attachment-21701-0.html

about 3.13... I noticed that stuff lying around, and purged images and
headers related to old kernel.  Reproduced problem afterward.

by the way... I it crashed just now again, I wasn´t in anything graphics
intensive, it had stayed up for about three hours, but I was just browsing,
and building the bug report (running Unity 7... the normal ubuntu desktop.)


root@alu:/alu/sdc1/home/peter# journalctl -b -1
Specifying boot ID has no effect, no persistent journal was found
root@alu:/alu/sdc1/home/peter# uname -a
Linux alu 4.4.0-14-generic #30-Ubuntu SMP Tue Mar 15 13:04:17 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux
root@alu:/alu/sdc1/home/peter# dpkg -l | grep linux-image
ii  linux-image-4.4.0-14-generic                          4.4.0-14.30
                         amd64        Linux kernel image for version 4.4.0
on 64 bit x86 SMP
ii  linux-image-extra-4.4.0-14-generic                    4.4.0-14.30
                         amd64        Linux kernel extra modules for
version 4.4.0 on 64 bit x86 SMP
ii  linux-image-generic                                   4.4.0.14.15
                         amd64        Generic Linux kernel image
root@alu:/alu/sdc1/home/peter#

On Sun, Mar 20, 2016 at 10:14 AM, <bugzilla-daemon@freedesktop.org> wrote:

> *Comment # 4 <https://bugs.freedesktop.org/show_bug.cgi?id=94637#c4> on
> bug 94637 <https://bugs.freedesktop.org/show_bug.cgi?id=94637> from Pierre
> Moreau <pierre.morrow@free.fr> *
>
> The dmesg you linked is *interesting*. There is no mention of Nouveau being
> loaded, and it reports a kernel version of 3.13, whereas your Xorg.log and your
> information paste mention kernel 4.4. Could you link a dmesg from 4.4 when it
> crashes? (If you have systemd on your laptop, you can get the logs from the
> previous boot by running `journalctl -b -1`.)
>
> ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 6 Peter Silva 2016-03-20 14:41:57 UTC
Created attachment 122448 [details]
when I run dmesg, I get the right stuff, but the file in /var/log is wrong. weird!
Comment 7 Peter Silva 2016-03-20 14:42:21 UTC
oh, and I have now purged docker...
Comment 8 Pierre Moreau 2016-03-20 14:48:54 UTC
Sadly there is nothing wrong in the dmesg… Maybe run `cat /dev/kmsg > somefile.txt&` and see if after crashing the computer, somefile.txt contains some more information.
Comment 9 Karol Herbst 2016-03-20 15:19:10 UTC
(In reply to Peter Silva from comment #6)
> Created attachment 122448 [details]
> when I run dmesg, I get the right stuff, but the file in /var/log is wrong.
> weird!

with systemd it is saved inside a journal with journald
Comment 10 Karol Herbst 2016-03-20 15:21:12 UTC
(In reply to Pierre Moreau from comment #8)
> Sadly there is nothing wrong in the dmesg… Maybe run `cat /dev/kmsg >
> somefile.txt&` and see if after crashing the computer, somefile.txt contains
> some more information.

I fear that when the kernel crashes, all the filesystem stuff is gone already. Maybe in the end he needs to setup netconsole
Comment 11 Peter Silva 2016-03-20 17:36:41 UTC
added the netlog thing.  It emitted this single line when it crashed:

peter@tigaroo:~$ more nc_log
[ 5612.909828] nouveau 0000:01:00.0: disp: ERROR 0 [] 00 [] chid 1 mthd 0080 data 00000000
peter@tigaroo:~$
Comment 12 Peter Silva 2016-03-23 01:14:06 UTC
I installed proprietary driver, and all symptoms disappeared.

It looks like it really was nouveau.
Comment 13 Pierre Moreau 2016-05-19 21:03:51 UTC
Just that single line? That’s sad…
Which version of Mesa are you using?

What happen if you boot with `nouveau.config=NvForcePost=1` on the kernel command line, and start Minecraft or glmark2 without doing any suspend/resume in between booting and starting the application? Probably more far fetched, but what happens if you reclock to a higher clock before starting the application?

* On 4.4, boot with nouveau.pstate=1 on the kernel command line, and run `echo XX > /sys/class/drm/card0/device/pstate` as root, where XX is a performance level; you can find the different performance level available on your card by cat’ing that file.
* On 4.5+, run `echo XX > /sys/kernel/debug/dri/0/pstate` as root, where XX is a performance level; you can find the different performance level available on your card by cat’ing that file.
Comment 14 GitLab Migration User 2019-09-18 20:42:12 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1097.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.