Bug 87361 - [NV4C] GPU lock-up after booting to desktop in Fedora 20 & 21 (Nvidia Geforce 6100 IGP)
Summary: [NV4C] GPU lock-up after booting to desktop in Fedora 20 & 21 (Nvidia Geforce...
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/nouveau (show other bugs)
Version: 10.3
Hardware: x86-64 (AMD64) Linux (All)
: medium blocker
Assignee: Nouveau Project
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-16 14:12 UTC by Jan Jasper de Kroon
Modified: 2014-12-17 18:14 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Dump of the Kmsg while the lock-up occurred (59.23 KB, text/plain)
2014-12-16 14:12 UTC, Jan Jasper de Kroon
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jan Jasper de Kroon 2014-12-16 14:12:40 UTC
Created attachment 110906 [details]
Dump of the Kmsg while the lock-up occurred

When using Fedora 20 and 21 I discovered a bug which was causing my complete desktop system to freeze (kernel freeze).
The videocard which is integrated on the motherboard is a Nvidia Geforce 6100 IGP.
Maybe this bug affects multiple distros/kernels as well but I haven't tested others besides Fedora 20 and 21.
I have applied a kmsg dump to this post as well, which was taken while the freeze occured.

To solve the problem the following things have been tried in cooperation with nouveau IRC user (RSpliet):
- Because the kmsg dump complained about PVPE support, while this graphics chip doesn't have PVPE support, first applying nouveau.config=PVP=0 as kernel parameter has been tried. Unfortunately this still caused the lock-up.
- After the above attempt the kernel parameter nouveau.config=NvMSI=0 had been applied, without the nouveau.config=PVP=0 parameter. This time it rendered the desktop without problems.

IRC user RSpliet suggested that NvMSI may need blacklisting for the NV4C chips.

For more information please dont hessitate to contact me.

Greetings Jan Jasper de Kroon
Comment 1 Ilia Mirkin 2014-12-16 16:30:47 UTC
We did make some wild assumptions that nv4c msi worked like nv4e msi. Perhaps they were unwarranted. Could you change drivers/gpu/drm/nouveau/core/engine/device/nv40.c to have

device->oclass[NVDEV_SUBDEV_MC     ] =  nv44_mc_oclass

instead of nv4c_mc_oclass for the 0x4c case? (and remove NvMSI=0 from your boot)

BTW, to answer some questions from the IRC log:

<4>[    1.628331] nouveau W[     DRM] DCB type 4 not known
<4>[    1.628335] nouveau W[     DRM] Unknown-1 has no encoders, removing

This is just the DVI connector on your motherboard which is implemented as an external encoder. nouveau doesn't support that, there's a bug open about it.

<3>[   25.103513] nouveau E[    PBUS][0000:00:0d.0] MMIO write of 0x01000001 FAULT at 0x00b000

And the others -- these come from VPE, I have no idea how they happen since I'm sure you're not even using VPE. But I've seen them on a bunch of people's boot logs and they appear harmless. If you wanted to disable VPE, you'd boot with nouveau.config=PMPEG=0. However it's unlikely to have any effect. PVP=0 should have had literally 0 effect since you don't have a (functioning) VP engine on those boards (you have VP1, but it's wholly unsupported by nouveau).
Comment 2 Jan Jasper de Kroon 2014-12-16 16:51:51 UTC
Hello Ilia Mirkin,

Currently I'm not in the vincinty of this computer so at earliest I'm able to test this tomorrow.
Will get back in touch with you when I tried you're proposed change.

Greetings Jan Jasper de Kroon
Comment 3 Ilia Mirkin 2014-12-16 23:26:32 UTC
Actually disabling MSI may just be the easiest thing to do. A NV4E user (also an IGP) also had issues. So few people have these devices, and MSI doesn't really provide any serious advantages for them, and NVIDIA never made use of it in their driver -- the hw could have bugs, or who knows what.

I sent a patch to disable, should hopefully be backported to various stable kernels as well. [MSI support in nouveau was first enabled in kernel 3.13 IIRC]
Comment 4 Jan Jasper de Kroon 2014-12-17 06:28:28 UTC
Hello Ilia,

I will also test you're previous suggestion with nv40.c changing the 0x4c case device->oclass[NVDEV_SUBDEV_MC     ] =  nv4c_mc_oclass to:
device->oclass[NVDEV_SUBDEV_MC     ] =  nv44_mc_oclass
Then at least you know what difference it would have made.
And also if you'd like I can try you're suggested patch to disable the msi alltoghether as well.
Let me know if I can be of any help.

Greetings Jan Jasper de Kroon
Comment 5 Jan Jasper de Kroon 2014-12-17 07:20:03 UTC
Currently I have the recompiled rpms with the change you suggested me to make.
This evening I'll be able to test this recompiled packages.
Around 16:00 UTC that will be.
So after that I'll report back what I experienced with this modification.

Greetings Jan Jasper de Kroon
Comment 6 Jan Jasper de Kroon 2014-12-17 08:01:52 UTC
A quick search around the freedesktop.org bugzilla with search criteria NV4C revealed some similar bugs like: https://bugs.freedesktop.org/show_bug.cgi?id=61321
When I did a quick search on nv4c_mc_oclass I found on the website: http://lxr.free-electrons.com/ident?v=3.14;i=nv4c_mc_oclass that this nv4c_mc_oclass is first mentioned in kernel 3.14.
But if you look at the bug report I linked in this comment I found out that this person also experienced the same problem with the transition of kernel 3.6 to 3.7.
While browsing the kernel source from 3.7 I found out that the value for device->oclass[NVDEV_SUBDEV_MC still was nv44_mc_oclass.
When you enter this search value 'nv44_mc_oclass' in the free-electrons.com website and you look at kernel version 3.6 and 3.7 you'll find the nv44_mc_oclass was first introduced in kernel version 3.7 (where the above mentioned bug report applies to).
So indeed I think you're thought to disable the MSI in the kernel source will be the correct decision. But of course that's just my humble opinion.
As said earlier I will report this evening about the change from nv4c_mc_oclass to nv44_mc_oclass, but given the above bug report I suppose this will eventually lead to the same result.
Hope I have given you a lot of usefull information about this problem.
Comment 7 Jan Jasper de Kroon 2014-12-17 18:14:37 UTC
To resolve the bug these changes from the mailing list need to be applied:
http://lists.freedesktop.org/archives/nouveau/2014-December/019417.html


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.