Bug 92991

Summary: Nouveau module loading regression in 4.4-rc1 on optimus system
Product: xorg Reporter: Oliver Neukum <oliver>
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: NEW --- QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
lspci -vvv
none
dmesg of 4.4-rc2
none
X log none

Description Oliver Neukum 2015-11-18 12:01:51 UTC
Created attachment 119903 [details]
lspci -vvv

I am seeing a short freeze of the system loading the nouveau module and then a kernel error in syslog:

[  295.486823] nouveau 0000:01:00.0: enabling device (0000 -> 0003)
[  295.486851] nouveau 0000:01:00.0: NVIDIA GK104 (0e4360a2)
[  295.499666] nouveau 0000:01:00.0: bios: version 80.04.b1.00.03
[  295.499966] nouveau 0000:01:00.0: mxm: BIOS version 3.0
[  295.503015] ACPI Warning: \_SB_.PCI0.PEGP.DGFX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[  295.746220] nouveau 0000:01:00.0: fb: 4096 MiB GDDR5
[  295.812103] [TTM] Zone  kernel: Available graphics memory: 12164158 kiB
[  295.812105] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[  295.812105] [TTM] Initializing pool allocator
[  295.812109] [TTM] Initializing DMA pool allocator
[  295.812116] nouveau 0000:01:00.0: DRM: VRAM: 4096 MiB
[  295.812117] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
[  295.812120] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
[  295.812121] nouveau 0000:01:00.0: DRM: DCB version 4.0
[  295.812122] nouveau 0000:01:00.0: DRM: DCB outp 03: 01075fd6 0f420020
[  295.812123] nouveau 0000:01:00.0: DRM: DCB outp 04: 01075f92 00020020
[  295.812124] nouveau 0000:01:00.0: DRM: DCB outp 05: 08014fc6 0f420010
[  295.812125] nouveau 0000:01:00.0: DRM: DCB outp 06: 08014f82 00020010
[  295.812126] nouveau 0000:01:00.0: DRM: DCB outp 08: 04038fb6 0f420010
[  295.812127] nouveau 0000:01:00.0: DRM: DCB outp 09: 04038f72 00020010
[  295.812128] nouveau 0000:01:00.0: DRM: DCB conn 04: 01000446
[  295.812129] nouveau 0000:01:00.0: DRM: DCB conn 05: 02000546
[  295.812130] nouveau 0000:01:00.0: DRM: DCB conn 07: 00010747
[  295.812131] nouveau 0000:01:00.0: DRM: DCB conn 08: 00020846
[  295.812132] nouveau 0000:01:00.0: DRM: DCB conn 09: 00000900
[  295.830519] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[  295.830521] [drm] Driver supports precise vblank timestamp query.
[  296.026589] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
[  297.565049] SFW2-INext-DROP-DEFLT IN=enp0s25 OUT= MAC=6c:3b:e5:8c:99:ec:00:1c:4a:86:25:6a:08:00 SRC=192.168.234.1 DST=192.168.234.102 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=6751 DF PROTO=TCP SPT=4739 DPT=14013 WINDOW=5840 RES=0x00 SYN URGP=0 OPT (020405B40402080A020ABC300000000001030301) 
[  298.474076] nouveau 0000:01:00.0: gr: wait for idle timeout (en: 1, ctxsw: 0, busy: 1)
[  300.473977] nouveau 0000:01:00.0: gr: wait for idle timeout (en: 1, ctxsw: 0, busy: 1)
[  302.473878] nouveau 0000:01:00.0: gr: wait for idle timeout (en: 1, ctxsw: 0, busy: 1)
[  304.473781] nouveau 0000:01:00.0: gr: wait for idle timeout (en: 1, ctxsw: 0, busy: 1)
[  306.473761] nouveau 0000:01:00.0: timeout at drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c:1365/gf100_grctx_generate()!
[  306.473775] nouveau 0000:01:00.0: gr: failed to construct context
[  306.473778] nouveau 0000:01:00.0: gr: init failed, -16
[  306.476140] nouveau 0000:01:00.0: DRM: allocated 1680x1050 fb: 0x60000, bo ffff88068c051400
[  306.476221] nouveau 0000:01:00.0: fb1: nouveaufb frame buffer device
[  306.476226] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 1
Comment 1 Karol Herbst 2015-11-18 21:47:51 UTC
try to add nouveau.config=War00C800_0=1 to the kernel command line
Comment 2 Oliver Neukum 2015-11-18 22:09:36 UTC
(In reply to Karol Herbst from comment #1)
> try to add nouveau.config=War00C800_0=1 to the kernel command line

[ 4169.807962] nouveau 0000:01:00.0: enabling device (0000 -> 0003)
[ 4169.808016] nouveau 0000:01:00.0: NVIDIA GK104 (0e4360a2)
[ 4169.821387] nouveau 0000:01:00.0: bios: version 80.04.b1.00.03
[ 4169.821709] nouveau 0000:01:00.0: mxm: BIOS version 3.0
[ 4169.824803] ACPI Warning: \_SB_.PCI0.PEGP.DGFX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
[ 4170.068054] nouveau 0000:01:00.0: fb: 4096 MiB GDDR5
[ 4170.134338] [TTM] Zone  kernel: Available graphics memory: 12164158 kiB
[ 4170.134340] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[ 4170.134341] [TTM] Initializing pool allocator
[ 4170.134344] [TTM] Initializing DMA pool allocator
[ 4170.134355] nouveau 0000:01:00.0: DRM: VRAM: 4096 MiB
[ 4170.134356] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
[ 4170.134358] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
[ 4170.134359] nouveau 0000:01:00.0: DRM: DCB version 4.0
[ 4170.134361] nouveau 0000:01:00.0: DRM: DCB outp 03: 01075fd6 0f420020
[ 4170.134362] nouveau 0000:01:00.0: DRM: DCB outp 04: 01075f92 00020020
[ 4170.134363] nouveau 0000:01:00.0: DRM: DCB outp 05: 08014fc6 0f420010
[ 4170.134364] nouveau 0000:01:00.0: DRM: DCB outp 06: 08014f82 00020010
[ 4170.134364] nouveau 0000:01:00.0: DRM: DCB outp 08: 04038fb6 0f420010
[ 4170.134365] nouveau 0000:01:00.0: DRM: DCB outp 09: 04038f72 00020010
[ 4170.134367] nouveau 0000:01:00.0: DRM: DCB conn 04: 01000446
[ 4170.134367] nouveau 0000:01:00.0: DRM: DCB conn 05: 02000546
[ 4170.134368] nouveau 0000:01:00.0: DRM: DCB conn 07: 00010747
[ 4170.134369] nouveau 0000:01:00.0: DRM: DCB conn 08: 00020846
[ 4170.134370] nouveau 0000:01:00.0: DRM: DCB conn 09: 00000900
[ 4170.149535] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[ 4170.149536] [drm] Driver supports precise vblank timestamp query.
[ 4170.321600] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
[ 4170.616221] nouveau 0000:01:00.0: pmu: hw bug workaround enabled
[ 4170.728193] nouveau 0000:01:00.0: pmu: hw bug workaround enabled
[ 4170.799274] nouveau 0000:01:00.0: DRM: allocated 1680x1050 fb: 0x60000, bo ffff8805dc47ec00
[ 4170.799409] nouveau 0000:01:00.0: fb1: nouveaufb frame buffer device
[ 4170.799415] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 1
Comment 3 Ilia Mirkin 2015-11-18 22:12:34 UTC
(In reply to Oliver Neukum from comment #2)
> (In reply to Karol Herbst from comment #1)
> > try to add nouveau.config=War00C800_0=1 to the kernel command line
> 
> [ 4169.807962] nouveau 0000:01:00.0: enabling device (0000 -> 0003)
> [ 4169.808016] nouveau 0000:01:00.0: NVIDIA GK104 (0e4360a2)
> [ 4169.821387] nouveau 0000:01:00.0: bios: version 80.04.b1.00.03
> [ 4169.821709] nouveau 0000:01:00.0: mxm: BIOS version 3.0
> [ 4169.824803] ACPI Warning: \_SB_.PCI0.PEGP.DGFX._DSM: Argument #4 type
> mismatch - Found [Buffer], ACPI requires [Package] (20150930/nsarguments-95)
> [ 4170.068054] nouveau 0000:01:00.0: fb: 4096 MiB GDDR5
> [ 4170.134338] [TTM] Zone  kernel: Available graphics memory: 12164158 kiB
> [ 4170.134340] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
> [ 4170.134341] [TTM] Initializing pool allocator
> [ 4170.134344] [TTM] Initializing DMA pool allocator
> [ 4170.134355] nouveau 0000:01:00.0: DRM: VRAM: 4096 MiB
> [ 4170.134356] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
> [ 4170.134358] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
> [ 4170.134359] nouveau 0000:01:00.0: DRM: DCB version 4.0
> [ 4170.134361] nouveau 0000:01:00.0: DRM: DCB outp 03: 01075fd6 0f420020
> [ 4170.134362] nouveau 0000:01:00.0: DRM: DCB outp 04: 01075f92 00020020
> [ 4170.134363] nouveau 0000:01:00.0: DRM: DCB outp 05: 08014fc6 0f420010
> [ 4170.134364] nouveau 0000:01:00.0: DRM: DCB outp 06: 08014f82 00020010
> [ 4170.134364] nouveau 0000:01:00.0: DRM: DCB outp 08: 04038fb6 0f420010
> [ 4170.134365] nouveau 0000:01:00.0: DRM: DCB outp 09: 04038f72 00020010
> [ 4170.134367] nouveau 0000:01:00.0: DRM: DCB conn 04: 01000446
> [ 4170.134367] nouveau 0000:01:00.0: DRM: DCB conn 05: 02000546
> [ 4170.134368] nouveau 0000:01:00.0: DRM: DCB conn 07: 00010747
> [ 4170.134369] nouveau 0000:01:00.0: DRM: DCB conn 08: 00020846
> [ 4170.134370] nouveau 0000:01:00.0: DRM: DCB conn 09: 00000900
> [ 4170.149535] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> [ 4170.149536] [drm] Driver supports precise vblank timestamp query.
> [ 4170.321600] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
> [ 4170.616221] nouveau 0000:01:00.0: pmu: hw bug workaround enabled
> [ 4170.728193] nouveau 0000:01:00.0: pmu: hw bug workaround enabled
> [ 4170.799274] nouveau 0000:01:00.0: DRM: allocated 1680x1050 fb: 0x60000,
> bo ffff8805dc47ec00
> [ 4170.799409] nouveau 0000:01:00.0: fb1: nouveaufb frame buffer device
> [ 4170.799415] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on
> minor 1

That seems much happier. Mind providing the output of lspci -nnv -d 10de::300 ? The lspci you attached is missing the id's, we're whitelisting stuff for now. Eventually we'll flip this on by default.
Comment 4 Oliver Neukum 2015-11-18 22:26:54 UTC
lspci -nnv -d 10de:11b6
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK104GLM [Quadro K3100M] [10de:11b6] (rev a1) (prog-if 00 [VGA controller])                                                                                          
        Subsystem: Hewlett-Packard Company Device [103c:190a]                                                                                                                                                                     
        Flags: fast devsel, IRQ 10                                                                                                                                                                                                
        Memory at cb000000 (32-bit, non-prefetchable) [disabled] [size=16M]                                                                                                                                                       
        Memory at 50000000 (64-bit, prefetchable) [disabled] [size=256M]                                                                                                                                                          
        Memory at 60000000 (64-bit, prefetchable) [disabled] [size=32M]
        I/O ports at 5000 [disabled] [size=128]
        Expansion ROM at cc080000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Capabilities: [b4] Vendor Specific Information: Len=14 <?>
        Capabilities: [100] Virtual Channel
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Capabilities: [900] #19
        Kernel modules: nouveau
Comment 5 Oliver Neukum 2015-11-18 22:28:02 UTC
(In reply to Ilia Mirkin from comment #3)

> That seems much happier.

Unfortunately the result does not work. xrandr leads to an instant hard lockup with and without the additional parameter.
Comment 6 Ilia Mirkin 2015-11-19 16:03:53 UTC
(In reply to Oliver Neukum from comment #5)
> (In reply to Ilia Mirkin from comment #3)
> 
> > That seems much happier.
> 
> Unfortunately the result does not work. xrandr leads to an instant hard
> lockup with and without the additional parameter.

Can you recover any information when this happens? Perhaps with netconsole? Do you have anything like bumblebee installed? [If so, try without it.]

Separately, can you try booting with nouveau.runpm=0 (in addition to the other workaround thing)? That will prevent nouveau from runtime-suspending your GPU and will increase battery usage, but it's a good debugging step.
Comment 7 Oliver Neukum 2015-11-23 14:18:07 UTC
Created attachment 120048 [details]
dmesg of 4.4-rc2

This is with config=War00C800_0=1 runpm=0
Comment 8 Oliver Neukum 2015-11-23 14:19:23 UTC
With runpm=0 I get full output of xrandr. Without it, it is an instant crash:

Screen 0: minimum 8 x 8, current 1600 x 900, maximum 32767 x 32767
eDP1 connected primary 1600x900+0+0 (normal left inverted right x axis y axis) 382mm x 215mm
   1600x900      60.01*+  40.00  
   1024x768      60.00  
   800x600       60.32    56.25  
   640x480       59.94  
VGA1 disconnected (normal left inverted right x axis y axis)
VIRTUAL1 disconnected (normal left inverted right x axis y axis)
DisplayPort-1-0 disconnected (normal left inverted right x axis y axis)
DisplayPort-1-1 disconnected (normal left inverted right x axis y axis)
DisplayPort-1-2 connected (normal left inverted right x axis y axis)
   1680x1050     59.88 +  59.95  
   1400x1050     74.76    59.98  
   1280x1024     75.02    60.02  
   1440x900      74.98    59.90  
   1280x960      60.00  
   1152x864      75.00  
   1024x768      60.04    75.08    75.03    70.07    60.00  
   960x720       75.00    60.00  
   928x696       75.00    60.05  
   896x672       75.05    60.01  
   832x624       74.55  
   800x600       75.00    70.00    65.00    60.00    72.19    75.00    60.32    56.25  
   700x525       74.76    59.98  
   640x512       75.02    60.02  
   640x480       60.00    75.00    72.81    75.00    60.00    59.94  
   720x400       70.08  
   576x432       75.00  
   512x384       75.03    70.07    60.00  
   416x312       74.66  
   400x300       72.19    75.12    60.32    56.34  
   320x240       72.81    75.00    60.05  
  1024x768 (0xb1) 65.000MHz
        h: width  1024 start 1048 end 1184 total 1344 skew    0 clock  48.36KHz
        v: height  768 start  771 end  777 total  806           clock  60.00Hz
  800x600 (0xb2) 40.000MHz
        h: width   800 start  840 end  968 total 1056 skew    0 clock  37.88KHz                                                                                                                                                   
        v: height  600 start  601 end  605 total  628           clock  60.32Hz                                                                                                                                                    
  800x600 (0xb3) 36.000MHz                                                                                                                                                                                                        
        h: width   800 start  824 end  896 total 1024 skew    0 clock  35.16KHz                                                                                                                                                   
        v: height  600 start  601 end  603 total  625           clock  56.25Hz                                                                                                                                                    
  640x480 (0xb4) 25.175MHz                                                                                                                                                                                                        
        h: width   640 start  656 end  752 total  800 skew    0 clock  31.47KHz                                                                                                                                                   
        v: height  480 start  490 end  492 total  525           clock  59.94Hz
Comment 9 Ilia Mirkin 2015-11-23 17:58:03 UTC
(In reply to Oliver Neukum from comment #8)
> With runpm=0 I get full output of xrandr. Without it, it is an instant crash:
> 
> Screen 0: minimum 8 x 8, current 1600 x 900, maximum 32767 x 32767
> eDP1 connected primary 1600x900+0+0 (normal left inverted right x axis y
> axis) 382mm x 215mm
>    1600x900      60.01*+  40.00  
>    1024x768      60.00  
>    800x600       60.32    56.25  
>    640x480       59.94  
> VGA1 disconnected (normal left inverted right x axis y axis)
> VIRTUAL1 disconnected (normal left inverted right x axis y axis)
> DisplayPort-1-0 disconnected (normal left inverted right x axis y axis)
> DisplayPort-1-1 disconnected (normal left inverted right x axis y axis)
> DisplayPort-1-2 connected (normal left inverted right x axis y axis)
>    1680x1050     59.88 +  59.95  

Those connector names look like they're from xf86-video-modesetting, not xf86-video-nouveau. Could you try with xf86-video-nouveau installed? What version of xf86-video-intel are you using? SNA or UXA?

I definitely know that earlier modesetting various had various issues with reverse prime... when you say "lockup", do you really mean "lockup", or "system working perfectly fine with the minor exception that the display stops updating"?

Lastly, can you ensure that nouveau is loaded *before* X starts?
Comment 10 Oliver Neukum 2015-11-24 10:10:29 UTC
Created attachment 120079 [details]
X log

booting (nouveau and i915 statically compiled) and connecting an external monitor
Comment 11 Oliver Neukum 2015-11-24 10:18:05 UTC
(In reply to Ilia Mirkin from comment #9)

It turns out that with nouveau statically compiled I can use an external monitor and xrandr without "runpm=0"

> Those connector names look like they're from xf86-video-modesetting, not
> xf86-video-nouveau. Could you try with xf86-video-nouveau installed? What
> version of xf86-video-intel are you using? SNA or UXA?

How do I do that, as far as I can tell, it is not used.
 
> I definitely know that earlier modesetting various had various issues with
> reverse prime... when you say "lockup", do you really mean "lockup", or
> "system working perfectly fine with the minor exception that the display
> stops updating"?

Total lockup. Even CAPS LOCK not working.
 
> Lastly, can you ensure that nouveau is loaded *before* X starts?

Done. That helps.
Comment 12 Oliver Neukum 2015-11-24 10:25:37 UTC
(In reply to Ilia Mirkin from comment #9)

> Those connector names look like they're from xf86-video-modesetting, not
> xf86-video-nouveau. Could you try with xf86-video-nouveau installed? What
> version of xf86-video-intel are you using? SNA or UXA?

xf86-video-modesetting is definitely not used. I've deinstalled it and rebooted. The result is the same.
Comment 13 Karol Herbst 2015-11-26 17:06:31 UTC
(In reply to Oliver Neukum from comment #12)
> (In reply to Ilia Mirkin from comment #9)
> 
> > Those connector names look like they're from xf86-video-modesetting, not
> > xf86-video-nouveau. Could you try with xf86-video-nouveau installed? What
> > version of xf86-video-intel are you using? SNA or UXA?
> 
> xf86-video-modesetting is definitely not used. I've deinstalled it and
> rebooted. The result is the same.

It is part of the Xserver now, so you can't just deinstall it ;)

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.