Bug 105319 - DRM: EVO timeout with kernel 4.15.x
Summary: DRM: EVO timeout with kernel 4.15.x
Status: NEW
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-03-01 23:05 UTC by Sérgio M. Basto
Modified: 2018-12-04 13:18 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
./dmesg.txt (53.79 KB, text/plain)
2018-03-24 01:09 UTC, Sérgio M. Basto
no flags Details
Xorg.0.log (13.16 KB, text/plain)
2018-03-24 01:10 UTC, Sérgio M. Basto
no flags Details
the commit which start evo timeout (5.56 KB, patch)
2018-08-29 00:06 UTC, Sérgio M. Basto
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Sérgio M. Basto 2018-03-01 23:05:16 UTC
booting with kernel 4.15.6 , Linux Fedora 27 

mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: NVIDIA G98 (298480a2)
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: bios: version 62.98.2e.00.08
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: bios: M0203T not found
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: bios: M0203E not matched!
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: fb: 256 MiB DDR2
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: DRM: VRAM: 256 MiB
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: DRM: TMDS table version 2.0
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: DRM: DCB version 4.0
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: DRM: DCB outp 00: 01000323 00010034
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: DRM: DCB outp 01: 02011300 00000028
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: DRM: DCB outp 02: 04032312 00020010
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: DRM: DCB conn 00: 00000040
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: DRM: DCB conn 01: 00000100
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: DRM: DCB conn 02: 00001261
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: DRM: MM: using M2MF for buffer copies
mar 01 21:05:06 darkstart kernel: nouveau 0000:01:00.0: DRM: allocated 1280x800 fb: 0x50000, bo 000000008139a319
mar 01 21:05:06 darkstart kernel: fbcon: nouveaufb (fb0) is primary device
mar 01 21:05:19 darkstart kernel: nouveau 0000:01:00.0: DRM: EVO timeout
mar 01 21:05:19 darkstart kernel: nouveau 0000:01:00.0: DRM: base-0: timeoutmar 01 21:05:19 darkstart kernel: nouveau 0000:01:00.0: DRM: EVO timeout
mar 01 21:05:19 darkstart kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
mar 01 21:05:19 darkstart kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
mar 01 21:05:19 darkstart kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
mar 01 21:05:19 darkstart kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
mar 01 21:05:19 darkstart kernel: nouveau 0000:01:00.0: DRM: GPU lockup - switching to software fbcon
mar 01 21:05:19 darkstart kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
mar 01 21:05:19 darkstart kernel: nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
mar 01 21:05:19 darkstart kernel: [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0
mar 01 21:05:21 darkstart kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
mar 01 21:05:30 darkstart kernel: nouveau 0000:01:00.0: DRM: EVO timeout
Comment 1 Sérgio M. Basto 2018-03-01 23:05:46 UTC
boots fine with kernel 4.14.x
Comment 2 Ilia Mirkin 2018-03-01 23:23:28 UTC
Can you say a few words about what's connected?

Also, any chance you could bisect to the specific commit?
Comment 3 Sérgio M. Basto 2018-03-01 23:37:15 UTC
01:00.0 VGA compatible controller: NVIDIA Corporation G98M [GeForce 9300M GS] (rev a1)
Comment 4 Ilia Mirkin 2018-03-01 23:45:03 UTC
(In reply to Sérgio M. Basto from comment #3)
> 01:00.0 VGA compatible controller: NVIDIA Corporation G98M [GeForce 9300M
> GS] (rev a1)

I meant more like what's connected to the card... screens and how they're hooked up.
Comment 5 Sérgio M. Basto 2018-03-02 00:01:14 UTC
(In reply to Ilia Mirkin from comment #4)
> (In reply to Sérgio M. Basto from comment #3)
> > 01:00.0 VGA compatible controller: NVIDIA Corporation G98M [GeForce 9300M
> > GS] (rev a1)
> 
> I meant more like what's connected to the card... screens and how they're
> hooked up.

sorry my previous comment was not a reply .
I have one laptop from 2007 , 2009 or so, (dual core dual with a good nvidia at the time )  it boots, it write boot log in vesa mode as usual , when switch root and should load drm , it hangs ( when loading drm kernel module).  

About bisect kernel , before that I tried just over and boot with kernel 4.1-rc3 [1], which also don't boot . I may try boot with  kernel-4.15.0-0.rc0.git3 [2] 

It is not impossible I try bisect the kernel , but I don't build a kernel for a long time . The thing is I'm not the only one with problems with nvidia /nouveau and kernel 4.15 . yeah also with nvidia drives it hangs .

[1] 
https://fedoraproject.org/wiki/RawhideKernelNodebug

[2]
https://koji.fedoraproject.org/koji/buildinfo?buildID=999713
Comment 6 Sérgio M. Basto 2018-03-02 00:02:35 UTC
 I tried just over and boot with kernel 4.16-rc3 [1], which also don't boot
Comment 7 Sérgio M. Basto 2018-03-02 01:14:10 UTC
Sorry for my English , you may anything if you don't understand what I wrote: 

I just tried jump over and boot with kernel 4.16-rc3, which also don't boot


One new : kernel 4.16-rc3 and 4.15.6 boots well if I add nouveau.modeset=0 

kernel-4.15-git3 and kernel-4.15-git6 doesn't boot or hang even with nouveau.modeset=0 but maybe it is unrelated
Comment 8 Sérgio M. Basto 2018-03-02 17:15:11 UTC
The screen is only one , eDP-1 connected primary


I receive this comment [1]. This patch [2]  fixed my problem

I will test it , this patch makes sense to you ? 

Thanks 

[1]
https://bugzilla.redhat.com/show_bug.cgi?id=1546439#c7

[2]
https://github.com/skeggsb/nouveau/pull/1/files
Comment 9 Pierre Moreau 2018-03-22 21:16:59 UTC
Could you please try 4.16-rc6, which includes the patch you mentioned, and see if that helps?
Comment 10 Sérgio M. Basto 2018-03-24 01:09:32 UTC
Created attachment 138329 [details]
./dmesg.txt

(In reply to Pierre Moreau from comment #9)
> Could you please try 4.16-rc6, which includes the patch you mentioned, and
> see if that helps?

no luck , but I found that laptop does not freeze , I could connect to him via ssh 
I send in attach full dmesg.txt and Xorg.0.log
Comment 11 Sérgio M. Basto 2018-03-24 01:10:28 UTC
Created attachment 138330 [details]
Xorg.0.log
Comment 12 Pierre Moreau 2018-03-26 11:41:40 UTC
(In reply to Sérgio M. Basto from comment #10)
> I send in attach full dmesg.txt and Xorg.0.log

Thank you for the logs.

I tried 4.15.12 on a G98 (9300 GS) but could not reproduce that issue (I had two screens: one connected over VGA and the other over HDMI). 4.15.12 should not have any new patches over 4.16-rc6, so you don’t need to try 4.15.12.
I’ll see if I can find some errors in the patches that went in 4.15. If you are able to bisect the faulty commit that went in 4.15, that would be really helpful.
Comment 13 Sérgio M. Basto 2018-03-27 18:58:43 UTC
(In reply to Pierre Moreau from comment #12)
> (In reply to Sérgio M. Basto from comment #10)
> > I send in attach full dmesg.txt and Xorg.0.log
> 
> Thank you for the logs.
> 
> I tried 4.15.12 on a G98 (9300 GS) but could not reproduce that issue (I had
> two screens: one connected over VGA and the other over HDMI). 4.15.12 should
> not have any new patches over 4.16-rc6, so you don’t need to try 4.15.12.
> I’ll see if I can find some errors in the patches that went in 4.15. If you
> are able to bisect the faulty commit that went in 4.15, that would be really
> helpful.

I tried kernel-4.15.0-rc0.git3 and kernel-4.15.0-rc0.git6 doesn't boot or hang even with nouveau.modeset=0 

So for me is difficult to test kernel-4.15-rc0 , I convinced i issue started before kernel-4.15-rc1, but if you got the patches for nouveau on kernel 4.15 , I can build a stable kernel and reverse all the patches and if it works I could bisect then ... 

Since seems the laptop doesn't not hang and I could shut it down without a cool reboot even better
Comment 14 Sérgio M. Basto 2018-03-27 19:54:05 UTC
(*) I'm convinced my issue started before kernel-4.15-rc1
Comment 15 Sérgio M. Basto 2018-04-03 13:38:07 UTC
(In reply to Sérgio M. Basto from comment #14)
> (*) I'm convinced my issue started before kernel-4.15-rc1

I try bitsect kernel the patch-4.14-git2.xz is the first bad commit, I tried the same commit [1] but with git1 and boots fine , so I assume is exclusively a kernel code issue. 

I don't how numeration works so here is the resume,  kernel 4.15.0-git1 boots and kernel 4.15.0-git2 don't boot 

xzdiff -up patch-4.14-git1.xz patch-4.14-git2.xz , have 100 thousand lines , where I find a git tree with these commits ? 


[1]
https://src.fedoraproject.org/rpms/kernel/c/2ef4e8028f509354fb5a339bd2f8d0d1df8f2e8d?branch=master
Comment 16 Sérgio M. Basto 2018-07-01 18:43:45 UTC
while kernel-4.14.18 still boot without any problem , and was boot since 2007 , now with kernel-4.17.3-100.fc27.x86_64 still have 

dmesg | grep -i nouv
[    6.897082] nouveau 0000:01:00.0: NVIDIA G98 (298480a2)
[    6.924724] nouveau 0000:01:00.0: bios: version 62.98.2e.00.08
[    6.946862] nouveau 0000:01:00.0: bios: M0203T not found
[    6.947024] nouveau 0000:01:00.0: bios: M0203E not matched!
[    6.947173] nouveau 0000:01:00.0: fb: 256 MiB DDR2
[    7.031599] nouveau 0000:01:00.0: DRM: VRAM: 256 MiB
[    7.031744] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
[    7.031905] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
[    7.032073] nouveau 0000:01:00.0: DRM: DCB version 4.0
[    7.032240] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000323 00010034
[    7.032394] nouveau 0000:01:00.0: DRM: DCB outp 01: 02011300 00000028
[    7.032541] nouveau 0000:01:00.0: DRM: DCB outp 02: 04032312 00020010
[    7.032688] nouveau 0000:01:00.0: DRM: DCB conn 00: 00000040
[    7.032835] nouveau 0000:01:00.0: DRM: DCB conn 01: 00000100
[    7.032980] nouveau 0000:01:00.0: DRM: DCB conn 02: 00001261
[    7.039372] nouveau 0000:01:00.0: DRM: MM: using M2MF for buffer copies
[    7.117580] nouveau 0000:01:00.0: DRM: allocated 1280x800 fb: 0x60000, bo 0000000065c597bd
[    7.123536] fbcon: nouveaufb (fb0) is primary device
[    9.191132] nouveau 0000:01:00.0: DRM: EVO timeout
[   11.191063] nouveau 0000:01:00.0: DRM: base-0: timeout
[   13.192324] nouveau 0000:01:00.0: DRM: base-0: timeout
[   15.266028] nouveau 0000:01:00.0: DRM: base-0: timeout
[   17.266094] nouveau 0000:01:00.0: DRM: base-0: timeout
[   17.499507] nouveau 0000:01:00.0: DRM: GPU lockup - switching to software fbcon
[   19.503322] nouveau 0000:01:00.0: DRM: base-0: timeout
[   19.511588] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
[   19.519163] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0
[   21.564246] nouveau 0000:01:00.0: DRM: base-0: timeout
[   30.324866] nouveau 0000:01:00.0: DRM: EVO timeout
[  639.261601] nouveau 0000:01:00.0: DRM: EVO timeout
Comment 17 Sérgio M. Basto 2018-07-01 18:58:45 UTC
Against one good boot 

dmesg | grep -i nouv
[    6.731505] nouveau 0000:01:00.0: NVIDIA G98 (298480a2)
[    6.758367] nouveau 0000:01:00.0: bios: version 62.98.2e.00.08
[    6.781022] nouveau 0000:01:00.0: bios: M0203T not found
[    6.781025] nouveau 0000:01:00.0: bios: M0203E not matched!
[    6.781028] nouveau 0000:01:00.0: fb: 256 MiB DDR2
[    6.831893] nouveau 0000:01:00.0: DRM: VRAM: 256 MiB
[    6.832058] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
[    6.832206] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
[    6.832349] nouveau 0000:01:00.0: DRM: DCB version 4.0
[    6.832493] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000323 00010034
[    6.832638] nouveau 0000:01:00.0: DRM: DCB outp 01: 02011300 00000028
[    6.832783] nouveau 0000:01:00.0: DRM: DCB outp 02: 04032312 00020010
[    6.832928] nouveau 0000:01:00.0: DRM: DCB conn 00: 00000040
[    6.833085] nouveau 0000:01:00.0: DRM: DCB conn 01: 00000100
[    6.833226] nouveau 0000:01:00.0: DRM: DCB conn 02: 00001261
[    6.869977] nouveau 0000:01:00.0: DRM: MM: using M2MF for buffer copies
[    6.945008] nouveau 0000:01:00.0: DRM: allocated 1280x800 fb: 0x50000, bo ffff8a5638d90000
[    6.977103] fbcon: nouveaufb (fb0) is primary device
[    8.542893] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
[    8.546151] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0
Comment 18 Karol Herbst 2018-07-17 10:48:10 UTC
I hit an EVO timeout on one of my machines. If this is a regression I can bisect the kernel and see what it points me to.
Comment 19 Sérgio M. Basto 2018-07-17 12:09:44 UTC
kernel 4.15.0-git1 boots and kernel 4.15.0-git2 don't boot
regression started on kernel-4.15.0-git2 and still not fixed in kernel 4.17
Comment 20 Karol Herbst 2018-07-20 12:03:59 UTC
(In reply to Sérgio M. Basto from comment #15)
> (In reply to Sérgio M. Basto from comment #14)
> > (*) I'm convinced my issue started before kernel-4.15-rc1
> 
> I try bitsect kernel the patch-4.14-git2.xz is the first bad commit, I tried
> the same commit [1] but with git1 and boots fine , so I assume is
> exclusively a kernel code issue. 
> 
> I don't how numeration works so here is the resume,  kernel 4.15.0-git1
> boots and kernel 4.15.0-git2 don't boot 
> 
> xzdiff -up patch-4.14-git1.xz patch-4.14-git2.xz , have 100 thousand lines ,
> where I find a git tree with these commits ? 
> 
> 
> [1]
> https://src.fedoraproject.org/rpms/kernel/c/
> 2ef4e8028f509354fb5a339bd2f8d0d1df8f2e8d?branch=master

that diff doesn't really help and because this is an upstream bug tracker you should rather git bisect the kernel itself, not some packages you installed on your system.

If you can pinpoint to a specific git commit inside the kernel, that might be very helpful.
Comment 21 Sérgio M. Basto 2018-07-20 17:50:37 UTC
(In reply to Karol Herbst from comment #20)
> (In reply to Sérgio M. Basto from comment #15)
> > (In reply to Sérgio M. Basto from comment #14)
> > > (*) I'm convinced my issue started before kernel-4.15-rc1
> > 
> > I try bitsect kernel the patch-4.14-git2.xz is the first bad commit, I tried
> > the same commit [1] but with git1 and boots fine , so I assume is
> > exclusively a kernel code issue. 
> > 
> > I don't how numeration works so here is the resume,  kernel 4.15.0-git1
> > boots and kernel 4.15.0-git2 don't boot 
> > 
> > xzdiff -up patch-4.14-git1.xz patch-4.14-git2.xz , have 100 thousand lines ,
> > where I find a git tree with these commits ? 
> > 
> > 
> > [1]
> > https://src.fedoraproject.org/rpms/kernel/c/
> > 2ef4e8028f509354fb5a339bd2f8d0d1df8f2e8d?branch=master
> 
> that diff doesn't really help and because this is an upstream bug tracker
> you should rather git bisect the kernel itself, not some packages you
> installed on your system.
> 
> If you can pinpoint to a specific git commit inside the kernel, that might
> be very helpful.


where I find a git tree with these commits (patch-4.14-git1.xz to patch-4.14-git2.xz) ?
Comment 22 Dominik 'Rathann' Mierzejewski 2018-08-18 00:34:33 UTC
Looks like I have an affected machine as well. I encountered this when bringing an old installation (F26) up to date.

dmesg from 4.17.14-102.fc27.x86_64:
[...]
Aug 18 01:58:15 kernel: nouveau 0000:01:00.0: NVIDIA G98 (298480a2)
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: bios: version 62.98.3c.00.44
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: bios: M0203T not found
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: bios: M0203E not matched!
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: fb: 512 MiB DDR2
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: DRM: VRAM: 512 MiB
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: DRM: TMDS table version 2.0
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: DRM: DCB version 4.0
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: DRM: DCB outp 00: 01011323 00010034
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: DRM: DCB outp 01: 02000300 00000028
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: DRM: DCB outp 02: 02022312 00020030
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: DRM: DCB conn 00: 00000000
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: DRM: DCB conn 01: 00000140
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: DRM: DCB conn 02: 00002261
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: DRM: DCB conn 07: 00000513
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: DRM: MM: using M2MF for buffer copies
Aug 18 01:58:16 kernel: nouveau 0000:01:00.0: DRM: allocated 1440x900 fb: 0x50000, bo 000000006f9828c3
Aug 18 01:58:28 kernel: fbcon: nouveaufb (fb0) is primary device
Aug 18 01:58:28 kernel: nouveau 0000:01:00.0: DRM: EVO timeout
Aug 18 01:58:28 kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
Aug 18 01:58:28 kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
Aug 18 01:58:28 kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
Aug 18 01:58:28 kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
Aug 18 01:58:28 kernel: nouveau 0000:01:00.0: DRM: GPU lockup - switching to software fbcon
Aug 18 01:58:28 kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
Aug 18 01:58:28 kernel: nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
Aug 18 01:58:30 kernel: [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0
Aug 18 01:58:30 kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
Aug 18 01:58:32 kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
Aug 18 01:58:37 kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
Aug 18 01:58:53 kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
Aug 18 01:59:10 kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
Aug 18 01:59:11 kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
Aug 18 01:59:13 kernel: nouveau 0000:01:00.0: DRM: base-0: timeout
Aug 18 01:59:52 kernel: nouveau 0000:01:00.0: DRM: EVO timeout
Aug 18 02:04:07 kernel: nouveau 0000:01:00.0: DRM: EVO timeout

4.14.x works fine like in Sérgio's case. Mine is also a single-screen setup (LVDS-1) with no external outputs connected. There are no errors in Xorg log.

I believe this is a different bug than https://bugzilla.redhat.com/show_bug.cgi?id=1547037 and patch https://github.com/skeggsb/nouveau/pull/1/files did not fix this.

Sérgio, are you sure the above patch fixes this for you?
Comment 23 Dominik 'Rathann' Mierzejewski 2018-08-19 22:56:20 UTC
Ok, now I got something more interesting. I booted Fedora kernel 4.19.0-0.rc0.git5.1.fc30.x86_64 (commit 1f7a4c73a739a63b3f108d8eda6f947fdc70dd65). I still got a frozen console, but when Xorg started, the following WARNING appeared in kernel log. Does this give any clues?

[    7.193842] nouveau 0000:01:00.0: NVIDIA G98 (298480a2)
[    7.253541] nouveau 0000:01:00.0: bios: version 62.98.3c.00.44
[    7.301129] nouveau 0000:01:00.0: bios: M0203T not found
[    7.301492] nouveau 0000:01:00.0: bios: M0203E not matched!
[    7.301669] nouveau 0000:01:00.0: fb: 512 MiB DDR2
[    7.719129] nouveau 0000:01:00.0: DRM: VRAM: 512 MiB
[    7.719498] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
[    7.719681] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
[    7.719851] nouveau 0000:01:00.0: DRM: DCB version 4.0
[    7.720014] nouveau 0000:01:00.0: DRM: DCB outp 00: 01011323 00010034
[    7.720182] nouveau 0000:01:00.0: DRM: DCB outp 01: 02000300 00000028
[    7.720387] nouveau 0000:01:00.0: DRM: DCB outp 02: 02022312 00020030
[    7.720567] nouveau 0000:01:00.0: DRM: DCB conn 00: 00000000
[    7.720736] nouveau 0000:01:00.0: DRM: DCB conn 01: 00000140
[    7.720903] nouveau 0000:01:00.0: DRM: DCB conn 02: 00002261
[    7.721066] nouveau 0000:01:00.0: DRM: DCB conn 07: 00000513
[    7.738669] nouveau 0000:01:00.0: DRM: MM: using M2MF for buffer copies
[    7.784149] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[    7.813006] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for HDMI-A-1
[    7.828162] nouveau 0000:01:00.0: DRM: allocated 1440x900 fb: 0x50000, bo (____ptrval____)
[    7.870502] fbcon: nouveaufb (fb0) is primary device
[    7.885068] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[    7.898694] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for HDMI-A-1
[    9.902963] nouveau 0000:01:00.0: DRM: core notifier timeout
[   11.903058] nouveau 0000:01:00.0: DRM: base-0: timeout
[   11.908408] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[   11.947124] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[   11.958855] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for HDMI-A-1
[   11.961609] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
[   11.972641] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0
[   11.978613]  #0: (____ptrval____) (drm_connector_list_iter){.+.+}, at: nouveau_backlight_init+0x63/0x450 [nouveau]
[   22.205362] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[   32.445359] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[   42.685355] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[   52.925595] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[   63.165373] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[   73.405363] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[   83.645378] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[   93.890397] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[  104.125363] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[  107.020185] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[  107.032965] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for HDMI-A-1
[  107.074838] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[  107.086752] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for HDMI-A-1
[  110.113354] nouveau 0000:01:00.0: DRM: core notifier timeout
[  110.634595] ------------[ cut here ]------------
[  110.634608] nouveau 0000:01:00.0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000010c412000] [size=4096 bytes]
[  110.634630] WARNING: CPU: 1 PID: 1163 at kernel/dma/debug.c:1230 check_sync+0x136/0x670
[  110.634634] Modules linked in: ip_set nfnetlink ebtable_nat ebtable_broute ccm bridge stp llc ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables bnep sunrpc arc4 snd_hda_codec_realtek snd_hda_codec_generic ath9k snd_hda_intel ath9k_common snd_hda_codec ath9k_hw snd_hda_core uvcvideo btusb snd_hwdep btrtl snd_seq snd_seq_device btbcm btintel snd_pcm mac80211 videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 ath videobuf2_common cfg80211 videodev media bluetooth snd_timer snd coretemp ecdh_generic joydev r592 soundcore asus_laptop memstick sparse_keymap rfkill input_polldev pcc_cpufreq acpi_cpufreq dm_crypt
[  110.634775]  nouveau ata_generic pata_acpi firewire_ohci firewire_core mxm_wmi wmi i2c_algo_bit drm_kms_helper sdhci_pci cqhci sdhci ttm sis190 serio_raw mmc_core mii crc_itu_t drm sata_sis pata_sis video
[  110.634823] CPU: 1 PID: 1163 Comm: Xorg Not tainted 4.19.0-0.rc0.git5.1.fc30.x86_64 #1
[  110.634827] Hardware name: ASUSTeK Computer Inc.  X71SL               /X71SL     , BIOS 206     11/05/2008
[  110.634832] RIP: 0010:check_sync+0x136/0x670
[  110.634837] Code: 48 85 ed 75 04 48 8b 68 10 48 8b 3c 24 e8 e2 38 56 00 48 89 c6 4d 89 e8 4c 89 f9 48 89 ea 48 c7 c7 a8 18 30 b1 e8 ee 77 f6 ff <0f> 0b 8b 05 9a 75 85 01 85 c0 0f 84 81 04 00 00 48 83 c4 28 4c 89
[  110.634841] RSP: 0018:ffffb980412c7a10 EFLAGS: 00010082
[  110.634847] RAX: 0000000000000000 RBX: ffffffffb2f33410 RCX: 0000000000000006
[  110.634851] RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff9e12fbbd6ba0
[  110.634855] RBP: ffff9e12f9f82ed0 R08: 0000000000000000 R09: 0000000000000001
[  110.634859] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000286
[  110.634863] R13: 0000000000001000 R14: 0000000000010000 R15: 000000010c412000
[  110.634868] FS:  00007fe0441aeac0(0000) GS:ffff9e12fba00000(0000) knlGS:0000000000000000
[  110.634873] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  110.634877] CR2: 00007fe03c0c8d90 CR3: 0000000114a6e000 CR4: 00000000000006e0
[  110.634881] Call Trace:
[  110.634897]  debug_dma_sync_single_for_device+0x7b/0x90
[  110.634915]  ? ttm_bo_mem_compat+0x23/0x60 [ttm]
[  110.634925]  ? kfree+0x188/0x320
[  110.634932]  ? krealloc+0x25/0xa0
[  110.635040]  nouveau_bo_sync_for_device+0x6a/0xb0 [nouveau]
[  110.635098]  nouveau_bo_validate+0x71/0x90 [nouveau]
[  110.635154]  nouveau_gem_ioctl_pushbuf+0x8a5/0x1ad0 [nouveau]
[  110.635222]  ? nouveau_gem_ioctl_new+0xe0/0xe0 [nouveau]
[  110.635240]  ? drm_ioctl_kernel+0xa5/0xf0 [drm]
[  110.635240]  ? nouveau_gem_ioctl_new+0xe0/0xe0 [nouveau]
[  110.635240]  drm_ioctl_kernel+0xa5/0xf0 [drm]
[  110.635240]  drm_ioctl+0x1fc/0x390 [drm]
[  110.635240]  ? nouveau_gem_ioctl_new+0xe0/0xe0 [nouveau]
[  110.635240]  nouveau_drm_ioctl+0x65/0xc0 [nouveau]
[  110.635240]  do_vfs_ioctl+0xa5/0x6e0
[  110.635240]  ksys_ioctl+0x60/0x90
[  110.635240]  __x64_sys_ioctl+0x16/0x20
[  110.635240]  do_syscall_64+0x60/0x1f0
[  110.635240]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  110.635240] RIP: 0033:0x7fe041422ec7
[  110.635240] Code: 00 00 90 48 8b 05 d9 7f 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 7f 2c 00 f7 d8 64 89 01 48
[  110.635240] RSP: 002b:00007ffcd424fc68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  110.635240] RAX: ffffffffffffffda RBX: 0000000000d5ae98 RCX: 00007fe041422ec7
[  110.635240] RDX: 00007ffcd424fcd0 RSI: 00000000c0406481 RDI: 000000000000000e
[  110.635240] RBP: 00007ffcd424fcd0 R08: 0000000000000000 R09: 0000000000d59f20
[  110.635240] R10: 0000000000d6be98 R11: 0000000000000246 R12: 00000000c0406481
[  110.635240] R13: 000000000000000e R14: 0000000000d5a070 R15: 0000000000d59f20
[  110.635240] irq event stamp: 0
[  110.635240] hardirqs last  enabled at (0): [<0000000000000000>]           (null)
[  110.635240] hardirqs last disabled at (0): [<ffffffffb00bb817>] copy_process.part.28+0x747/0x1e70
[  110.635240] softirqs last  enabled at (0): [<ffffffffb00bb817>] copy_process.part.28+0x747/0x1e70
[  110.635240] softirqs last disabled at (0): [<0000000000000000>]           (null)
[  110.635240] ---[ end trace a1450e59d31d3810 ]---
[  114.365372] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[  117.080247] nouveau 0000:01:00.0: DRM: core notifier timeout
[  119.080664] nouveau 0000:01:00.0: DRM: base-0: timeout
[  122.562843] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[  122.574617] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for HDMI-A-1
[  124.605473] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[  134.845626] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[  145.085449] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[  155.325447] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[  165.565443] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[  175.805469] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
[  186.045466] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for VGA-1
Comment 24 Sérgio M. Basto 2018-08-20 13:00:39 UTC
BTW I started bisect kernel , but we have at least 2 phases , the first phase the computer hangs but graphics still on [1], this week I hope starting second phase and find the commit where really started EVO timeout, I tested binaries of fedora first and it is between kernel-4.15.0-0.rc4.git4.1.fc28.x86_64 and kernel-4.15.0-0.rc6.git0.1.fc28.x86_64 (build dates 22 dez 2017 and 01 jan 2018 ) , that is the state of my investigation .

Thanks.



[1]
# bad: [b18d62891aaff49d0ee8367d4b6bb9452469f807] Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
# good: [7d58e1c9059eefe0066c5acf2ffa582f6f0180e3] Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect start 'b18d62891aaff49d0ee8367d4b6bb9452469f807' '7d58e1c9059eefe0066c5acf2ffa582f6f0180e3'
# bad: [331b57d14829c49d75076779cdc54d7e4537bbf0] Merge branch 'irq/urgent' into x86/apic
git bisect bad 331b57d14829c49d75076779cdc54d7e4537bbf0
# good: [e43b3b58548051f8809391eb7bec7a27ed3003ea] genirq/cpuhotplug: Enforce affinity setting on startup of managed irqs
git bisect good e43b3b58548051f8809391eb7bec7a27ed3003ea
# good: [ec0f7cd273dc41ab28bba703cac82690ea5f2863] genirq/matrix: Add tracepoints
git bisect good ec0f7cd273dc41ab28bba703cac82690ea5f2863
# skip: [f0cc6ccaf7ba42a1247fe5a9244b6009a3beddd5] x86/vector: Simplify the CPU hotplug vector update
git bisect skip f0cc6ccaf7ba42a1247fe5a9244b6009a3beddd5
# bad: [baab1e84b1124bfd3e40ef6c8e05b2a15136e3d5] x86/apic: Remove unused callbacks
git bisect bad baab1e84b1124bfd3e40ef6c8e05b2a15136e3d5
# good: [83a105229c59e433409e4d86e9bb915ca281235c] x86/apic: Move common APIC callbacks
git bisect good 83a105229c59e433409e4d86e9bb915ca281235c
# bad: [3534be05e4adc303d41fae65901598695adea685] x86/ioapic: Mark legacy vectors at reallocation time
git bisect bad 3534be05e4adc303d41fae65901598695adea685
# bad: [ef9e56d894eab99a33a06b96ba8057afa67d3702] x86/ioapic: Remove obsolete post hotplug update
git bisect bad ef9e56d894eab99a33a06b96ba8057afa67d3702
# good: [c1d1ee9ac1793d939ba1a1322767cc5f77a5b8fe] x86/apic: Get rid of apic->target_cpus
git bisect good c1d1ee9ac1793d939ba1a1322767cc5f77a5b8fe
# good: [7854f82293e99f6bb3df793a2f579db4670ba71b] x86/vector: Rename used_vectors to system_vectors
git bisect good 7854f82293e99f6bb3df793a2f579db4670ba71b
# bad: [fdba46ffb4c203b6e6794163493fd310f98bb4be] x86/apic: Get rid of multi CPU affinity
git bisect bad fdba46ffb4c203b6e6794163493fd310f98bb4be
# first bad commit: [fdba46ffb4c203b6e6794163493fd310f98bb4be] x86/apic: Get rid of multi CPU affinity
Comment 25 Dominik 'Rathann' Mierzejewski 2018-08-21 16:35:06 UTC
I can confirm Sergio's findings so far, though kernel-4.15.0-0.rc4.git4.1.fc28.x86_64 (based on git commit ead68f216110) hangs completely upon switching fbcon to nouveaufb if I have "rhgb" in the kernel command line. I won't be able to provide more details for some time as I have to give back the machine where this can be reproduced. I'll keep my fingers crossed for Sergio.
Comment 26 Sérgio M. Basto 2018-08-21 17:25:26 UTC
(In reply to Dominik 'Rathann' Mierzejewski from comment #25)
> I can confirm Sergio's findings so far, though
> kernel-4.15.0-0.rc4.git4.1.fc28.x86_64 (based on git commit ead68f216110)
> hangs completely upon switching fbcon to nouveaufb if I have "rhgb" in the
> kernel command line. I won't be able to provide more details for some time
> as I have to give back the machine where this can be reproduced. I'll keep
> my fingers crossed for Sergio.

Correct, here [1] is the kernel I tested .

I just started :

git bisect start v4.15-rc6 ead68f216110
Bisecting: 173 revisions left to test after this (roughly 8 steps)



[1]
kernel-4.15.0-0.rc3.git4.1.fc28
kernel-4.15.0-0.rc4.git0.1.fc28 good graphics, bad interrupts 
kernel-4.15.0-0.rc4.git1.1.fc28
kernel-4.15.0-0.rc4.git2.1.fc28
kernel-4.15.0-0.rc4.git3.1.fc28 good graphics, bad interrupts 
kernel-4.15.0-0.rc4.git4.1.fc28 good graphics, bad interrupts 
kernel-4.15.0-0.rc6.git0.1.fc28 bad graphics
kernel-4.15.0-0.rc6.git0.2.fc28 bad graphics
kernel-4.15.0-0.rc6.git0.3.fc28
kernel-4.15.0-0.rc6.git1.1.fc28 good interrupts (but bad graphics) 
kernel-4.15.0-0.rc6.git2.1.fc28
Comment 27 Sérgio M. Basto 2018-08-29 00:06:14 UTC
Created attachment 141327 [details] [review]
the commit which start evo timeout

And result of git bisect start v4.15-rc6 ead68f216110 is [1]

This is the commit when switch graphics at boot startup starts to fail, I'm thinking in the revert it in kernel-4.15.0-0.rc6.git1.1.fc28 which already have the good interrupts, to see if I can boot correctly again .
Or should I seek for commit that fix delivery interrupts ? 
what do you think ? 


[1]
# bad: [30a7acd573899fd8b8ac39236eff6468b195ac7d] Linux 4.15-rc6
# good: [ead68f216110170ec729e2c4dec0aad6d38259d7] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Skip some commits because all are good, just almost the last one is the bad commit ...

# good: [64e05d118e357bb52a084b609436acf292ce7944] x86/apic: Update the 'apic=' description of setting APIC driver

# bad: [f39d7d78b70e0f39facb1e4fab77ad3df5c52a35] Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

# bad: [a31e58e129f73ab5b04016330b13ed51fde7a961] x86/apic: Switch all APICs to Fixed delivery mode

# first bad commit: [a31e58e129f73ab5b04016330b13ed51fde7a961] x86/apic: Switch all APICs to Fixed delivery mode
Comment 28 Sérgio M. Basto 2018-08-29 00:16:26 UTC
Just notice the log of the commit just attached, have this sentence [1] and the my first bisect end with [2] ! it match. 

[1]
Fixes: fdba46ffb4c2 ("x86/apic: Get rid of multi CPU affinity")
    Reported-by: vcaputo@pengaru.com

[2]
# first bad commit: [fdba46ffb4c203b6e6794163493fd310f98bb4be] x86/apic: Get rid of multi CPU affinity
Comment 29 Sérgio M. Basto 2018-12-04 13:18:24 UTC
In resume

My first bad commit: [fdba46ffb4c203b6e6794163493fd310f98bb4be] x86/apic: Get rid of multi CPU affinity (in kernel 4.15.0-git2)

My second bad commit: [a31e58e129f73ab5b04016330b13ed51fde7a961] x86/apic: Switch all APICs to Fixed delivery mode (in kernel-4.15.0-0.rc6.git1.1) [1] commit message say that fixes fdba46ffb4c2 ("x86/apic: Get rid of multi CPU affinity")


[1]
https://bugs.freedesktop.org/attachment.cgi?id=141327


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.