Bug 92274

Summary: nouveau black screen and errors with two monitors attached
Product: xorg Reporter: Joseph Thommes <joseph-thommes>
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: joseph-thommes
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg from /var/log/dmesg
none
VBIOS
none
/var/log/dmesg with kernel command line added drm.debug=14 log_buf_len=16M
none
/var/log/dmesg after changes in ./drm/.../hdmigk104.c
none
kernel log after upgrade to gentoo-4.1.15-r1
none
dmesg of the 4.4 kernel from the gentoo sys-kernel/git-sources none

Description Joseph Thommes 2015-10-04 13:27:40 UTC
Hello,
I switched from the proprietary nvidia-drivers to nouveau under Gentoo.
I use a Dual-Monitor Setup: one is connected to the DVI-I and the other one to
the HDMI output of my Nvidia GTX 770.
Under my Manjaro installation I use the proprietary driver and everything works;
under Gentoo everything was fine with the nvidia-drivers, too.
Now I can boot without errors into Gentoo when only one of the monitors is connected, no matter which one.
However, when both monitors are attached to the graphics card,
I cannot access a tty or my graphical login manger:
the screens turn off showing "no input signal".
Me not being able to access my Gentoo installation by booting into it when the error happens,
I chrooted into it and used cat /var/log/dmesg.
I know that this is not what the nouveau reporting guide suggests for obtaining a kernel log,
but if I run dmesg from my chrooted environment I obviously get the kernel log of my Manjaro boot.
If there is another better way for obtaining the kernel log, please let me know.

VBIOS is following in a few minutes.

If there is anything else missing, let me know.

lspci | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 770] (rev a1)

kernel version: x86-64 4.0.5-linux-gentoo
nouveau version: x11-drivers/xf86-video-nouveau-1.0.11
libdrm version: x11-libs/libdrm-2.4.59
mesa version: media-libs/mesa-10.3.7-r1
X server: x11-base/xorg-server-1.16.4
Comment 1 Joseph Thommes 2015-10-04 13:30:35 UTC
Created attachment 118654 [details]
dmesg from /var/log/dmesg
Comment 2 Joseph Thommes 2015-10-04 13:41:08 UTC
Created attachment 118655 [details]
VBIOS
Comment 3 Joseph Thommes 2015-10-04 13:45:24 UTC
Not sure if important, but the vbios is from Gentoo when only one monitor was connected.
Comment 4 Joseph Thommes 2015-10-05 17:01:03 UTC
Created attachment 118677 [details]
/var/log/dmesg with kernel command line added drm.debug=14 log_buf_len=16M
Comment 5 Joseph Thommes 2015-10-05 17:05:00 UTC
Having in mind that this could also be a Gennto-specific Bug or could be a badly configured kernel, do you have any suggestions for my kernel config?
I think I followed all instructions to get nouveau to work also due to it
working with only one monitor connected, but there's also the possibility
that I forgot something, so feel free to ask or suggest something that helps solving this.
Comment 6 Ilia Mirkin 2015-10-05 17:27:01 UTC
In your first dmesg, you have:

video=HDMI-0:e video=DVI-I-1:e

Are these required for something? Force-enabling this stuff just leads to trouble down the line.

Interesting....

[    8.872612] nouveau E[    PBUS][0000:01:00.0] MMIO read of 0x00000000 FAULT at 0x690400 [ IBUS ]
[    8.872781] nouveau E[    PBUS][0000:01:00.0] MMIO write of 0xbadf1001 FAULT at 0x690400 [ IBUS ]
[    8.872869] nouveau E[    PBUS][0000:01:00.0] MMIO read of 0x00000000 FAULT at 0x6904c0 [ IBUS ]
[    8.873043] nouveau E[    PBUS][0000:01:00.0] MMIO write of 0xbadf1001 FAULT at 0x6904c0 [ IBUS ]

in drm/nouveau/nvkm/engine/disp/hdmigk104.c:

        /* AVI InfoFrame */
        nvkm_mask(device, 0x690000 + hdmi, 0x00000001, 0x00000000);
        nvkm_wr32(device, 0x690008 + hdmi, 0x000d0282);
        nvkm_wr32(device, 0x69000c + hdmi, 0x0000006f);
        nvkm_wr32(device, 0x690010 + hdmi, 0x00000000);
        nvkm_wr32(device, 0x690014 + hdmi, 0x00000000);
        nvkm_wr32(device, 0x690018 + hdmi, 0x00000000);
        nvkm_mask(device, 0x690000 + hdmi, 0x00000001, 0x00000001);

        /* ??? InfoFrame? */
        nvkm_mask(device, 0x6900c0 + hdmi, 0x00000001, 0x00000000);
        nvkm_wr32(device, 0x6900cc + hdmi, 0x00000010);
        nvkm_mask(device, 0x6900c0 + hdmi, 0x00000001, 0x00000001);

690000 + hdmi / 6900c0 + hdmi each get 2 reads/writes, but you only see one error, both times on the enable. I wonder if this code doesn't need to be more like

u32 foo = nvkm_rd32(device, 0x690000 + hdmi);
nvkm_wr32(device, 0x690000 + hdmi, foo & ~1);
...
nvkm_wr32(device, 0x690000 + hdmi, foo | 1);

and same sort of thing for the second section. Are comfortable enough with code to try this on your own, or do you need patches?
Comment 7 Joseph Thommes 2015-10-05 18:40:20 UTC
Oh these two video options were for testing and did not show any effect.
I accidently took this kernel log, but the other ones without these options looked the same.
I'm not that into compiling everything on my own (except for some of the easy or
self written parts; or when gentoo does it for me ;)), but I'm willing to learn
and been always looking for a good reason to get myself doing something like this
and not thinking that others will do it or that it does not have enough advantages.

So just that I got it right:
I get the source code from the nouveau git repository,
change the affected code parts, follow the instructions in the readme etc.
and compile nouveau as a module to load it.

Is it enough that I just download a snapshot and do a out-of-tree compilation?
Is it compatible with my kernel?
(because it says it contains the 3.x versions and my version is 4)
Which settings do I need to enable in my kernel that are not trivial?

Sorry for so many questions, but I'd really like to get this done.
Comment 8 Ilia Mirkin 2015-10-05 18:59:02 UTC
(In reply to joseph-thommes from comment #7)
> Oh these two video options were for testing and did not show any effect.
> I accidently took this kernel log, but the other ones without these options
> looked the same.
> I'm not that into compiling everything on my own (except for some of the
> easy or
> self written parts; or when gentoo does it for me ;)), but I'm willing to
> learn
> and been always looking for a good reason to get myself doing something like
> this
> and not thinking that others will do it or that it does not have enough
> advantages.
> 
> So just that I got it right:
> I get the source code from the nouveau git repository,
> change the affected code parts, follow the instructions in the readme etc.
> and compile nouveau as a module to load it.
> 
> Is it enough that I just download a snapshot and do a out-of-tree
> compilation?
> Is it compatible with my kernel?
> (because it says it contains the 3.x versions and my version is 4)
> Which settings do I need to enable in my kernel that are not trivial?
> 
> Sorry for so many questions, but I'd really like to get this done.

Doesn't matter what code you grab, it should basically be the same for the past many kernel versions. Nouveau got a bit of a rewrite in 4.3, so the code might look a bit different if you use an older kernel, but it's the same idea.

If this is your first time compiling a kernel, then it might get a bit tricky. First grab a kernel, compile it, make sure you can boot it, and *then* make the relevant changes.
Comment 9 Joseph Thommes 2015-10-05 19:18:55 UTC
Oh I misunderstood something about the kernel sources :D.
Thougt this included more afford ;).
I configured my current kernel by myself and it is bootable except for
the multi-monitor problem.
I'm going to try it now.
Comment 10 Joseph Thommes 2015-10-05 19:52:12 UTC
Ok I'm sorry to ask again, but I've got no idea what the methods nvkm_mask, read, and wr32 are expecting, returning or doing, so I cannot
think by myself what to do...
(By the way in my sources they are just named nv_*.)
So where can I find documentation about it or where are they defined?
I'm familiar with (userspace) C, but not in these dimensions
and I don't know how to locate the definition of a function except for
looking in all included headers recursively...
Did I understand you correctly, that I should make the changes like this:
(again I've got no idea what these functions do, so it might be nonsense)

/* AVI InfoFrame */

        u32 foo = nvkm_rd32(device, 0x690000 + hdmi);
        nvkm_wr32(device, 0x690000 + hdmi, foo & ~1);
        nvkm_wr32(device, 0x69000c + hdmi, 0x0000006f);
        nvkm_wr32(device, 0x690010 + hdmi, 0x00000000);
        nvkm_wr32(device, 0x690014 + hdmi, 0x00000000);
        nvkm_wr32(device, 0x690018 + hdmi, 0x00000000);
        nvkm_wr32(device, 0x690000 + hdmi, foo | 1);

        /* ??? InfoFrame? */
        u32 foo = nvkm_rd32(device, 0x6900c0 + hdmi);        
        nvkm_wr32(device, 0x6900c0 + hdmi, foo & ~1);
        nvkm_wr32(device, 0x6900cc + hdmi, 0x00000010);
        nvkm_wr32(device, 0x6900c0 + hdmi, foo | 1);
Comment 11 Joseph Thommes 2015-10-05 19:54:26 UTC
And of course I meant effort not afford
Comment 12 Ilia Mirkin 2015-10-05 20:06:04 UTC
(In reply to joseph-thommes from comment #10)
> Ok I'm sorry to ask again, but I've got no idea what the methods nvkm_mask,
> read, and wr32 are expecting, returning or doing, so I cannot
> think by myself what to do...
> (By the way in my sources they are just named nv_*.)
> So where can I find documentation about it or where are they defined?
> I'm familiar with (userspace) C, but not in these dimensions
> and I don't know how to locate the definition of a function except for
> looking in all included headers recursively...
> Did I understand you correctly, that I should make the changes like this:
> (again I've got no idea what these functions do, so it might be nonsense)
> 
> /* AVI InfoFrame */
> 
>         u32 foo = nvkm_rd32(device, 0x690000 + hdmi);
>         nvkm_wr32(device, 0x690000 + hdmi, foo & ~1);
>         nvkm_wr32(device, 0x69000c + hdmi, 0x0000006f);
>         nvkm_wr32(device, 0x690010 + hdmi, 0x00000000);
>         nvkm_wr32(device, 0x690014 + hdmi, 0x00000000);
>         nvkm_wr32(device, 0x690018 + hdmi, 0x00000000);
>         nvkm_wr32(device, 0x690000 + hdmi, foo | 1);
> 
>         /* ??? InfoFrame? */
>         u32 foo = nvkm_rd32(device, 0x6900c0 + hdmi);        
>         nvkm_wr32(device, 0x6900c0 + hdmi, foo & ~1);
>         nvkm_wr32(device, 0x6900cc + hdmi, 0x00000010);
>         nvkm_wr32(device, 0x6900c0 + hdmi, foo | 1);

Seems right to me, except of course it won't compile since you have two variables named "foo" being declared in the same scope. Remove the second 'u32' and you should be good to go.
Comment 13 Joseph Thommes 2015-10-05 20:27:27 UTC
Ok I changed the file, it compiled fine (with a little C90 standard warning).
On bootup I still get the message at the HDMI Monitor: No input. The DVI Monitor just stays black, not sure if this is specific to this monitor, but I think it usually shows such a message, too. So the symtpoms are the same.
In the last post I accidentally removed one line too much, which I did not in the source code I used to compile it.
The kernel log has changed a bit.
Comment 14 Joseph Thommes 2015-10-05 20:29:28 UTC
Created attachment 118683 [details]
/var/log/dmesg after changes in ./drm/.../hdmigk104.c
Comment 15 Joseph Thommes 2015-10-12 15:58:45 UTC
I changed to the nouveau driver in Manjaro and the same issue came up.
Unfortunately I'm not able to obtain a kernel log in a chrooted environment in Manjaro, so there is none attached. If someone knows how to do this, please just answer. (There's no file like dmesg in other distributions in /var/log/).
Anyway, the symptoms are the same, so it is not a gentooish bug, but really a bug in nouveau...
Comment 16 Joseph Thommes 2015-10-26 16:56:38 UTC
Just saw this (https://bugs.freedesktop.org/show_bug.cgi?id=91705) thread which is about the same problem on another card so its not just this card. I don't know if this helps, but I hope so. (You probably know this already, but just in case.)
Comment 17 Joseph Thommes 2015-10-26 18:01:56 UTC
Just compiled the new gentoo-4.0.9 sources and the problem persists.
Comment 18 Joseph Thommes 2015-10-26 18:03:13 UTC
When I changed it in Manjaro it was using 4.1.11 I think.
Comment 19 Joseph Thommes 2016-01-23 07:54:32 UTC
Created attachment 121229 [details]
kernel log after upgrade to gentoo-4.1.15-r1

The monitors still turn off and I think it's the exact same thing as before.
Just wanted to keep this up-to-date.
Comment 20 Pierre Moreau 2016-01-23 10:21:48 UTC
(In reply to Joseph Thommes from comment #19)
> Created attachment 121229 [details]
> kernel log after upgrade to gentoo-4.1.15-r1
> 
> The monitors still turn off and I think it's the exact same thing as before.
> Just wanted to keep this up-to-date.

The latest release would be 4.4, which includes the big rewrite from 4.3, whereas 4.1.15 will only include fixes that were back ported (and Nouveau doesn't send many fixes to stable version).

Sorry I can't help more than that: I don't have the correct setup and have no knowledge about that part of Nouveau.
Comment 21 Joseph Thommes 2016-01-25 18:19:10 UTC
I meant the latest kernel release in sys-kernel/gentoo-sources, but I think, I'll try some more recent sources to find out whether the problem is fixed or not and I'll keep this thread up-to-date.
Comment 22 Joseph Thommes 2016-01-25 21:51:44 UTC
Created attachment 121284 [details]
dmesg of the 4.4 kernel from the gentoo sys-kernel/git-sources

I upgraded to the 4.4 release of the kernel today, but still the same error happens. I think the syntax of the output was slightly changed, but all in all it's still the same thing.
If you need any other information, tell me.
Comment 23 Joseph Thommes 2016-03-08 10:55:48 UTC
Hi,
I now have both monitors attached via DVI and now the system boots up. However the monitors now display both the same picture with the same frequency and resolution - they are mirrored. I haven't really tried to solve this by now and I am not sure whether this is a X or a nouveau problem, but my X configuration is apparently not read at all. On the other hand, this mirroring is also the case while booting. So I need to figure out what the problem is and let you know what it was.
This all being the case and me not knowing how the driver is constructed, I think you can isolate the problem to somewhere, but I don't really know where.
I hope this somehow helps.
Comment 24 Joseph Thommes 2016-03-08 11:05:03 UTC
Ok, the same problem occurs just in the consoles, too.
So it's not a X problem.
Comment 25 Pierre Moreau 2016-03-08 15:25:22 UTC
AFAIK, the mirrored behaviour in console mode is expected. What happens if you run `xrandr --output ID_OF_YOUR_RIGHT_SCREEN --right-of ID_OF_YOUR_LEFT_SCREEN`, with the id being of the form DVI-I-1?

Can you link your Xorg config file here, and tell us where it is placed on your system?
Comment 26 Joseph Thommes 2016-03-09 19:32:26 UTC
Ok the fact that the mirroring was also present in console mode made me not even check for the X configuration. Thanks for this tip. I can calibrate and configure the monitors now. Sometimes the simplest things can confuse one :D
But the bug concerning the HDMI and DVI combination was still not fixed before I bought the DVI-cable, so I'm leaving this thread open. :)
Comment 27 Joseph Thommes 2017-02-27 13:49:16 UTC
I switched back to nouveau (kernel 4.10.1) after a time using the proprietary nvidia driver and now X and tty work even with three monitors attached. But now the switching between X and tty does not work properly, so I'll open a new thread for that.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.