Bug 60680

Summary: [NV96] HDMI is connected and has mode, TV says "no signal"
Product: xorg Reporter: Pekka Paalanen <ppaalanen>
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: fd_mitch
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
kernel log with nouveau.debug=trace, v3.8-rc5-51-g4f84a11
none
X log, to go with kernel log v3.8-rc5-51-g4f84a11
none
video bios image, extracted from /sys/bus/pci/devices/*/rom
none
xrandr --prop output, when both LVDS and HDMI configured --auto
none
VBIOS from /sys/kernel/debug/dri/0/vbios.rom
none
Kernel log 3.5.2, HDMI clone ok, drm.debug=0xe
none
vbios-pq1.patch
none
kernel 3.12.6 with vbios-pq1.patch log none

Description Pekka Paalanen 2013-02-11 18:20:18 UTC
Created attachment 74633 [details]
kernel log with nouveau.debug=trace, v3.8-rc5-51-g4f84a11

TV does not receive signal when connected via HDMI to my Asus G50V laptop. Laptop is single-GPU, NV96. xrandr shows HDMI as connected, having a mode, and it should just work but doesn't. TV does not detect a signal.

LVDS works just fine.

I have tried:
- switching HDMI on and off via xrandr
- setting different video modes with xrandr for the HDMI
- booting with and without HDMI cable connected
- hotplugging HDMI

DDX 1.0.6.
Kernel is the nouveau kernel based on 3.8.0-rc5, v3.8-rc5-51-g4f84a11
commit 4f84a118d8872f21e1d5b78241e0e6f31b79e46b
Author: Marcin Slusarz <marcin.slusarz@gmail.com>
Date:   Tue Feb 5 20:44:19 2013 +0100

    drm/nouveau/therm: reduce stack usage of nouveau_therm_ic_ctor


The nvidia proprietary drivers can run the TV via HDMI fine.

This is the first time I am trying to run nouveau with HDMI on this machine.

In the kernel log at timestamp 93, I was in X and ran:
$ xrandr --output LVDS-1 --auto --output HDMI-1 --auto --right-of LVDS-1
Before that X had seemed to set up a clone mode at the laptop panels resolution also on TV.


Similar bugs:
#37085 - I do not know if my case is a regression, different chips.
#43939 - I am using a real TV, not a DVI monitor.
#56601 - I have not even attempted audio yet, video only.
Comment 1 Pekka Paalanen 2013-02-11 18:23:03 UTC
Created attachment 74634 [details]
X log, to go with kernel log v3.8-rc5-51-g4f84a11
Comment 2 Pekka Paalanen 2013-02-11 18:29:11 UTC
Created attachment 74637 [details]
video bios image, extracted from /sys/bus/pci/devices/*/rom
Comment 3 Pekka Paalanen 2013-02-11 18:49:45 UTC
Created attachment 74641 [details]
xrandr --prop output, when both LVDS and HDMI configured --auto

And the TV says "no signal". Yes, I'm fairly sure I've selected the right input on TV.
Comment 4 Ben Skeggs 2013-02-12 08:50:05 UTC
I'm not sure if it'll help, but there's an additional commit (fix missing sor modectrl sync flags) which might be related sitting in git.
Comment 5 Pekka Paalanen 2013-02-13 20:04:47 UTC
(In reply to comment #4)
> I'm not sure if it'll help, but there's an additional commit (fix missing
> sor modectrl sync flags) which might be related sitting in git.

I tried a new complete kernel with that patch, but unfortunately it doesn't help.

Btw. the attached kernel log has this line:
[   10.194866] farn kernel: nouveau E[    PBUS][0000:01:00.0] MMIO write of 0x00000000 FAULT at 0x400724
and a few times I have been left with no outputs at all, when changing modes on the 3.8-rc kernel, but that is a different story.


Now I know this is a regression: in 3.5.2-gentoo kernel Nouveau can drive the HDMI output fine.
Comment 6 russianneuromancer 2013-03-27 13:06:25 UTC
I have same issue on Acer Aspire 5920G laptop with NV84 GPU and attached HDMI display. If Pekka information about affected hardware is not enough and information by me is also needed - just ask, and I'll upload it.
Comment 7 Ilia Mirkin 2013-08-20 23:12:16 UTC
Can you confirm that this is still an issue with 3.11-rc6? If so, a bisect between 3.5 and 3.8 would be great. Probably some infoframe validation thing, the TV doesn't like it. (My personal bet would be the "move hdmi to new system" commit...) Note that the bisect would only have to go over drivers/gpu/drm/nouveau, not the whole kernel, so it shouldn't take *too* many tries.
Comment 8 Pekka Paalanen 2013-08-21 05:58:55 UTC
(In reply to comment #7)
> Can you confirm that this is still an issue with 3.11-rc6? If so, a bisect
> between 3.5 and 3.8 would be great. Probably some infoframe validation
> thing, the TV doesn't like it. (My personal bet would be the "move hdmi to
> new system" commit...) Note that the bisect would only have to go over
> drivers/gpu/drm/nouveau, not the whole kernel, so it shouldn't take *too*
> many tries.

Sure, I'll see what I have time to do. Damn, it's been 6 months since I reported it, I could have figured I had time to bisect, but meh :-D

I was slightly discouraged because the bisection would cross the great rewrite, but maybe that was unwarranted.
Comment 9 Pekka Paalanen 2013-08-24 08:30:02 UTC
As preparing for bisection, here are new confirmed data points:

- 3.5.2 vanilla kernel (i.e. not Gentoo-patches) works.
- 3.11-rc6 vanilla kernel does not work.

I am specifically looking at whether boot-time fbcon after nouveau loads shows up on HDMI, whether X by default shows up on HDMI, and can I get the HDMI output exclusively.

On 3.5.2, fbcon is cloned, X by default is cloned, and HDMI exclusive works.

On 3.11-rc6, fbcon shows only on LVDS, X by default shows only on LVDS, and HDMI exclusive does not show up anywhere. (off-topic: default resolution was not native for LVDS anymore.)

The HDMI exclusive is set up by:
$ xrandr --output LVDS-1 --off --output HDMI-1 --mode 1920x1080 --set 'scaling mode' 'None'

I did also try dual setup:
$ xrandr --output LVDS-1 --auto --output HDMI-1 --mode 1920x1080 --set 'scaling mode' 'None' --right-of LVDS-1
but that does not work even on 3.5.2, LVDS works but TV says no signal. Apparently something in manual dual-screen setup gets messed up, while boot-up dual screen works. But that is off-topic here.

Therefore, my bisection will concentrate on whether HDMI exclusive at full-HD resolution works or not.
Comment 10 Pekka Paalanen 2013-09-06 12:55:07 UTC
Bisection complete:

cb75d97e9c77743ecfcc43375be135a55a4d9b25 is the first bad commit
commit cb75d97e9c77743ecfcc43375be135a55a4d9b25
Author: Ben Skeggs <bskeggs@redhat.com>
Date:   Wed Jul 11 10:44:20 2012 +1000

    drm/nouveau: implement devinit subdev, and new init table parser
    
    v2:
    - make sure not to execute display scripts unless resuming
    
    Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


The results during bisection were consistent:
- if good, fbcon appears cloned to HDMI at boot, X appears cloned to HDMI by default, and the HDMI-only full-HD setup works
- if bad, TV always reports "no signal"


$ git bisect log
# bad: [19f949f52599ba7c3f67a5897ac6be14bfcb1200] Linux 3.8
# good: [684012d815c70359162d8b9cc9879b83855e59bf] Linux 3.5.2
git bisect start 'v3.8' 'v3.5.2' '--' 'drivers/gpu/drm/nouveau'
# good: [28a33cbc24e4256c143dce96c7d93bf423229f92] Linux 3.5
git bisect good 28a33cbc24e4256c143dce96c7d93bf423229f92
# bad: [ae168d973f5fa3f3467dc5600f74a0f03e3cafe7] Merge branch 'drm-nouveau-fixes' of git://git.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes
git bisect bad ae168d973f5fa3f3467dc5600f74a0f03e3cafe7
# bad: [c4837d27945b9b607b5a7274a33059e73d1a7831] drm/nouveau/core: remove some left-over pieces from the porting process
git bisect bad c4837d27945b9b607b5a7274a33059e73d1a7831
# good: [70c0f263cc2eb12e02506eb75f0a71490e7dea4d] drm/nouveau/bios: pull in basic vbios subdev, more to come later
git bisect good 70c0f263cc2eb12e02506eb75f0a71490e7dea4d
# bad: [3863c9bc887e9638a9d905d55f6038641ece78d6] drm/nouveau/instmem: completely new implementation, as a subdev module
git bisect bad 3863c9bc887e9638a9d905d55f6038641ece78d6
# bad: [0134a97979a0abc1c756b0fe491e074693c2bdf5] drm/nv50-/instmem: allocate vram for kernel objects from end of vram
git bisect bad 0134a97979a0abc1c756b0fe491e074693c2bdf5
# good: [70790f4f819875e8f390871fd15bbbf823f28e1b] drm/nouveau/clock: pull in the implementation from all over the place
git bisect good 70790f4f819875e8f390871fd15bbbf823f28e1b
# bad: [7d9115dee978e8540734c456c925d71a37752b8d] drm/nouveau/mc: port to subdev interfaces
git bisect bad 7d9115dee978e8540734c456c925d71a37752b8d
# bad: [cb75d97e9c77743ecfcc43375be135a55a4d9b25] drm/nouveau: implement devinit subdev, and new init table parser
git bisect bad cb75d97e9c77743ecfcc43375be135a55a4d9b25
Comment 11 Ilia Mirkin 2013-09-06 14:47:18 UTC
Well, this commit is obviously enormous. And I'm guessing the difference is something stupid. Could run mmiotraces (of nouveau) before and after the offending commit? That should help narrow down the difference.
Comment 12 Pekka Paalanen 2013-09-22 09:45:09 UTC
http://people.freedesktop.org/~pq/mmiotrace-fdo-bug-60680.tar.xz (8.8 MB)

You can find a package containing mmiotraces from the last good and the first bad kernel revision in the above URL. The traces contain the following marks.

Good:
MARK 11.701343 before begin
MARK 38.640046 nouveau loaded, TV has fbcon
MARK 39.232127 going to start X
MARK 64.224171 X is up, TV has picture
MARK 87.013208 going to change to TV-only
MARK 93.579942 TV is on with full-HD mode
MARK 97.452990 switched to LVDS-only

Bad:
MARK 16.029575 before begin
MARK 46.718026 nouveau loaded, no signal on TV
MARK 47.340368 going to start X
MARK 71.961538 X is up, no signal on TV
MARK 74.445054 going to switch to TV-only full-HD
MARK 84.561667 done, I'm blind
MARK 87.738044 switched to LVDS-only

The command to set the TV-only full-HD mode is:
$ xrandr --output LVDS-1 --off --output HDMI-1 --mode 1920x1080 --set 'scaling mode' 'None'

The command to set the LVDS-only mode is:
$ xrandr --output LVDS-1 --auto --output HDMI-1 --off

Note that by default Nouveau (fbcon and X) light up both outputs as clones when it works, just like I've described before.
Comment 13 Ilia Mirkin 2013-09-22 19:08:58 UTC
A few observations from running

diff -u <(grep -P '(PDISPLAY|MARK)' good/demmio | sed 's/.*MMIO//')  <(grep -P '(PDISPLAY|MARK)' bad/demmio | sed 's/.*MMIO//')

(the demmio files are envytools/rnn/demmio -f good/mydump.txt > good/demmio)

These largely look the same. However the "good" trace has three sequences that don't appear in the "bad" trace:

1. A bunch of reads from a bunch of registers, towards the very beginning.

2. Some SOR clock setting:

 32 W 0x61c814 0x00000000 PDISPLAY.SOR[0x1].PLL2 <= 0
 32 R 0x61c90c 0x00401100 PDISPLAY.SOR[0x1].LINK[0].DP_CTRL => { LANE_MASK = 0 | TRAINING_PATTERN = DISABLED | 0x401100 }
 32 W 0x61c90c 0x00401101 PDISPLAY.SOR[0x1].LINK[0].DP_CTRL <= { LANE_MASK = 0 | TRAINING_PATTERN = DISABLED | 0x401101 }
-32 R 0x619494 0x000900e0 PDISPLAY.VGA.CR+0x94 => 0x900e0
 32 R 0x614b00 0x00870484 PDISPLAY.CLOCK.SOR[0x1] => 0x870484
-32 W 0x614b00 0x03870484 PDISPLAY.CLOCK.SOR[0x1] <= 0x3870484
-32 R 0x614b00 0x03870484 PDISPLAY.CLOCK.SOR[0x1] => 0x3870484
-32 R 0x61c80c 0x01000000 PDISPLAY.SOR[0x1].PLL0 => 0x1000000
-32 W 0x61c80c 0x00000000 PDISPLAY.SOR[0x1].PLL0 <= 0
-32 R 0x61c808 0x00800000 PDISPLAY.SOR[0x1]+0x8 => 0x800000
-32 W 0x61c808 0x14800000 PDISPLAY.SOR[0x1]+0x8 <= 0x14800000
-32 R 0x61c808 0x14800000 PDISPLAY.SOR[0x1]+0x8 => 0x14800000
-32 W 0x61c808 0x00800000 PDISPLAY.SOR[0x1]+0x8 <= 0x800000
-32 R 0x61c80c 0x00000000 PDISPLAY.SOR[0x1].PLL0 => 0
-32 W 0x61c80c 0x01000000 PDISPLAY.SOR[0x1].PLL0 <= 0x1000000
-32 R 0x61c840 0x1f000000 PDISPLAY.SOR[0x1].SEQ_INST[0] => 0x1f000000
-32 W 0x61c840 0x1f008000 PDISPLAY.SOR[0x1].SEQ_INST[0] <= 0x1f008000
-32 R 0x614b00 0x03870484 PDISPLAY.CLOCK.SOR[0x1] => 0x3870484
-32 W 0x614b00 0x03870080 PDISPLAY.CLOCK.SOR[0x1] <= 0x3870080
+32 W 0x614b00 0x00870080 PDISPLAY.CLOCK.SOR[0x1] <= 0x870080
 32 W 0x610024 0x00000020 PDISPLAY.INTR_1 <= { CLK_UNK1 }
 32 W 0x610030 0x80000000 PDISPLAY.UNK30_CTRL <= { PENDING }
 32 R 0x610020 0x00000000 PDISPLAY.INTR_0 => { 0 }

3. A bunch of times, this same sequence is repeated:

 32 R 0x610020 0x00000000 PDISPLAY.INTR_0 => { 0 }
 32 R 0x610024 0x00000040 PDISPLAY.INTR_1 => { CLK_UNK2 }
 32 R 0x610030 0x00000550 PDISPLAY.UNK30_CTRL => { UPDATE_VCLK1 | 0x150 }
-32 R 0x619494 0x000900e0 PDISPLAY.VGA.CR+0x94 => 0x900e0
-32 R 0x61c860 0x00002000 PDISPLAY.SOR[0x1].SEQ_INST[0x8] => 0x2000
-32 W 0x61c860 0x1f008000 PDISPLAY.SOR[0x1].SEQ_INST[0x8] <= 0x1f008000
-32 W 0x61c804 0x80000000 PDISPLAY.SOR[0x1].PWR <= { NORMAL_STATE = PD | NORMAL_START = NORMAL | SAFE_STATE = PD | SAFE_START = NORMAL | TRIGGER }
-32 R 0x61c804 0x00000000 PDISPLAY.SOR[0x1].PWR => { NORMAL_STATE = PD | NORMAL_START = NORMAL | SAFE_STATE = PD | SAFE_START = NORMAL }
-32 R 0x61c804 0x00000000 PDISPLAY.SOR[0x1].PWR => { NORMAL_STATE = PD | NORMAL_START = NORMAL | SAFE_STATE = PD | SAFE_START = NORMAL }
-32 R 0x61c830 0x00088800 PDISPLAY.SOR[0x1].BLANK => 0x88800
-32 R 0x61c830 0x00088800 PDISPLAY.SOR[0x1].BLANK => 0x88800
-32 R 0x61c840 0x1f008000 PDISPLAY.SOR[0x1].SEQ_INST[0] => 0x1f008000
-32 W 0x61c840 0x1f000000 PDISPLAY.SOR[0x1].SEQ_INST[0] <= 0x1f000000
-32 R 0x61c860 0x1f008000 PDISPLAY.SOR[0x1].SEQ_INST[0x8] => 0x1f008000
-32 W 0x61c860 0x00002000 PDISPLAY.SOR[0x1].SEQ_INST[0x8] <= 0x2000
-32 R 0x61c90c 0x00401101 PDISPLAY.SOR[0x1].LINK[0].DP_CTRL => { LANE_MASK = 0 | TRAINING_PATTERN = DISABLED | 0x401101 }
-32 W 0x61c90c 0x00401100 PDISPLAY.SOR[0x1].LINK[0].DP_CTRL <= { LANE_MASK = 0 | TRAINING_PATTERN = DISABLED | 0x401100 }
-32 R 0x614b00 0x03878040 PDISPLAY.CLOCK.SOR[0x1] => 0x3878040
-32 W 0x614b00 0x00878040 PDISPLAY.CLOCK.SOR[0x1] <= 0x878040
-32 W 0x61c804 0x80000001 PDISPLAY.SOR[0x1].PWR <= { NORMAL_STATE = PU | NORMAL_START = NORMAL | SAFE_STATE = PD | SAFE_START = NORMAL | TRIGGER }
-32 R 0x61c804 0x00000001 PDISPLAY.SOR[0x1].PWR => { NORMAL_STATE = PU | NORMAL_START = NORMAL | SAFE_STATE = PD | SAFE_START = NORMAL }
-32 R 0x61c804 0x00000001 PDISPLAY.SOR[0x1].PWR => { NORMAL_STATE = PU | NORMAL_START = NORMAL | SAFE_STATE = PD | SAFE_START = NORMAL }
-32 R 0x61c830 0x00038800 PDISPLAY.SOR[0x1].BLANK => 0x38800
-32 R 0x61c830 0x00038800 PDISPLAY.SOR[0x1].BLANK => 0x38800
 32 W 0x610024 0x00000040 PDISPLAY.INTR_1 <= { CLK_UNK2 }
 32 W 0x610030 0x80000000 PDISPLAY.UNK30_CTRL <= { PENDING }
 32 R 0x619494 0x000900e0 PDISPLAY.VGA.CR+0x94 => 0x900e0

There are a few other minor discrepancies, but I don't think they're very interesting. Not sure what to make of this -- these reigsters don't appear to be written to by the code directly, so it must be some sort of script getting executed? I wonder if the extra SOR.PLL/etc stuff (from #2 above) need to go into the NV50_DISP_SOR_PWR sequence. Or perhaps there's some sort of difference in the evo stuff, which I don't think is reflected in the mmiotrace.
Comment 14 Pekka Paalanen 2013-09-28 06:05:20 UTC
Ilia suggested that I try booting with nouveau.config=NvForcePost=1.

I had TV plugged in HDMI and on, I booted the first bad kernel revision with nouveau.config=NvForcePost=1, and got no picture on either HDMI or LVDS once Nouveau initialized.

(Note to self: kernel logs are at timestamp 2013 Sep 28 08:56:17.)
Comment 15 Ilia Mirkin 2014-01-10 10:17:41 UTC
Armed with a bit more knowledge about the whole VBIOS thing... can you boot one of the good kernels with drm.debug=0xe -- that should produce BIOSLOG's for the majority of instructions... I think.

Also, the VBIOS that you uploaded is all weird (x86 opcodes instead of script opcodes, which apparently can happen if you grab from pcirom). Would you mind grabbing it from /sys/kernel/debug/dri/0/vbios.rom?
Comment 16 Pekka Paalanen 2014-01-15 17:11:09 UTC
Created attachment 92167 [details]
VBIOS from /sys/kernel/debug/dri/0/vbios.rom
Comment 17 Pekka Paalanen 2014-01-15 17:19:26 UTC
Created attachment 92168 [details]
Kernel log 3.5.2, HDMI clone ok, drm.debug=0xe

As requested, here is the log from a kernel that lights up the HDMI output properly.

Laptop booted with HDMI connected and TV on, nouveau module then loaded manually after boot. Fbcon is cloned to both LVDS and HDMI correctly.
Comment 18 Ilia Mirkin 2014-01-15 22:10:31 UTC
Created attachment 92185 [details] [review]
vbios-pq1.patch

Pick any failing kernel that you like, and try applying this patch. Boot with nouveau.debug=VBIOS=trace,PDISP=debug which should just get the logs we care about.

BTW, my observation is that IO condition 5 is met in the old kernels, but not met in the new ones. This changes the execution flow, of course.
Comment 19 Ilia Mirkin 2014-01-17 06:54:25 UTC
... and if that doesn't work, try making it a rd32 & 0xff instead of a rd08. [would also need more logic to handle non-aligned-to-4 reads, but you don't have those]

Looking at the full mmiotrace (rather than just PDISPLAY as my earlier diff was doing), there are sections like

[0] 35.111642 MMIO32 R 0x619494 0x000900e8 PDISPLAY.VGA.CR+0x94 => 0x900e8
[0] 35.111688 MMIO32 W 0x619494 0x000900e0 PDISPLAY.VGA.CR+0x94 <= 0x900e0
... ~30 more identical lines ...

and then in the good trace:
[0] 35.113651 MMIO32 R 0x619494 0x000900e0 PDISPLAY.VGA.CR+0x94 => 0x900e0

but in the bad trace:
[0] 39.098615 MMIO8 W 0x6013d4 0x00000094 PRMIO.CRX <= 0x94
[0] 39.098655 CRTC0 R     0x94       0x00 0x94 => 0

And that is the value used by the condition 0x05. So it seems really likely that this will help. I hope.
Comment 20 Pekka Paalanen 2014-01-18 08:38:11 UTC
Created attachment 92319 [details]
kernel 3.12.6 with vbios-pq1.patch log

Here is the kernel log with the nouveau debug options Ilia asked, with the vbios-pq1.patch applied. I booted with HDMI connected to the live TV, and I manually loaded nouveau.ko after boot.

I'm happy to say that this change makes fbcon appear on the TV!

This also makes X show up on TV, and I can use xrandr to set the TV to full-HD resolution, extending my desktop from LVDS. So they both came up fine this time (I've had vague problems with using both outputs non-cloned before).

Wheee! \o/
Comment 21 Andreas Reis 2014-01-19 15:23:21 UTC
I applied the vbios-pq1.patch to 3.13-rc8+ to check if it fixes the same NV96 HDMI issue (mentioned incidentally in bug #73791), and it indeed works for me as well.
Comment 22 fd_mitch 2014-01-21 18:51:20 UTC
I applied the patch on the Ubuntu 13.10 kernel (3.11.0-15-generic) and it solved the issue (second monitor blank).
Comment 23 Ilia Mirkin 2014-01-23 06:57:28 UTC
A different patch was checked in to handle this and is now in drm-next and should appear in 3.14-rc1. Marking this as fixed... feel free to reopen if the thing in drm-next doesn't work for you.
Comment 24 Pekka Paalanen 2014-01-26 11:27:04 UTC
Tried the complete kernel from nouveau/drm-nouveau-next (v3.13-rc8-665-g1139ffb) which contains the commit

commit f87cd8b695d372087685976460fac1ec6ba2fca9
Author: Ilia Mirkin <imirkin@alum.mit.edu>
Date:   Sun Jan 19 04:18:15 2014 -0500

    drm/nouveau/devinit: lock/unlock crtc regs for all devices, not just pre-nv50
    
    Also make nv_lockvgac work for nv50+ devices. This should fix
    IO_CONDITION and related VBIOS opcodes that read/write the crtc regs.
    
    See https://bugs.freedesktop.org/show_bug.cgi?id=60680
    
    Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
    Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

which apparently is supposed to fix the issue.

This kernel works, and the issue is fixed.

Thank you!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.