After looking at #66129 and testing linux 3.10.3 (includes the patches from 66129) and still having problems, I decided to open a bugreport, as the problem appears to be something else. I'm not sure if the topic describes the problem really well. First I should mention, that I wrote a bugreport on the kernel tracker: https://bugzilla.kernel.org/show_bug.cgi?id=60071 Which was probably the wrong place, and a complete different description. Machine: One from the nv50 (8800 GTS) graphic cards. One monitor. Notes: *early KMS Issue: With 3.9.6 everything is working fine. Systems boots and I'm getting my DM 3.9.7 introduced following changes: commit aed4802d2a0d2f6c7179a2c03e184860741788c0 Author: Ben Skeggs <bskeggs@redhat.com> Date: Mon Jun 3 16:40:14 2013 +1000 drm/nv50/kms: use dac loadval from vbios, where it's available commit d40ee48acde16894fb3b241d7e896d5fa84e0f10 upstream. ------- commit b242745947ed7562ad91dad9e9fde1a92e40d666 Author: Ben Skeggs <bskeggs@redhat.com> Date: Mon Jun 3 16:07:06 2013 +1000 drm/nv50/disp: force dac power state during load detect commit ea9197cc323839ef3d5280c0453b2c622caa6bc7 upstream. ------- Since 3.9.7 (also tested with 3.9.8 and 3.10.3) the X-server won't be accessible. If that's the right description. I'm just seeing a black screen and I just can switch back to a tty (which takes it time). Tried it with a DM and without (plain xinit/startx from tty). Output: With Linux 3.9.6: xrandr message: xRandr: Found crtc's: 2 xRandr: Linking output DVI-I-1 with crtc 0 Output of xrandr --current: Screen 0: minimum 320 x 200, current 1680 x 1050, maximum 8192 x 8192 DVI-I-1 connected 1680x1050+0+0 (normal left inverted right x axis y axis) 473mm x 296mm 1680x1050 59.9*+ 1400x1050 74.9 1280x1024 75.0 1440x900 75.0 59.9 1152x864 75.0 1024x768 75.1 70.1 60.0 832x624 74.6 800x600 72.2 75.0 60.3 56.2 640x480 72.8 75.0 66.7 60.0 720x400 70.1 DVI-I-2 disconnected (normal left inverted right x axis y axis) Logs with Linux 3.9.7: xrandr message: xRandr: Found crtc's: 2 xRandr: Linking output DVI-I-1 with crtc 0 xRandr: Linking output DVI-I-2 with crtc 1 Output of xrandr --current: Screen 0: minimum 320 x 200, current 1024 x 768, maximum 8192 x 8192 DVI-I-1 connected 1024x768+0+0 (normal left inverted right x axis y axis) 473mm x 296mm 1680x1050 59.9 + 1400x1050 74.9 1280x1024 75.0 1440x900 75.0 59.9 1152x864 75.0 1024x768 75.1 70.1 60.0* 832x624 74.6 800x600 72.2 75.0 60.3 56.2 640x480 72.8 75.0 66.7 60.0 720x400 70.1 DVI-I-2 connected 1024x768+0+0 (normal left inverted right x axis y axis) 0mm x 0mm 1024x768 60.0* 800x600 60.3 56.2 848x480 60.0 640x480 59.9 I'm adding a dmesg snippet later.
Created attachment 83073 [details] nouveau specific entries from journalctl using 3.9.6 Journalctl log from the working 3.9.6 kernel
Created attachment 83074 [details] nouveau specific entries from journalctl using 3.10.3 Log from a non working kernel (3.10.3, the latest kernel at the moment)
(In reply to comment #0) > After looking at #66129 and testing linux 3.10.3 (includes the patches from > 66129) and still having problems, I decided to open a bugreport, as the > problem appears to be something else. > I'm not sure if the topic describes the problem really well. > > > First I should mention, that I wrote a bugreport on the kernel tracker: > https://bugzilla.kernel.org/show_bug.cgi?id=60071 > Which was probably the wrong place, and a complete different description. > > > Machine: > One from the nv50 (8800 GTS) graphic cards. One monitor. > So in case of 3.9.7 your system seems to be detecting a ghost monitor, hooked up to your DVI-I-2. Did I get that right? > Notes: > *early KMS > > > Issue: > With 3.9.6 everything is working fine. Systems boots and I'm getting my DM > > 3.9.7 introduced following changes: > commit aed4802d2a0d2f6c7179a2c03e184860741788c0 > Author: Ben Skeggs <bskeggs@redhat.com> > Date: Mon Jun 3 16:40:14 2013 +1000 > > drm/nv50/kms: use dac loadval from vbios, where it's available > > commit d40ee48acde16894fb3b241d7e896d5fa84e0f10 upstream. > ------- > commit b242745947ed7562ad91dad9e9fde1a92e40d666 > Author: Ben Skeggs <bskeggs@redhat.com> > Date: Mon Jun 3 16:07:06 2013 +1000 > > drm/nv50/disp: force dac power state during load detect > > commit ea9197cc323839ef3d5280c0453b2c622caa6bc7 upstream. I would suspect this ^^ commit to be the one causing the issue. Can you please revert it on top of 3.9.7 and confirm ? How many monitors does that kernel detect ? > ------- > > Since 3.9.7 (also tested with 3.9.8 and 3.10.3) the X-server won't be > accessible. If that's the right description. I'm just seeing a black screen > and I just can switch back to a tty (which takes it time). > Tried it with a DM and without (plain xinit/startx from tty). > A brief explanation what happens here: Nouveau detects a phantom/ghost monitor hooked up to DVI-I-2, which on it's own causes the DISPLAY engine to hang. Leading to the kernel being unable to change the resolution(modesetting), and X failing to start and VT switch being dead slow Cheers Emil
You probably also want this patch to make NV50 work properly: http://lists.freedesktop.org/archives/nouveau/2013-July/013051.html
(In reply to comment #3) > A brief explanation what happens here: > Nouveau detects a phantom/ghost monitor hooked up to DVI-I-2, which on it's > own causes the DISPLAY engine to hang. Leading to the kernel being unable to > change the resolution(modesetting), and X failing to start and VT switch > being dead slow > > Cheers > Emil Ah, I see. Thanks for the explanation. And yes, I can try to revert the second commit on top of 3.9.7 and will report the results later.
Result linux-3.9.7 without the drm-nv50-disp-force-dac-power-state-during-load-detect.patch[0]: Stuff is working like it should No ghost monitor anymore: Current Operating System: Linux Hekate 3.9.7-1-ARCH #1 SMP PREEMPT Sat Jul 27 17:56:17 CEST 2013 i686 (...) (II) NOUVEAU(0): Output DVI-I-1 connected (II) NOUVEAU(0): Output DVI-I-2 disconnected ( Whereas the failing 3.9.7 Xorg log looks like that: Current Operating System: Linux Hekate 3.9.7-1-ARCH #1 SMP PREEMPT Thu Jun 20 23:22:07 CEST 2013 i686 (...) (II) NOUVEAU(0): Output DVI-I-1 connected (II) NOUVEAU(0): Output DVI-I-2 connected ) _______ [0] https://git.kernel.org/cgit/linux/kernel/git/stable/stable-queue.git/tree/releases/3.9.7/drm-nv50-disp-force-dac-power-state-during-load-detect.patch?id=4c71e2e7d2a2175d6682447fd44fcbf5d5a98df8
Great stuff :) I'll attach a few patches in the next few days for you two try
Created attachment 83155 [details] [review] some debug printfs Give this patch a try and attach the output of dmesg. Thanks
Created attachment 83228 [details] dmesg from 3.9.7 with the debug patch There you go :) I hope the information you need is there. Also I wasn't sure if I should do this on top of 3.9.7 or the newest one. So I decided to do it with 3.9.7 and 3.10.3 (dmesg log coming soon)
Created attachment 83231 [details] dmesg from 3.10.3 with the debug patch Like stated in the comment before. Here the dmesg log using 3.10.3 and the debug patch. I hope the info is there. Else I need to search the journal
Created attachment 83240 [details] [review] another debug patch Another patch, it will printout a bit more information (that the first one missed) and it will unconditionally set ret & data[0]. Apply on top of either 3.9 or 3.10 branch _after_ the offending commit I have a sneaky suspicion that you may need to try a 3.7 kernel, but that will follow after the results from this patch
Link the bug that the regressing commit addresses
Created attachment 83307 [details] dmesg from 3.9 with the faulty commit and debug2 (aka 3.9.7+debug2) There you go. dmesg output with debug2 kernel. >I have a sneaky suspicion that you may need to try a 3.7 kernel, but that will >follow after the results from this patch Huh? Something special about 3.7, or what is the reason?
(In reply to comment #13) > Created attachment 83307 [details] > dmesg from 3.9 with the faulty commit and debug2 (aka 3.9.7+debug2) > > There you go. dmesg output with debug2 kernel. > Thanks, there is something really funky happening in here. > >I have a sneaky suspicion that you may need to try a 3.7 kernel, but that will >follow after the results from this patch > > Huh? Something special about 3.7, or what is the reason? The commit that causes the issue on your setup fixes a regression introduced around the 3.7
Created attachment 83320 [details] [review] debug-3 Grab the 3.9 branch _without_ the offending patch and apply debug-3 one on top. Attach dmesg as usual. Thanks :)
Created attachment 83335 [details] dmesg from 3.9.7 without the bad patch + debug3 And the next log. >Thanks, there is something really funky happening in here. Sounds like a lot of problems ;) >The commit that causes the issue on your setup fixes a regression introduced >around the 3.7 I see. Overlooked that fact
A fair bit of fairy stuff in here. Without the bad patch, nouveau does not call the appropriate sense() function, thus your system never looks for any other outputs and does not see the fathom monitor connected to the dvi-i. Can you provide a bit more information about the card - such as number and type of outputs/connectors and if they work or not ? i.e. dvi-i -> working using a dvi-to-vga adaptor and a vga monitor vga -> never tested ... Additionally please attach your video bios [1]. Thanks Cheers Emil [1] http://nouveau.freedesktop.org/wiki/DumpingVideoBios/ The debugfs or vbtracetool method is preferred
Created attachment 83378 [details] BIOS dump from my graphic card The card type is Nvidia Geforce 8800 GTS manufactured from Leadtek (IIRC Leadtek WinFast PX8800GTS TDH 320MB PCI-E) And the card has two dvi connectors (dual-linked, whatever that means) and a tv-out. Both are working, but only one is used. dvi-1 --> dvi-to-vga adaptor going to a vga monitor dvi-2 --> normally not used. But working tv-out --> never tested After the problems appeared and this second screen "appeared" I switched the connector, but with the same result. If I changed the connector while the system was running, I got a picture from this "ghost" What did it look like.. On dvi-1, once I get to a tty, it's resolution is like 1680x1050, but only 1024x768 is used. Switched to dvi-2 it's the same picture but the resolution is changed to 1024x768 Something else you need to know?
According to your vbios > dvi-1 --> dvi-to-vga adaptor going to a vga monitor The above uses OR 2/1 for analogue/digital output > dvi-2 --> normally not used. But working OR 1/1 respectively here > tv-out --> never tested OR 0 Can you redo my request from comment 15, but with the monitor connected into the other dvi port Just for the jokes can you try booting a failing kernel with nouveau.tv_disable=1 appended to your command line. If the above fails would be great if you can grab and compile a couple of kernels * git checkout 7ebb38b556485449bfaa506a196439f6a6fd6ebd~1 and * git checkout 7ebb38b556485449bfaa506a196439f6a6fd6ebd~2 Obviously let me know how they fare and attach the output of dmesg for both Thanks Emil
(In reply to comment #19) Interim report > According to your vbios > > dvi-1 --> dvi-to-vga adaptor going to a vga monitor > The above uses OR 2/1 for analogue/digital output > > > dvi-2 --> normally not used. But working > OR 1/1 respectively here > > > tv-out --> never tested > OR 0 Whatever that means :D > Can you redo my request from comment 15, but with the monitor connected into > the other dvi port It was working. Xorg.log wrote, that DVI-I-1 is disconnected and DVI-I-2 connected. The resulting dmesg varies at following lines from the first: [ 11.804787] nouveau [ PDISP][0000:01:00.0] nv50_dac_mthd power(or 2, data[0] 0x000001a4) 0x00000000 [ 11.806777] nouveau [ PDISP][0000:01:00.0] nv50_dac_mthd power(or 2, data[0] 0x000001a4) 0x00000000 [ 11.837181] nouveau [ PDISP][0000:01:00.0] nv50_dac_mthd power(or 2, data[0] 0x000001a4) 0x00000000 [ 11.839164] nouveau [ PDISP][0000:01:00.0] nv50_dac_mthd power(or 2, data[0] 0x000001a4) 0x00000000 > Just for the jokes can you try booting a failing kernel with > nouveau.tv_disable=1 appended to your command line. No positive result from that either. It was still failing > If the above fails would be great if you can grab and compile a couple of > kernels > * git checkout 7ebb38b556485449bfaa506a196439f6a6fd6ebd~1 > > and > * git checkout 7ebb38b556485449bfaa506a196439f6a6fd6ebd~2 > > Obviously let me know how they fare and attach the output of dmesg for both Will do, but it will take it's time. The kernel repo is not a small one :)
(In reply to comment #20) > It was working. Xorg.log wrote, that DVI-I-1 is disconnected and DVI-I-2 > connected. Let me see if I'm not day dreaming in here. If you startup with your monitor connected to the second DVI port, everything works fine ? > The resulting dmesg varies at following lines from the first: > [ 11.804787] nouveau [ PDISP][0000:01:00.0] nv50_dac_mthd power(or 2, > data[0] 0x000001a4) 0x00000000 > [ 11.806777] nouveau [ PDISP][0000:01:00.0] nv50_dac_mthd power(or 2, > data[0] 0x000001a4) 0x00000000 > [ 11.837181] nouveau [ PDISP][0000:01:00.0] nv50_dac_mthd power(or 2, > data[0] 0x000001a4) 0x00000000 > [ 11.839164] nouveau [ PDISP][0000:01:00.0] nv50_dac_mthd power(or 2, > data[0] 0x000001a4) 0x00000000 > I'm suspecting that there was a bit more to differ than this - such as the "or" on the other two printf's should differ as well. Would you mind attaching the dmesg ? Thanks
Created attachment 83466 [details] dmesg from 3.9.7 without offendind patch+debug3. Monitor connected to the second dvi >> It was working. Xorg.log wrote, that DVI-I-1 is disconnected and DVI-I-2 >> connected. >Let me see if I'm not day dreaming in here. If you startup with your monitor >connected to the second DVI port, everything works fine ? Yes. Setup like Comment15 and changed the dvi-port. After pressing Power and booting to the end I got my DM and everything else ;) Did you hope it would check both connectors now and even without the "offending" patch it would detect a ghost monitor? On a side note. First 3.7 Kernel is finished, second is being compiled. So the dmesg from those will follow in about an hour
Created attachment 83477 [details] dmesg from kernel available through * git checkout 7ebb38b556485449bfaa506a196439f6a6fd6ebd~1 Nothing particular to report. It worked
Created attachment 83478 [details] dmesg from kernel available through * git checkout 7ebb38b556485449bfaa506a196439f6a6fd6ebd~2 Worked too, so I dunno what to wrote about how they fare :)
Created attachment 83549 [details] [review] possible fix Still not too sure how come your system senses a monitor connected to your second dvi port. This patch reverts some of the new stuff introduced during the transition to core. Please apply this patch on top of 3.9 branch (i.e. after the offending commit)
Created attachment 83558 [details] dmesg from 3.9 with offending commit + fix (In reply to comment #25) > Created attachment 83549 [details] [review] [review] > possible fix > > Still not too sure how come your system senses a monitor connected to your > second dvi port. > > This patch reverts some of the new stuff introduced during the transition to > core. Please apply this patch on top of 3.9 branch (i.e. after the offending > commit) Yes, this is working. It seems no ghost monitor was detected and X works like it should :)
(In reply to comment #26) > Yes, this is working. It seems no ghost monitor was detected and X works > like it should :) Great, can you track down which hunk(one or a combination) is needed ? * set DAC_CLK_CTRL1 + nv_wr32(priv, 0x61a010 + doff, 0x00000001); * bump up the delay (5 x 9.5 ms) + udelay(9500); + udelay(9500); + udelay(9500); + udelay(9500); * remove the stabilisation +/* nv_wr32(priv, 0x61a00c + doff, 0x80000000); +*/ Thanks
Yes, I can do that and will post the results, when I'm finished.
I can confirm, that applying >* remove the stabilisation >+/* > nv_wr32(priv, 0x61a00c + doff, 0x80000000); >+*/ solves my issue. The other two hunks weren't successful. I suppose, testing the combination of those isn't needed after this result?
As the days went by. I'm wondering what I should do. Or what's the case with this report. Is it kind of a problem, that I need to take care myself? E.g. it cannot be patched/removed/altered, because this stabilisation is needed and would harm other cards? I have no problem with waiting, but the uncertainty is killing me :) Best regards
(In reply to comment #30) > As the days went by. I'm wondering what I should do. Or what's the case with > this report. > Is it kind of a problem, that I need to take care myself? E.g. it cannot be > patched/removed/altered, because this stabilisation is needed and would harm > other cards? > Point is I've asked a few people and no-one seems to have such issue like you. So I'm suspecting that it's related to the original NV50. With that said the only person I can think of having such a card to test is the nouveau/kernel maintainer. Which is quite busy, but hopefully will have a moment to test. Side effect of using the new "fix" is causing a regression on almost all other cards, and/or the need to increase the delay to a silly value (as it was before).
(In reply to comment #31) > (In reply to comment #30) > > As the days went by. I'm wondering what I should do. Or what's the case with > > this report. > > Is it kind of a problem, that I need to take care myself? E.g. it cannot be > > patched/removed/altered, because this stabilisation is needed and would harm > > other cards? > > > Point is I've asked a few people and no-one seems to have such issue like > you. So I'm suspecting that it's related to the original NV50. With that > said the only person I can think of having such a card to test is the > nouveau/kernel maintainer. Which is quite busy, but hopefully will have a > moment to test. > > Side effect of using the new "fix" is causing a regression on almost all > other cards, and/or the need to increase the delay to a silly value (as it > was before). Hey dude, I have two of them :) One quadro and one geforce.
Created attachment 84479 [details] [review] nv50 quirk Confirmed that only nv50 are affected. The blob's dac handling does differ between nv50 and later cards, so this patch should do the job. Feel free to test and provide a full name + email if you'd like your contributions to me acknowledged ("Reported-by" and "Tested-by")
Here is a bit more information as to the blob's handing on the case nv50 Simple register bashing write(loadval | 0x0100000) udelay(140s) read() write(0) nv94/96 HWSQ 00000000: 5f 01 00 ewait #CRTC0_VBLANK 0x0 00000003: 5f 01 01 ewait #CRTC0_VBLANK 0x1 00000006: 0e wait 0x2 shl 0x6 00000007: 0d wait 0x1 shl 0x6 00000008: 02 wait 0x2 shl 0x0 00000009: 01 wait 0x1 shl 0x0 // the above are only valid if the output is already enabled 0000000a: e2 08 02 10 00 data 0x100208 0000000f: e0 0c a8 61 00 addr 0x61a80c 00000014: 00 nop 00000015: 40 0c a8 addrlo 0xa80c 00000018: 05 wait 0x1 shl 0x2 00000019: e2 00 00 00 00 data 0x0 0000001e: 40 0c a8 addrlo 0xa80c 00000021: 7f exit 00000022: 7f exit 00000023: 7f exit PBUS.HWSQ.TRIGGER <= { TYPE = START | ENTRY_POINT = 0 } PBUS.HWSQ.STATUS => { A = { IP = 0x5 | ACTIVE } } if output is connected ~5,4 ms later PDISPLAY.DAC[0x1].LOAD_CTRL => { LOAD_PATTERN = 0 | PRESENT = 0x7 } PDISPLAY.DAC[0x1].LOAD_CTRL <= { LOAD_PATTERN = 0 | PRESENT = 0 } otherwise in ~5us PDISPLAY.DAC[0x1].LOAD_CTRL => { LOAD_PATTERN = 0 | PRESENT = 0x0 } PDISPLAY.DAC[0x1].LOAD_CTRL <= { LOAD_PATTERN = 0 | PRESENT = 0 }
(In reply to comment #33) > Created attachment 84479 [details] [review] [review] > nv50 quirk > > Confirmed that only nv50 are affected. The blob's dac handling does differ > between nv50 and later cards, so this patch should do the job. Quite a difference in handling, if I look at #34 > Feel free to test and provide a full name + email if you'd like your > contributions to me acknowledged ("Reported-by" and "Tested-by") Tested it and it's working :) Thanks for all the work. And naming me isn't necessary. I'm happy, that the problem is fixed. But I'm wondering if more such cases will appear.
(In reply to comment #35) > Quite a difference in handling, if I look at #34 > Current code works nicely (with this bugfix), and it's actually shorter than what the blob does on post nv50 cards :P > But I'm wondering if more such cases will appear. > Issues are bound to happen every now and then, especially if one card differs from the whole generation and the manufacturer does not provide any support/documentation :) The patch has been sent to the appropriate places (linux-stable #3.9+) and the nouveau ML, and should be picked up shortly. Thanks for the help.
P.S. feel free to close bug ticket at kernel.org, as the fix lands.
(In reply to comment #37) > P.S. feel free to close bug ticket at kernel.org, as the fix lands. Will do, but how long does it usually takes until a patch is accepted? Checked the source of 3.10.10 and 3.11 and it's still not applied.
This is now in Linus's tree as 5087f51da805f53cba7366f70d596e7bde2a5486.
Ah, that's nice. So landing with 3.12 Thanks
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.