Summary: | [NVD9] Unable to start X on ThinkPad T420s laptop | ||
---|---|---|---|
Product: | xorg | Reporter: | Mark Cave-Ayland <mark.cave-ayland> |
Component: | Driver/nouveau | Assignee: | Nouveau Project <nouveau> |
Status: | RESOLVED MOVED | QA Contact: | Xorg Project Team <xorg-team> |
Severity: | normal | ||
Priority: | high | CC: | ste.lendl |
Version: | git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Created attachment 69352 [details]
xorg.log
The output from lspci -vv is included below: 01:00.0 VGA compatible controller: NVIDIA Corporation GF119 [Quadro NVS 4200M] (rev a1) (prog-if 00 [VGA controller]) Subsystem: Lenovo Device 21d2 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 16 Region 0: Memory at f2000000 (32-bit, non-prefetchable) [size=16M] Region 1: Memory at e0000000 (64-bit, prefetchable) [size=256M] Region 3: Memory at f0000000 (64-bit, prefetchable) [size=32M] Region 5: I/O ports at 4000 [size=128] Expansion ROM at f3080000 [disabled] [size=512K] Capabilities: [60] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [78] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Latency L0 <256ns, L1 <4us ClockPM+ Surprise- LLActRep- BwNot- LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+ DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [b4] Vendor Specific Information: Len=14 <?> Kernel driver in use: nouveau Hmmm so this could be Optimus related? In the BIOS, the Display has 3 separate modes: - Integrated (works, using Intel driver) - Discrete (uses nouveau driver) - Optimus All my attempts so far to get this to work with nouveau have been using the "Discrete" setting - I don't really care about switching between cards on the fly, I'd just like to be able to use the external DisplayPort connector. Okay it seems that the bug is in the control/initialisation of the DisplayPort. If I leave the DisplayPort connector disconnected, then I can start X using the nouveau driver on the in-built LVDS display. If I then connect the DisplayPort connector, it shows up in xrandr but trying to enable it results in the same "hang" as trying to boot with the DisplayPort cable connected. I can try enabling various bits of debugging within the nouveau driver and report back the results if anyone can point me in the right direction? Another experiment: if I reboot with the DisplayPort connected, but disconnect it during the boot process before X starts then X still hangs on startup. So perhaps this is a kernel module initialisation problem for DisplayPort? Ah, sorry. I'm removing Optimus from bug title. Did it work correctly with previous kernels? 3.7 contains massive rewrite/rework of nouveau module, so it's possible it's a regression. No - only the 3.2 stock wheezy kernel, and that would often freeze early when starting X and while it would find the DisplayPort it kept insisting that it wasn't connected. I could try building the latest 3.6 kernel if you think that would help? Please do. Even without rewrites, Nouveau changes a lot between kernel revisions. I've just tried with a 3.6.5 kernel built from source, and I see the same behaviour as 3.7.rc3 when the DisplayPort cable is connected - i.e. X hangs on startup if the cable is connected. Does that help at all? Based upon the suggestions that this is a problem with the kernel module, I've now setup and built a full dev environment. With this in place, I've just done a build of git master (commit cd55d81fdb49c9113792c3b361c125e82091d6c4) from git://anongit.freedesktop.org/nouveau/linux-2.6 and unfortunately I see exactly the same problem as before :/ Incidentally can anyone explain what the following items in dmesg represent? [ 26.204990] nouveau W[ PGRAPH][0000:01:00.0] disabled, PGRAPH=1 to enable [ 29.873280] nouveau [ DRM] EvoCh: chid 1 mthd 0x0080 data 0x00000000 0x00005080 0x0000000d [ 29.876287] [sched_delayed] sched: RT throttling activated [ 33.870186] nouveau [ DRM] EvoCh: chid 2 mthd 0x0080 data 0x00000000 0x10005080 0x0000000d [ 57.731616] nouveau [ DRM] EvoCh: chid 1 mthd 0x0080 data 0x00000000 0x10005080 0x0000000d [ 73.691785] nouveau [ DRM] EvoCh: chid 1 mthd 0x0080 data 0x00000000 0x10005080 0x0000000d [ 118.486621] nouveau W[ PGRAPH][0000:01:00.0] disabled, PGRAPH=1 to enable [ 120.491687] nouveau [ DRM] EvoCh: chid 1 mthd 0x0080 data 0x00000000 0x10005080 0x0000000d [ 127.921298] nouveau E[ DRM] evo 0 dma stalled Are they any debugging kernel module parameters that can be enabled in order to provide more information? I've seen some interesting patches go through the mailing list which I thought may be related to this bug, so I thought I'd update to 3.6.11-rc2 and see if there was any improvement. Unfortunately I am still unable to use the DisplayPort connector in order to drive an external monitor :( Just to summarise for those who are interested: 1) My laptop is set so that it is forced to use nVidia graphics adapter, i.e. the Optimus mode is disabled in the BIOS. 2) If I run Nouveau without anything plugged into the DisplayPort connector on my laptop, it runs fine. 3) If I connect an external monitor to the DisplayPort connector and reboot then X freezes upon startup, even before enabling the external display. Under 3.6.11-rc2 I see the following in dmesg: nouveau E[ PDISP][0000:01:00.0] chid 1 mthd 0x0080 data 0x00000000 0x00005080 0x0000000 4) If I now unplug the DisplayPort connector, I see the following in dmesg: nouveau E[ DRM] DDC responded, but no EDID for DP-1 5) If I now try and start X again, it still freezes but for less time than before. 6) Restarting the laptop without the DisplayPort connector plugged in is required in order for Nouveau to work on the in-built display once again. Any further debugging information available on request. I have an up-to-date git tree to which I can temporarily apply diagnostic patches if required. Many thanks, Mark. NVD9 is a chip that has seen a bunch of fixes lately. I'd suggest trying 3.11-rc2 or the nouveau/master tree. [The two should be fairly close right now.] (Or was the "3.6.11-rc2" really meant to say "3.11-rc2"?) Sigh. Yes what I really meant was 3.11-rc2 (nouveau/linux-2.6) master branch as of about 8 hours ago. Many thanks, Mark. I spent a little bit more time playing with the 3.11-rc2 kernel and I noticed a couple of things: 1) If I boot in standalone mode with no DisplayPort connector attached, xrandr shows that I have 3 DisplayPort outputs?! This is definitely not true! 2) Once booted in standalone mode, if I then proceed to plug in the DisplayPort connector then the kernel throws a backtrace - I'll upload this as a separate attachment. My best guess is that the freeze is related to trying to enumerate these non-existent DisplayPort outputs when starting X. Whereabouts in the code are these outputs parsed/enumerated? Created attachment 83450 [details]
3.11-rc2 backtrace when plugging in DP connector
3.11-rc2 backtrace when plugging in DP connector from standalone mode.
I wouldn't worry about that backtrace, it's related to something else I think. Would you mind switching to nouveau/master? You can just add it as a remote to your current git setup, no need to check out a whole new tree. Could you post a full dmesg from a nouveau/master kernel booted with "nouveau.debug=PDISP=debug"? The dmesg that you've already posted seems to suggest everything is broken (e.g. PDISP is disabled which is a very bad sign). Also, to be clear, if you boot with DP plugged in, and start X, then you get a hang, right? Does the whole system hang, or just the display? Can you ssh in? If you can ssh in, what happens when you try to turn stuff off/on with xrandr? (Obviously with DISPLAY=:0 so as to connect to the laptop's X.) (In reply to comment #16) > I wouldn't worry about that backtrace, it's related to something else I > think. Okay, no worries :) > Would you mind switching to nouveau/master? You can just add it as a remote > to your current git setup, no need to check out a whole new tree. Sure. My kernel git repository is exclusively set-up for trying to get DP working so it's a clone of git://anongit.freedesktop.org/nouveau/linux-2.6. I'll do a pull on master branch and build that for my next set of tests. > Could you post a full dmesg from a nouveau/master kernel booted with > "nouveau.debug=PDISP=debug"? The dmesg that you've already posted seems to > suggest everything is broken (e.g. PDISP is disabled which is a very bad > sign). Alright. I'm not back in the office until next week now, but as soon as I get a chance I'll try it and upload the dmesg output here. > Also, to be clear, if you boot with DP plugged in, and start X, then you get > a hang, right? Does the whole system hang, or just the display? Can you ssh > in? If you can ssh in, what happens when you try to turn stuff off/on with > xrandr? (Obviously with DISPLAY=:0 so as to connect to the laptop's X.) AFAICT it's a full system hang - I haven't tried remote network access, but the keyboard is totally unresponsive, even to CTRL-ALT-F1 and friends until the driver times out after about 2-3mins. I'll give the remote access a try next week though. Many thanks, Mark. Hi Ilia, I've just done a build and test against 3.11-rc3+ from nouveau/linux-2.6 git master which shows as the following commit: commit 68ea5eed7ca275434468edff36b3b49e5f129213 Author: Ben Skeggs <bskeggs@redhat.com> Date: Tue Jul 30 11:47:47 2013 +1000 drm/nouveau/vm: make vm refcount into a kref The result in terms of the hang are the same as before. Unfortunately I haven't had a chance to do the network test yet whilst in the hung state since my desktop switch is currently broken. If after reviewing the logs you still want me to try this, let me know and I'll try to work something out. The nouveau module was given the parameter debug=PGRAPH=debug and I did two attempts: the first file 3.11-rc3-dmesg.plugged is a boot with the DP connector plugged in (bad, hangs) whilst the second file 3.11-rc3-dmesg.unplug is a boot without the DP connector plugged in (good, but standalone mode only). I see there are some mentions of switcheroo in the logs which is interesting given that I'm currently running it in Discrete Graphics (nVidia mode) without Optimus enabled in the BIOS. Many thanks, Mark. Created attachment 83717 [details]
3.11-rc3-dmesg.plugged
3.11-rc3-dmesg.plugged
Created attachment 83718 [details]
3.11-rc3-dmesg.unplug
3.11-rc3-dmesg.unplug
Hi Ilia, I finally managed to get a working network connection on boot, and it looks as if remote SSH is working. So I asked a friend who is fluent in X to ssh in remotely and perform some tests :) Things we found were: 1) On boot dmesg shows "nouveau E[ DRM] DDC responded, but no EDID for DP-1" in the kernel logfile 2) xrandr -q --verbose works but very slowly; each line is noticeably lethargic when appearing on screen. This output shows 3 DP outputs, but there is only one DP connector on the laptop?! The physical ports on the laptop are: LVDS-1 - on-board LCD screen VGA-1 - external VGA output (possibly only wired to on-board Intel chip?) DP-1 - external DP output (wired to NV9 chip only) 3) When manually trying to switch between the LVDS/DP-1 outputs with xrandr there was a backtrace in the logs from drm_mode_set_config_internal(). I'll attach the related files onto this bug report. Many thanks, Mark. Created attachment 83818 [details]
dmesg output from remote SSH session
Created attachment 83819 [details]
Xorg.log output from remote SSH session (hung)
Created attachment 83820 [details]
xrandr output from remote SSH session
Hi all, Sorry to prod an old bug once again, however as I'd seem some possibly relevant fixes go through the mailing list I thought I'd try the latest drm-nouveau-next checkout and see if my problem was fixed. Unfortunately it still doesn't work for me :( On the plus side, I noticed that if I use a DisplayPort to HDMI cable to connect to a borrowed HD monitor (1920x1080) then nouveau now works as expected which is a great milestone! However if I try and use my standard Dell HD monitor (2560x1440) with a direct DisplayPort cable then nouveau still doesn't work for me with the same symptoms as described earlier in this bug report. I noticed that there is quite a lot of verbose debugging in dmesg in the latest checkout, so I'll attach that below in the hope someone can work out why I can't use my Dell monitor with direct DisplayPort connection. Many thanks, Mark. Created attachment 104876 [details]
nouveau nouveau-drm-next (3.15.0-rc8) dmesg output
Updated dmesg output from nouveau-drm-net (3.15.0-rc8) kernel
Following the discussions at bug #91333, I was interested in trying Ben's patch to see if this fixed my DisplayPort issue. I rebuilt a new kernel based upon origin/linux-4.1 and tested the behaviour both with and without the patch applied. Sadly the patch didn't work for me, but I have updated dmesg logs which I will add to this bug report shortly. Created attachment 117217 [details]
linux-4.1 unpatched dmesg grep nouveau
Created attachment 117218 [details]
linux-4.1 patched dmesg grep nouveau
Updated dmesg logs have been added - it seems that newer kernels have much more debugging information included. The parts of the dmesg log that stand out to me are: nouveau [ PDISP][0000:01:00.0] unknown intr24 0x80000000 and: nouveau E[ PDISP][0000:01:00.0] chid 1 mthd 0x0080 data 0x00000000 0x00005080 0x0000000d I also noticed this too: nouveau E[ VBIOS][0000:01:00.0] 0x5c2a[0]: script needs crtc Just for the record, here is what happens with the latest linux-4.1 git repository under a Debian wheezy setup: 1) Reboot laptop into Lenovo BIOS, switch display over from Integrated to Discrete 2) Plug in DisplayPort cable 3) Reboot laptop once again 4) Initial kernel boot seems normal; kernel mode setting seems to work on integrated LVDS supply on boot as normal, i.e. screen blanks and then redraws at a higher resolution during boot with smaller fonts 5) The problem occurs when X starts. Both the integrated LVDS and DisplayPort screens go black and here everything hangs. The laptop is unresponsive. If I do a CTRL-ALT-F1 to switch back to a text console, it takes around 30s for the laptop to react and switch to the new TTY. 6) Once in the text console, the laptop seems responsive once again. While the related bug report mentioned a problem with the cursor under GNOME, I'm actually using KDE (kdm) provided as standard with Debian Wheezy. I have the same Problem on an T410 - NVS 3100M When I boot the gdm wants to start, a mouse cursor appears but the display stays black. login in through tty and startx works fine! I have a display connected via Display Port... dmesg | ag "(drm|nouveau)" [ 1.326854] ata1.00: supports DRM functions and may not be fully accessible [ 1.328990] ata1.00: supports DRM functions and may not be fully accessible [ 2.707846] [drm] Initialized drm 1.1.0 20060810 [ 4.153679] nouveau [ DEVICE][0000:01:00.0] BOOT0 : 0x0a8600a2 [ 4.153684] nouveau [ DEVICE][0000:01:00.0] Chipset: GT218 (NVA8) [ 4.153686] nouveau [ DEVICE][0000:01:00.0] Family : NV50 [ 4.193558] nouveau [ VBIOS][0000:01:00.0] using image from PRAMIN [ 4.193704] nouveau [ VBIOS][0000:01:00.0] BIT signature found [ 4.193708] nouveau [ VBIOS][0000:01:00.0] version 70.18.4e.00.00 [ 4.194400] nouveau [ PMC][0000:01:00.0] MSI interrupts enabled [ 4.194453] nouveau [ PFB][0000:01:00.0] RAM type: DDR3 [ 4.194455] nouveau [ PFB][0000:01:00.0] RAM size: 512 MiB [ 4.194457] nouveau [ PFB][0000:01:00.0] ZCOMP: 960 tags [ 4.197564] nouveau [ VOLT][0000:01:00.0] GPU voltage: 1000000uv [ 4.238954] nouveau [ PTHERM][0000:01:00.0] FAN control: none / external [ 4.238974] nouveau [ PTHERM][0000:01:00.0] fan management: automatic [ 4.238980] nouveau [ PTHERM][0000:01:00.0] internal sensor: yes [ 4.258996] nouveau [ CLK][0000:01:00.0] 03: core 135 MHz shader 270 MHz memory 135 MHz [ 4.259001] nouveau [ CLK][0000:01:00.0] 07: core 405 MHz shader 810 MHz memory 405 MHz [ 4.259006] nouveau [ CLK][0000:01:00.0] 0f: core 606 MHz shader 1468 MHz memory 790 MHz [ 4.259081] nouveau [ CLK][0000:01:00.0] --: core 405 MHz shader 810 MHz memory 405 MHz [ 4.259516] nouveau [ DRM] VRAM: 512 MiB [ 4.259518] nouveau [ DRM] GART: 1048576 MiB [ 4.259524] nouveau [ DRM] TMDS table version 2.0 [ 4.259526] nouveau [ DRM] DCB version 4.0 [ 4.259529] nouveau [ DRM] DCB outp 00: 01800323 00010034 [ 4.259534] nouveau [ DRM] DCB outp 01: 02811300 00000000 [ 4.259536] nouveau [ DRM] DCB outp 02: 028223a6 0f220010 [ 4.259538] nouveau [ DRM] DCB outp 03: 02822362 00020010 [ 4.259540] nouveau [ DRM] DCB outp 04: 048333b6 0f220010 [ 4.259542] nouveau [ DRM] DCB outp 05: 04833372 00020010 [ 4.259543] nouveau [ DRM] DCB outp 06: 088443c6 0f220010 [ 4.259545] nouveau [ DRM] DCB outp 07: 08844382 00020010 [ 4.259547] nouveau [ DRM] DCB conn 00: 00000040 [ 4.259549] nouveau [ DRM] DCB conn 01: 00000100 [ 4.259551] nouveau [ DRM] DCB conn 02: 00101246 [ 4.259553] nouveau [ DRM] DCB conn 03: 00202346 [ 4.259555] nouveau [ DRM] DCB conn 04: 00410446 [ 4.290527] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 4.290531] [drm] Driver supports precise vblank timestamp query. [ 4.324628] nouveau [ DRM] MM: using COPY for buffer copies [ 5.617924] nouveau [ DRM] allocated 1920x1200 fb: 0x70000, bo ffff88022edbe400 [ 5.618256] fbcon: nouveaufb (fb0) is primary device [ 5.622086] nouveau E[ PDISP][0000:01:00.0] INVALID_VALUE [UNK84] chid 0 mthd 0x0828 data 0x0000a5c5 [ 5.729431] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device [ 5.729434] nouveau 0000:01:00.0: registered panic notifier [ 5.743721] [drm] Initialized nouveau 1.2.2 20120801 for 0000:01:00.0 on minor 0 lspci -k | grep -A 2 -E "(VGA|3D)" 01:00.0 VGA compatible controller: NVIDIA Corporation GT218M [NVS 3100M] (rev a2) Subsystem: Lenovo ThinkPad T410 Kernel driver in use: nouveau I am running Debian Testing and nouveau installed from the repo: Package: xserver-xorg-video-nouveau Version: 1:1.0.11-1+b1 Maintainer: Debian X Strike Force <debian-x@lists.debian.org> Architecture: amd64 Stefan, what kernel version are you using? I was looking to test a 4.3+ kernel as I believe the nouveau code has had a significant rewrite in the hope that this would resolve this issue. Also do you see any difference when using a DP to HDMI cable to connect to a HDMI monitor instead? Ah yes, of course.. missed some info ^^ I am running Kernel 4.2.0 (Debian Testing) to clarify my display setup: I am using besides my laptop screen I have a DVI monitor and a DisplayPort->HDMI cable to an AV-Receiver for a projector. DP and DVI are always plugged in but only two monitors can be active at the same time -> Nvidia's HW limitations.. I have not tested any other settings with monitors [not] pluged in. I switched back to nvidia-driver for now. I can swtich to nouveau for further testing if necessary.. Thank you for your reply Debian updated to linux 4.3 now and I decided to try nouveau again. The problem comes from having three displays attached when the graphics card can only drive two! (one being a projector I don't always use) When I boot the error shows up, when I then unplug one of the screens, immediately the login (gdm3) shows up. Is it possible to ignore a display on startup but allowing it to be turned on, reconfigured afterwards? or defining a set of displays be considered at startup? Thank you. Stefan -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/30. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 69351 [details] dmesg Hi all, I am trying to use the discrete graphics (nVidia) graphics card option on my T420s in order to use the DisplayPort connector, and am currently unable to start X under Debian Wheezy. Under advice from IRC, I've upgraded Wheezy with the following components: Linux kernel 3.7-rc3 from kernel.org libdrm-2.4.39 xf86-video-nouveau git (commit 8c3e1623b0be15f8cc590d893bfd19be87bd079a) However I am still unable to start with the new driver in place. When starting X, the fbconsole remains visible on screen and then "hangs" for nearly a minute before the login TTY appears again. There are some messages emitted in Xorg.log and dmesg which are attached to this bug report, along with the output of lspci. Many thanks, Mark.