Bug 94643

Summary: 4K Display on Displayport goes black when pointer hits column 0
Product: xorg Reporter: Robert Schöftner <rmu>
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: NEW --- QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Xorg.0.log none

Description Robert Schöftner 2016-03-21 08:04:29 UTC
I have a very strange problem with a recently purchased ViewSonic 4k display connected via display port to a Thinkpad 410 with nvidia gt218m (nvs 3100m).

Every time the mouse pointer touches the left edge of the screen, the display becomes a block of solid color and freezes. Suspend/resume of the laptop reanimates the display.

versions: X server 1.17.2, linux kernel 4.5.0, nouveau 1.0.11 and 1.0.12, dual screens active (also happens with external monitor active only, though that configuration is very unstable), does not matter if the external monitor is configured as "left", "right" or "top" to the internal; did not try "bottom". System: ubuntu 14.04, linux kernel from mainline ppa, x-server & co is of lts-wily variant.

with workaround: xbarrier 1 0 1 2160, so pointer can't go to column 0, everything seems stable.
Comment 1 Robert Schöftner 2016-03-21 17:02:14 UTC
Created attachment 122461 [details]
Xorg.0.log
Comment 2 Robert Schöftner 2016-03-21 17:10:27 UTC
Clarifications: only the external monitor goes "solid color", LVDS-1 remains working. Does not happen with 2560×1440 screen connected via DisplayPort.
Comment 3 Robert Schöftner 2016-04-04 12:19:43 UTC
I just managed to hit this bug again, this time external display went gray, keyboard was dead, internal display was not "damaged" but did not update any more except for mouse pointer, that could be moved.

syslog:

Apr  4 14:06:24 boltzmann kernel: [346097.973285] nouveau 0000:01:00.0: Xorg[2270]: failed to idle channel 3 [Xorg[2270]]
Apr  4 14:06:24 boltzmann kernel: [346097.973475] nouveau 0000:01:00.0: fb: trapped read at 002001e020 on channel 3 [0fb30000 Xorg[2270]] engine 0c [SEMAPHORE_BG] client 08 [PFIFO_READ] subclient 00 [] reason 00000002 [PAGE_NOT_PRESENT]
Apr  4 14:06:24 boltzmann kernel: [346097.973505] nouveau 0000:01:00.0: gr: TRAP_DISPATCH (unknown 00000004)
Apr  4 14:06:24 boltzmann kernel: [346097.973519] nouveau 0000:01:00.0: fb: trapped read at 0000000400 on channel 3 [0fb30000 Xorg[2270]] engine 00 [PGRAPH] client 03 [DISPATCH] subclient 00 [GRCTX] reason 0000000f [DMAOBJ_LIMIT]
Apr  4 14:06:24 boltzmann kernel: [346097.973548] nouveau 0000:01:00.0: gr: TRAP_DISPATCH (unknown 00000008)
Apr  4 14:06:24 boltzmann kernel: [346097.973591] nouveau 0000:01:00.0: fb: trapped read at 0000015400 on channel 3 [0fb30000 Xorg[2270]] engine 00 [PGRAPH] client 03 [DISPATCH] subclient 00 [GRCTX] reason 0000000f [DMAOBJ_LIMIT]
Apr  4 14:06:24 boltzmann kernel: [346097.973619] nouveau 0000:01:00.0: gr: TRAP_DISPATCH (unknown 00000008)
Apr  4 14:06:24 boltzmann kernel: [346097.973625] nouveau 0000:01:00.0: gr: 00200030 [ILLEGAL_MTHD ILLEGAL_CLASS] ch 3 [000fb30000 Xorg[2270]] subc 2 class 0000 mthd 0230 data 000000cf
Apr  4 14:06:24 boltzmann kernel: [346097.973646] nouveau 0000:01:00.0: fb: trapped read at 0000045600 on channel 3 [0fb30000 Xorg[2270]] engine 00 [PGRAPH] client 03 [DISPATCH] subclient 00 [GRCTX] reason 0000000f [DMAOBJ_LIMIT]
Apr  4 14:06:24 boltzmann kernel: [346097.973666] nouveau 0000:01:00.0: gr: 00000030 [ILLEGAL_MTHD ILLEGAL_CLASS] ch 3 [000fb30000 Xorg[2270]] subc 2 class 0000 mthd 0234 data 00000000
Apr  4 14:06:24 boltzmann kernel: [346097.973684] nouveau 0000:01:00.0: gr: 00000030 [ILLEGAL_MTHD ILLEGAL_CLASS] ch 3 [000fb30000 Xorg[2270]] subc 2 class 0000 mthd 0238 data 00000040
Apr  4 14:06:24 boltzmann kernel: [346097.973708] nouveau 0000:01:00.0: gr: 00000030 [ILLEGAL_MTHD ILLEGAL_CLASS] ch 3 [000fb30000 Xorg[2270]] subc 2 class 0000 mthd 023c data 00000001
Apr  4 14:06:24 boltzmann kernel: [346097.973734] nouveau 0000:01:00.0: gr: 00000030 [ILLEGAL_MTHD ILLEGAL_CLASS] ch 3 [000fb30000 Xorg[2270]] subc 2 class 0000 mthd 0240 data 00000000
Apr  4 14:06:24 boltzmann kernel: [346097.973759] nouveau 0000:01:00.0: gr: 00000030 [ILLEGAL_MTHD ILLEGAL_CLASS] ch 3 [000fb30000 Xorg[2270]] subc 2 class 0000 mthd 0248 data 00000915

[repeats like that with different mthd/data]

I killed Xorg with -9 and could reboot cleanly (via ssh).

Any hints how to debug this?
Comment 4 Karol Herbst 2016-04-04 13:19:41 UTC
ohhh I think in the end it is just an invalid memory access caused by some strange offset by 1 problem and then the gpu gets a bit upset.
Comment 5 Robert Schöftner 2016-04-04 14:39:42 UTC
I already had a quick look for "obvious" off-by-one-errors in linux/drivers/gpu/drm/nouveau, following nv50_crtc_cursor_move, the but of course I didn't find any.

Usually the GPU does not go completely bonkers, the LVDS-connected display of the laptop continues to display a screen and usually remains completely useable.

I also tried enabling "SWcursor", but that leads to completely black screen and unuseable X, even with only internal display, which I probably should report as a separate bug.

If somebody has an idea where to look, I'm quite capable of compiling and testing patched nouveau kernel and userspace drivers.

Would something like mmiotrace be helpful or a waste of time?
Comment 6 Robert Schöftner 2016-04-28 13:44:38 UTC
proprietary nvidia 340.96 driver on ubuntu xenial stock distro kernel works as expected, no pointer-related strangeness. so the hardware should be capable.

nonetheless, this driver is not a viable option, as suspend to ram, switching virtual terminals and other random stuff segfaults in libGL etc...

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.