Bug 92515

Summary: Virtualbox with 3D acceleration and with installed vbox-additions crashes
Product: Mesa Reporter: Tomasz Paweł Gajc <tpgxyz>
Component: Drivers/DRI/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED MOVED QA Contact: Nouveau Project <nouveau>
Severity: normal    
Priority: medium    
Version: 10.6   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on: 92077    
Bug Blocks:    

Description Tomasz Paweł Gajc 2015-10-17 23:35:42 UTC
I'm running this setup:

libdrm:
 lib64drm_nouveau2-2.4.65-0.1-omv2014.0.x86_64
Mesa:
 lib64gl1-10.5.9-0.1-omv2014.0.x86_64
kernel:
 kernel-nrjQL-desktop-4.1.9-1omv-1-1-omv2014.0.x86_64
Vbox:
 virtualbox-5.0.6-0.2-omv2014.0.x86_64


My vbox setup have enabled 3D acceleration support and have installed vbox-additions. This are needed for moder toolkit like Qt5 which heavily uses 3D acceleration.

Unfortunately my vbox crashes, or applications inside vbox does not work or segfaults.

I found these in my host journal:

paź 18 01:07:01 lazur kernel: ieee80211 phy0: rt61pci_txdone: Warning - TX status report missed for entry 25
paź 18 01:08:06 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:08:08 lazur kernel: PS2K: Selected scan set 2
paź 18 01:08:14 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:08:14 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:08:15 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:12:35 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 5 [VBoxCrWinCmd[4504]] get 0x002002f2d0 put 0x002002f914 ib_get 0x000000d3 ib_put 0x000000d4 state 0x8000a204 (e... push 0x00406040
paź 18 01:12:35 lazur kernel: nouveau E[VBoxCrWinCmd[4504]] push 1 buffer not in list
paź 18 01:12:35 lazur kernel: nouveau E[VBoxCrWinCmd[4504]] push 1 buffer not in list
paź 18 01:12:35 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 5 [VBoxCrWinCmd[4504]] get 0x00200455a4 put 0x0020045690 ib_get 0x00000208 ib_put 0x0000022b state 0x8000005c (e... push 0x00406040
paź 18 01:12:35 lazur kernel: ShCrOpenGL[4503]: segfault at 8a0 ip 00007fa8e55998fc sp 00007fa8e58717b0 error 4
paź 18 01:12:35 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR 
paź 18 01:12:35 lazur kernel: INVALID_BITFIELD
paź 18 01:12:35 lazur kernel: 
paź 18 01:12:35 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[4504]] subc 4 class 0x502d mthd 0x0200 data 0x00047454
paź 18 01:12:35 lazur kernel:  in libdrm_nouveau.so.2.0.0[7fa8e5597000+6000]
paź 18 01:12:57 lazur kernel: vboxdrv: ffffffffc0764020 VMMR0.r0
paź 18 01:12:57 lazur kernel: vboxdrv: ffffffffc085d020 VBoxDDR0.r0
paź 18 01:12:57 lazur kernel: vboxdrv: ffffffffc007d020 VBoxDD2R0.r0
paź 18 01:12:59 lazur kernel: PS2K: Selected scan set 2
paź 18 01:13:13 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:13:13 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:13:14 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:15:27 lazur kernel: vboxdrv: ffffffffc087b020 VMMR0.r0
paź 18 01:15:27 lazur kernel: vboxdrv: ffffffffc0974020 VBoxDDR0.r0
paź 18 01:15:27 lazur kernel: vboxdrv: ffffffffc0098020 VBoxDD2R0.r0
paź 18 01:15:29 lazur kernel: PS2K: Selected scan set 2
paź 18 01:15:33 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:15:33 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:15:34 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:16:27 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 5 [VBoxCrWinCmd[6947]] get 0x002018cacc put 0x002018cb34 ib_get 0x0000028d ib_put 0x000002ac state 0x80007b10 (e... push 0x00406040
paź 18 01:16:27 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 5 [VBoxCrWinCmd[6947]] get 0x002018cb3c put 0x002018cbb0 ib_get 0x0000028f ib_put 0x000002ac state 0x80000024 (e... push 0x00406040
paź 18 01:16:27 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_BITFIELD
paź 18 01:16:27 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[6947]] subc 3 class 0x8297 mthd 0x1b00 data 0x07800000
paź 18 01:16:27 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_OPERATION
paź 18 01:16:27 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[6947]] subc 3 class 0x8297 mthd 0x1b0c data 0x07800000
paź 18 01:17:19 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 5 [VBoxCrWinCmd[6947]] get 0x0020081ff4 put 0x0020082ae0 ib_get 0x0000020e ib_put 0x0000020f state 0x80008228 (e... push 0x00406040
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_VALUE
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[6947]] subc 4 class 0x502d mthd 0x021c data 0xffffffff
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_BITFIELD
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[6947]] subc 4 class 0x502d mthd 0x0220 data 0xffffffff
paź 18 01:17:19 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 5 [VBoxCrWinCmd[6947]] get 0x002009d600 put 0x002009ded4 ib_get 0x000003cb ib_put 0x000003cc state 0x80000000 (e... push 0x00406040
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_VALUE
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[6947]] subc 4 class 0x502d mthd 0x0838 data 0x03b60000
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_VALUE
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[6947]] subc 4 class 0x502d mthd 0x083c data 0x000c6a0c
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_VALUE
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[6947]] subc 4 class 0x502d mthd 0x0844 data 0x43ed8000
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_VALUE
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[6947]] subc 4 class 0x502d mthd 0x084c data 0x000c6a00
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_VALUE
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[6947]] subc 4 class 0x502d mthd 0x0854 data 0xc3ed8000
paź 18 01:17:19 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 5 [VBoxCrWinCmd[6947]] get 0x00200a4528 put 0x00200a46c4 ib_get 0x0000003a ib_put 0x0000003b state 0x80006f05 (e... push 0x00406040
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_BITFIELD
paź 18 01:17:19 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[6947]] subc 3 class 0x8297 mthd 0x0910 data 0x000472e8
paź 18 01:18:54 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 5 [VBoxCrWinCmd[6947]] get 0x0020188d4c put 0x00201898f4 ib_get 0x00000203 ib_put 0x00000204 state 0x80000000 (e... push 0x00406040
paź 18 01:18:54 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_ENUM
paź 18 01:18:54 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[6947]] subc 3 class 0x8297 mthd 0x191c data 0x41c851ec
paź 18 01:18:54 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_ENUM
paź 18 01:18:54 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[6947]] subc 3 class 0x8297 mthd 0x1920 data 0x42780000
paź 18 01:22:23 lazur kernel: vboxdrv: ffffffffc0992020 VMMR0.r0
paź 18 01:22:23 lazur kernel: vboxdrv: ffffffffc0a8b020 VBoxDDR0.r0
paź 18 01:22:23 lazur kernel: vboxdrv: ffffffffc0159020 VBoxDD2R0.r0
paź 18 01:22:24 lazur kernel: PS2K: Selected scan set 2
paź 18 01:22:36 lazur kernel: vboxdrv: ffffffffc0aa9020 VMMR0.r0
paź 18 01:22:36 lazur kernel: vboxdrv: ffffffffc0ba2020 VBoxDDR0.r0
paź 18 01:22:36 lazur kernel: vboxdrv: ffffffffc01fb020 VBoxDD2R0.r0
paź 18 01:22:37 lazur kernel: PS2K: Selected scan set 2
paź 18 01:22:41 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:22:41 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:22:42 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:24:49 lazur kernel: vboxdrv: ffffffffc0764020 VMMR0.r0
paź 18 01:24:49 lazur kernel: vboxdrv: ffffffffc085d020 VBoxDDR0.r0
paź 18 01:24:49 lazur kernel: vboxdrv: ffffffffc007d020 VBoxDD2R0.r0
paź 18 01:24:51 lazur kernel: PS2K: Selected scan set 2
paź 18 01:25:11 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:25:11 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:25:12 lazur kernel: HPET: Using timer above configured range: 3
paź 18 01:26:12 lazur kernel: nouveau E[VBoxCrWinCmd[7184]] multiple instances of buffer 30 on validation list
paź 18 01:26:12 lazur kernel: nouveau E[VBoxCrWinCmd[7184]] validate_init
paź 18 01:26:12 lazur kernel: nouveau E[VBoxCrWinCmd[7184]] validate: -22
paź 18 01:26:12 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 5 [VBoxCrWinCmd[7184]] get 0x002003f5dc put 0x002003fd84 ib_get 0x0000019e ib_put 0x000001c5 state 0x80007220 (e... push 0x00406040
paź 18 01:26:12 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 5 [VBoxCrWinCmd[7184]] get 0x0020e41344 put 0x0020e41370 ib_get 0x0000019f ib_put 0x000001c5 state 0x80000000 (e... push 0x00406040
paź 18 01:26:12 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 5 [VBoxCrWinCmd[7184]] get 0x002003ffd4 put 0x002003ffd8 ib_get 0x000001a0 ib_put 0x000001c5 state 0x80007b10 (e... push 0x00406040
paź 18 01:26:12 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_VALUE
paź 18 01:26:12 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[7184]] subc 3 class 0x8297 mthd 0x1b04 data 0x000475ec
paź 18 01:26:12 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 5 [VBoxCrWinCmd[7184]] get 0x0020379744 put 0x002037974c ib_get 0x000001a1 ib_put 0x000001c5 state 0x80000000 (e... push 0x00406040
paź 18 01:26:12 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_BITFIELD
paź 18 01:26:12 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 5 [0x001f764000 VBoxCrWinCmd[7184]] subc 3 class 0x8297 mthd 0x1b0c data 0x000375f
Comment 1 Tomasz Paweł Gajc 2015-10-17 23:38:52 UTC
I'm running this with quite old GeForce 9800 GT
Comment 2 Ilia Mirkin 2015-10-17 23:41:17 UTC
Does vbox have multiple contexts concurrently active in multiple threads? That's the only thing I can think of that would cause these types of errors.

If you search around, the 406040 thing is something that's been plaguing nv50 for quite some time, with no progress made on figuring out how an invalid command is making it into the fifo stream.
Comment 3 Tomasz Paweł Gajc 2015-10-19 07:53:28 UTC
(In reply to Ilia Mirkin from comment #2)
> Does vbox have multiple contexts concurrently active in multiple threads?

Ilia it is hard for me to give you and answer, because simply i do not understand what you are asking me for.

How can i check this ?
Comment 4 Tomasz Paweł Gajc 2015-12-13 21:31:15 UTC
I'm still having this issue with mesa-10.6.9 and kernel-4.1.12 and libdrm-2.4.65 with vbox-5.0.10

I'm trying to run Plasma 5 inside vbox. Nowadays Plasma 5 and SDDM uses heavily OGL features throug Qt5Quick. Tricks like QT_XCB_FORCE_SOFTWARE_OPENGL=1 or LIBGL_ALWAYS_SOFTWARE=1 gives black screen.

Below is the output from host system when , in start Plasma 5 inside vbox with 3D accel enabled.

gru 13 22:20:12 lazur kernel: vboxdrv: ffffffffc087c020 VMMR0.r0
gru 13 22:20:12 lazur kernel: vboxdrv: ffffffffc0975020 VBoxDDR0.r0
gru 13 22:20:12 lazur kernel: vboxdrv: ffffffffc007d020 VBoxDD2R0.r0
gru 13 22:20:12 lazur kernel: PS2K: Selected scan set 2
gru 13 22:20:21 lazur kernel: HPET: Using timer above configured range: 3
gru 13 22:20:21 lazur kernel: HPET: Using timer above configured range: 3
gru 13 22:20:21 lazur kernel: HPET: Using timer above configured range: 3
gru 13 22:24:08 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 6 [VBoxCrWinCmd[26409]] get 0x002001cf60 put 0x002001d6f4 ib_get 0x00000045 ib_put 0x00000046 state 0x40000004 (... push 0x00406040
gru 13 22:24:08 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_VALUE
gru 13 22:24:08 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 6 [0x001f5dd000 VBoxCrWinCmd[26409]] subc 4 class 0x502d mthd 0x0214 data 0x00086e04
gru 13 22:24:08 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_VALUE
gru 13 22:24:08 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 6 [0x001f5dd000 VBoxCrWinCmd[26409]] subc 4 class 0x502d mthd 0x0218 data 0x02800000
gru 13 22:24:08 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_VALUE
gru 13 22:24:08 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 6 [0x001f5dd000 VBoxCrWinCmd[26409]] subc 4 class 0x502d mthd 0x021c data 0x01e00000
gru 13 22:24:08 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] DATA_ERROR INVALID_BITFIELD
gru 13 22:24:08 lazur kernel: nouveau E[  PGRAPH][0000:01:00.0] ch 6 [0x001f5dd000 VBoxCrWinCmd[26409]] subc 4 class 0x502d mthd 0x0220 data 0x00086e14
gru 13 22:24:08 lazur kernel: nouveau E[   PFIFO][0000:01:00.0] CACHE_ERROR - ch 6 [VBoxCrWinCmd[26409]] subc 0 mthd 0x0000 data 0x00086e24
gru 13 22:26:05 lazur kernel: vboxdrv: ffffffffc0993020 VMMR0.r0
gru 13 22:26:06 lazur kernel: vboxdrv: ffffffffc0a8c020 VBoxDDR0.r0
gru 13 22:26:06 lazur kernel: vboxdrv: ffffffffc0098020 VBoxDD2R0.r0
Comment 5 Ilia Mirkin 2015-12-13 21:43:35 UTC
I think it's been pretty much established that vbox uses opengl from multiple threads at the same time. This is likely to lead to crashes with nouveau... no amount of upgrading will fix this. [However it should be noted that 10.6 series was end-of-lifed a long time ago... 11.0.x series is the supported one, with 11.1 coming out soon.]

Resolving this issue is very low on my todo list. Right now I'm aware of two applications this affects -- vbox, a proprietary closed source app which includes a kernel module that taints the kernel, and warsow 2.0, which just came out, and includes a workaround to disable the multithreading. And this wouldn't matter, but it's also a giant pain to fix in a way that doesn't break perf for 99.9999% of the users.
Comment 6 Tomasz Paweł Gajc 2016-12-10 14:36:42 UTC
Looks like this is related to broken multi-threading in nouveau, see linked bugs.
Comment 7 GitLab Migration User 2019-09-18 20:41:36 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1088.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.