Bug 104291

Summary: Virgl broken on current mesa (17.4+), with nouveau/nv50 as host
Product: Mesa Reporter: Andrew Randrianasulu <randrik>
Component: Drivers/DRI/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED FIXED QA Contact: Nouveau Project <nouveau>
Severity: normal    
Priority: medium CC: fdsfgs
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: glxinfo -l from inside guest
gdb log
slightly better gdb log
X log (from guest, of course)
apitrace

Description Andrew Randrianasulu 2017-12-16 17:14:04 UTC
Hello.

Sorry for possibly irrelevant bug report, but I just tested virgl and found it broken - glxinfo works, but glxgears or any gl program segfaults.

On both host and guest (self-assembled live DVD) I have mesa Mesa 17.4.0-devel (git-96fc5fbf23), X server 1.19.5. Host kernel - Linux slax 4.14.3-x64 #18 SMP Sun Dec 3 09:13:07 MSK 2017 x86_64 AMD FX(tm)-4300 Quad-Core Processor AuthenticAMD GNU/Linux, guest kernel - 4.12.0-x64.

Qemu - 2.11. Launch command and output:

qemu-system-x86_64 -M q35 -enable-kvm -cdrom /dev/shm/slax_16_12_2017_test.iso -m 512 -display sdl,gl=on -soundhw es1370 -smp 4 -vga virtio -usbdevice mouse
qemu-system-x86_64: -usbdevice mouse: '-usbdevice' is deprecated, please use '-device usb-...' instead
gl_version 33 - core profile enabled

startx inside guest works, glxinfo shows virgl as renderer, but glxgears or any gl program (chromium BSU [old arcade game], glblur from xscreensavers) segfaults.

At some point it was working better, but it was long ago .... (16.02.2016, to be prescie).

I also run into assertion on qemu side, described here: 
https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg05461.html

Because I had only dvd image connected as dvd rom - nothing bad happened, but be aware about this abort if you connect real hard disk ....
Comment 1 Andrew Randrianasulu 2017-12-16 17:15:34 UTC
One more comment - while both kernels are 64-bit - whole userspace for guest and host are just 32-bit binaries. It may play some role.
Comment 2 Ilia Mirkin 2017-12-16 17:18:18 UTC
If you can capture an apitrace of qemu that replays incorrectly on nouveau but correctly on other hardware, then this is a nouveau bug.

If virgl / virglrenderer code crashes (or generates an incorrect GL operation sequence), then it's a virgl bug.
Comment 3 Andrew Randrianasulu 2017-12-16 17:23:52 UTC
Created attachment 136219 [details]
glxinfo -l from inside guest
Comment 4 Andrew Randrianasulu 2017-12-16 17:27:59 UTC
Created attachment 136220 [details]
gdb log

Unfortunately, I stripped debug symbols for smaller file sizes. I can recompile mesa with debug symbols, but it will take some time.
Comment 5 Ilia Mirkin 2017-12-16 17:31:09 UTC
If it crashes inside the guest, then it's nothing to do with nouveau. Most likely.
Comment 6 Andrew Randrianasulu 2017-12-16 17:35:12 UTC
(In reply to Ilia Mirkin from comment #5)
> If it crashes inside the guest, then it's nothing to do with nouveau. Most
> likely.

ya, but I was unable to find virgl as specific component  in bugzilla, so I just  added bug to nouveau. Don't feel blamed. (and  after swithcing virtual desktops whole image inside qemu just become real black ..I killed this VM)
Comment 7 Andrew Randrianasulu 2017-12-16 17:43:49 UTC
It also doesn't work with  LIBGL_ALWAYS_SOFTWARE=1 (llvmpipe). So, yes, not nouveau bug....
Comment 8 Andrew Randrianasulu 2017-12-16 17:55:43 UTC
Created attachment 136221 [details]
slightly better gdb log
Comment 9 Andrew Randrianasulu 2017-12-16 18:12:23 UTC
Created attachment 136222 [details]
X log (from guest, of course)
Comment 10 Andrew Randrianasulu 2017-12-16 18:28:17 UTC
Created attachment 136223 [details]
apitrace
Comment 11 Andrew Randrianasulu 2017-12-19 22:37:43 UTC
I tried few older mesa releases (17.1.10 - without llvm, 17.2.7 and 17.3.0 - with llvm 5.0.0), and they all segfaults :/

This is kinda strange, because according to this bugreport it was working in 17.1-dev era:

https://bugzilla.redhat.com/show_bug.cgi?id=1426549

I tried disabling PageFlipping (not sure if it makes any sense for VM), and switching dri mode to dri3. Nothing helped :/
Comment 12 Andrew Randrianasulu 2018-04-30 11:59:38 UTC
Sorry, it works now (Mesa 18.2.0-devel (git-96ed3714fc) for both guest and host). Apparently, I just forgot to mount /dev/shm in initscripts in guest, and dri3  failed.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.