Bug 91632 - Assert in nouveau_pushbuf_data
Summary: Assert in nouveau_pushbuf_data
Status: NEW
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/nouveau (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Nouveau Project
QA Contact: Nouveau Project
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-08-14 08:26 UTC by Allan Sandfeld
Modified: 2017-05-01 12:21 UTC (History)
6 users (show)

See Also:
i915 platform:
i915 features:


Attachments
gdb backtrace for qupzilla (40.16 KB, text/plain)
2016-07-30 19:08 UTC, Francesco Turco
Details
backtrace with symbol tables (147.00 KB, text/plain)
2016-08-02 20:17 UTC, Andrés Becerra Sandoval
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Allan Sandfeld 2015-08-14 08:26:09 UTC
I have been looking into why QtWebEngine crashes with the nouveau driver, but the crashes appears asserts in the driver and not our code.

Any idea about what could trigger it:


#3  0x00007ffff0f032d2 in __GI___assert_fail (assertion=assertion@entry=0x7fffe1d77aa9 "kref", 
    file=file@entry=0x7fffe1d77a8a "../../nouveau/pushbuf.c", line=line@entry=726, 
    function=function@entry=0x7fffe1d77ad0 <__PRETTY_FUNCTION__.6213> "nouveau_pushbuf_data") at assert.c:101
#4  0x00007fffe1d76284 in nouveau_pushbuf_data (push=push@entry=0x598d40, bo=0x5ace30, offset=115624, length=20)
    at ../../nouveau/pushbuf.c:726
#5  0x00007fffe1d761cb in nouveau_pushbuf_data (push=push@entry=0x598d40, bo=bo@entry=0x0, offset=offset@entry=0, 
    length=length@entry=0) at ../../nouveau/pushbuf.c:718
#6  0x00007fffe1d762ea in pushbuf_submit (push=push@entry=0x598d40, chan=<optimized out>, chan=<optimized out>)
    at ../../nouveau/pushbuf.c:329
#7  0x00007fffe1d7656e in pushbuf_flush (push=push@entry=0x598d40) at ../../nouveau/pushbuf.c:404
#8  0x00007fffe1d77180 in nouveau_pushbuf_kick (push=0x598d40, chan=<optimized out>) at ../../nouveau/pushbuf.c:778
#9  0x00007fffe246d766 in ?? () from /usr/lib/x86_64-linux-gnu/dri/nouveau_dri.so
#10 0x00007fffe2182024 in ?? () from /usr/lib/x86_64-linux-gnu/dri/nouveau_dri.so
<snip>
#15 0x00007ffff1cda345 in glXMakeCurrentReadSGI () from /usr/lib/x86_64-linux-gnu/libGL.so.1
Comment 1 Ilia Mirkin 2015-08-24 13:24:40 UTC
I've seen that a few times... My analysis was that there's a fixed number of GEM objects and we're running out of them. Which means that the GPU is either locked up or way way WAY behind. However I haven't confirmed any part of this... just based on reading the code. In my case, the GPU was, in fact, locking up.
Comment 2 Allan Sandfeld 2015-08-24 13:46:42 UTC
The original qtwebengine bug is at https://bugreports.qt.io/browse/QTBUG-41242

One interesting thing I discovered was that using the EGL/GLES mode instead of GLX/GL solved the issue. I assume at your level it shouldn't be that different. So it could be a difference in how EGL vs GLX acts.
Comment 3 Ilia Mirkin 2015-08-24 13:55:18 UTC
(In reply to Allan Sandfeld from comment #2)
> The original qtwebengine bug is at
> https://bugreports.qt.io/browse/QTBUG-41242
> 
> One interesting thing I discovered was that using the EGL/GLES mode instead
> of GLX/GL solved the issue. I assume at your level it shouldn't be that
> different. So it could be a difference in how EGL vs GLX acts.

The issue was happening for me with Unigine Heaven run with DRI3 + DRI_PRIME=1. Then I started running it with vblank_mode=0 and all my problems disappeared. I haven't checked whether removing vblank_mode=0 would cause the issues to reappear. I was happy to blame it on the DRI3 boogeyman and moved on (I tend not to have a good grasp on these types of issues in the first place... normally I wouldn't touch DRI3 but it really makes nouveau dev quite a bit easier with a primary non-nvidia adapter since I can load/unload nouveau/nvidia at will and still have things display on my regular screen).

Not sure how vsync would cause the problems, but I was getting actual shader traps reported in dmesg. Do you (or whoever is having the issue) have that as well?
Comment 4 Tomasz Paweł Gajc 2016-06-28 16:03:30 UTC
Hi,

any news on this issue ?

I'm running quite fresh system (OpenMandriva Lx 3.0) with:
Qt 5.6.1
Mesa 12.0-rc4
libdrm 2.4.68
kernel 4.62

gfx card nv92

and Qupzilla 2.01 crashes for me with this error 


[tpg@lazur ~]$ LC_ALL=C qupzilla
QupZilla: 0 extensions loaded
Uncaught TypeError: Cannot read property 'restoreData' of null
nouveau: kernel rejected pushbuf: No such file or directory
nouveau: ch10: krec 0 pushes 0 bufs 1 relocs 0
nouveau: ch10: buf 00000000 00000002 00000004 00000004 00000000
nouveau: kernel rejected pushbuf: No such file or directory
nouveau: ch10: krec 0 pushes 0 bufs 1 relocs 0
nouveau: ch10: buf 00000000 00000002 00000004 00000004 00000000
ATTENTION: default value of option force_s3tc_enable overridden by environment.
QupZilla: Starting with profile 'default'
QupZilla: 0 extensions loaded
nouveau: kernel rejected pushbuf: No such file or directory
nouveau: ch11: krec 0 pushes 0 bufs 2 relocs 0
nouveau: ch11: buf 00000000 00000002 00000004 00000004 00000000
nouveau: ch11: buf 00000001 00000006 00000004 00000000 00000004
nouveau: kernel rejected pushbuf: No such file or directory
nouveau: ch11: krec 0 pushes 0 bufs 1 relocs 0
nouveau: ch11: buf 00000000 00000002 00000004 00000004 00000000
qupzilla: pushbuf.c:727: void nouveau_pushbuf_data(struct nouveau_pushbuf *, struct nouveau_bo *, uint64_t, uint64_t): Warunek zapewnienia `kref' nie został spełniony.
Comment 5 Francesco Turco 2016-07-30 19:07:59 UTC
I can't start the Qupzilla web browser because of a crash, and the problem seems to be related to nouveau. I can run qupzilla just file if I use the --disable-gpu commandline option. Please see https://github.com/QupZilla/qupzilla/issues/2046 for the original bug I submitted. I'm going to attach the qupzilla gdb backtrace. I'm using mesa-12.0.1-5 on a Parabola GNU/Linux-libre system.
Comment 6 Francesco Turco 2016-07-30 19:08:32 UTC
Created attachment 125441 [details]
gdb backtrace for qupzilla
Comment 7 Andrés Becerra Sandoval 2016-08-02 20:17:56 UTC
Created attachment 125493 [details]
backtrace with symbol tables

- Made with gdb with the command: thread apply all bt full
- Packages versions (on a Gentoo system):
www-client/qupzilla-2.0.1
x11-drivers/xf86-video-nouveau-1.0.12
media-libs/mesa-12.0.1
x11-libs/libdrm-2.4.70
dev-qt/qtwebengine-5.6.1
Comment 8 Ilia Mirkin 2016-08-02 23:09:54 UTC
Unfortunately "this" issue is a lot of potential issues. The current leading cause of such errors, assuming an up-to-date libdrm/etc, is multithreading in the application. Nouveau does not handle this well.

I have a branch which tries to address this:

https://github.com/imirkin/mesa/commits/locking

However it's not quite there yet (although it's a good part of the way there), and needs a rethink on some of the core mechanisms used. Unfortunately the libdrm_nouveau api is really not well-suited to multithreading, kicks can end up resulting implicitly from all kinds of seemingly innocuous actions.
Comment 9 Reuben 2016-08-03 04:15:45 UTC
(In reply to Ilia Mirkin from comment #8)
> Unfortunately "this" issue is a lot of potential issues. The current leading
> cause of such errors, assuming an up-to-date libdrm/etc, is multithreading
> in the application. Nouveau does not handle this well.
> 

This may not be "all the way there", but I can confirm a significantly more stable experience with qtwebengine.
Comment 10 Allan Sandfeld 2016-08-03 08:25:36 UTC
As a matter of fact QtWebEngine does use multithreading. Chromium renders in one thread, and Qt composites in another.
Comment 11 Tomasz Paweł Gajc 2016-12-10 14:44:58 UTC
Looks like this is a well known issue, see related bug for more info.
Comment 12 Andrey 2017-05-01 12:21:55 UTC
Same here with qupzilla. But Chromium is working fine.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct.