Bug 88514 - X segfaults when using prime offloading to nouveau card
Summary: X segfaults when using prime offloading to nouveau card
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-16 22:45 UTC by aidan
Modified: 2015-01-26 10:36 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
demsg from crash. (72.33 KB, text/plain)
2015-01-16 22:45 UTC, aidan
no flags Details
x log from crash. (26.23 KB, text/plain)
2015-01-16 22:46 UTC, aidan
no flags Details
backtrace from crash (1.67 KB, text/plain)
2015-01-17 00:57 UTC, aidan
no flags Details

Description aidan 2015-01-16 22:45:44 UTC
Created attachment 112362 [details]
demsg from crash.

I am running Arch Linux, and I used to be able to use my discrete graphics card for games using the method described below, but a while ago that stopped working, I believe after an update of nouveau and/or the kernel.

I have:
00:02.0 VGA compatible controller: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06)
01:00.0 VGA compatible controller: NVIDIA Corporation GK107M [GeForce GT 750M] (rev a1) with nouveau drivers.

Steps to reproduce:
I run xrandr --setprovideroffloadsink nouveau Intel at the startup of my desktop (cinnamon). Then I run DRI_PRIME=1 glxinfo (or any other openGL application that I've tried (glxgears, Civ V)), and X crashes and bring me back to my display manager (lightdm).

Is there any other info you need, and/or did I report this issue completely wrong?
Comment 1 aidan 2015-01-16 22:46:24 UTC
Created attachment 112363 [details]
x log from crash.
Comment 2 Ilia Mirkin 2015-01-16 23:27:01 UTC
While the situation is less-than-ideal, I don't see how nouveau is implicated. Accel is disabled because you have the "pgraph is off and we can't figure out how to turn it on" issue (there's already a bug open about that, no known resolution). Perhaps you can get debug symbols to see where it's dying?
Comment 3 aidan 2015-01-16 23:54:50 UTC
Could you point me towards the bug report you mentioned, I couldn't find it.
Comment 4 Ilia Mirkin 2015-01-16 23:59:25 UTC
https://bugs.freedesktop.org/show_bug.cgi?id=70354

From the backtrace, the crash appears to be happening in core X though... but perhaps as a result of something nouveau or intel is doing wrong.
Comment 5 Tobias Klausmann 2015-01-17 00:40:51 UTC
(In reply to Ilia Mirkin from comment #4)
> https://bugs.freedesktop.org/show_bug.cgi?id=70354
> 
> From the backtrace, the crash appears to be happening in core X though...
> but perhaps as a result of something nouveau or intel is doing wrong.

That seems to be likely the case. Try to remove xf86-video-nouveau or blacklist nouveau_drv.so to let nouveau fallback to modesettings. Or you could alternatively try the DRI3 variant of prime offloading: http://nouveau.freedesktop.org/wiki/Optimus/

That wont fix the actual bug but may help to work around this.
Comment 6 aidan 2015-01-17 00:57:22 UTC
Created attachment 112372 [details]
backtrace from crash

I quickly compiled X with debug symbols, and this is what I got for a backtrace.
Comment 7 aidan 2015-01-17 01:01:05 UTC
(In reply to Tobias Klausmann from comment #5)
> (In reply to Ilia Mirkin from comment #4)
> > https://bugs.freedesktop.org/show_bug.cgi?id=70354
> > 
> > From the backtrace, the crash appears to be happening in core X though...
> > but perhaps as a result of something nouveau or intel is doing wrong.
> 
> That seems to be likely the case. Try to remove xf86-video-nouveau or
> blacklist nouveau_drv.so to let nouveau fallback to modesettings. Or you
> could alternatively try the DRI3 variant of prime offloading:
> http://nouveau.freedesktop.org/wiki/Optimus/
> 
> That wont fix the actual bug but may help to work around this.

I might try one of these solutions.  A couple questions for DRI3: how do I check if X and mesa have DRI3 support?  Also, what setup do I need to do?  Do I just need to use DRI_PRIME=1, without using xrandr --setprovideroffloadsink (like I would for DRI2)?
Comment 8 Tobias Klausmann 2015-01-17 01:13:23 UTC
(In reply to aidan from comment #7)
(snip)
> I might try one of these solutions.  A couple questions for DRI3: how do I
> check if X and mesa have DRI3 support?  Also, what setup do I need to do? 
> Do I just need to use DRI_PRIME=1, without using xrandr
> --setprovideroffloadsink (like I would for DRI2)?

You'll need the setup described at the nouveau's prime page, linked above, other than that, you just have to do:
DRI_PRIME=1 app
and it will offload to nouveau
Comment 9 Ilia Mirkin 2015-01-17 01:23:08 UTC
(In reply to Tobias Klausmann from comment #8)
> (In reply to aidan from comment #7)
> (snip)
> > I might try one of these solutions.  A couple questions for DRI3: how do I
> > check if X and mesa have DRI3 support?  Also, what setup do I need to do? 
> > Do I just need to use DRI_PRIME=1, without using xrandr
> > --setprovideroffloadsink (like I would for DRI2)?
> 
> You'll need the setup described at the nouveau's prime page, linked above,
> other than that, you just have to do:
> DRI_PRIME=1 app
> and it will offload to nouveau

Except it won't work because we can't get accel going on his card... this bug should just be about the crash.
Comment 10 Chris Wilson 2015-01-17 09:56:04 UTC
Right, the bug is just that we attempted to dereference a DRI2ScreenPtr on a GPU screen for which DRI2 was never initialised (due to NoAccel). So we have 2 bugs:

1. The code should be more robust and only add an offload_slave if DRI2 was enabled on that Screen (or be prepared for lies).

2. Nouveau shouldn't be claiming to be an offload slave if it doesn't support offloading (due to NoAccel).
Comment 11 Chris Wilson 2015-01-17 10:03:04 UTC
(In reply to Chris Wilson from comment #10)
> 2. Nouveau shouldn't be claiming to be an offload slave if it doesn't
> support offloading (due to NoAccel).

Except, offloading may be eiter DRI2 or DRI3 so this doesn't make sense.
Comment 12 Chris Wilson 2015-01-17 10:05:41 UTC
diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index 8b94b8f..e048224 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -158,6 +158,9 @@ GetScreenPrime(ScreenPtr master, int prime_id)
         DRI2ScreenPtr ds;
 
         ds = DRI2GetScreen(slave);
+        if (ds == NULL)
+            continue;
+
         if (ds->prime_id == prime_id)
             return slave;
     }
Comment 13 Chris Wilson 2015-01-26 10:36:01 UTC
commit 082931014811e587a9734cbf4d88fd948979b641
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Jan 17 10:09:54 2015 +0000

    dri2: SourceOffloads may be for DRI3 only
    
    As a DDX may declare offload support without supporting DRI2
    (because it is using an alternative acceleration mechanism like DRI3),
    when iterating the list of offload_source Screens to find a matching
    DRI2 provider we need to check before assuming it is DRI2 capable.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88514
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Dave Airlie <airlied@redhat.com>
    Signed-off-by: Keith Packard <keithp@keithp.com>


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.