Bug 32190 - Tons of "[drm:i915_gem_do_execbuffer] *ERROR* Failed to pin buffer"
Summary: Tons of "[drm:i915_gem_do_execbuffer] *ERROR* Failed to pin buffer"
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Chris Wilson
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-07 10:56 UTC by sergio.callegari
Modified: 2010-12-09 04:33 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description sergio.callegari 2010-12-07 10:56:49 UTC
On Linux (Ubuntu maverick 64bit with KDE, kernel 2.6.35, xorg 1.9.0, intel drivers updated to 2.13.901+git20101124 via the glasen PPA), on a Dell E6500 laptop with Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07), I am experiencing general instability of the graphics stack, particularly in conjunction with the following conditions:

1) compositing on + suspend/resume. At resume the screen is completely black and objects appear only flashing when the mouse goes over them.

2) compositing on + start/stop of the screensaver (same as above)

3) window movement (occasional freeze of the graphics stack)

4) attachment of an external screen when xrandr is used to set it not to clone the primary one (on the external screen the image is ok, on the primary one the screen gets divided in two regions halfway. The top one shows what should be the bottom of the screen, the bottom one shows garble).

5) screen saver starting when a secondary screen is attached and set not to clone the primary screen (occasional freeze of the graphics stack when the screensaver is stopped).

All the above does not happen repeatably, but just occasionally, thought quite frequently. At the same time, I am seeing tons of the following in dmesg:

drm:i915_gem_do_execbuffer] *ERROR* Failed to pin buffer 16 of 32, total 3555328 bytes, 31 fences: -28
[   36.405962] [drm:i915_gem_do_execbuffer] *ERROR* 690 objects [23 pinned], 65089536 object bytes [22482944 pinned], 22548480/234881024 gtt bytes
[   36.953175] [drm:i915_gem_do_execbuffer] *ERROR* Failed to pin buffer 16 of 31, total 3489792 bytes, 30 fences: -28
[   36.953180] [drm:i915_gem_do_execbuffer] *ERROR* 736 objects [23 pinned], 67301376 object bytes [22482944 pinned], 22548480/234881024 gtt bytes
[   37.905765] [drm:i915_gem_do_execbuffer] *ERROR* Failed to pin buffer 16 of 32, total 3555328 bytes, 31 fences: -28
[   37.905770] [drm:i915_gem_do_execbuffer] *ERROR* 768 objects [23 pinned], 69349376 object bytes [22482944 pinned], 22548480/234881024 gtt bytes

I wonder if the two things can be related.
Comment 1 Chris Wilson 2010-12-07 11:33:28 UTC
Which mesa and drm?
Comment 2 sergio.callegari 2010-12-07 12:22:05 UTC
Here are the info asked for:

Mesa is 7.9~git20100924-0ubuntu2
libdrm is 2.4.22+git20101204

both from the glasen PPA.
Comment 3 Chris Wilson 2010-12-07 12:33:10 UTC
Ok, that libdrm is doing something the kernel doesn't expect. Naughty kernel for trusting userspace. Naughty userspace for confusing the kernel.

Have fixed the kernel already, but I shouldn't release a libdrm that won't function on an older kernel...
Comment 4 Chris Wilson 2010-12-07 12:36:21 UTC
commit 537703fd4805e9cd352965fce642670986822d22
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Dec 7 20:34:22 2010 +0000

    intel: Reorder need_fence vs fenced_command to avoid fences on gen4
    
    gen4+ hardware doesn't use fences for GPU access and the older kernel
    doesn't expect userspace to make such a mistake. So don't.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32190
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 5 sergio.callegari 2010-12-08 02:49:53 UTC
Thank you for the amazingly quick action!

I see that the fix has already been commited by you on the master branch of the git repo, so the ubuntu PPA by Stefan Glasenhardt can pick it up soon making many users happy!
Comment 6 sergio.callegari 2010-12-08 03:33:31 UTC
I guess that the PPA maintainer has been extra fast too, since I've just been delivered a new libdrm deb and after that no more "pinning" error appears in dmesg.

However, I have noticed the following

[drm:i915_gem_do_execbuffer] *ERROR* Object ffff8800b0b08c00 appears more than once in object list

(anyway happening much less frequently than the previous)

Should I open a new bug for it?

also, in now checking I have noticed a

[drm] MTRR allocation failed.  Graphics performance may suffer

for what I have found on the internet this should not be important, yet please let me know if I should send a notice for this too and whether the appropriate destination is userspace or kernel.
Comment 7 Chris Wilson 2010-12-08 03:43:46 UTC
(In reply to comment #6)
> I guess that the PPA maintainer has been extra fast too, since I've just been
> delivered a new libdrm deb and after that no more "pinning" error appears in
> dmesg.
> 
> However, I have noticed the following
> 
> [drm:i915_gem_do_execbuffer] *ERROR* Object ffff8800b0b08c00 appears more than
> once in object list

I've an open bug that mentions that. Never seen it myself and it should be impossible...

> (anyway happening much less frequently than the previous)
> 
> Should I open a new bug for it?
> 
> also, in now checking I have noticed a
> 
> [drm] MTRR allocation failed.  Graphics performance may suffer
> 
> for what I have found on the internet this should not be important, yet please
> let me know if I should send a notice for this too and whether the appropriate
> destination is userspace or kernel.

On a gen4 you can reasonable expect to have PAT which supersedes MTRR, though the conflicting MTRR configuration is still worrying but irrelevant.
Comment 8 sergio.callegari 2010-12-08 04:21:27 UTC
> [drm:i915_gem_do_execbuffer] *ERROR* Object ffff8800b0b08c00 appears more than
> once in object list

I've an open bug that mentions that. Never seen it myself and it should be
impossible...

I have confirmed it on bug 31584 (hope it is the right one), so in case you need further information you have reference about my case there too.
Comment 9 sergio.callegari 2010-12-09 04:33:39 UTC
Some checks of the messages log makes me think that the "impossible" (namely the object appearing more than once) happens with good repeatability in suspend/restore cycles, right after the restore.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.