Bug 92911

Summary: xorg resource leak and crash in Intel Ivybridge
Product: DRI Reporter: Nikolay <mar.kolya>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: David.Biesack, intel-gfx-bugs, nemesis
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
xorg log
none
crash stack trace none

Description Nikolay 2015-11-12 02:11:36 UTC
Created attachment 119581 [details]
xorg log

X session crashes after some use, several times a day under normal desktop usage.

I'll attach xorg log

I've noticed one more strange thing that is possibly related. Output of this command:

cat /proc/{xorg pid}/maps  | grep mm | wc -l

(this looks for lines like this: 7fd313cc8000-7fd313d09000 rw-s 00000000 00:05 41728                      /drm mm object (deleted))

goes to 10's of thousands. It seems to be growing quicker if some videa application is being used (google hangouts). Stopping browser (and hangouts) doesn't seem to release those mmaps.

The frequency of the crashing seems to be roughly correlated to time of hangout usage and therefore it looks like crashes and number of mmaps growing without bounds seems related.

This all happens with current iobaf build on Ubuntu Trusty (Mint). Rolling back to vivid lts packages seems to be keeping number of mmaped regions to less than 150 and no crashes so far.
Comment 1 Nikolay 2015-11-12 02:12:44 UTC
Created attachment 119582 [details]
crash stack trace
Comment 2 Chris Wilson 2015-11-12 09:27:04 UTC
commit 7490b9ec263b87b3669096579ec0f0066ec328cb
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Nov 12 09:17:17 2015 +0000

    sna: Wait upon the same ring when out-of-memory
    
    The current out-of-memory allocation code was waiting upon the wrong
    ring id instead of using the index - causing an issue if forced to wait
    upon the BLT ring. If we still cannot allocate, make sure that all
    caches are dropped.
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=92911
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


Can you keep an eye on /sys/kernel/debug/dri/0/i915_gem_objects?
Comment 3 Nikolay 2015-11-13 02:02:15 UTC
Thanks for very quick response!

Number of '/drm mm' lines in maps file went down to ~250 and it doesn't seem to grow even with hangout running.

$ sudo cat /sys/kernel/debug/dri/0/i915_gem_objects
2616 objects, 1292308480 bytes
438 [33] objects, 865861632 [240902144] bytes in gtt
  54 [13] active objects, 407191552 [74022912] bytes
  384 [20] inactive objects, 458670080 [166879232] bytes
1192 unbound objects, 319500288 bytes
172 purgeable objects, 12177408 bytes
3 pinned mappable objects, 121995264 bytes
14 fault mappable objects, 10797056 bytes
2147483648 [268435456] gtt total

[k]batch pool: 6 objects, 36864 bytes (24576 active, 12288 inactive, 36864 global, 0 shared, 0 unbound)
Xorg: 2143 objects, 686604288 bytes (80031744 active, 310317056 inactive, 390348800 global, 534630400 shared, 234741760 unbound)
compiz: 382 objects, 535212032 bytes (348803072 active, 103641088 inactive, 452444160 global, 211742720 shared, 21983232 unbound)
Compositor: 141 objects, 327192576 bytes (57970688 active, 170000384 inactive, 227971072 global, 45486080 shared, 98963456 unbound)

Also, subjectively system is somewhat more responsive.
Thanks!
Comment 4 Chris Wilson 2015-11-13 19:07:06 UTC
commit 2d26643cab33a32847afaf13b50d326d09d58bf7
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Nov 13 19:03:36 2015 +0000

    sna/dri2: Drop the reference on the fence when complete
    
    Fixes regression from
    
    commit 8d9e496670f48b4eec64dfe1bcedb49793cf3073
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Wed Jul 22 11:14:01 2015 +0100
    
        sna/dri2: Take over the placeholder vblank
    
    After noting the fence was complete, we would clear it. But I forgot
    that we actually held a reference on to it, and so we would leak the 64k
    batch, and starve the system of available memory in about 18 minutes of
    SwapBuffers.
    
    Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92911
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 5 Chris Wilson 2016-02-13 16:31:41 UTC
*** Bug 94092 has been marked as a duplicate of this bug. ***
Comment 6 David Biesack 2016-02-16 16:54:51 UTC
in what release is this fix available? I downloaded the latest release of

https://download.01.org/gfx/ubuntu/15.10/main/pool/main/i/intel-linux-graphics-installer/intel-linux-graphics-installer_1.2.1-0intel2_amd64.deb

That file has date 2015-11-16 (in the parent directory of the download)
Bug 94092 was marked as a dup of this, but I experienced Bug 94092
using that deb file.

See https://bugs.freedesktop.org/attachment.cgi?id=121709 -- does that
show which version of the driver is running and does it contain this fix?

thanks
Comment 7 Ryan Underwood 2016-04-02 04:17:59 UTC
David, this fix was not backported to Wily yet.  The relevant bug thread is here:
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1522727

Please engage the Ubuntu maintainer to get a backport.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.