Bug 79238 - [SNA G33/31] xorg/ddx hits kgem_check_many_bo_fenced: Assertion `bo->refcnt' failed.
Summary: [SNA G33/31] xorg/ddx hits kgem_check_many_bo_fenced: Assertion `bo->refcnt' ...
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: high major
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on: 73406
Blocks:
  Show dependency treegraph
 
Reported: 2014-05-26 04:03 UTC by Matti Hämäläinen
Modified: 2014-06-10 06:07 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
GDB session log (5.85 KB, text/plain)
2014-05-26 04:03 UTC, Matti Hämäläinen
no flags Details
Assert we do not replace live IO buffers (1.44 KB, patch)
2014-05-26 06:12 UTC, Chris Wilson
no flags Details | Splinter Review
Implicit release of upload buffers considered bad (6.29 KB, patch)
2014-05-26 06:14 UTC, Chris Wilson
no flags Details | Splinter Review
Implicit release of upload buffers considered bad (6.29 KB, patch)
2014-05-26 07:38 UTC, Chris Wilson
no flags Details | Splinter Review
Implicit release of upload buffers considered bad (6.28 KB, patch)
2014-05-26 07:38 UTC, Chris Wilson
no flags Details | Splinter Review

Description Matti Hämäläinen 2014-05-26 04:03:13 UTC
Created attachment 99820 [details]
GDB session log

The assert hit is very occasional/random. Sometimes it is reproducible several times in row, sometimes several days go on without problems.

Attaching a GDB log with (some of) the information requested in IRC by Chris Wilson. Unfortunately not all variables requested by Chris could be printed out, due to having been optimized out by compiler.

-- Window manager: WindowMaker 0.95.5
-- chipset: 00:02.0 VGA compatible controller: Intel Corporation 82G33/G31
Express Integrated Graphics Controller (rev 10)
-- system architecture: i686 / 32bit
-- xf86-video-intel: GIT 605fcd9050efc816ac8163e8d626f466d98261c2
-- xserver: X.Org X Server 1.15.1-1 from latest Debian testing
-- mesa: 10.1.2-1 from Debian testing
-- libpixman: 0.32.4-1
-- libdrm version: 2.4.54-1
-- kernel version: 3.14.4 (vanilla+grsec)
-- Linux distribution: current Debian Testing
-- Machine or mobo model: Asus P5KPL-CM
-- Display connector: VGA
Comment 1 Chris Wilson 2014-05-26 05:29:49 UTC
An io bo, at least that reduces the amount of code to think about first.

(gdb) p *src_bo
$1 = {rq = 0xb7cfc358, exec = 0x0, proxy = 0xb81315e8, list = {next = 0xb7e150e4, prev = 0xb7e150e4}, request = {next = 0xb7e150ec, prev = 0xb7e150ec}, vma = {next = 0xb7e150f4, prev = 0xb7e150f4}, map__cpu = 0xb3a2d000, map__gtt = 0x0, binding = {next = 0x0, format = 0, offset = 0}, presumed_offset = 0, unique_id = 3765653, refcnt = 0, handle = 176, target_handle = 4294967295, delta = 4096, size = {pages = {count = 224, bucket = 0}, bytes = 224}, pitch = 56, tiling = 0, reusable = 0, gpu_dirty = 0, gtt_dirty = 0, domain = 0, needs_flush = 0, snoop = 0, io = 1, flush = 0, scanout = 0, purged = 0}
Comment 2 Chris Wilson 2014-05-26 06:12:49 UTC
Created attachment 99826 [details] [review]
Assert we do not replace live IO buffers
Comment 3 Chris Wilson 2014-05-26 06:14:19 UTC
Created attachment 99827 [details] [review]
Implicit release of upload buffers considered bad

A couple of patches to test. Both address 2 little bugs, the second is more likely to explain your assertion failures.
Comment 4 Chris Wilson 2014-05-26 07:38:03 UTC
Created attachment 99835 [details] [review]
Implicit release of upload buffers considered bad
Comment 5 Chris Wilson 2014-05-26 07:38:55 UTC
Created attachment 99836 [details] [review]
Implicit release of upload buffers considered bad
Comment 6 Chris Wilson 2014-05-26 19:50:12 UTC
Went ahead and pushed:

commit 9f4f855ba37966fb91d31e9081d03cf72affb154
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon May 26 07:06:18 2014 +0100

    sna: Implicit release of upload buffers considered bad
    
    Currently upload buffers are automatically decoupled when the buffer is
    retired. As retiring can happen during command setup after we have
    selected which bo to render with, this can free the bo we plan to use.
    Which is bad.
    
    Instead of making the release of upload buffers automatic, we manually
    check whether the buffer is idle before use as a source to consider
    scrapping it and replacing it with a real GPU bo. This is likely to keep
    upload buffers alive for longer (limiting reuse between Pixmaps but
    making reuse of the buffer within a Pixmap more likely) which is both
    good and bad. (Good - may improve the content cache, bad - may increase
    the amount of memory used by upload buffers for arbitrary long periods.)
    
    Reported-by: Matti Hämäläinen <ccr@tnsp.org>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79238
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

commit b508d8f3318c42a2a87b7731789b1d03610e9b46
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon May 26 06:54:03 2014 +0100

    sna: Assert that we do not replace active IO buffers
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=79238
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 7 Matti Hämäläinen 2014-05-27 00:24:05 UTC
Thanks, will be testing this.
Comment 8 Matti Hämäläinen 2014-06-10 00:10:44 UTC
The good news is that after ~two weeks of testing, also with the later changes mentioned on IRC, I've encountered this issue no more. I'd say with rather good confidence that the issue can be closed.

Thanks again for your hard work!
Comment 9 Chris Wilson 2014-06-10 06:07:08 UTC
Thanks for the testing and feedback.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.