Bug 20152 - [G45/GM965 UXA] cannot view JPG in firefox when running UXA (lots of errors in dmesg)
Summary: [G45/GM965 UXA] cannot view JPG in firefox when running UXA (lots of errors i...
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium major
Assignee: Chris Wilson
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
: 21512 28005 (view as bug list)
Depends on:
Blocks: 20893
  Show dependency treegraph
 
Reported: 2009-02-17 01:16 UTC by martin
Modified: 2010-08-08 12:11 UTC (History)
5 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg while hung (43.15 KB, text/plain)
2009-02-17 01:17 UTC, martin
no flags Details
xorg log while hung (34.69 KB, text/plain)
2009-02-17 01:17 UTC, martin
no flags Details
xorg log old while hung (26.70 KB, application/x-trash)
2009-02-17 01:18 UTC, martin
no flags Details
xsession errors while hung (3.52 KB, application/octet-stream)
2009-02-17 01:18 UTC, martin
no flags Details
xorg conf (ubuntu stock + UXA) while hung (1.04 KB, application/octet-stream)
2009-02-17 01:19 UTC, martin
no flags Details
all logs (captured after running the repro, using 29rc8+close_to_git_2009mar19 userspace) (34.94 KB, application/x-compressed-tar)
2009-03-20 07:33 UTC, martin
no flags Details
xorg bitmap size stress testing tool (5.34 KB, text/x-csrc)
2009-03-23 06:42 UTC, martin
no flags Details
gdb trace when UXA goes into CPU spin while repro'ing this bug using mesa git HEAD april 4th (8.65 KB, text/plain)
2009-04-05 06:06 UTC, martin
no flags Details
Protect mmapped buffers from causal eviction. (7.31 KB, patch)
2010-05-08 12:09 UTC, Chris Wilson
no flags Details | Splinter Review

Description martin 2009-02-17 01:16:03 UTC
On a system with a Gigabyte GA-EG45M-DS2H board (intel G45-chipset).
Using 2.6.29-020629rc5-generic plus the normal current ubuntu jaunty bits:
libdrm-intel1 and libdrm2 is 2.4.4-0ubuntu6
xserver-xorg-video-intel is 2:2.6.1-1ubuntu2
libgl1-mesa-dev is 7.3-1ubuntu1

(Using UXA) if I open this URL in firefox and let it load completely, my xorg hangs:
http://upload.wikimedia.org/wikipedia/commons/b/b7/Singapore_port_panorama.jpg

After freeze, dmesg shows this error:
[  391.129252] [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[  391.129258] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12
Comment 1 martin 2009-02-17 01:17:22 UTC
Created attachment 23010 [details]
dmesg while hung
Comment 2 martin 2009-02-17 01:17:50 UTC
Created attachment 23011 [details]
xorg log while hung
Comment 3 martin 2009-02-17 01:18:10 UTC
Created attachment 23012 [details]
xorg log old while hung
Comment 4 martin 2009-02-17 01:18:37 UTC
Created attachment 23013 [details]
xsession errors while hung
Comment 5 martin 2009-02-17 01:19:08 UTC
Created attachment 23014 [details]
xorg conf (ubuntu stock + UXA) while hung
Comment 6 martin 2009-02-17 01:20:49 UTC
while hung:

cat i915_gem_interrupt 
Interrupt enable:    00000053
Interrupt identity:  00000000
Interrupt mask:      fffedfac
Pipe A stat:         00000206
Pipe B stat:         00000000
Interrupts received: 12507
Current sequence:    193290
Waiter sequence:     193293
IRQ sequence:        193289

(also I re-ran it after a few seconds and interrupt count wasnt changing)
Comment 7 martin 2009-02-17 01:21:33 UTC
(while hung) cat gem_objects 
4736 objects
771022848 object bytes
6 pinned
17076224 pin bytes
181567488 gtt bytes
234885120 gtt total
Comment 8 martin 2009-02-17 01:28:53 UTC
The bug does not repro if I switch over to EXA using the exact same bits / hardware.
Comment 9 martin 2009-02-17 02:03:03 UTC
I tried to boot the standard jaunty live CD on a laptop with 965 chipset, and then I did "/etc/init.d/gdm stop" on it, changed xorg.conf to UXA and did startx.

xorg came up using the live cd bits + UXA and then I opened this JPG and boom xorg frozen on the 965+UXA as well. I had previous done "sudo apt-get install ssh" and "sudo su" followed by "passwd ubuntu" and I had two root shells logged in to the live CD system. The nasty part was that on 965 when the bug triggers all my ssh shells becomes unresponsive as well. All I can tell you is that the fan on the laptop starts to work at 100% speed which is odd because that usually only happens if xorg does into a spin loop and on the G45 what I saw in gdb was not a CPU spin but rather a hang where xorg was stuck in some ioctl.

For reasons above I have no idea what the dmesg or xorg_log says on the 965 (I also cant install jaunty on this box because I need to have a stable machine for email etc).

Anyway, it's safe to say that 965+UXA seems affected too.
Comment 10 Gordon Jin 2009-02-17 06:14:45 UTC
Martin, thanks for the bug report.

Jian, can you reproduce this?
Comment 11 martin 2009-02-17 06:22:56 UTC
Another ubuntu user has been able to repro this on the following hardware:

00:02.0 VGA compatible controller [0300]: Intel Corporation 82865G Integrated Graphics Controller [8086:2572] (rev 02)
        Subsystem: ASUSTeK Computer Inc. Device [1043:2572]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx+
        Latency: 0
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at f0000000 (32-bit, prefetchable) [size=128M]
        Region 1: Memory at fe780000 (32-bit, non-prefetchable) [size=512K]
        Region 2: I/O ports at eff0 [size=8]
        Capabilities: <access denied>
        Kernel modules: intelfb
Comment 12 zhao jian 2009-02-25 01:49:16 UTC
 I tested it on g45 and gm965, both of them didn't crash. G45 with kernel 2.6.29-rc6 and gm965 with kernel 2.6.29-rc5. G45 runs very fast and seems not affected. GM965 runs a little slow the first time and the icons on desktop and icons in firefox all became blurred. But when I reboot it and run it again it doesn't have such issue.  One configuration difference is G45 has 2G memory and gm965 only 1G. 

Configuration:
------------------------------
Platform:               g45
Arch:           x86_64
OSD:            Fedora release 8 (Werewolf)
Kernel:         2.6.29-rc6
Libdrm:               (master)1c381092a310af9b1b39b3a983ad5760b71a9025
Mesa:                 (mesa_7_4_branch)e3050c1777fe4d420bea0171f3624e53513b1055
Xserver:              (server-1.6-branch)4557b3f6c4273cd83b701beaf7a150c806fed298
Xf86_video_intel:     (master)668b2352a47bcfba75fe0492a5805726222755eb
Kernel:             (drm-intel-next)126fd9c96d80260a1f2d8dfc72e444e572bcafb7



Platform:               gm965
Arch:           x86_64
OSD:            Fedora release 10 (Cambridge)
Kernel:         2.6.29-rc5
Libdrm:               (master)1c381092a310af9b1b39b3a983ad5760b71a9025
Mesa:                 (mesa_7_4_branch)e3050c1777fe4d420bea0171f3624e53513b1055
Xserver:              (server-1.6-branch)4557b3f6c4273cd83b701beaf7a150c806fed298
Xf86_video_intel:     (master)668b2352a47bcfba75fe0492a5805726222755eb

Comment 13 martin 2009-02-25 05:03:58 UTC
The default config in ubuntu has compiz enabled, so Fedora is a quiet different environment. I think it would be useful if you had at least once machine with ubuntu for testing (maybe you do?).
Comment 14 martin 2009-03-20 04:06:02 UTC
Jaunty just got 2.6.3 uploaded and I ran the repro again. This was the result:

At first the image seems to load fine (while it's being downloaded, more and more of it slowly becomes visible). However at some point near the end the image just comes black (see partially_black.png). If I fiddle with the browser window to force a repaint the entire image goes black (see fully_black.png).

Also, if I look in dmesg after reproing this bug I see a slew of nasty messages like this:

[ 195.560097] [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[ 195.560102] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12<3>[drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[ 196.112425] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12<3>[drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[ 196.596163] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12<3>[drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[ 197.097209] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12<3>[drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[ 197.655869] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12<3>[drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[ 214.263969] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12


The 2.6.3 version seems to be an improvement in that the xserver no longer crashes, it's just this particular image that is being corrupted.
Comment 15 martin 2009-03-20 07:11:45 UTC
I've now also tested this bug using the ubuntu xorg-edgers PPA repo. Basically, I found that the following versions still show the bug (note I still use the 2.6.28 standard ubuntu kernel):

Linux kingfish 2.6.28-11-generic #35-Ubuntu SMP Wed Mar 18 21:55:34 UTC 2009 x86_64 GNU/Linux

xserver-xorg-core                            2:1.6.0-0ubuntu3
xserver-xorg-core-dbg                        2:1.6.0-0ubuntu3
libdrm-intel1                                2.4.5+git20090314.82eac806-0ubuntu0tormod2
libdrm-intel1-dbg                            2.4.5+git20090314.82eac806-0ubuntu0tormod2
xserver-xorg-video-intel                     2:2.6.99.1+git20090319.bedc894a-0ubuntu0tormod
xserver-xorg-video-intel-dbg                 2:2.6.99.1+git20090319.bedc894a-0ubuntu0tormod
libgl1-mesa-dev                              7.3+git20090312+mesa-7-4-branch.a6f7e909-0ubuntu0tormod
libgl1-mesa-dri                              7.3+git20090312+mesa-7-4-branch.a6f7e909-0ubuntu0tormod
libgl1-mesa-dri-dbg                          7.3+git20090312+mesa-7-4-branch.a6f7e909-0ubuntu0tormod
libgl1-mesa-glx                              7.3+git20090312+mesa-7-4-branch.a6f7e909-0ubuntu0tormod
libglu1-mesa                                 7.3+git20090312+mesa-7-4-branch.a6f7e909-0ubuntu0tormod
libglu1-mesa-dev                             7.3+git20090312+mesa-7-4-branch.a6f7e909-0ubuntu0tormod
mesa-common-dev                              7.3+git20090312+mesa-7-4-branch.a6f7e909-0ubuntu0tormod
mesa-utils                                   7.3+git20090312+mesa-7-4-branch.a6f7e909-0ubuntu0tormod
xlibmesa-gl-dev                              1:7.4~5ubuntu16
libdrm-dev                                   2.4.5+git20090314.82eac806-0ubuntu0tormod2
libdrm-intel1                                2.4.5+git20090314.82eac806-0ubuntu0tormod2
libdrm-intel1-dbg                            2.4.5+git20090314.82eac806-0ubuntu0tormod2
libdrm-nouveau1                              2.4.5+git20090314.82eac806-0ubuntu0tormod2
libdrm2                                      2.4.5+git20090314.82eac806-0ubuntu0tormod2
libdrm2-dbg                                  2.4.5+git20090314.82eac806-0ubuntu0tormod2
compiz                                       1:0.8.2-0ubuntu5
compiz-core                                  1:0.8.2-0ubuntu5
compiz-dev                                   1:0.8.2-0ubuntu5
compiz-fusion-plugins-extra                  0.8.2-0ubuntu1
compiz-fusion-plugins-main                   0.8.2-0ubuntu1
compiz-gnome                                 1:0.8.2-0ubuntu5
compiz-kde                                   1:0.8.2-0ubuntu5
compiz-plugins                               1:0.8.2-0ubuntu5
compiz-wrapper                               1:0.8.2-0ubuntu5
Comment 16 martin 2009-03-20 07:30:41 UTC
Okay, so now I booted using the same user space components (more or less git HEAD) from the previous comment, but on top of that I used an almost unmodified upstream 2.6.29-rc8 kernel (it was built using the ubuntu kernel config though but no ubuntu sauce patches applied).

With this config I saw a black horizontal stripe though the image when it was fully loaded (parts above and below this strip looked fine). When I changed from the singapore containers JPG TAB to another TAB in firefox and then back to the big JPG again, then the JPG looked perfectly. However, I still got some nasty stuff printed to dmesg:

[  253.106340] [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[  253.106345] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12


Just those two lines not, nothing else (previous I had many error in dmesg as seen in earlier comments).



Comment 17 martin 2009-03-20 07:33:44 UTC
Created attachment 24083 [details]
all logs (captured after running the repro, using 29rc8+close_to_git_2009mar19 userspace)
Comment 18 martin 2009-03-20 07:45:44 UTC
With the same 29rc8+recent-user-space config, after I reboot I saw a third error in dmesg (still UXA):

[  120.100180] [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[  120.100185] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12
[  120.110140] [drm:i915_gem_evict_something] *ERROR* inactive empty 1 request empty 1 flushing empty 1
Comment 19 martin 2009-03-20 13:17:27 UTC
If it doesn't crash right away, also try these things:

* single click the big JPG in firefox in order to enlarge it and show it in "100%" natural size... actually what I do normally is that I repeatedly click the image to zoom/dezoom it many many times.

* move around the firefox showing the JPG in enlarged mode, I've found that very often other windows in X get corrupted by doing this. It's like if the backing textures for those other windows has been overwritten by garbage and by moving around the firefox window with the big JPG a redraw is forced on the other windows and then the fact that they backing textures contain garbage is revealed.

* move around the firefox window while the big image is loading... you will find that the entire machine runs extremely choppy sort of like if kernel space is taking to much time and interferes with "fair scheduling".
Comment 20 martin 2009-03-20 13:28:26 UTC
http://temp.minimum.se/big_jpg_screws_up_other_windows.png

screenshot showing what it looks like when the other windows get corrupted. basically the window with the garbage used to be a window showing cgit for mesa and the only thing I did was to view this JPG, click it once and to show it in 100% size and then move around the window with the big JPG.

this corruption doesn't happen consistently though, the exact outcome of loading this big JPG seems to be a little bit different every time I try it.
Comment 21 Eric Anholt 2009-03-20 13:31:53 UTC
We're going to want to make a small testcase for this so we can reliably reproduce it on a variety of configurations without remembering to test it.  Some of our paths should handle this correctly.  Some of them don't, and it's a bug we're likely to regress on repeatedly.

pseudocode:
for (i = 100; i < 10000; i *= 1.5) {
  /* This would get 1.5GB max size.  <32 bits, but > aperture size
   * because I don't want to think about -ENOMEM or overflow handling
   */

  pix = XCreatePixmap(32, i, i);
  picture = XRenderCreatePicture(pix);
  XFreePixmap(pix);

  /* hit a HW accel path */
  XRenderFillRectangle(picture, PictOpSrc, &color, 0, 0, i, i);

  /* hit a SW path */
  XRenderFillRectangle(picture, PictOpConjointOver, &color, 0, 0, i, i);

  XFreePicture(picture);
}
Comment 22 martin 2009-03-23 06:42:37 UTC
Created attachment 24166 [details]
xorg bitmap size stress testing tool

Eric,

I've implemented a tool according to your specifications above, use the attached main.c and compile using "gcc *.c -std=c99 -lXrender -o bitmap_size_stress". The tool has some basic command line options that will allow you to configure max size and whether to hit HW/SW path or both etc. Use "bitmap_size_stress --help" to see a list of parameters. I have licensed it under MIT so I hope intel QA can fork it and build it into some regression test suite.

Good news is that I can hang xorg with it easily on G45. Bad news is that I don't get the exact same stack. I will do more extensive testing later to see if I can repro the exact same stack using some particular set of command line parameters.
Comment 23 martin 2009-03-23 06:46:40 UTC
The error I get when using this stress testing tool is what looks like a CPU spin inside drm_intel_gem_bo_alloc_internal (I put break points on all functions in stack and ran it for a pretty long time and it didn't hit them):

#0  0x00007f2867bc6887 in drm_intel_gem_bo_alloc_internal (bufmgr=0x1bcd4c0, name=0x7f2867e3e872 "pixmap", size=1140615168, 
    alignment=<value optimized out>, for_render=1) at ../../../libdrm/intel/intel_bufmgr_gem.c:205
#1  0x00007f2867e17cad in i830_uxa_create_pixmap (screen=0x1bccd70, w=16877, h=16877, depth=<value optimized out>, usage=0)
    at ../../src/i830_exa.c:966
#2  0x0000000000449b8a in ProcCreatePixmap (client=0x6155570) at ../../dix/dispatch.c:1299
#3  0x000000000044e354 in Dispatch () at ../../dix/dispatch.c:437
#4  0x0000000000433ddd in main (argc=10, argv=0x7fff74080a88, envp=<value optimized out>) at ../../dix/main.c:397

The original stack I got earlier in this bug report was a "chip hung" stack so X.org wasn't spinning. I get the CPU spin above also using the stress tool with --hw-only parameter.
Comment 24 Clemens Eisserer 2009-03-23 14:14:38 UTC
I experienced the same problem on my i945GM , after working about 2 hours on KDE (Xorg memory useage was ~150mb), I opend the following image with firefox:
http://upload.wikimedia.org/wikipedia/commons/5/5a/Ikea_Dortmund_Ellinghausen.JPG

This was intel-2.6.99.902 and kernel-2.6.29rc8:

and git endless messages like:
[drm:i915_gem_execbuffer] *ERROR* Failed to pin buffers -12                             
[drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty                  
[drm:i915_gem_object_pin] *ERROR* Failure to bind: -12                                  
[drm:i915_gem_evict_something] *ERROR* inactive empty 1 request empty 1 flushing empty 1
[drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty       
Comment 25 martin 2009-04-05 06:06:34 UTC
Created attachment 24571 [details]
gdb trace when UXA goes into CPU spin while repro'ing this bug using mesa git HEAD april 4th

I tried this repro again today with 2.4.6ish libdrm, 2:2.6.99ish intel ddx, .29 kernel and mesa git HEAD as of april 4th.

This I did not run into a "chip hung" ioctl() block thing, instead the xserver went into CPU spin entirely inside pixmanBltsse2() in the top stack frame below:

#0  pixmanBltsse2 (src_bits=<value optimized out>, dst_bits=0x7f20a678c000, src_stride=40448, dst_stride=40448, src_bpp=<value optimized out>,
    dst_bpp=<value optimized out>, src_x=<value optimized out>, src_y=0, dst_x=<value optimized out>, dst_y=0, width=10000, height=165)
    at /usr/lib/gcc/x86_64-linux-gnu/4.3.3/include/emmintrin.h:699
#1  0x00007f20cdc7aef4 in fbCopyNtoN (pSrcDrawable=<value optimized out>, pDstDrawable=<value optimized out>, pGC=<value optimized out>,
    pbox=0x7fffda750470, nbox=1, dx=0, dy=0, reverse=0, upsidedown=0, bitplane=0, closure=0x0) at ../../fb/fbcopy.c:64
#2  0x00007f20ce4ff8e0 in uxa_copy_n_to_n (pSrcDrawable=0x53143d0, pDstDrawable=0x52836d0, pGC=0x52655c0, pbox=0x7fffda750470, nbox=1, dx=0, dy=0,
    reverse=0, upsidedown=0, bitplane=0, closure=0x0) at ../../uxa/uxa-accel.c:477
#3  0x00007f20cdc79df0 in fbCopyRegion (pSrcDrawable=0x53143d0, pDstDrawable=0x52836d0, pGC=0x52655c0, pDstRegion=<value optimized out>, dx=0,
    dy=7488, copyProc=0x7f20ce4ff1d0 <uxa_copy_n_to_n>, bitPlane=0, closure=0x0) at ../../fb/fbcopy.c:396
#4  0x00007f20cdc7a353 in fbDoCopy (pSrcDrawable=0x53143d0, pDstDrawable=0x52836d0, pGC=0x52655c0, xIn=0, yIn=0, widthSrc=10000, heightSrc=224,
    xOut=0, yOut=0, copyProc=0x7f20ce4ff1d0 <uxa_copy_n_to_n>, bitPlane=0, closure=0x0) at ../../fb/fbcopy.c:596
#5  0x00007f20ce4ff147 in uxa_copy_area (pSrcDrawable=0x53143d0, pDstDrawable=0x52836d0, pGC=0x52655c0, srcx=0, srcy=0, width=10000, height=224,
    dstx=0, dsty=0) at ../../uxa/uxa-accel.c:496
Comment 26 martin 2009-04-05 07:24:46 UTC
Okay so now I did this in gdb:

(gdb) break pixmanBltsse2
(gdb) commands
>print height
>c
>end
(gdb) c

And thus gdb printed all parameters passed to this function every time it was called. Then I ran the repro and I _think_ these were the ones passed into the function when it never returned (the width clearly indicates it has something to do with the big jpg file).

Breakpoint 1, pixmanBltsse2 (src_bits=<value optimized out>, dst_bits=0x7f07bad4b000, src_stride=40448, dst_stride=40448, src_bpp=32, dst_bpp=32, src_x=<value optimized out>, src_y=0, dst_x=0, dst_y=0, width=10000, height=1841) at ../../pixman/pixman-sse2.c:4422
Comment 27 martin 2009-05-02 17:25:37 UTC
I can confirm JPG comes out black (but no freeze and no segv) also using these versions / hw:

00:02.0 VGA compatible controller [0300]: Intel Corporation 4 Series Chipset Integrated Graphics Controller [8086:2e22] (rev 03)
Linux kingfish 2.6.28-11-generic #42-Ubuntu SMP Fri Apr 17 01:58:03 UTC 2009 x86_64 GNU/Linux
libdrm2 2.4.9-1ubuntu1~xup~1
xserver-xorg-video-intel 2:2.7.0-1ubuntu1~xup~1
libgl1-mesa-dri 7.4-0ubuntu3
libgl1-mesa-glx 7.4-0ubuntu3
(**) intel(0): Using UXA for acceleration

While running repro on this config, this was added to dmesg:

[ 163.317061] [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[ 163.317064] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12<3>[drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[ 163.879976] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12<3>[drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[ 164.437987] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12<3>[drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[ 165.041770] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12<3>[drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[ 165.712079] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -12
Comment 28 martin 2009-05-02 17:31:23 UTC
If I use sw / hw configuration from comment #27 but also adding 2.6.30-rc4 kernel compiled with ubuntu's kernel config, then I again run into the xserver CPU spin from comment #25.

Let me know if there is anything else I can try to get us closer to a bugfix.
Comment 29 martin 2009-05-03 09:54:26 UTC
This picture of the Japanese subway system is also not viewable when running in UXA mode:
http://www.flickr.com/photos/formforce/3409362834/sizes/o/

And also this poster of Lego mini figs:
http://www.flickr.com/photos/dunechaser/3496361448/sizes/o/
Comment 30 Gordon Jin 2009-05-03 23:38:21 UTC
Reassign to cworth as he already owns some similar bugs.
Comment 31 roberth 2009-05-07 19:45:34 UTC
This bug is also affecting me here on 945GME and the latest git versions of libdrm/-intel and mesa master -
https://bugs.freedesktop.org/show_bug.cgi?id=21512

X freezes but there are no errors in Xorg.0.log or dmesg when it happens. The website http://www.woodtv.com also triggers it. I have a backtrace and intel_gpu_dump logs attached to that bug and can provide more info if needed. 
Comment 32 Eric Anholt 2009-05-19 10:14:44 UTC
It's nice to have reproducible testcases in the form of a URL.  Thanks!

I'm marking this fixed as I've pushed changes fixing the original site.  Please open new bugs for other sites that fail.

kernel:
commit 8e7d2b2c6ecd3c21a54b877eae3d5be48292e6b5
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Fri May 8 16:13:25 2009 -0700

    drm/i915: allocate large pointer arrays with vmalloc

libdrm:
commit 469655fab7a56eb32ff8cdefb33992813342353a
Author: Eric Anholt <eric@anholt.net>
Date:   Mon May 18 16:07:45 2009 -0700

    intel: Only do BO caching up to 64MB objects.

xf86-video-intel:
commit 09beee378cecd1079e7a9fa6eee8f084d680d37e
Author: Eric Anholt <eric@anholt.net>
Date:   Mon May 18 18:01:05 2009 -0700

    Don't do GTT maps on objects bigger than half the available aperture size.
Comment 33 Eric Anholt 2009-05-19 10:15:36 UTC
*** Bug 21512 has been marked as a duplicate of this bug. ***
Comment 34 Chris Wilson 2010-05-08 00:51:57 UTC
The freeze is back.
Comment 35 Chris Wilson 2010-05-08 11:57:26 UTC
Step 1 is to avoid the fallback for large sources:

commit a7b800513fcc94e063dfd68d2f63b6bab7fae47d
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Apr 14 21:14:34 2010 +0100

    uxa: Extract sub-region from in-memory buffers.
    
    If the buffer is too large or not suitable for a GPU operation, we
    currently fallback and perform the composite on the CPU. An alternative
    is too extract the small region out of the source (as usually the
    sample extents are much smaller than the actual surface size) and try
    the composite with the new surface.
    
    The effect is particularly noticeable on pathological websites that use
    very large background images. For example, http://www.woodtv.com/ uses a
    1299x15000 pattern that is obscured by another opaque pattern.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

commit 848ab66384508c3ad3e5fb4884e4527f3ebd3bde
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat May 8 18:36:55 2010 +0100

    uxa: Transform composites with a simple translation into a blit
    
    We can also convert a composite with an integer translation into a
    blit, so long as the sample extents remains within the source.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Step 2 is a kernel patch to avoid the pathological eviction behaviour.
Comment 36 Chris Wilson 2010-05-08 12:09:28 UTC
Created attachment 35510 [details] [review]
Protect mmapped buffers from causal eviction.
Comment 37 Chris Wilson 2010-05-25 00:49:22 UTC
Dropping priority, this now depends upon a kernel patch to protect against any remaining pathological cases. However, the DDX should prevent the hang from occurring again without the kernel patch.
Comment 38 Chris Wilson 2010-06-26 04:48:51 UTC
*** Bug 28005 has been marked as a duplicate of this bug. ***
Comment 39 Chris Wilson 2010-08-08 12:11:53 UTC
Woohoo!


commit e2bf07fe23fd11a2acba609bf34ccc59c5553389
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Aug 7 11:01:24 2010 +0100

    drm/i915: Implement fair lru eviction across both rings. (v2)
    
    Based in a large part upon Daniel Vetter's implementation and adapted
    for handling multiple rings in a single pass.
    
    This should lead to better gtt usage and fixes the page-fault-of-doom
    triggered. The fairness is provided by scanning through the GTT space
    amalgamating space in rendering order. As soon as we have a contiguous
    space in the GTT large enough for the new object (and its alignment),
    evict any object which lies within that space. This should keep more
    objects resident in the GTT.
    
    Doing throughput testing on a PineView machine with cairo-perf-trace
    indicates that there is very little difference with the new LRU scan,
    perhaps a small improvement... Except oddly for the poppler trace.
    
    Reference:
    
      Bug 15911 - Intermittent X crash (freeze)
      https://bugzilla.kernel.org/show_bug.cgi?id=15911
    
      Bug 20152 - cannot view JPG in firefox when running UXA
      https://bugs.freedesktop.org/show_bug.cgi?id=20152
    
      Bug 24369 - Hang when scrolling firefox page with window in front
      https://bugs.freedesktop.org/show_bug.cgi?id=24369
    
      Bug 28478 - Intermittent graphics lockups due to overflow/loop
      https://bugs.freedesktop.org/show_bug.cgi?id=28478
    
    v2: Attempt to clarify the logic and order of eviction through the use
    of comments and macros.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
    Signed-off-by: Eric Anholt <eric@anholt.net>


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.