Bug 22947 - [855GM, xf86-video-intel-2.8.0] "Freeze" when RENDER extension is being used
[855GM, xf86-video-intel-2.8.0] "Freeze" when RENDER extension is being used
Status: RESOLVED FIXED
Product: xorg
Classification: Unclassified
Component: Driver/intel
unspecified
Other All
: high critical
Assigned To: Carl Worth
Xorg Project Team
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-07-25 17:02 UTC by Bruno
Modified: 2009-09-22 13:37 UTC (History)
1 user (show)

See Also:


Attachments
xorg.conf (5.88 KB, text/plain)
2009-07-25 17:02 UTC, Bruno
no flags Details
intel_gpu_dump [frozen state] (91.87 KB, text/plain)
2009-07-25 17:11 UTC, Bruno
no flags Details
intel_reg_dump [frozen state] (10.62 KB, text/plain)
2009-07-25 17:12 UTC, Bruno
no flags Details
kernel log up to frozen state (34.41 KB, text/plain)
2009-07-25 17:12 UTC, Bruno
no flags Details
Xorg log up to frozen state (18.19 KB, text/plain)
2009-07-25 17:13 UTC, Bruno
no flags Details
kernel log (with 'sysrq-t') (111.10 KB, text/plain)
2009-09-12 01:41 UTC, Bruno
no flags Details
intel_reg_dump (while frozen) (10.61 KB, text/plain)
2009-09-12 01:42 UTC, Bruno
no flags Details
Xorg log (18.24 KB, text/plain)
2009-09-12 01:43 UTC, Bruno
no flags Details
intel_gpu_dump (while frozen, bzip2 compressed) (103.33 KB, application/octet-stream)
2009-09-12 01:46 UTC, Bruno
no flags Details
Set texture vertext data format and type for projective transforms (6.01 KB, patch)
2009-09-19 17:37 UTC, Keith Packard
no flags Details | Splinter Review
GPU dump for driver with patch applied (while frozen, bzip2 compressed) (111.78 KB, application/octet-stream)
2009-09-21 13:15 UTC, Bruno
no flags Details
Fallback to software for non-affine transformations (1.48 KB, patch)
2009-09-21 15:45 UTC, Carl Worth
no flags Details | Splinter Review
Corrected version of Keith's patch (6.67 KB, patch)
2009-09-21 17:13 UTC, Chris Wilson
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Bruno 2009-07-25 17:02:03 UTC
Created attachment 28004 [details]
xorg.conf

System details:
- Hardware:
  Acer TravelMate 66x
  00:02.0 VGA compatible controller [0300]:
  Intel Corporation 82852/855GM Integrated Graphics Device [8086:3582] (rev 02)
  00:02.1 Display controller [0380]:
  Intel Corporation 82852/855GM Integrated Graphics Device [8086:3582] (rev 02)

- Distro: Gentoo
  - x11-base/xorg-server-1.6.2-r1
  - x11-drivers/xf86-video-intel-2.8.0
  - x11-libs/libdrm-2.4.12
  - media-libs/mesa-7.5-r2
- Kernel:
  - 2.6.31-rc4
  + drm-intel-next from git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel
    (up to 2a2430f4542467502d39660bfd66b0004fd8d6a9
     drm/i915: correct self-refresh calculation in "everything off" case)
  - KMS enabled

When starting up enlightenment X server freezes (access is still possible via SSH but responsiveness is bad). Note: without extra patches from Anholt's tree no interactive remote shell was possible anymore because of eventd/0 waiting on a i915 DRM mutex.

When doing the same with RENDER extension disabled in Xorg system is usable.

(same kind of freeze when using UMS though I have not tested it with disabled RENDER)

xf86-video-intel-2.7.1 worked more or less properly in UMS mode (freezes rare) but didn't work in KMS mode (freeze in that case). I haven't tested 2.7.1 behavior with RENDER disabled.


In follow-up attachments there is:
- output of intel_gpu_dump for frozen X
- output of intel_reg_dump for frozen X
- kernel log for frozen X
- Xorg.0.log for frozen X
- xorg.conf


Sidenote, the MCE error mentionned in kernel log corresponds to:
  CPU 0 BANK 1 MCG status:
  MCi status:
  Error overflow
  Uncorrected error
  Processor context corrupt
  MCA: Data CACHE Level-1 Instruction-Fetch Error
  STATUS e200000000000155 MCGSTATUS 0
Probably unrelated as it happens on any system start at exactly the same uptime...
Comment 1 Bruno 2009-07-25 17:11:48 UTC
Created attachment 28005 [details]
intel_gpu_dump [frozen state]
Comment 2 Bruno 2009-07-25 17:12:25 UTC
Created attachment 28006 [details]
intel_reg_dump [frozen state]
Comment 3 Bruno 2009-07-25 17:12:57 UTC
Created attachment 28007 [details]
kernel log up to frozen state
Comment 4 Bruno 2009-07-25 17:13:21 UTC
Created attachment 28008 [details]
Xorg log up to frozen state
Comment 5 Bruno 2009-07-25 17:19:02 UTC
One applied kernel patch I forgot to mention is attachment #27810 [details] [review] from bug #20115. (Without that one nothing visible on LVDS)
Comment 6 Carl Worth 2009-07-31 11:08:46 UTC
Hi Bruno,

Thank you for your bug report about freezes when using the Render extension with an 855.

I know that Eric Anholt recently discovered that with some 8xx hardware, the Render acceleration code in the driver was unstable and would regularly cause freezes. I think his plan is to remove/disable that code for now.

I'm including him on the CC list for this bug so that he can confirm or deny this.

-Carl
Comment 7 Gordon Jin 2009-09-06 19:38:22 UTC
wish to fix in 2.9
Comment 8 Carl Worth 2009-09-11 13:51:42 UTC
This sounds like a bug that should be fixed with the patch posted by Eric
yesterday. Please see bug #22904 for the patch, and please report there
if the patch fixes your bug. (If it doesn't then we can reopen this
bug as not being a duplicate.)

Thanks for your report!

-Carl


*** This bug has been marked as a duplicate of bug 22904 ***
Comment 9 Bruno 2009-09-12 01:39:20 UTC
(In reply to comment #8)
> This sounds like a bug that should be fixed with the patch posted by Eric
> yesterday. Please see bug #22904 for the patch, and please report there
> if the patch fixes your bug. (If it doesn't then we can reopen this
> bug as not being a duplicate.)
> 
> Thanks for your report!
> 
> -Carl
> 
> 
> *** This bug has been marked as a duplicate of bug 22904 ***
> 

I tried the patch mentionned at the end of bug 22904 [agp/intel: Fix the pre-9xx chipset flush] (applied on top of linux-2.6.31) but X did still freeze on me...

Only mouse pointer was alive on the graphics side.
In follow-up attachments various information that might help understand the freeze (kernel log with dump of task kernel stacks, gpudump, regdump, xorg log).

My steps were: start X + enlightenment, then reenable render engine for enlightenment, soon after which graphics output did freeze. (system was still reachable via network/ssh)
Comment 10 Bruno 2009-09-12 01:41:09 UTC
Created attachment 29430 [details]
kernel log (with 'sysrq-t')
Comment 11 Bruno 2009-09-12 01:42:35 UTC
Created attachment 29431 [details]
intel_reg_dump (while frozen)
Comment 12 Bruno 2009-09-12 01:43:06 UTC
Created attachment 29432 [details]
Xorg log
Comment 13 Bruno 2009-09-12 01:46:50 UTC
Created attachment 29433 [details]
intel_gpu_dump (while frozen, bzip2 compressed)
Comment 14 Keith Packard 2009-09-19 17:37:57 UTC
Created attachment 29704 [details] [review]
Set texture vertext data format and type for projective transforms

This appears to hit the projective transform path within the i8xx render code due to a slightly imprecise computation of the transform within e17 (it's so close to affine, but not quite there).

Reviewing that code yielded several mistakes, from not actually setting the texture coordinate type and format to sending four values per texture coordinate instead of three.

This patch is entirely untested, but tries to follow the documentation.
Comment 15 Bruno 2009-09-21 13:15:00 UTC
Created attachment 29724 [details]
GPU dump for driver with patch applied (while frozen, bzip2 compressed)

(In reply to comment #14)
> Created an attachment (id=29704) [details]
> Set texture vertext data format and type for projective transforms
> 
> This appears to hit the projective transform path within the i8xx render code
> due to a slightly imprecise computation of the transform within e17 (it's so
> close to affine, but not quite there).
> 
> Reviewing that code yielded several mistakes, from not actually setting the
> texture coordinate type and format to sending four values per texture
> coordinate instead of three.
> 
> This patch is entirely untested, but tries to follow the documentation.

It does not help in my case (though I have no idea if it does anything bad either)
Attached is a gpudump (bz2 compressed) when having patch in attachment #29704 [details] [review] applied on top of git (b4d29452b929a3ef224d3625e4bc66b787c5edb7 - More dumps for Arrandale LVDS) with 2.6.31 kernel and GPU still freezing
Comment 16 Carl Worth 2009-09-21 15:45:55 UTC
Created attachment 29727 [details] [review]
Fallback to software for non-affine transformations

Bruno,

Thanks for testing Keith's patch. We worked with it a bit and fixed on
ebug in it, (examining the transform for is_affine before the
transform is actually initialized), but we also couldn't get it to
work.

Here's another patch instead. This just bypasses the known-buggy
driver code by simply falling back to software for any projective
transformation matrix.

In my testing on an 865, this patch fixes the hangs when running
enlightenment. It also does mean that enlightenment runs quite a bit
more slowly, but this could be addressed by fixing enlightenment to
not set a projective transformation when not necessary, (for example,
when doing integer translation for moving a window, etc.). It appears
that enlightenment has a rounding bug preventing these
very-close-to-affine transformations from being recognized as affine.

I'm planning to push this change now, (and close this bug
report). Please feel free to reopen the report if you find any further
problems.

Thanks again,

-Carl
Comment 17 Carl Worth 2009-09-21 15:48:11 UTC
Fixed in xf86-video-intel master now.

-Carl

commit 505025053d66d415e1c23ac858b9238fa8541d37
Author: Carl Worth <cworth@cworth.org>
Date:   Mon Sep 21 13:50:09 2009 -0700

    8xx: Fallback for any non-affine transformation.
    
    There are definitely bugs in the 8xx code dealing with non-affine
    transformations. Disable that code for now to get things working.
    
    Fixes bug #22947 ([855GM, xf86-video-intel-2.8.0] "Freeze" when RENDER extension is being used)
Comment 18 Chris Wilson 2009-09-21 17:13:16 UTC
Created attachment 29728 [details] [review]
Corrected version of Keith's patch

Note this patch simply comments out Carl's workaround and will break i915 - so only use it for testing on i8xx boxes!
Comment 19 Chris Wilson 2009-09-21 17:42:43 UTC
Hopefully final update... Keith split the i915 code paths from the i830, making the way clear to apply this patch.

commit 2cc1f3cb6034dddd65b3781b0cde7dff4ac1e803
Author: Keith Packard <keithp@keithp.com>
Date:   Sat Sep 19 17:30:57 2009 -0700

    i8xx: Format projective texture coordinates correctly.
    
    Projective texture coordinates must be delivered as TEXCOORDFMT_3D
    using TEXCOORDTYPE_HOMOGENOUS. This meant selecting the correct type
    in i830_texture_setup, the correct format in i830_emit_composite_state
    and sending only 3 coordinates in i830_emit_composite_primitive.
    
    Signed-off-by: Keith Packard <keithp@keithp.com>
    [ickle: tweaked to fix up a couple of use-before-initialised]
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Comment 20 Carl Worth 2009-09-22 11:43:27 UTC
Note that the current state of the driver means that code such as enlightenment
should run stably, and should also be plenty fast. Many thanks to Keith and
Chris for helping to eliminate my workaround.

-Carl
Comment 21 Bruno 2009-09-22 13:37:33 UTC
Keith, Carl, Chris,

Thanks for the fix! I've not caught a freeze yet while running on xf86-video-intel GIT (7e7db7ac530b5282b0841585959597b54fcc633b - Add new backlight driver "samsung")  for a dozzen minutes now.

And scrolling in Firefox is no more a CPU and time hog :)

In addition some artifacts on text redrawing (usually after tool-tip disappears/moves) seem gone (were white text on black background instead of the opposite plus a few white pixels spread around) - probably an issue with write-through, cacheline & co for software mode that does not show up anymore as render skips that code path.