Bug 27447 - [965GM KMS Overlay 2.6.33.2] GPU hung
Summary: [965GM KMS Overlay 2.6.33.2] GPU hung
Status: CLOSED DUPLICATE of bug 24977
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Daniel Vetter
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-04-04 00:15 UTC by Thomas Lindroth
Modified: 2017-07-24 23:08 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description Thomas Lindroth 2010-04-04 00:15:35 UTC
Bug description:
When using the 2.6.33.2 kernel using KMS and overlay the GPU will hang after playing video for about 10 minutes. The mouse moves and the video window displays the blue colorkey. Everything else is frozen. Only way to fix it is with a sysrq reboot.
This is a 2.6.33.1 -> 2.6.33.2 regression. I have not been able to bisect it but 2.6.33.2 only got two i915 patches anyway.


System environment: 
-- chipset: 965GM
-- system architecture: 64-bit
-- xf86-video-intel: 2.10.0
-- xserver: 1.7.5
-- mesa: 7.7
-- libdrm: 2.4.17
-- kernel: 2.6.33.2
-- Linux distribution: gentoo
-- Machine or mobo model: FS Amilo Si 2636
-- Display connector: HDMI

Reproducing steps:
Use 2.6.33.2 and playback xv video using XvPreferOverlay in xorg.conf
It's possible the crash can be avoided by moving the mouse. I have
not been able to crash it when doing other work at the same time.

Additional info:
Last output from /var/log/Xorg.0.log
(**) FontPath set to:
        /usr/share/fonts/misc/,
        /usr/share/fonts/TTF/,
        /usr/share/fonts/OTF,
        /usr/share/fonts/Type1/,
        /usr/share/fonts/100dpi/,
        /usr/share/fonts/75dpi/,
        /usr/share/fonts/freefonts/,
        /usr/share/fonts/corefonts/,
        /usr/share/fonts/cyrillic/,
        /usr/share/fonts/ttf-bitstream-vera/,
        /usr/share/fonts/misc/,
        /usr/share/fonts/TTF/,
        /usr/share/fonts/OTF,
        /usr/share/fonts/Type1/,
        /usr/share/fonts/100dpi/,
        /usr/share/fonts/75dpi/
(==) ModulePath set to "/usr/lib64/xorg/modules"
(WW) AllowEmptyInput is on, devices using drivers 'kbd', 'mouse' or 'vmmouse' will be disabled.
(WW) Disabling Mouse0
(WW) Disabling Keyboard0
(II) Loader magic: 0x7ed200
(II) Module ABI versions:
        X.Org ANSI C Emulation: 0.4
        X.Org Video Driver: 6.0
...skipping...
(II) intel(0): Modeline "832x624"x0.0   57.28  832 864 928 1152  624 625 628 667 -hsync -vsync (49.7 kHz)
(II) intel(0): Modeline "800x600"x0.0   49.50  800 816 896 1056  600 601 604 625 +hsync +vsync (46.9 kHz)
(II) intel(0): Modeline "1152x864"x0.0  108.00  1152 1216 1344 1600  864 865 868 900 +hsync +vsync (67.5 kHz)
(II) intel(0): Modeline "1152x720"x60.0   67.32  1152 1208 1328 1504  720 721 724 746 -hsync +vsync (44.8 kHz)
(II) intel(0): Modeline "1280x720"x60.0   74.48  1280 1336 1472 1664  720 721 724 746 -hsync +vsync (44.8 kHz)
(II) intel(0): Modeline "1280x960"x0.0  108.00  1280 1376 1488 1800  960 961 964 1000 +hsync +vsync (60.0 kHz)
(II) intel(0): Modeline "1280x1024"x0.0  108.00  1280 1328 1440 1688  1024 1025 1028 1066 +hsync +vsync (64.0 kHz)
(II) intel(0): Modeline "1600x900"x60.0  119.00  1600 1696 1864 2128  900 901 904 932 -hsync +vsync (55.9 kHz)
(II) intel(0): Modeline "1680x1050"x0.0  119.00  1680 1728 1760 1840  1050 1053 1059 1080 +hsync -vsync (64.7 kHz)
(II) intel(0): Modeline "1920x1080"x60.0  172.80  1920 2040 2248 2576  1080 1081 1084 1118 -hsync +vsync (67.1 kHz)
(II) intel(0): EDID vendor "BNQ", prod id 30784
(II) intel(0): Using hsync ranges from config file
(II) intel(0): Using vrefresh ranges from config file
(II) intel(0): Printing DDC gathered Modelines:
(II) intel(0): Modeline "1920x1080"x0.0  148.50  1920 2008 2052 2200  1080 1084 1089 1125 +hsync -vsync (67.5 kHz)
(II) intel(0): Modeline "800x600"x0.0   40.00  800 840 968 1056  600 601 605 628 +hsync +vsync (37.9 kHz)
(II) intel(0): Modeline "640x480"x0.0   31.50  640 656 720 840  480 481 484 500 -hsync -vsync (37.5 kHz)
(II) intel(0): Modeline "640x480"x0.0   25.18  640 656 752 800  480 490 492 525 -hsync -vsync (31.5 kHz)
(II) intel(0): Modeline "720x400"x0.0   28.32  720 738 846 900  400 412 414 449 -hsync +vsync (31.5 kHz)
(II) intel(0): Modeline "1280x1024"x0.0  135.00  1280 1296 1440 1688  1024 1025 1028 1066 +hsync +vsync (80.0 kHz)
(II) intel(0): Modeline "1024x768"x0.0   78.75  1024 1040 1136 1312  768 769 772 800 +hsync +vsync (60.0 kHz)
(II) intel(0): Modeline "1024x768"x0.0   65.00  1024 1048 1184 1344  768 771 777 806 -hsync -vsync (48.4 kHz)
(II) intel(0): Modeline "832x624"x0.0   57.28  832 864 928 1152  624 625 628 667 -hsync -vsync (49.7 kHz)
(II) intel(0): Modeline "800x600"x0.0   49.50  800 816 896 1056  600 601 604 625 +hsync +vsync (46.9 kHz)
(II) intel(0): Modeline "1152x864"x0.0  108.00  1152 1216 1344 1600  864 865 868 900 +hsync +vsync (67.5 kHz)
(II) intel(0): Modeline "1152x720"x60.0   67.32  1152 1208 1328 1504  720 721 724 746 -hsync +vsync (44.8 kHz)
(II) intel(0): Modeline "1280x720"x60.0   74.48  1280 1336 1472 1664  720 721 724 746 -hsync +vsync (44.8 kHz)
(II) intel(0): Modeline "1280x960"x0.0  108.00  1280 1376 1488 1800  960 961 964 1000 +hsync +vsync (60.0 kHz)
(II) intel(0): Modeline "1280x1024"x0.0  108.00  1280 1328 1440 1688  1024 1025 1028 1066 +hsync +vsync (64.0 kHz)
(II) intel(0): Modeline "1600x900"x60.0  119.00  1600 1696 1864 2128  900 901 904 932 -hsync +vsync (55.9 kHz)
(II) intel(0): Modeline "1680x1050"x0.0  119.00  1680 1728 1760 1840  1050 1053 1059 1080 +hsync -vsync (64.7 kHz)
(II) intel(0): Modeline "1920x1080"x60.0  172.80  1920 2040 2248 2576  1080 1081 1084 1118 -hsync +vsync (67.1 kHz)
(EE) intel(0): Failed to submit batch buffer, expect rendering corruption or even a frozen display: Input/output error.
(EE) intel(0): Failed to submit batch buffer, expect rendering corruption or even a frozen display: Input/output error.
(EE) intel(0): Failed to submit batch buffer, expect rendering corruption or even a frozen display: Input/output error.
[mi] EQ overflowing. The server is probably stuck in an infinite loop.

Backtrace:
0: /usr/bin/X (xorg_backtrace+0x28) [0x4b4db8]
1: /usr/bin/X (mieqEnqueue+0x1bd) [0x4b44ed]
2: /usr/bin/X (xf86PostMotionEventP+0xd8) [0x484728]
3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7f94a337f000+0x590c) [0x7f94a338490c]
4: /usr/bin/X (0x400000+0x77527) [0x477527]
5: /usr/bin/X (0x400000+0x13af98) [0x53af98]
6: /lib/libpthread.so.0 (0x7f94a7dff000+0xedf0) [0x7f94a7e0ddf0]
7: /lib/libc.so.6 (ioctl+0x7) [0x7f94a591ff17]
8: /usr/lib/libdrm.so.2 (drmIoctl+0x23) [0x7f94a4c0a533]
9: /usr/lib/libdrm.so.2 (drmCommandWrite+0x1b) [0x7f94a4c0a7bb]
10: /usr/lib64/xorg/modules/drivers/intel_drv.so (0x7f94a4541000+0x1ab91) [0x7f94a455bb91]
11: /usr/bin/X (0x400000+0x1517ad) [0x5517ad]
12: /usr/lib64/xorg/modules/extensions/libextmod.so (0x7f94a5425000+0xfbe8) [0x7f94a5434be8]
13: /usr/bin/X (0x400000+0x30114) [0x430114]
14: /usr/bin/X (0x400000+0x24ffa) [0x424ffa]
15: /lib/libc.so.6 (__libc_start_main+0xe6) [0x7f94a5875a26]
16: /usr/bin/X (0x400000+0x24bb9) [0x424bb9]
...skipping...
(II) intel(0): Modeline "832x624"x0.0   57.28  832 864 928 1152  624 625 628 667 -hsync -vsync (49.7 kHz)
(II) intel(0): Modeline "800x600"x0.0   49.50  800 816 896 1056  600 601 604 625 +hsync +vsync (46.9 kHz)
(II) intel(0): Modeline "1152x864"x0.0  108.00  1152 1216 1344 1600  864 865 868 900 +hsync +vsync (67.5 kHz)
(II) intel(0): Modeline "1152x720"x60.0   67.32  1152 1208 1328 1504  720 721 724 746 -hsync +vsync (44.8 kHz)
(II) intel(0): Modeline "1280x720"x60.0   74.48  1280 1336 1472 1664  720 721 724 746 -hsync +vsync (44.8 kHz)
(II) intel(0): Modeline "1280x960"x0.0  108.00  1280 1376 1488 1800  960 961 964 1000 +hsync +vsync (60.0 kHz)
(II) intel(0): Modeline "1280x1024"x0.0  108.00  1280 1328 1440 1688  1024 1025 1028 1066 +hsync +vsync (64.0 kHz)
(II) intel(0): Modeline "1600x900"x60.0  119.00  1600 1696 1864 2128  900 901 904 932 -hsync +vsync (55.9 kHz)
(II) intel(0): Modeline "1680x1050"x0.0  119.00  1680 1728 1760 1840  1050 1053 1059 1080 +hsync -vsync (64.7 kHz)
(II) intel(0): Modeline "1920x1080"x60.0  172.80  1920 2040 2248 2576  1080 1081 1084 1118 -hsync +vsync (67.1 kHz)
(II) intel(0): EDID vendor "BNQ", prod id 30784
(II) intel(0): Using hsync ranges from config file
(II) intel(0): Using vrefresh ranges from config file
(II) intel(0): Printing DDC gathered Modelines:
(II) intel(0): Modeline "1920x1080"x0.0  148.50  1920 2008 2052 2200  1080 1084 1089 1125 +hsync -vsync (67.5 kHz)
(II) intel(0): Modeline "800x600"x0.0   40.00  800 840 968 1056  600 601 605 628 +hsync +vsync (37.9 kHz)
(II) intel(0): Modeline "640x480"x0.0   31.50  640 656 720 840  480 481 484 500 -hsync -vsync (37.5 kHz)
(II) intel(0): Modeline "640x480"x0.0   25.18  640 656 752 800  480 490 492 525 -hsync -vsync (31.5 kHz)
(II) intel(0): Modeline "720x400"x0.0   28.32  720 738 846 900  400 412 414 449 -hsync +vsync (31.5 kHz)
(II) intel(0): Modeline "1280x1024"x0.0  135.00  1280 1296 1440 1688  1024 1025 1028 1066 +hsync +vsync (80.0 kHz)
(II) intel(0): Modeline "1024x768"x0.0   78.75  1024 1040 1136 1312  768 769 772 800 +hsync +vsync (60.0 kHz)
(II) intel(0): Modeline "1024x768"x0.0   65.00  1024 1048 1184 1344  768 771 777 806 -hsync -vsync (48.4 kHz)
(II) intel(0): Modeline "832x624"x0.0   57.28  832 864 928 1152  624 625 628 667 -hsync -vsync (49.7 kHz)
(II) intel(0): Modeline "800x600"x0.0   49.50  800 816 896 1056  600 601 604 625 +hsync +vsync (46.9 kHz)
(II) intel(0): Modeline "1152x864"x0.0  108.00  1152 1216 1344 1600  864 865 868 900 +hsync +vsync (67.5 kHz)
(II) intel(0): Modeline "1152x720"x60.0   67.32  1152 1208 1328 1504  720 721 724 746 -hsync +vsync (44.8 kHz)
(II) intel(0): Modeline "1280x720"x60.0   74.48  1280 1336 1472 1664  720 721 724 746 -hsync +vsync (44.8 kHz)
(II) intel(0): Modeline "1280x960"x0.0  108.00  1280 1376 1488 1800  960 961 964 1000 +hsync +vsync (60.0 kHz)
(II) intel(0): Modeline "1280x1024"x0.0  108.00  1280 1328 1440 1688  1024 1025 1028 1066 +hsync +vsync (64.0 kHz)
(II) intel(0): Modeline "1600x900"x60.0  119.00  1600 1696 1864 2128  900 901 904 932 -hsync +vsync (55.9 kHz)
(II) intel(0): Modeline "1680x1050"x0.0  119.00  1680 1728 1760 1840  1050 1053 1059 1080 +hsync -vsync (64.7 kHz)
(II) intel(0): Modeline "1920x1080"x60.0  172.80  1920 2040 2248 2576  1080 1081 1084 1118 -hsync +vsync (67.1 kHz)
(EE) intel(0): Failed to submit batch buffer, expect rendering corruption or even a frozen display: Input/output error.
(EE) intel(0): Failed to submit batch buffer, expect rendering corruption or even a frozen display: Input/output error.
(EE) intel(0): Failed to submit batch buffer, expect rendering corruption or even a frozen display: Input/output error.
[mi] EQ overflowing. The server is probably stuck in an infinite loop.

Backtrace:
0: /usr/bin/X (xorg_backtrace+0x28) [0x4b4db8]
1: /usr/bin/X (mieqEnqueue+0x1bd) [0x4b44ed]
2: /usr/bin/X (xf86PostMotionEventP+0xd8) [0x484728]
3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7f94a337f000+0x590c) [0x7f94a338490c]
4: /usr/bin/X (0x400000+0x77527) [0x477527]
5: /usr/bin/X (0x400000+0x13af98) [0x53af98]
6: /lib/libpthread.so.0 (0x7f94a7dff000+0xedf0) [0x7f94a7e0ddf0]
7: /lib/libc.so.6 (ioctl+0x7) [0x7f94a591ff17]
8: /usr/lib/libdrm.so.2 (drmIoctl+0x23) [0x7f94a4c0a533]
9: /usr/lib/libdrm.so.2 (drmCommandWrite+0x1b) [0x7f94a4c0a7bb]
10: /usr/lib64/xorg/modules/drivers/intel_drv.so (0x7f94a4541000+0x1ab91) [0x7f94a455bb91]
11: /usr/bin/X (0x400000+0x1517ad) [0x5517ad]
12: /usr/lib64/xorg/modules/extensions/libextmod.so (0x7f94a5425000+0xfbe8) [0x7f94a5434be8]
13: /usr/bin/X (0x400000+0x30114) [0x430114]
14: /usr/bin/X (0x400000+0x24ffa) [0x424ffa]
15: /lib/libc.so.6 (__libc_start_main+0xe6) [0x7f94a5875a26]
16: /usr/bin/X (0x400000+0x24bb9) [0x424bb9]
...skipping...
keeps on repeting like that.

message in /var/log/messages
[err] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[err] render error detected, EIR: 0x00000000
[err] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -5 (awaiting 5105582 at 5105581)
[err] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[err] render error detected, EIR: 0x00000000
[err] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -5 (awaiting 5105585 at 5105584)
Comment 1 Jesse Barnes 2010-04-12 14:59:28 UTC
Any ideas, Daniel?
Comment 2 Daniel Vetter 2010-04-12 15:22:27 UTC
> --- Comment #1 from Jesse Barnes <jbarnes@virtuousgeek.org> 2010-04-12 14:59:28 PDT ---
> Any ideas, Daniel?

Dunno, besides that there's another report of overlay problems on i965
(stuck at green when first using).

Thomas, can you please rehang your gpu and grab the i915_error_state? Just
to confirm that it's indeed the overlay.
Comment 3 Thomas Lindroth 2010-04-15 05:22:12 UTC
Seems I was wrong about this being a regression. I experienced the same problem with 2.6.33.1 but that was after a week of uptime and countless videos played. There was nothing unusual about the video I was playing when it hang. I had been playing it for about an hour when the hang occurred. After the reboot I again tried to play the video and it hanged after just a few minutes. I haven't been able to hang it after that even with the 2.6.33.2 kernel and I haven't been able to dump the i915_error_state. Next time I'll get it.
Comment 4 Thomas Lindroth 2010-04-15 23:06:26 UTC
Got another hang. Here is the output of intel_error_decode:

Time: 1271396789 s 258100 us
EIR: 0x00000000
  PGTBL_ER: 0x00000000
  INSTPM: 0x00000000
  IPEIR: 0x00000000
  IPEHR: 0x01810000
  INSTDONE: 0xffe5fafc
    busy: MCSTP
    busy: CC
    busy: DCMP
    busy: TB
    busy: EF
    busy: Primary ring 1
    busy: Primary ring 0
  ACTHD: 0x5ba18074
  INSTPS: 0x0001e000
  INSTDONE1: 0x000fffff

This was with the 2.6.33.2 kernel.
Comment 5 Daniel Vetter 2010-04-16 00:22:48 UTC
> --- Comment #4 from Thomas Lindroth <thomas.lindroth@gmail.com> 2010-04-15 23:06:26 PDT ---
> Got another hang. Here is the output of intel_error_decode:
> 
> Time: 1271396789 s 258100 us
> EIR: 0x00000000
>   PGTBL_ER: 0x00000000
>   INSTPM: 0x00000000
>   IPEIR: 0x00000000
>   IPEHR: 0x01810000
>   INSTDONE: 0xffe5fafc
>     busy: MCSTP
>     busy: CC
>     busy: DCMP
>     busy: TB
>     busy: EF
>     busy: Primary ring 1
>     busy: Primary ring 0
>   ACTHD: 0x5ba18074
>   INSTPS: 0x0001e000
>   INSTDONE1: 0x000fffff

IPEHR is MI_WAIT_FOR_EVENT | MI_WAIT_FOR_OVERLAY_FLIP. So yes, the overlay
is fried.

A few things to check:
- This always happens _while_ playing video, not when stopping or
  starting?
- Does it work with user mode setting? If yes, can you please supply a
  register dump for both cases (ums and broken kms) after havin used the
  overlay once?

Maybe-Duplicate-of: https://bugs.freedesktop.org/show_bug.cgi?id=24977
Comment 6 Thomas Lindroth 2010-04-16 01:12:18 UTC
> A few things to check:
> - This always happens _while_ playing video, not when stopping or
>   starting?
I have never got this crash when starting or stopping. The crash is always during playback usually after a few minutes of viewing. 

> - Does it work with user mode setting? If yes, can you please supply a
>   register dump for both cases (ums and broken kms) after havin used the
>   overlay once? 
AFAIK 2.10.0 and later doesn't have UMS support. Do you want me to downgrade to an earlier version and get a dump?
This problem started after 2.10.0. Before that I was using UMS.

> Maybe-Duplicate-of: https://bugs.freedesktop.org/show_bug.cgi?id=24977
I'm experiencing the green overlay bug so it's possible these problems are related.
Comment 7 Daniel Vetter 2010-04-16 01:35:24 UTC
> --- Comment #6 from Thomas Lindroth <thomas.lindroth@gmail.com> 2010-04-16 01:12:18 PDT ---
> > - Does it work with user mode setting? If yes, can you please supply a
> >   register dump for both cases (ums and broken kms) after havin used the
> >   overlay once? 
> AFAIK 2.10.0 and later doesn't have UMS support. Do you want me to downgrade to
> an earlier version and get a dump?
> This problem started after 2.10.0. Before that I was using UMS.

Another i965 overlay bug was fixed by some changes to the clock gating. So
my suspicion is that something's still wrong there. But there's no errata
in the docs, so getting a dump from the ums driver is the only way to get
a clue. Please upload the register dumps to the green overlay bug report,
I think this is the same fundamental issue.

btw, next time you report a bug please mention any other related issues.
If stuff breaks, it tends to start breaking everywhere else, too.
Comment 8 Daniel Vetter 2010-04-16 01:38:34 UTC

*** This bug has been marked as a duplicate of bug 24977 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.