Summary: | 965GM firefox crashes/corruption of screen GPU hung | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Martin Sillence <martin> | ||||||||
Component: | Driver/intel | Assignee: | Chris Wilson <chris> | ||||||||
Status: | RESOLVED INVALID | QA Contact: | Xorg Project Team <xorg-team> | ||||||||
Severity: | major | ||||||||||
Priority: | medium | CC: | kenyon | ||||||||
Version: | 7.5 (2009.10) | ||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||
OS: | Linux (All) | ||||||||||
Whiteboard: | |||||||||||
i915 platform: | i915 features: | ||||||||||
Attachments: |
|
Description
Martin Sillence
2010-05-21 06:37:16 UTC
Created attachment 35781 [details]
xorg log of failure
On Fri, May 21, 2010 at 06:37:16 -0700, bugzilla-daemon@freedesktop.org wrote: > lib drm version: libdrm-intel1 2.4.18-5 you should probably upgrade that. > > lib drm version: libdrm-intel1 2.4.18-5
> you should probably upgrade that.
OK upgraded to
libdrm-intel1 2.4.20-2
and kernel: 2.6.34-1-amd64 #1 SMP
Now get:
Kernel:
May 21 15:28:41 griffin kernel: [ 302.284025] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
May 21 15:28:41 griffin kernel: [ 302.284207] render error detected, EIR: 0x00000000
May 21 15:28:41 griffin kernel: [ 302.284243] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -5 (awaiting 9126 at 9125)
May 21 15:28:43 griffin kernel: [ 304.120040] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
May 21 15:28:43 griffin kernel: [ 304.120051] render error detected, EIR: 0x00000000
May 21 15:28:43 griffin kernel: [ 304.120079] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -5 (awaiting 9130 at 9129)
Xorg:
(WW) intel(0): i830_uxa_pixmap_swap_bo_with_image: bo map failed
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_pixmap_swap_bo_with_image: bo map failed
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_pixmap_swap_bo_with_image: bo map failed
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_pixmap_swap_bo_with_image: bo map failed
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(EE) intel(0): Failed to submit batch buffer, expect rendering corruption or even a frozen display: Input/output error.
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
(WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error
Fatal server error:
Failed to map batchbuffer: Input/output error
Created attachment 35784 [details] last batch buffer before GPU hang As per: http://intellinuxgraphics.org/i915_error_state.html Oh, this bug. I've only seen this after removing the MI_FLUSH, but since I can neither explain how the code blows up nor how the random MI_FLUSH prevents the aforementioned explosion, I suspect you've found an instance where it blows up even with the unexplainable flushing. Fixing this crippling i965 bug is definitely on my todo list. Created attachment 35786 [details]
Another failure without the crash
Don't know if it's always the same failure or how to check but this is a trace where everything seems to still be working but there are the errors in the kernel log.
If it helps, it seems reasonably easy to provoke the crash/failure. Start up firefox with multiple saved tabs 10 or so and wait for them to load. That seems enough to break X. I was also running google's chrome when X died. Takes less than a minute for X to fail. Hi, I've been testing the latest package in debian created by Cyril: > Cyril Brulebois <kibi@debian.org> (12/07/2010): >> It would be nice to know how it goes with the packages I built (for >> i386 + amd64) and uploaded there: >> http://people.debian.org/~kibi/packages/xserver-xorg-video-intel/ > > I've put a new version there: 2.12.0-1+ickle2 It looks a lot better already - in my limited testing - no crashes yet and it was so easy to provoke before. modeset=0 is vital for it to work. I note x-video isn't working/supported: $ xvinfo X-Video Extension version 2.2 screen #0 no adaptors present Apart from that a massive improvement, many thanks. To link them up the Debian bug is: 551387 Hibernate and resume is working. Thanks again, M GPU hung for Gen3 graphics (like 945GM) should by fixed in 2.6.35-rc6 With Gen4 hardware (like G45), I found that bug appears with upgrade of mesa 7.8.2 and libdrm 2.4.21. I have no more issue when i rollback to mesa 7.7 and libdrm 2.4.19 (Tested with kernel 2.6.32 and 2.6.34) I have 945GME (Asus Eee 1000H) and apparently run into the very same issue which can be very annoying. It does NOT happen that fast to me though, it only occurs rarely (every two weeks or so) and not always when I'm having a lot of firefox tabs open. I was using intel 2.12.0 Also it was triggered with Opera, so it's not just firefox causing this. I screenshotted the text corruption: http://eloxoph.com/intel2.12corruption1.png http://eloxoph.com/intel2.12corruption11.png Also I got similar Xorg.0.log entries: [ 9898.021] (WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error [ 9898.021] (WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error [ 9898.022] (WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error [ 9898.022] (WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error Very interestingly, I also got kernel oopses which fired exactly when the text corruption started: WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xad/0x12a() Hardware name: 1000H Modules linked in: fuse sunrpc cpufreq_ondemand acpi_cpufreq xt_physdev nf_conntrack_netbios_ns ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 vboxnetadp vboxnetflt vboxdrv uinput snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd iTCO_wdt eeepc_laptop iTCO_vendor_support soundcore rt2860sta snd_page_alloc uvcvideo sparse_keymap rfkill atl1e videodev v4l1_compat joydev microcode aes_i586 aes_generic xts gf128mul dm_crypt i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last unloaded: scsi_wait_scan] Pid: 0, comm: swapper Not tainted 2.6.33.6-147.fc13.i686.PAE #1 Call Trace: [<c043d69d>] warn_slowpath_common+0x65/0x7c [<c04b12e1>] ? debug_kmap_atomic+0xad/0x12a [<c043d6c1>] warn_slowpath_null+0xd/0x10 [<c04b12e1>] debug_kmap_atomic+0xad/0x12a [<c042a91b>] kmap_atomic_prot+0x5c/0x10c [<c04c9162>] ? __kmalloc+0x103/0x10f [<c042a9df>] kmap_atomic+0x14/0x16 [<f8008083>] i915_error_object_create+0x9f/0xfa [i915] [<f80083f2>] i915_handle_error+0x314/0x813 [i915] [<f8008990>] i915_hangcheck_elapsed+0x9f/0xdf [i915] [<c0448749>] run_timer_softirq+0x163/0x1e6 [<f80088f1>] ? i915_hangcheck_elapsed+0x0/0xdf [i915] [<c0442a79>] __do_softirq+0xac/0x152 [<c0442b50>] do_softirq+0x31/0x3c [<c0442c64>] irq_exit+0x29/0x5c [<c041d6d7>] smp_apic_timer_interrupt+0x6f/0x7d [<c078358d>] apic_timer_interrupt+0x31/0x38 [<c045007b>] ? __cancel_work_timer+0x12a/0x15d [<c05ffe10>] ? acpi_idle_enter_simple+0x10a/0x13d [<c06d886b>] cpuidle_idle_call+0x6e/0xc3 [<c0407ab8>] cpu_idle+0x91/0xad [<c077e7a7>] start_secondary+0x1f5/0x233 uname -r: 2.6.33.6-147.2.4.fc13.i686.PAE glxinfo | grep Mesa: client glx vendor string: Mesa Project and SGI OpenGL renderer string: Mesa DRI Intel(R) 945GME GEM 20100328 2010Q1 OpenGL version string: 1.4 Mesa 7.8.1 Reopening programs does FREE them from text corruption from me! .. .. BUT if I open blender after any corruption has been shown, X dies instantly with a black screen/freeze (I tested this both times that text corruption oddness has struck me) - after a reboot everything's fine again and also blender runs perfectly. Reproducing the issue willingly seems close to impossible for me - it really just happens rarely Hi, I've tried the latest release 2.13 with the latest kernel 2.6.35-trunk (2.6.35-rc6 was too unstable) this combination still results in gpu hung messages and applications crashing. Is there anything I can do to help? Do more reports help here? Would you like a new bug with the logs or attached to this one? Would remote access to my laptop help? Thanks, M see Bug 30637 hardware -memory fault |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.