Created attachment 53745 [details]
as of git commit 3b9479dc39d32fd97f80c1e5e0fac67d36ee5e40, I got window content corruption, if I move window from right to left :) (The opposite dirrection is not affected nor moving up/down).
See attached screenshot.
Can you please describe your configuration (in particular WM and compositing mode) along with an Xorg.log?
(In reply to comment #1)
> Can you please describe your configuration (in particular WM and compositing
> mode) along with an Xorg.log?
Xorg.log attached. WM is OpenBox 3.5.0, no composing (or at least no intentional).
The application is xfce4-terminal. But it happens with different application as well, e.g., stardict, skype. (not gtk2/3 related, skype is Qt).
Created attachment 53748 [details]
(In reply to comment #3)
> Created attachment 53748 [details]
> Xorg log
Seems to be gone with the current git head.
Still none the wiser, so keep an eye out for its reoccurrence. Thanks for the report and following-up.
(In reply to comment #5)
> Still none the wiser, so keep an eye out for its reoccurrence. Thanks for the
> report and following-up.
I think this one could be the fix.
sna: Avoid the double application of drawable offsets for tiled spans
Could be... I hope not as that implies some half-evil code. Sounds like you have an interesting setup to analyze. ;-)
(In reply to comment #7)
> Could be... I hope not as that implies some half-evil code. Sounds like you
> have an interesting setup to analyze. ;-)
Well, it's not. It seems that everything is OK until GPU hang/reset. After GPU reset, I got the corruption again.
this could be related:
[ 4111.642204] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[12383.273486] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[12383.273492] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[12383.283560] [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 618518 at 618513, next 618519)
[12383.377526] [drm:ironlake_update_pch_refclk] *ERROR* enabling SSC on PCH
[12398.607001] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[12398.607015] [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 618523 at 618513, next 618524)
[12398.694399] [drm:ironlake_update_pch_refclk] *ERROR* enabling SSC on PCH
and tons of the following messages repeated:
[12461.190594] ------------[ cut here ]------------
[12461.190599] WARNING: at drivers/gpu/drm/i915/i915_drv.c:372 gen6_gt_force_wake_put+0x46/0x50 [i915]()
[12461.190600] Hardware name: 4178A4G
[12461.190601] Modules linked in: i915 fbcon tileblit font bitblit softcursor drm_kms_helper drm fb fbdev i2c_algo_bit cfbcopyarea cfbimgblt cfbfillrect uvcvideo videodev v4l2_compat_ioctl32 bnep aesni_intel cryptd aes_x86_64 aes_generic ecb bluetooth thinkpad_acpi hwmon snd_hda_codec_conexant arc4 iwlagn mac80211 cfg80211 e1000e intel_agp snd_hda_intel intel_gtt snd_hda_codec ehci_hcd rfkill uinput [last unloaded: sunrpc]
[12461.190619] Pid: 0, comm: kworker/0:0 Tainted: G W 3.1.0+ #154
[12461.190620] Call Trace:
[12461.190621] <IRQ> [<ffffffff8103995b>] ? warn_slowpath_common+0x7b/0xc0
[12461.190628] [<ffffffffa0313686>] ? gen6_gt_force_wake_put+0x46/0x50 [i915]
[12461.190633] [<ffffffffa031a494>] ? i915_handle_error+0x84/0xc20 [i915]
[12461.190637] [<ffffffffa0313686>] ? gen6_gt_force_wake_put+0x46/0x50 [i915]
[12461.190642] [<ffffffffa031bf53>] ? i915_hangcheck_elapsed+0x253/0x350 [i915]
[12461.190645] [<ffffffff810461cb>] ? cascade+0x7b/0xa0
[12461.190650] [<ffffffffa031bd00>] ? i915_vblank_swap+0x10/0x10 [i915]
[12461.190652] [<ffffffff81046306>] ? run_timer_softirq+0x116/0x270
[12461.190655] [<ffffffff8105f523>] ? ktime_get+0x63/0xf0
[12461.190657] [<ffffffff8103f838>] ? __do_softirq+0x98/0x120
[12461.190659] [<ffffffff814646ac>] ? call_softirq+0x1c/0x30
[12461.190662] [<ffffffff810048fd>] ? do_softirq+0x4d/0x80
[12461.190664] [<ffffffff8103fbee>] ? irq_exit+0x8e/0xd0
[12461.190667] [<ffffffff8101c1e8>] ? smp_apic_timer_interrupt+0x68/0xa0
[12461.190669] [<ffffffff81463c4b>] ? apic_timer_interrupt+0x6b/0x70
[12461.190670] <EOI> [<ffffffff81059f8a>] ? __hrtimer_start_range_ns+0x16a/0x3e0
[12461.190675] [<ffffffff812077b2>] ? intel_idle+0xc2/0x110
[12461.190678] [<ffffffff8120778e>] ? intel_idle+0x9e/0x110
[12461.190681] [<ffffffff8132ae77>] ? cpuidle_idle_call+0x97/0xe0
[12461.190683] [<ffffffff810011da>] ? cpu_idle+0xba/0x110
[12461.190686] [<ffffffff814559e6>] ? start_secondary+0x1f5/0x1fb
[12461.190687] ---[ end trace 989737665136fa51 ]---
[12461.285621] [drm:ironlake_update_pch_refclk] *ERROR* enabling SSC on PCH
How soon after the hang do you see corruption? There will be some corruption inevitably as a result of lost data due to the hang.
I just want to establish whether we misrender in the acceleration or fallback code.
And can you attach the /sys/kernel/debug/dri/0/i915_error_state for the hang?
Created attachment 54013 [details]
(In reply to comment #10)
> How soon after the hang do you see corruption? There will be some corruption
> inevitably as a result of lost data due to the hang.
anytime since the hang, so it does not looks like corruption during the hang.
You don't happen to have FBC enabled do you? cat /sys/kernel/debug/dri/0/i915_fbc_status
If you do can you test without, i915.i915_enable_fbc=0?
(In reply to comment #14)
> You don't happen to have FBC enabled do you? cat
> If you do can you test without, i915.i915_enable_fbc=0?
I have FBC enabled. I try to test without FBC. Is it possible to set the parameter on the fly?
You can try echo 0 > /sys/module/i915/parameters/i915_enable_fbc and restarting X and then cat /sys/kernel/debug/dri/0/i915_fbc_status to confirm
(In reply to comment #16)
> You can try echo 0 > /sys/module/i915/parameters/i915_enable_fbc and restarting
> X and then cat /sys/kernel/debug/dri/0/i915_fbc_status to confirm
well, it stil happes even if fbc disabled.
FBC disabled: disabled per module param (default off)
btw, it seems that system is more hang prone if I run forcewaked (to prevent render issues)
Can you attach the error state so that I can be sure it is the same problem? rc6 issues have been related to VTd/iommu in the past, can you either disable VTd in the BIOS or pass intel_iommu=off
(In reply to comment #18)
> Can you attach the error state so that I can be sure it is the same problem?
> rc6 issues have been related to VTd/iommu in the past, can you either disable
> VTd in the BIOS or pass intel_iommu=off
VTd is disabled in the BIOS all the time, I do not use it.
error state attached.
Created attachment 54065 [details]
Ok, that does look to be consistent with the first. A nuisance, as I had seen a very similar error (along with performance issues) go away after disabling FBC. A further check is that I was suffering x11perf -dot performance of around 300Kdot/s with FBC enabled and 70Mdot/s without.
(In reply to comment #21)
> Ok, that does look to be consistent with the first. A nuisance, as I had seen a
> very similar error (along with performance issues) go away after disabling FBC.
> A further check is that I was suffering x11perf -dot performance of around
> 300Kdot/s with FBC enabled and 70Mdot/s without.
I noticed performance drop with my favorite glxgears. I see drop from 6000fps to 3000fps (with FBC/without FBC, resp.).
I got 180Mdot/s without FBC. Don't know how much with FBC.
*** Bug 43587 has been marked as a duplicate of this bug. ***
(In reply to comment #23)
> *** Bug 43587 has been marked as a duplicate of this bug. ***
not sure if that's the same bug. the hang is unrelated to window move itself, the corruption of moved window just happens after any hang..
(In reply to comment #24)
> (In reply to comment #23)
> > *** Bug 43587 has been marked as a duplicate of this bug. ***
> not sure if that's the same bug. the hang is unrelated to window move itself,
> the corruption of moved window just happens after any hang..
Just go with me when I say the error states are the same, how you trigger it is up to you...
Thanks for your response in my bugreport Chris.
fbc is still enabled... maybe there is no option like this in my module (xf86-video-intel-2.17.0-r2)?
black ~ # grep . /sys/module/i915/parameters/*
and just for test:
drm:i915_hangcheck_elapsed after open urxvt and "less /var/log/messages"
i915_error_state is attached
> And we may as try with rc6 and semaphores disabled for completeness.
Do you mean rc6 AND semaphores disabled or rc6 enabled and semaphores disabled too?
Created attachment 54249 [details]
I believe these are all related to the underlying bug:
Author: Chris Wilson <firstname.lastname@example.org>
Date: Wed Dec 14 13:57:23 2011 +0100
drm/i915: Only clear the GPU domains upon a successful finish
By clearing the GPU read domains before waiting upon the buffer, we run
the risk of the wait being interrupted and the domains prematurely
cleared. The next time we attempt to wait upon the buffer (after
userspace handles the signal), we believe that the buffer is idle and so
skip the wait.
There are a number of bugs across all generations which show signs of an
overly haste reuse of active buffers.
A couple of those pre-date i915_gem_object_finish_gpu(), so may be
unrelated (such as a wild write from a userspace command buffer), but
this does look like a convincing cause for most of those bugs.
Signed-off-by: Chris Wilson <email@example.com>
Reviewed-by: Daniel Vetter <firstname.lastname@example.org>
Reviewed-by: Eugeni Dodonov <email@example.com>
Signed-off-by: Daniel Vetter <firstname.lastname@example.org>