89467 – [ivb gt2] GPU hangs in 4.0: where did my seqno writes go?

Bug 89467 - [ivb gt2] GPU hangs in 4.0: where did my seqno writes go?

Summary: [ivb gt2] GPU hangs in 4.0: where did my seqno writes go?

Status:	CLOSED INVALID

Alias:	None

Product:	DRI
Classification:	Unclassified
Component:	DRM/Intel (show other bugs)
Version:	XOrg git
Hardware:	Other All

Importance:	medium normal
Assignee:	Intel GFX Bugs mailing list
QA Contact:	Intel GFX Bugs mailing list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2015-03-06 12:41 UTC by Alin M Elena
Modified:	2017-07-06 17:31 UTC (History)
CC List:	1 user (show)

See Also:
i915 platform:
i915 features:

Attachments
first error (2.04 MB, text/plain) 2015-03-06 12:41 UTC, Alin M Elena	no flags	Details
second error (2.04 MB, text/plain) 2015-03-06 12:42 UTC, Alin M Elena	no flags	Details
kwin error 1 (2.35 MB, text/plain) 2015-03-06 13:08 UTC, Alin M Elena	no flags	Details
kwin_error 2 (2.35 MB, text/plain) 2015-03-06 13:09 UTC, Alin M Elena	no flags	Details
kwin error 3 (2.35 MB, text/plain) 2015-03-06 13:09 UTC, Alin M Elena	no flags	Details
plasma crash (2.35 MB, text/plain) 2015-03-06 13:09 UTC, Alin M Elena	no flags	Details
kernel 3.10 error (2.14 MB, text/plain) 2015-03-07 07:50 UTC, Alin M Elena	no flags	Details
kernel 3.11 (2.14 MB, text/plain) 2015-03-07 07:52 UTC, Alin M Elena	no flags	Details
kernel 3.12 e1 (2.15 MB, text/plain) 2015-03-07 07:59 UTC, Alin M Elena	no flags	Details
kernel 3.12 e2 (2.15 MB, text/plain) 2015-03-07 08:01 UTC, Alin M Elena	no flags	Details
kernel 3.13 e1 (2.14 MB, text/plain) 2015-03-07 21:39 UTC, Alin M Elena	no flags	Details
kernel 3.13 e2 (2.14 MB, text/plain) 2015-03-07 21:41 UTC, Alin M Elena	no flags	Details
kernel 3.13 e3 (2.17 MB, text/plain) 2015-03-07 21:42 UTC, Alin M Elena	no flags	Details
kernel 3.13 e4 (2.17 MB, text/plain) 2015-03-07 21:44 UTC, Alin M Elena	no flags	Details
kernel 3.13 e5 (2.15 MB, text/plain) 2015-03-07 21:46 UTC, Alin M Elena	no flags	Details
kernel 3.14 e1 (2.14 MB, text/plain) 2015-03-07 21:51 UTC, Alin M Elena	no flags	Details
kernel 3.14 e2 (2.14 MB, text/plain) 2015-03-07 21:52 UTC, Alin M Elena	no flags	Details
kernel 3.14 e3 (2.15 MB, text/plain) 2015-03-07 21:54 UTC, Alin M Elena	no flags	Details
kernel 3.14 e4 (2.14 MB, text/plain) 2015-03-07 21:55 UTC, Alin M Elena	no flags	Details
kernel 3.15 e1 (2.15 MB, text/plain) 2015-03-08 09:19 UTC, Alin M Elena	no flags	Details
kernel 3.15 e2 (2.15 MB, text/plain) 2015-03-08 09:20 UTC, Alin M Elena	no flags	Details
kernel 3.19 e1 (2.19 MB, text/plain) 2015-03-09 12:03 UTC, Alin M Elena	no flags	Details
kernel 3.19 e2 (2.19 MB, text/plain) 2015-03-09 12:04 UTC, Alin M Elena	no flags	Details
kernel 3.19 e3 (2.19 MB, text/plain) 2015-03-09 12:04 UTC, Alin M Elena	no flags	Details
kernel 3.19 e4 (2.19 MB, text/plain) 2015-03-09 12:05 UTC, Alin M Elena	no flags	Details
kernel 3.19 e5 (2.19 MB, text/plain) 2015-03-09 12:06 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 e1 (2.52 MB, text/plain) 2015-03-09 14:13 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 e2 (2.19 MB, text/plain) 2015-03-09 14:14 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 e3 (2.35 MB, text/plain) 2015-03-09 14:14 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 e4 (2.19 MB, text/plain) 2015-03-09 14:16 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 e1 dri disabled (2.04 MB, text/plain) 2015-03-09 15:08 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 e2 dri disabled (2.04 MB, text/plain) 2015-03-09 15:14 UTC, Alin M Elena	no flags	Details
Just a random shot in the dark (949 bytes, patch) 2015-03-09 21:28 UTC, Chris Wilson	no flags	Details \| Splinter Review
kernel 4.0.0-rc3 dri enabled + patch t1 e1 (2.19 MB, text/plain) 2015-03-10 06:54 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 dri enabled + patch t1 e2 (2.19 MB, text/plain) 2015-03-10 06:55 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 dri enabled + patch t1 e3 (2.35 MB, text/plain) 2015-03-10 06:56 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 dri enabled + patch t1 e4 (2.19 MB, text/plain) 2015-03-10 06:57 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 dri enabled + patch t1 e5 (2.19 MB, text/plain) 2015-03-10 06:58 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 dri enabled + patch t1 e6 (2.19 MB, text/plain) 2015-03-10 06:59 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 dri enabled + patch t1 e7 (2.64 MB, text/plain) 2015-03-10 07:31 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 + patch t1 dri disabled e1 (2.04 MB, text/plain) 2015-03-10 11:45 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 dri enabled + patch t1 rc6=0 e1 (2.19 MB, text/plain) 2015-03-10 16:05 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 dri enabled + patch t1 rc6=0 e2 (2.52 MB, text/plain) 2015-03-10 16:05 UTC, Alin M Elena	no flags	Details
kernel 4.0.0-rc3 dri enabled + patch t1 rc6=0 e3 (2.05 MB, text/plain) 2015-03-10 16:06 UTC, Alin M Elena	no flags	Details
View All

Description Alin M Elena 2015-03-06 12:41:51 UTC

Created attachment 114084 [details]
first error

The desktop freezes for few seconds gpu hang in dmesg

error state attached

in Xorg
   1 Section"Device"                                              
   2    Identifier  "Intel Graphics"
   3    Driver      "intel"
   4 #   Option     "AccelMethod"  "uxa"
   5    Option       "DRI" "False"
   6 EndSection

when the freeze happens again the code may change

Comment 1 Alin M Elena 2015-03-06 12:42:44 UTC

Created attachment 114085 [details]
second error

Comment 2 Chris Wilson 2015-03-06 12:51:21 UTC

Please also attach your Xorg.0.log.

The first looks familiar...

Comment 3 Chris Wilson 2015-03-06 12:57:19 UTC

The second is batch buffer incoherence.

Comment 4 Chris Wilson 2015-03-06 13:01:04 UTC

Could you see if this makes the X hangs go away:

xf86-video-intel:
diff --git a/src/sna/kgem.c b/src/sna/kgem.c
index a5571aa..adf52d6 100644
--- a/src/sna/kgem.c
+++ b/src/sna/kgem.c
@@ -83,7 +83,7 @@ search_snoop_cache(struct kgem *kgem, unsigned int num_pages, unsigned flags);
 #define DBG_NO_FAST_RELOC 0
 #define DBG_NO_HANDLE_LUT 0
 #define DBG_NO_WT 0
-#define DBG_NO_WC_MMAP 0
+#define DBG_NO_WC_MMAP 1
 #define DBG_DUMP 0
 #define DBG_NO_MALLOC_CACHE 0

Comment 5 Alin M Elena 2015-03-06 13:08:02 UTC

I will try Chris,
Any prefered place to get the kernel from?

I have enabled the dri and now I can see more errors attached them too.

the gpu hang is kind of old, since summer I have it. 3.14 was probably the last kernel when things were ok 

regards,
Alin

Comment 6 Alin M Elena 2015-03-06 13:08:36 UTC

Created attachment 114086 [details]
kwin error 1

Comment 7 Alin M Elena 2015-03-06 13:09:02 UTC

Created attachment 114087 [details]
kwin_error 2

Comment 8 Alin M Elena 2015-03-06 13:09:24 UTC

Created attachment 114088 [details]
kwin error 3

Comment 9 Alin M Elena 2015-03-06 13:09:47 UTC

Created attachment 114089 [details]
plasma crash

Comment 10 Alin M Elena 2015-03-07 07:49:53 UTC

I have built and tried 3.10, and 3.11

both showed crashes 3.10 fast 3.11 after a while and only one.
Now on 3.12.
I have attached the errors.

Alin

Comment 11 Alin M Elena 2015-03-07 07:50:37 UTC

Created attachment 114100 [details]
kernel 3.10 error

Comment 12 Alin M Elena 2015-03-07 07:52:07 UTC

Created attachment 114101 [details]
kernel 3.11

Comment 13 Alin M Elena 2015-03-07 07:59:50 UTC

Created attachment 114102 [details]
kernel 3.12 e1

Comment 14 Alin M Elena 2015-03-07 08:01:50 UTC

Created attachment 114103 [details]
kernel 3.12 e2

Comment 15 Chris Wilson 2015-03-07 09:48:02 UTC

Time for good/bad news. The errors on 3.10/3.11/3.12 are a different error originating from mesa. If you make sure you have the DRI "false" in your xorg.conf and see if the errors persist. Even if they do, could you run with kernels 3.13->4.0 and attach the error states and I'll can see when the error switches over to the more troubling incoherence issue.

Comment 16 Alin M Elena 2015-03-07 09:51:15 UTC

That is great news. Yes I will continue with the new kernels(some of them already built) over the weekend and report.

regards,
Alin

Comment 17 Alin M Elena 2015-03-07 21:38:13 UTC

3.13 started to show the crashes I am used with. errors added.

Comment 18 Alin M Elena 2015-03-07 21:39:25 UTC

Created attachment 114119 [details]
kernel 3.13 e1

Comment 19 Alin M Elena 2015-03-07 21:41:22 UTC

Created attachment 114120 [details]
kernel 3.13 e2

Comment 20 Alin M Elena 2015-03-07 21:42:58 UTC

Created attachment 114121 [details]
kernel 3.13 e3

Comment 21 Alin M Elena 2015-03-07 21:44:38 UTC

Created attachment 114122 [details]
kernel 3.13 e4

Comment 22 Alin M Elena 2015-03-07 21:46:28 UTC

Created attachment 114123 [details]
kernel 3.13 e5

Comment 23 Alin M Elena 2015-03-07 21:47:52 UTC

kernel 3.14 the same freezes...I am uplaoding the errors.

Comment 24 Alin M Elena 2015-03-07 21:51:11 UTC

Created attachment 114124 [details]
kernel 3.14 e1

Comment 25 Alin M Elena 2015-03-07 21:52:42 UTC

Created attachment 114125 [details]
kernel 3.14 e2

Comment 26 Alin M Elena 2015-03-07 21:54:33 UTC

Created attachment 114126 [details]
kernel 3.14 e3

Comment 27 Alin M Elena 2015-03-07 21:55:47 UTC

Created attachment 114127 [details]
kernel 3.14 e4

Comment 28 Chris Wilson 2015-03-08 09:09:41 UTC

3.13/3.14 is started to die after the context switch and before the batchbuffer start. Different again. This is fun!

Comment 29 Alin M Elena 2015-03-08 09:11:17 UTC

Great time to move to 3.15

Comment 30 Alin M Elena 2015-03-08 09:18:13 UTC

with 3.15 crashes are fast... states added.

Comment 31 Alin M Elena 2015-03-08 09:19:14 UTC

Created attachment 114130 [details]
kernel 3.15 e1

Comment 32 Alin M Elena 2015-03-08 09:20:23 UTC

Created attachment 114131 [details]
kernel 3.15 e2

Comment 33 Alin M Elena 2015-03-09 12:03:00 UTC

kernel 3.19 error state

Comment 34 Alin M Elena 2015-03-09 12:03:58 UTC

Created attachment 114156 [details]
kernel 3.19 e1

Comment 35 Alin M Elena 2015-03-09 12:04:29 UTC

Created attachment 114157 [details]
kernel 3.19 e2

Comment 36 Alin M Elena 2015-03-09 12:04:58 UTC

Created attachment 114158 [details]
kernel 3.19 e3

Comment 37 Alin M Elena 2015-03-09 12:05:28 UTC

Created attachment 114159 [details]
kernel 3.19 e4

Comment 38 Alin M Elena 2015-03-09 12:06:00 UTC

Created attachment 114160 [details]
kernel 3.19 e5

Comment 39 Chris Wilson 2015-03-09 12:12:31 UTC

The 3.19 error states all look to be mesa hangs. Could you capture a fresh 4.0 error state as well?

Comment 40 Alin M Elena 2015-03-09 14:12:36 UTC

latest 4.0.0-rc3 states attached

Comment 41 Alin M Elena 2015-03-09 14:13:26 UTC

Created attachment 114164 [details]
kernel 4.0.0-rc3 e1

Comment 42 Alin M Elena 2015-03-09 14:14:17 UTC

Created attachment 114165 [details]
kernel 4.0.0-rc3 e2

Comment 43 Alin M Elena 2015-03-09 14:14:46 UTC

Created attachment 114166 [details]
kernel 4.0.0-rc3 e3

Comment 44 Alin M Elena 2015-03-09 14:16:10 UTC

Created attachment 114167 [details]
kernel 4.0.0-rc3 e4

Comment 45 Chris Wilson 2015-03-09 14:23:39 UTC

Now we have a split between death inside mesa (similar to the error states from earlier states) for kwin_x11 and a death on context restore like from around 3.13/3.14.

Comment 46 Alin M Elena 2015-03-09 15:07:52 UTC

dri disable kernel 4.0.90rc3

Comment 47 Alin M Elena 2015-03-09 15:08:30 UTC

Created attachment 114168 [details]
kernel 4.0.0-rc3 e1 dri disabled

Comment 48 Alin M Elena 2015-03-09 15:14:48 UTC

Created attachment 114169 [details]
kernel 4.0.0-rc3 e2 dri disabled

Comment 49 Chris Wilson 2015-03-09 15:29:23 UTC

(In reply to Alin M Elena from comment #47)
> Created attachment 114168 [details]
> kernel 4.0.0-rc3 e1 dri disabled

Hangcheck be broken.

Comment 50 Chris Wilson 2015-03-09 15:33:59 UTC

(In reply to Alin M Elena from comment #48)
> Created attachment 114169 [details]
> kernel 4.0.0-rc3 e2 dri disabled

The GPU appears to have fallen asleep.

Comment 51 Chris Wilson 2015-03-09 16:13:57 UTC

(In reply to Chris Wilson from comment #49)
> (In reply to Alin M Elena from comment #47)
> > Created attachment 114168 [details]
> > kernel 4.0.0-rc3 e1 dri disabled
> 
> Hangcheck be broken.

Ok, it's a bit deeper than that:

last seqno write in ringbuffer: 1c2d
last seqno in hws: 182d

Comment 52 Chris Wilson 2015-03-09 16:40:50 UTC

Every batch before and after the last successful hws is using the same batch, which implies it must have considered idle (i.e. seqno had advanced). Again something fishy with obj->active.

Comment 53 Chris Wilson 2015-03-09 16:50:09 UTC

Another interesting tidbit:

SYNC_1: 0x00001c2d which corresponds with the ringbuffer.

So it appears that the hws ended up with a stale value long after "later" writes landed.

Comment 54 Chris Wilson 2015-03-09 21:12:46 UTC

I don't have a good explanation, but we may as well try punching a few things and see what we can break:

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e5b3c6dbd467..c972f24d50cc 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1087,18 +1087,20 @@ gen6_add_request(struct intel_engine_cs *ring)
        int ret;
 
        if (ring->semaphore.signal)
-               ret = ring->semaphore.signal(ring, 4);
+               ret = ring->semaphore.signal(ring, 6);
        else
-               ret = intel_ring_begin(ring, 4);
+               ret = intel_ring_begin(ring, 6);
 
        if (ret)
                return ret;
 
+       intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_DISABLE);
        intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
        intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
        intel_ring_emit(ring,
                    i915_gem_request_get_seqno(ring->outstanding_lazy_request));
        intel_ring_emit(ring, MI_USER_INTERRUPT);
+       intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
        __intel_ring_advance(ring);
 
        return 0;

Comment 55 Chris Wilson 2015-03-09 21:28:34 UTC

Created attachment 114176 [details] [review]
Just a random shot in the dark

Comment 56 Alin M Elena 2015-03-10 06:34:35 UTC

dri disabled clean   dmesg after one night.
drm enabled... errors are still there

Comment 57 Alin M Elena 2015-03-10 06:53:43 UTC

with dri enabled still issues.. the same as before... in addition
[  742.562429] [drm] GPU HANG: ecode 7:0:0x97f4ffff, in chromium [2431], reason: Ring hung, action: reset
[  742.562440] ------------[ cut here ]------------
[  742.562486] WARNING: CPU: 1 PID: 2308 at /home/alin/lavello/linux/drivers/gpu/drm/i915/intel_display.c:9574 intel_mmio_flip_work_func+0x2ea/0x310 [i915]()
[  742.562487] WARN_ON(__i915_wait_request(mmio_flip->req, crtc->reset_counter, false, NULL, NULL) != 0)
[  742.562489] Modules linked in:
[  742.562490]  ctr ccm fuse ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bnep dell_wmi sparse_keymap nls_iso8859_1 nls_cp437 vfat fat arc4 ath9k ath9k_common ath9k_hw snd_hda_codec_hdmi iTCO_wdt ath iTCO_vendor_support snd_hda_codec_realtek snd_hda_codec_generic mac80211 snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_timer kvm dm_mod snd crct10dif_pclmul ath3k btusb crc32_pclmul cfg80211 dell_laptop crc32c_intel uvcvideo dcdbas bluetooth ghash_clmulni_intel aesni_intel videobuf2_vmalloc aes_x86_64 videobuf2_memops glue_helper videobuf2_core lrw gf128mul ablk_helper rndis_host v4l2_common cryptd cdc_ether videodev usbnet mei_me joydev mii rfkill mei serio_raw pcspkr lpc_ich
[  742.562532]  i2c_i801 shpchp mfd_core soundcore tpm_tis tpm wmi thermal battery processor ac efivarfs xhci_pci xhci_hcd i915 i2c_algo_bit drm_kms_helper drm video button sg
[  742.562544] CPU: 1 PID: 2308 Comm: kworker/1:1 Tainted: G     U          4.0.0-rc3-1.gf264c86-desktop+ #1
[  742.562545] Hardware name: Dell Inc.          XPS L322X/0PJHXN, BIOS A10 08/28/2013
[  742.562561] Workqueue: events intel_mmio_flip_work_func [i915]
[  742.562563]  ffffffffa0193448 ffff8800b26a7ce8 ffffffff81678e34 0000000000000000
[  742.562565]  ffff8800b26a7d38 ffff8800b26a7d28 ffffffff810657aa ffff880235b70000
[  742.562568]  ffff88003f8f68b0 ffff8800b3fb9f40 ffff88003f8f6000 ffffe8ffffc41900
[  742.562570] Call Trace:
[  742.562577]  [<ffffffff81678e34>] dump_stack+0x4c/0x6e
[  742.562582]  [<ffffffff810657aa>] warn_slowpath_common+0x8a/0xc0
[  742.562585]  [<ffffffff81065826>] warn_slowpath_fmt+0x46/0x50
[  742.562602]  [<ffffffffa012e06a>] intel_mmio_flip_work_func+0x2ea/0x310 [i915]
[  742.562605]  [<ffffffff810893ed>] ? finish_task_switch+0x5d/0x100
[  742.562609]  [<ffffffff8107dfb5>] process_one_work+0x145/0x440
[  742.562611]  [<ffffffff8107e3d1>] worker_thread+0x121/0x450
[  742.562614]  [<ffffffff8107e2b0>] ? process_one_work+0x440/0x440
[  742.562616]  [<ffffffff810836f9>] kthread+0xc9/0xe0
[  742.562629]  [<ffffffff81083630>] ? kthread_create_on_node+0x180/0x180
[  742.562631]  [<ffffffff8167f798>] ret_from_fork+0x58/0x90
[  742.562634]  [<ffffffff81083630>] ? kthread_create_on_node+0x180/0x180
[  742.562636] ---[ end trace 5268aa6c476a71d0 ]---

Comment 58 Alin M Elena 2015-03-10 06:54:58 UTC

Created attachment 114185 [details]
kernel 4.0.0-rc3 dri enabled + patch t1 e1

Comment 59 Alin M Elena 2015-03-10 06:55:44 UTC

Created attachment 114186 [details]
kernel 4.0.0-rc3 dri enabled + patch t1 e2

Comment 60 Alin M Elena 2015-03-10 06:56:26 UTC

Created attachment 114188 [details]
kernel 4.0.0-rc3 dri enabled + patch t1 e3

Comment 61 Alin M Elena 2015-03-10 06:57:34 UTC

Created attachment 114189 [details]
kernel 4.0.0-rc3 dri enabled + patch t1 e4

Comment 62 Alin M Elena 2015-03-10 06:58:58 UTC

Created attachment 114190 [details]
kernel 4.0.0-rc3 dri enabled + patch t1 e5

Comment 63 Alin M Elena 2015-03-10 06:59:44 UTC

Created attachment 114191 [details]
kernel 4.0.0-rc3 dri enabled + patch t1 e6

Comment 64 Alin M Elena 2015-03-10 07:01:26 UTC

kwin seems to generate a trace in the kernel too
[ 1133.003380] drm/i915: Resetting chip after gpu hang
[ 1147.017084] [drm] stuck on render ring
[ 1147.017546] [drm] GPU HANG: ecode 7:0:0x97f4ffff, in kwin_x11 [2782], reason: Ring hung, action: reset
[ 1147.017597] ------------[ cut here ]------------
[ 1147.017635] WARNING: CPU: 3 PID: 152 at /home/alin/lavello/linux/drivers/gpu/drm/i915/intel_display.c:9574 intel_mmio_flip_work_func+0x2ea/0x310 [i915]()
[ 1147.017637] WARN_ON(__i915_wait_request(mmio_flip->req, crtc->reset_counter, false, NULL, NULL) != 0)
[ 1147.017639] Modules linked in:
[ 1147.017640]  ctr ccm fuse ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bnep dell_wmi sparse_keymap nls_iso8859_1 nls_cp437 vfat fat arc4 ath9k ath9k_common ath9k_hw snd_hda_codec_hdmi iTCO_wdt ath iTCO_vendor_support snd_hda_codec_realtek snd_hda_codec_generic mac80211 snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_timer kvm dm_mod snd crct10dif_pclmul ath3k btusb crc32_pclmul cfg80211 dell_laptop crc32c_intel uvcvideo dcdbas bluetooth ghash_clmulni_intel aesni_intel videobuf2_vmalloc aes_x86_64 videobuf2_memops glue_helper videobuf2_core lrw gf128mul ablk_helper rndis_host v4l2_common cryptd cdc_ether videodev usbnet mei_me joydev mii rfkill mei serio_raw pcspkr lpc_ich
[ 1147.017674]  i2c_i801 shpchp mfd_core soundcore tpm_tis tpm wmi thermal battery processor ac efivarfs xhci_pci xhci_hcd i915 i2c_algo_bit drm_kms_helper drm video button sg
[ 1147.017688] CPU: 3 PID: 152 Comm: kworker/3:1 Tainted: G     U  W       4.0.0-rc3-1.gf264c86-desktop+ #1
[ 1147.017689] Hardware name: Dell Inc.          XPS L322X/0PJHXN, BIOS A10 08/28/2013
[ 1147.017705] Workqueue: events intel_mmio_flip_work_func [i915]
[ 1147.017707]  ffffffffa0193448 ffff88003fb97ce8 ffffffff81678e34 0000000000000000
[ 1147.017709]  ffff88003fb97d38 ffff88003fb97d28 ffffffff810657aa ffff88023f2cd900
[ 1147.017710]  ffff88003f8f68b0 ffff88003f900640 ffff88003f8f6000 ffffe8ffffcc1900
[ 1147.017712] Call Trace:
[ 1147.017719]  [<ffffffff81678e34>] dump_stack+0x4c/0x6e
[ 1147.017723]  [<ffffffff810657aa>] warn_slowpath_common+0x8a/0xc0
[ 1147.017725]  [<ffffffff81065826>] warn_slowpath_fmt+0x46/0x50
[ 1147.017740]  [<ffffffffa012e06a>] intel_mmio_flip_work_func+0x2ea/0x310 [i915]
[ 1147.017743]  [<ffffffff810893ed>] ? finish_task_switch+0x5d/0x100
[ 1147.017746]  [<ffffffff8107dfb5>] process_one_work+0x145/0x440
[ 1147.017748]  [<ffffffff8107e3d1>] worker_thread+0x121/0x450
[ 1147.017750]  [<ffffffff8107e2b0>] ? process_one_work+0x440/0x440
[ 1147.017752]  [<ffffffff810836f9>] kthread+0xc9/0xe0
[ 1147.017754]  [<ffffffff81083630>] ? kthread_create_on_node+0x180/0x180
[ 1147.017756]  [<ffffffff8167f798>] ret_from_fork+0x58/0x90
[ 1147.017758]  [<ffffffff81083630>] ? kthread_create_on_node+0x180/0x180
[ 1147.017759] ---[ end trace 5268aa6c476a71d1 ]---

Comment 65 Alin M Elena 2015-03-10 07:31:07 UTC

Created attachment 114192 [details]
kernel 4.0.0-rc3 dri enabled + patch t1 e7

Comment 66 Chris Wilson 2015-03-10 08:09:32 UTC

That trace is just the -EIO issue that should have been fixed with #requests. To be clean, if you run with DRI disabled and with the shotgun patch, do we either see a hang? Give it a good long use.

Comment 67 Alin M Elena 2015-03-10 11:44:17 UTC

ok... with dri disabled and the patch got an error in the end.
state attached

ALin

Comment 68 Alin M Elena 2015-03-10 11:45:20 UTC

Created attachment 114196 [details]
kernel 4.0.0-rc3 + patch t1 dri disabled e1

Comment 69 Alin M Elena 2015-03-10 16:05:09 UTC

Created attachment 114204 [details]
kernel 4.0.0-rc3 dri enabled + patch t1 rc6=0 e1

Comment 70 Alin M Elena 2015-03-10 16:05:55 UTC

Created attachment 114205 [details]
kernel 4.0.0-rc3 dri enabled + patch t1 rc6=0 e2

Comment 71 Alin M Elena 2015-03-10 16:06:19 UTC

Created attachment 114206 [details]
kernel 4.0.0-rc3 dri enabled + patch t1 rc6=0 e3

Comment 72 Chris Wilson 2015-03-10 16:08:17 UTC

Same error that I was hoping was rc6 related. I need a new tree to bark at.

Comment 73 Alin M Elena 2015-04-11 10:14:53 UTC

changed the memory and the bug is gone.

Alin

Comment 74 Chris Wilson 2015-04-11 13:53:05 UTC

/o\ Hoping this remains a hw issue.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.