Summary: | [v4.13 ARCH] GPU HANG: DMAR: DRHD: handling fault status reg 3 (arch reverted the use of intel_iommu=igfx_off) | ||
---|---|---|---|
Product: | DRI | Reporter: | Eric Blau <eblau> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | CLOSED WORKSFORME | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | major | ||
Priority: | medium | CC: | aetf, andyrtr, bbykov_1989, camtech075, cfernandez, crazymanjinn, ganadist, geromanas, innykto, intel-gfx-bugs, jkt, jonathan, like.the.23, linux, martin.stiborsky, mattkrll, pablo.doramas, pmenzel+bugs.freedesktop.org, quejacq, sgh, throwaway19587, vasyl.demin |
Version: | unspecified | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | BDW, SKL | i915 features: | GPU hang |
Description
Eric Blau
2017-10-03 13:04:18 UTC
It looks like this may be similar to bug 102870. Also possibly similar to bug 103068. *** Bug 103042 has been marked as a duplicate of this bug. *** *** Bug 103034 has been marked as a duplicate of this bug. *** DMAR and death is nothing new, see bug 89360. Standard practice is to disable iommu, with intel_iommu=igfx_off. But the question here is what happened in v4.13 to make it happen for more people? *** Bug 102870 has been marked as a duplicate of this bug. *** *** Bug 103068 has been marked as a duplicate of this bug. *** It looks like the Arch Linux kernel maintainer changed the default config option to enable IOMMU: https://bugs.archlinux.org/task/55629 I will try with the kernel boot option you mention and report back. Thanks. intel_iommu=igfx_off solves the problem on 4.13.3 for me. Thanks for the suggestion. I get an almost immediate lockup in X without the option. *** Bug 103082 has been marked as a duplicate of this bug. *** *** Bug 103106 has been marked as a duplicate of this bug. *** *** Bug 102912 has been marked as a duplicate of this bug. *** *** Bug 103121 has been marked as a duplicate of this bug. *** *** Bug 103137 has been marked as a duplicate of this bug. *** *** Bug 103139 has been marked as a duplicate of this bug. *** Maybe we should close this bug as a duplicate of https://bugs.freedesktop.org/show_bug.cgi?id=89360 or did I miss something? The workaround "intel_iommu=igfx_off" works for me using Arch Linux, too. *** Bug 103204 has been marked as a duplicate of this bug. *** *** Bug 103230 has been marked as a duplicate of this bug. *** (In reply to Ansgar Hegerfeld from comment #16) > Maybe we should close this bug as a duplicate of > https://bugs.freedesktop.org/show_bug.cgi?id=89360 or did I miss something? > The workaround "intel_iommu=igfx_off" works for me using Arch Linux, too. I don't think the problem is fixed. I'm on Sandybridge and have tested various kernels and configs. 4.9-LTS OK 4.4-LTS KINDA-OK because it has atomic modesetting errors that got introduced in 4.2 and haven't been fixed until 4.9-LTS (or earlier, don't have kernel in-between which is maintained). This would be a nice kernel because it will be supported until 2020 IIUC, but the new atomic modesetting code is buggier than in 4.9. 4.12 OK but EOL already 4.13 BAD. most problematic drm version: On my Sandybridge machine I've disabled IOMMU in the BIOS and also added intel_iommu=igfx_off, on top off the 4.13.5+ kernel having CONFIG_INTEL_IOMMU_DEFAULT_ON not set anymore. Even though it's harder to trigger now in 4.13, I can still provoke GPU errors not present in either 4.12 or 4.9. I've been successfully using 4.9 for more than a day with heavy GPU and CPU utilization and haven't hit the same errors as in 4.13. drm-tip from a week ago No improvement over 4.13.4 I'm able to hit errors in 4.13 by running ffmpeg to encode a video, utilizing vaapi for decoding the input stream, using the cpu cores for encoding, and then starting a second VAAPI client or a browser with a compositor process like Firefox or Chrome. If I just run ffmpeg as the sole VAAPI client and no browser or mpv with vaapi decoding, there are no hangs. The minute I fire up a video to watch via vaapi and rendering with OpenGL or use Firefox/Chrome, there's a GPU hang with reset. Firefox: [drm] GPU HANG: ecode 6:0:0x80202f7b, in Compositor [2620], reason: Hang on rcs0, action: reset drm/i915: Resetting chip after gpu hang Chrome: drm/i915: Resetting chip after gpu hang asynchronous wait on fence i915:[global]:a4255 timed out drm/i915: Resetting chip after gpu hang Summary: 4.9 stable for days, 4.4 not good, 4.12 good, 4.13 very bad. While I appreciate new features like atomic modesetting or synchronization fences, the fallout from all the changes has left the drm drivers in a state of hit and miss. I mean, I would love to use the 4.9 drm drivers in a 4.13 kernel for stability reasons, but it's almost EOL. Built a new drm-tip kernel today and 3d7ee91be487380ef6cad329fafbe424f6885372 is so far looking more promising than 4.13.6 has. But it's too early to declare success. drm-tip from a week ago wasn't this stable. Let's hope I can make it through the weekend without a GPU hang. With that drm-tip kernel I can so far report the following, which isn't a GPU hang, but looks like simple bug: workqueue: PF_MEMALLOC task 41(khugepaged) is flushing !WQ_MEM_RECLAIM i915-userptr-release: (null) WARNING: CPU: 3 PID: 41 at kernel/workqueue.c:2440 check_flush_dependency+0xe8/0xf0 [12787.495230] Call Trace: [12787.495236] flush_workqueue+0x110/0x3c0 [12787.495242] ? finish_task_switch+0x70/0x1f0 [12787.495273] ? i915_gem_userptr_mn_invalidate_range_start+0x13f/0x150 [i915] [12787.495296] i915_gem_userptr_mn_invalidate_range_start+0x13f/0x150 [i915] [12787.495303] __mmu_notifier_invalidate_range_start+0x4a/0x70 [12787.495307] try_to_unmap_one+0x715/0x790 [12787.495311] rmap_walk_file+0xe4/0x230 [12787.495314] try_to_unmap+0x8e/0xe0 [12787.495317] ? page_remove_rmap+0x260/0x260 [12787.495319] ? page_not_mapped+0x10/0x10 [12787.495322] ? page_get_anon_vma+0x90/0x90 [12787.495325] migrate_pages+0x6d7/0x9a0 [12787.495329] ? isolate_freepages_block+0x320/0x320 [12787.495332] ? __ClearPageMovable+0x10/0x10 [12787.495335] compact_zone+0x568/0x660 [12787.495337] compact_zone_order+0x9b/0xc0 [12787.495341] ? try_to_compact_pages+0xb2/0x220 [12787.495344] try_to_compact_pages+0xb2/0x220 [12787.495348] __alloc_pages_direct_compact+0x45/0xe0 [12787.495351] __alloc_pages_slowpath+0xa66/0xc00 [12787.495354] ? finish_task_switch+0x70/0x1f0 [12787.495358] ? del_timer_sync+0x30/0x40 [12787.495361] ? schedule_timeout+0x177/0x2b0 [12787.495364] __alloc_pages_nodemask+0x1ab/0x1d0 [12787.495368] ? wait_woken+0x80/0x80 [12787.495372] khugepaged+0x296/0x1770 [12787.495375] ? wait_woken+0x80/0x80 [12787.495379] ? collapse_shmem.isra.39+0xa30/0xa30 [12787.495381] kthread+0x10d/0x130 [12787.495384] ? kthread_create_on_node+0x60/0x60 [12787.495387] ret_from_fork+0x22/0x30 Still no hang, but another different error: [14485.810561] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=733534 end=733535) time 208 us, min 763, max 767, scanline start 761, end 771 Happened when mpv was finishing playing a video. Finally caught one. Took longer than usual in 4.13+, but it's the same Firefox compositor process as before: [18270.319058] [drm] GPU HANG: ecode 6:0:0x80203f7b, in Compositor [26396], reason: Hang on rcs0, action: reset It always seems to take a while until the right conditions are met. Back to 4.9 again until I build another drm-tip snapshot. Been trying to provoke it under two different Wayland compositors for several hours (same workload as yesterday), and it seems much harder to trigger there (same drm-tip kernel). Is this regression related to https://bugs.freedesktop.org/show_bug.cgi?id=101237? (In reply to Carsten Mattner from comment #25) > Been trying to provoke it under two different Wayland compositors for > several hours (same workload as yesterday), and it seems much harder to > trigger there (same drm-tip kernel). After Wayland taking too long to trigger, exited and entered Xorg. Didn't take long before I hit workqueue: PF_MEMALLOC task 41(khugepaged) is flushing !WQ_MEM_RECLAIM i915-userptr-release: (NULL) I suppose if I continue this session, it will repeat last nights events and finally RCS0 hang and reset the GPU. This looks like a pattern to me. (In reply to Carsten Mattner from comment #25) > Been trying to provoke it under two different Wayland compositors for > several hours (same workload as yesterday), and it seems much harder to > trigger there (same drm-tip kernel). Despite the familiar PF_MEMALLOC fault, after having run Wayland compositor for 3+ hours, and having switched to Xorg after that, no reboot inbetween, I still haven't hit the hang yet. Uneducated speculation would be that using Wayland first after boot put the stack in a more forgiving state and increased the time and operations needed for it to trigger. Wayland was run natively via its drm backend. I'll reboot soon for other reasons, but if Wayland manages to hide the drm regression, I might have to use it as the main daily driver, although there's no drop-in replacement for my usual X11 window manager environment (yet). 4.9 has been the most stable DRM stack, since 4.4.92 can also be made to hang the GPU with the workload, as I noticed running it for hours today. Haven't seen a single hang with 4.9 yet. Too bad 4.4 will be Extended-LTS while 4.9 will be EOL soon. The concurrent use of VAAPI seems to trip up things eventually with kernels <4.9 and >4.9 but not 4.9. Testing drm-tip commit ba1af442e4884a1148422a7f92ae2f978cfb26a1 With drm-tip ba1af442e4884a1148422a7f92ae2f978cfb26a1 it took 8 hours before the hangs happened. I managed to have ffmpeg and mpv be reported as the processes that causes RCS0 hangs, both utilizing VAAPI, but once one hang happened, anything (Firefox, Chrome) provokes the hang until restarting the DRM stack (aka kernel restart). 4.9.56 still hang free with same workload and more than 8 hours, as before. Hi to all I'm new to this mailing list, I'm a Linux user from year 2000. I can confirm, this bug! My system is opensuse tumbleweed 20171010. Kernel is 4.13.5. My gpu driver is Intel i915. After kernel update I have sudden random crash. System is completely stuck! No disk activity, no network activity, no report in systemd journal! The crash happens after some time , 5 minutes , 30 minutes randomly, but the BUG is very Borring! A fix is to pass nomodeset. Another fix is revert to kernel 4.9 (I have installed 4.9.54-2-pf). With this kernel no more hangs, and chrome with web google earth works like a charm!!! ;-) intel_iommu=igfx_off no works for me! Kind Regards So after running the same workload which happened to provoke the hangs with 4.4, 4.13 and 4.14-drm-tip, I've been trying to get it to hang with 4.9. Multiple days and still same good result with the state of DRM in 4.9.56. The quickest was with 4.13, not even needing an hour before the workload described above triggers the bugs. 4.14-drm-tip takes anywhere from 2 to 8 hours before the GPU hangs. So I agree with Ivan, 4.9 has the most stable DRM right now and is sadly not the extended LTS release, so either 4.4 needs a backport of 4.9 DRM or 4.14 fixes for the regressions. Not sure what is more likely to happen. Testing drm-tip 9dd506b9e3b79799503694e9c1bb5aba0d7d72eb drm-tip 9dd506b9e3b79799503694e9c1bb5aba0d7d72eb same as before My system with: kernel 4.13.5-1-default and plymouth.enable=0 i915.semaphores=1 i915.enable_rc6=0 i915.enable_psr=0 intel_iommu=igfx_off looks pretty stable... :-) Ivan, the explicit, non-default options make no difference with 4.13.9 on Sandybridge. Still hangs after a few hours of VAAPI use. Ivan, how extensively have you tested it? I can run CPU/GPU load for days with 4.9.57, but 4.13 and 4.14-drm-tip will hang eventually. And once it hangs for the the first time, the GPU stack is in a state of easy hangs repeated until kernel restart. Carsten, not sure we are speaking about same issue at this point... (In reply to Ivan Linty from comment #39) > Carsten, not sure we are speaking about same issue at this point... I'm talking about the GPU hangs. If I understand you correctly, you say that with those kernel flags you can't provoke hangs anymore. My observation is that 4.13 and newer is still susceptible if you test long enough with a mix of CPU and GPU use. The original, immediate hang reported by Arch Linux users is fixed by not enabling DMAR by default, but I cannot make 4.9.59 GPU hang no matter how hard I try. With 4.13 (no DMAR) and newer it's easy, but takes a little time. Another 4.13 regression: https://github.com/mpv-player/mpv/issues/5043 With 4.13.2 entering Xorg and leaving results in a failed atomic flip which then 2/3 of the time makes it impossible to restart the kernel cleanly. This doesn't happen if a Wayland compositor is used and exited. (In reply to Carsten Mattner from comment #42) > With 4.13.2 entering Xorg and leaving results in a failed atomic flip which > then 2/3 of the time makes it impossible to restart the kernel cleanly. > > This doesn't happen if a Wayland compositor is used and exited. It's this atomic error: "flip_done timed out" when you exit Xorg. There have been other updates in Arch Linux and if I try hard I can reproduce it on 4.9.61 as well now. (In reply to Carsten Mattner from comment #43) > (In reply to Carsten Mattner from comment #42) > > With 4.13.2 entering Xorg and leaving results in a failed atomic flip which > > then 2/3 of the time makes it impossible to restart the kernel cleanly. > > > > This doesn't happen if a Wayland compositor is used and exited. > > It's this atomic error: "flip_done timed out" when you exit Xorg. > > There have been other updates in Arch Linux and if I try hard I can > reproduce it on 4.9.61 as well now. Adding video=SVIDEO-1:d to the kernel cmdline seems to fix the flip_done hang. (In reply to Carsten Mattner from comment #44) > (In reply to Carsten Mattner from comment #43) > > (In reply to Carsten Mattner from comment #42) > > > With 4.13.2 entering Xorg and leaving results in a failed atomic flip which > > > then 2/3 of the time makes it impossible to restart the kernel cleanly. > > > > > > This doesn't happen if a Wayland compositor is used and exited. > > > > It's this atomic error: "flip_done timed out" when you exit Xorg. > > > > There have been other updates in Arch Linux and if I try hard I can > > reproduce it on 4.9.61 as well now. > > Adding video=SVIDEO-1:d to the kernel cmdline seems to fix the flip_done > hang. Ivan, coming back to your suggestion and explicitly enabling semaphores and disabling framebuffer compression, rc6 sleep mode and (I don't know what it is) psr, in addition to video=SVIDEO-1:d seems to be working better than the other tests so far on 4.13.12. Still testing this: video=SVIDEO-1:d plymouth.enable=0 i915.semaphores=1 i915.enable_rc6=0 i915.enable_psr=0 intel_iommu=igfx_off I don't think plymouth.enable=0 is needed on Arch Linux since I think it's a Red Hat graphical boot system, isn't it? I mean it doesn't hurt and is ignored, but I had to ask. (In reply to Carsten Mattner from comment #45) > (In reply to Carsten Mattner from comment #44) > > (In reply to Carsten Mattner from comment #43) > > > (In reply to Carsten Mattner from comment #42) > > > > With 4.13.2 entering Xorg and leaving results in a failed atomic flip which > > > > then 2/3 of the time makes it impossible to restart the kernel cleanly. > > > > > > > > This doesn't happen if a Wayland compositor is used and exited. > > > > > > It's this atomic error: "flip_done timed out" when you exit Xorg. > > > > > > There have been other updates in Arch Linux and if I try hard I can > > > reproduce it on 4.9.61 as well now. > > > > Adding video=SVIDEO-1:d to the kernel cmdline seems to fix the flip_done > > hang. > > Ivan, coming back to your suggestion and explicitly enabling semaphores and > disabling framebuffer compression, rc6 sleep mode and (I don't know what it > is) psr, in addition to video=SVIDEO-1:d seems to be working better than the > other tests so far on 4.13.12. > > Still testing this: > > video=SVIDEO-1:d plymouth.enable=0 i915.semaphores=1 i915.enable_rc6=0 > i915.enable_psr=0 intel_iommu=igfx_off > > I don't think plymouth.enable=0 is needed on Arch Linux since I think it's a > Red Hat graphical boot system, isn't it? I mean it doesn't hurt and is > ignored, but I had to ask. It took almost 19 hours, but I was able to provoke the RCS0 hang. The flags seem to certainly hide the regression(s) well enough that one might possibly get a work day's worth of use of intel-drm, if one follows a strict reboot once or twice a day routine. Like many Linux/BSD users I have an x220 and I've been following this and similar bug reports closely. I don't have much to add but signed up to confirm that I run into the same problems. It's great that I found this ticket and the boot flags that sorta keep the bugs at bay. Is https://bugs.freedesktop.org/show_bug.cgi?id=101237 a duplicate? I'm seeing different symptoms with this error now. I assume this is still the same underlying issue. I've been running with semaphores=1 lately, but it does not seem to help. Most of the time I get a hang with similar output to this when resuming from hibernate. Unfortunately I could not capture the error file this time because my laptop was completely unresponsive. System Architecture: x86_64 Kernel Version: 4.13.12-1-ARCH Linux Distribution: Arch Linux Machine: MacBook Pro 12,1 Display Connector: Thunderbolt to DisplayPort (2 external monitors both connected via Thunderbolt, laptop display disabled) Nov 21 08:30:09 eric-macbookpro kernel: asynchronous wait on fence i915:Xorg[3192]/0:cc435 timed out Nov 21 08:30:11 eric-macbookpro kernel: [drm] GPU HANG: ecode 8:0:0xda91d857, in slack [10507], reason: Hang on rcs0, action: reset Nov 21 08:30:11 eric-macbookpro kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. Nov 21 08:30:11 eric-macbookpro kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel Nov 21 08:30:11 eric-macbookpro kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. Nov 21 08:30:11 eric-macbookpro kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. Nov 21 08:30:11 eric-macbookpro kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error Nov 21 08:30:11 eric-macbookpro kernel: drm/i915: Resetting chip after gpu hang Nov 21 08:30:17 eric-macbookpro kernel: drm/i915: Resetting chip after gpu hang Nov 21 08:30:17 eric-macbookpro kernel: [drm:i915_reset [i915]] *ERROR* GPU recovery failed Nov 21 08:30:27 eric-macbookpro kernel: [drm:drm_atomic_helper_swap_state [drm_kms_helper]] *ERROR* [CRTC:32:pipe A] hw_done timed out Nov 21 08:30:28 eric-macbookpro kernel: asynchronous wait on fence i915:Xorg[3192]/0:cc437 timed out Nov 21 08:30:37 eric-macbookpro kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:32:pipe A] hw_done timed out Nov 21 08:30:38 eric-macbookpro kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:32:pipe A] hw_done timed out Nov 21 08:30:47 eric-macbookpro kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:32:pipe A] flip_done timed out Nov 21 08:30:48 eric-macbookpro kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:32:pipe A] flip_done timed out Nov 21 08:30:48 eric-macbookpro kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 Nov 21 08:30:48 eric-macbookpro kernel: IP: __mutex_lock.isra.2+0x203/0x520 Nov 21 08:30:48 eric-macbookpro kernel: PGD 0 Nov 21 08:30:48 eric-macbookpro kernel: P4D 0 Nov 21 08:30:48 eric-macbookpro kernel: Nov 21 08:30:48 eric-macbookpro kernel: Oops: 0002 [#1] PREEMPT SMP Nov 21 08:30:48 eric-macbookpro kernel: Modules linked in: brcmfmac brcmutil cfg80211 mmc_core facetimehd(O) videobuf2_dma_sg videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media tun asix usbnet mii libphy rfcomm ipt_MASQUERADE nf_nat_masquerade_ipv4 Nov 21 08:30:48 eric-macbookpro kernel: intel_powerclamp coretemp kvm_intel kvm irqbypass i2c_algo_bit intel_cstate drm_kms_helper intel_rapl_perf drm snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm pcspkr snd_timer i2c_i801 intel_pch_thermal mei_m Nov 21 08:30:48 eric-macbookpro kernel: [last unloaded: brcmutil] Nov 21 08:30:48 eric-macbookpro kernel: CPU: 0 PID: 17245 Comm: kworker/u8:73 Tainted: P U O 4.13.12-1-ARCH #1 Nov 21 08:30:48 eric-macbookpro kernel: Hardware name: Apple Inc. MacBookPro12,1/Mac-E43C1C25D4880AD6, BIOS MBP121.88Z.0167.B33.1706181928 06/18/2017 Nov 21 08:30:48 eric-macbookpro kernel: Workqueue: events_unbound intel_atomic_commit_work [i915] Nov 21 08:30:48 eric-macbookpro kernel: task: ffff9a2a1eca6900 task.stack: ffffb7848560c000 Nov 21 08:30:48 eric-macbookpro kernel: RIP: 0010:__mutex_lock.isra.2+0x203/0x520 Nov 21 08:30:48 eric-macbookpro kernel: RSP: 0018:ffffb7848560fbd0 EFLAGS: 00010212 Nov 21 08:30:48 eric-macbookpro kernel: RAX: 0000000000000004 RBX: ffff9a2a1eca6900 RCX: 0000000000000002 Nov 21 08:30:48 eric-macbookpro kernel: RDX: ffff9a295a2e4198 RSI: ffff9a2a1eca6900 RDI: ffff9a295a2e41b0 Nov 21 08:30:48 eric-macbookpro kernel: RBP: ffffb7848560fc70 R08: 0000000000000022 R09: ffff9a2a223fe9c0 Nov 21 08:30:48 eric-macbookpro kernel: R10: 0000000000000210 R11: 0000000000000207 R12: ffffb7848560fc10 Nov 21 08:30:48 eric-macbookpro kernel: R13: 0000000000000002 R14: ffff9a295a2df000 R15: ffff9a295a2e41a0 Nov 21 08:30:48 eric-macbookpro kernel: FS: 0000000000000000(0000) GS:ffff9a2a2ec00000(0000) knlGS:0000000000000000 Nov 21 08:30:48 eric-macbookpro kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 21 08:30:48 eric-macbookpro kernel: CR2: 0000000000000004 CR3: 0000000257a09000 CR4: 00000000003406f0 Nov 21 08:30:48 eric-macbookpro kernel: Call Trace: Nov 21 08:30:48 eric-macbookpro kernel: ? gen8_write32+0x104/0x260 [i915] Nov 21 08:30:48 eric-macbookpro kernel: __mutex_lock_slowpath+0x13/0x20 Nov 21 08:30:48 eric-macbookpro kernel: ? __mutex_lock_slowpath+0x13/0x20 Nov 21 08:30:48 eric-macbookpro kernel: mutex_lock+0x25/0x30 Nov 21 08:30:48 eric-macbookpro kernel: ilk_initial_watermarks+0x28/0x120 [i915] Nov 21 08:30:48 eric-macbookpro kernel: intel_pre_plane_update+0xa8/0x130 [i915] Nov 21 08:30:48 eric-macbookpro kernel: intel_update_crtc+0xc1/0xe0 [i915] Nov 21 08:30:48 eric-macbookpro kernel: intel_update_crtcs+0x5b/0x80 [i915] Nov 21 08:30:48 eric-macbookpro kernel: intel_atomic_commit_tail+0x24b/0xf80 [i915] Nov 21 08:30:48 eric-macbookpro kernel: ? dequeue_task_fair+0x49f/0x640 Nov 21 08:30:48 eric-macbookpro kernel: ? __switch_to+0x1fc/0x4d0 Nov 21 08:30:48 eric-macbookpro kernel: ? finish_task_switch+0x75/0x200 Nov 21 08:30:48 eric-macbookpro kernel: intel_atomic_commit_work+0x12/0x20 [i915] Nov 21 08:30:48 eric-macbookpro kernel: process_one_work+0x1de/0x430 Nov 21 08:30:48 eric-macbookpro kernel: worker_thread+0x48/0x400 Nov 21 08:30:48 eric-macbookpro kernel: kthread+0x125/0x140 Nov 21 08:30:48 eric-macbookpro kernel: ? process_one_work+0x430/0x430 Nov 21 08:30:48 eric-macbookpro kernel: ? kthread_create_on_node+0x70/0x70 Nov 21 08:30:48 eric-macbookpro kernel: ret_from_fork+0x25/0x30 Nov 21 08:30:48 eric-macbookpro kernel: Code: 48 39 c6 0f 84 c1 02 00 00 49 8d 47 10 4c 8d 65 a0 48 89 c7 48 89 85 70 ff ff ff 49 8b 47 18 48 89 7d a0 4d 89 67 18 48 89 45 a8 <4c> 89 20 65 48 8b 04 25 00 d3 00 00 4d 39 67 10 48 89 45 b0 0f Nov 21 08:30:48 eric-macbookpro kernel: RIP: __mutex_lock.isra.2+0x203/0x520 RSP: ffffb7848560fbd0 Nov 21 08:30:48 eric-macbookpro kernel: CR2: 0000000000000004 Nov 21 08:30:48 eric-macbookpro kernel: ---[ end trace 7df4d0d92d1ba7c4 ]--- Nov 21 08:30:48 eric-macbookpro kernel: note: kworker/u8:73[17245] exited with preempt_count 2 Nov 21 08:30:57 eric-macbookpro kernel: [drm:drm_atomic_helper_swap_state [drm_kms_helper]] *ERROR* [CRTC:32:pipe A] hw_done timed out Nov 21 08:31:07 eric-macbookpro kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:32:pipe A] hw_done timed out Nov 21 08:31:17 eric-macbookpro kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:32:pipe A] flip_done timed out Nov 21 08:31:27 eric-macbookpro kernel: [drm:drm_atomic_helper_swap_state [drm_kms_helper]] *ERROR* [CRTC:32:pipe A] hw_done timed out Nov 21 08:31:37 eric-macbookpro kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:32:pipe A] hw_done timed out Nov 21 08:31:47 eric-macbookpro kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:32:pipe A] flip_done timed out Hi, i found a way to reliable trigger the problem on my archlinux machine. when i test code inside a vagrant box and play a youtube video in chrome, it hangs 99% of the time. E.g: acceptance tests for puppet code like https://github.com/puppetlabs/puppetlabs-apache First of all. Sorry about spam. This is mass update for our bugs. Sorry if you feel this annoying but with this trying to understand if bug still valid or not. If bug investigation still in progress, please ignore this and I apologize! If you think this is not anymore valid, please comment to the bug that can be closed. If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug. Just trying to understand if still valid. Closing, please re-open is issue still exists. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.