Summary: | complete system stalls while changing displays resolutions on a hybrid (intel/radeon) system | ||
---|---|---|---|
Product: | DRI | Reporter: | Yaroslav Halchenko <debian> |
Component: | DRM/Radeon | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED MOVED | QA Contact: | |
Severity: | major | ||
Priority: | medium | CC: | debian, Rondom |
Version: | DRI git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Description
Yaroslav Halchenko
2015-11-19 15:09:36 UTC
Yep, please try Maarten's patch from bug 92258 for additional information. Also, can you narrow down which kernel version/change introduced the problem, ideally using git bisect? Thank you Michael for your response! Kernel according to my irc log on #intel-gfx started to happen with upgrade to 4.2.0-1-amd64, and according to old copy of the journal it was Oct 25 11:22:54 hopa kernel: Linux version 3.17-1-amd64 (debian-kernel@lists.debian.org) (gcc version 4.8.3 (Debian 4.8.3-13) ) #1 SMP Debian 3.17-1~exp1 (2014-10-14) before that. bisection I guess will be the measure of last resort -- this laptop is the main workhorse and halt is not 100% reproducible patch: applied and rebuilding now. Will report as soon as halts again (will do some forceful playful interaction with external displays tomorrow) or if can't trigger the halt. Thanks! Created attachment 119987 [details]
journalctl output (a bit annonymized) showing details of the session with the crash
reporting on "success": after new patched kernel installation and some ugprades (kept crashing gnome not kernel, so had to upgrade), caused the stall with a bit different but overall similar traceback (full output of journalctl for that boot is attached): Nov 20 10:04:38 hopa kernel: [drm:intel_hdmi_detect] [CONNECTOR:53:HDMI-A-2] Nov 20 10:04:38 hopa kernel: ffff88043c187858 0000000000000001 ffffffff81555c51 ffff88043c678080 Nov 20 10:04:38 hopa kernel: Call Trace: Nov 20 10:04:38 hopa kernel: [<ffffffff8108678d>] ? wq_worker_sleeping+0xd/0x90 Nov 20 10:04:38 hopa kernel: [<ffffffff81555835>] ? __schedule+0x505/0x8f0 Nov 20 10:04:38 hopa kernel: [<ffffffff81555c51>] ? schedule+0x31/0x80 Nov 20 10:04:38 hopa kernel: [<ffffffff8107209c>] ? do_exit+0x72c/0xa90 Nov 20 10:04:38 hopa kernel: [<ffffffff810175ec>] ? oops_end+0x9c/0xd0 Nov 20 10:04:38 hopa kernel: [drm:intel_hdmi_detect] Live status not up! Nov 20 10:04:38 hopa kernel: [drm:drm_helper_probe_single_connector_modes_merge_bits] [CONNECTOR:53:HDMI-A-2] disconnected Nov 20 10:04:38 hopa kernel: [<ffffffff8155b5d8>] ? general_protection+0x28/0x30 Nov 20 10:04:38 hopa kernel: [<ffffffff8140689f>] ? reservation_object_test_signaled_rcu+0xcf/0x220 Nov 20 10:04:38 hopa kernel: [<ffffffff81406ef9>] ? reservation_object_wait_timeout_rcu+0x219/0x260 Nov 20 10:04:38 hopa kernel: [<ffffffffa0832b29>] ? ttm_bo_wait+0x29/0x50 [ttm] Nov 20 10:04:38 hopa kernel: [<ffffffffa0833207>] ? ttm_bo_cleanup_refs_and_unlock+0x27/0x170 [ttm] Nov 20 10:04:38 hopa kernel: [<ffffffffa083340f>] ? ttm_bo_delayed_delete+0xbf/0x200 [ttm] Nov 20 10:04:38 hopa kernel: [<ffffffffa0833567>] ? ttm_bo_delayed_workqueue+0x17/0x40 [ttm] Nov 20 10:04:38 hopa kernel: [<ffffffff810856ff>] ? process_one_work+0x19f/0x3d0 Nov 20 10:04:38 hopa kernel: [<ffffffff8108597d>] ? worker_thread+0x4d/0x450 Nov 20 10:04:38 hopa kernel: [<ffffffff81085930>] ? process_one_work+0x3d0/0x3d0 Nov 20 10:04:38 hopa kernel: [<ffffffff8108b47d>] ? kthread+0xbd/0xe0 Nov 20 10:04:38 hopa kernel: [<ffffffff8108b3c0>] ? kthread_create_on_node+0x170/0x170 Nov 20 10:04:38 hopa kernel: [<ffffffff8155984f>] ? ret_from_fork+0x3f/0x70 Nov 20 10:04:38 hopa kernel: [<ffffffff8108b3c0>] ? kthread_create_on_node+0x170/0x170 Nov 20 10:04:38 hopa kernel: Code: 48 c7 c7 b2 07 80 81 e8 83 39 fe ff e9 bf fe ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 40 04 00 00 <48> 8b 40 d8 c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f Nov 20 10:04:38 hopa kernel: RIP [<ffffffff8108ba2c>] kthread_data+0xc/0x20 Nov 20 10:04:38 hopa kernel: RSP <ffff88043c187b98> Nov 20 10:04:38 hopa kernel: CR2: ffffffffffffffd8 Nov 20 10:04:38 hopa kernel: ---[ end trace 01c0854cd2e7cf2f ]--- Nov 20 10:04:38 hopa kernel: Fixing recursive fault but reboot is needed! To stall it, I had both displays connected where 2nd one was just mirroring the first one. And I have turned off the 2nd display which caused all the mess what could be the next step? ;-) BTW -- with this recent upgrade, now two attached monitors are also seen as an extended desktop (3840x1200) which never happened before, and actually works quite nicely. but then also caused crash (no traceback was recorded and I didn't have remote session attached) using the same trick of turning the 2nd display off Created attachment 119998 [details]
cut/paste terminal output for the 2nd crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000042
The beast crashed again... I don't remember if I had those before -- just that screen went off due to inactivity (may be it was also locked -- I was away from the laptop) and when I came back -- it was stalled. I had ssh session opened at another box watching journalctl -f (nothing in the logs on the drive after reboot). The last messages
Nov 20 12:20:04 hopa kernel: [drm:drm_crtc_helper_set_config] attempting to set mode from userspace
Nov 20 12:20:04 hopa kernel: [drm:drm_mode_debug_printmodeline] Modeline 57:"" 0 296400 3840 3888 3920 4000 1200 1203 1209 1235 0x0 0x5
Nov 20 12:20:04 hopa kernel: [drm:radeon_encoder_set_active_device] setting active device to 00000008 from 00000008 00000008 for encoder 2
Nov 20 12:20:04 hopa kernel: [drm:drm_crtc_helper_set_mode] [CRTC:29]
Nov 20 12:20:04 hopa kernel: [drm:radeon_atom_encoder_dpms] encoder dpms 30 to mode 3, devices 00000080, active_devices 00000000
Nov 20 12:20:04 hopa kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000042
aha -- I think I found what triggered it since I did it again and it stalled probably identically (didn't have remote console :-/): I have ran DISPLAY=:0 0install run -c http://gfxmonk.net/dist/0install/shellshape.xml --replace to try shellshape and whenever it finished downloading, it did smth which triggered the bug, and screens went blank. Probably it is a different, although possibly related, issue since during original stalls I still have smth on the screens. In this case they just go down into suspend mode etc. Do you think I should file a separate report on this one? I see you have same laptop as me zbook 14, DP on docking station are conected only to AMD GPU. I have DRI_PRIME=1 issue with new kernel (probably start with 3.19) maybe is related https://bugzilla.opensuse.org/show_bug.cgi?id=954783 For me with DRI_PRIME=1 it even sometimes does not render at all... first I thought it happens only with external display, but nope -- also happens straight on laptop screen unpredictably. But no crashes from that so far during my trials https://wiki.archlinux.org/index.php/PRIME DRI_PRIME=1 need xrandr compositing and crash with 4.1(3.19up) with multiple glmatrix running simultaneously after few minutes or game... 4.3 glmarix works fine for couple hours and crash randomly during gameplay same 4.2 maybe is completly another bug affected 4.2/4.3 kernels. -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/663. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.