Summary: | Kernel v3.13 hang during boot now that dpm is enabled for radeon driver - Radeon HD4870 | ||
---|---|---|---|
Product: | DRI | Reporter: | OmegaPhil |
Component: | DRM/Radeon | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED MOVED | QA Contact: | |
Severity: | major | ||
Priority: | medium | CC: | landjgregory, OmegaPhil |
Version: | unspecified | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Description
OmegaPhil
2014-03-17 18:32:46 UTC
Does booting an older kernel (e.g., 3.11 or 3.12) with radeon.dpm=1 work ok? Please attach your dmesg output with kms enabled. I tried twice to get at flushed dmesg output for v3.13, both failed - I confirmed the hang happens with v3.12 and v3.11 - looks like REISUB does nothing, no dmesg files were created for the relevant boots... Shall I set up netconsole and see what I can get? Just for my education, it looks like dmesg output is part of kern.log (https://stackoverflow.com/a/11413417/1188444) - is this correct? Yes. dmesg prints out the kernel log for the current boot. Can you attach the log with radeon.dpm=0? The log in comment 0 is with modeset=0. Created attachment 95969 [details]
dmesg from normal v3.13 boot with radeon.dpm=0
Normal boot dmesg attached
I am having a similar problem with my 4870. If i disconnect my second monitor I can boot with radeon.dmp = 1. If I reconnect the monitor again it will lock up shortly after boot. A bisect brings up this commit https://github.com/torvalds/linux/commit/ab70b1dde73ff4525c3cd51090c233482c50f21 Which makes sense since this commit enabled dpm by default on radeon x7xx series cards. Created attachment 96138 [details]
Greg: Log where i managed to get system to boot to terminal session before crash happened.
Log from kernel created from bisected copy of Linus's github on an linux machine. I added a few DRM_INFO() outputs in an attempt to find where the lockup is occurring. From what I can tell the crash is preventing the logs from being written out,
Created attachment 96139 [details]
Kernel Panic Displayed after waiting out the lockup very rare
If I wait around long enough the system might kernel panic and display a few messages that don't seem to get added to the logs. I have a photo of them taken from my phone to attach.
Created attachment 96140 [details]
Disabled all but one core and hyper-threading. Helped with log loss
Log with more info. Disabled hyper-threading and all but one core.
Created attachment 96141 [details]
Image of kernel panic with single core only
Image of kernel panic with single core only
Created attachment 96170 [details] [review] disable some dpm features Does the attached kernel patch help? If so, can you narrow down what part of it helps? Patch did not resolve the problem. Created attachment 96182 [details]
Core Dump at insane logging level.
I enabled every debug line I could find. Got this.
Mar 21 10:00:54 endora systemd[1]: Received SIGCHLD from PID 584 (sd_cicero).
Mar 21 10:00:54 endora systemd[1]: Child 584 (sd_cicero) died (code=exited, status=1/FAILURE)
Mar 21 10:00:54 endora kernel: [drm:r600_irq_process], r600_irq_process start: rptr 18624, wptr 18640
Mar 21 10:00:54 endora kernel: [drm:drm_calc_vbltimestamp_from_scanoutpos], crtc 1 : v 13 p(521,-40)@ 28.771340 -> 28.771949 [e 0 us, 0 rep]
Mar 21 10:00:54 endora kernel: [drm:r600_irq_process], IH: D2 vblank
Mar 21 10:00:54 endora kernel: sd_festival[590]: segfault at 2d0 ip 00007fcc15b8d1a1 sp 00007ffff79b0a10 error 4 in libpthread-2.19.so[7fcc15b80000+18000]
Mar 21 10:00:54 endora kernel: potentially unexpected fatal signal 11.
Mar 21 10:00:54 endora kernel: CPU: 7 PID: 590 Comm: sd_festival Tainted: G I 3.14.0-rc7-ARCH-00059-g08edb33-dirty #1
Mar 21 10:00:54 endora kernel: Hardware name: /DX58SO, BIOS SOX5810J.86A.5600.2013.0729.2250 07/29/2013
Mar 21 10:00:54 endora kernel: task: ffff8801a537cf00 ti: ffff8800c1f2e000 task.ti: ffff8800c1f2e000
Mar 21 10:00:54 endora kernel: RIP: 0033:[<00007fcc15b8d1a1>] [<00007fcc15b8d1a1>] 0x7fcc15b8d1a1
Mar 21 10:00:54 endora kernel: RSP: 002b:00007ffff79b0a10 EFLAGS: 00010206
Mar 21 10:00:54 endora kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fcc15b7a678
Mar 21 10:00:54 endora kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Mar 21 10:00:54 endora kernel: RBP: 00007ffff79b0bb8 R08: 00007fcc15b7a678 R09: 00007fcc15b7a670
Mar 21 10:00:54 endora kernel: R10: 00007ffff79b07e0 R11: 00007fcc15b8d1a0 R12: 0000000000405d6c
Mar 21 10:00:54 endora kernel: R13: 00007ffff79b0bb0 R14: 0000000000000000 R15: 0000000000000000
Mar 21 10:00:54 endora kernel: FS: 00007fcc16afe700(0000) GS:ffff8801aece0000(0000) knlGS:0000000000000000
Mar 21 10:00:54 endora kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 21 10:00:54 endora kernel: CR2: 00000000000002d0 CR3: 00000000c1cf8000 CR4: 00000000000007e0
Mar 21 10:00:54 endora kernel:
Mar 21 10:00:54 endora kernel: [drm:r600_irq_process], r600_irq_process start: rptr 18640, wptr 18656
Mar 21 10:00:54 endora kernel: [drm:drm_calc_vbltimestamp_from_scanoutpos], crtc 0 : v 13 p(1866,-47)@ 28.774347 -> 28.775032 [e 0 us, 0 rep]
Mar 21 10:00:54 endora kernel: [drm:r600_irq_process], IH: D1 vblank
Mar 21 10:00:54 endora systemd-coredump[591]: Process 590 (sd_festival) dumped core.
Uninstalled segfalting package - Speech-Dispatcher (part of kde accessability) - Segfault resovled system still locks up with dpm.radeon = 1. I noticed v3.13.7-1 has come to Debian testing today with the following in the changelog: - drm/radeon: fix runpm disabling on non-PX harder (may fix #741619, #742507) I can confirm it doesnt help when I get rid of 'radeon.dpm=0' on boot. This is still a problem with kernel 3.14.4-1. (In reply to comment #15) > This is still a problem with kernel 3.14.4-1. It's fixed in: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=76e6dcece841faebbee78895780e8209ff40d922 Doesn't look like that has hit 3.14 yet. Thanks, but thats a workaround, not a fix - is DPM abandoned for these cards? Created attachment 99834 [details] [review] fix for 73911 this patch will make a key register's value correct and fix this bug Comment on attachment 99834 [details] [review] fix for 73911 jyliu: You've put the patch on the wrong ticket, this is for DPM on r600g cards. -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/467. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.