Summary: | [Radeon HD 5650] return from sleep state failed | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Rafael Pereira <rnp> | ||||||||||||||||
Component: | DRM/Radeon | Assignee: | Default DRI bug account <dri-devel> | ||||||||||||||||
Status: | RESOLVED FIXED | QA Contact: | |||||||||||||||||
Severity: | normal | ||||||||||||||||||
Priority: | medium | CC: | florent.bondoux, johannes.hirte | ||||||||||||||||
Version: | XOrg 6.7.0 | ||||||||||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||||||||||
OS: | Linux (All) | ||||||||||||||||||
Whiteboard: | |||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||
Attachments: |
|
Created attachment 108070 [details]
Kernel Config
Kernel config - 3.17.1
Created attachment 108071 [details]
lspci graphic card infos
lspci output for video card
What do you mean by sleep state? DPMS (turing the monitors off) or suspend? Is this a regression? If so, when was it last working and can you bisect? By sleep state I mean the S3 sleeping state (sleep, standby Suspend to RAM and DPMS), according the ACPI standard. Yes, it is a regression. The last working version is 3.16.6 (3.15.x, 3.14.x and so on, didn't have this problem). I noticed the bug was first introduced on version 3.17.0. I am working on the bisect. After bisecting, I found the first occurence (of the bug) on the following commit: git describe --contains 43c40df2c7fedce640a6c39fcdf58764f6bbac5c v3.17-rc1~83 (In reply to Rafael Pereira from comment #5) > After bisecting, I found the first occurence (of the bug) on the following > commit: > > git describe --contains 43c40df2c7fedce640a6c39fcdf58764f6bbac5c That's a merge of a linux-leds tree, so I'm afraid something went wrong during the bisection. Most likely, the problem doesn't happen every time even with affected commits, so you need to test several times before declaring a commit good. Terrebly sorry, I did the oposite. This is the bisect result: git bisect log # bad: [bfe01a5ba2490f299e1d2d5508cbbbadd897bbe9] Linux 3.17 # good: [acfaf47549491e5f804d30855c9055ebeb6ecc7b] Linux 3.16.6 git bisect start 'v3.17' 'v3.16.6' # good: [19583ca584d6f574384e17fe7613dfaeadcdc4a6] Linux 3.16 git bisect good 19583ca584d6f574384e17fe7613dfaeadcdc4a6 # good: [f2d7e4d4398092d14fb039cb4d38e502d3f019ee] checkpatch: add fix_insert_line and fix_delete_line helpers git bisect good f2d7e4d4398092d14fb039cb4d38e502d3f019ee # bad: [c309bfa9b481e7dbd3e1ab819271bf3009f44859] Merge tag 'for-linus-20140808' of git://git.infradead.org/linux-mtd git bisect bad c309bfa9b481e7dbd3e1ab819271bf3009f44859 So the last GOOD commit was this one: git describe --contains f2d7e4d4398092d14fb039cb4d38e502d3f019ee v3.17-rc1~94^2~11 I don't think you've actually finished the bisection yet. When it's finished, running bisect good/bad will say 'xyz is the first bad commit' followed by the log of commit xyz. When that happens, please provide that output. @Michel Dänzer: thanks for helping. I performed the bisection and that is result: %git bisect log # bad: [bfe01a5ba2490f299e1d2d5508cbbbadd897bbe9] Linux 3.17 # good: [acfaf47549491e5f804d30855c9055ebeb6ecc7b] Linux 3.16.6 git bisect start 'v3.17' 'v3.16.6' # good: [19583ca584d6f574384e17fe7613dfaeadcdc4a6] Linux 3.16 git bisect good 19583ca584d6f574384e17fe7613dfaeadcdc4a6 # good: [f2d7e4d4398092d14fb039cb4d38e502d3f019ee] checkpatch: add fix_insert_line and fix_delete_line helpers git bisect good f2d7e4d4398092d14fb039cb4d38e502d3f019ee # bad: [c309bfa9b481e7dbd3e1ab819271bf3009f44859] Merge tag 'for-linus-20140808' of git://git.infradead.org/linux-mtd git bisect bad c309bfa9b481e7dbd3e1ab819271bf3009f44859 # good: [43c40df2c7fedce640a6c39fcdf58764f6bbac5c] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds git bisect good 43c40df2c7fedce640a6c39fcdf58764f6bbac5c # bad: [b3345d7c57d70e6cb6749af25cdbe80515582e99] Merge tag 'soc-for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc git bisect bad b3345d7c57d70e6cb6749af25cdbe80515582e99 # bad: [03f62abd112d5150b6ce8957fa85d4f6e85e357f] drm/radeon: split PT setup in more functions git bisect bad 03f62abd112d5150b6ce8957fa85d4f6e85e357f # good: [2f295791312937d077cecec4cd8f453ce2ce93df] drm/i915: Reorder ctx unref on ppgtt cleanup git bisect good 2f295791312937d077cecec4cd8f453ce2ce93df # good: [60b61c2f9e13cb1cfa995984d3a3dca1fdb6490d] drm/exynos/ipp: remove type casting git bisect good 60b61c2f9e13cb1cfa995984d3a3dca1fdb6490d # bad: [920f946428b70494eb1c64e0de260da0d8bde040] Merge branch 'tda998x-devel' of git://ftp.arm.linux.org.uk/~rmk/linux-cubox into drm-next git bisect bad 920f946428b70494eb1c64e0de260da0d8bde040 # good: [2ee39452fa2fff1e8edb954ccb7e0daee9646557] drm: close race in connector registration (v2) git bisect good 2ee39452fa2fff1e8edb954ccb7e0daee9646557 # good: [b544021f18931258e4e1ef8d88d43dedb73a7ecf] drm/msm: fix double struct_mutex acquire git bisect good b544021f18931258e4e1ef8d88d43dedb73a7ecf # good: [944fc36c31ed685cf8d3d125eb681ae7198f06fc] drm/msm: use upstream iommu git bisect good 944fc36c31ed685cf8d3d125eb681ae7198f06fc # good: [a2fe6cdc03d7a9b0d048a7f32f9d8827e06c67fa] drm/msm/hdmi: fix HDMI_MUX_EN gpio request typo git bisect good a2fe6cdc03d7a9b0d048a7f32f9d8827e06c67fa # good: [c707c3619ca81f499a5ce032021405e989a96ff0] drm/i2c: tda998x: add component support git bisect good c707c3619ca81f499a5ce032021405e989a96ff0 # good: [eceb55a0ecd584ac5bebf667ffd8e859f261d0c0] Merge branch 'msm-next' of git://people.freedesktop.org/~robclark/linux into drm-next git bisect good eceb55a0ecd584ac5bebf667ffd8e859f261d0c0 # first bad commit: [920f946428b70494eb1c64e0de260da0d8bde040] Merge branch 'tda998x-devel' of git://ftp.arm.linux.org.uk/~rmk/linux-cubox into drm-next (In reply to Rafael Pereira from comment #9) > I performed the bisection and that is result: [...] > # first bad commit: [920f946428b70494eb1c64e0de260da0d8bde040] Merge branch > 'tda998x-devel' of git://ftp.arm.linux.org.uk/~rmk/linux-cubox into drm-next Still same answer as in comment #6, I'm afraid. That commit doesn't directly affect the radeon driver AFAICT, so it can't be the cause of your problem. Please try bisecting again, but make sure you test several (at least three, but the more the better) times before running git bisect good. I have the same bug with another GPU (Mobility HD 5430, code name is CEDAR). Log after suspend/resume : [ 139.610475] [drm] PCIE GART of 1024M enabled (table at 0x000000000025E000). [ 139.610628] radeon 0000:01:00.0: WB enabled [ 139.610633] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff8800b88d7c00 [ 139.610635] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff8800b88d7c0c [ 139.611445] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90005c9c418 [ 139.628149] [drm] ring test on 0 succeeded in 0 usecs [ 139.628174] [drm] ring test on 3 succeeded in 3 usecs [ 139.805679] [drm] ring test on 5 succeeded in 1 usecs [ 139.805683] [drm] UVD initialized successfully. [ 139.805727] [drm] ib test on ring 0 succeeded in 0 usecs [ 140.457604] [drm] ib test on ring 5 succeeded [ 152.692727] radeon 0000:01:00.0: ring 0 stalled for more than 10463msec [ 152.692737] radeon 0000:01:00.0: GPU lockup (current fence id 0x00000000000015b8 last fence id 0x00000000000016b8 on ring 0) [ 152.692786] radeon 0000:01:00.0: failed to get a new IB (-35) [ 152.692795] [drm:radeon_cs_ib_fill] *ERROR* Failed to get ib ! [ 152.750259] radeon 0000:01:00.0: Saved 8183 dwords of commands on ring 0. [ 152.750281] radeon 0000:01:00.0: GPU softreset: 0x00000008 [ 152.750284] radeon 0000:01:00.0: GRBM_STATUS = 0xA0003828 [ 152.750287] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 [ 152.750290] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 [ 152.750293] radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 [ 152.750296] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 [ 152.750299] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 [ 152.750302] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010100 [ 152.750304] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00020182 [ 152.750307] radeon 0000:01:00.0: R_008680_CP_STAT = 0x80038243 [ 152.750311] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [ 152.759991] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00004001 [ 152.760049] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100 [ 152.761216] radeon 0000:01:00.0: GRBM_STATUS = 0x00003828 [ 152.761219] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 [ 152.761222] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 [ 152.761225] radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 [ 152.761228] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 [ 152.761231] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 [ 152.761233] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 [ 152.761236] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 [ 152.761239] radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 [ 152.761242] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [ 152.761267] radeon 0000:01:00.0: GPU reset succeeded, trying to resume [ 152.784402] [drm] PCIE GART of 1024M enabled (table at 0x000000000025E000). [ 152.784520] radeon 0000:01:00.0: WB enabled [ 152.784523] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff8800b88d7c00 [ 152.784526] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff8800b88d7c0c [ 152.785283] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90005c9c418 [ 152.801936] [drm] ring test on 0 succeeded in 1 usecs [ 152.801945] [drm] ring test on 3 succeeded in 3 usecs [ 152.979461] [drm] ring test on 5 succeeded in 1 usecs [ 152.979468] [drm] UVD initialized successfully. [ 163.207936] radeon 0000:01:00.0: ring 0 stalled for more than 10234msec [ 163.207946] radeon 0000:01:00.0: GPU lockup (current fence id 0x00000000000015b9 last fence id 0x00000000000016b9 on ring 0) [ 163.208075] [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35). [ 163.208088] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-35). Bisecting gave commit [a3eb06dbca08e3fdad7039021ae03b46b215f22a] drm/radeon: Remove radeon_gart_restore() commit a3eb06dbca08e3fdad7039021ae03b46b215f22a Author: Michel Dänzer <michel.daenzer@amd.com> Date: Wed Jul 9 20:15:42 2014 +0200 drm/radeon: Remove radeon_gart_restore() Doesn't seem necessary, the GART table memory should be persistent. Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> That's the same commit as in bug #86267. The attached patches didn't help either. Created attachment 111361 [details] [review] Keep GART table mapped across suspend/resume Does this patch help? Note that it's just an incomplete proof of concept, but it should at least help narrow down the problem. Created attachment 111376 [details]
kernel log with patch
Hello Michel,
The patch didn't work (tested on 3.18.1).
Both warnings triggered, see attached suspend/resume log.
Same problem with a Mobility Radeon HD 5470 Cedar here. But I first noticed with 3.18 kernel. 3.18-rc1 seems to be fine. I've never seen a lockup with this version. Bisect points at least to 3.18.0-rc1-00005-g031d705. I've stopped bisecting here, cause this doesn't make sense. All commits here are from v4l tree and don't look like affecting graphics. Maybe 3.18-rc1 is bad too and I only didn't hit the bug with this version. the log shows me this with linux-3.18: Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: ring 0 stalled for more than 10020msec Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: GPU lockup (current fence id 0x00000000000450cf last fence id 0x00000000000450ea on ring 0) Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: failed to get a new IB (-35) Jan 5 18:16:13 acer kernel: [drm:radeon_cs_ib_fill] *ERROR* Failed to get ib ! Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: ring 0 stalled for more than 10020msec Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: GPU lockup (current fence id 0x00000000000450cf last fence id 0x00000000000450ea on ring 0) Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: failed to get a new IB (-35) Jan 5 18:52:54 acer kernel: [drm:radeon_cs_ib_fill] *ERROR* Failed to get ib ! Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: Saved 855 dwords of commands on ring 0. Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: GPU softreset: 0x00000008 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: GRBM_STATUS = 0xA0003828 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010002 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00020186 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x80038647 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: Saved 855 dwords of commands on ring 0. Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: GPU softreset: 0x00000008 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: GRBM_STATUS = 0xA0003828 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010002 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00020186 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x80038647 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00004001 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00004001 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: GRBM_STATUS = 0x00003828 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: GRBM_STATUS = 0x00003828 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume Jan 5 18:16:13 acer kernel: [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 Jan 5 18:52:54 acer kernel: [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 Jan 5 18:52:54 acer kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000025E000). Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: WB enabled Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff8800de25bc00 Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff8800de25bc0c Jan 5 18:16:13 acer kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000025E000). Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: WB enabled Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff8800de25bc00 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff8800de25bc0c Jan 5 18:52:54 acer kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc9000189c418 Jan 5 18:16:13 acer kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc9000189c418 Jan 5 18:52:54 acer kernel: [drm] ring test on 0 succeeded in 1 usecs Jan 5 18:52:54 acer kernel: [drm] ring test on 3 succeeded in 2 usecs Jan 5 18:16:13 acer kernel: [drm] ring test on 0 succeeded in 1 usecs Jan 5 18:16:13 acer kernel: [drm] ring test on 3 succeeded in 2 usecs Jan 5 18:52:55 acer kernel: [drm] ring test on 5 succeeded in 1 usecs Jan 5 18:52:55 acer kernel: [drm] UVD initialized successfully. Jan 5 18:16:13 acer kernel: [drm] ring test on 5 succeeded in 1 usecs Jan 5 18:16:13 acer kernel: [drm] UVD initialized successfully. Jan 5 18:16:24 acer kernel: radeon 0000:01:00.0: ring 0 stalled for more than 10222msec Jan 5 18:16:24 acer kernel: radeon 0000:01:00.0: GPU lockup (current fence id 0x00000000000450d3 last fence id 0x00000000000450ea on ring 0) Jan 5 18:53:05 acer kernel: radeon 0000:01:00.0: ring 0 stalled for more than 10222msec Jan 5 18:53:05 acer kernel: radeon 0000:01:00.0: GPU lockup (current fence id 0x00000000000450d3 last fence id 0x00000000000450ea on ring 0) Jan 5 18:53:05 acer kernel: radeon 0000:01:00.0: ring 0 stalled for more than 10723msec Jan 5 18:53:05 acer kernel: radeon 0000:01:00.0: GPU lockup (current fence id 0x00000000000450d3 last fence id 0x00000000000450ea on ring 0) Jan 5 18:16:24 acer kernel: radeon 0000:01:00.0: ring 0 stalled for more than 10723msec Jan 5 18:16:24 acer kernel: radeon 0000:01:00.0: GPU lockup (current fence id 0x00000000000450d3 last fence id 0x00000000000450ea on ring 0) Jan 5 18:16:25 acer kernel: [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35). Jan 5 18:16:25 acer kernel: [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-35). Jan 5 18:53:06 acer kernel: [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35). Jan 5 18:53:06 acer kernel: [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-35). Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: GPU softreset: 0x00000008 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: GRBM_STATUS = 0xA0003828 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010002 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00020186 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x80038647 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: GPU softreset: 0x00000008 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: GRBM_STATUS = 0xA0003828 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010002 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00020186 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x80038647 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00004001 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00004001 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: GRBM_STATUS = 0x00003828 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: GRBM_STATUS = 0x00003828 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume Jan 5 18:53:07 acer kernel: [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 Jan 5 18:16:26 acer kernel: [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 Jan 5 18:53:07 acer kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000025E000). Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: WB enabled Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff8800de25bc00 Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff8800de25bc0c Jan 5 18:16:26 acer kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000025E000). Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: WB enabled Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff8800de25bc00 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff8800de25bc0c Jan 5 18:53:07 acer kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc9000189c418 Jan 5 18:16:26 acer kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc9000189c418 Jan 5 18:16:26 acer kernel: [drm] ring test on 0 succeeded in 1 usecs Jan 5 18:16:26 acer kernel: [drm] ring test on 3 succeeded in 2 usecs Jan 5 18:53:07 acer kernel: [drm] ring test on 0 succeeded in 1 usecs Jan 5 18:53:07 acer kernel: [drm] ring test on 3 succeeded in 2 usecs Jan 5 18:16:26 acer kernel: [drm] ring test on 5 succeeded in 1 usecs Jan 5 18:16:26 acer kernel: [drm] UVD initialized successfully. Jan 5 18:53:07 acer kernel: [drm] ring test on 5 succeeded in 1 usecs Jan 5 18:53:07 acer kernel: [drm] UVD initialized successfully. Jan 5 18:53:08 acer kernel: [drm] ib test on ring 0 succeeded in 0 usecs Jan 5 18:53:08 acer kernel: [drm] ib test on ring 3 succeeded in 0 usecs Jan 5 18:16:27 acer kernel: [drm] ib test on ring 0 succeeded in 0 usecs Jan 5 18:16:27 acer kernel: [drm] ib test on ring 3 succeeded in 0 usecs Please take the bisect to completion. (In reply to Alex Deucher from comment #15) > Please take the bisect to completion. Will do so. This will take some time. From observation it needs more suspend/resume-cycles for every step closer to the bad commit. 3.18 fails in the first place every time, for 3.18.0-rc1-00005-g031d705 it took more than 10 trials to trigger the bug. Ok, 3.18-rc1 is affected too. Searching for a good version to restart bisecting. Created attachment 112529 [details] [review] Split off separate radeon_gart_get_page_entry ASIC hook Created attachment 112530 [details] [review] Reinstate radeon_gart_restore for when the GART table is in VRAM Do these two patches help? Yes they do help! The patches solve the bug for me on 3.19.0-rc5. Thanks The last two patches fix the problem for me too. Tested here on v3.19-rc5 + patches, and (after 15 trials) no sign of the bug. The fix is queued up in Alex Deucher's -fixes tree. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 108069 [details] Kernel log The system is put from sleep from GUI (X.org + KDE). After "waking up" the from sleep state the GUI is not accessible anymore and the screen keeps switching from conlose with kernel log output and a black screen. Tested on Arch Linux, Kernel version 3.17.0, 3.17.1 Graphic card infos and kenrel log: see files attached. Okt 19 19:18:56 fourier kernel: [drm:radeon_cs_ib_fill] *ERROR* Failed to get ib ! Okt 19 19:18:56 fourier kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000025D000). Okt 19 19:18:56 fourier kernel: [drm] UVD initialized successfully. Okt 19 19:18:56 fourier kernel: [drm] ring test on 0 succeeded in 1 usecs Okt 19 19:18:56 fourier kernel: [drm] ring test on 3 succeeded in 1 usecs Okt 19 19:18:56 fourier kernel: [drm] ring test on 5 succeeded in 1 usecs Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: GRBM_STATUS = 0x00003828 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: GRBM_STATUS = 0xA0003828 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010002 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00020182 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: R_008680_CP_STAT = 0x80038647 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x000000000000140e last fence id 0x00000000000013f7 on ring 0) Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: GPU softreset: 0x00000008 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00004001 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: Saved 1202 dwords of commands on ring 0. Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: WB enabled Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: failed to get a new IB (-35) Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff88020cfc2c00 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff88020cfc2c0c Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc9000731c418 Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: ffff8801fa15d000 pin failed Okt 19 19:18:56 fourier kernel: radeon 0000:01:00.0: ring 0 stalled for more than 10203msec Okt 19 19:19:06 fourier kernel: [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35). Okt 19 19:19:06 fourier kernel: [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-35). Okt 19 19:19:06 fourier kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000025D000). Okt 19 19:19:06 fourier kernel: [drm] ring test on 0 succeeded in 1 usecs Okt 19 19:19:06 fourier kernel: [drm] ring test on 3 succeeded in 1 usecs