Created attachment 90773 [details] vbios.rom With kernel-3.12.0, the system fails to resume from suspend/hibernate. - It always fails with radeon.dpm=1. - It fails most times with radeon.dpm=0. Usually it works on the first try but fails on the second or third, but sometimes even the first try is unsuccessful. The screen goes black before it starts reading the pageset2 data, the numlock/scrolllock leds start blinking indicating a kernel panic. A hard reset is required. There are no weird messages in dmesg (no_console_suspend=1). As for the kernel version, I think the problems started around 3.7.0 or maybe one of its release candidates. Since there have been so many changes and quite a few problems with those versions, I'm not sure how to bisect this. The last reliable version that I used was 3.6.2. I have also pulled the drm-next-3.13 patches into 3.12, but it didn't help. Apart from the failure to resume, the driver works fine (even with dpm enabled), I have no lockups nor crashes. dmesg: [drm] radeon kernel modesetting enabled. [drm] initializing kernel modesetting (RV635 0x1002:0x9598 0x1462:0x1260). [drm] register mmio base: 0xF0100000 [drm] register mmio size: 65536 ATOM BIOS: 113 radeon 0000:01:00.0: VRAM: 512M 0x0000000000000000 - 0x000000001FFFFFFF (512M used) radeon 0000:01:00.0: GTT: 512M 0x0000000020000000 - 0x000000003FFFFFFF [drm] Detected VRAM RAM=512M, BAR=256M [drm] RAM width 128bits DDR [TTM] Zone kernel: Available graphics memory: 1952880 kiB [TTM] Initializing pool allocator [TTM] Initializing DMA pool allocator [drm] radeon: 512M of VRAM memory ready [drm] radeon: 512M of GTT memory ready. [drm] GART: num cpu pages 131072, num gpu pages 131072 [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [drm] Loading RV635 Microcode [drm] PCIE GART of 512M enabled (table at 0x0000000000040000). radeon 0000:01:00.0: WB enabled radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff880112f67c00 radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff880112f67c0c [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [drm] Driver supports precise vblank timestamp query. radeon 0000:01:00.0: irq 41 for MSI/MSI-X radeon 0000:01:00.0: radeon: using MSI. [drm] radeon: irq initialized. [drm] ring test on 0 succeeded in 1 usecs [drm] ring test on 3 succeeded in 1 usecs [drm] Enabling audio 0 support [drm] ib test on ring 0 succeeded in 0 usecs [drm] ib test on ring 3 succeeded in 0 usecs [drm] Radeon Display Connectors [drm] Connector 0: [drm] DVI-I-1 [drm] HPD1 [drm] DDC: 0x7e60 0x7e60 0x7e64 0x7e64 0x7e68 0x7e68 0x7e6c 0x7e6c [drm] Encoders: [drm] DFP1: INTERNAL_UNIPHY [drm] CRT2: INTERNAL_KLDSCP_DAC2 [drm] Connector 1: [drm] DIN-1 [drm] Encoders: [drm] TV1: INTERNAL_KLDSCP_DAC2 [drm] Connector 2: [drm] DVI-I-2 [drm] HPD2 [drm] DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c [drm] Encoders: [drm] CRT1: INTERNAL_KLDSCP_DAC1 [drm] DFP2: INTERNAL_KLDSCP_LVTMA == power state 0 == ui class: none internal class: boot caps: video uvd vclk: 0 dclk: 0 power level 0 sclk: 72500 mclk: 40000 vddc: 1250 power level 1 sclk: 72500 mclk: 40000 vddc: 1250 power level 2 sclk: 72500 mclk: 40000 vddc: 1250 status: c r b == power state 1 == ui class: performance internal class: none caps: single_disp video uvd vclk: 0 dclk: 0 power level 0 sclk: 11000 mclk: 25200 vddc: 900 power level 1 sclk: 30000 mclk: 35000 vddc: 1000 power level 2 sclk: 72500 mclk: 40000 vddc: 1250 status: == power state 2 == ui class: none internal class: uvd caps: video uvd vclk: 40000 dclk: 30000 power level 0 sclk: 60000 mclk: 40000 vddc: 1150 power level 1 sclk: 60000 mclk: 40000 vddc: 1150 power level 2 sclk: 60000 mclk: 40000 vddc: 1150 status: == power state 3 == ui class: performance internal class: none caps: video uvd vclk: 0 dclk: 0 power level 0 sclk: 30000 mclk: 40000 vddc: 1250 power level 1 sclk: 30000 mclk: 40000 vddc: 1250 power level 2 sclk: 72500 mclk: 40000 vddc: 1250 status: switching from power state: ui class: none internal class: boot caps: video uvd vclk: 0 dclk: 0 power level 0 sclk: 72500 mclk: 40000 vddc: 1250 power level 1 sclk: 72500 mclk: 40000 vddc: 1250 power level 2 sclk: 72500 mclk: 40000 vddc: 1250 status: c b switching to power state: ui class: performance internal class: none caps: single_disp video uvd vclk: 0 dclk: 0 power level 0 sclk: 11000 mclk: 25200 vddc: 900 power level 1 sclk: 30000 mclk: 35000 vddc: 1000 power level 2 sclk: 72500 mclk: 40000 vddc: 1250 status: r [drm] radeon: dpm initialized [drm] fb mappable at 0xE0141000 [drm] vram apper at 0xE0000000 [drm] size 7299072 [drm] fb depth is 24 [drm] pitch is 6912 fbcon: radeondrmfb (fb0) is primary device Console: switching to colour frame buffer device 210x65 radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device radeon 0000:01:00.0: registered panic notifier [drm] Initialized radeon 2.35.0 20080528 for 0000:01:00.0 on minor 0 lspci -vvv: 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV635 [Radeon HD 3650/3750/4570/4580] (prog-if 00 [VGA controller]) Subsystem: Micro-Star International Co., Ltd. Device 1260 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 41 Region 0: Memory at e0000000 (64-bit, prefetchable) [size=256M] Region 2: Memory at f0100000 (64-bit, non-prefetchable) [size=64K] Region 4: I/O ports at 2100 [size=256] [virtual] Expansion ROM at f0120000 [disabled] [size=128K] Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal+ Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee0f00c Data: 4181 Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Kernel driver in use: radeon cat /sys/class/drm/card0/device/power_dpm_force_performance_level auto cat /sys/class/drm/card0/device/power_dpm_state balanced I have attached the video bios rom, maybe it is of use?
Ok, more tries: I've forced the power state to high. Now the machine survived two hibernate/resume cycles in a row but failed on the third. Not sure how much this piece of information is worth.
Bisecting would be the best bet.
Status update: After updating to 3.18.1 vanilla and booting with radeon.dpm=0, suspend/resume now works reliably. Hibernating/resuming still fails, but only and always on the second cycle, the first cycle seems to work fine now: 1) Boot with radeon.dpm=0 (radeon.dpm=1 seems to have its own stability troubles) 2) hibernate 3) resume 4) hibernate 5) resume => kernel panic after loading pages (counting from 0% to 100%). It seems the kernel panics when trying to switch back to the X screen. Unfortunately, I am for some reason no longer able to boot with a 3.6/3.7 kernel (maybe because of an udev problem), so I cannot bisect - and I am not sure any 3.6 kernels worked reliably before because I remember it had issues too. I've tried to suspend/resume between the hibernation cycles, but that does not change anything; hibernate/resume will still fail on the second resume attempt.
Is there any way you can get more information about the panic on resume, e.g. via a serial console or netconsole or some suspend/hibernate specific debugging mechanism?
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/419.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.