Bug 109712

Summary: S2idle is not working on AMD Gigabyte platform
Product: DRI Reporter: shahul <shahulhameed.481>
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: critical    
Priority: medium    
Version: DRI git   
Hardware: x86-64 (AMD64)   
OS: other   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
suspend/resume error log in s2idle none

Description shahul 2019-02-21 10:50:30 UTC
Created attachment 143427 [details]
suspend/resume error log in s2idle

I am porting Android_N and kernel 4.19.2 on  AMD Gigabyte  platform. Graphic card is gfx9.
Porting is done successfully. Now i am working on suspend/resume.
Suspend/resume is working fine in deep mode.
But suspend/resume is not working in s2idle (Suspend to idle) mode.

during resume i found error in gpu driver.


[  105.862161] [drm:gfx_v9_0_hw_init [amdgpu]] *ERROR* KCQ enable failed (scratch(0xC040)=0xCAFEDEAD)
[  105.862187] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_0> failed -22
[  105.862210] [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-22).
[  105.862215] dpm_run_callback(): pci_pm_resume+0x0/0xd6 returns -22
[  105.862345] PM: Device 0000:03:00.0 failed to resume async: error -22

I attached complete dmesg log.

Please someone help me to resolve this issue.
Thanks and Regards,
Sk shahul.
Comment 1 shahul 2019-02-21 10:54:13 UTC
Comment on attachment 143427 [details]
suspend/resume error log in s2idle

>
>[  852.851628] smpboot: CPU 1 is now offline
>[  852.876714] smpboot: CPU 2 is now offline
>[  852.901690] smpboot: CPU 3 is now offline
>[  852.921911] smpboot: CPU 4 is now offline
>[  852.941588] smpboot: CPU 5 is now offline
>[  852.954720] smpboot: CPU 6 is now offline
>[  852.969625] smpboot: CPU 7 is now offline
>[  852.974523] PM: suspend entry (s2idle)
>[  852.974525] PM: Syncing filesystems ... done.
>[  852.987057] PM: Preparing system for sleep (s2idle)
>[  852.987335] Freezing user space processes ... (elapsed 0.002 seconds) done.
>[  852.990264] OOM killer disabled.
>[  852.990265] Freezing remaining freezable tasks ... (elapsed 0.000 seconds) done.
>[  852.990968] PM: Suspending system (s2idle)
>[  852.990969] PM:  platform_suspend_begin 1
>[  852.990970] PM:  platform_suspend_begin PM_SUSPEND_TO_IDLE
>[  852.990971] ACPI: acpi_s2idle_begin 
>[  852.990972] PM:  suspend_devices_and_enter
>[  852.990972] Suspending console(s) (use no_console_suspend to debug)
>[  852.992193] rtc_cmos 00:01: suspend, ctrl 02
>[  852.992892] [drm:amdgpu_pmops_suspend [amdgpu]] *ERROR*  in if amdgpu_pmops_suspend
>[  852.995310] amdgpu: [powerplay] smu firmware version too old, can not set dpm level
>[  853.011977] amdgpu: [powerplay] smu firmware version too old, can not set dpm level
>[  853.060914] amdgpu: [powerplay] smu firmware version too old, can not set dpm level
>[  853.101456] [drm:amdgpu_device_suspend [amdgpu]] *ERROR* before  amdgpu_device_suspend  curr=0, req=3 
>[  853.113257] amdgpu 0000:04:00.0:  curr=3, req=3 
>[  853.113261] amdgpu 0000:04:00.0:  dev->bus->self
>[  853.113355] [drm:amdgpu_device_suspend [amdgpu]] *ERROR* before  amdgpu_device_suspend  curr=3, req=3 
>[  853.122540] PM: suspend of devices complete after 131.420 msecs
>[  853.122543] PM:  1 suspend_devices_and_enter
>[  853.122546] PM: suspend devices took 0.131 seconds
>[  853.122547] PM:  2 suspend_devices_and_enter
>[  853.123153] PM: late suspend of devices complete after 0.604 msecs
>[  853.123155] ACPI: aacpi_s2idle_prepare  out 
>[  853.123156] PM: suspend-to-idle
>[  853.123299] ACPI: EC: interrupt blocked
>[  853.136027]     power-0276 __acpi_power_off      : Power resource [P0ST] turned off
>[  853.136032] device_pm-0236 device_set_power      : Device [SATA] transitioned to D3hot
>[  853.136061]     power-0189 power_get_state       : Resource [P0ST] is off
>[  853.136065]     power-0220 power_get_list_state  : Resource list is off
>[  853.136087]     power-0189 power_get_state       : Resource [P3ST] is off
>[  853.136090]     power-0220 power_get_list_state  : Resource list is off
>[  853.136093] device_pm-0125 device_get_power      : Device [SATA] power state is D3cold
>[  853.147966] PM: noirq suspend of devices complete after 24.762 msecs
>[  857.458149] PM: Timekeeping suspended for 4.809 seconds
>[  857.458248] ACPI: acpi_s2idle_wake  out 
>[  857.458485] ACPI: EC: interrupt unblocked 
>[  857.470478]     power-0235 __acpi_power_on       : Power resource [P0ST] turned on
>[  857.470503] device_pm-0236 device_set_power      : Device [SATA] transitioned to D0
>[  857.470516] ahci 0000:05:00.0:  Refused to change power state, currently in D3
>[  857.470517] ahci 0000:05:00.0:  curr=3, req=0 
>[  857.470518] ahci 0000:05:00.0:  dev->bus->self
>[  857.470534]     power-0189 power_get_state       : Resource [P0ST] is on
>[  857.470536]     power-0220 power_get_list_state  : Resource list is on
>[  857.470538] device_pm-0125 device_get_power      : Device [SATA] power state is D0
>[  857.494570] PM: noirq resume of devices complete after 36.315 msecs
>[  857.494797] ACPI: acpi_s2idle_sync out 
>[  857.495136] PM: resume from suspend-to-idle
>[  857.495136] ACPI: acpi_s2idle_restore 
>[  857.495451] PM: early resume of devices complete after 0.312 msecs
>[  857.495451] PM:  3 suspend_devices_and_enter
>[  857.496002] [drm:amdgpu_pmops_resume [amdgpu]] *ERROR*  in if amdgpu_pmops_resume
>[  857.496045] [drm:amdgpu_device_resume [amdgpu]] *ERROR* before  amdgpu_device_resume  curr=0, req=0 
>[  857.496091] [drm:amdgpu_device_resume [amdgpu]] *ERROR* after  amdgpu_device_resume  curr=0, req=0 
>[  857.496838] [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000).
>[  857.496865] [drm] PSP is resuming...
>[  857.501901] rtc_cmos 00:01: resume, ctrl 22
>
>[  857.611273] amdgpu: [powerplay] dpm has been enabled
>[  857.611278] amdgpu: [powerplay] smu firmware version too old, can not set dpm level
>[  857.636777] [drm] SADs count is: 0, don't need to read it
>[  857.713206] nvme nvme0: Shutdown timeout set to 8 seconds
>[  857.764036] amdgpu: [powerplay] smu firmware version too old, can not set dpm level
>[  857.814556] ata1: SATA link down (SStatus 0 SControl 300)
>
>[  858.820040] [drm:gfx_v9_0_hw_init [amdgpu]] *ERROR* KCQ enable failed (scratch(0xC040)=0xCAFEDEAD)
>[  858.820130] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_0> failed -22
>[  858.820172] [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-22).
>[  858.820178] dpm_run_callback(): pci_pm_resume+0x0/0xb0 returns -22
>[  858.820401] PM: Device 0000:04:00.0 failed to resume async: error -22
>[  858.822359] PM: resume of devices complete after 1326.904 msecs
>[  858.822515] PM: resume devices took 1.327 seconds
>[  858.822518] ACPI: acpi_s2idle_end 
>[  858.822519] PM: Finishing wakeup.
>[  858.822520] OOM killer enabled.
>[  858.822520] Restarting tasks ... done.
>[  858.828914] video LNXVIDEO:02: Restoring backlight state
>[  858.829045] PM: suspend exit
>[  858.829522] x86: Booting SMP configuration:
>[  858.829523] smpboot: Booting Node 0 Processor 1 APIC 0x1
>[  858.842077] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
>[  858.842304] microcode: CPU1: patch_level=0x08100004
>[  858.842588] smpboot: Booting Node 0 Processor 2 APIC 0x2
>[  858.851116] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
>[  858.851419] microcode: CPU2: patch_level=0x08100004
>[  858.851593] evbug: Event. Dev: input7, Type: 2, Code: 1, Value: -1
>[  858.851595] evbug: Event. Dev: input7, Type: 0, Code: 0, Value: 0
>[  858.851803] smpboot: Booting Node 0 Processor 3 APIC 0x3
>[  858.861101] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
>[  858.861347] microcode: CPU3: patch_level=0x08100004
>[  858.861653] smpboot: Booting Node 0 Processor 4 APIC 0x4
>[  858.876201] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
>[  858.876476] microcode: CPU4: patch_level=0x08100004
>[  858.876866] smpboot: Booting Node 0 Processor 5 APIC 0x5
>[  858.883595] evbug: Event. Dev: input7, Type: 2, Code: 1, Value: -1
>[  858.883597] evbug: Event. Dev: input7, Type: 0, Code: 0, Value: 0
>[  858.896152] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
>[  858.896430] microcode: CPU5: patch_level=0x08100004
>[  858.896902] smpboot: Booting Node 0 Processor 6 APIC 0x6
>[  858.918446] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
>[  858.918720] microcode: CPU6: patch_level=0x08100004
>[  858.919169] smpboot: Booting Node 0 Processor 7 APIC 0x7
>[  858.936103] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
>[  858.936349] microcode: CPU7: patch_level=0x08100004
>[  858.943167] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.943222] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.943238] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.943286] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.943293] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.943341] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.943348] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.943395] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.943401] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.943448] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.943454] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.943500] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.943506] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.943553] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.943562] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.943609] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.947600] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.947657] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.947663] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.947726] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.947729] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.947776] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.947777] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.947824] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.947825] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.947872] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.947873] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.947920] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.947921] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.947968] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.948039] amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:222 vmid:5 pasid:32768, for process surfaceflinger pid 1509 thread surfaceflinger pid 1771
>[  858.948039] )
>[  858.948041] amdgpu 0000:04:00.0:   at address 0x00000001001f9000 from 27
>[  858.948042] amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x005009BD
>[  858.948047] amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:222 vmid:5 pasid:32768, for process surfaceflinger pid 1509 thread surfaceflinger pid 1771
>[  858.948047] )
>[  858.948049] amdgpu 0000:04:00.0:   at address 0x00000001001fa000 from 27
>[  858.948050] amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
>[  858.964994] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.965223] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.965282] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.965334] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.965401] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.965455] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.965506] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.965558] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.965833] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.965883] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.965968] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.966017] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.966040] amdgpu 0000:04:00.0: couldn't schedule ib on ring <sdma0>
>[  858.966088] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
>[  858.966589] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.966854] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.966910] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.966963] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.967036] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.967101] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.967159] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.967211] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.967267] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  858.967318] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>[  859.872807] r8169 0000:02:00.0 eth0: Link is Down
>[  860.365593] r8169 0000:02:00.0 eth0: Link is Up - 1Gbps/Full - flow control rx/tx
>[  869.728687] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=7433, emitted seq=7436
>[  869.728692] [drm] GPU recovery disabled.

>
Comment 2 Martin Peres 2019-11-19 09:15:15 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/709.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.