Created attachment 143453 [details] dmesg Hi, I have frequent gpu hangs when playing Xonotic using amdgpu video driver. [ 9330.297589] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CRTC:47:crtc-0] flip_done timed out [ 9340.537609] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [PLANE:45:plane-5] flip_done timed out [ 9340.537682] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* amdgpu_dm_commit_planes: acrtc 0, already busy [ 9340.537762] WARNING: CPU: 1 PID: 3733 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:4860 amdgpu_dm_atomic_commit_tail+0x1349/0x14f0 [amdgpu] full dmesg attached # uname -r 5.0.0-rc7+ # emerge mesa -pv These are the packages that would be merged, in order: Calculating dependencies... done! [ebuild R ] media-libs/mesa-19.0.0_rc4::gentoo USE="classic dri3 egl gallium gbm gles2 llvm pic vaapi vdpau wayland -d3d9 -debug -gles1 -lm_sensors -opencl -osmesa -pax_kernel (-selinux) -test -unwind -valgrind -vulkan -xa -xvmc" ABI_X86="32 (64) (-x32)" VIDEO_CARDS="i965 intel radeon radeonsi (-freedreno) -i915 (-imx) -nouveau -r100 -r200 -r300 -r600 (-vc4) -virgl (-vivante) -vmware" 0 KiB Total: 1 package (1 reinstall), Size of downloads: 0 KiB # cat /etc/X11/xorg.conf.d/video.conf #Section "Device" # Identifier "Intel Graphics" # Driver "modesetting" # #Option "GLXVBlank" "off" # Option "AccelMethod" "glamor" # Option "DRI" "3" #EndSection Section "Device" Identifier "AMD" Driver "amdgpu" Option "GLXVBlank" "off" Option "AccelMethod" "glamor" Option "DRI" "3" EndSection # lspci -vvv -s 01:00.0 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Tonga PRO [Radeon R9 285/380] (rev f1) (prog-if 00 [VGA controller]) Subsystem: Gigabyte Technology Co., Ltd Tonga PRO [Radeon R9 285/380] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 46 Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M] Region 2: Memory at d0000000 (64-bit, prefetchable) [size=2M] Region 4: I/O ports at e000 [size=256] Region 5: Memory at dfd00000 (32-bit, non-prefetchable) [size=256K] Expansion ROM at dfd40000 [disabled] [size=128K] Capabilities: [48] Vendor Specific Information: Len=08 <?> Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq- RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop+ MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s (ok), Width x16 (ok) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled AtomicOpsCtl: ReqEn- LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+ EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee00558 Data: 0000 Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Capabilities: [150 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [200 v1] Resizable BAR <?> Capabilities: [270 v1] Secondary PCI Express <?> Capabilities: [2b0 v1] Address Translation Service (ATS) ATSCap: Invalidate Queue Depth: 00 ATSCtl: Enable-, Smallest Translation Unit: 00 Capabilities: [2c0 v1] Page Request Interface (PRI) PRICtl: Enable- Reset- PRISta: RF- UPRGI- Stopped+ Page Request Capacity: 00000020, Page Request Allocation: 00000000 Capabilities: [2d0 v1] Process Address Space ID (PASID) PASIDCap: Exec+ Priv+, Max PASID Width: 10 PASIDCtl: Enable- Exec- Priv- Capabilities: [328 v1] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 1 ARICtl: MFVC- ACS-, Function Group: 0 Kernel driver in use: amdgpu Kernel modules: amdgpu
So as I half remembered it didn't happen in the past. I did some internet searches and there are few similar bugs on this bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=109461 https://bugzilla.freedesktop.org/show_bug.cgi?id=104624 https://bugzilla.freedesktop.org/show_bug.cgi?id=108309 But also I found: https://bbs.archlinux.org/viewtopic.php?id=239670 which allowed me to narrow the time when it broke. I looked at changes between 4.14 and 4.15 and choose to try reverting one of the commits with "flip" as part of commit message. (Getting bisect running on 4.14 with too new gcc is pain...) So seems like reverting 320a127437e5d3cbb7fc444f8769eb510d11d3b9 helps with random freezes for me (although I tested only for one day). However from what I can see reverting this commit is just a workaround... So if anyone wants to try and reproduce it you can install Xonotic from xonotic.org (or your distribution repositories) and either have some fun playing or just create infinite time match with bots and once click left mouse button after it starts to select bot view and leave it running. One note that I start xonotic from command line with vblank_mode=0 added before, like "vblank_mode=0 xonotic".
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/710.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.