Bug 104527 - Amdgpu locks up occasionally when running 3d applications
Summary: Amdgpu locks up occasionally when running 3d applications
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/AMDgpu (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-07 15:07 UTC by Michal Suchanek
Modified: 2019-11-19 08:28 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
kernel messages (105.21 KB, text/plain)
2018-01-07 15:07 UTC, Michal Suchanek
no flags Details
X log with "BUG: triggered 'if (in_input_thread())'" traces removed (113.30 KB, text/plain)
2018-01-07 15:55 UTC, Michal Suchanek
no flags Details

Description Michal Suchanek 2018-01-07 15:07:01 UTC
Created attachment 136597 [details]
kernel messages

Linux 4.14.0 libdrm 2.4.89 mesa 17.3.1 on Debian

01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Radeon RX 460] [1002:67ef] (rev cf)

After lockup I see this message:

[150509.194713] amdgpu 0000:01:00.0: GPU fault detected: 147 0x00004802
[150509.194718] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
[150509.194720] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0A048002
[150509.194722] amdgpu 0000:01:00.0: VM fault (0x02, vmid 5) at page 0, read from 'TC0' (0x54433000) (72)

but similar message earlier did not cause lockup

[112552.659698] amdgpu 0000:01:00.0: GPU fault detected: 147 0x07f04802
[112552.659702] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0003F8FE
[112552.659704] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0A048002
[112552.659706] amdgpu 0000:01:00.0: VM fault (0x02, vmid 5) at page 260350, read from 'TC0' (0x54433000) (72)

Earlier versions of kernel+mesa would occasionally lock up displaying garbage randomly at any time. I have not seen that for a while but the card still occasionally locks up when running a 3D application. After lock up the card keeps showing static screen of something the application rendered and movable cursor. It *seems* to happen most often when there is some setup in progress like loading a new scene.
Comment 1 Michal Suchanek 2018-01-07 15:55:16 UTC
Created attachment 136598 [details]
X log with "BUG: triggered 'if (in_input_thread())'" traces removed
Comment 2 Michal Suchanek 2018-01-08 05:36:54 UTC
ok, AFAICT this problem only happens with Wine when it uses an "optimization" (default on) that does not guarantee correct rendering order. In extrema cases (like when creating new scene with many objects) this probably means that some objects/buffers are used before they are created/uploaded resulting in bogus code and crashing the card.

Of course it should not crash whatever the user does but given the silicone is not impeccable it might be difficult to avoid in practice, even on the more recent cards.
Comment 3 Martin Peres 2019-11-19 08:28:23 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/289.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.