Bug 111682

Summary: use-after-free in amdgpu_vm_update_pdes
Product: DRI Reporter: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer>
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: not set    
Priority: not set    
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg output
none
dmesg when using cfdabd064b2d(drm/amdgpu: remove the redundant null checks) none

Description Pierre-Eric Pelloux-Prayer 2019-09-13 10:39:00 UTC
Created attachment 145345 [details]
dmesg output

When using amdgpu.vm_update_mode=3 the following error appears after some time (ranging from a few minutes to a few hours):

BUG: KASAN: use-after-free in amdgpu_vm_update_directories

I attached the relevant dmesg part.

Notes:
- happens on Navi10 and gfx9 (probably also on other cards but I didn't try)
- reproduced on 865b4ca43816e113996c3be571d4998b6daf5f1 and 20d6b9c3b7f40ec427af912d140f2be0de098d2d
Comment 1 Andrey Grodzovsky 2019-09-16 17:56:06 UTC
Which kernel branch are you using ? I couldn't find  amdgpu_vm_update_directories in latest code in amd-staging-drm-next and turns out it was renamed to amdgpu_vm_update_pdes in 78b20c2ee6788ba0df8b36b1369bc7e264262d3b back in March so seems like this is very outdated code.
Comment 2 Pierre-Eric Pelloux-Prayer 2019-09-16 18:24:33 UTC
(In reply to Andrey Grodzovsky from comment #1)
> Which kernel branch are you using ? I couldn't find 
> amdgpu_vm_update_directories in latest code in amd-staging-drm-next and
> turns out it was renamed to amdgpu_vm_update_pdes in
> 78b20c2ee6788ba0df8b36b1369bc7e264262d3b back in March so seems like this is
> very outdated code.

I'm using amd-staging-drm-next from a few days ago.

But 78b20c2ee6788ba0df8b36b1369bc7e264262d3b (drm/amdgpu: allow direct submission of PDE updates v2) has been pushed in this branch recently and indeed it renamed the function.

I'll rebuild a kernel and test if the issue is still there.
Comment 3 Pierre-Eric Pelloux-Prayer 2019-09-17 07:18:52 UTC
Created attachment 145387 [details]
dmesg when using cfdabd064b2d(drm/amdgpu: remove the redundant null checks)

Using the latest commit from amd-staging-drm-next (= cfdabd064b2d58f "drm/amdgpu: remove the redundant null checks"): the use-after-free bug is still there.
Comment 4 Martin Peres 2019-11-19 09:51:17 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/905.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.