| Summary: | forced EDID's can cause a amdgpu to null ptr deref | ||
|---|---|---|---|
| Product: | DRI | Reporter: | Edward O'Callaghan <funfunctor> |
| Component: | DRM/other | Assignee: | Default DRI bug account <dri-devel> |
| Status: | RESOLVED INVALID | QA Contact: | |
| Severity: | normal | ||
| Priority: | medium | CC: | fdsfgs |
| Version: | unspecified | ||
| Hardware: | Other | ||
| OS: | All | ||
| Whiteboard: | |||
| i915 platform: | i915 features: | ||
|
Description
Edward O'Callaghan
2017-03-24 10:36:39 UTC
set_root doesn't look directly related to amdgpu or drm, so this could be memory corruption. KASAN might give more information. Does this only happen when forcing an invalid EDID? (In reply to Michel Dänzer from comment #1) > set_root doesn't look directly related to amdgpu or drm, so this could be > memory corruption. KASAN might give more information. > > Does this only happen when forcing an invalid EDID? Hi Michel, yes it only happens on shutdown with a EDID blob passed at boot. Actually the EDID blob passed I don't think is invalid, I don't know where it is getting the one in the trace from that could be from perhaps the monitor itself. Is the kernel there trying to open the EDID blob on a umounted fs? Sounds plausible, in which case it's probably a core DRM or even lower level kernel issue. actually this has nothing to do with the EDID I don't believe as not forcing a EDID makes no difference. The actual root causes is that if a page flip is in progress something races on that fd and causes the null ptr deref: [ 18.281296] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 18.289158] IP: [<ffffffff81169a8d>] set_root+0x1d/0xa0 [ 18.294401] PGD 0 [ 18.296239] [ 18.297739] Oops: 0000 [#1] SMP [ 18.300885] Modules linked in: amdgpu blackmagic_io(PO) ttm backlight hid_sony led_class [ 18.309086] CPU: 2 PID: 3595 Comm: hyperflow-engin Tainted: P O 4.9.16-gentoo #1 [ 18.317605] Hardware name: BIOSTAR Group A68N-5200/A68N-5200, BIOS 4.6.5 09/03/2015 [ 18.325248] task: ffff8802255755c0 task.stack: ffffc90008f30000 [ 18.331161] RIP: 0010:[<ffffffff81169a8d>] [<ffffffff81169a8d>] set_root+0x1d/0xa0 [ 18.338823] RSP: 0018:ffffc90008f33688 EFLAGS: 00010202 [ 18.344127] RAX: ffff8802255755c0 RBX: ffffc90008f337c0 RCX: ffff880218f12e00 [ 18.351252] RDX: ffffffff81c55e08 RSI: 0000000000000041 RDI: ffffc90008f337c0 [ 18.358376] RBP: ffffc90008f33698 R08: 0000000018f12e01 R09: ffff880218f12e00 [ 18.365501] R10: ffff88021432a024 R11: 0000000000000017 R12: 0000000000000000 [ 18.372626] R13: ffff88021432f01c R14: 0000000000000001 R15: ffff880218de8200 [ 18.379750] FS: 00007fee18f6d740(0000) GS:ffff88022ed00000(0000) knlGS:0000000000000000 [ 18.387827] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 18.393566] CR2: 0000000000000008 CR3: 0000000001a08000 CR4: 00000000000406e0 [ 18.400690] Stack: [ 18.402701] ffffc90008f337c0 0000000000000041 ffffc90008f336d8 ffffffff81169dc9 [ 18.410155] ffff880219f7e300 ffff88021432f000 ffffc90008f337c0 ffffc90008f338cc [ 18.417607] 0000000000000001 ffff880218de8200 ffffc90008f337b0 ffffffff8116c3aa [ 18.425063] Call Trace: [ 18.427510] [<ffffffff81169dc9>] path_init+0x1e9/0x330 [ 18.432735] [<ffffffff8116c3aa>] path_openat+0x6a/0x1480 [ 18.438137] [<ffffffff81079c3d>] ? default_wake_function+0xd/0x10 [ 18.444315] [<ffffffff8108ce3d>] ? __wake_up_common+0x4d/0x80 [ 18.450149] [<ffffffff8116f3c9>] do_filp_open+0x79/0xd0 [ 18.455463] [<ffffffff8134fba8>] ? acpi_driver_match_device+0x3d/0x5d [ 18.461987] [<ffffffff813d7164>] ? platform_match+0x24/0xa0 [ 18.467639] [<ffffffff816039f1>] ? klist_next+0x21/0xf0 [ 18.472944] [<ffffffff8115e82f>] file_open_name+0xdf/0x100 [ 18.478515] [<ffffffff8115e87e>] filp_open+0x2e/0x50 [ 18.483560] [<ffffffff811657b1>] kernel_read_file_from_path+0x31/0x70 [ 18.490079] [<ffffffff813e094f>] _request_firmware+0x2ef/0x5a0 [ 18.495989] [<ffffffff813e0c32>] request_firmware+0x32/0x50 [ 18.501649] [<ffffffff813a9f14>] drm_load_edid_firmware+0x264/0x500 [ 18.507996] [<ffffffff8139ec0c>] drm_helper_probe_single_connector_modes+0x14c/0x4d0 [ 18.515822] [<ffffffff813aaf28>] drm_fb_helper_probe_connector_modes.isra.7+0x48/0x70 [ 18.523735] [<ffffffff813aca84>] drm_fb_helper_hotplug_event+0x94/0xd0 [ 18.530347] [<ffffffff813acc7c>] drm_fb_helper_restore_fbdev_mode_unlocked+0x1bc/0x2a0 [ 18.538370] [<ffffffffa01003d5>] amdgpu_fbdev_restore_mode+0x15/0x40 [amdgpu] [ 18.545605] [<ffffffffa00ed8dd>] amdgpu_driver_lastclose_kms+0xd/0x10 [amdgpu] [ 18.552909] [<ffffffff813b0bb6>] drm_lastclose+0x36/0xf0 [ 18.558300] [<ffffffff813b0f15>] drm_release+0x2a5/0x360 [ 18.563691] [<ffffffff811611ca>] __fput+0xda/0x1e0 [ 18.568561] [<ffffffff81161309>] ____fput+0x9/0x10 [ 18.573435] [<ffffffff8106e9a9>] task_work_run+0x79/0xa0 [ 18.578834] [<ffffffff8105738a>] do_exit+0x34a/0xaa0 [ 18.583886] [<ffffffff81058940>] do_group_exit+0x40/0xa0 [ 18.589277] [<ffffffff81062892>] get_signal+0x272/0x5e0 [ 18.594582] [<ffffffff8101bfd3>] do_signal+0x23/0x5b0 [ 18.599712] [<ffffffff81061978>] ? do_send_sig_info+0x58/0x70 [ 18.605537] [<ffffffff8100222e>] exit_to_usermode_loop+0x4e/0x80 [ 18.611620] [<ffffffff81002673>] syscall_return_slowpath+0x43/0x50 [ 18.617881] [<ffffffff81609a9f>] entry_SYSCALL_64_fastpath+0x92/0x94 [ 18.624327] Code: 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 55 65 48 8b 04 25 40 c4 00 00 48 89 e5 41 54 53 f6 47 38 40 4c 8b a0 68 05 00 00 74 39 <41> 8b 4c 24 08 f6 c1 01 75 6d 49 8b 54 24 20 [ 18.644280] RIP [<ffffffff81169a8d>] set_root+0x1d/0xa0 [ 18.649600] RSP <ffffc90008f33688> [ 18.653086] CR2: 0000000000000008 [ 18.656398] ---[ end trace 506f9f2a94b80534 ]--- [ 18.661007] Fixing recursive fault but reboot is needed! (In reply to Edward O'Callaghan from comment #4) > The actual root causes is that if a page flip is in progress something races > on that fd and causes the null ptr deref: How did you determine that it's related to a page flip (or amdgpu in the first place)? I don't see the connection between that and set_root. (I filed bug https://bugs.freedesktop.org/show_bug.cgi?id=102202 on what might be a related issue.) Hi, Freedesktop's Bugzilla instance is EOLed and open bugs are about to be migrated to http://gitlab.freedesktop.org. To avoid migrating out of date bugs, I am now closing all the bugs that did not see any activity in the past year. If the issue is still happening, please create a new bug in the relevant project at https://gitlab.freedesktop.org/drm (use misc by default). Sorry about the noise! |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.