Summary: | Framebuffer corruption when a fb which is not being scanned out gets removed | ||
---|---|---|---|
Product: | DRI | Reporter: | Hans de Goede <jwrdegoede> |
Component: | DRM/AMDgpu | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED MOVED | QA Contact: | |
Severity: | not set | ||
Priority: | not set | ||
Version: | DRI git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: |
Description
Hans de Goede
2019-09-08 10:00:19 UTC
I just realized I left out one bit of info which might be useful, to debug this I added the following change to the kernel: diff --git a/drivers/gpu/drm/drm_framebuffer.c b/drivers/gpu/drm/drm_framebuffer.c index 57564318ceea..4712bfb9ae05 100644 --- a/drivers/gpu/drm/drm_framebuffer.c +++ b/drivers/gpu/drm/drm_framebuffer.c @@ -464,6 +464,7 @@ int drm_mode_rmfb(struct drm_device *dev, u32 fb_id, if (drm_framebuffer_read_refcount(fb) > 1) { struct drm_mode_rmfb_work arg; + pr_err("drm_modr_rmfb calling drm_framebuffer_remove\n"); INIT_WORK_ONSTACK(&arg.work, drm_mode_rmfb_work_fn); INIT_LIST_HEAD(&arg.fbs); list_add_tail(&fb->filp_head, &arg.fbs); @@ -471,8 +472,10 @@ int drm_mode_rmfb(struct drm_device *dev, u32 fb_id, schedule_work(&arg.work); flush_work(&arg.work); destroy_work_on_stack(&arg.work); - } else + } else { + pr_err("drm_modr_rmfb calling drm_framebuffer_put\n"); drm_framebuffer_put(fb); + } return 0; @@ -669,11 +672,13 @@ void drm_fb_release(struct drm_file *priv) */ list_for_each_entry_safe(fb, tfb, &priv->fbs, filp_head) { if (drm_framebuffer_read_refcount(fb) > 1) { + pr_err("drm_fb_release calling drm_framebuffer_remove\n"); list_move_tail(&fb->filp_head, &arg.fbs); } else { list_del_init(&fb->filp_head); /* This drops the fpriv->fbs reference. */ + pr_err("drm_fb_release calling drm_framebuffer_put\n"); drm_framebuffer_put(fb); } } @@ -863,6 +868,8 @@ static int atomic_remove_fb(struct drm_framebuffer *fb) if (plane->state->fb != fb) continue; + pr_err("atomic_remove_fb found plane still using fb\n"); + plane_state = drm_atomic_get_plane_state(state, plane); if (IS_ERR(plane_state)) { ret = PTR_ERR(plane_state); In the working case, so where we let the kernel do the fb cleanup itself, I see: Plymouth removes fb it creates to test for 32bpp support: kernel: drm_modr_rmfb calling drm_framebuffer_put gdm starts, does page-flipping, resulting in a number of: kernel: drm_modr_rmfb calling drm_framebuffer_put kernel: drm_modr_rmfb calling drm_framebuffer_put ... lines And then plymouth exits without any cleanup, so we get: kernel: drm_fb_release calling drm_framebuffer_put Followed by more: kernel: drm_modr_rmfb calling drm_framebuffer_put kernel: drm_modr_rmfb calling drm_framebuffer_put ... From gdm. In the broken case, where ply_renderer_buffer_free() gets called on plymouth-quit, I only see: kernel: drm_modr_rmfb calling drm_framebuffer_put kernel: drm_modr_rmfb calling drm_framebuffer_put ... lines, wihch is expected as the fb is rmfb-ed before the fd is closed. Note that we never hit: @@ -863,6 +868,8 @@ static int atomic_remove_fb(struct drm_framebuffer *fb) if (plane->state->fb != fb) continue; + pr_err("atomic_remove_fb found plane still using fb\n"); + plane_state = drm_atomic_get_plane_state(state, plane); if (IS_ERR(plane_state)) { ret = PTR_ERR(plane_state); So AFAICT userspace is doing everything correctly even in the broken case. (In reply to Hans de Goede from comment #0) > 5) Plymouth internally calls src/plugins/renderers/drm/plugin.c: > ply_renderer_buffer_free() this does: > drmModeRmFB(...); > munmap (buffer->map_address, buffer->map_size); > destroy_dumb_buffer_request.handle = buffer->handle; > drmIoctl (fd, DRM_IOCTL_MODE_DESTROY_DUMB, &destroy_dumb_buffer_request); > Followed by calling close() on the fd. > 6) Plymouth exits > 7) 5 and/or 6 cause the gdm framebuffer being all messed up, it looks like a > wrong pitch or tiling setting Would be interesting if you could further narrow down which step (or even sub-step of 5) exactly triggers the problem. -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/902. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.