Summary: | gbm creates BO with wrong pitch when dri3_get_modifiers returns modifiers, causing drmModeAddFB2WithModifiers to fail | ||
---|---|---|---|
Product: | Mesa | Reporter: | Hans de Goede <jwrdegoede> |
Component: | Drivers/DRI/i965 | Assignee: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Status: | RESOLVED MOVED | QA Contact: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | intel-gfx-bugs |
Version: | 19.0 | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | Triaged, ReadyForDev | ||
i915 platform: | SKL | i915 features: | display/Other |
Attachments: |
Requested i915 debug log with add_fb2 failure.
isl: add pitch requirement for CCS display surface > 3840 isl: add matching display workaround to the kernel |
Description
Hans de Goede
2019-08-06 13:32:27 UTC
The pitch is determined by the driver. Could you run something like this : echo 15 > /sys/module/drm/parameters/debug; xrandr --fb <width>x<height>; sleep 0.1; echo 0 > /sys/module/drm/parameters/debug And attach the syslog for that period of time. That would help figuring out where the display code of i915 is throwing the xserver out. Thanks! Ok, I've done as you requested, the error you are looking for (I think) is: [ 801.852552] [drm:drm_ioctl [drm]] pid=3502, dev=0xe200, auth=1, DRM_IOCTL_MODE_ADDFB2 [ 801.852676] [drm:intel_framebuffer_init [i915]] plane 0 pitch (18816) must be at least 512 byte aligned [ 801.852705] [drm:drm_internal_framebuffer_create [drm]] could not create framebuffer [ 801.852715] [drm:drm_ioctl [drm]] pid=3502, ret = -22 I will attach the full log. Note that it runs for a bit longer then 0.1 seconds as I ended up copy & pasting a newline from your command list, so the xrandr --fb happened before I copy and pasted the echo 0. Created attachment 144979 [details]
Requested i915 debug log with add_fb2 failure.
Meeeeh... Looking at this I wonder how anything actually works. The first issue is that i965 doesn't appear to look at the GBM_BO_USE_SCANOUT flag in usage to compute the alignment restrictions of its surface. So if I get this right you should get the same issue without using modifiers. And we just happen to have things working by chance... Second I see a fundamental issue with the GBM api. The API doesn't let you specify a usage together with modifiers, but these are 2 sets of restrictions. It's perfectly legal to create compressed surfaces which alignment's properties don't match display requirements. Fixing the first issue is trivial. Not that easy with the second one. (In reply to Hans de Goede from comment #3) > Ok, I've done as you requested, the error you are looking for (I think) is: > > [ 801.852552] [drm:drm_ioctl [drm]] pid=3502, dev=0xe200, auth=1, > DRM_IOCTL_MODE_ADDFB2 > [ 801.852676] [drm:intel_framebuffer_init [i915]] plane 0 pitch (18816) > must be at least 512 byte aligned > [ 801.852705] [drm:drm_internal_framebuffer_create [drm]] could not create > framebuffer > [ 801.852715] [drm:drm_ioctl [drm]] pid=3502, ret = -22 > > I will attach the full log. Note that it runs for a bit longer then 0.1 > seconds as I ended up copy & pasting a newline from your command list, so > the xrandr --fb happened before I copy and pasted the echo 0. What size did you use in this particular case? (In reply to Lionel Landwerlin from comment #6) > What size did you use in this particular case? xrandr --fb 4704x1080 A bit weird width I know. I had that in my history from earlier comments. It is 3840 (my normal size) + the width of the micro-projecter which triggered this issue initially rounded up to a multiple of 32. This appears to work fine on my system. Could you give me the version of mesa/kernel/xserver you're reproducing with? mesa: [hans@shalem xserver]$ rpm -q mesa-libGL mesa-libGL-19.1.3-1.fc31.x86_64 kernel: [hans@shalem xserver]$ uname -a Linux shalem.localdomain 5.3.0-rc2+ #79 SMP Wed Aug 7 23:28:44 CEST 2019 x86_64 x86_64 x86_64 GNU/Linux xserver: I'm building and running the xserver from the master branch as mentioned in the original description you need the master branch xserver, not 1.20 to reproduce, as using modifiers is only enabled in the master branch. I think I see where we might be missing a workaround in Mesa/ISL : /* * Display WA #0531: skl,bxt,kbl,glk * * Render decompression and plane width > 3840 * combined with horizontal panning requires the * plane stride to be a multiple of 4. We'll just * require the entire fb to accommodate that to avoid * potential runtime errors at plane configuration time. */ if (IS_GEN(dev_priv, 9) && i == 0 && fb->width > 3840 && is_ccs_modifier(fb->modifier)) stride_alignment *= 4; I don't think we implement this so there is a mismatch with the surface created. Could confirm that a non multiple of 64 width inferior to 3840 doesn't cause any issue on your side? I can confirm that if I disable one of my monitors, so that my minimum fb size is 1920x1080 and then do: [hans@shalem ~]$ xrandr --fb 1952x1080 [hans@shalem ~]$ xrandr --fb 1936x1080 [hans@shalem ~]$ xrandr --fb 1928x1080 [hans@shalem ~]$ xrandr --fb 1924x1080 None of these cause the issue, where as: [hans@shalem ~]$ xrandr --fb 3872x1080 Does cause the issue with just the 1 monitor enabled (as well as with both monitors enabled). (In reply to Hans de Goede from comment #9) > mesa: > [hans@shalem xserver]$ rpm -q mesa-libGL > mesa-libGL-19.1.3-1.fc31.x86_64 > > kernel: > [hans@shalem xserver]$ uname -a > Linux shalem.localdomain 5.3.0-rc2+ #79 SMP Wed Aug 7 23:28:44 CEST 2019 > x86_64 x86_64 x86_64 GNU/Linux > > xserver: > I'm building and running the xserver from the master branch as mentioned in > the original description you need the master branch xserver, not 1.20 to > reproduce, as using modifiers is only enabled in the master branch. So I'm running a skylake system with 5.3.0-rc2+, master of xserver and mesa 19.1.3 and I still cannot repro. Is there any way you could put a trace in the xserver to figure what modifier is used when the ADDFB fails? Ok, so I added the following to my xserver: diff --git a/hw/xfree86/drivers/modesetting/drmmode_display.c b/hw/xfree86/drivers/modesetting/drmmode_display.c index e0d7cfb5c..c0c0d309b 100644 --- a/hw/xfree86/drivers/modesetting/drmmode_display.c +++ b/hw/xfree86/drivers/modesetting/drmmode_display.c @@ -985,6 +985,7 @@ drmmode_bo_import(drmmode_ptr drmmode, drmmode_bo *bo, strides[i] = gbm_bo_get_stride_for_plane(bo->gbm, i); offsets[i] = gbm_bo_get_offset(bo->gbm, i); modifiers[i] = gbm_bo_get_modifier(bo->gbm); + ErrorF("modifier %d %llx\n", i, modifiers[i]); } return drmModeAddFB2WithModifiers(drmmode->fd, bo->width, bo->height, And this prints: [ 2127.597] modifier 0 100000000000004 [ 2127.597] modifier 1 100000000000004 [ 2127.597] (WW) modeset(0): Present-flip: Page flip failed: Invalid argument With 100000000000004 (hex) corresponding to I915_FORMAT_MOD_Y_TILED_CCS Note I'm running gnome-shell on top of Xorg, that may be part of what is triggering this. I'm currently running mutter + gnome-shell master, but IIRC I could also reproduce this while running plain 3.32 of both. Also note I'm running Xorg master with the modesetting driver, not with xf86-video-intel! (In reply to Hans de Goede from comment #14) > Also note I'm running Xorg master with the modesetting driver, not with > xf86-video-intel! Same :) For me I915_FORMAT_MOD_Y_TILED_CCS/0x100000000000004 gets discarded in populate_format_modifiers(). On this particular condition : if (!(mod->formats & (1 << (i - mod->offset)))) continue; I don't really understand this function tbf :) (In reply to Lionel Landwerlin from comment #15) > > if (!(mod->formats & (1 << (i - mod->offset)))) > continue; That 1 << with a 64 bit integer, I'm not sure that's 100% right... That function builds a mapping from formats to supported-modifiers for each format. I've added the following debug: diff --git a/hw/xfree86/drivers/modesetting/drmmode_display.c b/hw/xfree86/drivers/modesetting/drmmode_display.c index e0d7cfb5c..249661985 100644 --- a/hw/xfree86/drivers/modesetting/drmmode_display.c +++ b/hw/xfree86/drivers/modesetting/drmmode_display.c @@ -2004,6 +2004,9 @@ populate_format_modifiers(xf86CrtcPtr crtc, const drmModePlane *kplane, for (j = 0; j < fmt_mod_blob->count_modifiers; j++) { struct drm_format_modifier *mod = &blob_modifiers[j]; + if (mod->modifier == I915_FORMAT_MOD_Y_TILED_CCS) + ErrorF("populate_format_modifiers i %d format %x formats %llx\n", i, blob_formats[i], mod->formats); + if ((i < mod->offset) || (i > mod->offset + 63)) continue; if (!(mod->formats & (1 << (i - mod->offset)))) Which for me generates: [ 3860.889] populate_format_modifiers i 0 format 20203843 formats 3c [ 3860.889] populate_format_modifiers i 1 format 36314752 formats 3c [ 3860.889] populate_format_modifiers i 2 format 34325258 formats 3c [ 3860.889] populate_format_modifiers i 3 format 34324258 formats 3c [ 3860.889] populate_format_modifiers i 4 format 34325241 formats 3c [ 3860.889] populate_format_modifiers i 5 format 34324241 formats 3c [ 3860.889] populate_format_modifiers i 6 format 30335258 formats 3c [ 3860.889] populate_format_modifiers i 7 format 30334258 formats 3c [ 3860.889] populate_format_modifiers i 8 format 56595559 formats 3c [ 3860.889] populate_format_modifiers i 9 format 55595659 formats 3c [ 3860.889] populate_format_modifiers i 10 format 59565955 formats 3c [ 3860.889] populate_format_modifiers i 11 format 59555956 formats 3c Note mod->offset always is 0 for me, I had that in the debug before but I dropped it. IOW for formats 2-5 in the format list part of the retreived blob, the I915_FORMAT_MOD_Y_TILED_CCS modifier is supported. Notice that those 4 formats are: DRM_FORMAT_XRGB8888 (XR24) DRM_FORMAT_XBGR8888 (XB24) DRM_FORMAT_ARGB8888 (AR24) DRM_FORMAT_ABGR8888 (AB24) So the fb ending up in one of these seems reasonable. I guess that for some reason the blob property describing the possible modifiers on the CRTC uses in your case is different. Are you using a skylake GPU? Perhaps things are different for you because you are using a (e)DP monitor/panel instead of the HDMI monitors I'm using? (In reply to Lionel Landwerlin from comment #16) > (In reply to Lionel Landwerlin from comment #15) > > > > if (!(mod->formats & (1 << (i - mod->offset)))) > > continue; > > That 1 << with a 64 bit integer, I'm not sure that's 100% right... Uh, no that is 100% wrong, good catch. Maybe you have more then 32 formats and you are hitting this? In that case changing the 1 to 1LL should fix this. (In reply to Hans de Goede from comment #17) > That function builds a mapping from formats to supported-modifiers for each > format. > > I've added the following debug: > > diff --git a/hw/xfree86/drivers/modesetting/drmmode_display.c > b/hw/xfree86/drivers/modesetting/drmmode_display.c > index e0d7cfb5c..249661985 100644 > --- a/hw/xfree86/drivers/modesetting/drmmode_display.c > +++ b/hw/xfree86/drivers/modesetting/drmmode_display.c > @@ -2004,6 +2004,9 @@ populate_format_modifiers(xf86CrtcPtr crtc, const > drmModePlane *kplane, > for (j = 0; j < fmt_mod_blob->count_modifiers; j++) { > struct drm_format_modifier *mod = &blob_modifiers[j]; > > + if (mod->modifier == I915_FORMAT_MOD_Y_TILED_CCS) > + ErrorF("populate_format_modifiers i %d format %x formats > %llx\n", i, blob_formats[i], mod->formats); > + > if ((i < mod->offset) || (i > mod->offset + 63)) > continue; > if (!(mod->formats & (1 << (i - mod->offset)))) > > Which for me generates: > > [ 3860.889] populate_format_modifiers i 0 format 20203843 formats 3c > [ 3860.889] populate_format_modifiers i 1 format 36314752 formats 3c > [ 3860.889] populate_format_modifiers i 2 format 34325258 formats 3c > [ 3860.889] populate_format_modifiers i 3 format 34324258 formats 3c > [ 3860.889] populate_format_modifiers i 4 format 34325241 formats 3c > [ 3860.889] populate_format_modifiers i 5 format 34324241 formats 3c > [ 3860.889] populate_format_modifiers i 6 format 30335258 formats 3c > [ 3860.889] populate_format_modifiers i 7 format 30334258 formats 3c > [ 3860.889] populate_format_modifiers i 8 format 56595559 formats 3c > [ 3860.889] populate_format_modifiers i 9 format 55595659 formats 3c > [ 3860.889] populate_format_modifiers i 10 format 59565955 formats 3c > [ 3860.889] populate_format_modifiers i 11 format 59555956 formats 3c > > Note mod->offset always is 0 for me, I had that in the debug before but I > dropped it. > > IOW for formats 2-5 in the format list part of the retreived blob, the > I915_FORMAT_MOD_Y_TILED_CCS modifier is supported. > > Notice that those 4 formats are: > DRM_FORMAT_XRGB8888 (XR24) > DRM_FORMAT_XBGR8888 (XB24) > DRM_FORMAT_ARGB8888 (AR24) > DRM_FORMAT_ABGR8888 (AB24) > > So the fb ending up in one of these seems reasonable. > > I guess that for some reason the blob property describing the possible > modifiers on the CRTC uses in your case is different. > > Are you using a skylake GPU? Perhaps things are different for you because > you are using a (e)DP monitor/panel instead of the HDMI monitors I'm using? Right yeah, I'm basically on DVI :) But Skylake GPU. Note my monitors are both connected over a DVI cable / connector on the monitor too, but "xrandr" lists the outputs as HDMI... About the "1 << with a 64 bit integer" issue you found, do you plan to submit a xserver patch fixing that upstream, or shall I? (In reply to Hans de Goede from comment #21) > About the "1 << with a 64 bit integer" issue you found, do you plan to > submit a xserver patch fixing that upstream, or shall I? Please go ahead and fix it :) (In reply to Hans de Goede from comment #20) > Note my monitors are both connected over a DVI cable / connector on the > monitor too, but "xrandr" lists the outputs as HDMI... Right, I have similar setup. hdmi connector on the computer, dvi connector on the monitor. I just tried another monitor with a fully hdmi<->hdmi setup but my testing NUC doesn't seem to deal with all the corners of HDMI 1.4 and cannot do 2560x1440... At this point I'm reasonably confident that we're missing a workaround in Mesa, but I can't really reproduce this issue here. Let's see if I can write a quick fix. Created attachment 144984 [details] [review] isl: add pitch requirement for CCS display surface > 3840 Let me know if this helps. Thanks! (In reply to Lionel Landwerlin from comment #24) > Created attachment 144984 [details] [review] [review] > isl: add pitch requirement for CCS display surface > 3840 > > Let me know if this helps. Thanks! I've build mesa-19.1.3 with this patch applied, logged out and logged in to get a new gnome-shell on Xorg session using the new mesa, and then tried the "xrandr --fb" commands. Unfortunately the patch does not fix the problem of the present-flip errors. Reassigning to i915 to check the workaround in intel_display.c : /* * Display WA #0531: skl,bxt,kbl,glk * * Render decompression and plane width > 3840 * combined with horizontal panning requires the * plane stride to be a multiple of 4. We'll just * require the entire fb to accommodate that to avoid * potential runtime errors at plane configuration time. */ if (IS_GEN(dev_priv, 9) && i == 0 && fb->width > 3840 && is_ccs_modifier(fb->modifier)) stride_alignment *= 4; I think there might be an issue there, because it applies a workaround that seem to be meant for the aux surface to the main surface. Created attachment 144992 [details] [review] isl: add matching display workaround to the kernel Meanwhile this patch allowed me to pass the ADDFB2 with a hacked up xserver to force CCS. (In reply to Lionel Landwerlin from comment #27) > Created attachment 144992 [details] [review] [review] > isl: add matching display workaround to the kernel > > Meanwhile this patch allowed me to pass the ADDFB2 with a hacked up xserver > to force CCS. Thank you for the updated patch and sorry for being a bit slow wrt getting around to testing it. I can confirm that this updated patch fixes the issue for me. (In reply to Hans de Goede from comment #28) > (In reply to Lionel Landwerlin from comment #27) > > Created attachment 144992 [details] [review] [review] [review] > > isl: add matching display workaround to the kernel > > > > Meanwhile this patch allowed me to pass the ADDFB2 with a hacked up xserver > > to force CCS. > > Thank you for the updated patch and sorry for being a bit slow wrt getting > around to testing it. > > I can confirm that this updated patch fixes the issue for me. Thanks a lot, no worries for the delay, I was away as well. We still need the i915/display people to check the workaround implemented in the kernel. I & Chris read the workaround description and could not figure out where it should apply (main surface plane or CCS surface plane). Cleared the kernel question with Ville on IRC. Switching back to mesa bug : https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1756 -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1825. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.