Bug 76103 - [i915GM regression] black screen with self-refresh on fbcon but not X
Summary: [i915GM regression] black screen with self-refresh on fbcon but not X
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Daniel Vetter
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-03-13 08:15 UTC by Krzysztof Mazur
Modified: 2017-07-24 22:55 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg - 3f2dc5ac05714711fc14f2bf0ee5e42d5c08c581 + some unrelated patches (bad) (71.21 KB, text/plain)
2014-03-13 08:15 UTC, Krzysztof Mazur
no flags Details
dmesg - 0fb58223967fdf5acc2bdbfe50347841843131bc + some unrelated patches (good) (67.78 KB, text/plain)
2014-03-13 08:16 UTC, Krzysztof Mazur
no flags Details
[PATCH 1/3] drm/i915: Double the gen3 SR latency (1.23 KB, patch)
2014-03-13 11:07 UTC, Ville Syrjala
no flags Details | Splinter Review
[PATCH 2/3] drm/i915: Calculate gen3 SR watermark with 32bpp pixel (1.16 KB, patch)
2014-03-13 11:07 UTC, Ville Syrjala
no flags Details | Splinter Review
[PATCH 3/3] drm/i915: Enable/disable planes as the last/first thing (2.58 KB, patch)
2014-03-13 11:08 UTC, Ville Syrjala
no flags Details | Splinter Review
shuffle INSTPM writes (1.15 KB, patch)
2014-03-13 11:30 UTC, Daniel Vetter
no flags Details | Splinter Review
dmesg from X startup (48.20 KB, text/plain)
2014-03-13 11:53 UTC, Krzysztof Mazur
no flags Details
dmesg from gem_exec_nop (2.54 KB, text/plain)
2014-03-13 12:23 UTC, Krzysztof Mazur
no flags Details
[PATCH] drm/i915: Repartition the display FIFO manyally on gen3 (2.31 KB, patch)
2014-03-13 13:22 UTC, Ville Syrjala
no flags Details | Splinter Review
dmesg from gem_render_copy (2.95 KB, text/plain)
2014-03-14 09:57 UTC, Krzysztof Mazur
no flags Details
disable sr without tiling (653 bytes, patch)
2014-03-27 10:10 UTC, Daniel Vetter
no flags Details | Splinter Review
disable sr without tiling, v2 (658 bytes, patch)
2014-03-27 10:12 UTC, Daniel Vetter
no flags Details | Splinter Review
disable sr without tiling, v3 (651 bytes, patch)
2014-04-01 18:46 UTC, Ville Syrjala
no flags Details | Splinter Review

Description Krzysztof Mazur 2014-03-13 08:15:30 UTC
Created attachment 95693 [details]
dmesg - 3f2dc5ac05714711fc14f2bf0ee5e42d5c08c581 + some unrelated patches (bad)

The commit 3f2dc5ac05714711fc14f2bf0ee5e42d5c08c581 (drm/i915: Fix 915GM self-refresh enable/disable) on HP Compaq nc6120 causes black screen after boot with LVDS only. Connecting an external VGA monitor or starting X fixes the problem.

Reverting this commit fixes the problem (at least up to 3.14.0-rc6).
Comment 1 Krzysztof Mazur 2014-03-13 08:16:48 UTC
Created attachment 95694 [details]
dmesg - 0fb58223967fdf5acc2bdbfe50347841843131bc + some unrelated patches (good)
Comment 2 Chris Wilson 2014-03-13 08:59:09 UTC
Hmm, the bug description matches the behaviour of that bit (only applicable to single pipe) but I don't see any contraindications from the spec. The spec says it does automatic disabling of SR, so perhaps we should only be poking the bit during clock gating setup rather than every WM update.
Comment 3 Ville Syrjala 2014-03-13 11:06:23 UTC
Right, so previously we were failing to enable self refresh since we did the INSTPM write incorrectly, but now that we do enable it, something fails. I don't see any underrun reports in the logs though.

Can you run something like gem_exec_nop from intel-gpu-tool after a boot where the display went blank, but before starting X? That should cause the GPU to generate some interrupts which would trick the underrun report out if it was there.

Also can you attach a dmesg log which includes the part where you start X and the display comes back to life?

Chris, I suppose you could be right about the bit. Maybe it's not really safe to poke it while the pipes/planes are active. But if we change that then we probably need to change the watermark code to always compute the SR watermark regardless of how many pipes are active. Not quite sure what we should stick there though, probably the worst case value for all pipes. Also if we compute the SR watermark and it comes out invalid, we should still have a way to disable SR entirely, even with just one active pipe. I don't see any other bit for doing that. So for that reason I'm a bit sceptical about this idea.

I did spot a few potential issues with the way we calculate the SR watermark. I'll attach a few patches for those.
Comment 4 Ville Syrjala 2014-03-13 11:07:29 UTC
Created attachment 95710 [details] [review]
[PATCH 1/3] drm/i915: Double the gen3 SR latency
Comment 5 Ville Syrjala 2014-03-13 11:07:54 UTC
Created attachment 95711 [details] [review]
[PATCH 2/3] drm/i915: Calculate gen3 SR watermark with 32bpp pixel
Comment 6 Ville Syrjala 2014-03-13 11:08:18 UTC
Created attachment 95712 [details] [review]
[PATCH 3/3] drm/i915: Enable/disable planes as the last/first thing
Comment 7 Ville Syrjala 2014-03-13 11:10:30 UTC
The three patches I just attached might help. Well, actually 2/3 shouldn't have any effect unless we give a <32bpp framebuffer for the primary plane, which we shouldn't do normally.
Comment 8 Daniel Vetter 2014-03-13 11:29:40 UTC
Note that SNA does 15 bits on gen2/3 iirc, so patch 2/3 from Ville's series might still be relevant.
Comment 9 Daniel Vetter 2014-03-13 11:30:34 UTC
Created attachment 95713 [details] [review]
shuffle INSTPM writes

I don't have any real justification for this, but something else to try (maybe together with Ville's stuff, but please also test separately).
Comment 10 Krzysztof Mazur 2014-03-13 11:53:45 UTC
Created attachment 95715 [details]
dmesg from X startup

Linux v3.14-rc6-26-g33807f4 + 3 patches from Ville Syrjälä. Still does not work.

Just the fragment when starting X server fixed the problem.
Comment 11 Krzysztof Mazur 2014-03-13 12:06:29 UTC
Linux v3.14-rc6-26-g33807f4 + shuffle-INSTPM-write.patch also does not work.
Comment 12 Krzysztof Mazur 2014-03-13 12:23:54 UTC
Created attachment 95718 [details]
dmesg from gem_exec_nop

Linux v3.14-rc6-26-g33807f4 + shuffle-INSTPM-write.patch.

Just after boot, blank dislay.
Comment 13 Ville Syrjala 2014-03-13 13:04:19 UTC
Nothing in the X log AFAICS. The watermarks are still the same there. Maybe something X is doing prevents C3 entry.

Can you run powertop after boot, and then while X is running and see if the CPU enters C3 state in each case?

I'm thinking maybe the automagic max FIFO mode doesn't work on i915gm we'd need to manually repartition the FIFO...
Comment 14 Ville Syrjala 2014-03-13 13:22:07 UTC
Created attachment 95724 [details] [review]
[PATCH] drm/i915: Repartition the display FIFO manyally on gen3

A patch to try out the automagic max FIFO failure idea I had.
Comment 15 Krzysztof Mazur 2014-03-13 13:52:04 UTC
Linux v3.14-rc6-26-g33807f4 + 0001-drm-i915-Repartition-the-display-FIFO-manyally-on-ge.patch still does not work and introduces another regression:
during the VT change from a framebuffer to X some artifacts are visible and pipe underrun occurs:

...
[  388.668402] [drm:check_encoder_state], [ENCODER:6:LVDS-6]
[  388.668405] [drm:check_encoder_state], [ENCODER:12:DAC-12]
[  388.668408] [drm:check_encoder_state], [ENCODER:14:TV-14]
[  388.668411] [drm:check_crtc_state], [CRTC:3]
[  388.668413] [drm:check_crtc_state], [CRTC:4]
[  388.668560] [drm:drm_helper_probe_single_connector_modes], [CONNECTOR:13:SVIDEO-1] disconnected
[  388.967210] [drm:i915_irq_handler], pipe B underrun


The CPU enters C3 before and after X server startup. Before:
$ cat /sys/devices/system/cpu/cpu0/cpuidle/state*/time
15
3610
235270
15333732

BTW. Starting kmscon is also sufficient to fixup the display.
Comment 16 Daniel Vetter 2014-03-14 07:17:38 UTC
Hm, sounds like a simple render op is sufficient to fix up things ...

Can you pls try to run gem_render_copy from i-g-t before starting kmscon/X while the screen is still black and report what happens?
Comment 17 Krzysztof Mazur 2014-03-14 09:57:12 UTC
Created attachment 95785 [details]
dmesg from gem_render_copy

No, gem_render_copy is not sufficient, at least on the kernel with all above patches applied.
Comment 18 Daniel Vetter 2014-03-27 10:10:40 UTC
Created attachment 96446 [details] [review]
disable sr without tiling

Please test this patch.

As a second test please boot a broken kernel and change your xorg.conf to disable tiling with

Section "Device"
        Identifier      "Intel"
        Driver          "Intel"
        Option          "Tiling" "false"
EndSection

and report back whether that breaks X or not.
Comment 19 Daniel Vetter 2014-03-27 10:12:18 UTC
Created attachment 96448 [details] [review]
disable sr without tiling, v2

Actually compile-tested this time around ...
Comment 20 Chris Wilson 2014-03-27 10:14:02 UTC
Probably actually want Option "LinearFramebuffer" "true"
Comment 21 Krzysztof Mazur 2014-04-01 18:09:47 UTC
The X server with:

Option "Tiling" "false"
Option "LinearFramebuffer" "true"

still works.

With the "disable sr without tiling, v2" patch the kernel freezes during boot. The last messages on serial console are:
[    4.955552] agpgart-intel 0000:00:00.0: Intel 915GM Chipset
[    4.961213] agpgart-intel 0000:00:00.0: detected gtt size: 262144K total, 262144K mappable
[    4.970559] agpgart-intel 0000:00:00.0: detected 8192K stolen memory
[    4.977129] agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xc0000000
[    4.984078] [drm] Initialized drm 1.1.0 20060810
[    4.989212] [drm] Memory usable by graphics device = 256M
[    4.995580] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    5.002256] [drm] Driver supports precise vblank timestamp query.
[    5.008480] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    5.021734] ACPI: Battery Slot [C174] (battery present)
[    5.027114] ACPI: Battery Slot [C173] (battery absent)
[    5.340973] [drm] initialized overlay support

Even sysrq does not work.
Comment 22 Ville Syrjala 2014-04-01 18:46:02 UTC
Created attachment 96743 [details] [review]
disable sr without tiling, v3

There was a slight bug in Daniel's patch. Here's a fixed version.
Comment 23 Krzysztof Mazur 2014-04-01 19:31:57 UTC
Thanks, the "disable sr without tiling, v3" fixes the bug. v3.14 + this patch works correctly.

If you like you can add:
Tested-by: Krzysztof Mazur <krzysiek@podlesie.net>
Comment 24 Daniel Vetter 2014-04-07 06:55:18 UTC
Ok, I've submitted the patch for inclusion (cc: stable), thanks for reporting this bug.
Comment 25 Chris Wilson 2014-04-10 12:02:56 UTC
commit 2ab1bc9df01dbc19b55b2271100db7407fa6bfdc
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Mon Apr 7 08:54:21 2014 +0200

    drm/i915: Disable self-refresh for untiled fbs on i915gm
    
    Apparently it doesn't work. X-tiled self-refresh works flawlessly
    otoh. Apparently X still works correctly with linear framebuffers, so
    might just be an issue with the initial modeset. It's unclear whether
    this just borked wm setup from our side or a hw restriction, but just
    disabling gets things going.
    
    Note that this regression was only brought to light with
    
    commit 3f2dc5ac05714711fc14f2bf0ee5e42d5c08c581
    Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Date:   Fri Jan 10 14:06:47 2014 +0200
    
        drm/i915: Fix 915GM self-refresh enable/disable
    
    before that self-refresh for i915GM didn't work at all.
    
    Kudos to Ville for spotting a little bug in the original patch I've
    attached to the bug.
    
    Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76103
    Tested-by: Krzysztof Mazur <krzysiek@podlesie.net>
    Cc: Krzysztof Mazur <krzysiek@podlesie.net>
    Cc: stable@vger.kernel.org
    Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    [Jani: rebase on top of drm-next with primary plane support.]
    Signed-off-by: Jani Nikula <jani.nikula@intel.com>


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.