Bug 100907 - [i965 bisected] boot hang on Dell D830 with kernel > 4.7
Summary: [i965 bisected] boot hang on Dell D830 with kernel > 4.7
Status: RESOLVED DUPLICATE of bug 93782
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: low major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords: bisected
Depends on:
Blocks:
 
Reported: 2017-05-02 19:56 UTC by Markus Schauler
Modified: 2019-05-31 17:53 UTC (History)
3 users (show)

See Also:
i915 platform: I965GM
i915 features: display/atomic


Attachments
dmesg for last good commit (680.82 KB, text/plain)
2017-05-02 19:56 UTC, Markus Schauler
no flags Details
dmesg for first bad commit (698.84 KB, text/plain)
2017-05-02 19:57 UTC, Markus Schauler
no flags Details
lspvi -vvv (29.57 KB, text/plain)
2017-05-02 19:58 UTC, Markus Schauler
no flags Details
dmesg for v4.11 (457.46 KB, text/plain)
2017-05-05 16:19 UTC, Markus Schauler
no flags Details
dmesg v4.13rc1 (173.01 KB, text/plain)
2017-07-23 13:42 UTC, Markus Schauler
no flags Details
[PATCH] drm/i915: Revert ea0000f0 "Roll out the helper nonblock (2.21 KB, patch)
2017-07-25 00:57 UTC, Jim Rees
no flags Details | Splinter Review
dmesg for patched version (293.18 KB, text/plain)
2017-07-26 17:19 UTC, Markus Schauler
no flags Details
dmesg patched 4.12 kernel, booting with enable.fbc=0 (68.73 KB, text/plain)
2017-12-21 21:46 UTC, Markus Schauler
no flags Details
dmesg patched 4.12 kernel, booting with enable.fbc=0, video=... (55.77 KB, text/plain)
2018-01-14 21:50 UTC, Markus Schauler
no flags Details

Description Markus Schauler 2017-05-02 19:56:49 UTC
Created attachment 131178 [details]
dmesg for last good commit

Using openSuse Tumbleweed on Dell Latitude D830 with GM965 graphics. After kernel upgrade, the system freezes at boot, displays DRM error message on screen.

To isolate the error, I switched to vanilla kernel and bisected.

Expected result: system boots and shows graphical plymouth-password prompt
Result with (bad) vanilla kernel: boot process freezes for about 10 seconds, then resumes and displays password prompt on text console.


Comparison of dmesg reveals that the bad kernels have this in the dmesg output:

[    5.859126] usb 7-1.2: Manufacturer: O2
[   15.120085] [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out
[   15.120095] [drm:drm_atomic_state_default_clear] Clearing atomic state ffff88007891a800
[   15.120104] [drm:drm_atomic_state_free] Freeing atomic state ffff88007891a800


First bad commit:
commit faf68d925671a0f7c105fb122db2a82b25030abc                                                                              
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Tue Jun 14 14:24:20 2016 +0200

    Reapply "drm/i915: Pass atomic states to fbc update, functions."
    
    The patch was reverted as part of the original nonblocking commit
    support, but is required for any kind of nonblocking commit.
    
    This is required to let fbc updates run async. It has a lot of
    checks whether certain locks are taken, which can be removed when
    the relevant states are passed in as pointers.
    
    Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    Link: http://patchwork.freedesktop.org/patch/msgid/1463490484-19540-17-git-send-email-maarten.lankhorst@linux.intel.com
    Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Link: http://patchwork.freedesktop.org/patch/msgid/28208c38-8738-abdf-0cce-8d8f266b9c28@linux.intel.com


Last good commit:
commit ea0000f0d369a59c2086fe9c489e0a2a86e080ba
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Mon Jun 13 16:13:46 2016 +0200

    drm/i915: Roll out the helper nonblock tracking
    
    Right now still all blocking, no worker anywhere to be seen.
    
    Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
    Link: http://patchwork.freedesktop.org/patch/msgid/1465827229-1704-2-git-send-email-daniel.vetter@ffwll.ch



attached is dmesg output (with drm.debug=0x1e) for both commits
Comment 1 Markus Schauler 2017-05-02 19:57:26 UTC
Created attachment 131179 [details]
dmesg for first bad commit
Comment 2 Markus Schauler 2017-05-02 19:58:05 UTC
Created attachment 131180 [details]
lspvi -vvv
Comment 3 Jani Nikula 2017-05-04 09:14:52 UTC
Please try v4.11.
Comment 4 Markus Schauler 2017-05-05 16:19:59 UTC
Created attachment 131229 [details]
dmesg for v4.11

Yes, v4.11 is still broken, see attached dmesg
Comment 5 Elizabeth 2017-07-20 20:51:28 UTC
(In reply to Markus Schauler from comment #4)
> Created attachment 131229 [details]
> dmesg for v4.11
> 
> Yes, v4.11 is still broken, see attached dmesg
Hello,
Still valid? Could you please try 4.13 https://www.kernel.org/ or drm-tip https://cgit.freedesktop.org/drm-tip
Thanks.
Comment 6 Markus Schauler 2017-07-23 13:42:29 UTC
Created attachment 132848 [details]
dmesg v4.13rc1
Comment 7 Markus Schauler 2017-07-23 13:43:44 UTC
(In reply to Elizabeth from comment #5)
> (In reply to Markus Schauler from comment #4)
> > Created attachment 131229 [details]
> > dmesg for v4.11
> > 
> > Yes, v4.11 is still broken, see attached dmesg
> Hello,
> Still valid? Could you please try 4.13 https://www.kernel.org/ or drm-tip
> https://cgit.freedesktop.org/drm-tip
> Thanks.

Yes, V4.13-rc1 is still broken, see attched dmesg output
Comment 8 Jim Rees 2017-07-25 00:57:02 UTC
Created attachment 132941 [details] [review]
[PATCH] drm/i915: Revert ea0000f0 "Roll out the helper nonblock

Can you try this patch?
Comment 9 Markus Schauler 2017-07-26 17:19:35 UTC
Created attachment 132997 [details]
dmesg for patched version
Comment 10 Markus Schauler 2017-07-26 17:22:16 UTC
(In reply to Jim Rees from comment #8)
> Created attachment 132941 [details] [review] [review]
> [PATCH] drm/i915: Revert ea0000f0 "Roll out the helper nonblock
> 
> Can you try this patch?

patch did not apply cleanly against v4.13-rc1, so I manually deleted the 10 lines.

Result: boot no longer hangs, but desktop (here: KDE) does not properly start: takes long time to start, desktop background not properly displayed, window manager not working. Dmesg output is attached
Comment 11 Jim Rees 2017-07-26 19:59:24 UTC
Can you try the patch on 4.12? It would be interesting to know if it's the same bug. I have not tested 4.13, but this patch fixes the flip_done timeout on 4.12 for me.
Comment 12 Markus Schauler 2017-07-27 20:25:46 UTC
(In reply to Jim Rees from comment #11)
> Can you try the patch on 4.12? It would be interesting to know if it's the
> same bug. I have not tested 4.13, but this patch fixes the flip_done timeout
> on 4.12 for me.

Yes, this patch fixes the flip_done timeout on v4.12 on by Dell. It also renders the system unusable. 
Symptoms: It seems that the X11/KDE is starting fine, but on virtual terminal vt7, there is only the boot-graphics shown. Mouse pointer is displayed and follows mouse movements. Sometimes, by switching between vt1,vt2, vt3,vt4,vt5,vt6 and vt7 I get access to my proper desktop (which has been running somewhere in the background). Could it be that X11 draws into one buffer, but the patched display driver displays anouther buffer?
Comment 13 Maarten Lankhorst 2017-11-22 12:31:11 UTC
Can you boot with enable.fbc=0?
Comment 14 Maarten Lankhorst 2017-12-08 11:25:53 UTC
ping?
Comment 15 Markus Schauler 2017-12-12 01:05:30 UTC
Yes, still listening. I will have access to my laptop after 19.12.2017 only.
Comment 16 Markus Schauler 2017-12-21 21:46:18 UTC
Created attachment 136351 [details]
dmesg patched 4.12 kernel, booting with enable.fbc=0
Comment 17 Markus Schauler 2017-12-21 21:51:13 UTC
Booting with enable.fbc=0 fixes part of the problem: the systems boots into the correct console, BUT the system is unusable: KDE/X does not start properly.

Sometimes, the window manger is not working (windows are restored from previous session, but window title bar is missing, windows cannot be resized/moved), sometimes, the transition from the boot/startup animation (here: the opensuse lightbulb) to the real desktop background does not happen.
Comment 18 Maarten Lankhorst 2018-01-09 15:51:18 UTC
Can you also try with i915.fbc=0 and the workaround from:

https://bugs.freedesktop.org/show_bug.cgi?id=93782#c65

?
Comment 19 Markus Schauler 2018-01-14 21:50:02 UTC
Said workaround solves some of the problems.
booting with video=SVIDEO-1:d i915.fbc=0 does not display any errors when booting, X11 / KDE plasma 5 seems to start normally, but the KDE taskbar is not present.

I use KDE with the "restore session" option, this gives me at least a terminal window to work with. When booting with an old 4.4 kernel, everything is back to normal, so the problem seems not to be linked to the saved KDE session information.  

So, the workaround helps, but the result is still a system that is unusable.
dmesg output is attached.
Comment 20 Markus Schauler 2018-01-14 21:50:52 UTC
Created attachment 136718 [details]
dmesg patched 4.12 kernel, booting with enable.fbc=0, video=...
Comment 21 Jani Saarinen 2018-03-29 07:11:25 UTC
First of all. Sorry about spam.
This is mass update for our bugs. 

Sorry if you feel this annoying but with this trying to understand if bug still valid or not.
If bug investigation still in progress, please ignore this and I apologize!

If you think this is not anymore valid, please comment to the bug that can be closed.
If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.
Comment 22 Markus Schauler 2018-04-09 16:45:19 UTC
bug is still valid
Comment 23 Francesco Balestrieri 2018-05-15 08:22:34 UTC
Marteen, any idea on how to proceed?
Comment 24 Simon Lee 2018-07-17 14:34:59 UTC
Hi Maarten,

Do you have any udpates?
Comment 25 Ville Syrjala 2018-09-06 15:01:49 UTC
The TV vblank timeouts are bug #93782, but I can't quite figure out what other problems have been reported here so not sure if we want to mark this as a duplicate.

Lots of talk about fbc in earlier comments which doesn't make sense considering we don't enable fbc on these platforms by default.
Comment 26 Ville Syrjala 2019-05-31 17:53:44 UTC
No updates in a while. I'll just mark this as a dupe of 93782

*** This bug has been marked as a duplicate of bug 93782 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.