Bug 103494 - Inescapable system freeze on initial X startup drm/i915
Summary: Inescapable system freeze on initial X startup drm/i915
Status: NEEDINFO
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2017-10-28 03:30 UTC by aun.sswick
Modified: 2018-12-04 08:07 UTC (History)
4 users (show)

See Also:
i915 platform: BSW/CHT
i915 features: GPU hang


Attachments
lspci (8.44 KB, text/plain)
2017-10-28 03:30 UTC, aun.sswick
no flags Details
dmesg (120.30 KB, text/plain)
2017-10-29 01:54 UTC, aun.sswick
no flags Details
Xorg.log (26.01 KB, text/x-log)
2017-10-29 01:56 UTC, aun.sswick
no flags Details
disable wc (837 bytes, patch)
2018-04-25 14:43 UTC, Mika Kuoppala
no flags Details | Splinter Review
panic image 1 (14.88 KB, image/png)
2018-05-19 15:12 UTC, aun.sswick
no flags Details
panic image 2 (13.12 KB, image/png)
2018-05-19 15:13 UTC, aun.sswick
no flags Details
the .config (211.17 KB, text/x-mpsub)
2018-05-19 15:17 UTC, aun.sswick
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description aun.sswick 2017-10-28 03:30:00 UTC
Created attachment 135138 [details]
lspci

Upon system boot into X a complete hang on black screen is experienced every time after recent (4.13+) kernel upgrade.  SysRq keys do not function after this point.  Unfortunately was not able to extract any logs from hung system, no logs are written to disk or found after power cycle.

Architecture: i686
Kernel Version: anything after 4.12
Distro: Ubuntu 17.04
Machine: Laptop Aspire one 1-431 (AO1-431_106D_1.11)

Tested with drm-tip nightly: Same result.

Bisect Results:
3dc38eea665f383c84cc8d858b9a7645c0b29c54	bad
72affdf                                 	good
309663ab7b4f0de1540aff212fd067e3dd92acf3	bad
8ee7c6e23bb1b3c37ef27e81395db056bd7eac53	bad
ec151f31cd81cc99b957d6b528709d6ecfb25801	bad
1188bc66eb33e64ac7452b5acd62ce0395204148   	good
f0a22974acbdd17b03cff4bdee880e4f08cccf6d	bad
9231da70b338b336b982c74fad4afab5b55e6534 	bad
8448661d65f6f5dbcdb9c5cba185b284f2464b65	bad
cbc4e9e6a6d31fcc44921d2be41104425be8ab01	good

dad@octo:~/src/linux$ git bisect good
8448661d65f6f5dbcdb9c5cba185b284f2464b65 is the first bad commit
Comment 1 Chris Wilson 2017-10-28 16:08:06 UTC
Can you grab dmesg and logs from bad^ (the last known good commit)?
Comment 2 aun.sswick 2017-10-29 01:54:41 UTC
Created attachment 135147 [details]
dmesg
Comment 3 aun.sswick 2017-10-29 01:56:33 UTC
Created attachment 135148 [details]
Xorg.log
Comment 4 Jani Saarinen 2018-03-29 07:11:45 UTC
First of all. Sorry about spam.
This is mass update for our bugs. 

Sorry if you feel this annoying but with this trying to understand if bug still valid or not.
If bug investigation still in progress, please ignore this and I apologize!

If you think this is not anymore valid, please comment to the bug that can be closed.
If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.
Comment 5 omega 2018-03-29 08:08:32 UTC
The issue is still present in a fully updated Ubuntu 17.10 with the newest mainline kernel 4.16rc7 from http://kernel.ubuntu.com/~kernel-ppa/mainline/
Comment 6 Jani Saarinen 2018-04-25 08:21:15 UTC
Any advice from you Mika/Chris?
Comment 7 Mika Kuoppala 2018-04-25 14:43:22 UTC
Created attachment 139098 [details] [review]
disable wc
Comment 8 Mika Kuoppala 2018-04-25 14:47:18 UTC
Reproduce with https://cgit.freedesktop.org/drm-tip

Then apply the attached 'disable wc' patch to see that the bisect is
valid. It should in effect revert the 448661d65f6f5dbcdb9c5cba185b284f2464b65
parts that are of suspect.

Is omega's machine also Braswell?
Comment 9 omega 2018-04-25 14:50:31 UTC
omega's machine is Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
Comment 10 aun.sswick 2018-04-27 20:19:06 UTC
Not able to get x86 builds of drm-tip from last few days to book.  Will try again next week.
Comment 11 Jani Saarinen 2018-05-04 10:04:01 UTC
Aun, any updates here?
Comment 12 Jani Saarinen 2018-05-17 08:48:00 UTC
ping.
Comment 13 aun.sswick 2018-05-19 15:12:53 UTC
Created attachment 139636 [details]
panic image 1
Comment 14 aun.sswick 2018-05-19 15:13:16 UTC
Created attachment 139637 [details]
panic image 2
Comment 15 aun.sswick 2018-05-19 15:16:55 UTC
Been trying pull and builds every few days to test patch for over 2 weeks. Can't get anything that won't panic on boot.  Screen shot is from vm but real machine does same thing.  Get same behavior on linus master and drm-tip master.  Googling failed me.  Need help, any suggestions?
Comment 16 aun.sswick 2018-05-19 15:17:30 UTC
Created attachment 139638 [details]
the .config
Comment 17 Ville Syrjala 2018-05-21 14:41:16 UTC
(In reply to aun.sswick from comment #14)
> Created attachment 139637 [details]
> panic image 2

This looks like https://lkml.org/lkml/2018/5/17/482

Setting CMA=n should get you a bootable kernel if you don't want to patch it.
Comment 18 aun.sswick 2018-05-22 02:08:16 UTC
Tried CMA=n but then initramfs couldn't find boot device.  Building with and without patch now.  Will be a few days before I can test again.
Comment 19 aun.sswick 2018-05-31 01:19:52 UTC
Got build that boots on different hardware.
On laptop with issue master and master+patch both hang.
master hangs after starting X just after _ in top left goes away.
patch hangs after starting X with a _ in top left.
Comment 20 Lakshmi 2018-09-06 06:52:34 UTC
Aun, Do you still see the issue? if so, reproduce the issue using drm-tip (https://cgit.freedesktop.org/drm-tip) and kernel parameters drm.debug=0x1e log_buf_len=4M, and if the problem persists attach the full dmesg from boot.
Comment 21 Lakshmi 2018-10-17 16:14:03 UTC
Aun, any updates here?
Comment 22 Lakshmi 2018-11-05 10:37:52 UTC
No feedback for more than a month. Reducing the priority of the bug.
Comment 23 omega 2018-11-05 17:32:08 UTC
I did not open the issue but I had a similar one which is resolved:

https://bugs.freedesktop.org/show_bug.cgi?id=103509

There are still occasional freezes which are not yet solved:

https://bugs.freedesktop.org/show_bug.cgi?id=108357
https://bugs.freedesktop.org/show_bug.cgi?id=108358
Comment 24 Francesco Balestrieri 2018-12-04 08:07:30 UTC
The two bugs above don't seem related (the second one is not even specific to Intel HW). 

Aun, do you continue seeing the original issue you reported?


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.