Bug 94934

Summary: [bdw] kernel crashes randomly
Product: DRI Reporter: Konstantin Demin <rockdrilla>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED WORKSFORME QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: BDW i915 features: GEM/PPGTT

Description Konstantin Demin 2016-04-14 16:27:21 UTC
System architecture: x86_64.
Kernels:
- 4.4.6-1 (Debian sid/unstable),
- 4.5.1 (own build),
- 4.6.0-rc3 from "drm-intel-nightly", HEAD @ c0a9d3a8bc
Distribution: Debian GNU/Linux sid/unstable
Machine: Acer Aspire E5-571G-5881
Display connector: eDP

GFX stack (Debian sid/unstable):
xorg: 1.18
xserver: 7.7
xf86-video-intel: 2:2.99.917+git20160325-1
mesa: 11.1.2-1
libdrm: 2.4.67-1

---

It seems to be that kernel crashes because of i915 with random uptime and load average.
The only way to boot and survive after hang is to use boot parameters "i915.semaphores=1" and "i915.enable_ppgtt=0".

A (fast) way to reproduce bug:
1) open Google Chrome
2) go to video-streaming website (e.g. YouTube)
3) expand video in fullscreen
4) collapse to windowed mode
5) repeat steps 3 and 4 few times ()
Result: system stops responding (even Magic SysRq doesn't work) and last seconds of sound are repeated at loop.

---

Attachments: none; please go to shared folder on my Google Drive and pick needed files (they're named accordingly):
drive.google.com/folderview?id=0B6WlgYCtySEQTUhIcnZlMi1YejQ
Comment 1 Konstantin Demin 2016-04-14 16:28:57 UTC
Kernel details:
- 4.4.6-1 - reproducible,
- 4.5.1 - reproducible,
- 4.6.0-rc3 - N/A (ref: dmesg).
Comment 2 Chris Wilson 2016-04-14 21:10:21 UTC
Looking at the error state for the i915.semaphore=1 i915.enable_ppgtt=0 case, the hang there is is the MI_SEMAPHORE_SIGNAL code. I've put some fixes for gen8 semaphores at https://cgit.freedesktop.org/~ickle/linux-2.6/log/?h=semaphores

Could you please see if they make the w/a i915.semaphore=1 i915.enable_ppgtt=0 case more stable?
Comment 3 Konstantin Demin 2016-04-15 21:56:18 UTC
I've built kernel from "semaphores" branch with your recent patches.

Added files with prefix "4.6.0-rc3-k0_custom_" to G.Drive:
- kernel config
- kernel build log
- dmesg (warning: unpacked size is about 160M)

There're visual random artefacts in (heavy?!) workloads: i've opened two "about:blank" tabs in Google Chrome. :)

My fast and dirty way to reproduce bug doesn't work at this moment: maybe it's not so fast anymore?..

BTW: may I ask you to provide same changeset for 4.5.1 kernel? It seems to be really more stable than 4.6.0-rc3.
Comment 4 Konstantin Demin 2016-04-15 22:09:24 UTC
I forgot: kernel (4.6.0-rc3 from "semaphores" branch) unexpectedly has been suspended during load, and I've resumed it by keystroke.
Comment 5 Jari Tahvanainen 2017-03-09 11:58:51 UTC
I'm sorry about the delay until getting on this. 
Konstantin - based on the last two comment this is fixed. or is it?
Chris - have code been submitted to drm-tip? Please provide commitID with yes, if that is the case.
Comment 6 Konstantin Demin 2017-03-09 20:55:31 UTC
Hi!
As far as I can see, issue was fixed approximately six months ago.
Now I'm using 4.9.13 and no hard lockups were detected.

GFX stack (Debian sid/unstable):
xorg: 1.19.2
xserver: 7.7
xf86-video-intel: 2:2.99.917+git20161206-1
mesa: 13.0.5-1
libdrm: 2.4.74-1
Comment 7 yann 2017-03-10 07:23:08 UTC
(In reply to Konstantin Demin from comment #6)
> Hi!
> As far as I can see, issue was fixed approximately six months ago.
> Now I'm using 4.9.13 and no hard lockups were detected.
> 
> GFX stack (Debian sid/unstable):
> xorg: 1.19.2
> xserver: 7.7
> xf86-video-intel: 2:2.99.917+git20161206-1
> mesa: 13.0.5-1
> libdrm: 2.4.74-1

Thanks Konstantin

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.