Bug 100231 - Freezing Linux machine by accessing site with WebGL shader
Summary: Freezing Linux machine by accessing site with WebGL shader
Status: RESOLVED WORKSFORME
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: 13.0
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-03-16 13:57 UTC by Andrei Lascu
Modified: 2018-04-06 15:19 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
glxinfo output from affected machine (22.27 KB, text/x-log)
2017-03-16 13:57 UTC, Andrei Lascu
Details
GLSL shader (41.08 KB, text/plain)
2017-03-22 16:27 UTC, Andrei Lascu
Details
Running script (151 bytes, application/x-shellscript)
2017-03-22 16:28 UTC, Andrei Lascu
Details
Webpage running GLSL shader (46.04 KB, text/html)
2017-03-22 16:31 UTC, Andrei Lascu
Details
crash dump (904.45 KB, text/plain)
2017-05-02 17:08 UTC, Hugues Evrard
Details
attachment-32112-0.html (2.07 KB, text/html)
2018-04-06 12:34 UTC, Alastair Donaldson
Details

Description Andrei Lascu 2017-03-16 13:57:13 UTC
Created attachment 130259 [details]
glxinfo output from affected machine

We have found a WebGL shader which, when rendered in Chrome, leads to consistent browser hang, and sometimes to a complete system freeze (at which point the machine is not even pingable). We were able to consistently replicate the machine freeze via a script, which launches Chrome attempting to render the shader successively. We were unable to reproduce the bug in Firefox (where we observed the tab crashing every time, but no system issues), and reported it to Chrome. After an analysis by the Chrome group, we were advised to report it here.

The machine is running an Archlinux installation, with manually installed Mesa 13.0.3, with an Intel Core i7-2600. The reported renderer string is "Mesa DRI Intel(R) Sandybridge Desktop". The full output of "glxinfo" is attached.

We have prepared a webpage which contains the source of the WebGL shader and attempts to render it shortly upon being opened, as well as the script which successively launches Chrome. Due to potential misuse, we are reluctant about making the shader public. Should we upload the shader on this tracker, or is there an alternative channel that we should be using?
Comment 1 Andrei Lascu 2017-03-22 16:27:33 UTC
Created attachment 130379 [details]
GLSL shader
Comment 2 Andrei Lascu 2017-03-22 16:28:46 UTC
Created attachment 130380 [details]
Running script

Using this script, we were able to reproduce the freezing issue deterministically, when the browser launches and attempts to render the shader for the second time.
Comment 3 Andrei Lascu 2017-03-22 16:31:11 UTC
Created attachment 130381 [details]
Webpage running GLSL shader

Automatically attempts to render the GLSL shader after a few seconds.
Comment 4 Andrei Lascu 2017-03-22 16:32:22 UTC
We have made the shader public. I have attached it, a webpage containing it which will attempt to render it shortly after being launched and the script with which we were able to deterministically trigger the freeze.
Comment 5 Francis Herne 2017-04-15 20:09:37 UTC
Reproducible with Mesa 17.0.3, kernel 4.10.8 on Sandy Bridge.

At the least this kills the browser; sometimes it hangs the entire desktop.
Comment 6 Mark Janes 2017-05-02 13:17:37 UTC
I don't get a hang when rendering this webpage with my skylake system.  It uses an awful lot of memory, though.

Can you attach the dmesg from system after the crash?
Comment 7 Hugues Evrard 2017-05-02 17:08:25 UTC
Created attachment 131175 [details]
crash dump
Comment 8 Hugues Evrard 2017-05-02 17:10:23 UTC
Reproduced on Mesa development (2017-04-28, git commit 24011ea).
On the first crash of Chrome, dmesg says:
[ 1069.720573] [drm] GPU HANG: ecode 6:0:0x8588fff8, in chrome [1849], reason: Hang on render ring, action: reset
[ 1069.720574] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 1069.720575] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 1069.720575] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 1069.720575] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 1069.720576] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 1069.720616] drm/i915: Resetting chip after gpu hang

See the previous attachment for the crash dump.
Comment 9 Andriy Khulap 2018-04-04 12:06:15 UTC
I've tested the page from comment#3 and no hangs or crashes happening. The browsers became non-responsible for the rendering time (around 40 sec). Then normal operation continued.

Tested browsers:
Chrome Version 65.0.3325.181 (Official Build) (64-bit)
Firefox ESR 52.6.0 (64-bit)

With the following mesa versions:
Mesa 17.2.0-devel (git-24011ead71)
Mesa 17.3.6 (git-b3e5a3f35b)
Mesa 18.1.0-devel (git-1beb80cb56)
Comment 10 Mark Janes 2018-04-04 16:33:54 UTC
to close this bug, we ought to confirm that sandy bridge can survive the web page with up-to-date kernel and mesa.
Comment 11 Andriy Khulap 2018-04-06 12:14:20 UTC
Tested on Sandy Bridge:
- Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
- Intel(R) Sandybridge Mobile  (0x126)
- Linux Mint 18.3-mate (Kernel 4.10.0-38-generic)
- Mesa Mesa 17.2.8 (from default install) and Mesa 18.1.0-devel (git-1beb80cb56)
- Firefox 59.0.2 (64-bit) and Chrome 65.0.3325.181

In all combinations system completely hang after around 40 sec after opening the page, so only power-off helped and I can't collect the debug info.

I will try with the newer kernel next.
The memory size on SandyBridge is the same as on Skylake from my post above: 8GB.
Comment 12 Alastair Donaldson 2018-04-06 12:34:49 UTC
Created attachment 138651 [details]
attachment-32112-0.html

I am on annual leave until Monday 16 April.

Best wishes

Ally
Comment 13 Andriy Khulap 2018-04-06 12:53:08 UTC
Updated kernels on SandyBridge:
- 4.13.0-041300-generic #201709031731 SMP Sun Sep 3 21:33:09 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
- 4.16.0-041600-generic #201804012230 SMP Sun Apr 1 22:31:39 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

And the hang gone. Now system becomes non-responsible for the rendering time but then operates normally. The same as on Skylake on comment #9 (it has 4.13.0-38, sorry for not to mention that).

So to summarize:
- System hangs with Kernel 4.10.0-38-generic
- System continues to operate with Kernels 4.13.0-041300-generic and 4.16.0-041600-generic
- Tested with Mesa 17.2.8 and 18.1.0-devel (git-1beb80cb56), with both Firefox and Chrome.
- Core(TM) i5-2520M, Sandybridge Mobile  (0x126)
Comment 14 Mark Janes 2018-04-06 15:19:43 UTC
Closing, since this works with a newer kernel.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.