Bug 103485 - [HSW] GPU HANG: ecode 7:0:0x87d3bffa, in java [2730], reason: Hang on render ring, action: reset
Summary: [HSW] GPU HANG: ecode 7:0:0x87d3bffa, in java [2730], reason: Hang on render ...
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium critical
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-10-27 15:03 UTC by Lars Taubert
Modified: 2017-12-18 09:09 UTC (History)
1 user (show)

See Also:
i915 platform: HSW
i915 features: GPU hang


Attachments
/sys/class/drm/card0/error (30.04 KB, text/plain)
2017-10-27 15:03 UTC, Lars Taubert
Details
/var/log/Xorg.0.log (20.43 KB, text/x-log)
2017-10-27 15:05 UTC, Lars Taubert
Details
/etc/X11/xorg.conf.d/01-intel-graphics.conf (124 bytes, text/plain)
2017-10-27 15:07 UTC, Lars Taubert
Details

Description Lars Taubert 2017-10-27 15:03:48 UTC
Created attachment 135107 [details]
/sys/class/drm/card0/error

Java is rendering RFB content via OpenGL and the OpenGL content hangs at some point in time
Sometimes after 5 Minutes after bootup and application start, sometime after 4 days...

System: Shuttle Inc. DS81D
OS: CentOS 7.4.1708
CPU: Intel(R) Celeron(R) CPU G1840 @ 2.80GHz
Host bridge: Intel Corporation 4th Gen Core Processor DRAM Controller (rev 06)
VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06)
Kernel: 3.10.0-693.2.2.el7.x86_64

Screen 0: minimum 8 x 8, current 3840 x 1080, maximum 32767 x 32767
DP1 connected 1920x1080+1920+0 (normal left inverted right x axis y axis) 640mm x 360mm
DP2 connected primary 1920x1080+0+0 (normal left inverted right x axis y axis) 640mm x 360mm
Comment 1 Lars Taubert 2017-10-27 15:05:26 UTC
Created attachment 135108 [details]
/var/log/Xorg.0.log
Comment 2 Lars Taubert 2017-10-27 15:07:12 UTC
Created attachment 135109 [details]
/etc/X11/xorg.conf.d/01-intel-graphics.conf
Comment 3 Elizabeth 2017-10-31 17:45:39 UTC
Hello, this seems a mesa bug, reassigning. If possible, please install a more recent kernel to be able to get more information from error state, also which version of Mesa are you using?? Thank you.

From error state: 
    ...
    ERROR: 0x00000101
    TLB page fault error (GTT entry not valid)
    Cacheline containing a PD was marked as invalid
    ...
  
render command stream:
  START: 0x00001000
  HEAD:  0x97410c68 [0x00010c00]
    head = 0x00010c68, wraps = 1210
  TAIL:  0x00010d30 [0x00010c88, 0x00010c98]
  CTL:   0x0001f001
    len=131072, enabled
  MODE:  0x00004000
  HWS:   0x7fff0000
  ACTHD: 0x00000000 97410c68
    at ring: 0x00000000
  IPEIR: 0x00000000
  IPEHR: 0x780c0000
  INSTDONE: 0xffdfbffa
    busy: SVG
    busy: VS
  SC_INSTDONE: 0xfffffffe
  SAMPLER_INSTDONE[0][0]: 0xffffffff
  ROW_INSTDONE[0][0]: 0xffffffff
  batch: [0x00000000_096f5000, 0x00000000_096f9000]
  BBADDR: 0x00000000_096f4d90
  BB_STATE: 0x00000000
  INSTPS: 0x00000500
  INSTPM: 0x00006080
  FADDR: 0x00000000 00011c68
  RC PSMI: 0x00000010
  FAULT_REG: 0x000000c5
    Valid
    Unloaded PD Fault (PPGTT)
    Address 0x00000000
    Source ID 24
  SYNC_0: 0x00000000
  SYNC_1: 0x00000000
  SYNC_2: 0x00000000
  GFX_MODE: 0x00002a00
  PP_DIR_BASE: 0x7fdf0000
  seqno: 0x00563817
  last_seqno: 0x0056381b
  waiting: yes
  ring->head: 0x0001a288
  ring->tail: 0x00010d30
  hangcheck: hung [42]

From batch:
...
0x096f52c0:      0x78090003: 3DSTATE_VERTEX_ELEMENTS
0x096f52c4:      0x02850000:    buffer 0: valid, type 0x0085, src offset 0x0000 bytes
0x096f52c8:      0x11230000:    (X, Y, 0.0, 1.0), dst offset 0x00 bytes
0x096f52cc:      0x0200000c:    buffer 0: valid, type 0x0000, src offset 0x000c bytes
0x096f52d0:      0x11110000:    (X, Y, Z, W), dst offset 0x00 bytes
0x096f52d4:      0x780c0000: 3D UNKNOWN: 3d_965 opcode = 0x780c
0x096f52d8:      0x00000000: MI_NOOP
0x096f52dc:      0x7b000005: 3DPRIMITIVE: 
0x096f52e0:      0x00000006:    tri fan sequential
0x096f52e4:      0x00000004:    vertex count
0x096f52e8:      0x00000000:    start vertex
0x096f52ec:      0x00000001:    instance count
0x096f52f0:      0x00000000:    start instance
0x096f52f4:      0x00000000:    index bias
...
Comment 4 Lars Taubert 2017-11-01 13:48:49 UTC
I'm using mesa-dri-drivers in version 17.0.1-6.20170307.
My kernel is nailed down to what's included in CentOS.
Comment 5 Elizabeth 2017-12-08 22:42:42 UTC
Hello again Lars, 
This may be hard but can you identify a reliable way to trigger the issue? New Mesa release 17.3 may do a difference, also it would be helpful if you could test with the combination of 17.3 and latest stable from: https://www.kernel.org.
Thank you.
Comment 6 Lars Taubert 2017-12-18 09:09:49 UTC
Fixed with new Mesa.

Many Thanks!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.