Bug 94707 - Repeatable GPU HANG: ecode 7:0:0x85fffffd
Summary: Repeatable GPU HANG: ecode 7:0:0x85fffffd
Status: CLOSED INVALID
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-03-25 20:09 UTC by Todd
Modified: 2017-04-11 14:51 UTC (History)
2 users (show)

See Also:
i915 platform: BYT
i915 features: GPU hang


Attachments
GPU crash dump from /sys/class/drm/card0/error (2.13 MB, text/plain)
2016-03-25 20:10 UTC, Todd
no flags Details
dmesg output (49.43 KB, text/plain)
2016-03-25 20:17 UTC, Todd
no flags Details

Description Todd 2016-03-25 20:09:23 UTC

    
Comment 1 Todd 2016-03-25 20:10:55 UTC
Created attachment 122569 [details]
GPU crash dump from /sys/class/drm/card0/error
Comment 2 Todd 2016-03-25 20:16:56 UTC
This is a repeatable crash using one of our existing software apps. This application runs as-is on other hardware.

Linux ikisNuc 4.3.0-1-686-pae #1 SMP Debian 4.3.3-5 (2016-01-04) i686 GNU/Linux

motherboard is an Intel NUC DE3815TYBE

OS is Debian Stretch
All versions are equal or greater than the most recent Stack release from Jan 04, 2016.


Here is a backtrace from the application that hangs the GPU

#0  0x00000099 in ?? ()
#1  0xb7e7dbff in GLView::DeleteChildren() () from /usr/local/lib/libGLSupport.so
#2  0xb7e7dc55 in GLView::~GLView() () from /usr/local/lib/libGLSupport.so
#3  0x08097d6d in CMeterScreen::~CMeterScreen (this=0x840efd8, __in_chrg=<optimized out>) at Screen.c:377
#4  0x08097ddb in CMeterScreen::~CMeterScreen (this=0x840efd8, __in_chrg=<optimized out>) at Screen.c:389
#5  0x0809728a in exit_func () at Screen.c:221
#6  0xb7843933 in __run_exit_handlers (status=1, listp=0xb79c93dc <__exit_funcs>, run_list_atexit=true) at exit.c:82
#7  0xb784398f in __GI_exit (status=1) at exit.c:104
#8  0xb6afc2ae in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#9  0xb6ac99f5 in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#10 0xb68e3eee in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#11 0xb68ced6d in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#12 0xb68e0b45 in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#13 0xb685ec3b in ?? () from /usr/lib/i386-linux-gnu/dri/i965_dri.so
#14 0x080636d7 in CGLView_CueMeter::drawMonoWaveform (this=0x859b008) at CGLView_CueMeter.cpp:458
#15 0x080622c8 in CGLView_CueMeter::DrawGL (this=0x859b008) at CGLView_CueMeter.cpp:71
#16 0xb7e7e648 in GLView::DrawChildren() [clone .localalias.5] () from /usr/local/lib/libGLSupport.so
#17 0x0805910c in CGLView_ChanStrip_Input::DrawGL (this=0x8599a38) at CGLView_ChanStrip_Input.cpp:278
#18 0x080549cf in CGLView_ChannelStrip::DrawGL (this=0xc715270) at CGLView_ChannelStrip.cpp:450
#19 0xb7e7e648 in GLView::DrawChildren() [clone .localalias.5] () from /usr/local/lib/libGLSupport.so
#20 0x080536ba in CGLView_ChannelStrips::DrawGL (this=0x8514600) at CGLView_ChannelStrip.cpp:192
#21 0xb7e7e648 in GLView::DrawChildren() [clone .localalias.5] () from /usr/local/lib/libGLSupport.so
#22 0x08097f2d in CMeterScreen::DrawGL (this=0x840efd8) at Screen.c:418
#23 0x0809706c in draw () at Screen.c:172
#24 0xb7eb351c in ?? () from /usr/lib/i386-linux-gnu/libglut.so.3
#25 0xb7eb704f in fgEnumWindows () from /usr/lib/i386-linux-gnu/libglut.so.3
#26 0xb7eb3a5e in glutMainLoopEvent () from /usr/lib/i386-linux-gnu/libglut.so.3
#27 0xb7eb42ac in glutMainLoop () from /usr/lib/i386-linux-gnu/libglut.so.3
#28 0x08097473 in main (argc=1, argv=0xbfffef54) at Screen.c:273

line 458 of CGLView_CueMeter.cpp is a call to glPopMatrix()

That is the last call inside our application before the crash (frame #14)
Comment 3 Todd 2016-03-25 20:17:21 UTC
Created attachment 122570 [details]
dmesg output
Comment 4 yann 2017-03-17 15:23:15 UTC
(In reply to Todd from comment #3)
> Created attachment 122570 [details]
> dmesg output

We seem to have neglected the bug a bit, apologies.

Todd, since There were improvements pushed in kernel that will benefit to your system, so please re-test with latest kernel and mark as REOPENED if you can reproduce (and attach fresh kernel log) and RESOLVED/* if you cannot reproduce.
Comment 5 Chris Wilson 2017-04-08 19:49:16 UTC
Likely fixed with mesa updates.
Comment 6 Todd 2017-04-11 14:51:48 UTC
I will recheck when I have some time and reopen if this is still an issue.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.