Bug 108358 - [hsw] GPU HANG: ecode 7:0:0x87d73c1a, in urbanterror
Summary: [hsw] GPU HANG: ecode 7:0:0x87d73c1a, in urbanterror
Status: RESOLVED WORKSFORME
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard: Triaged
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-14 20:08 UTC by holgersson
Modified: 2019-09-04 10:55 UTC (History)
1 user (show)

See Also:
i915 platform: HSW
i915 features: GPU hang


Attachments
crash dump from /sys/class/drm/card0/error (88.17 KB, text/plain)
2018-10-14 20:08 UTC, holgersson
Details
UrbanTerror game config (16.62 KB, text/x-csrc)
2018-10-23 13:24 UTC, holgersson
Details

Description holgersson 2018-10-14 20:08:53 UTC
Created attachment 142024 [details]
crash dump from /sys/class/drm/card0/error

Hi,

I ran the first person shooter urbanterror and got a hanging system for some seconds. Afterwards I inspected the kernel ring buffer and found the following in dmesg:

[ 8856.096653] [drm] GPU HANG: ecode 7:0:0x87d73c1a, in urbanterror [10986], reason: hang on rcs0, action: reset
[ 8856.096655] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 8856.096656] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 8856.096657] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 8856.096658] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 8856.096659] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 8856.096705] i915 0000:00:02.0: Resetting chip for hang on rcs0
[ 8864.189283] i915 0000:00:02.0: Resetting chip for hang on rcs0
[ 8865.701281] sysrq: SysRq : Keyboard mode set to system default
[ 8872.085862] i915 0000:00:02.0: Resetting chip for hang on rcs0
[ 8880.189122] i915 0000:00:02.0: Resetting chip for hang on rcs0
[ 8888.082397] i915 0000:00:02.0: Resetting chip for hang on rcs0

The sysrq comes from my attempt to gain back control to switch to a TTY
and inspect the issue as I first thought it would have been OOM.

I'm running Gentoo with KDE Plasma 5 and xorg-server-1.20.1.

My kernel is 4.18.14 from upstream with some of the genkernel patches, but only ones for additional menus, CPU detection (-march=native) and so on, but nothing related to graphical components.

Please let me know if you need further details.
Comment 1 Lakshmi 2018-10-15 06:09:07 UTC
Reporter, How often you see this issue? 
Can you verify this issue with latest drm-tip?(https://cgit.freedesktop.org/drm-tip)
If problem exists with latest drm-tip, set kernel parameters drm.debug=0x1e log_buf_len=4M and reboot.
Try to reproduce the issue and attach the GPU crash dump and dmesg log. 
This way we see more information about the bug.
Comment 2 Lionel Landwerlin 2018-10-15 09:14:58 UTC
Would be interesting to know what version of Mesa you're using.
The pointer of the instruction is outside of the batchbuffer so I'm thinking this might be fixed by this patch :

commit 1c9f1a28c0738a0b1cb8626af431d18eeee3f4f1
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Fri Jan 5 12:07:20 2018 -0800

    i965: Require space for MI_BATCHBUFFER_END.
Comment 3 holgersson 2018-10-15 12:19:13 UTC
(In reply to Lakshmi from comment #1)

Thanks for your fast response.

This was the first time I experienced the issue. Actually I had some hang situations in the past with urbanterror, but never took a look into dmesg.

Due to my personal time schedule I can try to reproduce this first at the end of the week going through the steps you mentioned.


(In reply to Lionel Landwerlin from comment #2)

I’m using media-libs/mesa-18.2.2-r1, compiled with gcc-8.2.
You can more details about revesions etc. on https://packages.gentoo.org/packages/sys-libs/mesa.

I'll paste also the output of "emerge --info" which gives plenty details about other essential system components like CPU (with corresponding GPU), CFLAGS and so on; "::gentoo" means the installed package comes from the official gentoo "repository":

Portage 2.3.51 (python 3.6.6-final-0, default/linux/amd64/17.1/no-multilib/hardened, gcc-8.2.0, glibc-2.26-r7, 4.18.14-x240 x86_64)
=================================================================
                         System Settings
=================================================================
System uname: Linux-4.18.14-x240-x86_64-Intel-R-_Core-TM-_i5-4300U_CPU_@_1.90GHz-with-gentoo-2.6
KiB Mem:     8098064 total,    220348 free
KiB Swap:    8373568 total,   8373568 free
Timestamp of repository gentoo: Sat, 13 Oct 2018 21:04:27 +0000
Head commit of repository gentoo: 1f76cb925845c062be88bfa28a4a78ef1ebbbb8a

Timestamp of repository holgersson-overlay: Sat, 13 Oct 2018 21:19:55 +0000
Head commit of repository holgersson-overlay: a1811950542a8eceb7e0f876a681d46f4376162c

Timestamp of repository kde: Sat, 13 Oct 2018 21:19:53 +0000
Timestamp of repository mozilla: Sat, 13 Oct 2018 21:19:43 +0000
sh bash 4.4_p23
ld GNU ld (Gentoo 2.31.1 p3) 2.31.1
app-shells/bash:          4.4_p23::gentoo
dev-lang/perl:            5.26.2::gentoo
dev-lang/python:          2.7.15::gentoo, 3.6.6::gentoo, 3.7.0::gentoo
dev-util/cmake:           3.12.3::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.6-r1::gentoo
sys-apps/openrc:          0.38.2::gentoo
sys-apps/sandbox:         2.13::gentoo
sys-devel/autoconf:       2.13::gentoo, 2.69-r4::gentoo
sys-devel/automake:       1.15.1-r2::gentoo, 1.16.1-r1::gentoo
sys-devel/binutils:       2.31.1-r1::gentoo
sys-devel/gcc:            8.2.0-r3::gentoo
sys-devel/gcc-config:     2.0::gentoo
sys-devel/libtool:        2.4.6-r5::gentoo
sys-devel/make:           4.2.1-r4::gentoo
sys-kernel/linux-headers: 4.17::gentoo (virtual/os-headers)
sys-libs/glibc:           2.26-r7::gentoo
Repositories:

gentoo
    location: /var/db/repos/gentoo
    sync-type: git
    sync-uri: https://anongit.gentoo.org/git/repo/sync/gentoo.git
    sync-user: portage-sync
    priority: -1000
    sync-git-verify-commit-signature: true
    sync-git-clone-extra-opts: --branch master

g-cpan
    location: /var/db/repos/g-cpan-overlay
    masters: gentoo
    priority: 0

local
    location: /var/db/repos/overlay
    masters: gentoo
    priority: 1

holgersson-overlay
    location: /var/lib/layman/holgersson-overlay
    sync-type: git
    sync-uri: https://git.holgersson.xyz/holgersson-overlay
    masters: gentoo
    priority: 50

kde
    location: /var/lib/layman/kde
    sync-type: laymansync
    sync-uri: https://anongit.gentoo.org/git/proj/kde.git
    masters: gentoo
    priority: 50

mozilla
    location: /var/lib/layman/mozilla
    sync-type: laymansync
    sync-uri: git://anongit.gentoo.org/proj/mozilla.git
    masters: gentoo
    priority: 50

Installed sets: @SDR, @custom_KDE, @electronics, @games
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe -march=native -frecord-gcc-switches -ffunction-sections -fdata-sections -fstack-clash-protection -fcf-protection=full -mindirect-branch=thunk -mfunction-return=thunk -mindirect-branch-register"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-O2 -pipe -march=native -frecord-gcc-switches -ffunction-sections -fdata-sections -fstack-clash-protection -fcf-protection=full -mindirect-branch=thunk -mfunction-return=thunk -mindirect-branch-register"
DISTDIR="/var/cache/distfiles"
EMERGE_DEFAULT_OPTS="--quiet-build --autounmask=n --binpkg-respect-use=y"
ENV_UNSET="DBUS_SESSION_BUS_ADDRESS DISPLAY PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-O2 -pipe -frecord-gcc-switches"
FEATURES="assume-digests binpkg-logs buildpkg cgroup clean-logs compress-build-logs config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news nodoc noinfo parallel-fetch preserve-libs protect-owned sandbox sfperms sign strict strict-keepdir unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe -frecord-gcc-switches"
GENTOO_MIRRORS="http://ftp.halifax.rwth-aachen.de/gentoo/"
LANG="de_DE.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed -Wl,--gc-sections"
LINGUAS="de de_DE"
MAKEOPTS="-j5 -l5"
PKGDIR="/var/cache/packages"
PORTAGE_COMPRESS="xz"
PORTAGE_COMPRESS_FLAGS="-6 -T4"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="X acl acpi alsa amd64 apparmor bash-completion bluetooth bzip2 caps cgroups crypt cryptsetup cups cxx dbus djvu egl elogind exif fam firefox flac fontconfig git glamor gles gpg gpm graphicsmagick gstreamer hardened hunspell iconv icu int64 ipv6 jit jpeg kde kipi lapack libinput libtirpc lzma ncurses networkmanager nftables nls nptl offensive ogg opengl openmp openmpi openmpi2 opus pam pcre pdf phonon pic pie pkcs11 plasma png policykit postscript pulseaudio qml qt5 readline sasl sdl seccomp semantic-desktop smp sound spell ssh ssl ssp startup-notification svg theora threads tiff truetype udev udisks unicode upower usb v4l vaapi video vim-syntax vorbis vpx wavpack wayland widgets x264 x265 xattr xcb xcomposite xkb xtpax xv xvid zlib zsh-completion" ABI_X86="64" ALSA_CARDS="hda-intel" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon plan sheets stage words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="evdev synaptics wacom" KERNEL="linux" L10N="de de_DE" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LLVM_TARGETS="BPF X86" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-6 php7-1" POSTGRES_TARGETS="postgres9_5 postgres10" PYTHON_SINGLE_TARGET="python3_6" PYTHON_TARGETS="python2_7 python3_6" RUBY_TARGETS="ruby23" USERLAND="GNU" VIDEO_CARDS="intel i965" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, INSTALL_MASK, LC_ALL, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_RSYNC_EXTRA_OPTS

=================================================================
                        Package Settings
=================================================================

media-libs/mesa-18.2.2-r1::gentoo was built with the following:
USE="classic dri3 egl gallium gbm gles2 llvm pic vaapi wayland -d3d9 -debug -gles1 -lm_sensors -opencl -osmesa -pax_kernel (-selinux) -test -unwind -valgrind -vdpau -vulkan -xa -xvmc" VIDEO_CARDS="i965 intel (-freedreno) -i915 (-imx) -nouveau -r100 -r200 -r300 -r600 -radeon -radeonsi (-vc4) -virgl (-vivante) -vmware"
Comment 4 Lionel Landwerlin 2018-10-15 16:54:30 UTC
False alarm on the missing MI_BATCH_BUFFER_END, just a bug in the tool.
Comment 5 Lionel Landwerlin 2018-10-15 17:51:16 UTC
Could you also provide the settings you're using for UrbanTerror? Thanks!
Comment 6 holgersson 2018-10-23 13:24:02 UTC
I couldn't reproduce the hang anymore without updating any graphic related packages in the mean time. Unless it helps you if investigate it further
I suggest to close it as resolved invalid and reopen if the same issue occurs (reproducable) again.


(In reply to Lionel Landwerlin from comment #5)
> Could you also provide the settings you're using for UrbanTerror? Thanks!

For the game engine I have set USE="altgamma client curl opus skeetshootmod voip vorbis -debug -mumble -openal -server". Read it as everything is enabled beside ones prefixed with "-".

As I’m not sure if you meant these or the game settings with OpenGL configs etc. I’ll attached this aswell.
Comment 7 holgersson 2018-10-23 13:24:51 UTC
Created attachment 142148 [details]
UrbanTerror game config
Comment 8 Lionel Landwerlin 2018-10-23 13:45:41 UTC
(In reply to holgersson from comment #6)
> I couldn't reproduce the hang anymore without updating any graphic related
> packages in the mean time. Unless it helps you if investigate it further
> I suggest to close it as resolved invalid and reopen if the same issue
> occurs (reproducable) again.
> 
> 
> (In reply to Lionel Landwerlin from comment #5)
> > Could you also provide the settings you're using for UrbanTerror? Thanks!
> 
> For the game engine I have set USE="altgamma client curl opus skeetshootmod
> voip vorbis -debug -mumble -openal -server". Read it as everything is
> enabled beside ones prefixed with "-".
> 
> As I’m not sure if you meant these or the game settings with OpenGL configs
> etc. I’ll attached this aswell.

I meant the OpenGL settings indeed. I'll give a try to your settings later and will close if no hang is reproduced.

I have one device (x240 Lenovo laptop) with the exact same PCI id as yours.
Unfortunately even after a few hours of running the game (version 4.3.4 : https://www.urbanterror.info/downloads/), I didn't run reproduce any hang.
Comment 9 holgersson 2018-10-24 07:01:10 UTC
(In reply to Lionel Landwerlin from comment #8)
> [...]
> I meant the OpenGL settings indeed. I'll give a try to your settings later
> and will close if no hang is reproduced.
> 
> I have one device (x240 Lenovo laptop) with the exact same PCI id as yours.
> Unfortunately even after a few hours of running the game (version 4.3.4 :
> https://www.urbanterror.info/downloads/), I didn't run reproduce any hang.

My hardware is a x240 aswell. We are using a different game engine on Gentoo, because at the time of that decision the official one had several issues which are fixed now AFAIK, but still it's a slightly different software:
https://github.com/mickael9/ioq3

However, I don't expect that you can reproduce it with this one either. Maybe it was a hiccup or a heisenbug :-)
Comment 10 Marina Chernish 2018-10-25 13:22:41 UTC
I’ve tried to reproduce this hang on mentioned game version (downloaded from https://github.com/mickael9/ioq3) using attached settings, but observed no hangs on mesa-18.2.2 and the newest 18.3.0. I explored several maps while playing. 
The only issue I’ve caught is disappearing of picture of weapon. It just gets black. It happened on Blitzkrieg map. I’m trying to create apitrace of it for now.
My environment: Haswell: CPU: Intel Core i5-4300M; GPU: Intel® HD Graphics 4600
Ubuntu 16.04; kernel 4.15.0-36-generic;
Mesa 18.2.2 and 18.3.0 were used.
Comment 11 vadym 2018-10-25 16:28:20 UTC
(In reply to Marina Chernish from comment #10)
> I’ve tried to reproduce this hang on mentioned game version (downloaded from
> https://github.com/mickael9/ioq3) using attached settings, but observed no
> hangs on mesa-18.2.2 and the newest 18.3.0. I explored several maps while
> playing. 
> The only issue I’ve caught is disappearing of picture of weapon. It just
> gets black. It happened on Blitzkrieg map. I’m trying to create apitrace of
> it for now.
> My environment: Haswell: CPU: Intel Core i5-4300M; GPU: Intel® HD Graphics
> 4600
> Ubuntu 16.04; kernel 4.15.0-36-generic;
> Mesa 18.2.2 and 18.3.0 were used.

This issue is reproducible on Intel, Radeon and on software renderer. Looks like issue is in game. It populates two uniforms u_LightOrigin and u_DirectedLight with zeroes which leads to solid black color of the weapon.
Comment 12 Denis 2019-09-04 10:55:06 UTC
hi, as Marina and Vadim tried to reproduce this issue without any success, and reported also mentioned that couldn't reproduce it anymore, closing it as worksforme. Please fill free to reopen in case if it will be actual again or new steps will be found for reproducing it.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.