Created attachment 143991 [details] i915_error_state KERNEL RELEASE: 4.19.5 KERNEL VERSION: #1 SMP Thu Nov 29 10:58:58 UTC 2018 PLATFORM: Ubuntu 18.04.1LTS BDW GPU GT : GT3 (0x1622) CPU model name : Intel(R) Core(TM) i7-5850HQ CPU @ 2.70GHz cat /proc/cmdline: \boot\vmlinuz-4.19.5 root=LABEL=TARGET_OS ro vconsole.font=latarcyrheb-sun16 crashkernel=128M vconsole.keymap=us biosdevname=0 LANG=en_US.UTF-8 systemd.debug modprobe.blacklist=ast,mgag200 intel_pstate=disable i915.enable_rc6=0 intel_idle.max_cstate=1 initrd=boot\initrd.img-4.19.5 GPU hang doesn't reproduce on Kernel 4.14.20.
(In reply to Emelianova Svetlana from comment #1) > Created attachment 143991 [details] > i915_error_state Looks like an ordinary userspace hang.
I added drm.debug=0xe parameter >> cat /proc/cmdline \boot\vmlinuz-4.19.5 root=LABEL=TARGET_OS ro vconsole.font=latarcyrheb-sun16 crashkernel=128M vconsole.keymap=us biosdevname=0 LANG=en_US.UTF-8 systemd.debug modprobe.blacklist=ast,mgag200 intel_pstate=disable i915.enable_rc6=0 intel_idle.max_cstate=1 drm.debug=0xe initrd=boot\initrd.img-4.19.5 dmesg after GPU hang >> dmesg -e [Apr17 11:24] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0 [ +0.001032] [drm:intel_gpu_reset [i915]] rcs0: timed out on STOP_RING [ +0.000034] [drm:i915_gem_reset_engine [i915]] client mfx_transcoder[6254]/2: gained 1 ban score, now 1 [ +4.031095] i915 0000:00:02.0: Resetting vcs0 for hang on vcs0 [ +0.001028] [drm:intel_gpu_reset [i915]] vcs0: timed out on STOP_RING [ +0.000022] [drm:i915_gem_reset_engine [i915]] client mfx_transcoder[6254]/2: gained 1 ban score, now 2
Created attachment 144035 [details] binary for reproducing
Created attachment 144036 [details] script for reproducing
I built the latest drm kernel from https://anongit.freedesktop.org/git/drm/drm.git f06ddb5 commit. GPU HANG appears too. I attached sample_encode and bash script which runs it. Need to replace "(path/to/stream)" to real stream path and replace correct resolution (-w -h) in script. Stream should be YUV format and has not less 4k resolution. Need to build mediasdk environment from https://github.com/Intel-Media-SDK/MediaSDK. For reproducing it is necessary a multiple launch, I ran with command line: "./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh"
(In reply to Emelianova Svetlana from comment #6) > I built the latest drm kernel from > https://anongit.freedesktop.org/git/drm/drm.git f06ddb5 commit. GPU HANG > appears too. > I attached sample_encode and bash script which runs it. Need to replace > "(path/to/stream)" to real stream path and replace correct resolution (-w > -h) in script. Stream should be YUV format and has not less 4k resolution. > Need to build mediasdk environment from > https://github.com/Intel-Media-SDK/MediaSDK. > For reproducing it is necessary a multiple launch, I ran with command line: > "./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & > ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & > ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & > ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & ./catch_GPU_HANG.sh & > ./catch_GPU_HANG.sh" Can you please attach error file and dmesg from boot from latest drmtip?
Created attachment 144076 [details] log, output stream and dmesg
(In reply to Emelianova Svetlana from comment #8) > Created attachment 144076 [details] > log, output stream and dmesg Attached logs are from kernel 4.19, Can you please verify the issue with drmtip (Kernel 5.1) (https://cgit.freedesktop.org/drm-tip) ?
(In reply to Lakshmi from comment #9) > (In reply to Emelianova Svetlana from comment #8) > > Created attachment 144076 [details] > > log, output stream and dmesg > > Attached logs are from kernel 4.19, Can you please verify the issue with > drmtip (Kernel 5.1) (https://cgit.freedesktop.org/drm-tip) ? The latest attachment 144076 [details] has logs and dmesg from the latest drmtip (build is based on f06ddb5 commit). "Linux version 5.1.0-rc5+" from dmessg_27128.txt
(In reply to Emelianova Svetlana from comment #10) > (In reply to Lakshmi from comment #9) > > (In reply to Emelianova Svetlana from comment #8) > > > Created attachment 144076 [details] > > > log, output stream and dmesg > > > > Attached logs are from kernel 4.19, Can you please verify the issue with > > drmtip (Kernel 5.1) (https://cgit.freedesktop.org/drm-tip) ? > > The latest attachment 144076 [details] has logs and dmesg from the latest > drmtip (build is based on f06ddb5 commit). "Linux version 5.1.0-rc5+" from > dmessg_27128.txt Can you please reporter this bug under Vaapi driver. https://github.com/intel/intel-vaapi-driver/issues/new Closing this as NOTOURBUG.
Hi Lakshmi, Yes, we can submit a ticket against media driver (but https://github.com/intel/media-driver/ not https://github.com/intel/intel-vaapi-driver). But can you please say which makes you think it's media driver issue? I mean the GPU hangs appeared once we moved to 4.19.5 (user stack remained the same). So technically now it looks as a kernel regression (although of course it's possible that kernel changes revealed an issue on media driver side).
Lakshmi, Can you please reply?
Dmitry, Sorry for the late response. There is no clue that indicates this is a kernel issue. I would recommend to debug the userspace.
I do agree with Dmitry that it is indeed strange that a kernel update (with the user space being the same) has resulted in these hangs. However I do think that we should keep this ticket open and file another ticket against the media driver (with a link to this ticket) and have them suggest if they have any ideas what may have gone wrong. These hangs don't appear to be driver related but the GPU HW itself has hanged. I don't know enough but one thing which probably changed with the kernel update is the HuC firmware, so that definitely seems to me to be something which should be looked into. Perhaps the media team can help with that?
>>There is no clue that indicates this is a kernel issue. I would recommend to debug the userspace. Okay. Then, let's close this one. Svetlana, please fill a bug against https://github.com/intel/media-driver and put cross links here and at future gitHub ticket.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.