Bug 97573 - [IVB/SNB/BYT/HSW/BDW] GuC boot kernel command lines are causing regressions
Summary: [IVB/SNB/BYT/HSW/BDW] GuC boot kernel command lines are causing regressions
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: high major
Assignee: Elio
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-02 07:42 UTC by cprigent
Modified: 2016-10-25 09:47 UTC (History)
5 users (show)

See Also:
i915 platform: BSW/CHT, BYT, IVB, SNB
i915 features: firmware/guc


Attachments
IVB_igt-basic_guc-commands_kern.log (384.40 KB, text/x-log)
2016-09-02 07:42 UTC, cprigent
no flags Details
IVB_igt-basic_guc-commands_output (12.95 KB, text/plain)
2016-09-02 07:46 UTC, cprigent
no flags Details

Description cprigent 2016-09-02 07:42:30 UTC
Created attachment 126168 [details]
IVB_igt-basic_guc-commands_kern.log

Platform: IVB
CPU: Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz (Family 6, Model 58, Stepping 9)
Motherboard version: DH77EB
GPU: Intel® HD Graphics 4000 - Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller

Software
Bios: BH7710H.86A.0096.2012.1012.1645
Linux distribution: Ubuntu 16.04 64 bits

Kernel: 4.8.0-rc4 f4f46e5 from http://cgit.freedesktop.org/drm-intel/
  commit f4f46e5544894b2198cdfd5a226ee587d9834cc4}}
  Author: Daniel Vetter <daniel.vetter@ffwll.ch>
  Date: Mon Aug 29 16:09:42 2016 +0200
  drm-intel-nightly: 2016y-08m-29d-14h-09m-23s UTC integration manifest
ibdrm-2.4.70-2 b214b05 from git://anongit.freedesktop.org/mesa/drm
mesa: mesa-11.2.2 3a9f628from git://anongit.freedesktop.org/mesa/mesa
cairo 1.15.2 db8a7f1 from git://anongit.freedesktop.org/cairo
xorg-server-1.18.0-535 25e4f9e from git://git.freedesktop.org/git/xorg/xserver
xf86-video-intel 2.99.917-698 71d3273 from git://git.freedesktop.org/git/xorg/driver/xf86-video-intel
libva-1.7.0-47 2ebf897 from git://git.freedesktop.org/git/vaapi/libva 
vaapi-intel-driver: 1.7.0-95 1817bee from git://git.freedesktop.org/git/vaapi/intel-driver
Intel-Gpu-Tools 1.15 572a770 from http://anongit.freedesktop.org/git/xorg/app/intel-gpu-tools.git

External screen: DELL U2311Hb

Steps:
------
1. Boot with kernel boot command lines: 
i915.enable_guc_loading=2 i915.enable_guc_submission=2
2. Execute some IGT tests

Actual results:
---------------
2. Trying to load GuC is causing 69 regressions on the basic IGT tests

Expected result:
---------------
2. Trying to load GuC has no impact
Comment 1 cprigent 2016-09-02 07:46:42 UTC
Created attachment 126169 [details]
IVB_igt-basic_guc-commands_output
Comment 2 cprigent 2016-09-02 07:49:02 UTC
Reproduced on all families before BSW (not reproduced on BSW)
Comment 3 Dave Gordon 2016-09-05 17:48:36 UTC
> Steps:
> ------
> 1. Boot with kernel boot command lines: 
i915.enable_guc_loading=2 i915.enable_guc_submission=2

Please note that these variables are marked as UNSAFE.
That means that:
(a) you shouldn't set them unless you know what you're doing, and what they mean, and
(b) the kernel is *tainted* when you do so, which in turn means,
(c) whatever happens, it's not a bug, because you asked for it.

Secondly, the number "2" for these variables means that you are telling the driver that loading and use of the GuC is *mandatory*, and the driver *must* fail startup (leaving the GPU wedged) if *for any reason* those mandatory requirements cannot be fulfilled.

Good reasons to fail here include
(a) you don't have the (correct) GuC firmware installed, or
(b) the driver doesn't know what GuC firmware would be appropriate, or
(c) you don't have a GuC!

Therefore, this part of the bug report is wrong
> Expected result:
> ---------------
> 2. Trying to load GuC has no impact

The expected result of telling the driver that it *must* use the GuC on a platform that doesn't have one, is that it will develop paranoid schizophrenia ^W^W^W log an error and leave the GPU wedged. Which, according to the attached log, is exactly what it is doing :)

Note that setting either or both of these variables to "1" is much more sensible. That means "try to load/use the GuC, but continue in execlist mode if you can't". Or even "-1", meaning "try to use the GuC *if it's supported on this platform*, but fall back to execlist mode if it isn't supported, or the firmware is missing, or there's any other reason it can't be used".
Comment 4 cprigent 2016-09-14 11:31:09 UTC
I understand the reason and your point of view but adding a kernel boot command should not break the system even if what you are doing is not expected. This is a negative case. It is happening because of "(c) you don't have a GuC!" and I don't know why this is not happening on BSW.

drivers/gpu/drm/i915/intel_guc_loader.c
What about adding a check for the platforms without GuC, to do nothing whatever the kernel boot command lines?
because I will also test 1 and -1, and I will also install some firmware on platforms for which it is not intended.
Comment 5 Jani Nikula 2016-09-15 08:40:45 UTC
I think the module parameters should be no-ops on platforms that do not have GuC.
Comment 6 Rodrigo Vivi 2016-09-21 18:22:47 UTC
https://patchwork.freedesktop.org/patch/111622/

Please check if this helps.
Comment 7 cprigent 2016-10-06 15:12:33 UTC
I confirm.
Thanks
Comment 8 yann 2016-10-17 14:03:33 UTC
Please retest with https://patchwork.freedesktop.org/series/13815/
Comment 9 Paulo Zanoni 2016-10-18 14:20:58 UTC
(In reply to yann from comment #8)
> Please retest with https://patchwork.freedesktop.org/series/13815/

Patch merged. I came here to close the bug but it was already closed...
Comment 10 cprigent 2016-10-25 09:47:36 UTC
I launched IGT Basic on HSW, IVB, BDW with kernel boot command lines:  i915.enable_guc_loading=2 i915.enable_guc_submission=2
There is no timeout.
I see in kernel log: 
[drm:intel_device_info_dump [i915]] i915 device info: has_guc: no
[drm:intel_guc_setup [i915]] GuC fw status: path (null), fetch NONE, load NONE

Kernel: 4.9.0-rc2 194359e from http://cgit.freedesktop.org/drm-intel/
  commit 194359e4a31ff988c7a290093820c5ef28d3752b
  Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
  Date:   Mon Oct 24 17:44:02 2016 -0200
  drm-intel-nightly: 2016y-10m-24d-19h-42m-14s UTC integration manifest
libdrm-2.4.71 9e24d0c from git://anongit.freedesktop.org/mesa/drm
mesa: mesa-12.0.0 8b06176 from git://anongit.freedesktop.org/mesa/mesa
cairo 1.15.2 db8a7f1 from git://anongit.freedesktop.org/cairo
xorg-server-1.18.99.901-80 5dcb066 from git://git.freedesktop.org/git/xorg/xserver
xf86-video-intel 2.99.917-720 388fd4a from git://git.freedesktop.org/git/xorg/driver/xf86-video-intel
libva-1.7.2-38 3b7e499 from git://git.freedesktop.org/git/vaapi/libva 
vaapi-intel-driver: 1.7.2-140 852cea1 from git://git.freedesktop.org/git/vaapi/intel-driver
IGT: intel-gpu-tools-1.16-96 93437cb from http://anongit.freedesktop.org/git/xorg/app/intel-gpu-tools.git

So closed.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.