Bug 90190 - [IVB Bisected]fail to start X and gpu hang
Summary: [IVB Bisected]fail to start X and gpu hang
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: All Linux (All)
: highest critical
Assignee: Mika Kuoppala
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-04-27 06:51 UTC by lu hua
Modified: 2017-10-06 14:30 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (119.29 KB, text/plain)
2015-04-27 06:51 UTC, lu hua
no flags Details
Xorg.0.log (12.85 KB, text/plain)
2015-04-27 06:51 UTC, lu hua
no flags Details
drm/i915: Arm cmd parser with aliasng ppgtt only (1.20 KB, patch)
2015-04-29 14:55 UTC, Mika Kuoppala
no flags Details | Splinter Review
drm/i915: Clear vma->bound on unbinding (963 bytes, patch)
2015-04-29 14:56 UTC, Mika Kuoppala
no flags Details | Splinter Review

Description lu hua 2015-04-27 06:51:32 UTC
Created attachment 115363 [details]
dmesg

==System Environment==
--------------------------
Regression: YES

Non-working platforms: IVB

==kernel==
--------------------------
drm-intel-nightly/92bb36c80e561f82b1f4b63cc269a71833137841
commit 92bb36c80e561f82b1f4b63cc269a71833137841
Author: Jani Nikula <jani.nikula@intel.com>
Date:   Fri Apr 24 00:26:53 2015 +0300

    drm-intel-nightly: 2015y-04m-23d-21h-26m-24s UTC integration manifest

==Bug detailed description==
-----------------------------
Fail to start X on IVB with drm-intel-nightly and drm-intel-next-queued kernel.It doesn't happen on drm-intel-fixes kernel.

Bisect shows:0875546c5318c85c13d07014af5350e9000bc9e9 is the first bad commit.
commit 0875546c5318c85c13d07014af5350e9000bc9e9
Author:     Daniel Vetter <daniel.vetter@ffwll.ch>
AuthorDate: Mon Apr 20 09:04:05 2015 -0700
Commit:     Daniel Vetter <daniel.vetter@ffwll.ch>
CommitDate: Thu Apr 23 21:06:39 2015 +0200

    drm/i915: Fix up the vma aliasing ppgtt binding

    Currently we have the problem that the decision whether ptes need to
    be (re)written is splattered all over the codebase. Move all that into
    i915_vma_bind. This needs a few changes:
    - Just reuse the PIN_* flags for i915_vma_bind and do the conversion
      to vma->bound in there to avoid duplicating the conversion code all
      over.
    - We need to make binding for EXECBUF (i.e. pick aliasing ppgtt if
      around) explicit, add PIN_USER for that.
    - Two callers want to update ptes, give them a PIN_UPDATE for that.

    Of course we still want to avoid double-binding, but that should be
    taken care of:
    - A ppgtt vma will only ever see PIN_USER, so no issue with
      double-binding.
    - A ggtt vma with aliasing ppgtt needs both types of binding, and we
      track that properly now.
    - A ggtt vma without aliasing ppgtt could be bound twice. In the
      lower-level ->bind_vma functions hence unconditionally set
      GLOBAL_BIND when writing the ggtt ptes.

    There's still a bit room for cleanup, but that's for follow-up
    patches.

    v2: Fixup fumbles.

    v3: s/PIN_EXECBUF/PIN_USER/ for clearer meaning, suggested by Chris.

    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

output:
[root@x-ivb9 ~]# xinit&
[1] 3724
[root@x-ivb9 ~]#

X.Org X Server 1.17.99
Release Date: 2015-02-02
X Protocol Version 11, Revision 0
Build Operating System: Linux 3.19.0-rc7+ x86_64
Current Operating System: Linux x-ivb9 4.0.0_drm-intel-nightly_92bb36_20150424+ #38 SMP Fri Apr 24 08:59:42 EDT 2015 x86_64
Kernel command line: BOOT_IMAGE=kernels//nightly_parents/2015_04_27/drm-intel-nightly/92bb36c80e561f82b1f4b63cc269a71833137841/bzImage_x86_64 root=/dev/sda2 drm.debug=0xe hostname=x-ivb9 modules_path=kernels//nightly_parents/2015_04_27/drm-intel-nightly/92bb36c80e561f82b1f4b63cc269a71833137841/modules_x86_64/lib/modules/4.0.0_drm-intel-nightly_92bb36_20150424+ kexec_jump_back_entry=0xcccccccc
Build Date: 26 April 2015  05:20:34PM

Current version of pixman: 0.33.1
        Before reporting problems, check http://wiki.x.org
        to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/opt/X11R7/var/log/Xorg.0.log", Time: Mon Apr 27 02:47:22 2015
(==) Using config directory: "/etc/X11/xorg.conf.d"
(==) Using system config directory "/opt/X11R7/share/X11/xorg.conf.d"
xterm: cannot load font '-misc-fixed-medium-r-semicondensed--13-120-75-75-c-60-iso10646-1'

[root@x-ivb9 ~]# glxinfo
name of display: :0.0
libGL: OpenDriver: trying /opt/X11R7/lib/dri/tls/i965_dri.so
libGL: OpenDriver: trying /opt/X11R7/lib/dri/i965_dri.so

dmesg:
[  159.708718] [drm] stuck on render ring
[  159.709116] [drm] GPU HANG: ecode 7:0:0xabcff7fb, in glxinfo [3813], reason: Ring hung, action: reset
[  159.709118] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[  159.709119] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[  159.709119] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[  159.709120] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[  159.709121] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[  159.709140] [drm:i915_reset_and_wakeup] resetting chip
[  159.710708] drm/i915: Resetting chip after gpu hang


==Reproduce steps==
---------------------------- 
1. clean boot system
2. xinit
3. glxinfo
Comment 1 lu hua 2015-04-27 06:51:53 UTC
Created attachment 115364 [details]
Xorg.0.log
Comment 2 lu hua 2015-04-27 07:21:56 UTC
On BSW, this bisect commit causes bug 90191.
Comment 3 Chris Wilson 2015-04-27 08:02:49 UTC
Please try: http://patchwork.freedesktop.org/patch/47858/
Comment 4 lu hua 2015-04-27 08:30:31 UTC
(In reply to Chris Wilson from comment #3)
> Please try: http://patchwork.freedesktop.org/patch/47858/

Which commit it the best for this patch? Try to apply it on the latest drm-intel-nightly:
patching file drivers/gpu/drm/i915/i915_gem.c
Hunk #1 succeeded at 3067 (offset 1 line).
Hunk #2 succeeded at 3589 (offset 1 line).
patching file drivers/gpu/drm/i915/i915_gem_execbuffer.c
patching file drivers/gpu/drm/i915/i915_gem_gtt.c
Hunk #1 FAILED at 1949.
Hunk #2 FAILED at 2813.
Hunk #3 succeeded at 2819 with fuzz 1 (offset -15 lines).
Hunk #4 FAILED at 2845.
3 out of 4 hunks FAILED -- saving rejects to file drivers/gpu/drm/i915/i915_gem_gtt.c.rej
Comment 5 Mika Kuoppala 2015-04-29 14:55:47 UTC
Created attachment 115452 [details] [review]
drm/i915: Arm cmd parser with aliasng ppgtt only
Comment 6 Mika Kuoppala 2015-04-29 14:56:34 UTC
Created attachment 115453 [details] [review]
drm/i915: Clear vma->bound on unbinding
Comment 7 Mika Kuoppala 2015-04-29 14:58:34 UTC
Bits from diff that Chris sent.
Comment 8 ye.tian 2015-04-30 02:17:08 UTC
Tested it on the latest nightly kernel with this two patches, this issue does not exists.
Comment 9 Jani Nikula 2015-04-30 10:33:04 UTC
Fixed by

commit 245054a1fe33c06ad233e0d58a27ec7b64db9284
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue Apr 14 17:35:22 2015 +0200

    drm/i915: Enable cmd parser to do secure batch promotion for aliasing ppgtt

in drm-intel-next-queued. Please reopen if the problem persists.
Comment 10 lu hua 2015-05-04 07:16:00 UTC
Verified.Fixed.
Comment 11 Elizabeth 2017-10-06 14:30:23 UTC
Closing old verified.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.