Bug 73276

Summary: Ubuntu 12.4 crash gpu
Product: DRI Reporter: saymon
Component: DRM/IntelAssignee: Ben Widawsky <ben>
Status: CLOSED INVALID QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs, przanoni
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
crashdump
none
dmesg
none
Xorg.0.log
none
/sys/class/drm/card0/error none

Description saymon 2014-01-04 11:59:51 UTC
Created attachment 91480 [details]
crashdump

laptop Acer E1-572G-74508G1TMnkk
# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 12.04.3 LTS
Release:	12.04
Codename:	precise
# uname -a
Linux laptop 3.13.0-031300rc6-generic #201312291935 SMP Mon Dec 30 00:37:05 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

In dmesg:

[   55.857037] [drm] stuck on render ring
[   55.857039] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[   55.857039] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[   55.857040] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[   55.857040] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[   55.857041] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[   55.859428] [drm:intel_pipe_set_base] *ERROR* pin & fence failed
[   55.859431] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x9a9000 ctx 0) at 0x9a9004
[   55.859433] detected fb_set_par error, error code: -5
Comment 1 saymon 2014-01-04 12:02:38 UTC
lspci 
00:00.0 Host bridge: Intel Corporation Haswell-ULT DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 09)
00:03.0 Audio device: Intel Corporation Device 0a0c (rev 09)
00:14.0 USB controller: Intel Corporation Lynx Point-LP USB xHCI HC (rev 04)
00:16.0 Communication controller: Intel Corporation Lynx Point-LP HECI #0 (rev 04)
00:1b.0 Audio device: Intel Corporation Lynx Point-LP HD Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation Lynx Point-LP PCI Express Root Port 3 (rev e4)
00:1c.3 PCI bridge: Intel Corporation Lynx Point-LP PCI Express Root Port 4 (rev e4)
00:1c.4 PCI bridge: Intel Corporation Lynx Point-LP PCI Express Root Port 5 (rev e4)
00:1d.0 USB controller: Intel Corporation Lynx Point-LP USB EHCI #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation Lynx Point-LP LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation Lynx Point-LP SATA Controller 1 [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation Lynx Point-LP SMBus Controller (rev 04)
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM57786 Gigabit Ethernet PCIe (rev 01)
01:00.1 SD Host controller: Broadcom Corporation BCM57765/57785 SDXC/MMC Card Reader (rev 01)
02:00.0 Network controller: Qualcomm Atheros QCA9565 / AR9565 Wireless Network Adapter (rev 01)
03:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Mars [Radeon HD 8670A/8750M]

cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 69
model name	: Intel(R) Core(TM) i7-4500U CPU @ 1.80GHz
stepping	: 1
microcode	: 0x10
cpu MHz		: 2410.312
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips	: 4788.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 69
model name	: Intel(R) Core(TM) i7-4500U CPU @ 1.80GHz
stepping	: 1
microcode	: 0x10
cpu MHz		: 2401.312
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips	: 4788.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 69
model name	: Intel(R) Core(TM) i7-4500U CPU @ 1.80GHz
stepping	: 1
microcode	: 0x10
cpu MHz		: 2400.093
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 2
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips	: 4788.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 69
model name	: Intel(R) Core(TM) i7-4500U CPU @ 1.80GHz
stepping	: 1
microcode	: 0x10
cpu MHz		: 2400.000
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 2
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips	: 4788.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:
Comment 2 Paulo Zanoni 2014-01-06 11:38:12 UTC
Hi

What were you doing when this crash happened? Can you reliably reproduce it? How?

Did you try other Kernels? Does any other Kernel work?

Thanks,
Paulo
Comment 3 Daniel Vetter 2014-01-08 16:55:59 UTC
blt commands in the render ring on hsw. Impressive.

Chris, when have we stopped that madness from shipping?
Comment 4 Chris Wilson 2014-01-08 17:28:37 UTC
Honestly, I didn't think this was possible with any of our releases. Historically, we would only send commands to recognised GPU which meant that for HSW we would have known about the split rings. I smell external factors.
Comment 5 Chris Wilson 2014-01-10 20:01:54 UTC
Can you please attach your Xorg.0.log, dmesg and glxinfo?
Comment 6 Daniel Vetter 2014-01-14 13:46:07 UTC
Please boot with drm.debug=0xe to capture the dmesg with additional driver information.
Comment 7 Hohahiu 2014-01-22 01:09:21 UTC
I have a similar error in dmesg while using kernel 3.13.0 and earlier rc versions:

[   42.037883] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[   42.037884] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[   42.037885] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[   42.037886] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[   42.037887] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.

But my hardware is different:

#lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Chelsea LP [Radeon HD 7730M] (rev ff)

# cpuinfo 
Intel(R) Processor information utility, Version 4.1 Update 1 Build 20130522
Copyright (C) 2005-2013 Intel Corporation.  All rights reserved.

=====  Processor composition  =====
Processor name    : Intel(R) Core(TM) i7-3610QM  

Should I open a new bug report for this?
Comment 8 Hohahiu 2014-01-22 01:12:03 UTC
Created attachment 92555 [details]
dmesg
Comment 9 Hohahiu 2014-01-22 01:16:57 UTC
Created attachment 92556 [details]
Xorg.0.log

Forgot to mention that my OS is openSUSE with kernel 3.13.0.
Mesa, libdrm, xf86-video-intel are from git.
X server is 1.15.0.
Comment 10 Hohahiu 2014-01-22 01:19:05 UTC
Created attachment 92557 [details]
/sys/class/drm/card0/error
Comment 11 Chris Wilson 2014-01-22 09:19:37 UTC
(In reply to comment #7)
> I have a similar error in dmesg while using kernel 3.13.0 and earlier rc
> versions:

Oh no you don't! In fact you have bug 73261,
Comment 12 Hohahiu 2014-01-22 22:30:39 UTC
(In reply to comment #11)
> (In reply to comment #7)
> > I have a similar error in dmesg while using kernel 3.13.0 and earlier rc
> > versions:
> 
> Oh no you don't! In fact you have bug 73261,

So should I attach all these files to those bug report?
Mine kernel version and GPU seems to be different. Also I didn't have this error message while using 3.12:
[   51.015929] [drm] stuck on render ring
[   51.015937] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[   51.015939] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[   51.015940] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[   51.015941] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[   51.015942] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Comment 13 Chris Wilson 2014-01-23 13:01:37 UTC
(In reply to comment #12)
> (In reply to comment #11)
> > (In reply to comment #7)
> > > I have a similar error in dmesg while using kernel 3.13.0 and earlier rc
> > > versions:
> > 
> > Oh no you don't! In fact you have bug 73261,
> 
> So should I attach all these files to those bug report?

I would say there was not a significance difference in the error state to worry about. Most relevant are your machine details (manufacturer, model, bios) so that we have some idea of how many different machines are affected.
Comment 14 Hohahiu 2014-01-24 01:17:04 UTC
(In reply to comment #13)
> (In reply to comment #12)
> > (In reply to comment #11)
> > > (In reply to comment #7)
> > > > I have a similar error in dmesg while using kernel 3.13.0 and earlier rc
> > > > versions:
> > > 
> > > Oh no you don't! In fact you have bug 73261,
> > 
> > So should I attach all these files to those bug report?
> 
> I would say there was not a significance difference in the error state to
> worry about. Most relevant are your machine details (manufacturer, model,
> bios) so that we have some idea of how many different machines are affected.

Oh, I got it. So my laptop is Hewlett-Packard HP ENVY 15 3200 CTO, BIOS F.08 08/15/2012 (according to dmesg) or F.0B (according to HP web site).
Knowing how rarely HP updates BIOS on their laptops it's unlikely that they will release new version.
Comment 15 Ben Widawsky 2014-01-29 18:50:57 UTC
i915.i915_enable_rc6=7 is not well supported. Please do not use it.

Can you reproduce without this command line addition?
Comment 16 Hohahiu 2014-01-31 01:55:48 UTC
(In reply to comment #15)
> i915.i915_enable_rc6=7 is not well supported. Please do not use it.
> 
> Can you reproduce without this command line addition?

Just tried it. Nothing has changed, GPU is still crashing. Also 3.11 and 3.12 kernels worked fine. At least these kind of errors didn't show up. What else can I do in order to help fixing this bug?
Comment 17 Ben Widawsky 2014-02-02 09:54:13 UTC
(In reply to comment #16)
> (In reply to comment #15)
> > i915.i915_enable_rc6=7 is not well supported. Please do not use it.
> > 
> > Can you reproduce without this command line addition?
> 
> Just tried it. Nothing has changed, GPU is still crashing. Also 3.11 and
> 3.12 kernels worked fine. At least these kind of errors didn't show up. What
> else can I do in order to help fixing this bug?

I missed that you're not the original filer. This is not necessarily the same bug, and as you've pointed out, haven't attached the requisite files. Please file a new bug, and if they do indeed turn out to be the same we'll mark them as duplicates later.

Thanks.

Saymon, are you still here?
Comment 18 Ben Widawsky 2014-02-12 02:36:34 UTC
Sayman, are you gone? If so we'll close this, and Hohahiu can open a new bug report.
Comment 19 Hohahiu 2014-02-19 00:36:27 UTC
(In reply to comment #18)
> Sayman, are you gone? If so we'll close this, and Hohahiu can open a new bug
> report.

It looks like the problem is resolved for me after update to kernel 3.14-rc3.
Comment 20 Hohahiu 2014-02-26 00:54:56 UTC
I was wrong. The bug is still there. I filled new bug report (https://bugs.freedesktop.org/show_bug.cgi?id=75514).
Comment 21 Chris Wilson 2014-02-26 10:31:57 UTC
We still have no information on what software was being used in the original bug that inserted BLT commands into the render ring.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.