Bug 64764

Summary: [845G regression] crtc active state doesn't match with hw state
Product: DRI Reporter: mlsemon35
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: minor    
Priority: medium    
Version: unspecified   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg #1 from kernel 3.9.3, standard framebuffer settings
none
boot, check E-mail in X11, go back go console, shut monitor off, wait...
none
Additional dump of regs from fresh boot, xset command not used
none
quirk the pipe A quirk none

Description mlsemon35 2013-05-19 18:14:05 UTC
Hi!  For a while, I've been battling a problem that I call "xfstests + DRM + console blanking = bad results that show when lock debugging is turned on in the kernel."  This applies strongly to the i845g framebuffer and somewhat to the old i810 framebuffer as well.  However, everybody's been hard at work fixing bugs for the past 6-8 months, and helpful tty fixes have been rolling into stable kernels since 3.8.8.  What's left on i845g are two closely related "soft" oopses when the PC is left alone in console mode for long periods of time.  At this point, it happens reliably on an idle system.

If I knew how to keep the i915 DRM framebuffer setup from using power management on the monitor, I wouldn't bother filing a bug report.  This could go into the category of "informative message, don't even worry about this ancient hardware."

The kernel here is a recent kernel git with an XFS git merged in--advertised as "3.10.0-rc1+"--plus a few file system patches.  The installed graphics packages in this slackware-current (not updated this week) are these:

libdrm-2.4.42-i486-1
intel-gpu-tools-1.3-i486-1
xf86-video-intel-2.21.6-i486-2
mesa-9.1.1-i486-1

For the next week, a kernel bisect would be a bad idea that would wreak havoc here.  However, I could be convinced to pull in some DRM git code, if it would help and still be within the range of a novice git user.

Anyway, a push in the right direction would be appreciated.  Thanks!

Michael

[Note: The PC was rebooted between the first oops and the second oops.]

------------[ cut here ]------------
WARNING: at drivers/gpu/drm/i915/intel_display.c:7942 intel_modeset_check_state+0x512/0x714()
crtc active state doesn't match with hw state (expected 0, found 1)
CPU: 0 PID: 16362 Comm: kworker/0:0 Not tainted 3.10.0-rc1+ #11
Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002
Workqueue: events console_callback
 e0e61b90 e0e61b90 e0e61b58 c14cb944 e0e61b80 c1025bd1 c163e148 e0e61bac
 00001f06 c130e9b8 c130e9b8 ef414000 eed70654 ef53a000 e0e61b98 c1025c38
 00000009 e0e61b90 c163e148 e0e61bac e0e61da8 c130e9b8 c163d05c 00001f06
Call Trace:
 [<c14cb944>] dump_stack+0x16/0x18
 [<c1025bd1>] warn_slowpath_common+0x54/0x6b
 [<c130e9b8>] ? intel_modeset_check_state+0x512/0x714
 [<c130e9b8>] ? intel_modeset_check_state+0x512/0x714
 [<c1025c38>] warn_slowpath_fmt+0x2e/0x30
 [<c130e9b8>] intel_modeset_check_state+0x512/0x714
 [<c1311f70>] intel_crt_dpms+0x63/0x87
 [<c12cc905>] drm_fb_helper_dpms.isra.9+0xcc/0x114
 [<c10401a1>] ? __blocking_notifier_call_chain+0x42/0x56
 [<c12cd740>] drm_fb_helper_blank+0x58/0x5f
 [<c1265eb5>] fb_blank+0x45/0x8b
 [<c126faf9>] fbcon_blank+0x21d/0x2be
 [<c10438ef>] ? sched_clock_local.constprop.3+0x39/0x131
 [<c10446e2>] ? update_curr.constprop.48+0x55/0x95
 [<c1044906>] ? dequeue_task_fair+0xbd/0xc5
 [<c1041f35>] ? dequeue_task+0x39/0x40
 [<c14ce8bc>] ? __schedule+0x192/0x3e7
 [<c12b79bd>] do_blank_screen+0x112/0x21d
 [<c12b7fc9>] console_callback+0x5a/0x145
 [<c1037483>] process_one_work+0xd7/0x305
 [<c10366c0>] ? destroy_worker+0x5c/0x88
 [<c10370be>] ? manage_workers.isra.33+0x18a/0x219
 [<c103779b>] worker_thread+0xea/0x2f8
 [<c10376b1>] ? process_one_work+0x305/0x305
 [<c103c113>] kthread+0x8e/0x90
 [<c14cfcb7>] ret_from_kernel_thread+0x1b/0x28
 [<c103c085>] ? __kthread_parkme+0x58/0x58
---[ end trace 983ed25759656968 ]---

------------[ cut here ]------------
WARNING: at drivers/gpu/drm/i915/intel_display.c:7942 intel_modeset_check_state+0x517/0x719()
crtc active state doesn't match with hw state (expected 0, found 1)
CPU: 0 PID: 3558 Comm: kworker/0:2 Not tainted 3.10.0-rc1+ #28
Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002
Workqueue: events console_callback
 de9f7b38 de9f7b38 de9f7b00 c158bd4b de9f7b28 c1029e91 c1766824 de9f7b54
 00001f06 c13a00b5 c13a00b5 eec15000 eecf34c0 ee2cc000 de9f7b40 c1029f02
 00000009 de9f7b38 c1766824 de9f7b54 de9f7d50 c13a00b5 c1765738 00001f06
Call Trace:
 [<c158bd4b>] dump_stack+0x16/0x18
 [<c1029e91>] warn_slowpath_common+0x59/0x70
 [<c13a00b5>] ? intel_modeset_check_state+0x517/0x719
 [<c13a00b5>] ? intel_modeset_check_state+0x517/0x719
 [<c1029f02>] warn_slowpath_fmt+0x33/0x35
 [<c13a00b5>] intel_modeset_check_state+0x517/0x719
 [<c13a3727>] intel_crt_dpms+0x68/0x8c
 [<c1369fec>] ? drm_modeset_lock_all+0x3d/0x4c
 [<c1358eb5>] drm_fb_helper_dpms+0xd5/0x11d
 [<c104ba26>] ? __blocking_notifier_call_chain+0x47/0x5b
 [<c1359de1>] drm_fb_helper_blank+0x22/0x53
 [<c12ea0e9>] fb_blank+0x4a/0x90
 [<c12f42cc>] fbcon_blank+0x220/0x2c1
 [<c10528bb>] ? update_curr.constprop.41+0x9f/0x162
 [<c104b56b>] ? down+0x13/0x3e
 [<c134299e>] ? console_callback+0x11/0x154
 [<c13423bd>] do_blank_screen+0x11c/0x227
 [<c13429ec>] console_callback+0x5f/0x154
 [<c1040e05>] process_one_work+0x18e/0x427
 [<c1040dc8>] ? process_one_work+0x151/0x427
 [<c1041191>] worker_thread+0xf3/0x316
 [<c104109e>] ? process_one_work+0x427/0x427
 [<c104694b>] kthread+0x9e/0xa0
 [<c1592bb7>] ret_from_kernel_thread+0x1b/0x28
 [<c10468ad>] ? __kthread_parkme+0x5b/0x5b
---[ end trace 3c7867f395b5fea7 ]---
Comment 1 Daniel Vetter 2013-05-20 18:27:04 UTC
Bisecting won't be useful since this check has only been added in 3.7. So very likely the issue has existed since forever.

Two things:
- Does anything wrt modesetting not work on your machine?
- Can you please compile the latest stable kernel (just so that we have all the latest debug output), boot with drm.debug=0xe added to your kernel commandline and attach the complete dmesg after you've reproduced the issue? Please make sure that the dmesg contains everything since boot-up, you might need to stich it together from logfiles or increase the buffer size with the log_buf_len kernel option (e.g. log_buf_len=4M).
Comment 2 mlsemon35 2013-05-20 19:56:33 UTC
(In reply to comment #1)
> Bisecting won't be useful since this check has only been added in 3.7. So
> very likely the issue has existed since forever.
> 
> Two things:
> - Does anything wrt modesetting not work on your machine?
> - Can you please compile the latest stable kernel (just so that we have all
> the latest debug output), boot with drm.debug=0xe added to your kernel
> commandline and attach the complete dmesg after you've reproduced the issue?
> Please make sure that the dmesg contains everything since boot-up, you might
> need to stich it together from logfiles or increase the buffer size with the
> log_buf_len kernel option (e.g. log_buf_len=4M).

WRT modesetting, I'm happy with it.  Bouncing back and forth from X to console and back, doing silly things like playing with the OpenGL screen savers and such, the GPU gets wedged so rarely now that it's effectively unreproducible.  It never crashes or locks up.  Also, KMS gripes about a bad monitor EDID from time to time, not nearly as much as it used to do it.  As far as booting to a console goes, just straight KMS with no interference from X11, it seems very good, other than for the console blanking (wakeup?) issue.  On older kernels, it used to oops in such situations.  Now, it gives something much more benign.

I'll try to brew up a 3.9.3 kernel.  Bandwidth is limited here, so I'll have to check it out from a git that's already here.
Comment 3 mlsemon35 2013-05-20 21:04:05 UTC
Created attachment 79591 [details]
dmesg #1 from kernel 3.9.3, standard framebuffer settings
Comment 4 mlsemon35 2013-05-20 21:05:45 UTC
(In reply to comment #3)
> Created attachment 79591 [details]
> dmesg #1 from kernel 3.9.3, standard framebuffer settings

This is from a linux-3.9.tar.xz tarball + patch-3.9.3.xz.  The setup is a simple `make oldnoconfig` of my normal setup, only the kernel log file size was set to "21" because it wouldn't let me enter a higher number.
Comment 5 Daniel Vetter 2013-05-20 21:30:52 UTC
Hm, that dmesg doesn't contain the WARN with the backtrace. I need all the debug noise up to the first WARN to be able to reconstruct what (and hopefully why) things exactly blew up.
Comment 6 mlsemon35 2013-05-21 00:37:49 UTC
(In reply to comment #5)
> Hm, that dmesg doesn't contain the WARN with the backtrace. I need all the
> debug noise up to the first WARN to be able to reconstruct what (and
> hopefully why) things exactly blew up.

Whoops.  [I'm smashing a pie to my own face so you don't have to do it.]  I was reading too fast and sent you just the initial dmesg, not everything leading to the soft oops.  I'll get you another dmesg tomorrow morning.  Usually, this kicks up when the PC is left overnight, two or three nights straight on 3.10.0-rc1.  I'll try to remember to reboot to the 3.9.3 kernel to collect it.

As for that first backtrace, that popped up with the debug stuff, maybe a self-test of the kmemleak facility.  If you'd like me to shut some of these off, let me know:

mls:~$ zcat /proc/config.gz | grep DEBUG | grep -v "^#" | grep -v HAVE | grep -v SUPPORT
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_PNP_DEBUG_MESSAGES=y
CONFIG_XFS_DEBUG=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_KERNEL=y
CONFIG_SCHED_DEBUG=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=400
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_PI_LIST=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_DEBUG_LOCKING_API_SELFTESTS=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_LIST=y
CONFIG_ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS=y
CONFIG_DEBUG_PAGEALLOC=y
CONFIG_WANT_PAGE_DEBUG_FLAGS=y
CONFIG_DYNAMIC_DEBUG=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_RODATA=y
CONFIG_DEBUG_RODATA_TEST=y
CONFIG_KEYS_DEBUG_PROC_KEYS=y
Comment 7 mlsemon35 2013-05-21 18:30:01 UTC
Try #1: Failure.  A simple boot to console with kernel 3.9.3 and leaving it to make a slow download and do overnight cron jobs does not cause the problem to happen.  Good for kernel 3.9.3, bad for reproducing the problem.

The current kernel here is 3.10.0-rc2+ with only one XFS patchset to manage, so the ability to bisect is a little better now...should it come to that.
Comment 8 Daniel Vetter 2013-05-21 18:45:30 UTC
The dmesg from 3.10 should also be good - I've only asked for 3.9 since sometimes bug reporters freak out a bit when asked to test -rcs ;-)
Comment 9 mlsemon35 2013-05-22 17:30:07 UTC
Created attachment 79675 [details]
boot, check E-mail in X11, go back go console, shut monitor off, wait...

OK, the issue seems to be narrowed down to the powerdown section.  The Slackware /etc/rc.d/rc.M has this in it:

/bin/setterm -blank 15 -powersave powerdown -powerdown 60

Perhaps this isn't the best setting on my part, to power down a monitor that is already off at the time the code is executed?

Anyway, perhaps X11 doesn't leave the monitor in the best state to be shut off.  The lockdep you see is from me mounting an F2FS-formatted USB stick in order to do a slow download.  The dmesg timestamp of 27862.877640 is from me turning on the monitor the next morning, then hitting the Alt key.

Thanks!

Michael
Comment 10 mlsemon35 2013-05-22 17:34:41 UTC
Reading that dmesg again, maybe I didn't use X11 at all there.  Sorry.
Comment 11 Daniel Vetter 2013-05-22 17:41:12 UTC
(In reply to comment #9)
> Anyway, perhaps X11 doesn't leave the monitor in the best state to be shut
> off.  The lockdep you see is from me mounting an F2FS-formatted USB stick in
> order to do a slow download.  The dmesg timestamp of 27862.877640 is from me
> turning on the monitor the next morning, then hitting the Alt key.

Hm, I didn't see any signs of X running. Is that true? Also what happens when you do a dpms off in X:

xset dpms force off
Comment 12 mlsemon35 2013-05-22 20:38:04 UTC
(In reply to comment #11)
> (In reply to comment #9)
> > Anyway, perhaps X11 doesn't leave the monitor in the best state to be shut
> > off.  The lockdep you see is from me mounting an F2FS-formatted USB stick in
> > order to do a slow download.  The dmesg timestamp of 27862.877640 is from me
> > turning on the monitor the next morning, then hitting the Alt key.
> 
> Hm, I didn't see any signs of X running. Is that true? Also what happens
> when you do a dpms off in X:

In the submitted dmesg, I may not have run X11.  I forgot that I had good luck and actually reproduced the problem without X11.  There was a rather confusing X11-based dmesg gathered as well, and I must have thought I was working off of that.  Early mornings...

> xset dpms force off

The monitor goes blank.  The power light on the monitor stays green and does not turn off or turn orange.  This is added to the dmesg output:

[10323.560048] [drm:intel_modeset_check_state], [ENCODER:5:DAC-5]
[10323.560060] [drm:intel_modeset_check_state], [CRTC:3]
[10323.560067] ------------[ cut here ]------------
[10323.560084] WARNING: at drivers/gpu/drm/i915/intel_display.c:7942 intel_modeset_check_state+0x517/0x719()
[10323.560089] crtc active state doesn't match with hw state (expected 0, found 1)
[10323.560098] CPU: 0 PID: 812 Comm: X Tainted: G        W    3.10.0-rc2+ #32
[10323.560102] Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002
[10323.560108]  e5b1dba8 e5b1dba8 e5b1db70 c158b3eb e5b1db98 c1029e91 c1765888 e5b1dbc4
[10323.560125]  00001f06 c139f755 c139f755 eec15000 eecf34c0 ee2cc000 e5b1dbb0 c1029f02
[10323.560142]  00000009 e5b1dba8 c1765888 e5b1dbc4 e5b1ddc0 c139f755 c176479c 00001f06
[10323.560159] Call Trace:
[10323.560171]  [<c158b3eb>] dump_stack+0x16/0x18
[10323.560185]  [<c1029e91>] warn_slowpath_common+0x59/0x70
[10323.560195]  [<c139f755>] ? intel_modeset_check_state+0x517/0x719
[10323.560203]  [<c139f755>] ? intel_modeset_check_state+0x517/0x719
[10323.560211]  [<c1029f02>] warn_slowpath_fmt+0x33/0x35
[10323.560220]  [<c139f755>] intel_modeset_check_state+0x517/0x719
[10323.560230]  [<c13a2dc7>] intel_crt_dpms+0x68/0x8c
[10323.560239]  [<c136d179>] drm_mode_obj_set_property_ioctl+0x35e/0x379
[10323.560248]  [<c10c029a>] ? might_fault+0x3e/0x94
[10323.560255]  [<c136d194>] ? drm_mode_obj_set_property_ioctl+0x379/0x379
[10323.560263]  [<c136d1c7>] drm_mode_connector_property_set_ioctl+0x33/0x3a
[10323.560271]  [<c13602b1>] drm_ioctl+0x4c6/0x56c
[10323.560279]  [<c136d194>] ? drm_mode_obj_set_property_ioctl+0x379/0x379
[10323.560288]  [<c1066115>] ? lock_release_non_nested+0xa1/0x2b7
[10323.560295]  [<c135fdeb>] ? drm_lastclose+0x22b/0x22b
[10323.560303]  [<c10e2bc0>] do_vfs_ioctl+0x2f6/0x4cc
[10323.560310]  [<c10c029a>] ? might_fault+0x3e/0x94
[10323.560317]  [<c10c02cd>] ? might_fault+0x71/0x94
[10323.560324]  [<c10c02e4>] ? might_fault+0x88/0x94
[10323.560332]  [<c1002664>] ? do_device_not_available+0x27/0x2a
[10323.560340]  [<c1591bd3>] ? restore_all+0xf/0xf
[10323.560347]  [<c10e2dd5>] SyS_ioctl+0x3f/0x6a
[10323.560354]  [<c1591ba0>] syscall_call+0x7/0xb
[10323.560359] ---[ end trace 6ba5403cd5f92711 ]---
[10323.560376] [drm:drm_mode_setcrtc], [CRTC:3]
[10323.560384] [drm:intel_crtc_set_config], [CRTC:3] [NOFB]
[10323.560393] [drm:intel_modeset_stage_output_state], [CONNECTOR:4:VGA-1] to [NOCRTC]
[10323.560399] [drm:intel_modeset_stage_output_state], encoder changed, full mode switch
[10323.560404] [drm:intel_modeset_stage_output_state], crtc changed, full mode switch
[10323.560410] [drm:__intel_set_mode], set mode pipe masks: modeset: 0, prepare: 0, disable: 1
[10323.560423] [drm:intel_modeset_check_state], [ENCODER:5:DAC-5]
[10323.560429] [drm:intel_modeset_check_state], [CRTC:3]
[10323.560434] ------------[ cut here ]------------
[10323.560444] WARNING: at drivers/gpu/drm/i915/intel_display.c:7942 intel_modeset_check_state+0x517/0x719()
[10323.560449] crtc active state doesn't match with hw state (expected 0, found 1)
[10323.560455] CPU: 0 PID: 812 Comm: X Tainted: G        W    3.10.0-rc2+ #32
[10323.560461] Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002
[10323.560465]  e5b1db18 e5b1db18 e5b1dae0 c158b3eb e5b1db08 c1029e91 c1765888 e5b1db34
[10323.560482]  00001f06 c139f755 c139f755 eec15000 eecf34c0 ee2cc000 e5b1db20 c1029f02
[10323.560498]  00000009 e5b1db18 c1765888 e5b1db34 e5b1dd30 c139f755 c176479c 00001f06
[10323.560514] Call Trace:
[10323.560522]  [<c158b3eb>] dump_stack+0x16/0x18
[10323.560533]  [<c1029e91>] warn_slowpath_common+0x59/0x70
[10323.560543]  [<c139f755>] ? intel_modeset_check_state+0x517/0x719
[10323.560551]  [<c139f755>] ? intel_modeset_check_state+0x517/0x719
[10323.560558]  [<c1029f02>] warn_slowpath_fmt+0x33/0x35
[10323.560567]  [<c139f755>] intel_modeset_check_state+0x517/0x719
[10323.560576]  [<c139f9dd>] intel_set_mode+0x30/0x39
[10323.560584]  [<c13a0083>] intel_crtc_set_config+0x5bc/0x7f8
[10323.560592]  [<c136b14a>] drm_mode_set_config_internal+0x1c/0x3f
[10323.560599]  [<c136b379>] drm_mode_setcrtc+0xde/0x5d9
[10323.560606]  [<c10c029a>] ? might_fault+0x3e/0x94
[10323.560613]  [<c10c02cd>] ? might_fault+0x71/0x94
[10323.560620]  [<c136b29b>] ? drm_framebuffer_remove+0x12e/0x12e
[10323.560627]  [<c136b29b>] ? drm_framebuffer_remove+0x12e/0x12e
[10323.560634]  [<c13602b1>] drm_ioctl+0x4c6/0x56c
[10323.560641]  [<c136b29b>] ? drm_framebuffer_remove+0x12e/0x12e
[10323.560648]  [<c135fdeb>] ? drm_lastclose+0x22b/0x22b
[10323.560655]  [<c10e2bc0>] do_vfs_ioctl+0x2f6/0x4cc
[10323.560662]  [<c10c029a>] ? might_fault+0x3e/0x94
[10323.560669]  [<c10c02cd>] ? might_fault+0x71/0x94
[10323.560676]  [<c10c02e4>] ? might_fault+0x88/0x94
[10323.560683]  [<c1002664>] ? do_device_not_available+0x27/0x2a
[10323.560690]  [<c1591bd3>] ? restore_all+0xf/0xf
[10323.560697]  [<c10e2dd5>] SyS_ioctl+0x3f/0x6a
[10323.560704]  [<c1591ba0>] syscall_call+0x7/0xb
[10323.560708] ---[ end trace 6ba5403cd5f92712 ]---
[10323.560796] [drm:drm_mode_setcrtc], [CRTC:3]
[10323.560804] [drm:intel_crtc_set_config], [CRTC:3] [NOFB]
[10323.583784] [drm:drm_mode_setcrtc], [CRTC:3]
[10323.583811] [drm:drm_mode_setcrtc], [CONNECTOR:4:VGA-1]
[10323.583819] [drm:intel_crtc_set_config], [CRTC:3] [FB:19] #connectors=1 (x y) (0 0)
[10323.583827] [drm:intel_set_config_compute_mode_changes], crtc has no fb, full mode set
[10323.583833] [drm:intel_modeset_stage_output_state], encoder changed, full mode switch
[10323.583839] [drm:intel_modeset_stage_output_state], [CONNECTOR:4:VGA-1] to [CRTC:3]
[10323.583844] [drm:intel_modeset_stage_output_state], crtc changed, full mode switch
[10323.583849] [drm:intel_crtc_set_config], attempting to set mode from userspace
[10323.583854] [drm:drm_mode_debug_printmodeline], Modeline 20:"" 0 108000 1280 1328 1440 1688 1024 1025 1028 1066 0x0 0x5
[10323.583868] [drm:intel_modeset_pipe_config], [CRTC:3]
[10323.583873] [drm:intel_modeset_pipe_config], plane bpp: 24, pipe bpp: 24, dithering: 0
[10323.583878] [drm:__intel_set_mode], set mode pipe masks: modeset: 1, prepare: 1, disable: 0
[10323.588443] [drm:i9xx_crtc_mode_set], Mode for pipe A:
[10323.588449] [drm:drm_mode_debug_printmodeline], Modeline 0:"" 0 108000 1280 1328 1440 1688 1024 1025 1028 1066 0x0 0x5
[10323.660035] [drm:intel_wait_for_vblank], vblank wait timed out
[10323.660053] [drm:i9xx_update_plane], Writing base 00800000 00000000 0 0 8192
[10323.660072] [drm:intel_crtc_mode_set], [ENCODER:5:DAC-5] set [MODE:0:]
[10323.660082] [drm:i845_get_fifo_size], FIFO size - (0x0000005f) A: 23
[10323.660088] [drm:intel_calculate_wm], FIFO entries required for mode: 68
[10323.660094] [drm:intel_calculate_wm], FIFO watermark level: -47
[10323.660099] [drm:i830_update_wm], Setting FIFO watermarks - A: 1
[10323.740033] [drm:intel_wait_for_vblank], vblank wait timed out
[10323.740092] [drm:intel_connector_check_state], [CONNECTOR:4:VGA-1]
[10323.740101] [drm:intel_modeset_check_state], [ENCODER:5:DAC-5]
[10323.740108] [drm:intel_modeset_check_state], [CRTC:3]
Comment 13 Daniel Vetter 2013-05-23 07:52:15 UTC
Can you please grab the latest intel-gpu-tools git from http://cgit.freedesktop.org/xorg/app/intel-gpu-tools/ and then grab the output of intel_reg_dumper both while the screen is on and once it's off (i.e. after the WARN hit dmesg)?

If you don't have ssh to do grab the 2nd dump you can do

xset dpms force off ; sleep 2; intel_reg_dumper > reg_dump.off
Comment 14 mlsemon35 2013-05-23 08:30:26 UTC
(In reply to comment #13)
> Can you please grab the latest intel-gpu-tools git from
> http://cgit.freedesktop.org/xorg/app/intel-gpu-tools/ and then grab the
> output of intel_reg_dumper both while the screen is on and once it's off
> (i.e. after the WARN hit dmesg)?
> 
> If you don't have ssh to do grab the 2nd dump you can do
> 
> xset dpms force off ; sleep 2; intel_reg_dumper > reg_dump.off

It looks like I did not mention that the monitor comes back up, as if `xset dpms force off` does only a temporary blanking of the screen, barely 2 seconds.

intel-gpu-tools-1.3 is part of slackware-current, so I used it, and it gave me this (as root)...

root:~$ xset dpms force off; sleep 2; intel_reg_dumper > dumper.out
Gen2/3 Ranges are not supported. Please use unsafe access.Aborted

...which I saw last year when I compiled intel-gpu-tools from source and tried to use it.

As for tools that might work here, there's a /sys/kernel/debug/dri directory available, and the i915 tracers are available.  After I go back home tomorrow, serial console/capture will be available as well.

This DRM debug and DPMS debugging is causing X11 to become unstartable sometimes.  It acts like it's going to start Xfce but stops at a blank screen.  When I switch back to the console from which I typed `startx`, this shows up over and over:

__kgem_bo_map__gtt: failed to retrieve GTT offset for handle=26: 5

Is this a message from the Intel X11 driver?  Ctrl-c stops X11 and the printing of these messages, but it takes a reboot in order to use X11 again.

Thanks for bearing with me!  [Reminder to self:  More coffee is needed in the morning.]

Michael
Comment 15 Daniel Vetter 2013-05-23 10:05:49 UTC
On Thu, May 23, 2013 at 10:30 AM,  <bugzilla-daemon@freedesktop.org> wrote:
> root:~$ xset dpms force off; sleep 2; intel_reg_dumper > dumper.out
> Gen2/3 Ranges are not supported. Please use unsafe access.Aborted

This bug is why I've asked you to use the latest git ;-)
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
Comment 16 mlsemon35 2013-05-23 18:48:15 UTC
(In reply to comment #15)
> On Thu, May 23, 2013 at 10:30 AM,  <bugzilla-daemon@freedesktop.org> wrote:
> > root:~$ xset dpms force off; sleep 2; intel_reg_dumper > dumper.out
> > Gen2/3 Ranges are not supported. Please use unsafe access.Aborted
> 
> This bug is why I've asked you to use the latest git ;-)
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Gotcha.  Nice git, BTW.  Not everybody puts master builds into a convenient tarball.

After installing Python 3.3.2 and the latest git intel-gpu-tools, the Gen2/3 error message remains, but the output file is not empty this time.

A note on monitor behavior:  It looks like the first time this is run, the monitor does indeed blank, stay off, power light is orange, etc., until the Alt key is pressed to wake the monitor.  It's the later tries that cause the monitor to blank and quickly unblank.

                           DCC: 0x00000000 ()
                     CHDECMISC: 0x16f000e8 (XOR bank, ch2 enh disabled, ch1 enh enabled, ch0 enh disabled, flex disabled, ep not present)
                        C0DRB0: 0x00000000 (0x0000)
                        C0DRB1: 0x00000000 (0x0000)
                        C0DRB2: 0x00000000 (0x0000)
                        C0DRB3: 0x00000000 (0x0000)
                        C1DRB0: 0x5f1090a0 (0x90a0)
                        C1DRB1: 0x694c5f10 (0x5f10)
                        C1DRB2: 0x6f4c694c (0x694c)
                        C1DRB3: 0x00066f4c (0x6f4c)
                       C0DRA01: 0x00000000 (0x0000)
                       C0DRA23: 0x00000000 (0x0000)
                       C1DRA01: 0x01560006 (0x0006)
                       C1DRA23: 0x03810156 (0x0156)
                    PGETBL_CTL: 0x2ffe0001
             VCLK_DIVISOR_VGA0: 0x00021207 (n = 2, m1 = 18, m2 = 7)
             VCLK_DIVISOR_VGA1: 0x00031406 (n = 3, m1 = 20, m2 = 6)
                 VCLK_POST_DIV: 0x0000888b (vga0 p1 = 13, p2 = 4, vga1 p1 = 10, p2 = 4)
                     DPLL_TEST: 0x00000000 (, DPLLA input buffer disabled, DPLLB input buffer disabled)
                  CACHE_MODE_0: 0x00000000
                       D_STATE: 0x00000000
                 DSPCLK_GATE_D: 0x00000008 (clock gates disabled: OVRUNIT)
                RENCLK_GATE_D1: 0x00000000
                RENCLK_GATE_D2: 0x00000000
                         SDVOB: 0x00000000 (disabled, pipe A, stall disabled, not detected)
                         SDVOC: 0x00000000 (disabled, pipe A, stall disabled, not detected)
                       SDVOUDI: 0x00000000
                        DSPARB: 0x0000005f
                        DSPFW1: 0x00000000
                        DSPFW2: 0x00000000
                        DSPFW3: 0x00000000
                          ADPA: 0x00000c18 (disabled, pipe A, +hsync, +vsync)
                          LVDS: 0x00003001 (disabled, pipe A, 18 bit, 1 channel)
                          DVOA: 0x00000000 (disabled, pipe A, no stall, -hsync, -vsync)
                          DVOB: 0x00000000 (disabled, pipe A, no stall, -hsync, -vsync)
                          DVOC: 0x00000000 (disabled, pipe A, no stall, -hsync, -vsync)
                   DVOA_SRCDIM: 0x00000000
                   DVOB_SRCDIM: 0x00000000
                   DVOC_SRCDIM: 0x00000000
                   BLC_PWM_CTL: 0x00000000
                  BLC_PWM_CTL2: 0x00000000
                    PP_CONTROL: 0x00000000 (power target: off)
                     PP_STATUS: 0x00000000 (off, not ready, sequencing idle)
                  PP_ON_DELAYS: 0x00000000
                 PP_OFF_DELAYS: 0x00000000
                    PP_DIVISOR: 0x00000000
                  PFIT_CONTROL: 0x00000000
               PFIT_PGM_RATIOS: 0x00000000
               PORT_HOTPLUG_EN: 0x00000000
             PORT_HOTPLUG_STAT: 0x00000000
                      DSPACNTR: 0x58000000 (disabled, pipe A)
                    DSPASTRIDE: 0x00002000 (8192 bytes)
                       DSPAPOS: 0x00000000 (0, 0)
                      DSPASIZE: 0x00000000 (1, 1)
                      DSPABASE: 0x00800000
                      DSPASURF: 0x10200000
                   DSPATILEOFF: 0x00000000
                     PIPEACONF: 0x80000000 (enabled, single-wide)
                      PIPEASRC: 0x04ff03ff (1280, 1024)
                     PIPEASTAT: 0x00000207 (status: VSYNC_INT_STATUS SVBLANK_INT_STATUS VBLANK_INT_STATUS OREG_UPDATE_STATUS)
             PIPEA_GMCH_DATA_M: 0x00000000
             PIPEA_GMCH_DATA_N: 0x00000000
               PIPEA_DP_LINK_M: 0x00000000
               PIPEA_DP_LINK_N: 0x00000000
                 CURSOR_A_BASE: 0x01000000
              CURSOR_A_CONTROL: 0x04000000
             CURSOR_A_POSITION: 0x007e018b
                          FPA0: 0x0003160d (n = 3, m1 = 22, m2 = 13)
                          FPA1: 0x0003160d (n = 3, m1 = 22, m2 = 13)
                        DPLL_A: 0x90810000 (enabled, non-dvo, default clock, DAC/serial mode, p1 = 3, p2 = 4)
                     DPLL_A_MD: 0x00000000
                      HTOTAL_A: 0x069704ff (1280 active, 1688 total)
                      HBLANK_A: 0x069704ff (1280 start, 1688 end)
                       HSYNC_A: 0x059f052f (1328 start, 1440 end)
                      VTOTAL_A: 0x042903ff (1024 active, 1066 total)
                      VBLANK_A: 0x042903ff (1024 start, 1066 end)
                       VSYNC_A: 0x04030400 (1025 start, 1028 end)
                     BCLRPAT_A: 0x00000000
                  VSYNCSHIFT_A: 0x00000000
                      DSPBCNTR: 0x00000000 (disabled, pipe A)
                    DSPBSTRIDE: 0x00000000 (0 bytes)
                       DSPBPOS: 0x00000000 (0, 0)
                      DSPBSIZE: 0x00000000 (1, 1)
                      DSPBBASE: 0x00000000
                      DSPBSURF: 0x00000000
                   DSPBTILEOFF: 0x00000000
                     PIPEBCONF: 0x00000000 (disabled, single-wide)
                      PIPEBSRC: 0x00000000 (1, 1)
                     PIPEBSTAT: 0x00000000 (status:)
             PIPEB_GMCH_DATA_M: 0x00000000
             PIPEB_GMCH_DATA_N: 0x00000000
               PIPEB_DP_LINK_M: 0x00000000
               PIPEB_DP_LINK_N: 0x00000000
                 CURSOR_B_BASE: 0x00000000
              CURSOR_B_CONTROL: 0x00000000
             CURSOR_B_POSITION: 0x00000000
                          FPB0: 0x00000000 (n = 0, m1 = 0, m2 = 0)
                          FPB1: 0x00000000 (n = 0, m1 = 0, m2 = 0)
                        DPLL_B: 0x00000000 (disabled, non-dvo, VGA, default clock, DAC/serial mode, p1 = 2, p2 = 2)
                     DPLL_B_MD: 0x00000000
                      HTOTAL_B: 0x00000000 (1 active, 1 total)
                      HBLANK_B: 0x00000000 (1 start, 1 end)
                       HSYNC_B: 0x00000000 (1 start, 1 end)
                      VTOTAL_B: 0x00000000 (1 active, 1 total)
                      VBLANK_B: 0x00000000 (1 start, 1 end)
                       VSYNC_B: 0x00000000 (1 start, 1 end)
                     BCLRPAT_B: 0x00000000
                  VSYNCSHIFT_B: 0x00000000
             VCLK_DIVISOR_VGA0: 0x00021207
             VCLK_DIVISOR_VGA1: 0x00031406
                 VCLK_POST_DIV: 0x0000888b
                      VGACNTRL: 0x80000000 (disabled)
                        TV_CTL: 0x069704ff
                        TV_DAC: 0x069704ff
                      TV_CSC_Y: 0x042903ff
                     TV_CSC_Y2: 0x04030400
                      TV_CSC_U: 0x00000000
                     TV_CSC_U2: 0x04ff03ff
                      TV_CSC_V: 0x00000000
                     TV_CSC_V2: 0x00000000
                  TV_CLR_KNOBS: 0x00000000
                  TV_CLR_LEVEL: 0x00000000
                    TV_H_CTL_1: 0x00000000
                    TV_H_CTL_2: 0x00000000
                    TV_H_CTL_3: 0x00000000
                    TV_V_CTL_1: 0x00000000
                    TV_V_CTL_2: 0x00000000
                    TV_V_CTL_3: 0x00000000
                    TV_V_CTL_4: 0x00000000
                    TV_V_CTL_5: 0x00000000
                    TV_V_CTL_6: 0x00000000
                    TV_V_CTL_7: 0x00000000
                   TV_SC_CTL_1: 0x000248c3
                   TV_SC_CTL_2: 0x0014cc0c
                   TV_SC_CTL_3: 0x00000000
                    TV_WIN_POS: 0x00000000
                   TV_WIN_SIZE: 0x00000000
               TV_FILTER_CTL_1: 0x00000000
               TV_FILTER_CTL_2: 0x00000000
               TV_FILTER_CTL_3: 0x00000000
                 TV_CC_CONTROL: 0x00000000
                    TV_CC_DATA: 0x00000000
                   TV_H_LUMA_0: 0x00000000
                  TV_H_LUMA_59: 0x00000000
                 TV_H_CHROMA_0: 0x069704ff
                TV_H_CHROMA_59: 0x00000000
                  FBC_CFB_BASE: 0x00000000
                   FBC_LL_BASE: 0x00000000
                   FBC_CONTROL: 0x00000000
                   FBC_COMMAND: 0x00000000
                    FBC_STATUS: 0x00000000
                  FBC_CONTROL2: 0x00000000
                 FBC_FENCE_OFF: 0x00000000
                   FBC_MOD_NUM: 0x00000000
                       MI_MODE: 0x00000000
                  MI_ARB_STATE: 0x00000000
                MI_RDRET_STATE: 0x00000000
                       ECOSKPD: 0x00000307
                          DP_B: 0x00000000
                DPB_AUX_CH_CTL: 0x00000000
              DPB_AUX_CH_DATA1: 0x00000000
              DPB_AUX_CH_DATA2: 0x00000000
              DPB_AUX_CH_DATA3: 0x00000000
              DPB_AUX_CH_DATA4: 0x00000000
              DPB_AUX_CH_DATA5: 0x00000000
                          DP_C: 0x069704ff
                DPC_AUX_CH_CTL: 0x042903ff
              DPC_AUX_CH_DATA1: 0x04030400
              DPC_AUX_CH_DATA2: 0x00000000
              DPC_AUX_CH_DATA3: 0x04ff03ff
              DPC_AUX_CH_DATA4: 0x00000000
              DPC_AUX_CH_DATA5: 0x00000000
                          DP_D: 0x00000000
                DPD_AUX_CH_CTL: 0x00000000
              DPD_AUX_CH_DATA1: 0x00000000
              DPD_AUX_CH_DATA2: 0x00000000
              DPD_AUX_CH_DATA3: 0x00000000
              DPD_AUX_CH_DATA4: 0x00000000
              DPD_AUX_CH_DATA5: 0x00000000
                    AUD_CONFIG: 0x069704ff
              AUD_HDMIW_STATUS: 0x00000000
                AUD_CONV_CHCNT: 0x00000000
                 VIDEO_DIP_CTL: 0x00000000
                 AUD_PINW_CNTR: 0x00000000
                   AUD_CNTL_ST: 0x00000000
                   AUD_PIN_CAP: 0x00000000
                  AUD_PINW_CAP: 0x00000000
            AUD_PINW_UNSOLRESP: 0x00000000
              AUD_OUT_DIG_CNVT: 0x00000000
                 AUD_OUT_CWCAP: 0x00000000
                   AUD_GRP_CAP: 0x00000000
                      FENCE  0: 0x00800461 (enabled, X tiled, 32768 pitch, 0x00800000 - 0x01800000 (16384kb))
                      FENCE  1: 0x01b80041 (enabled, X tiled, 8192 pitch, 0x01b00000 - 0x01c00000 (1024kb))
                      FENCE  2: 0x01880021 (enabled, X tiled, 2048 pitch, 0x01800000 - 0x01900000 (1024kb))
                      FENCE  3: 0x02800021 (enabled, X tiled, 2048 pitch, 0x02800000 - 0x02900000 (1024kb))
                      FENCE  4: 0x01780021 (enabled, X tiled, 2048 pitch, 0x01700000 - 0x01800000 (1024kb))
                      FENCE  5: 0x02880051 (enabled, X tiled, 16384 pitch, 0x02800000 - 0x02900000 (1024kb))
                      FENCE  6: 0x01c80051 (enabled, X tiled, 16384 pitch, 0x01c00000 - 0x01d00000 (1024kb))
                      FENCE  7: 0x01f00041 (enabled, X tiled, 8192 pitch, 0x01f00000 - 0x02000000 (1024kb))
                      FENCE  8: 0x00000000 (disabled)
                      FENCE  9: 0x00000000 (disabled)
                     FENCE  10: 0x00000000 (disabled)
                     FENCE  11: 0x00000000 (disabled)
                     FENCE  12: 0x00000000 (disabled)
                     FENCE  13: 0x00000000 (disabled)
                     FENCE  14: 0x00000000 (disabled)
                     FENCE  15: 0x00000000 (disabled)
                 FENCE START 0: 0x00000000 (disabled)
                   FENCE END 0: 0x00000000 (disabled)
                 FENCE START 1: 0x00000000 (disabled)
                   FENCE END 1: 0x00000000 (disabled)
                 FENCE START 2: 0x00000000 (disabled)
                   FENCE END 2: 0x00000000 (disabled)
                 FENCE START 3: 0x00000000 (disabled)
                   FENCE END 3: 0x00000000 (disabled)
                 FENCE START 4: 0x00000000 (disabled)
                   FENCE END 4: 0x00000000 (disabled)
                 FENCE START 5: 0x00000000 (disabled)
                   FENCE END 5: 0x00000000 (disabled)
                 FENCE START 6: 0x00000000 (disabled)
                   FENCE END 6: 0x00000000 (disabled)
                 FENCE START 7: 0x00000000 (disabled)
                   FENCE END 7: 0x00000000 (disabled)
                 FENCE START 8: 0x00000000 (disabled)
                   FENCE END 8: 0x00000000 (disabled)
                 FENCE START 9: 0x00000000 (disabled)
                   FENCE END 9: 0x00000000 (disabled)
                FENCE START 10: 0x00000000 (disabled)
                  FENCE END 10: 0x00000000 (disabled)
                FENCE START 11: 0x00000000 (disabled)
                  FENCE END 11: 0x00000000 (disabled)
                FENCE START 12: 0x00000000 (disabled)
                  FENCE END 12: 0x00000000 (disabled)
                FENCE START 13: 0x00000000 (disabled)
                  FENCE END 13: 0x00000000 (disabled)
                FENCE START 14: 0x00000000 (disabled)
                  FENCE END 14: 0x00000000 (disabled)
                FENCE START 15: 0x00000000 (disabled)
                  FENCE END 15: 0x00000000 (disabled)
                       INST_PM: 0x00000000
pipe A dot 108000 n 3 m1 22 m2 13 p1 3 p2 4
pipe B dot 72000 n 0 m1 0 m2 0 p1 2 p2 2
Comment 17 mlsemon35 2013-05-23 20:31:29 UTC
Created attachment 79721 [details]
Additional dump of regs from fresh boot, xset command not used

Here's a fresh set of "before" regs, in case they're needed.
Comment 18 Daniel Vetter 2013-05-24 20:11:18 UTC
Created attachment 79773 [details] [review]
quirk the pipe A quirk

Please test the attached patch, thanks.
Comment 19 mlsemon35 2013-05-26 21:57:47 UTC
(In reply to comment #18)
> Created attachment 79773 [details] [review] [review]
> quirk the pipe A quirk
> 
> Please test the attached patch, thanks.

Ugh, I'm still being stupid, but here's as close as I can get to a test:

Your patch does not patch using `git am`, so I trimmed it down to a normal patch and used `patch`.  The first hunk had an offset of around 200 lines, the second hunk of about -162 lines.  The kernel compile was good.

I can run `xset dpms force off` and intel_reg_dumper all day long and not worry too much about restarting X11 or getting those GTT errors.  I did get one GTT error, but that may have been due to my normal-user account overrunning its quota, and X11 couldn't touch a file in /tmp to set up the keyboard.  With quota issues resolved, I couldn't reproduce the previous behavior on X11.

However, I still get the same backtrace when leaving the console alone for an hour.  So for now, I'm taking this line in slackware-current's /etc/rc.d/rc.M...

/bin/setterm -blank 15 -powersave powerdown -powerdown 60

...and trying to get a value for -powersave that won't instigate this behavior.  The first try will be this...

/bin/setterm -blank 15 -powersave off

...and the settings for hsync blanking and vsync blanking will be tried as well.

Now that I have a decent Internet connection, I'll refresh my kernel git and try your patch again as well.

Thanks!

Michael
Comment 20 Daniel Vetter 2013-05-26 22:11:27 UTC
So if you do a

$ xset dpms force off

in X, do you still see the backtrace or not?

If you still get it, please attach the output of dmesg with drm.debug=0xe
Comment 21 mlsemon35 2013-05-26 23:58:13 UTC
(In reply to comment #20)
> So if you do a
> 
> $ xset dpms force off
> 
> in X, do you still see the backtrace or not?
> 
> If you still get it, please attach the output of dmesg with drm.debug=0xe

I'll try again when I get home.  I did test for this from a fresh boot, getting no backtrace in multiple tries.  However, it's very possible that the backtrace happened because the backtrace had already happened on the console.  This is 
a tentative opinion.

Michael
Comment 22 mlsemon35 2013-05-29 04:33:03 UTC
(In reply to comment #20)
> So if you do a
> 
> $ xset dpms force off
> 
> in X, do you still see the backtrace or not?
> 
> If you still get it, please attach the output of dmesg with drm.debug=0xe

It's looking very good right now, no issues in three days.  My experiment with the setterm -powerdown settings did not give the expected results, going like this:

powerdown: stack trace
off: nothing
hsync: nothing
vsync: nothing
powerdown: nothing

There were a few reboots in there, and I can't help but wonder if one of them let your fix do its job.  This PC is more than capable of creating confusing results like this.

I'm keeping this patch, despite the lack of absolute science behind the great success so far.

Thanks!

Michael
Comment 23 Daniel Vetter 2013-05-29 09:19:12 UTC
Fix merged to drm-intel-fixes as:

commit 69c2bafec6897830d5f8954d7c6588e3eb15aac8
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed May 29 10:41:29 2013 +0200

    drm/i915: Quirk the pipe A quirk in the modeset state checker

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.