While working at the machine, I first get a message in /var/log/messages EVERY 22.5 minutes roughly. The remainder and raw edid block change every time: ----- Mar 28 18:06:04 test114 kernel: [23757.099468] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 179 Mar 28 18:06:04 test114 kernel: [23757.099470] [drm:drm_edid_block_valid] *ERROR* Raw EDID: Mar 28 18:06:04 test114 kernel: [23757.099473] <3>00 ff ff ff ff ff ff 00 1a b3 52 05 2c d0 07 00 ..........R.,... Mar 28 18:06:04 test114 kernel: [23757.099475] <3>1b 11 01 03 80 26 1e 78 2a ee 95 a3 54 4c 97 ff .....&.x*...TL.. Mar 28 18:06:04 test114 kernel: [23757.099477] <3>ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ Mar 28 18:06:04 test114 kernel: [23757.099479] <3>ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ Mar 28 18:06:04 test114 kernel: [23757.099480] <3>ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ Mar 28 18:06:04 test114 kernel: [23757.099482] <3>ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ Mar 28 18:06:04 test114 kernel: [23757.099484] <3>ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ Mar 28 18:06:04 test114 kernel: [23757.099486] <3>ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ ----- This is most of the time followed by multiple: ----- Mar 28 18:06:04 test114 kernel: [23757.164114] i2c i2c-10: sendbytes: NAK bailout. Mar 28 18:06:14 test114 kernel: [23767.188782] i2c i2c-10: sendbytes: NAK bailout. Mar 28 18:06:14 test114 kernel: [23767.198106] i2c i2c-10: sendbytes: NAK bailout. ----- Sometimes the (or in case of multi-head, one of them) monitor blanks and the screen never comes back. Today I also saw the following kernel oops: ----- Mar 28 18:06:24 test114 kernel: [23777.387046] ------------[ cut here ]------------ Mar 28 18:06:24 test114 kernel: [23777.388002] kernel BUG at /usr/src/packages/BUILD/kernel-desktop-2.6.37.1/linux-2.6.37/drivers/gpu/drm/i915/i915_gem.c:4201! Mar 28 18:06:24 test114 kernel: [23777.388002] invalid opcode: 0000 [#1] PREEMPT SMP Mar 28 18:06:24 test114 kernel: [23777.388002] last sysfs file: /sys/devices/pci0000:00/0000:00:02.0/drm/card0/card0-VGA-1/status Mar 28 18:06:24 test114 kernel: [23777.388002] CPU 1 Mar 28 18:06:24 test114 kernel: [23777.388002] Modules linked in: fuse md5 des_generic cbc vboxnetadp vboxnetflt vboxdrv autofs4 snd_pcm_oss snd_mixer_oss edd rpcsec_gs s_krb5 nfs lockd fscache nfs_acl auth_rpcgss sunrpc af_packet cpufreq_conservative microcode cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf tg3 floppy container sg sr_mod c drom iTCO_wdt iTCO_vendor_support fschmd i2c_i801 serio_raw pcspkr snd_hda_codec_realtek tpm_tis tpm snd_hda_intel snd_hda_codec tpm_bios snd_hwdep snd_pcm snd_timer snd shpchp pci _hotplug soundcore snd_page_alloc ext4 jbd2 crc16 linear i915 drm_kms_helper drm i2c_algo_bit button video dm_snapshot dm_mod fan processor thermal thermal_sys Mar 28 18:06:24 test114 kernel: [23777.388002] Mar 28 18:06:24 test114 kernel: [23777.388002] Pid: 27, comm: kworker/1:1 Not tainted 2.6.37.1-1.2-desktop #1 FUJITSU SIEMENS CELSIUS W /D2317-A2 Mar 28 18:06:24 test114 kernel: [23777.388002] RIP: 0010:[<ffffffffa00f77c7>] [<ffffffffa00f77c7>] i915_gem_object_pin+0x187/0x1b0 [i915] Mar 28 18:06:24 test114 kernel: [23777.388002] RSP: 0018:ffff88007a2558a0 EFLAGS: 00010246 Mar 28 18:06:24 test114 kernel: [23777.388002] RAX: ffff880077c54000 RBX: ffff880037c83c00 RCX: 0000000000000000 Mar 28 18:06:24 test114 kernel: [23777.388002] RDX: 0000000000000000 RSI: 0000000000020000 RDI: ffff880037c83c00 Mar 28 18:06:24 test114 kernel: [23777.388002] RBP: 0000000000020000 R08: ffff88007a254000 R09: 00000000000f73aa Mar 28 18:06:24 test114 kernel: [23777.388002] R10: 0000000000000001 R11: 00000000ffffffff R12: ffff88003742c000 Mar 28 18:06:24 test114 kernel: [23777.388002] R13: 000000000003c47c R14: 000000000003c000 R15: 0000000000000000 Mar 28 18:06:24 test114 kernel: [23777.388002] FS: 0000000000000000(0000) GS:ffff88007f500000(0000) knlGS:0000000000000000 Mar 28 18:06:24 test114 kernel: [23777.388002] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Mar 28 18:06:24 test114 kernel: [23777.388002] CR2: 00007fea72395c60 CR3: 000000007973b000 CR4: 00000000000006e0 Mar 28 18:06:24 test114 kernel: [23777.388002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Mar 28 18:06:24 test114 kernel: [23777.388002] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Mar 28 18:06:24 test114 kernel: [23777.388002] Process kworker/1:1 (pid: 27, threadinfo ffff88007a254000, task ffff88007a252700) Mar 28 18:06:24 test114 kernel: [23777.388002] Stack: Mar 28 18:06:24 test114 kernel: [23777.388002] ffff880037c83c00 0000000000000000 0000000000000000 0000000000000000 Mar 28 18:06:24 test114 kernel: [23777.388002] 0000000000000000 ffffffffa0104f98 0000000000000000 dead000000200200 Mar 28 18:06:24 test114 kernel: [23777.388002] 0000000101663c8b ffff88007795e800 ffff880077c54000 ffffffffa0105121 Mar 28 18:06:24 test114 kernel: [23777.388002] Call Trace: Mar 28 18:06:24 test114 kernel: [23777.388002] [<ffffffffa0104f98>] intel_pin_and_fence_fb_obj+0x48/0x100 [i915] Mar 28 18:06:24 test114 kernel: [23777.388002] [<ffffffffa0105121>] intel_pipe_set_base+0xd1/0x290 [i915] Mar 28 18:06:24 test114 kernel: [23777.388002] [<ffffffffa0109028>] intel_crtc_mode_set+0x938/0x1e70 [i915] Mar 28 18:06:24 test114 kernel: [23777.388002] [<ffffffffa00d22cd>] drm_crtc_helper_set_mode+0x13d/0x3c0 [drm_kms_helper] Mar 28 18:06:24 test114 kernel: [23777.388002] [<ffffffffa00d320d>] drm_crtc_helper_set_config+0x83d/0xa00 [drm_kms_helper] Mar 28 18:06:24 test114 kernel: [23777.388002] [<ffffffffa00d1141>] drm_fb_helper_set_par+0x71/0xe0 [drm_kms_helper] Mar 28 18:06:24 test114 kernel: [23777.388002] [<ffffffffa00d130e>] drm_fb_helper_single_fb_probe+0x15e/0x2e0 [drm_kms_helper] Mar 28 18:06:24 test114 kernel: [23777.388002] [<ffffffffa00d10a2>] drm_fb_helper_hotplug_event+0xf2/0x120 [drm_kms_helper] Mar 28 18:06:24 test114 kernel: [23777.388002] [<ffffffffa00d2146>] output_poll_execute+0x1a6/0x1b0 [drm_kms_helper] Mar 28 18:06:24 test114 kernel: [23777.388002] [<ffffffff81074630>] process_one_work+0x110/0x490 Mar 28 18:06:24 test114 kernel: [23777.455024] [<ffffffff81075345>] worker_thread+0x165/0x340 Mar 28 18:06:24 test114 kernel: [23777.455024] [<ffffffff81079956>] kthread+0x96/0xa0 Mar 28 18:06:24 test114 kernel: [23777.455024] [<ffffffff81003d74>] kernel_thread_helper+0x4/0x10 Mar 28 18:06:24 test114 kernel: [23777.455024] Code: 00 00 00 00 e8 2b a2 ff ff 89 c5 e9 f4 fe ff ff 0f 1f 40 00 89 ee 48 89 df e8 66 ba ff ff 85 c0 0f 84 0e ff ff ff e9 42 ff ff ff <0f> 0b 41 89 e8 48 c7 c2 e0 f7 12 a0 be 72 10 00 00 48 c7 c7 b0 Mar 28 18:06:24 test114 kernel: [23777.455024] RIP [<ffffffffa00f77c7>] i915_gem_object_pin+0x187/0x1b0 [i915] Mar 28 18:06:24 test114 kernel: [23777.455024] RSP <ffff88007a2558a0> ----- Here are some further details: -- chipset: 965Q -- system architecture: x86_64 -- intel=2.14.0 xserver=1.9.3 mesa=7.10 libdrm=2.4.23-9.1 -- kernel version: 2.6.37.1-1.2-desktop -- Linux distribution: OpenSuSE 11.4 -- Machine or mobo model: Fujitsu-Siemens Celsius W350 -- Display connector: DMS-59 to dual DVI connector In order to debug the problem, I set the drm.debug=0x06 flag and got the following out of the logs: ------ Mar 29 09:53:41 test114 kernel: [52202.024240] [drm:intel_sdvo_debug_write], SDVOB: W: 0B (SDVO_CMD_GET_ATTACHED_DISPLAYS) Mar 29 09:53:41 test114 kernel: [52202.028840] [drm:intel_sdvo_read_response], SDVOB: R: (Success) 01 00 Mar 29 09:53:41 test114 kernel: [52202.032242] [drm:intel_sdvo_detect], SDVO response 1 0 [1] Mar 29 09:53:41 test114 kernel: [52202.032247] [drm:intel_sdvo_debug_write], SDVOB: W: 7A 02 (SDVO_CMD_SET_CONTROL_BUS_SWITCH) Mar 29 09:53:41 test114 kernel: [52202.038037] [drm:intel_sdvo_debug_write], SDVOB: W: 7A 02 (SDVO_CMD_SET_CONTROL_BUS_SWITCH) Mar 29 09:53:41 test114 kernel: [52202.086011] [drm:intel_crt_detect], CRT not detected via hotplug Mar 29 09:53:41 test114 kernel: [52202.086014] [drm:output_poll_execute], [CONNECTOR:5:VGA-1] status updated from 2 to 2 Mar 29 09:53:41 test114 kernel: [52202.086017] [drm:intel_sdvo_debug_write], SDVOB: W: 0B (SDVO_CMD_GET_ATTACHED_DISPLAYS) Mar 29 09:53:41 test114 kernel: [52202.088083] [drm:intel_sdvo_write_cmd], I2c transfer returned -6 Mar 29 09:53:41 test114 kernel: [52202.088086] [drm:output_poll_execute], [CONNECTOR:8:DVI-D-1] status updated from 1 to 3 Mar 29 09:53:41 test114 kernel: [52202.088089] [drm:intel_sdvo_debug_write], SDVOC: W: 0B (SDVO_CMD_GET_ATTACHED_DISPLAYS) Mar 29 09:53:41 test114 kernel: [52202.090152] [drm:intel_sdvo_write_cmd], I2c transfer returned -6 Mar 29 09:53:41 test114 kernel: [52202.090155] [drm:output_poll_execute], [CONNECTOR:10:DVI-D-2] status updated from 2 to 3 Mar 29 09:53:41 test114 kernel: [52202.092102] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 92 Mar 29 09:53:41 test114 kernel: [52202.092104] [drm:drm_edid_block_valid] *ERROR* Raw EDID: Mar 29 09:53:41 test114 kernel: [52202.092107] <3>00 ff ff ff ff ff ff 00 1a b3 52 05 2c d0 07 00 ..........R.,... Mar 29 09:53:41 test114 kernel: [52202.092109] <3>1b 11 01 03 80 26 1e 78 2a ee 95 a3 54 4c 99 26 .....&.x*...TL.& Mar 29 09:53:41 test114 kernel: [52202.092111] <3>0f 50 54 a5 4b 00 81 80 01 01 01 01 01 01 01 01 .PT.K........... Mar 29 09:53:41 test114 kernel: [52202.092113] <3>01 01 01 01 01 01 30 2a 00 98 51 00 2a 40 30 70 ......0*..Q.*@0p Mar 29 09:53:41 test114 kernel: [52202.092115] <3>13 00 78 2d 11 00 00 1e 00 00 00 fd 00 38 4c 1e ..x-.........8L. Mar 29 09:53:41 test114 kernel: [52202.092117] <3>52 0e 00 0a 20 20 20 20 20 20 00 00 00 fc 00 50 R... .....P Mar 29 09:53:41 test114 kernel: [52202.092118] <3>31 39 2d 32 0a 20 20 20 20 20 20 20 00 00 00 ff 19-2. .... Mar 29 09:53:41 test114 kernel: [52202.092120] <3>01 80 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ Mar 29 09:53:41 test114 kernel: [52202.092122] Mar 29 09:53:41 test114 kernel: [52202.092123] [drm:intel_sdvo_debug_write], SDVOB: W: 7A 02 (SDVO_CMD_SET_CONTROL_BUS_SWITCH) Mar 29 09:53:41 test114 kernel: [52202.146071] [drm:intel_sdvo_debug_write], SDVOC: W: 0B (SDVO_CMD_GET_ATTACHED_DISPLAYS) Mar 29 09:53:41 test114 kernel: [52202.150644] [drm:intel_sdvo_read_response], SDVOC: R: (Success) 00 00 Mar 29 09:53:41 test114 kernel: [52202.154019] [drm:intel_sdvo_detect], SDVO response 0 0 [1] Mar 29 09:53:41 test114 kernel: [52202.160037] [drm:intel_crt_detect], CRT not detected via hotplug Mar 29 09:53:41 test114 kernel: [52202.160436] [drm:intel_sdvo_debug_write], SDVOB: W: 0B (SDVO_CMD_GET_ATTACHED_DISPLAYS) Mar 29 09:53:41 test114 kernel: [52202.165024] [drm:intel_sdvo_read_response], SDVOB: R: (Success) 01 00 Mar 29 09:53:41 test114 kernel: [52202.168397] [drm:intel_sdvo_detect], SDVO response 1 0 [1] Mar 29 09:53:41 test114 kernel: [52202.168400] [drm:intel_sdvo_debug_write], SDVOB: W: 7A 02 (SDVO_CMD_SET_CONTROL_BUS_SWITCH) Mar 29 09:53:41 test114 kernel: [52202.174171] [drm:intel_sdvo_debug_write], SDVOB: W: 7A 02 (SDVO_CMD_SET_CONTROL_BUS_SWITCH) Mar 29 09:53:41 test114 kernel: [52202.228104] [drm:intel_sdvo_debug_write], SDVOC: W: 0B (SDVO_CMD_GET_ATTACHED_DISPLAYS) Mar 29 09:53:41 test114 kernel: [52202.232676] [drm:intel_sdvo_read_response], SDVOC: R: (Success) 00 00 Mar 29 09:53:41 test114 kernel: [52202.236050] [drm:intel_sdvo_detect], SDVO response 0 0 [1] Mar 29 09:53:41 test114 kernel: [52202.242037] [drm:intel_crt_detect], CRT not detected via hotplug Mar 29 09:53:51 test114 kernel: [52212.134016] [drm:intel_crt_detect], CRT not detected via hotplug Mar 29 09:53:51 test114 kernel: [52212.134022] [drm:output_poll_execute], [CONNECTOR:5:VGA-1] status updated from 2 to 2 Mar 29 09:53:51 test114 kernel: [52212.134028] [drm:intel_sdvo_debug_write], SDVOB: W: 0B (SDVO_CMD_GET_ATTACHED_DISPLAYS) Mar 29 09:53:51 test114 kernel: [52212.138627] [drm:intel_sdvo_read_response], SDVOB: R: (Success) 01 00 Mar 29 09:53:51 test114 kernel: [52212.142017] [drm:intel_sdvo_detect], SDVO response 1 0 [1] Mar 29 09:53:51 test114 kernel: [52212.142022] [drm:intel_sdvo_debug_write], SDVOB: W: 7A 02 (SDVO_CMD_SET_CONTROL_BUS_SWITCH) Mar 29 09:53:51 test114 kernel: [52212.147821] [drm:intel_sdvo_debug_write], SDVOB: W: 7A 02 (SDVO_CMD_SET_CONTROL_BUS_SWITCH) Mar 29 09:53:51 test114 kernel: [52212.201919] [drm:output_poll_execute], [CONNECTOR:8:DVI-D-1] status updated from 3 to 1 Mar 29 09:53:51 test114 kernel: [52212.201924] [drm:intel_sdvo_debug_write], SDVOC: W: 0B (SDVO_CMD_GET_ATTACHED_DISPLAYS) Mar 29 09:53:51 test114 kernel: [52212.206513] [drm:intel_sdvo_read_response], SDVOC: R: (Success) 00 00 Mar 29 09:53:51 test114 kernel: [52212.209905] [drm:intel_sdvo_detect], SDVO response 0 0 [1] Mar 29 09:53:51 test114 kernel: [52212.209909] [drm:output_poll_execute], [CONNECTOR:10:DVI-D-2] status updated from 3 to 2 Mar 29 09:53:51 test114 kernel: [52212.211953] [drm:intel_sdvo_debug_write], SDVOB: W: 0B (SDVO_CMD_GET_ATTACHED_DISPLAYS) Mar 29 09:53:51 test114 kernel: [52212.216525] [drm:intel_sdvo_read_response], SDVOB: R: (Success) 01 00 Mar 29 09:53:51 test114 kernel: [52212.219900] [drm:intel_sdvo_detect], SDVO response 1 0 [1] Mar 29 09:53:51 test114 kernel: [52212.219903] [drm:intel_sdvo_debug_write], SDVOB: W: 7A 02 (SDVO_CMD_SET_CONTROL_BUS_SWITCH) Mar 29 09:53:51 test114 kernel: [52212.225675] [drm:intel_sdvo_debug_write], SDVOB: W: 7A 02 (SDVO_CMD_SET_CONTROL_BUS_SWITCH) Mar 29 09:53:51 test114 kernel: [52212.279601] [drm:intel_sdvo_debug_write], SDVOC: W: 0B (SDVO_CMD_GET_ATTACHED_DISPLAYS) Mar 29 09:53:51 test114 kernel: [52212.284181] [drm:intel_sdvo_read_response], SDVOC: R: (Success) 00 00 Mar 29 09:53:51 test114 kernel: [52212.287553] [drm:intel_sdvo_detect], SDVO response 0 0 [1] Mar 29 09:53:51 test114 kernel: [52212.293038] [drm:intel_crt_detect], CRT not detected via hotplug ------
Created attachment 44984 [details] Xorg log
Both of these should have been addressed in 2.6.38+: commit 007c80a5497a3f9c8393960ec6e6efd30955dcb1 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Mar 15 11:40:00 2011 +0000 drm: Hold the mode mutex whilst probing for sysfs status As detect will use hw registers and may modify structures, it needs to be serialised by use of the dev->mode_config.mutex. Make it so. Otherwise, we may cause random crashes as the sysfs file is queried whilst a concurrent hotplug poll is being run. and commit 4819d2e4310796c4e9eef674499af9b9caf36b5a Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Mar 15 11:04:41 2011 +0000 drm: Retry i2c transfer of EDID block after failure Usually EDID retrieval is fine. However, sometimes, especially when the machine is loaded, it fails, but succeeds after a few retries. Based on a patch by Michael Buesch. Reported-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Closing a really old resolved bug.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.