[76973.700434] BUG: unable to handle kernel paging request at 0000000000001000 [76973.700468] IP: [<ffffffff811fb84d>] strnlen+0xd/0x40 [76973.700494] PGD 10387b067 PUD 15411a067 PMD 0 [76973.700515] Oops: 0000 [#1] SMP [76973.700529] Modules linked in: cpufreq_userspace cpufreq_powersave cpufreq_stats cpufreq_conservative binfmt_misc nfs lockd fscache sunrpc coretemp crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper iTCO_wdt iTCO_vendor_support microcode i2c_i801 lpc_ich mfd_core acpi_cpufreq mperf evdev processor ext4 crc16 jbd2 mbcache sd_mod crc_t10dif ahci libahci libata scsi_mod thermal fan [76973.700718] CPU: 2 PID: 13279 Comm: intel_error_dec Not tainted 3.10.0-rc6+ #99 [76973.700746] Hardware name: Intel Corporation Shark Bay Client platform/Flathead Creek Crb, BIOS HSWLPTU1.86C.0109.R03.1301282055 01/28/2013 [76973.700793] task: ffff880153fa2100 ti: ffff880152310000 task.ti: ffff880152310000 [76973.700820] RIP: 0010:[<ffffffff811fb84d>] [<ffffffff811fb84d>] strnlen+0xd/0x40 [76973.700849] RSP: 0018:ffff880152311cb0 EFLAGS: 00010286 [76973.700869] RAX: ffffffff81774000 RBX: ffff880154000072 RCX: fffffffffffffffe [76973.700895] RDX: 0000000000001000 RSI: ffffffffffffffff RDI: 0000000000001000 [76973.700920] RBP: ffff880152311cb0 R08: 000000000000ffff R09: 000000000000ffff [76973.700948] R10: 0000000000000000 R11: 000000000000000f R12: 0000000000001000 [76973.700974] R13: ffff880154001000 R14: 00000000ffffffff R15: 0000000000000000 [76973.701000] FS: 00007f00fc942880(0000) GS:ffff88015e300000(0000) knlGS:0000000000000000 [76973.701029] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [76973.701050] CR2: 0000000000001000 CR3: 000000015939a000 CR4: 00000000001407e0 [76973.701075] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [76973.701103] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [76973.701128] Stack: [76973.701136] ffff880152311ce8 ffffffff811fdafb ffff880154000072 ffff880154001000 [76973.701167] ffff880152311d58 ffffffff817a703e ffffffff817a703e ffff880152311d48 [76973.701198] ffffffff811fe8a9 0000000a0000ffff 0000000000000001 ffff880154000000 [76973.701229] Call Trace: [76973.701242] [<ffffffff811fdafb>] string.isra.5+0x3b/0xf0 [76973.701264] [<ffffffff811fe8a9>] vsnprintf+0x1d9/0x670 [76973.701286] [<ffffffff812d13c9>] i915_error_printf+0xb9/0x160 [76973.701311] [<ffffffff812d15fc>] print_error_buffers+0x18c/0x1e0 [76973.701335] [<ffffffff812d1a5e>] i915_error_state_read+0x40e/0x8e0 [76973.701361] [<ffffffff81114527>] vfs_read+0x97/0x160 [76973.701381] [<ffffffff81114f74>] SyS_read+0x44/0x90 [76973.701403] [<ffffffff8141a552>] system_call_fastpath+0x16/0x1b [76973.701424] Code: c0 01 80 38 00 75 f7 48 29 f8 5d c3 31 c0 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 8d 4e ff 48 85 f6 48 89 e5 74 2a <80> 3f 00 74 25 48 89 f8 31 d2 eb 10 0f 1f 80 00 00 00 00 48 83 [76973.701535] RIP [<ffffffff811fb84d>] strnlen+0xd/0x40 [76973.701550] RSP <ffff880152311cb0> [76973.701559] CR2: 0000000000001000 The patch I sent doesn't fix the bug - it seems to be using "%s", (void *)4096. Bizarre.
This is just stack corruption. If I break the long err_printf() of the buffer object into multiple calls, then It Just Works (TM).
commit c7815d42dfce7439cde3f23766e7dea59ffa7e1a Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Jun 29 23:26:50 2013 +0100 drm/i915: Break up the large vsnprintf() in print_error_buffers() So it appears that I have encountered some bogosity when trying to call i915_error_printf() with many arguments from print_error_buffers(). The symptom is that the vsnprintf parser tries to interpret an integer arg as a character string, the resulting OOPS indicating stack corruption. Replacing the single call with its 13 format specifiers and arguments with multiple calls to i915_error_printf() worked fine. This patch goes one step further and introduced i915_error_puts() to pass the strings simply. It may not fix the root cause, but it does prevent my box from dying and I think helps make print_error_buffers() more friendly. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66077 Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch
commit e29bb4ebbf000ff9ac081d29784a3331618f012e Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Sep 20 10:20:59 2013 +0100 drm/i915: Use a temporary va_list for two-pass string handling In commit edc3d8848dc9fe2a470316363dab8ef211d77e01 Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Thu May 23 13:55:35 2013 +0300 drm/i915: avoid big kmallocs on reading error state we introduce a two-pass mechanism for splitting long strings being formatted into the error-state. The first pass finds the length, and the second pass emits the right portion of the string into the accumulation buffer. Unfortunately we use the same va_list for both passes, resulting in the second pass reading garbage off the end of the argument list. As the two passes are only used for boundaries between read() calls, the corruption is only rarely seen. This fixes the root cause behind commit baf27f9b17bf2f369f3865e38c41d2163e8d815d Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Jun 29 23:26:50 2013 +0100 drm/i915: Break up the large vsnprintf() in print_error_buffers() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.