Created attachment 131990 [details] crash log. [ 6.992374] Bluetooth: L2CAP socket layer initialized [ 6.992386] Bluetooth: SCO socket layer initialized [ 22.000173] general protection fault: 0000 [#1] PREEMPT SMP [ 22.000181] Modules linked in: uinput acpi_als kfifo_buf industrialio bluetooth ecdh_generic lzo zram fuse cfg80211(O) compat(O) ip6table_filter asix usbnet mii [ 22.000202] CPU: 1 PID: 733 Comm: chrome Tainted: G U O 4.12.0-rc4-cros-be-ga02eede86890-dirty #1 [ 22.000205] Hardware name: Intel glkrvp/glkrvp, BIOS Intel_glkrvp.9623.0.2017_06_09_1457 06/09/2017 [ 22.000208] task: ffffa27fb7ac4100 task.stack: ffffae7001044000 [ 22.000216] RIP: 0010:per_file_stats+0x6a/0xc3 [ 22.000219] RSP: 0018:ffffae7001047c88 EFLAGS: 00010287 [ 22.000222] RAX: deacffffffffff20 RBX: ffffa27fa5a51e60 RCX: dead000000000100 [ 22.000224] RDX: ffffae7001047d18 RSI: ffffa27fb79317d0 RDI: ffffa27fb7b71500 [ 22.000226] RBP: ffffae7001047c88 R08: ffffa27fb7b713a8 R09: 0000000000000000 [ 22.000229] R10: ffffa27fba14c4e0 R11: 00000000000000c0 R12: ffffa27fb66d8c38 [ 22.000231] R13: ffffffff8ebf1f08 R14: ffffae7001047d18 R15: ffffa27fba148108 [ 22.000234] FS: 00007f5251d8c780(0000) GS:ffffa27fbfc80000(0000) knlGS:0000000000000000 [ 22.000236] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 22.000239] CR2: 00002eb770bd7000 CR3: 000000017794e000 CR4: 00000000003406e0 [ 22.000241] Call Trace: [ 22.000248] idr_for_each+0x4a/0xd1 [ 22.000252] i915_gem_object_info+0x28c/0x36e [ 22.000258] seq_read+0x1a9/0x38d [ 22.000264] full_proxy_read+0x5c/0x8b [ 22.000269] __vfs_read+0x35/0xc0 [ 22.000273] ? fsnotify_perm+0x64/0x6f [ 22.000276] ? security_file_permission+0x3b/0x42 [ 22.000280] vfs_read+0xa9/0xc5 [ 22.000283] SyS_read+_fastpath+0x13/0x94 [ 22.000292] RIP: 0033:0x7f52521ddf0x5f/0xa3 [ 22.000289] entry_SYSCALL_644d [ 22.000295] RSP: 002b:00007ffdcf82d400 EFLAGS: 00000293 ORIG_RAX: 0000000000000000 [ 22.000298] RAX: ffffffffffffffda RBX: 00002eb76f1bcd80 RCX: 00007f52521ddf4d [ 22.000300] RDX: 0000000000010000 RSI: 00002eb770bd7000 RDI: 0000000000000072 [ 22.000302] RBP: 00007ffdcf82d440 R08: 00007f5251d8c780 R09: 00002eb770bd7000 [ 22.000304] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 [ 22.000306] R13: 0000000000000000 R14: 0000000000000463 R15: ffffffffffffffff [ 22.000309] Code: 48 8b 86 d8 00 00 00 48 01 42 28 48 8b 86 10 01 00 00 48 81 c6 10 01 00 00 48 2d e0 01 00 00 48 8d 88 e0 01 00 00 48 39 f1 74 55 <f6> 80 98 00 00 00 01 74 3d f6 80 e1 00 00 00 01 74 0a 48 8b 48 [ 22.000349] RIP: per_file_stats+0x6a/0xc3 RSP: ffffae7001047c88 [ 22.000352] ---[ end trace 9d32ae44854cdd18 ]--- [ 22.003475] Kernel panic - not syncing: Fatal exception [ 22.003499] Kernel Offset: 0xd800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 22.006554] ACPI MEMORY or I/O RESET_REG.
This happens after 500 cycles of cold reboot.
Hello Abhay, could you please attach dmesg with parameter drm.debug=0xe and kern.log? Is the problem 100% reproducible? Could you, if possible, add more information about software and hardware environment? Thank you.
(In reply to elizabethx.de.la.torre.mena from comment #2) > Hello Abhay, could you please attach dmesg with parameter drm.debug=0xe and > kern.log? Is the problem 100% reproducible? Could you, if possible, add more > information about software and hardware environment? Thank you. There's no need. The oops is completely sufficient.
We had similar kinda crash long back https://bugs.freedesktop.org/show_bug.cgi?id=81712 here looks like we are lock up in fire permission.
commit 0caf81b5c53d9bd332a95dbcb44db8de0b397a7c Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Jun 17 12:57:44 2017 +0100 drm/i915: Hold struct_mutex for per-file stats in debugfs/i915_gem_object As we walk the obj->vma_list in per_file_stats(), we need to hold struct_mutex to prevent alteration of that list.
(In reply to Chris Wilson from comment #5) > commit 0caf81b5c53d9bd332a95dbcb44db8de0b397a7c > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Sat Jun 17 12:57:44 2017 +0100 > > drm/i915: Hold struct_mutex for per-file stats in debugfs/i915_gem_object > > As we walk the obj->vma_list in per_file_stats(), we need to hold > struct_mutex to prevent alteration of that list. Hi after test this kernel commit and send some cold reboots to my GLK i noticed that i got some relevant kernel messages kern :emerg : [Sun Dec 4 14:33:10 2016] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: a600000000020408 kern :emerg : [Sun Dec 4 14:33:10 2016] mce: [Hardware Error]: TSC 0 ADDR fef4c9a0 kern :emerg : [Sun Dec 4 14:33:10 2016] mce: [Hardware Error]: PROCESSOR 0:706a0 TIME 0 SOCKET 0 APIC 0 microcode 1c kern :err : [Sun Dec 4 14:33:10 2016] ACPI Error: Invalid type (RegionField) for target of Scope operator [SSP2] (Cannot override) (20170303/dswload-273) kern :err : [Sun Dec 4 14:33:10 2016] ACPI Exception: AE_AML_OPERAND_TYPE, During name lookup/catalog (20170303/psobject-241) kern :err : [Sun Dec 4 14:33:10 2016] ACPI Exception: AE_AML_OPERAND_TYPE, (SSDT: RVPRtd3) while loading table (20170303/tbxfload-228) kern :err : [Sun Dec 4 14:33:10 2016] ACPI Error: 1 table load failures, 11 successful (20170303/tbxfload-246) kern :err : [Sun Dec 4 14:33:19 2016] uvesafb: Getting VBE info block failed (eax=0x4f00, err=1) kern :err : [Sun Dec 4 14:33:19 2016] uvesafb: vbe_init() failed with -22 kern :err : [Sun Dec 4 14:33:19 2016] atkbd serio0: Failed to deactivate keyboard on isa0060/serio0 kern :err : [Sun Dec 4 14:33:20 2016] atkbd serio0: Failed to enable keyboard on isa0060/serio0 please see the dmesg.log attached i am not sure if this is the same failure for this bug, i will be waiting for a response if i should to create another bug for this or not. BTW, i tested with latest drm-intel and it was the same failure commit bf26e1dbbba24a7697559f1131d4be99747b7646 Author: Martin Peres <martin.peres@linux.intel.com> AuthorDate: Tue Jun 27 16:59:42 2017 +0300 Commit: Martin Peres <martin.peres@linux.intel.com> CommitDate: Tue Jun 27 16:59:42 2017 +0300 drm-tip: 2017y-06m-27d-13h-59m-07s UTC integration manifest
Created attachment 132287 [details] dmesg.log
The MCE is just that, acpi wouldn't be acpi without an failure and the circular locking bug is nothing to do with i915.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.