Bug 105479

Summary: [BAT] igt@debugfs_test@read_all_entries - BUG: unable to handle kernel paging request at 000003030000b16f
Product: DRI Reporter: Marta Löfstedt <marta.lofstedt>
Component: DRM/IntelAssignee: Marta Löfstedt <marta.lofstedt>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs, martin.peres
Version: DRI git   
Hardware: Other   
OS: All   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=105480
https://bugs.freedesktop.org/show_bug.cgi?id=105481
Whiteboard: ReadyForDev
i915 platform: CNL i915 features:

Description Marta Löfstedt 2018-03-13 10:44:41 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3916/fi-cnl-drrs/igt@debugfs_test@read_all_entries.html

<7>[   43.209197] [IGT] Reading file "i915_sseu_status"
<1>[   43.209777] BUG: unable to handle kernel paging request at 000003030000b16f
<1>[   43.209827] IP: intel_runtime_pm_put+0x4/0x90 [i915]
<6>[   43.209830] PGD 0 P4D 0 
<4>[   43.209837] Oops: 0000 [#1] PREEMPT SMP PTI
<0>[   43.209842] Dumping ftrace buffer:
...
<4>[   43.212710] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_codec snd_hwdep ghash_clmulni_intel snd_hda_core snd_pcm e1000e mei_me mei prime_numbers
<4>[   43.212743] CPU: 1 PID: 1257 Comm: debugfs_test Not tainted 4.16.0-rc5-CI-CI_DRM_3916+ #1
<4>[   43.212746] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X114.B11.1712190231 12/19/2017
<4>[   43.212798] RIP: 0010:intel_runtime_pm_put+0x4/0x90 [i915]
<4>[   43.212801] RSP: 0018:ffffc90000ab3d20 EFLAGS: 00010282
<4>[   43.212806] RAX: 0000000000000006 RBX: 0000030300000303 RCX: 0000000000000008
<4>[   43.212810] RDX: 0000000000000004 RSI: 0000000000000000 RDI: 0000030300000303
<4>[   43.212813] RBP: 0000030300000303 R08: 0000000000000001 R09: 0000000000000006
<4>[   43.212816] R10: 0000000000000006 R11: 0000000000000000 R12: ffffc90000ab3d52
<4>[   43.212819] R13: 0000000000000000 R14: 0000000000000001 R15: ffff88025b1bb9d8
<4>[   43.212823] FS:  00007ff3c5054980(0000) GS:ffff880271080000(0000) knlGS:0000000000000000
<4>[   43.212826] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   43.212830] CR2: 000003030000b16f CR3: 0000000259bd2005 CR4: 0000000000760ee0
<4>[   43.212833] PKRU: 55555554
<4>[   43.212836] Call Trace:
<4>[   43.212889]  i915_sseu_status+0xf3/0x350 [i915]
<4>[   43.212897]  ? seq_read+0x28a/0x3b0
<4>[   43.212904]  seq_read+0xe8/0x3b0
<4>[   43.212911]  ? debug_check_no_obj_freed+0x11f/0x230
<4>[   43.212918]  full_proxy_read+0x4b/0x70
<4>[   43.212924]  __vfs_read+0x1e/0x120
<4>[   43.212930]  ? do_sys_open+0x197/0x1f0
<4>[   43.212936]  ? rcu_read_lock_sched_held+0x6f/0x80
<4>[   43.212941]  ? kmem_cache_free+0x285/0x310
<4>[   43.212945]  vfs_read+0x9e/0x150
<4>[   43.212951]  SyS_read+0x40/0xa0
<4>[   43.212956]  do_syscall_64+0x65/0x1a0
<4>[   43.212962]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
<4>[   43.212966] RIP: 0033:0x7ff3c49ed34e
<4>[   43.212969] RSP: 002b:00007ffef04964f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
<4>[   43.212974] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff3c49ed34e
<4>[   43.212978] RDX: 0000000000000200 RSI: 00007ffef0496540 RDI: 0000000000000006
<4>[   43.212981] RBP: 00007ffef0496750 R08: 0000000000000000 R09: 0000000000000020
<4>[   43.212984] R10: 0000000000000000 R11: 0000000000000246 R12: 0000555e27cf61d0
<4>[   43.212987] R13: 00007ffef0496c00 R14: 0000000000000000 R15: 0000000000000000
<4>[   43.212995] Code: b2 f1 e0 0f 0b eb c5 80 3d ec e8 1e 00 00 75 c6 48 c7 c7 50 cf 2a a0 c6 05 dc e8 1e 00 01 e8 04 b2 f1 e0 0f 0b eb af 41 54 55 53 <80> bf 6c ae 00 00 00 48 89 fb 48 8b af 10 04 00 00 4c 8d a5 a0 
<1>[   43.213128] RIP: intel_runtime_pm_put+0x4/0x90 [i915] RSP: ffffc90000ab3d20
<4>[   43.213131] CR2: 000003030000b16f
<4>[   43.213136] ---[ end trace b8e18379265bbb83 ]---
<3>[   43.242052] BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:34
<3>[   43.242056] in_atomic(): 0, irqs_disabled(): 1, pid: 1257, name: debugfs_test
<4>[   43.242059] INFO: lockdep is turned off.
<4>[   43.242061] irq event stamp: 133638
<4>[   43.242069] hardirqs last  enabled at (133637): [<00000000bedb49a0>] _raw_spin_unlock_irqrestore+0x4c/0x60
<4>[   43.242072] hardirqs last disabled at (133638): [<000000005163ed21>] error_entry+0x78/0xf0
<4>[   43.242076] softirqs last  enabled at (133340): [<000000007d0a2c70>] __do_softirq+0x3a1/0x4aa
<4>[   43.242079] softirqs last disabled at (133319): [<0000000048cf38b4>] irq_exit+0xa4/0xb0
<4>[   43.242083] CPU: 1 PID: 1257 Comm: debugfs_test Tainted: G      D          4.16.0-rc5-CI-CI_DRM_3916+ #1
<4>[   43.242085] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X114.B11.1712190231 12/19/2017
<4>[   43.242088] Call Trace:
<4>[   43.242094]  dump_stack+0x5f/0x86
<4>[   43.242098]  ___might_sleep+0x1d9/0x240
<4>[   43.242104]  exit_signals+0x1b/0x2a0
<4>[   43.242108]  do_exit+0x93/0xcb0
<4>[   43.242113]  ? SyS_read+0x40/0xa0
<4>[   43.242119]  rewind_stack_do_exit+0x17/0x20
Comment 2 Chris Wilson 2018-03-13 11:42:56 UTC
*** Bug 105481 has been marked as a duplicate of this bug. ***
Comment 3 Chris Wilson 2018-03-13 11:43:44 UTC
*** Bug 105480 has been marked as a duplicate of this bug. ***
Comment 4 Marta Löfstedt 2018-03-13 11:48:15 UTC
testimony from IRC:

<ickle> https://bugs.freedesktop.org/show_bug.cgi?id=105479 is my fault
Comment 5 Chris Wilson 2018-03-13 12:38:03 UTC
commit c7fb3c6c1893fddbbd39e13066489050c29397c1
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Mar 13 11:31:49 2018 +0000

    drm/i915: Use sseu size for determining eu_regs[]
    
    eu_regs[] is written 2*max_slices times (like s_reg[]) but oddly read
    2*max_slices + max_subslices/2 times. Allocate the array large enough
    for the writes to avoid overwriting our stack and worry about the logic
    later.
    
    Fixes: 7aa0b14ede64 ("drm/i915: Remove variable length arrays from sseu debugfs printers")
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105479
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Cc: Matthew Auld <matthew.auld@intel.com>
    Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20180313113149.1094-1-chris@chris-wilson.co.uk
Comment 6 Marta Löfstedt 2018-03-13 13:47:54 UTC
I will monitor and close if/when results are OK
Comment 7 Marta Löfstedt 2018-03-14 06:24:51 UTC
Fix integrated to CI_DRM_3817 all is green
Comment 8 Martin Peres 2018-03-14 07:26:05 UTC
*** Bug 105498 has been marked as a duplicate of this bug. ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.