Bug 111021 - [kernel >=5.2.x][amdgpu][CIK] BUG: KASAN: null-ptr-deref in amdgpu_ib_schedule+0x82/0x790 [amdgpu]
Summary: [kernel >=5.2.x][amdgpu][CIK] BUG: KASAN: null-ptr-deref in amdgpu_ib_schedul...
Status: RESOLVED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/AMDgpu (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-06-28 22:52 UTC by erhard_f
Modified: 2019-10-02 13:28 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
kernel .dmesg (5.2-rc6) (57.10 KB, text/plain)
2019-06-28 22:52 UTC, erhard_f
no flags Details
kernel .config (5.2-rc6) (99.59 KB, text/plain)
2019-06-28 22:54 UTC, erhard_f
no flags Details
shaders (2.30 KB, application/gzip)
2019-06-28 22:55 UTC, erhard_f
no flags Details
kernel dmesg (5.2.1) (55.63 KB, text/plain)
2019-07-21 13:21 UTC, erhard_f
no flags Details
kernel .config (5.2.1) (102.00 KB, text/plain)
2019-07-21 13:22 UTC, erhard_f
no flags Details
kernel dmesg (5.2.5) (59.58 KB, text/plain)
2019-08-02 18:26 UTC, erhard_f
no flags Details
kernel .config (5.3-rc2) (101.27 KB, text/plain)
2019-08-02 18:28 UTC, erhard_f
no flags Details
kernel dmesg (5.3-rc2) (62.90 KB, text/plain)
2019-08-02 18:29 UTC, erhard_f
no flags Details
kernel .config (5.3-rc7) (101.48 KB, text/plain)
2019-09-06 17:01 UTC, erhard_f
no flags Details
kernel dmesg (5.3-rc7) (62.75 KB, text/plain)
2019-09-06 17:01 UTC, erhard_f
no flags Details

Description erhard_f 2019-06-28 22:52:28 UTC
Created attachment 144678 [details]
kernel .dmesg (5.2-rc6)

[...]
[  440.685185] cp queue preemption time out
[  440.685338] Resetting wave fronts (nocpsch) on dev 00000000feee3825
[  440.685426] BUG: kernel NULL pointer dereference, address: 0000000000000038
[  440.685432] #PF: supervisor read access in kernel mode
[  440.685436] #PF: error_code(0x0000) - not-present page
[  440.685440] PGD 0 P4D 0 
[  440.685448] Oops: 0000 [#1] SMP NOPTI
[  440.685455] CPU: 3 PID: 1026 Comm: xmr-stak Not tainted 5.2.0-rc6 #1
[  440.685459] Hardware name: System manufacturer System Product Name/M5A78L-M LX3, BIOS 1401    05/05/2016
[  440.685610] RIP: 0010:amdgpu_ib_schedule+0x4b/0x520 [amdgpu]
[  440.685616] Code: 89 f5 49 89 ff 48 89 54 24 08 0f b6 87 38 04 00 00 48 85 c9 0f 84 5d 03 00 00 48 8b 91 b0 00 00 00 48 89 54 24 10 48 8b 51 10 <48> 8b 52 38 48 89 14 24 84 c0 0f 84 09 e2 17 00 48 83 7c 24 10 00
[  440.685621] RSP: 0018:ffffac368c2a7ad0 EFLAGS: 00010286
[  440.685626] RAX: 0000000000000001 RBX: ffff97d66533dc00 RCX: ffff97d66533dc00
[  440.685630] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff97d685fe7d48
[  440.685634] RBP: 0000000000000001 R08: ffffac368c2a7b48 R09: 0000000000000001
[  440.685638] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000007
[  440.685642] R13: 0000000000ffd000 R14: ffff97d685fe0000 R15: ffff97d685fe7d48
[  440.685647] FS:  00007f2115109700(0000) GS:ffff97d6a6ac0000(0000) knlGS:0000000000000000
[  440.685651] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  440.685655] CR2: 0000000000000038 CR3: 00000003e4236000 CR4: 00000000000406e0
[  440.685659] Call Trace:
[  440.685669]  ? rcu_read_lock_sched_held+0x50/0x60
[  440.685807]  amdgpu_amdkfd_submit_ib+0xb6/0x170 [amdgpu]
[  440.685949]  deallocate_vmid.isra.12+0xe4/0xf0 [amdgpu]
[  440.686091]  destroy_queue_nocpsch_locked+0x176/0x190 [amdgpu]
[  440.686233]  process_termination_nocpsch+0x5e/0x130 [amdgpu]
[  440.686373]  kfd_process_dequeue_from_all_devices+0x36/0x50 [amdgpu]
[  440.686512]  kfd_process_notifier_release+0xf4/0x180 [amdgpu]
[  440.686519]  __mmu_notifier_release+0x65/0x110
[  440.686527]  exit_mmap+0x3b/0x170
[  440.686534]  mmput+0x45/0x110
[  440.686539]  do_exit+0x27d/0xb90
[  440.686546]  ? find_held_lock+0x2d/0x90
[  440.686551]  ? get_signal+0xcc/0xaa0
[  440.686556]  do_group_exit+0x42/0xb0
[  440.686561]  get_signal+0x119/0xaa0
[  440.686568]  do_signal+0x3e/0x620
[  440.686574]  ? find_held_lock+0x2d/0x90
[  440.686580]  exit_to_usermode_loop+0x4b/0xa0
[  440.686585]  do_syscall_64+0x149/0x1a0
[  440.686591]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  440.686596] RIP: 0033:0x7f212b976f6c
[  440.686604] Code: Bad RIP value.
[  440.686608] RSP: 002b:00007f2115108d30 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
[  440.686614] RAX: fffffffffffffe00 RBX: 00007f211d838c48 RCX: 00007f212b976f6c
[  440.686618] RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f211d838c70
[  440.686622] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f2115109700
[  440.686626] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000010
[  440.686630] R13: 00007f211d838c20 R14: 0000000000000000 R15: 00007f211d838c70
[  440.686634] Modules linked in: fuse sha256_ssse3 sha256_generic cfg80211 rfkill dm_crypt nhpoly1305_sse2 nhpoly1305 chacha_x86_64 chacha_generic adiantum poly1305_generic algif_skcipher af_alg ext4 crc16 mbcache jbd2 input_leds led_class joydev hid_generic usbhid hid crct10dif_pclmul crc32_generic crc32_pclmul ghash_generic gf128mul gcm xts ctr dm_mod cbc amdgpu ecb evdev gpu_sched ohci_pci i2c_algo_bit ttm snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi drm_kms_helper ehci_pci ohci_hcd cfbfillrect syscopyarea snd_hda_intel cfbimgblt k10temp sysfillrect ehci_hcd aesni_intel sysimgblt fb_sys_fops snd_hda_codec cfbcopyarea fb snd_hwdep usbcore aes_x86_64 snd_hda_core fam15h_power hwmon i2c_piix4 usb_common font glue_helper crypto_simd sr_mod snd_pcm cryptd fbdev cdrom button snd_timer drm acpi_cpufreq snd alx drm_panel_orientation_quirks soundcore processor backlight mdio lzo nfsd auth_rpcgss lockd grace zstd sunrpc sg zram zsmalloc
[  440.686714] CR2: 0000000000000038
[  440.686720] ---[ end trace 39cfe5e575b273f7 ]---
[  440.686847] RIP: 0010:amdgpu_ib_schedule+0x4b/0x520 [amdgpu]
[  440.686852] Code: 89 f5 49 89 ff 48 89 54 24 08 0f b6 87 38 04 00 00 48 85 c9 0f 84 5d 03 00 00 48 8b 91 b0 00 00 00 48 89 54 24 10 48 8b 51 10 <48> 8b 52 38 48 89 14 24 84 c0 0f 84 09 e2 17 00 48 83 7c 24 10 00
[  440.686857] RSP: 0018:ffffac368c2a7ad0 EFLAGS: 00010286
[  440.686862] RAX: 0000000000000001 RBX: ffff97d66533dc00 RCX: ffff97d66533dc00
[  440.686866] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff97d685fe7d48
[  440.686869] RBP: 0000000000000001 R08: ffffac368c2a7b48 R09: 0000000000000001
[  440.686873] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000007
[  440.686877] R13: 0000000000ffd000 R14: ffff97d685fe0000 R15: ffff97d685fe7d48
[  440.686882] FS:  00007f2115109700(0000) GS:ffff97d6a6ac0000(0000) knlGS:0000000000000000
[  440.686887] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  440.686890] CR2: 00007f212b976f42 CR3: 00000003e4236000 CR4: 00000000000406e0
[  440.686894] Fixing recursive fault but reboot is needed!

This happens every time when xmr-stak 2.10.5 (w. ROCm 2.5) tries to compile shaders for this R9 290X. An ~/.AMD archive is generated but the compilation process never finishes. When I close the shell with xmr-stak running (CTRL-C xmr-stack does not work), I get this kernel BUG. I used a 5.2-rc6 debug kernel, but it happens on 5.1.15 too.

Card is a Sapphire Radeon R9 290X Tri-X OC (11226-18-20G), additional info about the the system:

Machine:   Type: Desktop Mobo: ASUSTeK model: M5A78L-M LX3 v: Rev X.0x serial: <root required> BIOS: American Megatrends 
           v: 1401 date: 05/05/2016 
CPU:       6-Core: AMD FX-6300 type: MCP speed: 3817 MHz min/max: 1400/3800 MHz 
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Hawaii XT / Grenada XT [Radeon R9 290X/390X] driver: amdgpu v: kernel 
           Display: x11 server: X.Org 1.20.4 driver: amdgpu,ati unloaded: modesetting,radeon resolution: 1920x1080~60Hz 
           OpenGL: renderer: AMD Radeon R9 200 Series (HAWAII DRM 3.30.0 5.1.15-gentoo LLVM 8.0.0) v: 4.5 Mesa 19.0.8
Comment 1 erhard_f 2019-06-28 22:54:26 UTC
Created attachment 144679 [details]
kernel .config (5.2-rc6)
Comment 2 erhard_f 2019-06-28 22:55:30 UTC
Created attachment 144680 [details]
shaders
Comment 3 erhard_f 2019-07-21 13:21:57 UTC
Created attachment 144832 [details]
kernel dmesg (5.2.1)

Kernel 5.2.1 still affected.
Comment 4 erhard_f 2019-07-21 13:22:29 UTC
Created attachment 144833 [details]
kernel .config (5.2.1)
Comment 5 erhard_f 2019-08-02 18:26:15 UTC
Created attachment 144931 [details]
kernel dmesg (5.2.5)

More detailed debug info with KASAN.
Comment 6 erhard_f 2019-08-02 18:28:21 UTC
Created attachment 144932 [details]
kernel .config (5.3-rc2)
Comment 7 erhard_f 2019-08-02 18:29:57 UTC
Created attachment 144933 [details]
kernel dmesg (5.3-rc2)

[...]
[  214.315038] cp queue preemption time out
[  214.315406] Resetting wave fronts (nocpsch) on dev 00000000c3d0b577
[  214.316011] ==================================================================
[  214.316631] BUG: KASAN: null-ptr-deref in amdgpu_ib_schedule+0x7c/0x7f0 [amdgpu]
[  214.316664] Read of size 8 at addr 0000000000000038 by task xmr-stak/1130

[  214.316724] CPU: 5 PID: 1130 Comm: xmr-stak Not tainted 5.3.0-rc2 #1
[  214.316754] Hardware name: System manufacturer System Product Name/M5A78L-M LX3, BIOS 1401    05/05/2016
[  214.316783] Call Trace:
[  214.316818]  dump_stack+0x7c/0xc0
[  214.317258]  ? amdgpu_ib_schedule+0x7c/0x7f0 [amdgpu]
[  214.317696]  ? amdgpu_ib_schedule+0x7c/0x7f0 [amdgpu]
[  214.317730]  __kasan_report.cold.6+0x5/0x3c
[  214.318168]  ? amdgpu_ib_schedule+0x7c/0x7f0 [amdgpu]
[  214.318606]  amdgpu_ib_schedule+0x7c/0x7f0 [amdgpu]
[  214.318640]  ? kasan_unpoison_shadow+0x30/0x40
[  214.318672]  ? __kasan_kmalloc.constprop.7+0xc1/0xd0
[  214.319110]  ? amdgpu_sync_create+0x32/0x50 [amdgpu]
[  214.319568]  amdgpu_amdkfd_submit_ib+0x13c/0x230 [amdgpu]
[  214.320026]  ? amdgpu_amdkfd_get_num_gws+0x20/0x20 [amdgpu]
[  214.320487]  ? dbgdev_wave_control_diq+0x280/0x280 [amdgpu]
[  214.320520]  ? wake_up_klogd+0x2b/0x30
[  214.320550]  ? vprintk_emit+0xdc/0x260
[  214.320581]  ? memset+0x1f/0x40
[  214.321040]  deallocate_vmid.isra.12+0x25a/0x270 [amdgpu]
[  214.321503]  destroy_queue_nocpsch_locked+0x33d/0x360 [amdgpu]
[  214.321962]  ? init_mqd_sdma+0x90/0x90 [amdgpu]
[  214.322424]  process_termination_nocpsch+0xb1/0x280 [amdgpu]
[  214.322886]  kfd_process_dequeue_from_all_devices+0x79/0xa0 [amdgpu]
[  214.323345]  kfd_process_notifier_release+0x1ab/0x250 [amdgpu]
[  214.323382]  __mmu_notifier_release+0x9d/0x1c0
[  214.323414]  ? check_chain_key+0x1d7/0x2e0
[  214.323446]  exit_mmap+0x7c/0x280
[  214.323479]  ? __ia32_sys_munmap+0x30/0x30
[  214.323512]  ? aio_poll_wake+0x3c0/0x3c0
[  214.323543]  ? lock_downgrade+0x390/0x390
[  214.323574]  ? up_read+0x12c/0x370
[  214.323606]  ? rwlock_bug.part.2+0x50/0x50
[  214.323638]  mmput+0x99/0x1f0
[  214.323671]  do_exit+0x3cc/0x12e0
[  214.323703]  ? queued_spin_lock_slowpath+0x366/0x420
[  214.323735]  ? check_chain_key+0x1d7/0x2e0
[  214.323766]  ? mm_update_next_owner+0x340/0x340
[  214.323798]  ? lock_downgrade+0x390/0x390
[  214.323830]  ? do_raw_spin_lock+0x10e/0x1d0
[  214.323861]  ? match_held_lock+0x2e/0x240
[  214.323892]  do_group_exit+0x86/0x130
[  214.323925]  get_signal+0x1bc/0xeb0
[  214.323958]  ? refcount_sub_and_test_checked+0xaf/0x150
[  214.323992]  do_signal+0x9e/0xad0
[  214.324024]  ? wake_up_q+0x72/0x90
[  214.324054]  ? rwsem_wake.isra.9+0xb3/0xf0
[  214.324085]  ? rwsem_mark_wake+0x4d0/0x4d0
[  214.324116]  ? setup_sigcontext+0x250/0x250
[  214.324149]  ? __x64_sys_futex+0x1d3/0x240
[  214.324179]  ? down_read_nested+0x2b0/0x2b0
[  214.324211]  ? trace_hardirqs_on_thunk+0x1a/0x20
[  214.324242]  ? mark_held_locks+0x29/0xa0
[  214.324272]  ? exit_to_usermode_loop+0x41/0x130
[  214.324303]  exit_to_usermode_loop+0x59/0x130
[  214.324334]  do_syscall_64+0x1fd/0x250
[  214.324368]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  214.324398] RIP: 0033:0x7fd134c26f6c
[  214.324433] Code: Bad RIP value.
[  214.324462] RSP: 002b:00007fd11b7fdd30 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
[  214.324496] RAX: fffffffffffffe00 RBX: 00007fd125838c48 RCX: 00007fd134c26f6c
[  214.324525] RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007fd125838c74
[  214.324554] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fd108000b20
[  214.324582] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000007
[  214.324611] R13: 00007fd125838c20 R14: 0000000000000000 R15: 00007fd125838c74
[  214.324640] ==================================================================
[  214.324666] Disabling lock debugging due to kernel taint
[  214.324680] BUG: kernel NULL pointer dereference, address: 0000000000000038
[  214.324691] #PF: supervisor read access in kernel mode
[  214.324700] #PF: error_code(0x0000) - not-present page
[  214.324708] PGD 0 P4D 0 
[  214.324722] Oops: 0000 [#1] SMP KASAN NOPTI
[  214.324736] CPU: 5 PID: 1130 Comm: xmr-stak Tainted: G    B             5.3.0-rc2 #1
[  214.324746] Hardware name: System manufacturer System Product Name/M5A78L-M LX3, BIOS 1401    05/05/2016
[  214.325166] RIP: 0010:amdgpu_ib_schedule+0x7c/0x7f0 [amdgpu]
[  214.325180] Code: 00 00 49 8d 7d 70 e8 e3 d0 73 df 49 8b 45 70 49 8d 7d 10 48 89 44 24 38 e8 d1 d0 73 df 49 8b 6d 10 48 8d 7d 38 e8 c4 d0 73 df <48> 8b 45 38 48 89 44 24 20 45 84 e4 0f 84 e8 21 30 00 48 83 7c 24
[  214.325191] RSP: 0018:ffff888378a9f6b0 EFLAGS: 00010286
[  214.325204] RAX: 0000000000000000 RBX: ffff88837a5884d8 RCX: ffffffffa0105081
[  214.325214] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffffffffa1968f34
[  214.325224] RBP: 0000000000000000 R08: fffffbfff42e638d R09: fffffbfff42e638d
[  214.325234] R10: fffffbfff42e638c R11: ffffffffa1731c63 R12: 0000000000000001
[  214.325244] R13: ffff8883475050a8 R14: 0000000000000001 R15: 0000000000ffd000
[  214.325255] FS:  00007fd11b7fe700(0000) GS:ffff8883e6880000(0000) knlGS:0000000000000000
[  214.325265] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  214.325275] CR2: 0000000000000038 CR3: 0000000373628000 CR4: 00000000000406e0
[  214.325283] Call Trace:
[  214.325299]  ? kasan_unpoison_shadow+0x30/0x40
[  214.325312]  ? __kasan_kmalloc.constprop.7+0xc1/0xd0
[  214.325729]  ? amdgpu_sync_create+0x32/0x50 [amdgpu]
[  214.326163]  amdgpu_amdkfd_submit_ib+0x13c/0x230 [amdgpu]
[  214.326597]  ? amdgpu_amdkfd_get_num_gws+0x20/0x20 [amdgpu]
[  214.327035]  ? dbgdev_wave_control_diq+0x280/0x280 [amdgpu]
[  214.327048]  ? wake_up_klogd+0x2b/0x30
[  214.327059]  ? vprintk_emit+0xdc/0x260
[  214.327070]  ? memset+0x1f/0x40
[  214.327507]  deallocate_vmid.isra.12+0x25a/0x270 [amdgpu]
[  214.327946]  destroy_queue_nocpsch_locked+0x33d/0x360 [amdgpu]
[  214.328382]  ? init_mqd_sdma+0x90/0x90 [amdgpu]
[  214.328819]  process_termination_nocpsch+0xb1/0x280 [amdgpu]
[  214.329257]  kfd_process_dequeue_from_all_devices+0x79/0xa0 [amdgpu]
[  214.329694]  kfd_process_notifier_release+0x1ab/0x250 [amdgpu]
[  214.329709]  __mmu_notifier_release+0x9d/0x1c0
[  214.329721]  ? check_chain_key+0x1d7/0x2e0
[  214.329732]  exit_mmap+0x7c/0x280
[  214.329746]  ? __ia32_sys_munmap+0x30/0x30
[  214.329758]  ? aio_poll_wake+0x3c0/0x3c0
[  214.329771]  ? lock_downgrade+0x390/0x390
[  214.329782]  ? up_read+0x12c/0x370
[  214.329795]  ? rwlock_bug.part.2+0x50/0x50
[  214.329808]  mmput+0x99/0x1f0
[  214.329820]  do_exit+0x3cc/0x12e0
[  214.329834]  ? queued_spin_lock_slowpath+0x366/0x420
[  214.329846]  ? check_chain_key+0x1d7/0x2e0
[  214.329858]  ? mm_update_next_owner+0x340/0x340
[  214.329871]  ? lock_downgrade+0x390/0x390
[  214.329884]  ? do_raw_spin_lock+0x10e/0x1d0
[  214.329896]  ? match_held_lock+0x2e/0x240
[  214.329908]  do_group_exit+0x86/0x130
[  214.329921]  get_signal+0x1bc/0xeb0
[  214.329934]  ? refcount_sub_and_test_checked+0xaf/0x150
[  214.329947]  do_signal+0x9e/0xad0
[  214.329959]  ? wake_up_q+0x72/0x90
[  214.329970]  ? rwsem_wake.isra.9+0xb3/0xf0
[  214.329981]  ? rwsem_mark_wake+0x4d0/0x4d0
[  214.329994]  ? setup_sigcontext+0x250/0x250
[  214.330006]  ? __x64_sys_futex+0x1d3/0x240
[  214.330017]  ? down_read_nested+0x2b0/0x2b0
[  214.330029]  ? trace_hardirqs_on_thunk+0x1a/0x20
[  214.330041]  ? mark_held_locks+0x29/0xa0
[  214.330052]  ? exit_to_usermode_loop+0x41/0x130
[  214.330064]  exit_to_usermode_loop+0x59/0x130
[  214.330076]  do_syscall_64+0x1fd/0x250
[  214.330089]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  214.330100] RIP: 0033:0x7fd134c26f6c
[  214.330112] Code: Bad RIP value.
[  214.330121] RSP: 002b:00007fd11b7fdd30 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
[  214.330134] RAX: fffffffffffffe00 RBX: 00007fd125838c48 RCX: 00007fd134c26f6c
[  214.330143] RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007fd125838c74
[  214.330153] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fd108000b20
[  214.330162] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000007
[  214.330171] R13: 00007fd125838c20 R14: 0000000000000000 R15: 00007fd125838c74
[  214.330181] Modules linked in: fuse cfg80211 rfkill dm_crypt nhpoly1305_sse2 nhpoly1305 chacha_x86_64 chacha_generic adiantum poly1305_generic algif_skcipher crct10dif_pclmul crc32_generic crc32_pclmul ghash_generic gf128mul gcm dm_mod input_leds led_class xts joydev hid_generic ctr usbhid hid cbc ext4 crc16 mbcache jbd2 ecb amdgpu aesni_intel aes_x86_64 glue_helper crypto_simd evdev cryptd k10temp fam15h_power sr_mod cdrom hwmon gpu_sched snd_hda_codec_realtek i2c_algo_bit snd_hda_codec_generic ttm snd_hda_codec_hdmi ohci_pci drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea fb font fbdev snd_hda_intel drm snd_hda_codec alx mdio snd_hwdep drm_panel_orientation_quirks backlight ehci_pci snd_hda_core ohci_hcd snd_pcm ehci_hcd acpi_cpufreq snd_timer usbcore button processor snd usb_common soundcore i2c_piix4 lzo sg zstd zram zsmalloc
[  214.330329] CR2: 0000000000000038
[  214.330342] ---[ end trace c1688762b8700f92 ]---
[  214.330760] RIP: 0010:amdgpu_ib_schedule+0x7c/0x7f0 [amdgpu]
[  214.330773] Code: 00 00 49 8d 7d 70 e8 e3 d0 73 df 49 8b 45 70 49 8d 7d 10 48 89 44 24 38 e8 d1 d0 73 df 49 8b 6d 10 48 8d 7d 38 e8 c4 d0 73 df <48> 8b 45 38 48 89 44 24 20 45 84 e4 0f 84 e8 21 30 00 48 83 7c 24
[  214.330784] RSP: 0018:ffff888378a9f6b0 EFLAGS: 00010286
[  214.330796] RAX: 0000000000000000 RBX: ffff88837a5884d8 RCX: ffffffffa0105081
[  214.330806] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffffffffa1968f34
[  214.330816] RBP: 0000000000000000 R08: fffffbfff42e638d R09: fffffbfff42e638d
[  214.330826] R10: fffffbfff42e638c R11: ffffffffa1731c63 R12: 0000000000000001
[  214.330835] R13: ffff8883475050a8 R14: 0000000000000001 R15: 0000000000ffd000
[  214.330847] FS:  00007fd11b7fe700(0000) GS:ffff8883e6880000(0000) knlGS:0000000000000000
[  214.330857] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  214.330866] CR2: 00007fd134c26f42 CR3: 0000000373628000 CR4: 00000000000406e0
[  214.330876] Fixing recursive fault but reboot is needed!
Comment 8 erhard_f 2019-09-06 17:01:03 UTC
Created attachment 145282 [details]
kernel .config (5.3-rc7)
Comment 9 erhard_f 2019-09-06 17:01:43 UTC
Created attachment 145283 [details]
kernel dmesg (5.3-rc7)
Comment 10 erhard_f 2019-10-02 13:27:27 UTC
As of kernel 5.4-rc1 (and ROCm 2.8.0) the null-ptr-deref in amdgpu_ib_schedule+0x82/0x790 [amdgpu] is gone. Now the kernel reports bug #111881 when starting xmr-stak.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.