calling from libdrm_amdgpu amdgpu_read_mm_registers(dev, 0x8010 / 4, -1, 0xffffffff, 0, out) leads to this dump: WARNING: CPU: 3 PID: 30278 at mm/page_alloc.c:4377 __alloc_pages_nodemask+0x241/0x2b0 CPU: 3 PID: 30278 Comm: radeontop Not tainted 4.19.0-5-amd64 #1 Debian 4.19.37-5+deb10u1 RIP: 0010:__alloc_pages_nodemask+0x241/0x2b0 Code: 89 f7 89 ee 45 31 f6 e8 bd d5 ff ff e9 fb fe ff ff e8 e3 ac 01 00 e9 cb fe ff ff 45 31 f6 81 e7 00 02 00 00 0f 85 e7 fe ff ff <0f> 0b e9 e0 fe ff ff 31 c0 e9 6a fe ff ff 65 48 8b 04 25 40 5c 01 RSP: 0018:ffffb64a01c27a58 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8b4853df0000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000016 RDI: 0000000000000000 RBP: 00000003fffffffc R08: 0000000000000001 R09: ffffffffc0f01ebf R10: 0000000000000000 R11: 0000000000000000 R12: 00000000006000c0 R13: ffffb64a01c27d98 R14: 0000000000000000 R15: 0000000000000008 FS: 00007fa12fe5f280(0000) GS:ffff8b4856f80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa12f5f45b0 CR3: 000000010e498000 CR4: 00000000000406e0 Call Trace: kmalloc_order+0x14/0x30 kmalloc_order_trace+0x1d/0xa0 amdgpu_info_ioctl+0x908/0x1290 [amdgpu] ? get_page_from_freelist+0x7be/0x11b0 ? unix_destruct_scm+0x80/0xa0 ? select_idle_sibling+0x22/0x3a0 ? kmem_cache_free+0x1a7/0x1d0 ? free_unref_page_commit+0x91/0x100 ? amdgpu_firmware_info.isra.5+0x210/0x210 [amdgpu] drm_ioctl_kernel+0xa1/0xf0 [drm] drm_ioctl+0x206/0x3a0 [drm] ? amdgpu_firmware_info.isra.5+0x210/0x210 [amdgpu] ? tlb_finish_mmu+0x1f/0x30 ? unmap_region+0xdd/0x110 amdgpu_drm_ioctl+0x49/0x80 [amdgpu] do_vfs_ioctl+0xa4/0x630 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x53/0x110 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fa12faa8427 Code: 00 00 90 48 8b 05 69 aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 aa 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffc737ffda8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00005561b8c625b0 RCX: 00007fa12faa8427 RDX: 00007ffc737ffdf0 RSI: 0000000040206445 RDI: 0000000000000003 RBP: 00007ffc737ffdf0 R08: 0000000000000000 R09: 00005561b8c6a950 R10: fffffffffffffd06 R11: 0000000000000246 R12: 0000000040206445 R13: 0000000000000003 R14: 00007ffc7380002b R15: 0000000000000000 ---[ end trace e7c99a8c5897d841 ]--- libdrm's amdgpu_read_mm_registers() calls drmCommandWrite(DRM_AMDGPU_INFO) with AMDGPU_INFO_READ_MMR_REG query, that calls kernel's amdgpu_kms.c amdgpu_info_ioctl() it is not always reproducible, but it seems I can crash it once for each boot the system is Debian 10 buster amd64 Linux 4.19.37 libdrm 2.4.97 chipset KAVERI tell me if you need more info thanks!
Created attachment 145226 [details] [review] possible fix The proposed fix is tested on latest git. I'm unsure if 65536 is a good limit: it could be small as 64, but even if the longest consecutive registers are 48, may be in the future they are increased and no one remember to higher that limit. Anyway it should not be larger than the PCI BAR area for memory mapped registers, that on my KAVERI is 256K, thus 65536 registers. ciao!
Created attachment 145229 [details] [review] possible fix v2 Thanks to agd5f_, here the patch with updated limit fixed to 128.
Applied. thanks!
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.