Summary: | HP HP ENVY x360, amdgpu, Call Trace tgn10_lock at startup, VMC page fault during runtime | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | cd <chris> | ||||||||||||
Component: | DRM/AMDgpu | Assignee: | Default DRI bug account <dri-devel> | ||||||||||||
Status: | RESOLVED FIXED | QA Contact: | |||||||||||||
Severity: | normal | ||||||||||||||
Priority: | medium | CC: | harry.wentland | ||||||||||||
Version: | unspecified | ||||||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||||||
OS: | Linux (All) | ||||||||||||||
Whiteboard: | |||||||||||||||
i915 platform: | i915 features: | ||||||||||||||
Attachments: |
|
Looks like two separate issues, which should be tracked separately. Should I split up? The first error doesn't cause any issues for using the laptop. Only the second error actually freezes the graphics. I have now enabled amdgpu.dc_log=1 and drm.debug=6. When it crashes, I'll a new dmesg. Full demesg see in dmesg2 attachment --snip [ 3.139563] [drm:construct [amdgpu]] *ERROR* construct: Invalid Connector ObjectID from Adapter Service for connector index:2! --snip [ 3.190734] [drm] Initialized amdgpu 3.23.0 20150101 for 0000:04:00.0 on minor 0 [ 3.217577] [drm:generic_reg_wait [amdgpu]] *ERROR* REG_WAIT timeout 1us * 100 tries - tgn10_lock line:566 --snip [ 3.217682] Hardware name: HP HP ENVY x360 Convertible 15-bq1xx/83C6, BIOS F.13 11/10/2017 [ 3.217711] RIP: 0010:generic_reg_wait+0xee/0x120 [amdgpu] [ 3.217712] RSP: 0018:ffffa4030192fa38 EFLAGS: 00010297 [ 3.217713] RAX: 0000000000000001 RBX: 0000000000000065 RCX: 0000000000000001 [ 3.217713] RDX: 0000000000000000 RSI: ffffffffbaeaa34a RDI: 00000000ffffffff [ 3.217714] RBP: 0000000000000001 R08: ffffffffba49c4f0 R09: 00000000000003ff [ 3.217715] R10: 0000000000000002 R11: ffffffffbb5c1f2d R12: ffff924545a78400 [ 3.217715] R13: 000000000000504d R14: 0000000000000100 R15: 0000000000000001 [ 3.217717] FS: 00007f8e35de1740(0000) GS:ffff92455ecc0000(0000) knlGS:0000000000000000 [ 3.217717] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.217718] CR2: 00007f8e3568010f CR3: 0000000205a76000 CR4: 00000000003406e0 [ 3.217719] Call Trace: [ 3.217754] tgn10_lock+0x9e/0xb0 [amdgpu] [ 3.217785] program_all_pipe_in_tree+0x1387/0x1440 [amdgpu] [ 3.217817] dcn10_apply_ctx_for_surface+0x4a0/0x510 [amdgpu] [ 3.217845] dc_commit_state+0x281/0x4b0 [amdgpu] [ 3.217878] amdgpu_dm_atomic_commit_tail+0x2ab/0x9a0 [amdgpu] [ 3.217901] ? amdgpu_bo_pin_restricted+0x1ac/0x290 [amdgpu] [ 3.217904] ? kmem_cache_alloc_trace+0x1a9/0x1c0 [ 3.217934] ? dm_plane_helper_prepare_fb+0x1d1/0x240 [amdgpu] [ 3.217942] commit_tail+0x3d/0x70 [drm_kms_helper] [ 3.217947] drm_atomic_helper_commit+0xfc/0x110 [drm_kms_helper] [ 3.217952] restore_fbdev_mode_atomic+0x181/0x1f0 [drm_kms_helper] [ 3.217957] drm_fb_helper_restore_fbdev_mode_unlocked.part.25+0x23/0x70 [drm_kms_helper] [ 3.217980] amdgpu_fbdev_restore_mode+0x1b/0x40 [amdgpu] [ 3.218002] amdgpu_driver_lastclose_kms+0xe/0x20 [amdgpu] [ 3.218011] drm_lastclose+0x37/0xf0 [drm] [ 3.218018] drm_release+0x2c5/0x380 [drm] [ 3.218021] __fput+0x9d/0x1e0 [ 3.218025] task_work_run+0x84/0xa0 [ 3.218028] exit_to_usermode_loop+0x96/0xa0 [ 3.218029] do_syscall_64+0x18a/0x190 [ 3.218033] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 3.218034] RIP: 0033:0x7f8e359f89f4 [ 3.218035] RSP: 002b:00007ffc2dd7f588 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 [ 3.218036] RAX: 0000000000000000 RBX: 00007ffc2dd7f680 RCX: 00007f8e359f89f4 [ 3.218037] RDX: 00007ffc2dd7f5a0 RSI: 00000000c04064a0 RDI: 0000000000000004 [ 3.218037] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 3.218038] R10: 0000000000000133 R11: 0000000000000246 R12: 00007ffc2dd7f5a0 [ 3.218038] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000004 [ 3.218039] Code: 48 c7 c7 42 8d 10 c1 52 4c 8b 4c 24 58 48 c7 c2 e8 12 10 c1 44 8b 44 24 50 e8 ef e1 9b ff 41 83 7c 24 20 01 58 8b 44 24 08 74 02 <0f> 0b 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 c7 44 24 0c [ 3.218061] ---[ end trace e459452396c96ef1 ]--- --snip [ 190.230953] Hardware name: HP HP ENVY x360 Convertible 15-bq1xx/83C6, BIOS F.13 11/10/2017 [ 190.230996] RIP: 0010:generic_reg_wait+0xee/0x120 [amdgpu] [ 190.230998] RSP: 0018:ffffa40303f7f898 EFLAGS: 00010297 [ 190.231000] RAX: 0000000000000001 RBX: 0000000000000065 RCX: 0000000000000001 [ 190.231001] RDX: 0000000000000000 RSI: ffffffffbaeaa34a RDI: 00000000ffffffff [ 190.231002] RBP: 0000000000000001 R08: ffffffffba49c4f0 R09: 000000000000043f [ 190.231003] R10: 0000000000000002 R11: ffffffffbb5c1f2d R12: ffff924545a78400 [ 190.231003] R13: 000000000000504d R14: 0000000000000100 R15: 0000000000000001 [ 190.231005] FS: 00007f94d9f22940(0000) GS:ffff92455ec40000(0000) knlGS:0000000000000000 [ 190.231006] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 190.231007] CR2: 00007f94c0000010 CR3: 0000000208392000 CR4: 00000000003406e0 [ 190.231008] Call Trace: [ 190.231060] tgn10_lock+0x9e/0xb0 [amdgpu] [ 190.231108] program_all_pipe_in_tree+0x1387/0x1440 [amdgpu] [ 190.231114] ? __mod_zone_page_state+0x66/0xa0 [ 190.231160] dcn10_apply_ctx_for_surface+0x4a0/0x510 [amdgpu] [ 190.231211] ? generic_reg_get+0x21/0x30 [amdgpu] [ 190.231272] dc_commit_state+0x281/0x4b0 [amdgpu] [ 190.231342] amdgpu_dm_atomic_commit_tail+0x2ab/0x9a0 [amdgpu] [ 190.231347] ? preempt_count_add+0x68/0xa0 [ 190.231351] ? _raw_spin_lock_irq+0x1a/0x40 [ 190.231353] ? _raw_spin_unlock_irq+0x1d/0x30 [ 190.231356] ? wait_for_common+0x151/0x180 [ 190.231358] ? _raw_spin_unlock_irq+0x1d/0x30 [ 190.231361] ? wait_for_common+0x151/0x180 [ 190.231374] commit_tail+0x3d/0x70 [drm_kms_helper] [ 190.231385] drm_atomic_helper_commit+0xfc/0x110 [drm_kms_helper] [ 190.231395] drm_atomic_helper_set_config+0x80/0x90 [drm_kms_helper] [ 190.231413] __drm_mode_set_config_internal+0x67/0x110 [drm] [ 190.231431] drm_mode_setcrtc+0x3fb/0x5b0 [drm] [ 190.231451] ? drm_mode_getcrtc+0x170/0x170 [drm] [ 190.231466] drm_ioctl_kernel+0x5b/0xb0 [drm] [ 190.231481] drm_ioctl+0x2d5/0x370 [drm] [ 190.231499] ? drm_mode_getcrtc+0x170/0x170 [drm] [ 190.231546] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 190.231551] do_vfs_ioctl+0xa4/0x630 [ 190.231557] ? SyS_futex+0x12d/0x180 [ 190.231560] SyS_ioctl+0x74/0x80 [ 190.231565] do_syscall_64+0x74/0x190 [ 190.231568] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 190.231572] RIP: 0033:0x7f94d77e6d87 [ 190.231573] RSP: 002b:00007ffe0bbf8f88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 190.231576] RAX: ffffffffffffffda RBX: 00007ffe0bbf8fc0 RCX: 00007f94d77e6d87 [ 190.231577] RDX: 00007ffe0bbf8fc0 RSI: 00000000c06864a2 RDI: 000000000000000d [ 190.231578] RBP: 00007ffe0bbf8fc0 R08: 0000000000000000 R09: 0000556e6a0dc9e0 [ 190.231580] R10: 00007ffe0bbf9080 R11: 0000000000000246 R12: 00000000c06864a2 [ 190.231581] R13: 000000000000000d R14: 0000000000000000 R15: 0000556e6a0dc9e0 [ 190.231584] Code: 48 c7 c7 42 8d 10 c1 52 4c 8b 4c 24 58 48 c7 c2 e8 12 10 c1 44 8b 44 24 50 e8 ef e1 9b ff 41 83 7c 24 20 01 58 8b 44 24 08 74 02 <0f> 0b 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 c7 44 24 0c [ 190.231629] ---[ end trace e459452396c96ef2 ]--- --snip freeze is then at 604. This time it seems to be not a graphics related error. Created attachment 137848 [details]
dmesg2
Can you see if this is fixed in the branch drm-next-4.17-wip of Alex's repo (https://cgit.freedesktop.org/~agd5f/linux/?h=amd-staging-drm-next)? Created attachment 137867 [details]
journalctl-amd-drm-staging.txt
I hope I took the correct version, 4.16.0-rc1-085145ebf0e9
The screen freezes with the boot log, no further output is shown and gdm is constantly trying to launch.
I enclosed the complete journalctl -b -1, only removed the constant retry of gdm.
It looks like you built the driver without DC or DCN support. Please make sure the following are set in your .config: CONFIG_DRM_AMD_DC=y CONFIG_DRM_AMD_DC_DCN1_0=y Created attachment 137924 [details] 4.16.0-rc1-085145ebf0e9-color-temp I forgot enabling them. I recompiled and tested. In 1 out of 10 reboots I got a blank screen, see attachment 4 [details] [review].16.0-rc1-085145ebf0e9-blank-screen --snip Mar 08 10:38:07 bb8 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, last signaled seq=5, last emitted seq=8 --snip In the other cases it was running stable, but the colors were crazy, see 4.16.0-rc1-085145ebf0e9-color-temp. --snip Mar 08 10:39:05 bb8 kernel: [drm:construct [amdgpu]] *ERROR* construct: Invalid Connector ObjectID from Adapter Service for connector index:2! type 0 expected 3 --snip I don't know if this has anything to do with amdgpu, maybe it's related to this: https://bugs.archlinux.org/task/54207 --snip gsd-color[574]: failed to set screen _ICC_PROFILE: Failed to open file --snip Created attachment 137925 [details]
4.16.0-rc1-085145ebf0e9-blank-screen
A freeze occured while working with the laptop: uname -r 4.16.0-rc1-085145ebf0e9 [20516.321517] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, last signaled seq=448660, last emitted seq=448663 [20516.321526] [drm] No hardware hang detected. Did some blocks stall? Was solved by using https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 137826 [details] dmesg I'm using Linux version 4.15.7-1-ARCH on a HP ENVY x360. When booting the error below shows up. Still, gnome boots up and I can work with it. After some time graphics freezes, see below. [ 2.280593] Hardware name: HP HP ENVY x360 Convertible 15-bq1xx/83C6, BIOS F.13 11/10/2017 [ 2.280633] RIP: 0010:generic_reg_wait+0xee/0x120 [amdgpu] [ 2.280634] RSP: 0018:ffffb9edc135f3c0 EFLAGS: 00010297 [ 2.280635] RAX: 0000000000000001 RBX: 0000000000000065 RCX: 0000000000000001 [ 2.280636] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000292 [ 2.280636] RBP: 0000000000000001 R08: 0000000432672164 R09: 000000000000005e [ 2.280637] R10: 0000000000000002 R11: 0000000000000000 R12: ffff9bc4c4991e00 [ 2.280637] R13: 000000000000504d R14: 0000000000000100 R15: 0000000000000001 [ 2.280638] FS: 00007f932f8a28c0(0000) GS:ffff9bc4d6a40000(0000) knlGS:0000000000000000 [ 2.280639] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.280640] CR2: 000055b45015fee8 CR3: 00000002139da000 CR4: 00000000003406e0 [ 2.280641] Call Trace: [ 2.280688] tgn10_lock+0x9e/0xb0 [amdgpu] [ 2.280731] program_all_pipe_in_tree+0x1387/0x1440 [amdgpu] [ 2.280775] dcn10_apply_ctx_for_surface+0x4a0/0x510 [amdgpu] [ 2.280815] dc_commit_state+0x281/0x4b0 [amdgpu] [ 2.280860] amdgpu_dm_atomic_commit_tail+0x2ab/0x9a0 [amdgpu] [ 2.280896] ? amdgpu_bo_pin_restricted+0x1ac/0x290 [amdgpu] [ 2.280899] ? kmem_cache_alloc_trace+0x1a9/0x1c0 [ 2.280943] ? dm_plane_helper_prepare_fb+0x1d1/0x240 [amdgpu] [ 2.280951] commit_tail+0x3d/0x70 [drm_kms_helper] [ 2.280959] drm_atomic_helper_commit+0xfc/0x110 [drm_kms_helper] [ 2.280967] restore_fbdev_mode_atomic+0x181/0x1f0 [drm_kms_helper] [ 2.280976] drm_fb_helper_restore_fbdev_mode_unlocked.part.25+0x23/0x70 [drm_kms_helper] [ 2.280990] drm_fb_helper_set_par+0x3e/0x70 [drm_kms_helper] [ 2.280993] fbcon_init+0x482/0x660 [ 2.280997] visual_init+0xd5/0x130 [ 2.280998] do_bind_con_driver+0x1f4/0x400 [ 2.281001] do_take_over_console+0x7b/0x190 [ 2.281002] do_fbcon_takeover+0x58/0xb0 [ 2.281004] notifier_call_chain+0x47/0x70 [ 2.281006] blocking_notifier_call_chain+0x3e/0x60 [ 2.281009] ? down+0x12/0x50 [ 2.281011] register_framebuffer+0x248/0x350 [ 2.281016] __drm_fb_helper_initial_config_and_unlock+0x20e/0x420 [drm_kms_helper] [ 2.281052] amdgpu_fbdev_init+0xc4/0xf0 [amdgpu] [ 2.281087] amdgpu_device_init+0x1110/0x15e0 [amdgpu] [ 2.281121] amdgpu_driver_load_kms+0x86/0x2d0 [amdgpu] [ 2.281131] drm_dev_register+0x132/0x1c0 [drm] [ 2.281164] amdgpu_pci_probe+0x10a/0x140 [amdgpu] [ 2.281167] local_pci_probe+0x42/0xa0 [ 2.281169] ? pci_match_device+0xd9/0x100 [ 2.281170] pci_device_probe+0x146/0x1b0 [ 2.281173] driver_probe_device+0x315/0x480 [ 2.281175] __driver_attach+0xa0/0xe0 [ 2.281176] ? driver_probe_device+0x480/0x480 [ 2.281177] bus_for_each_dev+0x6b/0xb0 [ 2.281179] bus_add_driver+0x1c2/0x260 [ 2.281180] ? 0xffffffffc0e78000 [ 2.281182] driver_register+0x57/0xc0 [ 2.281183] ? 0xffffffffc0e78000 [ 2.281185] do_one_initcall+0x4e/0x190 [ 2.281187] ? kmem_cache_alloc_trace+0xa1/0x1c0 [ 2.281190] do_init_module+0x5b/0x205 [ 2.281192] load_module+0x26ad/0x2b30 [ 2.281195] ? vmap_page_range_noflush+0x27b/0x380 [ 2.281197] ? SyS_init_module+0x163/0x1a0 [ 2.281199] SyS_init_module+0x163/0x1a0 [ 2.281201] do_syscall_64+0x74/0x190 [ 2.281204] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 2.281206] RIP: 0033:0x7f932f1f46ca [ 2.281206] RSP: 002b:00007ffc090cc3e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af [ 2.281207] RAX: ffffffffffffffda RBX: 0000563d60680570 RCX: 00007f932f1f46ca [ 2.281208] RDX: 00007f932eaabcb5 RSI: 000000000057dfb0 RDI: 0000563d61064370 [ 2.281209] RBP: 00007f932eaabcb5 R08: 0000000000000006 R09: 0000000800000001 [ 2.281209] R10: 0000000000000005 R11: 0000000000000246 R12: 0000563d61064370 [ 2.281210] R13: 0000563d6069b200 R14: 0000000000020000 R15: 00007ffc090ccee0 [ 2.281211] Code: 48 c7 c7 42 7d 11 c1 52 4c 8b 4c 24 58 48 c7 c2 e8 02 11 c1 44 8b 44 24 50 e8 ef c1 9c ff 41 83 7c 24 20 01 58 8b 44 24 08 74 02 <0f> 0b 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 c7 44 24 0c [ 2.281229] ---[ end trace eb239795106c8f7c ]--- ---cut, see full dmesg During runtime, the graphics freezes. I can still login using ssh: [ 1112.083668] amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vm_id:4 pas_id:0) [ 1112.083677] amdgpu 0000:04:00.0: at page 0x0000000103c18000 from 27 [ 1112.083679] amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00401031 [ 1112.083686] amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vm_id:4 pas_id:0) [ 1112.083689] amdgpu 0000:04:00.0: at page 0x0000000103c16000 from 27 [ 1112.083691] amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 [ 1112.083697] amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vm_id:4 pas_id:0) [ 1112.083699] amdgpu 0000:04:00.0: at page 0x0000000103c01000 from 27 [ 1112.083701] amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 ---cut, see full dmesg [ 1475.551038] Call Trace: [ 1475.551049] ? __schedule+0x24b/0x8c0 [ 1475.551054] schedule+0x32/0x90 [ 1475.551057] schedule_timeout+0x213/0x480 [ 1475.551129] ? generic_reg_get+0x21/0x30 [amdgpu] [ 1475.551204] ? tgn10_get_crtc_scanoutpos+0x68/0xa0 [amdgpu] [ 1475.551209] dma_fence_default_wait+0x1ea/0x280 [ 1475.551213] ? dma_fence_default_wait+0x280/0x280 [ 1475.551216] dma_fence_wait_timeout+0x38/0x110 [ 1475.551220] reservation_object_wait_timeout_rcu+0x187/0x360 [ 1475.551294] amdgpu_dm_do_flip+0x109/0x360 [amdgpu] [ 1475.551372] amdgpu_dm_atomic_commit_tail+0x8a1/0x9a0 [amdgpu] [ 1475.551385] commit_tail+0x3d/0x70 [drm_kms_helper] [ 1475.551392] process_one_work+0x1ce/0x410 [ 1475.551395] worker_thread+0x2b/0x3d0 [ 1475.551398] ? process_one_work+0x410/0x410 [ 1475.551400] kthread+0x113/0x130 [ 1475.551403] ? kthread_create_on_node+0x70/0x70 [ 1475.551406] ret_from_fork+0x22/0x40 [ 1475.551471] INFO: task amdgpu_cs:0:734 blocked for more than 120 seconds. [ 1475.551474] Tainted: G WC 4.15.7-1-ARCH #1 [ 1475.551476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1475.551478] amdgpu_cs:0 D 0 734 731 0x00000000 [ 1475.551481] Call Trace: [ 1475.551485] ? __schedule+0x24b/0x8c0 [ 1475.551492] ? ttm_bo_mem_compat+0x23/0x60 [ttm] [ 1475.551495] schedule+0x32/0x90 [ 1475.551498] schedule_timeout+0x213/0x480 [ 1475.551500] ? _raw_spin_lock+0x13/0x40 [ 1475.551502] ? _raw_spin_unlock+0x16/0x30 [ 1475.551551] ? amdgpu_vm_update_directories+0x475/0x600 [amdgpu] [ 1475.551555] dma_fence_default_wait+0x1ea/0x280 [ 1475.551557] ? dma_fence_default_wait+0x280/0x280 [ 1475.551560] dma_fence_wait_timeout+0x38/0x110 [ 1475.551610] amdgpu_ctx_wait_prev_fence+0x49/0x80 [amdgpu] [ 1475.551661] amdgpu_cs_ioctl+0x2a4/0x1b20 [amdgpu] [ 1475.551667] ? dequeue_entity+0x389/0x990 [ 1475.551718] ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu] [ 1475.551735] drm_ioctl_kernel+0x5b/0xb0 [drm] [ 1475.551751] drm_ioctl+0x2d5/0x370 [drm] [ 1475.551799] ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu] [ 1475.551843] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 1475.551850] do_vfs_ioctl+0xa4/0x630 [ 1475.551856] ? SyS_futex+0x12d/0x180 [ 1475.551859] SyS_ioctl+0x74/0x80 [ 1475.551865] do_syscall_64+0x74/0x190 [ 1475.551868] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 ---cut, see full dmesg