Ever since I upgraded to 4.8 based kernel, I keep seeing the message like the one at the end. Wasn't present in 4.7. FYI, same message is present in drm-next-4.9-wip. [ 15.091381] [drm] amdgpu kernel modesetting enabled. [ 15.091394] vga_switcheroo: detected switching method \_SB_.PCI0.GFX0.ATPX handle [ 15.091496] ATPX version 1, functions 0x00000003 [ 15.091567] ATPX Hybrid Graphics [ 15.146747] CRAT table not found [ 15.146748] Finished initializing topology ret=0 [ 15.146762] kfd kfd: Initialized module [ 15.146945] amdgpu 0000:01:00.0: enabling device (0006 -> 0007) [ 15.147246] [drm] initializing kernel modesetting (TOPAZ 0x1002:0x6900 0x103C:0x811C 0x83). [ 15.147256] [drm] register mmio base: 0xE2000000 [ 15.147257] [drm] register mmio size: 262144 [ 15.147262] [drm] doorbell mmio base: 0xE0000000 [ 15.147263] [drm] doorbell mmio size: 2097152 [ 15.147270] [drm] probing gen 2 caps for device 8086:9d10 = 1724843/e [ 15.147272] [drm] probing mlw for device 8086:9d10 = 1724843 [ 15.147276] vga_switcheroo: enabled [ 15.150833] ATOM BIOS: HP/Quanta [ 15.150847] [drm] GPU not posted. posting now... [ 15.154089] [drm] Changing default dispclk from 0Mhz to 600Mhz [ 15.217613] amdgpu 0000:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used) [ 15.217615] amdgpu 0000:01:00.0: GTT: 2048M 0x0000000080000000 - 0x00000000FFFFFFFF [ 15.217616] [drm] Detected VRAM RAM=2048M, BAR=256M [ 15.217617] [drm] RAM width 64bits DDR3 [ 15.217688] [TTM] Zone kernel: Available graphics memory: 4027936 kiB [ 15.217689] [TTM] Zone dma32: Available graphics memory: 2097152 kiB [ 15.217690] [TTM] Initializing pool allocator [ 15.217709] [TTM] Initializing DMA pool allocator [ 15.217724] [drm] amdgpu: 2048M of VRAM memory ready [ 15.217724] [drm] amdgpu: 2048M of GTT memory ready. [ 15.217735] [drm] GART: num cpu pages 524288, num gpu pages 524288 [ 15.218660] [drm] PCIE GART of 2048M enabled (table at 0x0000000000040000). [ 15.218689] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 15.218689] [drm] Driver supports precise vblank timestamp query. [ 15.218720] amdgpu 0000:01:00.0: amdgpu: using MSI. [ 15.218744] [drm] amdgpu: irq initialized. [ 15.492088] iwlwifi 0000:03:00.0: L1 Enabled - LTR Enabled [ 15.510118] amdgpu 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000010, cpu addr 0xffff880231649010 [ 15.510142] amdgpu 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000020, cpu addr 0xffff880231649020 [ 15.510159] amdgpu 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000030, cpu addr 0xffff880231649030 [ 15.510175] amdgpu 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000040, cpu addr 0xffff880231649040 [ 15.510191] amdgpu 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000050, cpu addr 0xffff880231649050 [ 15.510220] amdgpu 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000080000060, cpu addr 0xffff880231649060 [ 15.510238] amdgpu 0000:01:00.0: fence driver on ring 6 use gpu addr 0x0000000080000070, cpu addr 0xffff880231649070 [ 15.510253] amdgpu 0000:01:00.0: fence driver on ring 7 use gpu addr 0x0000000080000080, cpu addr 0xffff880231649080 [ 15.510272] amdgpu 0000:01:00.0: fence driver on ring 8 use gpu addr 0x0000000080000090, cpu addr 0xffff880231649090 [ 15.557848] ieee80211 phy0: Selected rate control algorithm 'iwl-mvm-rs' [ 15.753033] amdgpu 0000:01:00.0: fence driver on ring 9 use gpu addr 0x00000000800000a0, cpu addr 0xffff8802316490a0 [ 15.753116] amdgpu 0000:01:00.0: fence driver on ring 10 use gpu addr 0x00000000800000b0, cpu addr 0xffff8802316490b0 [ 15.996625] [drm] ring test on 0 succeeded in 15 usecs [ 15.996840] [drm] ring test on 1 succeeded in 19 usecs [ 15.996872] [drm] ring test on 2 succeeded in 15 usecs [ 15.996883] [drm] ring test on 3 succeeded in 3 usecs [ 15.996889] [drm] ring test on 4 succeeded in 2 usecs [ 15.996913] [drm] ring test on 5 succeeded in 2 usecs [ 15.996919] [drm] ring test on 6 succeeded in 2 usecs [ 15.996925] [drm] ring test on 7 succeeded in 2 usecs [ 15.996946] [drm] ring test on 8 succeeded in 2 usecs [ 15.996989] [drm] ring test on 9 succeeded in 6 usecs [ 15.996995] [drm] ring test on 10 succeeded in 4 usecs [ 15.997193] [drm] ib test on ring 0 succeeded [ 15.997326] [drm] ib test on ring 1 succeeded [ 15.997374] [drm] ib test on ring 2 succeeded [ 15.997405] [drm] ib test on ring 3 succeeded [ 15.997435] [drm] ib test on ring 4 succeeded [ 15.997466] [drm] ib test on ring 5 succeeded [ 15.997495] [drm] ib test on ring 6 succeeded [ 15.997526] [drm] ib test on ring 7 succeeded [ 15.997556] [drm] ib test on ring 8 succeeded [ 15.997612] [drm:sdma_v2_4_ring_test_ib [amdgpu]] *ERROR* amdgpu: fence wait failed (1000). [ 15.997654] [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on ring 9 (1000). [ 15.997713] [drm:sdma_v2_4_ring_test_ib [amdgpu]] *ERROR* amdgpu: fence wait failed (1000). [ 15.997749] [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on ring 10 (1000).
Can you bisect?
bbec97aae660adafa5208c5defc54e3cbbe6b129 is the first bad commit commit bbec97aae660adafa5208c5defc54e3cbbe6b129 Author: Christian König <christian.koenig@amd.com> Date: Tue Jul 5 21:07:17 2016 +0200 drm/amdgpu: add a fence timeout for the IB tests v2 10ms should be enough for now. v2: fix some typos in CIK code Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yeah that is a known issue. David already raised the timeout to 1s because of this. On the other hand I would really like to know why 10ms isn't enough for VCE to come up.
(In reply to Christian König from comment #3) > Yeah that is a known issue. David already raised the timeout to 1s because > of this. > > On the other hand I would really like to know why 10ms isn't enough for VCE > to come up. Could you please point out the commit which raises the timeout? Thanks.
That's commit e0d079679705b02407cccea1f0e48bff39befce5 increase timeout of IB test. Should be available in Alex repository.
(In reply to Christian König from comment #5) > That's commit e0d079679705b02407cccea1f0e48bff39befce5 increase timeout of > IB test. > > Should be available in Alex repository. patching file drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c Reversed (or previously applied) patch detected! Skipping patch. 1 out of 1 hunk ignored -- saving rejects to file drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c.rej That patch is applied on top of 4.8-rc2. However, I still see the problem.
Created attachment 125847 [details] [review] Possible fix Indeed it isn't the timeout value. Instead there is a stupid typo in the return check. Please try the attached patch.
(In reply to Christian König from comment #7) > Created attachment 125847 [details] [review] [review] > Possible fix > > Indeed it isn't the timeout value. Instead there is a stupid typo in the > return check. > > Please try the attached patch. The attached patch makes the error message go away.
Good, the patch should show up in the next rc. Thanks for testing, Christian.
same problem on my laptaop. i've tested: linux 4.8 rc1, rc2 and now also rc3. @Christian König "Good, the patch should show up in the next rc." do you mean on rc4??
The patch will go upstream in the -fixes pull this week.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.