Created attachment 71476 [details] errors in kern log with dma I haven't had time to test latest drm-next but will post this now as may be AFK tomorrow. After finding a place on mesa where etqw seems OK with drm-fixes I am getting errors with drm-next. On yesterdays head + the wb patch I got attachment 1 [details] [review]. With the tree reset to before the dma changes which required the patch - drm/ttm: remove no_wait_reserve, v3 I got attachment 2 [details] [review] the last lines repeating for 400k lines and the log also getting filled with junk.
Created attachment 71477 [details] errors in kern log before dma changes
Hmm I see using the word a t t a c h m e n t does strange things - 1 and 2 are not mine.
It seems that ttm_mem_evict_first is called way more often in a nasted fashion than is healthy there. Could you resolve the ttm_mem_evict_first address where it ends up calling itself back to a specific line?
It looks nasty though, could you also dump mem_type for each time it calls ttm_mem_evict_first?
Do any of your local patches touch radeon_evict_flags or radeon_ttm_placement_from_domain? I don't see why it would recurse so deeply otherwise. A full public git tree to reproduce the problem and seeing what patches are applied would also be nice.
Created attachment 71507 [details] [review] fix Should be fixed with this patch.
(In reply to comment #6) > Created attachment 71507 [details] [review] [review] > fix > > Should be fixed with this patch. Probably :-) It seems that current drm-next head + fix has a different issue which makes etqw die quite quickly. drm-next reset onto drm/ttm: remove no_wait_reserve, v3 + the fix is now stable with etqw. The head issue is - EE r600_texture.c:697 r600_texture_transfer_map - failed to create temporary texture to hold untiled copy Mesa: User error: GL_OUT_OF_MEMORY in glTexSubImage radeon: The kernel rejected CS, see dmesg for more information. double fault: 'Segmentation fault', bailing out in dmesg - [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object (8192, 2, 4096, -12) [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! etqw.x86[2478]: segfault at 0 ip af5142ad sp bff8b310 error 4 in gamex86.so[af23f000+948000]
Could you please run a git bisection to see where that error has been introduced, then?
Created attachment 71530 [details] gpu lock + oops on use async dma for ttm buffer moves on 6xx-SI
(In reply to comment #8) > Could you please run a git bisection to see where that error has been > introduced, then? It seems that drm/radeon: use async dma for ttm buffer moves on 6xx-SI is the first non working, but it gives a different fail from head. Log attached.
A patch referencing this bug report has been merged in Linux v3.8-rc1: commit dd54fee7d440c4a9756cce2c24a50c15e4c17ccb Author: Dave Airlie <airlied@redhat.com> Date: Fri Dec 14 21:04:46 2012 +1000 radeon: fix regression with eviction since evict caching changes
Make sure your kernel has this patch: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=0953e76e91f4b6206cef50bd680696dc6bf1ef99
(In reply to comment #12) > Make sure your kernel has this patch: > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff; > h=0953e76e91f4b6206cef50bd680696dc6bf1ef99 I tested drm-next head when that went in and got the same results. I've just rebuilt it to be sure and with etqw I get a segfault after about 10 secs and in dmesg - [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! I've also managed to reproduce the GPU lock + oops I reported earlier - this time with nexuiz on current drm-next head. I am not getting ttm errors any more so I guess this bug should be closed?
(In reply to comment #13) > (In reply to comment #12) > > Make sure your kernel has this patch: > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff; > > h=0953e76e91f4b6206cef50bd680696dc6bf1ef99 > > I tested drm-next head when that went in and got the same results. > > I've just rebuilt it to be sure and with etqw I get a segfault after about > 10 secs and in dmesg - > > [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! > > I've also managed to reproduce the GPU lock + oops I reported earlier - this > time with nexuiz on current drm-next head. > > I am not getting ttm errors any more so I guess this bug should be closed? FWIW I tried current drm-next + patch - 0003-drm-radeon-fix-dma-copy-on-r6xx-r7xx-evergen-ni-si-g.patch And I still fail with etqw after about 10 secs, but do get more info. radeon: The kernel rejected CS, see dmesg for more information. radeon: The kernel rejected CS, see dmesg for more information. radeon: The kernel rejected CS, see dmesg for more information. radeon: Failed to allocate a buffer: radeon: size : 7168 bytes radeon: alignment : 256 bytes radeon: domains : 2 EE r600_texture.c:697 r600_texture_transfer_map - failed to create temporary texture to hold untiled copy Mesa: User error: GL_OUT_OF_MEMORY in glTexSubImage radeon: The kernel rejected CS, see dmesg for more information. radeon: The kernel rejected CS, see dmesg for more information. radeon: The kernel rejected CS, see dmesg for more information. radeon: The kernel rejected CS, see dmesg for more information. radeon: The kernel rejected CS, see dmesg for more information. radeon: The kernel rejected CS, see dmesg for more information. radeon: The kernel rejected CS, see dmesg for more information. double fault: 'Segmentation fault', bailing out shutdown terminal support /home/andy/bin/etqw: line 1: 2472 Segmentation fault /usr/local/games/etqw/etqw dmesg - [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object (8192, 2, 4096, -12) [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12! etqw.x86[2472]: segfault at 0 ip af5292ad sp bfbe3250 error 4 in gamex86.so[af254000+948000]
Current drm-fixes is working for me now. The remaining etqw issue was fixed by - Revert "drm/radeon: do not move bo to different placement at each cs"
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.