Summary: | Poor xterm/exa perf with ColorTiling on. | ||
---|---|---|---|
Product: | xorg | Reporter: | Andy Furniss <adf.lists> |
Component: | Server/Acceleration/EXA | Assignee: | Xorg Project Team <xorg-team> |
Status: | RESOLVED WONTFIX | QA Contact: | Xorg Project Team <xorg-team> |
Severity: | normal | ||
Priority: | medium | CC: | alexander, freedesktop, hramrach, iusty, maxi, petr.pisar, ranma+freedesktop |
Version: | git | ||
Hardware: | x86 (IA32) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: |
Description
Andy Furniss
2011-02-19 12:28:08 UTC
xterm is doing something which requires software fallbacks which are slower with tiling enabled since the driver has to de-tile the pixmap so the CPU can access it. (In reply to comment #1) > xterm is doing something which requires software fallbacks which are slower > with tiling enabled since the driver has to de-tile the pixmap so the CPU can > access it. OK - I retested resetting ddx properly :-) using git log to pick rather than cgit and the commit this starts with is - commit fccdca8db34010f566bd068c74cdef0f4a8cb7f5 Author: Alex Deucher <alexdeucher@gmail.com> Date: Wed Nov 17 17:37:25 2010 -0500 radeon/kms: allow tiled front buffer on 6xx/7xx Use UTS/DFS to tile/untile as appropriate for sw access. Also enables pageflipping with tiling enabled. If I reset to the commit before this I get the same perf with or without tiling, so whatever the s/w fallback did then was a lot faster. (In reply to comment #2) > If I reset to the commit before this I get the same perf with or without > tiling, so whatever the s/w fallback did then was a lot faster. Of course maybe it didn't tile for this case before the commit anyway. (In reply to comment #2) > > commit fccdca8db34010f566bd068c74cdef0f4a8cb7f5 > Author: Alex Deucher <alexdeucher@gmail.com> > Date: Wed Nov 17 17:37:25 2010 -0500 > > radeon/kms: allow tiled front buffer on 6xx/7xx > > Use UTS/DFS to tile/untile as appropriate for sw access. > Also enables pageflipping with tiling enabled. That commit enabled tiling and pageflipping to work together; both the front and back buffers are tiled so pageflipping will work. Prior to that, only the back buffer was tiled. However, since the front buffer is tiled, software cannot access the front buffer directly; the hw needs to convert it to a linear buffer first. Unfortunately, there's not really a good fast way to support software fallbacks and tiling. Your best bet it to use a terminal or xterm configuration that uses a hw path (e.g., don't use bitmap fonts). See bug 30679. Order of magnitude slowdown :/ ColorTiling enabled Moving a window in fluxbox: http://youtu.be/e__5dKZobxA xterm redraw/scroll speed: http://youtu.be/zx7K5213R7o ColorTiling disabled Moving a window in fluxbox:http://youtu.be/FSiQ34OHDbA xterm redraw/scroll speed: http://youtu.be/li9CTseCUqUh At least now I know how to work around the issue :) *** Bug 48310 has been marked as a duplicate of this bug. *** After some research I added this xterm config on my system to get fast xterm perfomance again: $ cat .Xresources xterm*faceName: Bitstream Vera Sans Mono xterm*faceSize: 10 $ xrdb -merge .Xresources This changes xterm to use a TrueType font and rendering is now way faster for me. On the downside is seems like the TrueType fonts don't look as nice as the bitmap fonts for small font sizes. (In reply to comment #7) > On the downside is seems like the TrueType fonts don't look as nice as the > bitmap fonts for small font sizes. You should be able to use the same bitmap fonts with the right fontconfig setup. (In reply to comment #8) > (In reply to comment #7) > > On the downside is seems like the TrueType fonts don't look as nice as the > > bitmap fonts for small font sizes. > > You should be able to use the same bitmap fonts with the right fontconfig > setup. I just tried that. Yes you can. But it's just as slow as the non-fontconfig bitmap font rendering. For me anyway. Caveat: I'm using the pcf fonts from the xfonts-efont-unicode Debian package for proper unicode and japanese character support (listed as 'Biwidth' in the fc-list output). :/ With a pure truetype font I do indeed get no speed difference between ColorTiled and non-ColorTiled mode though. I can haz fast bitmap fonts in my terminal? (In reply to comment #9) > But it's just as slow as the non-fontconfig bitmap font rendering. When using Xft, the X server only receives pixel data for the font glyphs, so it couldn't discriminate between different font types even if it wanted to. So I suspect either your terminal still ends up not using Xft for the font you choose, or it/Xft/... is doing something else differently between different font types that makes the difference. FWIW in urxvt the performance with truetupe fonts is also poor. Randomly sometimes text is drawn fast and at other times it is slow. Letting lots of text scroll by always makes text rendering slow eventually. using the -letstp option in urxvt which changes letter spacing makes font rendering overall slower. The fast case almost never happens then. Hi everyone. I need this bug/feature so much that I'm willing to pay 100.00 bucks for it. This offer is registered at FreedomSponsors (http://www.freedomsponsors.org/core/issue/262/poor-xtermexa-perf-with-colortiling-on). Once you solve it (according to the acceptance criteria described there), just create a FreedomSponsors account and mark it as resolved (oh, you'll need a Paypal account too) I'll then check it out and will gladly pay up! If anyone else would like to throw in a few bucks to elevate the priority on this issue, you should check out FreedomSponsors! FWIW real truetype fonts are just as slow as bitmap fonts with xterm. And a performance issue elsewhere has recently forced me to re-enable ColorTiling. :/ Trying oprofile I get this with ColorTiling on and and xterm continuously updating: I guess almost all the time spent in r100_mm_rreg is busywaiting on ring flush or something. CPU: AMD64 family15h, speed 3.4e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (CPU Clocks not Halted) with a unit mask of 0x00 (No unit mask) count 100000 samples % linenr info image name app name symbol name 1444865 49.9498 r100.c:4059 vmlinux vmlinux r100_mm_rreg 293257 10.1381 processor_idle.c:777 vmlinux vmlinux acpi_idle_enter_simple 62363 2.1559 clear_page_64.S:11 vmlinux vmlinux clear_page_c 59976 2.0734 memcpy-ssse3.S:59 libc-2.17.so libc-2.17.so __memcpy_ssse3 37058 1.2811 fbbits.h:502 libfb.so libfb.so fbGlyph32 27747 0.9592 (no location information) xterm xterm /usr/bin/xterm 25504 0.8817 (no location information) opreport opreport /usr/bin/opreport 22180 0.7668 libahci.c:1845 vmlinux vmlinux ahci_interrupt 21730 0.7512 evergreen_cs.c:1092 vmlinux vmlinux evergreen_cs_check_reg 17961 0.6209 processor_idle.c:122 vmlinux vmlinux acpi_safe_halt 17897 0.6187 page_alloc.c:1846 vmlinux vmlinux get_page_from_freelist 17266 0.5969 (no location information) libdrm_radeon.so.1.0.1 libdrm_radeon.so.1.0.1 /usr/lib/x86_64-linux-gnu/libdrm_radeon.so.1.0.1 17015 0.5882 evergreen_cs.c:1791 vmlinux vmlinux evergreen_packet3_check 14750 0.5099 radeon_cs.c:611 vmlinux vmlinux radeon_get_ib_value 14100 0.4874 ahci.h:381 vmlinux vmlinux ahci_port_intr 12130 0.4193 slab.c:3919 vmlinux vmlinux kfree 10780 0.3727 amd_iommu.c:2668 vmlinux vmlinux __map_single 9674 0.3344 radeon_cs.c:646 vmlinux vmlinux radeon_cs_packet_parse 9538 0.3297 slab.c:3757 vmlinux vmlinux kmem_cache_alloc_trace 9107 0.3148 swap.c:162 vmlinux vmlinux put_page 8939 0.3090 (no location information) oprofiled oprofiled /usr/bin/oprofiled 8766 0.3030 (no location information) libglib-2.0.so.0.3700.0 libglib-2.0.so.0.3700.0 /lib/x86_64-linux-gnu/libglib-2.0.so.0.3700.0 8451 0.2922 tsc.c:762 vmlinux vmlinux read_tsc 8390 0.2900 ttm_page_alloc_dma.c:864 vmlinux vmlinux ttm_dma_populate 8133 0.2812 slab.c:3864 vmlinux vmlinux __kmalloc 8057 0.2785 copy_user_64.S:183 vmlinux vmlinux copy_user_generic_string 7473 0.2583 amd_iommu.c:2958 vmlinux vmlinux alloc_coherent 6931 0.2396 fbglyph.c:313 libfb.so libfb.so fbImageGlyphBlt 6691 0.2313 evergreen_cs.c:2562 vmlinux vmlinux evergreen_cs_parse 6247 0.2160 malloc.c:3241 libc-2.17.so libc-2.17.so _int_malloc 6201 0.2144 pixman-region.c:760 libpixman-1.so.0.28.2 libpixman-1.so.0.28.2 pixman_op 6074 0.2100 malloc.c:3732 libc-2.17.so libc-2.17.so _int_free 5935 0.2052 drm_drv.c:375 vmlinux vmlinux drm_ioctl 5904 0.2041 ttm_memory.c:513 vmlinux vmlinux ttm_mem_global_alloc_zone.isra.4 5732 0.1982 ring_buffer.c:3295 vmlinux vmlinux rb_get_reader_page 5727 0.1980 slab.c:3744 vmlinux vmlinux kmem_cache_alloc 5598 0.1935 bitmap.c:278 vmlinux vmlinux bitmap_set 5546 0.1917 wcwidth.h:36 libc-2.17.so libc-2.17.so wcwidth 5340 0.1846 pixman-sse2.c:3331 libpixman-1.so.0.28.2 libpixman-1.so.0.28.2 sse2_fill 5293 0.1830 page_alloc.c:2576 vmlinux vmlinux __alloc_pages_nodemask 5200 0.1798 page_alloc.c:1316 vmlinux vmlinux free_hot_cold_page 5138 0.1776 page_alloc.c:635 vmlinux vmlinux free_pcppages_bulk 4953 0.1712 libahci.c:544 vmlinux vmlinux ahci_scr_read 4863 0.1681 ring_buffer.c:3731 vmlinux vmlinux ring_buffer_consume 4576 0.1582 ttm_page_alloc_dma.c:936 vmlinux vmlinux ttm_dma_unpopulate 4499 0.1555 (no location information) libfontconfig.so.1.7.0 libfontconfig.so.1.7.0 /usr/lib/x86_64-linux-gnu/libfontconfig.so.1.7.0 4467 0.1544 raid1.c:2383 vmlinux vmlinux sync_request 4372 0.1511 core.c:2876 vmlinux vmlinux __schedule 4349 0.1503 cayman_accel.c:49 radeon_drv.so radeon_drv.so cayman_set_default_state 4264 0.1474 core.c:3151 vmlinux vmlinux __wake_up 4175 0.1443 page_alloc.c:2684 vmlinux vmlinux __free_pages 4002 0.1384 amd_iommu.c:1018 vmlinux vmlinux iommu_queue_command_sync 3985 0.1378 page_alloc.c:5648 vmlinux vmlinux get_pageblock_flags_group 3983 0.1377 fair.c:4090 vmlinux vmlinux update_blocked_averages 3982 0.1377 amd_iommu.c:1737 vmlinux vmlinux dma_ops_area_alloc 3904 0.1350 find_next_bit.c:25 vmlinux vmlinux find_next_bit 3896 0.1347 mutex.c:112 vmlinux vmlinux mutex_unlock 3738 0.1292 entry_64.S:614 vmlinux vmlinux system_call 3689 0.1275 vfprintf.c:235 libc-2.17.so libc-2.17.so vfprintf 3652 0.1263 entry_64.S:622 vmlinux vmlinux system_call_after_swapgs 3606 0.1247 page_alloc.c:1098 vmlinux vmlinux __rmqueue 3590 0.1241 file.c:768 vmlinux vmlinux fget_light 3577 0.1237 ttm_bo.c:174 vmlinux vmlinux ttm_bo_add_to_lru 3550 0.1227 core.c:3027 vmlinux vmlinux mutex_spin_on_owner 3531 0.1221 (no location information) libz.so.1.2.8 libz.so.1.2.8 /lib/x86_64-linux-gnu/libz.so.1.2.8 3522 0.1218 blk-merge.c:117 vmlinux vmlinux __blk_segment_map_sg 3515 0.1215 (no location information) find find /usr/bin/find 3352 0.1159 malloc.c:2845 libc-2.17.so libc-2.17.so malloc 3318 0.1147 memory.c:1322 vmlinux vmlinux unmap_single_vma 3074 0.1063 (no location information) libpng12.so.0.49.0 libpng12.so.0.49.0 /lib/x86_64-linux-gnu/libpng12.so.0.49.0 3073 0.1062 mutex.c:85 vmlinux vmlinux mutex_lock 3070 0.1061 fair.c:4822 vmlinux vmlinux find_busiest_group 3052 0.1055 (no location information) libcairo.so.2.11200.14 libcairo.so.2.11200.14 /usr/lib/x86_64-linux-gnu/libcairo.so.2.11200.14 3049 0.1054 menu.c:312 vmlinux vmlinux menu_select 3038 0.1050 entry_64.S:1472 vmlinux vmlinux page_fault 2968 0.1026 (no location information) libperl.so.5.14.2 libperl.so.5.14.2 /usr/lib/libperl.so.5.14.2 2943 0.1017 fair.c:4123 vmlinux vmlinux tg_load_down 2926 0.1012 bio.c:499 vmlinux vmlinux __bio_add_page.part.18 2916 0.1008 iommu-helper.c:23 vmlinux vmlinux iommu_area_alloc 2899 0.1002 tlb.c:186 vmlinux vmlinux flush_tlb_mm_range 2874 0.0994 ring_buffer.c:3423 vmlinux vmlinux rb_advance_reader 2852 0.0986 ttm_memory.c:428 vmlinux vmlinux ttm_check_swapping 2794 0.0966 drm_gem.c:393 vmlinux vmlinux drm_gem_object_lookup 2790 0.0965 ttm_memory.c:452 vmlinux vmlinux ttm_mem_global_free_zone And with ColorTiling disabled and fast xterm redraw: CPU: AMD64 family15h, speed 3.4e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (CPU Clocks not Halted) with a unit mask of 0x00 (No unit mask) count 100000 samples % linenr info image name app name symbol name 3047367 36.3998 IfEvent.c:47 libX11.so.6.3.0 libX11.so.6.3.0 XIfEvent 2201532 26.2966 xcb_io.c:374 libX11.so.6.3.0 libX11.so.6.3.0 _XReadEvents 640128 7.6461 pixman-sse2.c:3331 libpixman-1.so.0.28.2 libpixman-1.so.0.28.2 sse2_fill 443063 5.2922 fbbits.h:502 libfb.so libfb.so fbGlyph32 313618 3.7461 (no location information) xterm xterm /usr/bin/xterm 121204 1.4477 processor_idle.c:777 vmlinux vmlinux acpi_idle_enter_simple 120839 1.4434 fbglyph.c:313 libfb.so libfb.so fbImageGlyphBlt 101344 1.2105 r100.c:4059 vmlinux vmlinux r100_mm_rreg 66918 0.7993 wcwidth.h:36 libc-2.17.so libc-2.17.so wcwidth 44306 0.5292 (no location information) opreport opreport /usr/bin/opreport 43471 0.5192 (no location information) libapt-pkg.so.4.12.0 libapt-pkg.so.4.12.0 /usr/lib/x86_64-linux-gnu/libapt-pkg.so.4.12.0 34944 0.4174 pixman-region.c:2118 libpixman-1.so.0.28.2 libpixman-1.so.0.28.2 pixman_region_contains_rectangle 32001 0.3822 (no location information) find find /usr/bin/find 24110 0.2880 (no location information) aptitude-curses aptitude-curses /usr/bin/aptitude-curses 23485 0.2805 fbglyph.c:41 libfb.so libfb.so fbGlyphIn 23019 0.2750 memset.S:43 libc-2.17.so libc-2.17.so __memset_sse2 21346 0.2550 (no location information) libXfont.so.1.4.1 libXfont.so.1.4.1 /usr/lib/libXfont.so.1.4.1 20557 0.2455 (no location information) oprofiled oprofiled /usr/bin/oprofiled 19707 0.2354 malloc.c:3732 libc-2.17.so libc-2.17.so _int_free 18680 0.2231 n_tty.c:2015 vmlinux vmlinux n_tty_write 18160 0.2169 copy_user_64.S:183 vmlinux vmlinux copy_user_generic_string 17814 0.2128 entry_64.S:622 vmlinux vmlinux system_call_after_swapgs 16754 0.2001 entry_64.S:614 vmlinux vmlinux system_call 16680 0.1992 (no location information) libglib-2.0.so.0.3700.0 libglib-2.0.so.0.3700.0 /lib/x86_64-linux-gnu/libglib-2.0.so.0.3700.0 16571 0.1979 (no location information) dpkg dpkg /usr/bin/dpkg 15907 0.1900 vfprintf.c:235 libc-2.17.so libc-2.17.so vfprintf 15877 0.1896 memcpy-ssse3.S:59 libc-2.17.so libc-2.17.so __memmove_ssse3 14403 0.1720 malloc.c:3241 libc-2.17.so libc-2.17.so _int_malloc 14388 0.1719 ring_buffer.c:3295 vmlinux vmlinux rb_get_reader_page 12060 0.1441 fbimage.c:145 libfb.so libfb.so fbPutXYImage 11892 0.1420 ring_buffer.c:3731 vmlinux vmlinux ring_buffer_consume 11378 0.1359 core.c:3027 vmlinux vmlinux mutex_spin_on_owner 10921 0.1304 af_unix.c:1898 vmlinux vmlinux unix_stream_recvmsg 10908 0.1303 (no location information) libperl.so.5.14.2 libperl.so.5.14.2 /usr/lib/libperl.so.5.14.2 10738 0.1283 malloc.c:2845 libc-2.17.so libc-2.17.so malloc 10521 0.1257 processor_idle.c:122 vmlinux vmlinux acpi_safe_halt 10408 0.1243 drm_drv.c:375 vmlinux vmlinux drm_ioctl 10371 0.1239 file.c:768 vmlinux vmlinux fget_light 10274 0.1227 strcmp-sse42.S:117 libc-2.17.so libc-2.17.so __strncasecmp_l_avx 9752 0.1165 strlen-sse4.S:26 libc-2.17.so libc-2.17.so __strlen_sse42 9681 0.1156 (no location information) libstdc++.so.6.0.18 libstdc++.so.6.0.18 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.18 9576 0.1144 entry_64.S:655 vmlinux vmlinux sysret_check 9289 0.1110 (no location information) chrome chrome /opt/google/chrome/chrome 9160 0.1094 (no location information) [vdso] (tgid:9490 range:0x7ffffa200000-0x7ffffa201000) Xorg [vdso] (tgid:9490 range:0x7ffffa200000-0x7ffffa201000) 8197 0.0979 (no location information) libz.so.1.2.8 libz.so.1.2.8 /lib/x86_64-linux-gnu/libz.so.1.2.8 7767 0.0928 ring_buffer.c:3423 vmlinux vmlinux rb_advance_reader 7636 0.0912 (no location information) libfontconfig.so.1.7.0 libfontconfig.so.1.7.0 /usr/lib/x86_64-linux-gnu/libfontconfig.so.1.7.0 7549 0.0902 recv.c:28 libc-2.17.so libc-2.17.so recv 7391 0.0883 (no location information) libXt.so.6.0.0 libXt.so.6.0.0 /usr/lib/x86_64-linux-gnu/libXt.so.6.0.0 7350 0.0878 core.c:2876 vmlinux vmlinux __schedule 7169 0.0856 memcpy-ssse3.S:59 libc-2.17.so libc-2.17.so __memcpy_ssse3 7129 0.0852 mutex.c:112 vmlinux vmlinux mutex_unlock 7035 0.0840 syscall-template.S:81 libc-2.17.so libc-2.17.so ioctl 6311 0.0754 drm_gem.c:393 vmlinux vmlinux drm_gem_object_lookup 6199 0.0740 pthread_mutex_lock.c:47 libpthread-2.17.so libpthread-2.17.so pthread_mutex_lock 6091 0.0728 memchr.S:25 libc-2.17.so libc-2.17.so memchr 5999 0.0717 process_64.c:273 vmlinux vmlinux __switch_to 5941 0.0710 socket.c:802 vmlinux vmlinux sock_recvmsg 5892 0.0704 core.c:1440 vmlinux vmlinux try_to_wake_up 5675 0.0678 core.c:3151 vmlinux vmlinux __wake_up 5649 0.0675 (no location information) libcairo.so.2.11200.14 libcairo.so.2.11200.14 /usr/lib/x86_64-linux-gnu/libcairo.so.2.11200.14 5607 0.0670 (no location information) libdrm_radeon.so.1.0.1 libdrm_radeon.so.1.0.1 /usr/lib/x86_64-linux-gnu/libdrm_radeon.so.1.0.1 5561 0.0664 (no location information) libxcb.so.1.1.0 libxcb.so.1.1.0 /usr/lib/x86_64-linux-gnu/libxcb.so.1.1.0 5532 0.0661 tty_buffer.c:258 vmlinux vmlinux tty_insert_flip_string_fixed_flag 5500 0.0657 (no location information) libpng12.so.0.49.0 libpng12.so.0.49.0 /lib/x86_64-linux-gnu/libpng12.so.0.49.0 5363 0.0641 radeon_gem.c:372 vmlinux vmlinux radeon_gem_wait_idle_ioctl 4872 0.0582 clear_page_64.S:11 vmlinux vmlinux clear_page_c 4857 0.0580 mutex.c:85 vmlinux vmlinux mutex_lock 4752 0.0568 fileops.c:1290 libc-2.17.so libc-2.17.so _IO_file_xsputn@@GLIBC_2.2.5 4599 0.0549 select.c:819 vmlinux vmlinux do_sys_poll 4539 0.0542 io.c:194 Xorg Xorg ReadRequestFromClient 4422 0.0528 pthread_mutex_unlock.c:36 libpthread-2.17.so libpthread-2.17.so __pthread_mutex_unlock_usercnt 4361 0.0521 regexec.c:621 libc-2.17.so libc-2.17.so re_search_internal 4226 0.0505 strchrnul.S:26 libc-2.17.so libc-2.17.so strchrnul 4212 0.0503 entry_64.S:1472 vmlinux vmlinux page_fault 3983 0.0476 hash.c:140 vmlinux vmlinux ext4fs_dirhash 3958 0.0473 genops.c:448 libc-2.17.so libc-2.17.so _IO_default_xsputn 3758 0.0449 exa_unaccel.c:323 libexa.so libexa.so ExaCheckImageGlyphBlt 3734 0.0446 exa.c:284 libexa.so libexa.so ExaDoPrepareAccess 3716 0.0444 slab.c:3864 vmlinux vmlinux __kmalloc 3705 0.0443 radeon_object.c:601 vmlinux vmlinux radeon_bo_wait 3638 0.0435 ttm_bo.c:368 vmlinux vmlinux ttm_bo_unreserve This patch fixes the performance issue for me: http://lists.x.org/archives/xorg-devel/2011-May/022275.html But it was never applied because it "breaks the x11perf status line". (For me the x11perf status line has a rendering issue, but that was even before applying this patch, I don't see any breakage related to it). Dupe of this issue: https://bugs.freedesktop.org/show_bug.cgi?id=35197 while opening my gmail account it is not opening This is ancient and I can't test exa with my current hardware. If anyone can test/is still affected you could file a new bug. I have: AMD Radeon HD 7660D linux-4.14.0 xorg-server-1.19.5 xf86-video-ati-7.9.0 and I confirm that it's still slow if I switch to EXA with enabled ColorTiling. The fundamental issue in EXA will likely never be fixed at this point. The recommended solution is to use glamor where possible. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.