OS is current Debian Sid 32bit, card 1002:5960...
So this is on r200 i spotted playing any video file vith gl render (but also many videos in games are also affected), bisecting says it started with this code in ttm:
And it performs the same also in 3.14-rc5 kernel :). It consume cca 40% or more CPU power after that commit... that *more* depends on video file and if it is in game or playing video file with mplayer -vo gl.
Mesa version seems doesn't metter i tried 9.2 git, 10.0 git, 10.1 git and current git master.
Created attachment 95043 [details]
Created attachment 95046 [details]
Created attachment 95047 [details]
Disable or enable ColorTiling doesn't help it is the same plus 40% CPU usage.
In both cases even it used more CPU for gl video playback it is not smooth anymore.
Created attachment 95050 [details]
Example from games when video is there: main menu from OpenJK game https://github.com/JACoders/OpenJK . In the centar circle there is playing a video file.
CPU usage is 50% and it have ~50 FPS
Created attachment 95051 [details]
With this commit
With this commit:
CPU usage is higher 90% but it have just ~12 FPS :).
So in this case it uses 700 MHz more CPU power, but gives nothing FPS :).
There's really nothing in this commit that should affect anything but metadata, and if anything, CPU usage should've decreased.
So if CPU usage is drastically increased by this commit (or rather this merge), IMHO that points to a bug somewhere else, but triggered by this merge; possibly in radeon / TTM caching setup or in the x86 PAT memory region tracking code.
In any case, I'm on out-of-office this week and can't look at this before next week, but meantime it would be helpful if you could perform a couple of additional checks:
1) Try to find out which of the commits in the merge that causes this. If that turns out to be hard, could you change all occurencies of VM_PFNMAP in ttm_bo_mmap() (ttm_bo_vm.c) to VM_MIXEDMAP and check if that fixes the problem you are seeing?
2) If you're familiar with oprofile, it would be very helpful to see a profile of a "good" and a "bad" run, including kernel symbols.
(In reply to comment #8)
> could you change all occurencies of VM_PFNMAP in
> ttm_bo_mmap() (ttm_bo_vm.c) to VM_MIXEDMAP and check if that fixes the
> problem you are seeing?
Quick try and yes that helps :). Seems like r200 like VM_MIXEDMAP, but not VM_PFNMAP.
One thing to point out which is rather strange to me, but maybe that can help someone to figured out what is this... GL games rendering and CPU usage in them are normal and not affected by this bug!!! but ONLY gl video playback triggers this both with players or in games.
So two thing are in this combination: VM_PFNMAP and GL videos doesn't play well together on r200 :).
I am not in this but... could it be that videos played with gl are also some kind of special case which maybe? needs COW thus need MIXEDMAP or something like that? :)
Created attachment 95181 [details]
BAD: 3.14-rc5 and VM_PFNMAP
(In reply to comment #8)
> 2) If you're familiar with oprofile, it would be very helpful to see a
> profile of a "good" and a "bad" run, including kernel symbols.
Not very familiar but to give it a try :). This is 3.14-rc5 proper kernel with pfnmap which is bad case here and then oprofiled with pfnmap changed to mixedmap which is good one :).
Created attachment 95182 [details]
GOOD: 3.14-rc5 and VM_MIXEDMAP
(In reply to comment #8)
> So if CPU usage is drastically increased by this commit (or rather this
> merge), IMHO that points to a bug somewhere else, but triggered by this
> merge; possibly in radeon / TTM caching setup or in the x86 PAT memory
> region tracking code.
Seems like that is the case... VM_PFNMAP usage instead of VM_MIXEDMAP triggers that i now need 'nopat' boot option it is also workaround...
So only 'nopat' option work here good with VM_PFNMAP or like before VM_MIXEDMAP is good with PAT... VM_PFNMAP with PAT here is really unusable for gl videos :).
(In reply to comment #14)
> (In reply to comment #8)
> > So if CPU usage is drastically increased by this commit (or rather this
> > merge), IMHO that points to a bug somewhere else, but triggered by this
> > merge; possibly in radeon / TTM caching setup or in the x86 PAT memory
> > region tracking code.
> Seems like that is the case... VM_PFNMAP usage instead of VM_MIXEDMAP
> triggers that i now need 'nopat' boot option it is also workaround...
> So only 'nopat' option work here good with VM_PFNMAP or like before
> VM_MIXEDMAP is good with PAT... VM_PFNMAP with PAT here is really unusable
> for gl videos :).
Interesting, although it's not the first time x86 pat/caching has busted graphics performance-wise, which might appear a bit strange considering graphics is a major user.
In any case, I'll put together a patch to revert the VM_PFNMAP usage until I've figured out what's really happening (it's only there to boost performance anyway).
Meanwhile, the fact that this is visible also points to an area of potential improvement in the radeon driver. Apparently video playback causes a lot of buffer maps / unmaps. Could the maps somehow be cached in the driver?