Bug 75719 - mplayer -vo gl consume more CPU on r200
Summary: mplayer -vo gl consume more CPU on r200
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: XOrg git
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-03-03 15:09 UTC by smoki
Modified: 2014-03-10 09:41 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (30.38 KB, text/plain)
2014-03-03 17:17 UTC, smoki
no flags Details
glxinfo (9.98 KB, text/plain)
2014-03-03 17:18 UTC, smoki
no flags Details
Xorg.0.log (39.96 KB, text/plain)
2014-03-03 17:19 UTC, smoki
no flags Details
Before commit (776.32 KB, image/png)
2014-03-03 18:04 UTC, smoki
no flags Details
With this commit (774.16 KB, image/png)
2014-03-03 18:06 UTC, smoki
no flags Details
BAD: 3.14-rc5 and VM_PFNMAP (83.37 KB, text/plain)
2014-03-05 18:49 UTC, smoki
no flags Details
GOOD: 3.14-rc5 and VM_MIXEDMAP (85.20 KB, text/plain)
2014-03-05 18:50 UTC, smoki
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description smoki 2014-03-03 15:09:52 UTC
OS is current Debian Sid 32bit, card 1002:5960...
  
  So this is on r200 i spotted playing any video file vith gl render (but also many videos in games are also affected), bisecting says it started with this code in ttm:
 
 http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-next&id=a095c60bd06f204c98527aafd5fda6ef42b53eb5
 
 And it performs the same also in 3.14-rc5 kernel :). It consume cca 40% or more CPU power after that commit... that *more* depends on video file and if it is in game or playing video file with mplayer -vo gl.
 
  Mesa version seems doesn't metter i tried 9.2 git, 10.0 git, 10.1 git and current git master.
Comment 1 smoki 2014-03-03 17:17:41 UTC
Created attachment 95043 [details]
dmesg
Comment 2 smoki 2014-03-03 17:18:31 UTC
Created attachment 95046 [details]
glxinfo
Comment 3 smoki 2014-03-03 17:19:18 UTC
Created attachment 95047 [details]
Xorg.0.log
Comment 4 smoki 2014-03-03 17:22:55 UTC
 Disable or enable ColorTiling doesn't help it is the same plus 40% CPU usage.

 In both cases even it used more CPU for gl video playback it is not smooth anymore.
Comment 5 smoki 2014-03-03 18:04:26 UTC
Created attachment 95050 [details]
Before commit


 Example from games when video is there: main menu from OpenJK game https://github.com/JACoders/OpenJK . In the centar circle there is playing a video file.

 CPU usage is 50% and it have ~50 FPS
Comment 6 smoki 2014-03-03 18:06:38 UTC
Created attachment 95051 [details]
With this commit


 With this commit:

 CPU usage is higher 90% but it have just ~12 FPS :).
Comment 7 smoki 2014-03-03 18:08:44 UTC
 So in this case it uses 700 MHz more CPU power, but gives nothing FPS :).
Comment 8 Thomas Hellström 2014-03-04 13:06:32 UTC
There's really nothing in this commit that should affect anything but metadata, and if anything, CPU usage should've decreased.

So if CPU usage is drastically increased by this commit (or rather this merge), IMHO that points to a bug somewhere else, but triggered by this merge; possibly in radeon / TTM caching setup or in the x86 PAT memory region tracking code.

In any case, I'm on out-of-office this week and can't look at this before next week, but meantime it would be helpful if you could perform a couple of additional checks:

1) Try to find out which of the commits in the merge that causes this. If that turns out to be hard, could you change all occurencies of VM_PFNMAP in ttm_bo_mmap() (ttm_bo_vm.c) to VM_MIXEDMAP and check if that fixes the problem you are seeing?

2) If you're familiar with oprofile, it would be very helpful to see a profile of a "good" and a "bad" run, including kernel symbols.

Thanks,

/Thomas
Comment 9 smoki 2014-03-04 14:37:59 UTC
(In reply to comment #8)
> could you change all occurencies of VM_PFNMAP in
> ttm_bo_mmap() (ttm_bo_vm.c) to VM_MIXEDMAP and check if that fixes the
> problem you are seeing?

 Quick try and yes that helps :). Seems like r200 like VM_MIXEDMAP, but not VM_PFNMAP.
Comment 10 smoki 2014-03-04 15:47:59 UTC
 One thing to point out which is rather strange to me, but maybe that can help someone to figured out what is this... GL games rendering and CPU usage in them are normal and not affected by this bug!!! but ONLY gl video playback triggers this both with players or in games.

 So two thing are in this combination: VM_PFNMAP and GL videos doesn't play well together on r200 :).
Comment 11 smoki 2014-03-04 17:37:17 UTC
 I am not in this but... could it be that videos played with gl are also some kind of special case which maybe? needs COW thus need MIXEDMAP or something like that? :)
Comment 12 smoki 2014-03-05 18:49:23 UTC
Created attachment 95181 [details]
BAD: 3.14-rc5 and VM_PFNMAP

(In reply to comment #8)
> 
> 2) If you're familiar with oprofile, it would be very helpful to see a
> profile of a "good" and a "bad" run, including kernel symbols.
>

 Not very familiar but to give it a try :). This is 3.14-rc5 proper kernel with pfnmap which is bad case here and then oprofiled with pfnmap changed to mixedmap which is good one :).
Comment 13 smoki 2014-03-05 18:50:41 UTC
Created attachment 95182 [details]
GOOD: 3.14-rc5 and VM_MIXEDMAP
Comment 14 smoki 2014-03-06 04:40:27 UTC
(In reply to comment #8)
> So if CPU usage is drastically increased by this commit (or rather this
> merge), IMHO that points to a bug somewhere else, but triggered by this
> merge; possibly in radeon / TTM caching setup or in the x86 PAT memory
> region tracking code.
> 

 Seems like that is the case... VM_PFNMAP usage instead of VM_MIXEDMAP triggers that i now need 'nopat' boot option it is also workaround...  

 So only 'nopat' option work here good with VM_PFNMAP or like before VM_MIXEDMAP is good with PAT... VM_PFNMAP with PAT here is really unusable for gl videos :).
Comment 15 Thomas Hellström 2014-03-10 09:41:25 UTC
(In reply to comment #14)
> (In reply to comment #8)
> > So if CPU usage is drastically increased by this commit (or rather this
> > merge), IMHO that points to a bug somewhere else, but triggered by this
> > merge; possibly in radeon / TTM caching setup or in the x86 PAT memory
> > region tracking code.
> > 
> 
>  Seems like that is the case... VM_PFNMAP usage instead of VM_MIXEDMAP
> triggers that i now need 'nopat' boot option it is also workaround...  
> 
>  So only 'nopat' option work here good with VM_PFNMAP or like before
> VM_MIXEDMAP is good with PAT... VM_PFNMAP with PAT here is really unusable
> for gl videos :).


Interesting, although it's not the first time x86 pat/caching has busted graphics performance-wise, which might appear a bit strange considering graphics is a major user.

In any case, I'll put together a patch to revert the VM_PFNMAP usage until I've figured out what's really happening (it's only there to boost performance anyway).

Meanwhile, the fact that this is visible also points to an area of potential improvement in the radeon driver. Apparently video playback causes a lot of buffer maps / unmaps. Could the maps somehow be cached in the driver?

Thanks,
Thomas


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.