Bug 94596

Summary: xserver-1.18.2 uses one core fully with dri3
Product: xorg Reporter: Laurent carlier <lordheavym>
Component: Driver/AMDgpuAssignee: xf86-video-ati maintainers <xorg-driver-ati>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: alexander, linux, mrader3940, shawn.starr
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
perf file with "perf record --pid <xorg>"
none
perf file with 'perf report --stdio | head -1500'
none
perf file with 'perf report -G --stdio | head -500'
none
perf file with 'perf report -G --stdio | head -500' - second try
none
perf file with 'perf report -G --stdio | head -20000'
none
trace with sysprof (for Michel)
none
A nice screenshot
none
Only requeue for the next MSC none

Description Laurent carlier 2016-03-17 16:44:50 UTC
Created attachment 122381 [details]
perf file with "perf record --pid <xorg>"

- xserver-1.18.2
- xf86-video-amdgpu-1.0.1 (same problem with git)
- linux kernel-4.5.0

afaik it was working with xserver-1.18.1, back to normal when dri3 is disabled. Joined file is recorded with perf.
Comment 1 Chris Wilson 2016-03-17 20:47:44 UTC
perf.data is local to the system it was generated on, it depends upon the local libraries for symbol resolution.

Please do something like "perf report --stdio | head -1500 | upload"
Comment 2 Laurent carlier 2016-03-17 20:57:57 UTC
Created attachment 122390 [details]
perf file with 'perf report --stdio | head -1500'
Comment 3 Chris Wilson 2016-03-17 21:09:06 UTC
Spinning on poll. Ok, next up is perf report -G --stdio | head -500

My bet is the busywait in ms_present_queue_vblank() when its gets more than 128 vblanks behind the clients.
Comment 4 Laurent carlier 2016-03-17 21:13:30 UTC
Created attachment 122391 [details]
perf file with 'perf report -G --stdio | head -500'
Comment 5 Chris Wilson 2016-03-17 21:42:03 UTC
* scratches head, -G doesn't work with --stdio?

Oh, did you record with perf record -g? If not, can you please grab a new perf.data with callgraphs (-g)?
Comment 6 Laurent carlier 2016-03-17 21:43:37 UTC
Will do, currently, i'm bisecting......
Comment 7 Laurent carlier 2016-03-17 22:14:31 UTC
Created attachment 122394 [details]
perf file with 'perf report -G --stdio | head -500' - second try
Comment 8 Laurent carlier 2016-03-17 22:27:02 UTC
Bisecting result:
e457e93e5d33cbec387e40051fbabf6cf76d6af3 is the first bad commit
commit e457e93e5d33cbec387e40051fbabf6cf76d6af3
Author: Michel Dänzer <michel.daenzer@amd.com>
Date:   Wed Feb 24 16:52:59 2016 +0900

    present: Only requeue if target MSC is not reached after an unflip
    
    While present_pixmap decrements target_msc by 1 for present_queue_vblank,
    it leaves the original vblank->target_msc intact. So incrementing the
    latter for requeueing resulted in the requeued presentation being
    executed too late.
    
    Also, no need to requeue if the target MSC is already reached.
    
    This further reduces stutter when a popup menu appears on top of a
    flipping fullscreen window.
    
    Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
    Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
    (cherry picked from commit b4ac7b142fa3c536e9b283cfd34b94d82c03aac6)

:040000 040000 ed8bcba1d57c5f2070241b926eb60abcb15dc19f 0a73bffafa12376ef064352c520e962e3c7e351e M      present

Bisect log:
git bisect start
# good: [23e60f917a5af652cd83b8a3a9951c79838721b3] xserver 1.18.1
git bisect good 23e60f917a5af652cd83b8a3a9951c79838721b3
# bad: [93d4224ccf3dee5a51815a66f76c905450419b50] xserver 1.18.2
git bisect bad 93d4224ccf3dee5a51815a66f76c905450419b50
# bad: [c1eb12f89c2391cd99a834da6b342f104da1eb5c] xwin: Update to XRANDR 1.2 internal interface to ensure an output is reported by XRANDR
git bisect bad c1eb12f89c2391cd99a834da6b342f104da1eb5c
# good: [68bd172432de069aa242d21b3af505867417ae45] glamor: Reuse the glamor_program_alpha_* enums for Render.
git bisect good 68bd172432de069aa242d21b3af505867417ae45
# good: [29d1c5a8e4127d5fd1a0d41f8e1474e9dfdb9638] randr: Send ConfigNotify when manual monitor list changes
git bisect good 29d1c5a8e4127d5fd1a0d41f8e1474e9dfdb9638
# good: [eb5108b87017128f394ae31b5b7cd85dd8819bca] present: Requeue if flip driver hook fails and target MSC not reached
git bisect good eb5108b87017128f394ae31b5b7cd85dd8819bca
# bad: [1e143cff16835dd69559ca5646f313dad3db75ab] xwin: Improve handling of no-decoration motif hint
git bisect bad 1e143cff16835dd69559ca5646f313dad3db75ab
# bad: [f0c2a3527797cf2bfbfa4697c294ae087c1e4423] xwin: XGetWMNormalHints() returns non-zero on success
git bisect bad f0c2a3527797cf2bfbfa4697c294ae087c1e4423
# bad: [e457e93e5d33cbec387e40051fbabf6cf76d6af3] present: Only requeue if target MSC is not reached after an unflip
git bisect bad e457e93e5d33cbec387e40051fbabf6cf76d6af3
# first bad commit: [e457e93e5d33cbec387e40051fbabf6cf76d6af3] present: Only requeue if target MSC is not reached after an unflip
Comment 9 Laurent carlier 2016-03-17 22:57:37 UTC
(In reply to Laurent carlier from comment #8)
> Bisecting result:
> e457e93e5d33cbec387e40051fbabf6cf76d6af3 is the first bad commit
> commit e457e93e5d33cbec387e40051fbabf6cf76d6af3
> Author: Michel Dänzer <michel.daenzer@amd.com>
> Date:   Wed Feb 24 16:52:59 2016 +0900

No, i still have the problem with this commit reverted, will bisect again
Comment 10 Chris Wilson 2016-03-18 00:12:40 UTC
(In reply to Laurent carlier from comment #7)
> Created attachment 122394 [details]
> perf file with 'perf report -G --stdio | head -500' - second try

Looks to be interesting, but with the calldepth, head -500 is far too short. Is head -10000 uploadable?
Comment 11 Michel Dänzer 2016-03-18 00:48:56 UTC
Can you get debugging symbols for amdgpu_drv.so as well?

Also, how do you trigger the problem? Haven't seen it myself.
Comment 12 Michel Dänzer 2016-03-18 00:49:56 UTC
BTW, personally I'd prefer a sysprof profile to look at.
Comment 13 Laurent carlier 2016-03-18 10:02:17 UTC
Created attachment 122416 [details]
perf file with 'perf report -G --stdio | head -20000'
Comment 14 Laurent carlier 2016-03-18 10:05:25 UTC
(In reply to Michel Dänzer from comment #11)
> Can you get debugging symbols for amdgpu_drv.so as well?
> 
> Also, how do you trigger the problem? Haven't seen it myself.

There is not reproducible. It can produce in some minutes or after 20 minutes; I'm with kde desktop
Comment 15 Laurent carlier 2016-03-18 10:07:07 UTC
Created attachment 122417 [details]
trace with sysprof (for Michel)
Comment 16 Laurent carlier 2016-03-18 10:18:58 UTC
Created attachment 122418 [details]
A nice screenshot
Comment 17 Laurent carlier 2016-03-18 13:53:35 UTC
New bisect result:
eb5108b87017128f394ae31b5b7cd85dd8819bca is the first bad commit
commit eb5108b87017128f394ae31b5b7cd85dd8819bca
Author: Michel Dänzer <michel.daenzer@amd.com>
Date:   Wed Feb 24 16:52:58 2016 +0900

    present: Requeue if flip driver hook fails and target MSC not reached                                                                                                                                         
                                                                                                                                                                                                                  
    For flipping, we wait for the MSC before the target MSC and then call                                                                                                                                         
    the driver flip hook. If the latter fails, we have to wait for the                                                                                                                                            
    target MSC before falling back to a copy, or else it's executed too                                                                                                                                           
    early.                                                                                                                                                                                                        
                                                                                                                                                                                                                  
    Fixes glxgears running at unbounded framerate (not synchronized to the                                                                                                                                        
    refresh rate) in fullscreen if the driver flip hook fails.                                                                                                                                                    
                                                                                                                                                                                                                  
    Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>                                                                                                                                                          
    Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>                                                                                                                                                         
    (cherry picked from commit e7a35b9e16aa12970908f5d55371bb1b862f8f24)                                                                                                                                          
                                                                                                                                                                                                                  
:040000 040000 f429593ed2928acc4a99b18f2b47fc9bf418cfdd ed8bcba1d57c5f2070241b926eb60abcb15dc19f M      present                                                                                                 
  
Bisect log:
git bisect start
# good: [23e60f917a5af652cd83b8a3a9951c79838721b3] xserver 1.18.1
git bisect good 23e60f917a5af652cd83b8a3a9951c79838721b3
# bad: [93d4224ccf3dee5a51815a66f76c905450419b50] xserver 1.18.2
git bisect bad 93d4224ccf3dee5a51815a66f76c905450419b50
# bad: [c1eb12f89c2391cd99a834da6b342f104da1eb5c] xwin: Update to XRANDR 1.2 internal interface to ensure an output is reported by XRANDR
git bisect bad c1eb12f89c2391cd99a834da6b342f104da1eb5c
# good: [68bd172432de069aa242d21b3af505867417ae45] glamor: Reuse the glamor_program_alpha_* enums for Render.
git bisect good 68bd172432de069aa242d21b3af505867417ae45
# good: [29d1c5a8e4127d5fd1a0d41f8e1474e9dfdb9638] randr: Send ConfigNotify when manual monitor list changes
git bisect good 29d1c5a8e4127d5fd1a0d41f8e1474e9dfdb9638
# bad: [eb5108b87017128f394ae31b5b7cd85dd8819bca] present: Requeue if flip driver hook fails and target MSC not reached
git bisect bad eb5108b87017128f394ae31b5b7cd85dd8819bca
# good: [10e46504711427ed5280fdeb974a08b888f47034] xwayland: Prefix shm tmp file names with xwayland
git bisect good 10e46504711427ed5280fdeb974a08b888f47034
# good: [8869f45a965a7b12bb7fe03ba226167e1f821c1e] kdrive/evdev: update keyboard LEDs (#22302)
git bisect good 8869f45a965a7b12bb7fe03ba226167e1f821c1e
# good: [cf30b7cfd6d91da16c762332617fc165d82f17d2] present: Move msc_is_(equal_or_)after to the top of present.c
git bisect good cf30b7cfd6d91da16c762332617fc165d82f17d2
# first bad commit: [eb5108b87017128f394ae31b5b7cd85dd8819bca] present: Requeue if flip driver hook fails and target MSC not reached
Comment 18 Martin Peres 2016-03-18 14:03:14 UTC
I also bisected an issue to this commit, but this time not only was one core fully utilized, it was also leaking RAM at an impressive rate (600 MB/s).
Comment 19 Martin Peres 2016-03-18 14:04:23 UTC
(In reply to Martin Peres from comment #18)
> I also bisected an issue to this commit, but this time not only was one core
> fully utilized, it was also leaking RAM at an impressive rate (600 MB/s).

This would be when using the intel ddx.
Comment 20 Chris Wilson 2016-03-18 15:17:08 UTC
(In reply to Martin Peres from comment #19)
> (In reply to Martin Peres from comment #18)
> > I also bisected an issue to this commit, but this time not only was one core
> > fully utilized, it was also leaking RAM at an impressive rate (600 MB/s).
> 
> This would be when using the intel ddx.

That happens under the same scenario as the busyspin - if we continually requeue multiple events to the same target vblank, we keep adding the new event to the currently queued vblank (by reallocating the event array). That will show up in the full-debug log quite nicely, I think.
Comment 21 Bernd Steinhauser 2016-03-21 17:42:51 UTC
*** Bug 94624 has been marked as a duplicate of this bug. ***
Comment 22 Bernd Steinhauser 2016-03-21 17:46:31 UTC
As mentioned in bug 94624, I do see this, too.
GPU is Kaveri, driver is radeon. I could try to reproduce it with amdgpu, but I don't really see the point as it was proven already.

For me, there is no delay in the effect. Immediately after login I do see Xorg using one core fully.
That made it quite easy to bisect.

I also can remove the effect by reverting said commit on top of 1.18.2.
Comment 23 Michel Dänzer 2016-03-23 04:18:43 UTC
Created attachment 122489 [details] [review]
Only requeue for the next MSC

While it's looking from bug 94515 like this is actually an issue with clients passing invalid MSC parameters and/or drivers not checking them thoroughly enough, this xserver patch should make Xorg more robust against this. Please test.
Comment 24 Shawn Starr 2016-03-23 16:53:05 UTC
Michel, your patch seems to fix issue, still testing, but my normal trigger is not showing one core being pegged.
Comment 25 Laurent carlier 2016-03-23 20:37:19 UTC
(In reply to Michel Dänzer from comment #23)
> Created attachment 122489 [details] [review] [review]
> Only requeue for the next MSC
> 
> While it's looking from bug 94515 like this is actually an issue with
> clients passing invalid MSC parameters and/or drivers not checking them
> thoroughly enough, this xserver patch should make Xorg more robust against
> this. Please test.

Your patch fixes the problem
Comment 26 Bernd Steinhauser 2016-03-23 20:48:45 UTC
Yes, that fixes the issue.
Comment 27 Maxim Sheviakov 2016-03-24 08:50:53 UTC
(In reply to Michel Dänzer from comment #23)
> Created attachment 122489 [details] [review] [review]
> Only requeue for the next MSC
> 
> While it's looking from bug 94515 like this is actually an issue with
> clients passing invalid MSC parameters and/or drivers not checking them
> thoroughly enough, this xserver patch should make Xorg more robust against
> this. Please test.

Also fixes the problem: https://bugs.freedesktop.org/show_bug.cgi?id=94611
Could you please send it upstream?
Comment 28 Michel Dänzer 2016-03-24 08:53:25 UTC
*** Bug 94611 has been marked as a duplicate of this bug. ***
Comment 29 Michel Dänzer 2016-03-24 08:56:55 UTC
(In reply to Maxim Sheviakov from comment #27)
> Could you please send it upstream?

Done: https://patchwork.freedesktop.org/patch/78155/
Comment 30 Michel Dänzer 2016-03-25 02:44:39 UTC
The fix has landed on the master branch; leaving this report open until it lands on the 1.18 branch.
Comment 31 Michel Dänzer 2016-03-30 02:21:46 UTC
Fix landed in Git server-1.18-branch:

commit 07ad2fde78f07e98caaf3e9b6b67af15359fefe4
Author: Michel Dänzer <michel.daenzer@amd.com>
Date:   Thu Mar 24 17:42:47 2016 +0900

    present: Only requeue for next MSC after flip failure

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.