Created attachment 122381 [details] perf file with "perf record --pid <xorg>" - xserver-1.18.2 - xf86-video-amdgpu-1.0.1 (same problem with git) - linux kernel-4.5.0 afaik it was working with xserver-1.18.1, back to normal when dri3 is disabled. Joined file is recorded with perf.
perf.data is local to the system it was generated on, it depends upon the local libraries for symbol resolution. Please do something like "perf report --stdio | head -1500 | upload"
Created attachment 122390 [details] perf file with 'perf report --stdio | head -1500'
Spinning on poll. Ok, next up is perf report -G --stdio | head -500 My bet is the busywait in ms_present_queue_vblank() when its gets more than 128 vblanks behind the clients.
Created attachment 122391 [details] perf file with 'perf report -G --stdio | head -500'
* scratches head, -G doesn't work with --stdio? Oh, did you record with perf record -g? If not, can you please grab a new perf.data with callgraphs (-g)?
Will do, currently, i'm bisecting......
Created attachment 122394 [details] perf file with 'perf report -G --stdio | head -500' - second try
Bisecting result: e457e93e5d33cbec387e40051fbabf6cf76d6af3 is the first bad commit commit e457e93e5d33cbec387e40051fbabf6cf76d6af3 Author: Michel Dänzer <michel.daenzer@amd.com> Date: Wed Feb 24 16:52:59 2016 +0900 present: Only requeue if target MSC is not reached after an unflip While present_pixmap decrements target_msc by 1 for present_queue_vblank, it leaves the original vblank->target_msc intact. So incrementing the latter for requeueing resulted in the requeued presentation being executed too late. Also, no need to requeue if the target MSC is already reached. This further reduces stutter when a popup menu appears on top of a flipping fullscreen window. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> (cherry picked from commit b4ac7b142fa3c536e9b283cfd34b94d82c03aac6) :040000 040000 ed8bcba1d57c5f2070241b926eb60abcb15dc19f 0a73bffafa12376ef064352c520e962e3c7e351e M present Bisect log: git bisect start # good: [23e60f917a5af652cd83b8a3a9951c79838721b3] xserver 1.18.1 git bisect good 23e60f917a5af652cd83b8a3a9951c79838721b3 # bad: [93d4224ccf3dee5a51815a66f76c905450419b50] xserver 1.18.2 git bisect bad 93d4224ccf3dee5a51815a66f76c905450419b50 # bad: [c1eb12f89c2391cd99a834da6b342f104da1eb5c] xwin: Update to XRANDR 1.2 internal interface to ensure an output is reported by XRANDR git bisect bad c1eb12f89c2391cd99a834da6b342f104da1eb5c # good: [68bd172432de069aa242d21b3af505867417ae45] glamor: Reuse the glamor_program_alpha_* enums for Render. git bisect good 68bd172432de069aa242d21b3af505867417ae45 # good: [29d1c5a8e4127d5fd1a0d41f8e1474e9dfdb9638] randr: Send ConfigNotify when manual monitor list changes git bisect good 29d1c5a8e4127d5fd1a0d41f8e1474e9dfdb9638 # good: [eb5108b87017128f394ae31b5b7cd85dd8819bca] present: Requeue if flip driver hook fails and target MSC not reached git bisect good eb5108b87017128f394ae31b5b7cd85dd8819bca # bad: [1e143cff16835dd69559ca5646f313dad3db75ab] xwin: Improve handling of no-decoration motif hint git bisect bad 1e143cff16835dd69559ca5646f313dad3db75ab # bad: [f0c2a3527797cf2bfbfa4697c294ae087c1e4423] xwin: XGetWMNormalHints() returns non-zero on success git bisect bad f0c2a3527797cf2bfbfa4697c294ae087c1e4423 # bad: [e457e93e5d33cbec387e40051fbabf6cf76d6af3] present: Only requeue if target MSC is not reached after an unflip git bisect bad e457e93e5d33cbec387e40051fbabf6cf76d6af3 # first bad commit: [e457e93e5d33cbec387e40051fbabf6cf76d6af3] present: Only requeue if target MSC is not reached after an unflip
(In reply to Laurent carlier from comment #8) > Bisecting result: > e457e93e5d33cbec387e40051fbabf6cf76d6af3 is the first bad commit > commit e457e93e5d33cbec387e40051fbabf6cf76d6af3 > Author: Michel Dänzer <michel.daenzer@amd.com> > Date: Wed Feb 24 16:52:59 2016 +0900 No, i still have the problem with this commit reverted, will bisect again
(In reply to Laurent carlier from comment #7) > Created attachment 122394 [details] > perf file with 'perf report -G --stdio | head -500' - second try Looks to be interesting, but with the calldepth, head -500 is far too short. Is head -10000 uploadable?
Can you get debugging symbols for amdgpu_drv.so as well? Also, how do you trigger the problem? Haven't seen it myself.
BTW, personally I'd prefer a sysprof profile to look at.
Created attachment 122416 [details] perf file with 'perf report -G --stdio | head -20000'
(In reply to Michel Dänzer from comment #11) > Can you get debugging symbols for amdgpu_drv.so as well? > > Also, how do you trigger the problem? Haven't seen it myself. There is not reproducible. It can produce in some minutes or after 20 minutes; I'm with kde desktop
Created attachment 122417 [details] trace with sysprof (for Michel)
Created attachment 122418 [details] A nice screenshot
New bisect result: eb5108b87017128f394ae31b5b7cd85dd8819bca is the first bad commit commit eb5108b87017128f394ae31b5b7cd85dd8819bca Author: Michel Dänzer <michel.daenzer@amd.com> Date: Wed Feb 24 16:52:58 2016 +0900 present: Requeue if flip driver hook fails and target MSC not reached For flipping, we wait for the MSC before the target MSC and then call the driver flip hook. If the latter fails, we have to wait for the target MSC before falling back to a copy, or else it's executed too early. Fixes glxgears running at unbounded framerate (not synchronized to the refresh rate) in fullscreen if the driver flip hook fails. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> (cherry picked from commit e7a35b9e16aa12970908f5d55371bb1b862f8f24) :040000 040000 f429593ed2928acc4a99b18f2b47fc9bf418cfdd ed8bcba1d57c5f2070241b926eb60abcb15dc19f M present Bisect log: git bisect start # good: [23e60f917a5af652cd83b8a3a9951c79838721b3] xserver 1.18.1 git bisect good 23e60f917a5af652cd83b8a3a9951c79838721b3 # bad: [93d4224ccf3dee5a51815a66f76c905450419b50] xserver 1.18.2 git bisect bad 93d4224ccf3dee5a51815a66f76c905450419b50 # bad: [c1eb12f89c2391cd99a834da6b342f104da1eb5c] xwin: Update to XRANDR 1.2 internal interface to ensure an output is reported by XRANDR git bisect bad c1eb12f89c2391cd99a834da6b342f104da1eb5c # good: [68bd172432de069aa242d21b3af505867417ae45] glamor: Reuse the glamor_program_alpha_* enums for Render. git bisect good 68bd172432de069aa242d21b3af505867417ae45 # good: [29d1c5a8e4127d5fd1a0d41f8e1474e9dfdb9638] randr: Send ConfigNotify when manual monitor list changes git bisect good 29d1c5a8e4127d5fd1a0d41f8e1474e9dfdb9638 # bad: [eb5108b87017128f394ae31b5b7cd85dd8819bca] present: Requeue if flip driver hook fails and target MSC not reached git bisect bad eb5108b87017128f394ae31b5b7cd85dd8819bca # good: [10e46504711427ed5280fdeb974a08b888f47034] xwayland: Prefix shm tmp file names with xwayland git bisect good 10e46504711427ed5280fdeb974a08b888f47034 # good: [8869f45a965a7b12bb7fe03ba226167e1f821c1e] kdrive/evdev: update keyboard LEDs (#22302) git bisect good 8869f45a965a7b12bb7fe03ba226167e1f821c1e # good: [cf30b7cfd6d91da16c762332617fc165d82f17d2] present: Move msc_is_(equal_or_)after to the top of present.c git bisect good cf30b7cfd6d91da16c762332617fc165d82f17d2 # first bad commit: [eb5108b87017128f394ae31b5b7cd85dd8819bca] present: Requeue if flip driver hook fails and target MSC not reached
I also bisected an issue to this commit, but this time not only was one core fully utilized, it was also leaking RAM at an impressive rate (600 MB/s).
(In reply to Martin Peres from comment #18) > I also bisected an issue to this commit, but this time not only was one core > fully utilized, it was also leaking RAM at an impressive rate (600 MB/s). This would be when using the intel ddx.
(In reply to Martin Peres from comment #19) > (In reply to Martin Peres from comment #18) > > I also bisected an issue to this commit, but this time not only was one core > > fully utilized, it was also leaking RAM at an impressive rate (600 MB/s). > > This would be when using the intel ddx. That happens under the same scenario as the busyspin - if we continually requeue multiple events to the same target vblank, we keep adding the new event to the currently queued vblank (by reallocating the event array). That will show up in the full-debug log quite nicely, I think.
*** Bug 94624 has been marked as a duplicate of this bug. ***
As mentioned in bug 94624, I do see this, too. GPU is Kaveri, driver is radeon. I could try to reproduce it with amdgpu, but I don't really see the point as it was proven already. For me, there is no delay in the effect. Immediately after login I do see Xorg using one core fully. That made it quite easy to bisect. I also can remove the effect by reverting said commit on top of 1.18.2.
Created attachment 122489 [details] [review] Only requeue for the next MSC While it's looking from bug 94515 like this is actually an issue with clients passing invalid MSC parameters and/or drivers not checking them thoroughly enough, this xserver patch should make Xorg more robust against this. Please test.
Michel, your patch seems to fix issue, still testing, but my normal trigger is not showing one core being pegged.
(In reply to Michel Dänzer from comment #23) > Created attachment 122489 [details] [review] [review] > Only requeue for the next MSC > > While it's looking from bug 94515 like this is actually an issue with > clients passing invalid MSC parameters and/or drivers not checking them > thoroughly enough, this xserver patch should make Xorg more robust against > this. Please test. Your patch fixes the problem
Yes, that fixes the issue.
(In reply to Michel Dänzer from comment #23) > Created attachment 122489 [details] [review] [review] > Only requeue for the next MSC > > While it's looking from bug 94515 like this is actually an issue with > clients passing invalid MSC parameters and/or drivers not checking them > thoroughly enough, this xserver patch should make Xorg more robust against > this. Please test. Also fixes the problem: https://bugs.freedesktop.org/show_bug.cgi?id=94611 Could you please send it upstream?
*** Bug 94611 has been marked as a duplicate of this bug. ***
(In reply to Maxim Sheviakov from comment #27) > Could you please send it upstream? Done: https://patchwork.freedesktop.org/patch/78155/
The fix has landed on the master branch; leaving this report open until it lands on the 1.18 branch.
Fix landed in Git server-1.18-branch: commit 07ad2fde78f07e98caaf3e9b6b67af15359fefe4 Author: Michel Dänzer <michel.daenzer@amd.com> Date: Thu Mar 24 17:42:47 2016 +0900 present: Only requeue for next MSC after flip failure
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.