Bug 35452

Summary: Pageflipping + nouveau + compiz + fullscreen == FAIL
Product: xorg Reporter: Mike Lothian <mike>
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: dark_mail, hramrach, mario.kleiner, maximlevitsky, mike, mss, oreaus
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 32534    
Attachments:
Description Flags
dmesg of card initialisation
none
Invalidate DRI2 buffers for all windows with the same pixmap on swap.
none
Fix pageflip completion handling and timestamping on noveau ddx.
none
Bugfix for nouveau, a direct translation of the bugfix in ati-ddx for same problem. none

Description Mike Lothian 2011-03-19 14:25:26 UTC
Created attachment 44620 [details]
dmesg of card initialisation 

When I run Warcraft 3 under wine on my 64bit system the screen shows some corruption then displays the start screen whilst flashing in and out and the game complains there is no audio hardware

The A/V is outputted via HDMI and sound and glxgears (32bit and 64bit) run without issue at about 300fps when vblank_mode=0

Using the keyboard I can close the game but the visual glitches continue to the desktop (I run kwin)

the lspci -nn output for the device is

01:05.0 VGA compatible controller [0300]: ATI Technologies Inc M880G [Mobility Radeon HD 4200] [1002:9712]

I compiled the 32bit libdrm and mesa libraries myself from the latest git to make sure the issue hasn't already been fixed and I've also tried the latest kernel 2.6.39-rc0 too.

With the classic r600 driver the game doesn't even start I have to ssh in to kill it and there are no graphical errors after

I'll attach dmesg initialisation of the card 

The errors show as this:

[drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -35!
radeon 0000:01:05.0: GPU lockup CP stall for more than 10040msec
[drm] Disabling audio support
radeon 0000:01:05.0: GPU softreset 
radeon 0000:01:05.0:   R_008010_GRBM_STATUS=0xB2333030
radeon 0000:01:05.0:   R_008014_GRBM_STATUS2=0x00000003
radeon 0000:01:05.0:   R_000E50_SRBM_STATUS=0x20000040
radeon 0000:01:05.0:   R_008020_GRBM_SOFT_RESET=0x00007FEE
radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00000001
radeon 0000:01:05.0:   R_008010_GRBM_STATUS=0xA0003030
radeon 0000:01:05.0:   R_008014_GRBM_STATUS2=0x00000003
radeon 0000:01:05.0:   R_000E50_SRBM_STATUS=0x20008040
radeon 0000:01:05.0: GPU reset succeed
radeon 0000:01:05.0: WB enabled
[drm] ring test succeeded in 0 usecs
[drm] ib test succeeded in 1 usecs
[drm] Enabling audio support
[drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -35!
radeon 0000:01:05.0: GPU lockup CP stall for more than 10040msec
[drm] Disabling audio support
radeon 0000:01:05.0: GPU softreset 
radeon 0000:01:05.0:   R_008010_GRBM_STATUS=0xB2333030
radeon 0000:01:05.0:   R_008014_GRBM_STATUS2=0x00000003
radeon 0000:01:05.0:   R_000E50_SRBM_STATUS=0x20000040
radeon 0000:01:05.0:   R_008020_GRBM_SOFT_RESET=0x00007FEE
radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00000001
radeon 0000:01:05.0:   R_008010_GRBM_STATUS=0xA0003030
radeon 0000:01:05.0:   R_008014_GRBM_STATUS2=0x00000003
radeon 0000:01:05.0:   R_000E50_SRBM_STATUS=0x20008040
radeon 0000:01:05.0: GPU reset succeed
radeon 0000:01:05.0: WB enabled
[drm] ring test succeeded in 0 usecs
[drm] ib test succeeded in 1 usecs
[drm] Enabling audio support


Using the old kernel and user space I used to get errors about invalid command streams too

Mar 19 20:29:33 quark kernel: [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -35!
Mar 19 20:29:33 quark kernel: radeon 0000:01:05.0: texture bo too small (1920 1080 26 0 -> 8294400 have 1720320)
Mar 19 20:29:33 quark kernel: radeon 0000:01:05.0: alignments 1920 1 1 1
Mar 19 20:29:33 quark kernel: [drm:radeon_cs_ioctl] *ERROR* Invalid command stream !


Please let me know if there is any extra messages you require, or reruns with debugging turned on
Comment 1 Mike Lothian 2011-03-22 01:25:24 UTC
I can confirm commit e4b040c2b922ff1887651cbf658b06b48b5992c5 fixes the -35 issue for me

Audio now works fine but the blinking / flashing still happens ie the screen goes black then comes back with the next few frames

The biggest difference is now when I exit the game it stops flashing but doesn't take me back to the desktop. The wine and game processes end but the start screen still remains on screen, I can't see or use my desktop. I have my desktop cursor back and it is responsive and I can ssh in - there's nothing in dmesg though

Is there any extra debugging I can do to pin point the problem?
Comment 2 Michel Dänzer 2011-03-22 05:29:23 UTC
After some discussion on IRC, the problem only occurs when page flipping is enabled and kwin is configured to unredirect fullscreen windows.

One possibly related problem is that flipping an unredirected fullscreen window doesn't update the DRI2 buffer information for the root / composite overlay window, possibly tossing the compositing manager into an inconsistent state.
Comment 3 Michel Dänzer 2011-03-23 04:30:02 UTC
Created attachment 44749 [details] [review]
Invalidate DRI2 buffers for all windows with the same pixmap on swap.

(In reply to comment #2)
> One possibly related problem is that flipping an unredirected fullscreen window
> doesn't update the DRI2 buffer information for the root / composite overlay
> window, possibly tossing the compositing manager into an inconsistent state.

This xserver patch seems to fix this problem for me. Does it help for your remaining issues?
Comment 4 Mike Lothian 2011-03-24 18:47:24 UTC
Sorry for taking so long to test out this patch

Yes I can confirm this fixes things for me on xorg-server 1.10

I still see graphical corruption when the app is first launched but that could be a kwin issue and I'm not sure how to pin point it

Warcraft 3 is now running under Linux for me for the first time with open drivers hopefully it'll be playing just as well as under windows

Thanks everyone for your help with this
Comment 5 dark_mail 2011-04-05 15:59:59 UTC
I can confirm that this patch solves a different issue regarding fullscreen flash video (posted it on the gentoo bugtracker).
https://bugs.gentoo.org/show_bug.cgi?id=359569

I haven't found the proposed patch in the freedesktop git repo.
Is the patch production-grade or a quick fix? In the former case intagrating it would be pretty nice.
Comment 6 Michel Dänzer 2011-04-06 01:20:58 UTC
(In reply to comment #5)
> Is the patch production-grade or a quick fix? In the former case intagrating it
> would be pretty nice.

It would be, wouldn't it? See http://lists.x.org/archives/xorg-devel/2011-March/020799.html .
Comment 7 dark_mail 2011-04-06 12:12:32 UTC
Hm, I see. For those interested, the "other" discussion mentioned is found under
http://lists.x.org/archives/xorg-devel/2011-March/020716.html

If there's a future patch to be tested, just post/link it here.
Comment 8 Ian Pilcher 2011-04-25 19:32:45 UTC
Changing the component, as this is an X server problem that affects at least
radeon and intel (and presumably nouveau as well).
Comment 9 maximlevitsky 2011-05-22 03:19:55 UTC
I confirm that this patch fixes non-usable compiz after running full-screen game on nouveau.
Comment 10 maximlevitsky 2011-05-22 03:20:18 UTC
And page-flipping enabled that is
Comment 11 Scott Moreau 2011-05-31 04:09:18 UTC
*** Bug 37781 has been marked as a duplicate of this bug. ***
Comment 12 Mike Lothian 2011-06-30 15:37:13 UTC
Has this fix been applied to xorg-server yet?
Comment 13 maximlevitsky 2011-08-30 18:15:48 UTC
Any update? Patch works for me.
Comment 14 Michel Dänzer 2011-08-31 02:53:42 UTC
(In reply to comment #13)
> Any update? Patch works for me.

The problem was fixed differently in xf86-video-ati Git.
Comment 15 maximlevitsky 2011-09-26 10:03:10 UTC
I think you mean this commit:
http://cgit.freedesktop.org/xorg/driver/xf86-video-ati/commit/?id=9493563c1ef4b51af0ee8a44cb4e7c5bb280347e

Now, I guess same hack should be put into nouveau, or common code is better?
(I am still using the xserver patch, and it works)
Comment 16 maximlevitsky 2011-12-16 15:46:54 UTC
Any update on this? I want the 10FPS back :-)
Comment 17 Mario Kleiner 2012-01-04 08:21:24 UTC
Created attachment 55116 [details] [review]
Fix pageflip completion handling and timestamping on noveau ddx.

Patch 0002, the proposed bugfix for nouveau trivially depends on this one at the moment.
Comment 18 Mario Kleiner 2012-01-04 08:23:06 UTC
Created attachment 55117 [details] [review]
Bugfix for nouveau, a direct translation of the bugfix in ati-ddx for same problem.
Comment 19 Mario Kleiner 2012-01-04 08:34:01 UTC
Hi,

can you try the two patches i just attached against the nouveau ddx?

They are direct translations of what is done in the ati-ddx and intel-ddx. They fix the same problem for me. Technically they are independent of each other, but the 2nd one (the fix for your problem) doesn't apply without the first one atm. Also i wanted to resubmit them for review anyway, so you can be my first independent tester.

Both have been successfully tested by me a couple of months ago, i've just rebased them against nouveau master. They compile successfully but i haven't retested them yet.

The first patch in the series was held back, because Francisco Jerez didn't want to have it before the x-server implements a proper swaplimit api, but xorg-1.12rc has the api now, so i wanted to resubmit this series anyway.

-mario
Comment 20 maximlevitsky 2012-01-04 09:40:18 UTC
Just one question, do I need to update xserver for this? I am using some unknown xserver revision from git and usually updating xserver is quite painful process.

I currently use this revision + patch that 'fixes' this issue but slows everything down as I said before.

I am on:


commit 4020cab88f5cf3164fc83cf912f94f288aa5a45d
Author: Michel Dänzer <michel.daenzer@amd.com>
Date:   Wed Aug 10 11:36:16 2011 +0200

    EXA/mixed: Update sys_pitch in MPH even when there's no system memory copy.
    
    Otherwise sys_pitch will be stale when a system memory copy is allocated.
    
    Fixes https://bugs.freedesktop.org/show_bug.cgi?id=38322 and a crash when
    unlocking the screen with xscreensaver, reported by Janne Huttunen.
    
    Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
    Tested-by: Janne Huttunen <jahuttun@gmail.com>
    Tested-by: Jan Kriho <Erbureth@gmail.com>
    Signed-off-by: Keith Packard <keithp@keithp.com>
Comment 21 maximlevitsky 2012-01-04 09:57:22 UTC
Just tested, appears to be working. FPS didn't increase as much as I expected, I will test this more thoughtfully.
Comment 22 maximlevitsky 2012-01-04 10:22:36 UTC
OK, here are the full results:

All tests done without any composition manager using specific scene from neverball with accuracy +=1FPS:

1. without any patches: 100 FPS
2. with your patch: 94 FPS
3. with ugly xserver workaround I used before: 84 FPS

Then I tested with kwin and 'unredirect full screen windows' option, and I confirm that your patch indeed fixes the issue, the issue of crazy kwin after I end the game 

(I had this with compiz as well - had to stop using it, its just too buggy nowadays and its not nouveau fault, I checked, besides kwin finally is good both performance and feature wise, and integrates well into KDE).

Also I noticed though that if I don't redirect full screen windows (and I don't as I prefer just to disable composing - its easy with kwin, and I had enough of problems with this option, everything works fine without any patches. Thus if I forget to disable composing, game runs, abet slow - 67 FPS).

So thanks, but it would be great somehow to eliminate the overhead in case composing manager is not running, which not the case yet.

Thank a lot for looking into this,
Best regards,
     Maxim Levitsky
Comment 23 Mario Kleiner 2012-01-04 14:51:30 UTC
Good. You can try with the 2nd patch only if you want. That should fix your bug and should give you an idea of the performance you'll get. The 2nd patch does not apply cleanly because of some context it needs from the 1st patch, but you can fix this manually. ("Bugfix for nouveau, a direct translation of the bugfix in ati-ddx for same problem.")

The 1st patch fixes other bugs, but potentially reduces fps a little bit by reducing kms-flips from triple-buffered to double-buffered, probably causing your lost 6 fps wrt. no fix applied at all.

1st patch wasn't applied to nouveau because of this. But Xserver 1.12 has a new swaplimit api that should allow to apply the 1st patch without performance loss, once we have a patch using the swaplimit api.

I'll prepare those additional patch in the next couple of days and submit them to nouveau-devel.
Comment 24 Mike Lothian 2012-06-30 04:35:26 UTC
Seems to be working fine these days - closing this off

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.