This
Play a video using xine, and then right-click on the video window to bing up the context menu. Select "Stream / Information (OSD)". KMS badness happens. On r100, the dmesg log fills us with messages like: [drm:r100_cs_packet_parse] *ERROR* Packet (0:0:99) end after CS buffer (34) ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:r100_cs_packet_parse] *ERROR* Packet (0:0:99) end after CS buffer (90) ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:r100_cs_packet_parse] *ERROR* Packet (0:0:99) end after CS buffer (34) ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! On rv280, the Xserver stops responding entirely. Note that this bug is similar to this RedHat bug for R300: https://bugzilla.redhat.com/show_bug.cgi?id=540133
Can you try mesa from git master or the 7.7 branch? Dave fixed some issues there: http://cgit.freedesktop.org/mesa/mesa/commit/?id=554043bff72ced41b2a5e03e61cbc087bb41bd3d http://cgit.freedesktop.org/mesa/mesa/commit/?id=42f2880ffd0b847df7cb56b7f7f0747287e0b08f
(In reply to comment #2) > Can you try mesa from git master or the 7.7 branch? Dave fixed some issues > there: Both these fixes seem to be one-liners in a much stabler part of Mesa, and so I patched them both by hand into Fedora's SRPM and then installed the new mesa-libGL and mesa-dri-drivers package on the PC with the rv100. And it made absolutely no difference whatsoever. Mind you, the comment for the radeon_cs_legacy.c patch suggests that it is for the NON-KMS path, so maybe that's no surprise? (This bug is about the KMS path.)
This bug is still present with the following configuration: Stock Linux kernel 2.6.32.5 And Fedora 12 packages: mesa-dri-drivers-7.7-2.fc12.i686 mesa-libGL-7.7-2.fc12.i686 libdrm-2.4.17-1.fc12.i686 xorg-x11-drv-ati-6.13.0-0.20.20091221git4b05c47ac.fc12.i686 However, the dmesg errors have now changed to an endless stream of: [drm:r100_cs_packet_parse] *ERROR* Unknown packet type 1 at 0 ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! This bug is also present with my rv280, and my Radeon 9550 (although the Xserver doesn't stop responding with this card - the video just temporarily freezes instead), but curiously not at all with my HD4890.
This bug is still present with vanilla Linux 2.6.33.1.
xine is using clip regions to render the text top of video. The number of clip regions is too high for single command stream which causes trouble. Maybe better implementation would be using VBO for vertex data so we could handle large number of clip regions better. (and possible only emit them when there is change) In any case it would be better option to render the text on top of video would be providing way of rendering some image on top of video frame.
(In reply to comment #6) > Maybe better implementation would be using VBO for vertex data so we could > handle large number of clip regions better. (and possible only emit them when > there is change) Hmm, this bug exists on lowly Radeon 7000 cards as well; do they support VBOs? > The number of clip regions is too high for single command stream which causes > trouble. Does the maximum size of a single command stream increase in subsequent Radeon iterations? Because on rv100 and rv280, the X server must be restarted, but the X server recovers with my RV350 and handles the stream fine with my RV790. (BTW, how does a "stream" have a "maximum size" anyway? By mere definition of a "stream", I'd expect it to "start" and then chatter away until it met an "end" marker. So I suspect it doesn't work like that really ... ;-).)
Pauli, please elaborate. How exactly are 'too many cliprects' causing the symptoms described in this report?
> --- Comment #8 from Michel Dänzer <michel@daenzer.net> 2010-03-19 07:11:33 PST --- > Pauli, please elaborate. How exactly are 'too many cliprects' causing the > symptoms described in this report? Because r200 textured video is rendering the clipping regions using vertices embed into command stream. So end result is that vertex data takes more space than single CS array can hold. For xine case that I tested it took about 28k dwords. Single buffer can hold only 16k dwords.
On Fri, 2010-03-19 at 16:20 +0200, Pauli Nieminen wrote: > > --- Comment #8 from Michel Dänzer <michel@daenzer.net> 2010-03-19 07:11:33 PST --- > > Pauli, please elaborate. How exactly are 'too many cliprects' causing the > > symptoms described in this report? > > Because r200 textured video is rendering the clipping regions using > vertices embed into command stream. So end result is that vertex data > takes more space than single CS array can hold. For xine case that I > tested it took about 28k dwords. > > Single buffer can hold only 16k dwords. So the textured video code needs to be fixed to use smaller draw packets in several CS. Even when using VBOs limits like this need to be dealt with somehow.
For reference, with vanilla 2.6.33.1 and my RV350, my dmesg log now fills up with messages like this: [drm:r100_cs_track_check] *ERROR* [drm] No buffer for z buffer ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! But the X server still recovers when the on-screen display disappears, and then normal playback resumes.
(In reply to comment #10) > So the textured video code needs to be fixed to use smaller draw packets > in several CS. Even when using VBOs limits like this need to be dealt > with somehow. I have a minor shed-load of different Radeons to test any changes with, BTW ;-).
Created attachment 34246 [details] [review] possible fix Does this patch help?
(In reply to comment #13) > Does this patch help? I'll test wth the rv100 and rv280 tomorrow, but the results with the RV350 at least are unchanged: [drm:r100_cs_track_check] *ERROR* [drm] No buffer for z buffer ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream !
(In reply to comment #13) > Does this patch help? I've just tested with my rv280, and: - xine-ui's on-screen display appears correctly in the window - the dmesg log does not fill up with errors - the X server does not need restarting afterwards. However, the video window is now completely black too :-)!
(In reply to comment #13) > Does this patch help? Same results with my rv100 as with my rv280 - xine-ui's on-screen display now behaves correctly, but the movie windows is now completely black too.
(In reply to comment #16) > (In reply to comment #13) > > Does this patch help? > > Same results with my rv100 as with my rv280 - xine-ui's on-screen display now > behaves correctly, but the movie windows is now completely black too. > Anything in dmesg?
(In reply to comment #17) > Anything in dmesg? No, dmesg is "clean" for both rv100 and rv280.
That's an odd for-loop: + if (radeon_cs_space_remaining(pScrn) > (nBox * 3 * vtx_count + 5)) { + loops = 1; + nBox_loop = nBox; + } else { + loops = nBox; + nBox_loop = 1; + } + + for (i = 0; i > loops; i++) { Surely this loop can never execute...?
(In reply to comment #19) > That's an odd for-loop: > > + if (radeon_cs_space_remaining(pScrn) > (nBox * 3 * vtx_count + 5)) { > + loops = 1; > + nBox_loop = nBox; > + } else { > + loops = nBox; > + nBox_loop = 1; > + } > + > + for (i = 0; i > loops; i++) { > > Surely this loop can never execute...? > Whoops, yeah, reverse the logic on those.
(In reply to comment #20) > Whoops, yeah, reverse the logic on those. OK, I've just tested my rv100 again and the video is now back in the window :-). But activating the on-screen display now crashes the X server instead. The Xorg.0.log file confirms the crash is in radeon_drv.so, but then we probably guessed that.. There's nothing in the dmesg log except: Process 3650(Xorg) has RLIMIT_CORE set to 0 Aborting core
if (radeon_cs_space_remaining(pScrn) > (nBox * 3 * vtx_count + 5)) { loops = 1; nBox_loop = nBox; } else { loops = nBox; nBox_loop = 1; } for (i = 0; i < loops; i++) { ... while (nBox_loop--) { ... Does this mean that nBox_loop will be zero when the for-loop executes for the *second* time? Which means that the while-loop isn't going to stop again until it wraps that 32 bit integer round.
I added the following line directly beneath the while-loop: nBox_loop = 1; The idea is that this line won't have any effect unless the for-loop executes more than once. And now the X server doesn't crash, and the on-screen display appears :-)! But it's still not right. In fact, it is now misbehaving more like my RV350 misbehaves instead. The PC with the rv100 is running vanilla 2.6.32.10, and the dmesg log is now full of messages like this: [drm:r100_cs_track_check] *ERROR* [drm] No buffer for color buffer 0 ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! And as with the RV350, the video window is freezing while the on-screen-display is visible, apart from a small band at the very top.
The rv280's X server is also surviving xine-ui's on-screen display now with the revised patch, but it looks like the R200's command parser still has an issue with IMMDs. (Whatever they are...) [drm:r100_cs_track_check] *ERROR* IMMD draw 14932 dwors but needs 18 dwords [drm:r100_cs_track_check] *ERROR* VAP_VF_CNTL.NUM_VERTICES 3, VTX_SIZE 6 [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:r100_cs_track_check] *ERROR* [drm] No buffer for z buffer ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:r100_cs_track_check] *ERROR* [drm] No buffer for z buffer ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! [drm:r100_cs_packet_parse] *ERROR* Packet (0:3:14932) end after CS buffer (9306) ! [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! The "No buffer for z buffer" error is consistent with the RV350's results. (The rv280 is in a PC running vanilla 2.6.33.1.)
Created attachment 34269 [details] [review] revised patch - much better, but no cigar yet I've fixed some bugs in the original patch, and now the rv100 and rv280 misbehave more like the RV350. (The patch also exposes another bug in the R200's command parser.) But the X server no longer needs to be restarted after activating xine-ui's on-screen display, which is something.
(In reply to comment #24) > The rv280's X server is also surviving xine-ui's on-screen display now with the > revised patch, but it looks like the R200's command parser still has an issue > with IMMDs. (Whatever they are...) > > [drm:r100_cs_track_check] *ERROR* IMMD draw 14932 dwors but needs 18 dwords > [drm:r100_cs_track_check] *ERROR* VAP_VF_CNTL.NUM_VERTICES 3, VTX_SIZE 6 > [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! > [drm:r100_cs_track_check] *ERROR* [drm] No buffer for z buffer ! > [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! > [drm:r100_cs_track_check] *ERROR* [drm] No buffer for z buffer ! > [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! > [drm:r100_cs_packet_parse] *ERROR* Packet (0:3:14932) end after CS buffer > (9306) ! > [drm:radeon_cs_ioctl] *ERROR* Invalid command stream ! > > The "No buffer for z buffer" error is consistent with the RV350's results. (The > rv280 is in a PC running vanilla 2.6.33.1.) > The problem is you need to re-emit all the previous 3D state (default state and state setup in the textured video functions) if you cross over into a new command buffer. I'm not sure of a clean way to handle this without some sort of state tracking. We should probably track the state like the mesa driver does and mark it as dirty when we flush so it can be re-emitted in the next command buffer.
(In reply to comment #26) > The problem is you need to re-emit all the previous 3D state (default state and > state setup in the textured video functions) if you cross over into a new > command buffer. So does this mean that R300+ was already splitting command buffers, like the patch does for R100 and R200? Note that my RV790 has no problems with xine-ui's on-screen display. Is this because it already correctly splits across command buffers, or because its default buffer size is big enough not to need splitting in the first place?
(In reply to comment #27) > (In reply to comment #26) > > The problem is you need to re-emit all the previous 3D state (default state and > > state setup in the textured video functions) if you cross over into a new > > command buffer. > > So does this mean that R300+ was already splitting command buffers, like the > patch does for R100 and R200? R3xx->R5xx only emits one set of verts per draw command since it doesn't have a native rect primitive but rather uses clipped triangles, so it has to update the clipping info between draws. > > Note that my RV790 has no problems with xine-ui's on-screen display. Is this > because it already correctly splits across command buffers, or because its > default buffer size is big enough not to need splitting in the first place? > R6xx+ is less of an issue since it uses vertex buffers rather than embedding verts in the command stream (r6xx+ doesn't support inline verts like older asics). None of the ddx accel code (EXA, Xv) tracks 3D state which would be needed to solve this properly.
(In reply to comment #28) > None of the ddx accel code (EXA, Xv) tracks 3D state which would be needed to > solve this properly. Interestingly, I've also noticed that sometimes the "struct radeon_renderbuffer" is either NULL or incorrect in Mesa R300's emit_zb_offset() function. (This is with KMS enabled.) Could it just be a coincidence that both textured video and Mesa are complaining about bad/missing z buffers now?
(In reply to comment #29) > (In reply to comment #28) > > None of the ddx accel code (EXA, Xv) tracks 3D state which would be needed to > > solve this properly. > > Interestingly, I've also noticed that sometimes the "struct > radeon_renderbuffer" is either NULL or incorrect in Mesa R300's > emit_zb_offset() function. (This is with KMS enabled.) Could it just be a > coincidence that both textured video and Mesa are complaining about bad/missing > z buffers now? > They are unrelated. The command check just makes sure the zbuffer is valid. In the mesa case there's a bug in either mesa or the 3d driver that causes the depth buffer pointer to be incorrect. This results in a command buffer with an invalid z buffer which the CS checker flags. In the ddx, when we flush in the Xv drawing code, the new command buffer is missing the 3D state that was set earlier in the function, so the CS checker flags it as well. two different bugs that result in the same drm error.
Fixed in git master. commmits: c1136f94b80346065893f8a43c0fbf60c8c9b057 - 6c07816025f70e522986614c29c300ea13e5d932
(In reply to comment #31) > Fixed in git master. commmits: > c1136f94b80346065893f8a43c0fbf60c8c9b057 - > 6c07816025f70e522986614c29c300ea13e5d932 I have now tested this: - The RV350 is fine. - The rv280 now shows the on-screen display correctly and there are no messages written to the dmesg log. So I would agree that this is fixed, except that the video frame rate while the on-screen display is active seems to drop considerably. - The r100 also shows the on-screen display correctly without filling the dmesg log, but the video picture now "ghosts" horribly like an old TV set with a poor signal. This "ghosting" now happens regardless of whether the on-screen display is active or not. The CPU usage for video playback is also a lot higher.
The CPU usage will be higher because the state and huge numbers of vertexes will have to be send to the drm. The real fix is probably to either use vertex buffers for Xv or fix xine to not use clipping for the OSD. I'll take a look at the r1xx issues.
(In reply to comment #33) > I'll take a look at the r1xx issues. The rv100 issue now looks fixed with the following commit: commit 426114b4a99d37b394efe3336968bb0ab9b6e9ae Author: Dave Airlie <airlied@redhat.com> Date: Tue Jun 8 11:34:35 2010 +1000 xv: fix Xv on M6/RV100 under KMS. pRADEONEnt->HasCRTC2 wasn't setup under KMS.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.