The i915 driver seems to have had a major speed regression since rotation Using Xorg 7.0 and Mesa 3D just before rotation support gets 1110 FPS. With the latest driver and Mesa just after rotation support I only 780FPS.
btw that's with glxgears... but it has been reported in other apps.
The problem here is batchbuffers. Due to rotation being able to shuffle memory around it wasn't possible to support batchbuffers in agp space as the 2D driver could rip up memory allocation upon a rotation event. So the 3D driver falls back to the cmdbuffer path from system memory (rather than AGP memory) which can be guaranteed. To support batchbuffers again we'll need the new memory manager.
to activate the batchbuffer path again for now though, you should be able to do... INTEL_BATCH=1 glxgears
This will undoubtably crash things if you rotate though.
well I've no intention of rotating anything :-), okay with that I get glxgears back up to 890FPS, which still isn't the 1190 I was getting before, but better than the non-batch.
Might be worth checking MTRR's.
Wierdly on my Xorg 7.0 build I don't have mtrr's that work, so I don't get mtrrs at all when I get the 1100 FPS, however I do get MTRRs okay for the newer code... If I disable MTRRs on the latest trees, I get 694 FPS instead of the 888 with batchbuffers enabled. I'll see what I can discover, I've got a day or two to look into this for a project.
there is still another problem here... I'll track down where the FPS go, bit by bit, The changes to the batchbuffer code seem to be suspect for me at the moment, adding the code for emit_invarient stuff I lose 100FPS from my 1100...
(In reply to comment #2) > To support batchbuffers again we'll need the new memory manager. Generally the new memory manager or something that someone is working on? If the former case is true then I think there is a place for config option whether to enable rotation at all (as I believe many ordinary users will prefer faster 3D than rotation).
(In reply to comment #9) > (In reply to comment #2) > > > To support batchbuffers again we'll need the new memory manager. > > Generally the new memory manager or something that someone is working on? > If the former case is true then I think there is a place for config option > whether to enable rotation at all (as I believe many ordinary users will prefer > faster 3D than rotation). It's also worth checking if tiling is being set up correctly for the depth and back buffers. Unfortunately I'm not able to do more than make suggestions at this point...
(In reply to comment #10) > It's also worth checking if tiling is being set up correctly for the depth and > back buffers. Unfortunately I'm not able to do more than make suggestions at > this point... Looks like they are: (II) I810(0): MakeTiles failed for the FRONT buffer (II) I810(0): Activating tiled memory for the back buffer. (II) I810(0): Activating tiled memory for the depth buffer. However this may be related. Using the same config, Xorg 6.9.0 allocates memory like this: (II) I810(0): Allocating at least 1280 scanlines for pixmap cache (II) I810(0): Initial framebuffer allocation size: 8192 kByte ... (II) I810(0): Allocated 3072 kB for the back buffer at 0xf800000. (II) I810(0): Allocated 3072 kB for the depth buffer at 0xf400000. (II) I810(0): Allocated 32 kB for the logical context at 0xf3f8000. (II) I810(0): Allocated 50432 kB for textures at 0x880000 while Xorg 7.x.x allocates memory like this: (II) I810(0): Allocating at least 1248 scanlines for pixmap cache (II) I810(0): Initial framebuffer allocation size: 20224 kByte ... (II) I810(0): Allocated 32 kB for the logical context at 0xffe2000. (II) I810(0): Allocated 6400 kB for the back buffer at 0xf000000. (II) I810(0): Allocated 6400 kB for the depth buffer at 0xe800000. (II) I810(0): Allocated 31104 kB for textures at 0xc9a0000
Sounds to me like you may be running different resolutions between those two runs.
Created attachment 5550 [details] log 7.0.0
(In reply to comment #12) > Sounds to me like you may be running different resolutions between those two runs. Yes, sorry. However, setting the same resolution (using BIOS hack) I got: 6.9.0: (II) I810(0): Allocating at least 1248 scanlines for pixmap cache (II) I810(0): Initial framebuffer allocation size: 16384 kByte 7.0.0: (II) I810(0): Allocating at least 1248 scanlines for pixmap cache (II) I810(0): Initial framebuffer allocation size: 20224 kByte (althought, this can be caused by non HDTV XV support in 6.9.0)
it's not any of the obvious things, I've been messing with these drivers for long enough to know that :-), I'm not getting back to this until tomorrow at the earliest, but on my system when I left it last night, I had an 1100FPS Mesa tree running on the rotated DDX driver, I've just ported over the necessary interface changes to get gears to run with batch buffers, However the last piece of the patch I applied last night was the code to emit_invarient_state and allocate the batch buffer differently, once I applied that I lost 100FPS, I'm expecting I'll find another 100 tomorrow at some point...
Okay I invalidated some of my previous test results due to me being a dumbass, and having some drm debug turned on for some of them, But I tracked it down, Lukas can you test the latest Mesa tree but comment out intel_batchbuffer.c:768 /* KW: temporary - this make crashes & lockups more frequent, so * leave in until they are solved. */ //intel->alloc.size = 8 * 1024; Is what I have, that plus INTEL_BATCH has gotten me back most of my FPS in gears..
Well, I guess you don't get more guilty than that. Sorry for the hassle and thanks Dave for tracking it down...
(In reply to comment #16) > But I tracked it down, Lukas can you test the latest Mesa tree but comment out > > intel_batchbuffer.c:768 > /* KW: temporary - this make crashes & lockups more frequent, so > * leave in until they are solved. > */ > //intel->alloc.size = 8 * 1024; > > Is what I have, that plus INTEL_BATCH has gotten me back most of my FPS in gears.. I can configm glxgears are back to 1100FPS but ppracer still does only 12FPS :( Is there additional copy of the frame buffer? (compared to 6.9.0 version)
Okay that's one thing back, I'm not sure what affects ppracer, I've got ppracer running at 800x600x32-bit at 16FPS without batch and about 20 with batch...
(In reply to comment #19) > Okay that's one thing back, I'm not sure what affects ppracer, I've got ppracer > running at 800x600x32-bit at 16FPS without batch and about 20 with batch... I'm running at 1024x768@32bit. Stencil buffer enabled, Show UI Snow, Reflections, Shadows, FPS, Fog. Progress bar is disabled. For me, it's about 11FPS without batch and 13FPS with batch. Anyway, thanks for tracking down the back buffer issue.
okay tomorrow I'll go track down the ppracer issue.. I'm seeing ppracer with 1024x768-32 Xorg 7.0 + Mesa 6.4.2 + MTRR enabled by hand = about 35FPS. same setup with latest trees is 14FPS.. I need to sort out these issues for a customer app anyways..
(In reply to comment #21) > okay tomorrow I'll go track down the ppracer issue.. > > I'm seeing ppracer with 1024x768-32 Xorg 7.0 + Mesa 6.4.2 + MTRR enabled by hand > = about 35FPS. This should be correct speed, it's the same I got using Xorg 6.9.0 + Mesa 6.4.1. Btw, did you use INTEL_BATCH quirks for it or Xorg 7.0 + Mesa 6.4.2 + MTRR works just fine?
> > This should be correct speed, it's the same I got using Xorg 6.9.0 + Mesa 6.4.1. > > Btw, did you use INTEL_BATCH quirks for it or Xorg 7.0 + Mesa 6.4.2 + MTRR works > just fine? Xorg 7.0 + Mesa 6.4.2 + hand adding the MTRR, the INTEL_BATCH stuff only matters when after rotation support was added before that it was the default.... I've got a nice git tree with the Mesa stuff in it and I can move between branches very easily...
(In reply to comment #23) > Xorg 7.0 + Mesa 6.4.2 + hand adding the MTRR, the INTEL_BATCH stuff only matters > when after rotation support was added before that it was the default.... > > I've got a nice git tree with the Mesa stuff in it and I can move between > branches very easily... Ah, I thought that rotation support has been included in Xorg since 7.0.
Okay I eventually found the ppracer regression thanks to the power of git, I went down a couple of bad alleys in my bisections but I tracked it down to
Okay I eventually found the ppracer regression thanks to the power of git, I went down a couple of bad alleys in my bisections but I tracked it down to http://webcvs.freedesktop.org/mesa/Mesa/src/mesa/main/stencil.c?r1=1.34&r2=1.35 causing most of the regression, perhaps Brian or Keith can comment and suggest a fix? literally this patch halves the performance of ppracer on i915.
also compiler optimisation on HEAD are turned off for some reason, it looks like an accidental commit by Brian should they be on?
maybe something to do with the check = 0xff in intel_ioctl.c??
Okay I've checked in two fixes to i915 driver in CVS, one removes Keiths temporary batchbuffer comment/fix and the other fixes the stencil thing to use ~0U instead of 0xff.
I think SiS and unichrome might be broken similarly... but I've no hw to test those on..
Nice catch Dave. Maybe worth sending an email off to the mesa3d lists, as people might not be subscribed to dri-devel.
(In reply to comment #27) > also compiler optimisation on HEAD are turned off for some reason, it looks like > an accidental commit by Brian should they be on? Personally, I tweak compiler options like this: -O3 -fomit-frame-pointer -msse2 -mmmx -ffast-math -march=pentium4 -mfpmath=sse Thanks for the tracking the bug.
(In reply to comment #29) > Okay I've checked in two fixes to i915 driver in CVS, one removes Keiths > temporary batchbuffer comment/fix and the other fixes the stencil thing to use > ~0U instead of 0xff. I can confirm that ppracer FPS is doubled. However, it's still 2-3FPS slower than 6.9.0 version, but this can be due to rotation overhead or something like that.
(In reply to comment #26) > Okay I eventually found the ppracer regression thanks to the power of git, I > went down a couple of bad alleys in my bisections but I tracked it down to > > http://webcvs.freedesktop.org/mesa/Mesa/src/mesa/main/stencil.c?r1=1.34&r2=1.35 > > causing most of the regression, perhaps Brian or Keith can comment and suggest a > fix? The code you've committed looks good, though I think for generality it's better to do something like #define I915_STENCIL_MASK 0xff and use the condition: if ((ctx->Stencil.Xyz & I915_STENCIL_MASK) == I915_STENCIL_MASK) The trouble with directly testing against ~0u is that an application might specify 0xff explicitly for that value (as it knows how deep the stencil buffer is) and we want the same behaviour in either case - the hardware only cares about the low 8 bits. You're correct that several other drivers are affected by the change also.
Dave, I'm not sure which accidental commit you're referring to. I reordered/cleaned-up CFLAGS in a few places but I don't think I made any net changes to the flags. Go ahead and fix whatever you think needs to be done.
well maybe it wasn't accidental just the checkin message didn't reflect the checkin :) http://webcvs.freedesktop.org/mesa/Mesa/configs/linux-dri?r1=1.34&r2=1.35 you dropped the -O from the OPT_FLAGS... the checkin doesn't mention that... (In reply to comment #35) > Dave, I'm not sure which accidental commit you're referring to. I > reordered/cleaned-up CFLAGS in a few places but I don't think I made any net > changes to the flags. > > Go ahead and fix whatever you think needs to be done. >
OK, I've restored the -O flag. I totally didn't see that. Thanks, Dave.
(In reply to comment #37) > OK, I've restored the -O flag. I totally didn't see that. Thanks, Dave. > Does this patch already commited to the cvs ? Or the OPT_FLAGS really effect the result other then speed ? Cause my source was updated on May 5 and I still got the bug. My FreeBSD box with xorg 7, Mesa , drm from cvs, crash when exiting from Xorg session. The chip is i915 and freebsd 7.
> > Does this patch already commited to the cvs ? Or the OPT_FLAGS really effect the > result other then speed ? > Cause my source was updated on May 5 and I still got the bug. My FreeBSD box > with xorg 7, Mesa , drm from cvs, crash when exiting from Xorg session. The chip > is i915 and freebsd 7. please don't add a bug to this bug, this bug never mentions Xorg crashing on exit, so I don't know how you think anything on this bug can cause/fix it.... I'll close this when Keith checks in the stencil fixes.
Okay keith has checked in his fixes... there might be a minor regression, but I'm not sure it is worth tracking down
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.