Bugzilla – Bug 7697
r300_check_offset fails on PCI-E R420 (5D4F)
Last modified: 2006-12-14 10:33:55 UTC
I'm using an AMD64 computer with X800 GTO PCI-Express gfx card (5D4F). I'm using
both a) Ubuntu 6.06 LTS default configuration and b) Ubuntu edgy development
branch with kernel 2.6.17 drm modules, mesa 6.5.0.cvs.20060725,
xserver-xorg-driver-ati-6.6.1 and libdrm cvs. Glxinfo gives me "Direct
rendering: Yes", but glxgears gives: drmRadeonCmdBuffer: -22 (exiting)
In both configurations, dmesg shows the following after trying to run glxgears:
[ 1849.338955] [drm:r300_emit_carefully_checked_packet0] *ERROR* Offset failed
range check (reg=4e28 sz=1)
[ 1849.338960] [drm:r300_do_cp_cmdbuf] *ERROR* r300_emit_packet0 failed
For some reason, r300_check_offset in r300_cmdbuf.c is failing while it probably
should not. At least if I force that function to return 0; (in the Ubuntu edgy
configuration), glxgears runs without problems (ca. 2800 fps) and so does a
couple of screensavers.
Any idea what's going wrong? If you need more information, just tell what should
I find out.
Mesa 6.5.x releases have broken PCIE support. I think r300 support in HEAD is
now as stable as it used to be when 6.5 went out.
Now Ubuntu dev branch has mesa 6.5.1~20060817 and xserver-xorg-video-ati 6.6.2.
Additionally, I've installed libdrm and linux-core/drm.ko + linux-core/radeon.ko
from mesa-drm git HEAD. I'm still getting the same error messages when trying
eg. glxgears. If I apply the following:
diff --git a/shared-core/r300_cmdbuf.c b/shared-core/r300_cmdbuf.c
index c65ffd5..561f614 100644
@@ -259,7 +259,7 @@ static __inline__ int r300_check_offset(
if (offset >= dev_priv->gart_vm_start &&
offset < (dev_priv->gart_vm_start + dev_priv->gart_size))
- return 1;
+ return 0;
static __inline__ int r300_emit_carefully_checked_packet0(drm_radeon_private_t
and install the new radeon.ko generated, the error message disappears (offset
check always returns 0) and 3D starts to work. Glxgears runs, GL* Gnome
screensavers run etc.
The problem is also happening when running x86 version of Linux instead of
AMD64. Please tell me if I can gather relevant information for you or something.
This looks like you're essentially reducing the function to
and thus narrowing down GART boundary checking. I doubt this is the right
solution, I rather suppose that gart_vm_start and gart_size are not set
correctly for PCIE cards.
I'm also having issues with a Xegl / radeon mobility M300 setup (i.e.
R300/CHIP_RV380). I'm trying to understand the DRM code and it actually looks
like some of the code assumes that the GART is right behind the framebuffer, but
it might be hard under some circumstances to get the actual FB size (cf.
http://lists.freedesktop.org/archives/xorg/2005-May/007671.html). For instance,
the kernel seems to limit the VRAM to MAX_VRAM_SIZE, and I fear that the GART
position might be located somewhere inside this VRAM.
It would also be interesting to load the drm module with "debug=1" and check
dmesg | grep "Setting GART location based on old memory map"
shell command returns a matching line.
For me, together with another modification, forcing dev_priv->new_memmap in
radeon_drv.c to 1 I got my PCIE card beyond a segfault and made it display
"vertical stripes", which seems to be a well-known radeonfb issue. It's quite
hard to determine whether bogus microcode is involved or not, and not many
experts seems to be available for the radeon driver.
For instance, I've spent roughly 15 hours on this issue and found no valuable
documentation explaining the detailed design desicions for FB implementations
and radeonFB layout. I'm CCing Dave Airlie because I still don't understand the
There's a good chance this is fixed in xf86-video-ati git commit
6671c1b01bf29d8f1cacf9306ef658b967d8a3cf (not in any release yet), please test.
(In reply to comment #4)
> There's a good chance this is fixed in xf86-video-ati git commit
> 6671c1b01bf29d8f1cacf9306ef658b967d8a3cf (not in any release yet), please test.
Does not seem to help, I installed ati_drv.so, atimisc_drv.so, r128_drv.so and
radeon_drv.so as GIT versions.
As to Chris's comments, I only got "Setting GART location based on new memory
map" (not old) on one drm module load, and now I got:
[ 251.148528] [drm] Initialized drm 1.1.0 20060810
[ 253.021470] [drm:drm_init]
[ 253.022024] [drm:drm_get_dev]
[ 253.022069] ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 18 (level, low) -> IRQ
[ 253.022076] PCI: Setting latency timer of device 0000:01:00.0 to 64
[ 253.022238] [drm:radeon_driver_load] PCIE card detected
[ 253.022295] [drm:drm_ctxbitmap_next] drm_ctxbitmap_next bit : 0
[ 253.022351] [drm:drm_ctxbitmap_init] drm_ctxbitmap_init : 0
[ 253.022354] [drm:drm_get_head]
[ 253.022787] [drm:drm_get_head] new minor assigned 0
[ 253.022791] [drm] Initialized radeon 1.25.0 20060524 on minor 0:
Created attachment 7431 [details]
dmesg output with drm debug=1
dmesg output with drm module loaded with debug=1, when glxgears is executed
Interesting that the same error code is also shown for a person using AGP Radeon
Not that I'd know if it helps anyone to pinpoint the actual problem, but the
problem seems to (in one way or another) to occur also outside of r300_cmdbuf.c.
(In reply to comment #7)
> Interesting that the same error code is also shown for a person using AGP Radeon
That's bug 7595. This one might be a duplicate; could everybody make sure
they're using the DRM from kernel >= 2.6.19 (where the fix was integrated) or
from git. If it still happens with that, please add some debugging output that
shows what the offending offset is and why it gets rejected.
Created attachment 7985 [details]
r300_cmdbuf offset error
(In reply to comment #8)
> That's bug 7595. This one might be a duplicate; could everybody make sure
> they're using the DRM from kernel >= 2.6.19 (where the fix was integrated) or
> from git. If it still happens with that, please add some debugging output that
> shows what the offending offset is and why it gets rejected.
Happens with 2.6.19 and git version (for me). I added a DRM_ERROR debug output
in the check_offset function, so there's "Offset=nnn" in the attached log in the
place the checking fails. Do you need more debug output?
(In reply to comment #10)
> I added a DRM_ERROR debug output in the check_offset function, so there's
> "Offset=nnn" in the attached log in the place the checking fails.
> Do you need more debug output?
It would be nice if the offset was printed in hex and if it also printed the
value compared against.
Please also attach the corresponding full X log file.
Created attachment 8039 [details]
r300_cmdbuf offset error, new version
Here's a new version of the log file.
Created attachment 8040 [details] [review]
offset check fix
While trying to get the hex output, I noticed that I had to do a lot of casting
to get correct (long enough) values printed out. Next I noticed that actually
the check should be returning ok according to the debug output (see previous
attachment), but it did not... so doing similar casts in the check itself, like
with the patch attached, the problem goes away and 3D works!
You probably know how to make a better patch, but just for reference. Do you
need the Xorg.log anymore?
(In reply to comment #13)
> While trying to get the hex output, I noticed that I had to do a lot of casting
> to get correct (long enough) values printed out. Next I noticed that actually
> the check should be returning ok according to the debug output (see previous
> attachment), but it did not... so doing similar casts in the check itself, like
> with the patch attached, the problem goes away and 3D works!
So there is a problem if the fb gets exactly mapped at the end of the 32bit
address space. Couldn't this happen on 32bit systems too? And with the gart
area? The same bug is certainly present in radeon_state.c too, and radeon_cp.c
uses the same calculation. Just storing fb_size -1 instead of fb_size (and
change the comparisons accordingly) might work too instead of the casts all over
the place, just need to make sure the fb_size wasn't 0 before (which shouldn't
Again, see bug 7595; unfortunately, I completely forgot about the r300 DRM being
a whole parallel DRM within radeon when I fixed that. Ideally, everything should
use a single function for this.
(In reply to comment #15)
> Again, see bug 7595; unfortunately, I completely forgot about the r300 DRM being
> a whole parallel DRM within radeon when I fixed that. Ideally, everything should
> use a single function for this.
Ah right, I looked at old code and missed it is fixed already for radeon. You're
right ideally it should use the same function, though r300 doesn't have to worry
about old broken clients.
Created attachment 8080 [details] [review]
Unify offset checking
Does this patch work for you as well?
Yes, the patch works. Thanks! Please put a note when you commit the change so I
can mark this as fixed (it seems bugzilla's verified/closed are not used much here).
Fixed in drm git commit aefc7a34431a8f1540b261e23d8b8d05d824b60a.