Summary: | r300_check_offset fails on PCI-E R420 (5D4F) | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Timo Jyrinki <timo.jyrinki> | ||||||||||||
Component: | DRM/other | Assignee: | Default DRI bug account <dri-devel> | ||||||||||||
Status: | RESOLVED FIXED | QA Contact: | |||||||||||||
Severity: | normal | ||||||||||||||
Priority: | high | CC: | airlied | ||||||||||||
Version: | DRI git | ||||||||||||||
Hardware: | x86 (IA32) | ||||||||||||||
OS: | Linux (All) | ||||||||||||||
Whiteboard: | |||||||||||||||
i915 platform: | i915 features: | ||||||||||||||
Attachments: |
|
Description
Timo Jyrinki
2006-07-30 12:02:47 UTC
Mesa 6.5.x releases have broken PCIE support. I think r300 support in HEAD is now as stable as it used to be when 6.5 went out. Now Ubuntu dev branch has mesa 6.5.1~20060817 and xserver-xorg-video-ati 6.6.2. Additionally, I've installed libdrm and linux-core/drm.ko + linux-core/radeon.ko from mesa-drm git HEAD. I'm still getting the same error messages when trying eg. glxgears. If I apply the following: diff --git a/shared-core/r300_cmdbuf.c b/shared-core/r300_cmdbuf.c index c65ffd5..561f614 100644 --- a/shared-core/r300_cmdbuf.c +++ b/shared-core/r300_cmdbuf.c @@ -259,7 +259,7 @@ static __inline__ int r300_check_offset( if (offset >= dev_priv->gart_vm_start && offset < (dev_priv->gart_vm_start + dev_priv->gart_size)) return 0; - return 1; + return 0; } static __inline__ int r300_emit_carefully_checked_packet0(drm_radeon_private_t * and install the new radeon.ko generated, the error message disappears (offset check always returns 0) and 3D starts to work. Glxgears runs, GL* Gnome screensavers run etc. The problem is also happening when running x86 version of Linux instead of AMD64. Please tell me if I can gather relevant information for you or something. This looks like you're essentially reducing the function to return 0; and thus narrowing down GART boundary checking. I doubt this is the right solution, I rather suppose that gart_vm_start and gart_size are not set correctly for PCIE cards. I'm also having issues with a Xegl / radeon mobility M300 setup (i.e. R300/CHIP_RV380). I'm trying to understand the DRM code and it actually looks like some of the code assumes that the GART is right behind the framebuffer, but it might be hard under some circumstances to get the actual FB size (cf. http://lists.freedesktop.org/archives/xorg/2005-May/007671.html). For instance, the kernel seems to limit the VRAM to MAX_VRAM_SIZE, and I fear that the GART position might be located somewhere inside this VRAM. It would also be interesting to load the drm module with "debug=1" and check whether the dmesg | grep "Setting GART location based on old memory map" shell command returns a matching line. For me, together with another modification, forcing dev_priv->new_memmap in radeon_drv.c to 1 I got my PCIE card beyond a segfault and made it display "vertical stripes", which seems to be a well-known radeonfb issue. It's quite hard to determine whether bogus microcode is involved or not, and not many experts seems to be available for the radeon driver. For instance, I've spent roughly 15 hours on this issue and found no valuable documentation explaining the detailed design desicions for FB implementations and radeonFB layout. I'm CCing Dave Airlie because I still don't understand the whole code. There's a good chance this is fixed in xf86-video-ati git commit 6671c1b01bf29d8f1cacf9306ef658b967d8a3cf (not in any release yet), please test. (In reply to comment #4) > There's a good chance this is fixed in xf86-video-ati git commit > 6671c1b01bf29d8f1cacf9306ef658b967d8a3cf (not in any release yet), please test. Does not seem to help, I installed ati_drv.so, atimisc_drv.so, r128_drv.so and radeon_drv.so as GIT versions. As to Chris's comments, I only got "Setting GART location based on new memory map" (not old) on one drm module load, and now I got: [ 251.148528] [drm] Initialized drm 1.1.0 20060810 [ 253.021470] [drm:drm_init] [ 253.022024] [drm:drm_get_dev] [ 253.022069] ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 18 (level, low) -> IRQ 233 [ 253.022076] PCI: Setting latency timer of device 0000:01:00.0 to 64 [ 253.022238] [drm:radeon_driver_load] PCIE card detected [ 253.022295] [drm:drm_ctxbitmap_next] drm_ctxbitmap_next bit : 0 [ 253.022351] [drm:drm_ctxbitmap_init] drm_ctxbitmap_init : 0 [ 253.022354] [drm:drm_get_head] [ 253.022787] [drm:drm_get_head] new minor assigned 0 [ 253.022791] [drm] Initialized radeon 1.25.0 20060524 on minor 0: Created attachment 7431 [details]
dmesg output with drm debug=1
dmesg output with drm module loaded with debug=1, when glxgears is executed
Interesting that the same error code is also shown for a person using AGP Radeon 9200, https://bugs.launchpad.net/distros/ubuntu/+source/xserver-xorg-video-ati/+bug/65605 Not that I'd know if it helps anyone to pinpoint the actual problem, but the problem seems to (in one way or another) to occur also outside of r300_cmdbuf.c. (In reply to comment #7) > Interesting that the same error code is also shown for a person using AGP Radeon > 9200, > https://bugs.launchpad.net/distros/ubuntu/+source/xserver-xorg-video-ati/+bug/65605 That's bug 7595. This one might be a duplicate; could everybody make sure they're using the DRM from kernel >= 2.6.19 (where the fix was integrated) or from git. If it still happens with that, please add some debugging output that shows what the offending offset is and why it gets rejected. Created attachment 7985 [details]
r300_cmdbuf offset error
(In reply to comment #8) > That's bug 7595. This one might be a duplicate; could everybody make sure > they're using the DRM from kernel >= 2.6.19 (where the fix was integrated) or > from git. If it still happens with that, please add some debugging output that > shows what the offending offset is and why it gets rejected. Happens with 2.6.19 and git version (for me). I added a DRM_ERROR debug output in the check_offset function, so there's "Offset=nnn" in the attached log in the place the checking fails. Do you need more debug output? (In reply to comment #10) > I added a DRM_ERROR debug output in the check_offset function, so there's > "Offset=nnn" in the attached log in the place the checking fails. Thanks. > Do you need more debug output? It would be nice if the offset was printed in hex and if it also printed the value compared against. Please also attach the corresponding full X log file. Created attachment 8039 [details]
r300_cmdbuf offset error, new version
Here's a new version of the log file.
Created attachment 8040 [details] [review] offset check fix While trying to get the hex output, I noticed that I had to do a lot of casting to get correct (long enough) values printed out. Next I noticed that actually the check should be returning ok according to the debug output (see previous attachment), but it did not... so doing similar casts in the check itself, like with the patch attached, the problem goes away and 3D works! You probably know how to make a better patch, but just for reference. Do you need the Xorg.log anymore? (In reply to comment #13) > While trying to get the hex output, I noticed that I had to do a lot of casting > to get correct (long enough) values printed out. Next I noticed that actually > the check should be returning ok according to the debug output (see previous > attachment), but it did not... so doing similar casts in the check itself, like > with the patch attached, the problem goes away and 3D works! So there is a problem if the fb gets exactly mapped at the end of the 32bit address space. Couldn't this happen on 32bit systems too? And with the gart area? The same bug is certainly present in radeon_state.c too, and radeon_cp.c uses the same calculation. Just storing fb_size -1 instead of fb_size (and change the comparisons accordingly) might work too instead of the casts all over the place, just need to make sure the fb_size wasn't 0 before (which shouldn't really happen)... Again, see bug 7595; unfortunately, I completely forgot about the r300 DRM being a whole parallel DRM within radeon when I fixed that. Ideally, everything should use a single function for this. (In reply to comment #15) > Again, see bug 7595; unfortunately, I completely forgot about the r300 DRM being > a whole parallel DRM within radeon when I fixed that. Ideally, everything should > use a single function for this. Ah right, I looked at old code and missed it is fixed already for radeon. You're right ideally it should use the same function, though r300 doesn't have to worry about old broken clients. Created attachment 8080 [details] [review] Unify offset checking Does this patch work for you as well? Yes, the patch works. Thanks! Please put a note when you commit the change so I can mark this as fixed (it seems bugzilla's verified/closed are not used much here). Fixed in drm git commit aefc7a34431a8f1540b261e23d8b8d05d824b60a. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.