Kernel version where bug don't exists: 2.6.21-rc4-git5 Distribution: Gentoo Hardware: VIA EPIA M10000, CLE266 chipset with integrated CastleRock graphics Good "via" module version seems to be: 2.11.0 20061227 Bad "via" module version seems to be: 2.11.1 20070202 Gentoo's x11-drm package replaced kernel "drm" and "via" modules with version from git. After reboot X screen was trashed. No window background, no window frames. Only text visible, but printed on previous lines. Problem dependents on "EnableAGPDMA" option set in xorg.conf and I can't reproduce it in my current kernel. I can't reach broken commit with git-bisect nor git-reset because I'm using Linux 2.6.21-rc4 and I need "fix build for 2.6.21-rc1" patch to compile drm modules. For older commits bug seems to occur only if OpenGL is used (about 10s). Gears from glxgears command aren't visible after it hit. Switching to text console and backward is causing lockup. Output of "lscpi -v" command: 01:00.0 VGA compatible controller: VIA Technologies, Inc. VT8623 [Apollo CLE266] integrated CastleRock graphics (rev 03) (prog-if 00 [VGA]) Subsystem: VIA Technologies, Inc. VT8623 [Apollo CLE266] integrated CastleRock graphics Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 11 Memory at d0000000 (32-bit, prefetchable) [size=64M] Memory at d4000000 (32-bit, non-prefetchable) [size=16M] [virtual] Expansion ROM at d5000000 [disabled] [size=64K] After trying today's git I found these errors in log (this is first time, no errors earlier): Mar 27 21:38:01 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 0 next_addr 80100 Mar 27 21:38:01 elke [drm:via_cmdbuf_jump] *ERROR* via_cmdbuf_jump failed Mar 27 21:38:01 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 30 next_addr 80230 Mar 27 21:38:01 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 100 next_addr 80300 Mar 27 21:38:01 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80302 Mar 27 21:38:01 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80302 Mar 27 21:38:01 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80302 Mar 27 21:38:02 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80348 Mar 27 21:38:02 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80398 Mar 27 21:38:02 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 803e8 Mar 27 21:38:02 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80430 Mar 27 21:38:03 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80478 Mar 27 21:38:03 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 804c0 Mar 27 21:38:03 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80508 Mar 27 21:38:03 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80550 Mar 27 21:38:04 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80598 Mar 27 21:38:04 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 805e0 Mar 27 21:38:04 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80628 Mar 27 21:38:04 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80670 Mar 27 21:38:05 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 806b8 Mar 27 21:38:05 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80700 Mar 27 21:38:05 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80748 Mar 27 21:38:06 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80790 Mar 27 21:38:06 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 807d8 Mar 27 21:38:06 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80820 Mar 27 21:38:06 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80868 Mar 27 21:38:07 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 808b0 Mar 27 21:38:07 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 808f8 Mar 27 21:38:07 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80950 Mar 27 21:38:07 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 809a0 Mar 27 21:38:08 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 809f8 Mar 27 21:38:08 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80a48 Mar 27 21:38:08 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80a98 Mar 27 21:38:08 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80ae0 Mar 27 21:38:09 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80b28 Mar 27 21:38:09 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80b70 Mar 27 21:38:09 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80bb8 Mar 27 21:38:09 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80c00 Mar 27 21:38:10 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80c48 Mar 27 21:38:10 elke [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw 21600 cur_addr 200 next_addr 80c90 [...]
I had some time, so I start to bisecting with Linux 2.6.20. Looks like that commit 6c04185857694b2293046b7ea1d4515404a740c3: > Author: Thomas Hellstrom <thomas-at-tungstengraphics-dot-com> > Date: Fri Feb 2 09:15:44 2007 +0100 > > via: Try to improve command-buffer chaining. > > Bump driver date and patchlevel. broke dri on my machine. DRI is OK one commit before.
Hunk #2 of "via: Try to improve command-buffer chaining" seems to be a reason. Reverting hunk #2 is solving my problem.
Patch below is solving my problem or, at least, is making it very hard to reproduce. It is reverted part of hunk 2. With this change I can use glxgears again. Tested with 2.6.20 and 2.6.21-rc5. Btw. I don't know what this patch is doing. Please make it correct. diff --git a/shared-core/via_dma.c b/shared-core/via_dma.c --- a/shared-core/via_dma.c +++ b/shared-core/via_dma.c @@ -419,7 +419,6 @@ static inline uint32_t *via_get_dma(drm_via_private_t * dev_priv) * modifying the pause address stored in the buffer itself. If * the regulator has already paused, restart it. */ - static int via_hook_segment(drm_via_private_t *dev_priv, uint32_t pause_addr_hi, uint32_t pause_addr_lo, int no_pci_fire) @@ -430,12 +429,20 @@ static int via_hook_segment(drm_via_private_t *dev_priv, paused = 0; via_flush_write_combine(); + while(! *(via_get_dma(dev_priv)-1)); *dev_priv->last_pause_ptr = pause_addr_lo; via_flush_write_combine(); + /* + * The below statement is inserted to really force the flush. + * Not sure it is needed. + */ + + while(! *dev_priv->last_pause_ptr); reader = *(dev_priv->hw_addr_ptr); ptr = ((volatile char *)paused_at - dev_priv->dma_ptr) + dev_priv->dma_offset + (uint32_t) dev_priv->agpAddr + 4; dev_priv->last_pause_ptr = via_get_dma(dev_priv) - 1; + while(! *dev_priv->last_pause_ptr); if ((ptr - reader) <= dev_priv->dma_diff ) { count = 10000000;
Created attachment 9537 [details] [review] Patch that adds a printout. Hi, The patch adds a printout in the kernel log at X server start if AGPDMA is enabled, that looks like: "DMA DIFF is " and a number. Can you check that number and report back? Regards, Thomas
Marking as needinfo
Below log of my "sure" branch. X + ion3 - good glxgears - good glaxium - good [drm] Initialized drm 1.1.0 20060810 ACPI: PCI Interrupt 0000:01:00.0[A] -> Link [LNKA] -> GSI 11 (level, low) -> IRQ 11 [drm] Initialized via 2.11.1 20070202 on minor 0 agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode [drm:via_cmdbuf_start] *ERROR* DMA DIFF is 0x00000000 [drm:via_cmdbuf_start] *ERROR* DMA DIFF is 0x00000000 [drm:via_cmdbuf_start] *ERROR* DMA DIFF is 0x00000000 [drm:via_cmdbuf_start] *ERROR* DMA DIFF is 0x00000000 ACPI: PCI interrupt for device 0000:01:00.0 disabled [drm] Module unloaded Below log of "master" branch. X + ion3 - good glxgears - bad [drm] Initialized drm 1.1.0 20060810 ACPI: PCI Interrupt 0000:01:00.0[A] -> Link [LNKA] -> GSI 11 (level, low) -> IRQ 11 [drm] Initialized via 2.11.1 20070202 on minor 0 agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode [drm:via_cmdbuf_start] *ERROR* DMA DIFF is 0x00000000
(In reply to comment #3) > Patch below is solving my problem or, at least, is making it very hard to > reproduce. It is reverted part of hunk 2. With this change I can use glxgears > again. Tested with 2.6.20 and 2.6.21-rc5. Btw. I don't know what this patch is > doing. Please make it correct. > > diff --git a/shared-core/via_dma.c b/shared-core/via_dma.c > --- a/shared-core/via_dma.c > +++ b/shared-core/via_dma.c > @@ -419,7 +419,6 @@ static inline uint32_t *via_get_dma(drm_via_private_t * > dev_priv) > * modifying the pause address stored in the buffer itself. If > * the regulator has already paused, restart it. > */ > - > static int via_hook_segment(drm_via_private_t *dev_priv, > uint32_t pause_addr_hi, uint32_t pause_addr_lo, > int no_pci_fire) > @@ -430,12 +429,20 @@ static int via_hook_segment(drm_via_private_t *dev_priv, > > paused = 0; > via_flush_write_combine(); > + while(! *(via_get_dma(dev_priv)-1)); > *dev_priv->last_pause_ptr = pause_addr_lo; > via_flush_write_combine(); > + /* > + * The below statement is inserted to really force the flush. > + * Not sure it is needed. > + */ > + > + while(! *dev_priv->last_pause_ptr); > reader = *(dev_priv->hw_addr_ptr); > ptr = ((volatile char *)paused_at - dev_priv->dma_ptr) + > dev_priv->dma_offset + (uint32_t) dev_priv->agpAddr + 4; > dev_priv->last_pause_ptr = via_get_dma(dev_priv) - 1; > + while(! *dev_priv->last_pause_ptr); > > if ((ptr - reader) <= dev_priv->dma_diff ) { > count = 10000000; > Hi, The code you have added (except the last +) is used to flush write-combining registers in the processor. I thought DRM_MEMORYBARRIER() should be sufficient for that, but apparently not. I have added some equivalent code in drm_git. Can you try it and see if it runs? /Thomas
Yes. It is working for me. Thanks.
Have a VIA CN700 and I have the same nightmare. After trying all the patches on the net I had also tried the latest libdrm/drm git clone git://anongit.freedesktop.org/git/mesa/drm and mesa 7.0.1 However, I am still getting the mesage below after 30seconds of using googleearth. Then sometime the system freeze. [drm:via_cmdbuf_wait] *ERROR* via_cmdbuf_wait timed out hw c9e00 cur_addr 49e00 next_addr ca000 Any new ideas? My board is EPIA EN12000EG lspic -v: 01:00.0 VGA compatible controller: VIA Technologies, Inc. UniChrome Pro IGP (rev 01) (prog-if 00 [VGA]) Subsystem: VIA Technologies, Inc. UniChrome Pro IGP Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 21 Memory at f4000000 (32-bit, prefetchable) [size=64M] Memory at fb000000 (32-bit, non-prefetchable) [size=16M] [virtual] Expansion ROM at fc000000 [disabled] [size=64K] Capabilities: [60] Power Management version 2 Capabilities: [70] AGP version 3.0 Thank you in advance for your answer, Octavian P.S. I am trying to solve this problem for more than half a year.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.