Bug 8060 - r200: enabling/disabling of vertex programs (even in subsequent run apps) may cause lockups)
Summary: r200: enabling/disabling of vertex programs (even in subsequent run apps) may...
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/r200 (show other bugs)
Version: git
Hardware: x86 (IA32) Linux (All)
: high major
Assignee: Roland Scheidegger
QA Contact:
URL:
Whiteboard:
Keywords:
: 8027 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-08-29 05:52 UTC by Roland Scheidegger
Modified: 2009-08-24 12:24 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
more vap state flushs to potentially prevent gpu lockups (809 bytes, patch)
2006-08-29 06:03 UTC, Roland Scheidegger
Details | Splinter Review
disable the really ugly vp disable hack (959 bytes, patch)
2006-08-29 06:06 UTC, Roland Scheidegger
Details | Splinter Review
OpenGL Warcraft, with the new shader handling. (286.66 KB, image/png)
2006-09-01 16:13 UTC, Chris Rankin
Details
Source code for filter shared object (1.31 KB, text/plain)
2006-09-03 06:55 UTC, Chris Rankin
Details
R200_DEBUG=all for celestia 1.4.1 (COLD) (403.87 KB, application/octet-stream)
2006-09-04 17:01 UTC, Chris Rankin
Details
R200_DEBUG=all for celestia 1.4.1 (WARM) (380.72 KB, application/octet-stream)
2006-09-04 17:13 UTC, Chris Rankin
Details
R200_DEBUG=all for celestia 1.4.1 (THE FIX) (309.70 KB, application/octet-stream)
2006-09-04 17:14 UTC, Chris Rankin
Details

Description Roland Scheidegger 2006-08-29 05:52:26 UTC
It seems that there is some initialization problem wrt to vertex programs, if
you use them apps run after that may cause lockups. This was reported in bug
#8009, with running celestia and then running World of Warcraft (with wine).
(New bug because old one was really about a different problem). Without WoW I'm
unable to reproduce the exact same problem but it's probably really nasty.
Comment 1 Roland Scheidegger 2006-08-29 06:03:12 UTC
Created attachment 6735 [details] [review]
more vap state flushs to potentially prevent gpu lockups

Could you try the attached patch (against drm). It's kinda ugly and I don't
quite know why it would fix things, but this allowed to remove some other
really ugly hack in the dri driver without getting lockups in doom3 (see next
patch). I suspect somehow that if you enable/disable vertex programs you need a
vap state flush, similar to that you sometimes need one when changing tcl
parameters.
Comment 2 Roland Scheidegger 2006-08-29 06:06:20 UTC
Created attachment 6736 [details] [review]
disable the really ugly vp disable hack

This patch removes the ugly, non-understood hack required to prevent lockups in
doom3. You will experience lockups without the previous patch. Actually, I was
sure I tried something along the lines of the previous patch earlier and it
didn't work at that time :-(.
Comment 3 Chris Rankin 2006-08-29 14:02:15 UTC
Nope, those patches didn't help:

- Replaced both drm/radeon kernel modules, as well as r200_dri.so object.
- Rebooted
- Enabled GL_ARB_vertex_program in celestia.cfg file
- Ran celestia with planet rendering enabled
- Ran World of Warcraft in OpenGL mode
- *GPU LOCKUP*
Comment 4 Chris Rankin 2006-08-30 01:11:09 UTC
The problem with this bug is that I can't be sure which application is
responsible for crashing my Radeon's GPU: celestia or Warcraft. Running celestia
1.4.1 once such that it uses the hardware's GL_ARB_vertex_program feature and
renders planets is enough such that running Warcraft under Wine in OpenGL mode
later will lock things up.

HOWEVER, you should keep bug #8027 in mind. #8027 has two screen shots from
Warcraft running "successfully" in both Direct3D mode and OpenGL mode, and
you'll notice that the OpenGL screen is not exactly correct! More importantly,
while Wine has no choice but to implement Direct3D using OpenGL primitives, the
Direct3D mode doesn't cause lockups at all.

Warcraft OpenGL has already been found to call the GetProgramiv() function for a
vertex program that hasn't been loaded yet, and so it's possible that celestia
is blameless in this bug and that Warcraft is performing another (unhandled)
illegal thing. Maybe any vertex-using OpenGL application run before Warcraft
would cause the same lock-up? Certainly any insight into what could be making
Warcraft's OpenGL screen misrender the buttons (when pressed) and cursors would
be welcome.
Comment 5 Roland Scheidegger 2006-08-30 02:44:06 UTC
(In reply to comment #4)
> HOWEVER, you should keep bug #8027 in mind. #8027 has two screen shots from
> Warcraft running "successfully" in both Direct3D mode and OpenGL mode, and
> you'll notice that the OpenGL screen is not exactly correct! More importantly,
> while Wine has no choice but to implement Direct3D using OpenGL primitives, the
> Direct3D mode doesn't cause lockups at all.
Well, the gl commands generated may be quite different - there are even options
for wine if it should emulate vertex shaders.

> Warcraft OpenGL has already been found to call the GetProgramiv() function for a
> vertex program that hasn't been loaded yet, and so it's possible that celestia
> is blameless in this bug and that Warcraft is performing another (unhandled)
> illegal thing. Maybe any vertex-using OpenGL application run before Warcraft
> would cause the same lock-up?
I'd suspect something like that, I'll need to double check the driver never
"half-enables" vertex programs. Just to be clear, if you run celestia, but WoW
not immidiately thereafter but after something else (say glxgears) it still
locks up?
Comment 6 Chris Rankin 2006-08-30 15:30:39 UTC
(In reply to comment #5)
> Just to be clear, if you run celestia, but WoW not immidiately thereafter but
> after something else (say glxgears) it still locks up?

Yes:
- Enable GL_ARB_vertex_program for celestia
- Run celestia with planet rendering enabled, then exit
- Run glxgears, then exit
- Run Warcraft in OpenGL mode
- *GPU lockup*

This is as expected. After all, I can run Warcraft in Direct3D mode after
celestia without locking up the GPU, and that is still an OpenGL program under
the covers.

BTW, I have also grabbed r200_vertprog.c 1.11 from CVS, but that hasn't affected
anything either. And I've backed out the two patches (DRM and Mesa) that you
posted as part of this bug because they had no effect.
Comment 7 Roland Scheidegger 2006-08-30 15:35:43 UTC
(In reply to comment #6)
> Yes:
> - Enable GL_ARB_vertex_program for celestia
> - Run celestia with planet rendering enabled, then exit
> - Run glxgears, then exit
> - Run Warcraft in OpenGL mode
> - *GPU lockup*
Guess it's time for that 3GB download :-(.

> BTW, I have also grabbed r200_vertprog.c 1.11 from CVS, but that hasn't affected
> anything either.
Yes, that's expected, just fixes some other little things.

> And I've backed out the two patches (DRM and Mesa) that you
> posted as part of this bug because they had no effect.
Those will go in the respective development repos soon, as it's "the right thing
to do". I'm really happy to see the workaround in the dri driver go, workarounds
you have no idea why they work are not a so good idea if you can actually fix
the real problem... Even if it doesn't help for that bug.

Comment 8 Chris Rankin 2006-08-30 15:39:13 UTC
About those two patches: Can the patched kernel module be used with the
unpatched Mesa code? I already know that the GPU crashes if you patch Mesa
without patching the kernel. (Your warning, plus personal experience...)
Comment 9 Chris Rankin 2006-08-30 15:56:16 UTC
Have just run arbvtest1:

libGL warning: 3D driver claims to not support visual 0x4b
Mesa: CPU vendor: GenuineIntel
Mesa: CPU name:                   Intel(R) Xeon(TM) CPU 2.66GHz
Mesa: MMX cpu detected.
Mesa: SSE cpu detected.
Mesa: Not testing OS support for SSE, leaving enabled.
arbvptest1: r200_vertprog.c:564: r200_translate_vertex_program: Assertion
`mesa_vp->Base.OutputsWritten & (1 << 0)' failed.
Aborted (core dumped)

Oops.
Comment 10 Roland Scheidegger 2006-08-30 16:01:17 UTC
(In reply to comment #8)
> About those two patches: Can the patched kernel module be used with the
> unpatched Mesa code? I already know that the GPU crashes if you patch Mesa
> without patching the kernel. (Your warning, plus personal experience...)
Yes. It won't help, however. And if it does I'm going to cry.
Comment 11 Chris Rankin 2006-08-30 16:09:59 UTC
Other things to report, from Mesa's test programs:

- arpvptorus is another program like celestia that will lock Warcraft up.

- arbvpwarpmesh does *not* lock Warcraft up later, although I was worried that
it might. The Warcraft login screen took slightly longer to appear once I had
run arbvpwarpmesh, but appear it did...
Comment 12 Brian Paul 2006-08-30 16:19:07 UTC
The failing assertion:

arbvptest1: r200_vertprog.c:564: r200_translate_vertex_program: Assertion
`mesa_vp->Base.OutputsWritten & (1 << 0)' failed.

might be an invalid assertion.  I just double-checked the GL_ARB_v_p spec and
apparently it's not an error to use a vertex program that does not write to
result.position.  It IS an error for GL_NV_v_p.  The spec says the results will
be undefined if result.position is not written (but we shouldn't crash).

The arbvptest1.c program tests several program strings, one of which does not
write result.position.  That's probably the one that's failing.
Comment 13 Chris Rankin 2006-08-30 16:21:20 UTC
arbvptest3 is Warcraft-friendly. Having arbvptest1 core-dump on me doesn't hurt
Warcraft either.
Comment 14 Roland Scheidegger 2006-08-30 16:27:15 UTC
(In reply to comment #9)
> arbvptest1: r200_vertprog.c:564: r200_translate_vertex_program: Assertion
> `mesa_vp->Base.OutputsWritten & (1 << 0)' failed.
> Aborted (core dumped)
> Oops.
It's not really that big of a problem, and moreover it's not exactly new, it
just wasn't exposed previously. arbvptest1 just loads some vertex programs, and
never exececutes them, previous to the last commits such programs were never
translated to hw, but now they are. The driver doesn't handle vertex progs which
don't output position data, we'd need to either output some generated position
data or just do a fallback. Such programs are apparently legal so there
shouldn't be an abort(), though likely don't exist in practice.

(In reply to comment #11)
> Other things to report, from Mesa's test programs:
> 
> - arpvptorus is another program like celestia that will lock Warcraft up.
I'd guess any program which will actually run a vertex prog on the hardware will.

> - arbvpwarpmesh does *not* lock Warcraft up later, although I was worried that
> it might. The Warcraft login screen took slightly longer to appear once I had
> run arbvpwarpmesh, but appear it did...
arbvpwarpmesh uses generic attribs which aren't supported right now in the
driver, thus this test never runs a vertex program on the hardware, just
triggers a fallback. Not sure why arbvptest3 won't lock up later, probably we
get lucky...
Comment 15 Chris Rankin 2006-08-30 16:29:35 UTC
I have removed the assertion from r200_vertprog.c, and can now say that
arbvptest1 is Warcraft-friendly as well. So it's just arbvptorus that damages
things, although arbvpwarpmesh seems to rattle them slightly (non-lethally) too.
Comment 16 Chris Rankin 2006-08-30 16:45:06 UTC
(In reply to comment #14)
> Not sure why arbvptest3 won't lock up later, probably we get lucky...

Could it depend on what the vertex program was supposed to do? Something
presumably has to parse them, assign parameters etc.

And arbvptest1 just puts a black window on my screen. Is it supposed to do that?
arbvptest3 gives me a multi-hued triangle.
Comment 17 Brian Paul 2006-08-30 16:55:04 UTC
Don't worry about the output of the arb[vf]ptest?.c programs.  They were
intended for testing the program parser during development.
Comment 18 Roland Scheidegger 2006-08-30 16:59:32 UTC
(In reply to comment #12)
> The failing assertion:
> 
> arbvptest1: r200_vertprog.c:564: r200_translate_vertex_program: Assertion
> `mesa_vp->Base.OutputsWritten & (1 << 0)' failed.
> 
> might be an invalid assertion.  I just double-checked the GL_ARB_v_p spec and
> apparently it's not an error to use a vertex program that does not write to
> result.position.  It IS an error for GL_NV_v_p.  The spec says the results will
> be undefined if result.position is not written (but we shouldn't crash).
> 
> The arbvptest1.c program tests several program strings, one of which does not
> write result.position.  That's probably the one that's failing.
Yes. I'm going to add a fallback instead. I'll also fix up writing to the BFC
outputs just the same (this is actually strange, r200 can handle that with tnl,
but I _never_ got fglrx to write a vertex prog to the hardware which writes that
- the instruction to write it gets optimized away completely, and if you
actually enable two-side lighting for vertex progs it runs a lot slower so
presumably hits a fallback).
Comment 19 Chris Rankin 2006-08-30 17:16:17 UTC
I've just discovered a workaround for the Warcraft lock-up bug:

- Run celestia with GL_ARB_vertex_program enabled and planet rendering, then exit.
- Switch to a console via (e.g) Alt-F6.
- Switch back into X.
- Run Warcraft in OpenGL mode.
- Ta Da!

I don't know if it helps, but I've also applied the patch in comment #1, and not
applied the patch in comment #2.
Comment 20 Roland Scheidegger 2006-08-30 18:13:49 UTC
(In reply to comment #12)
> The failing assertion:
> 
> arbvptest1: r200_vertprog.c:564: r200_translate_vertex_program: Assertion
> `mesa_vp->Base.OutputsWritten & (1 << 0)' failed.
> 
> might be an invalid assertion.
Ok that's fixed, just use a fallback (if the shader is actually run) instead.
Comment 21 Roland Scheidegger 2006-09-01 12:03:12 UTC
(In reply to comment #19)
> I've just discovered a workaround for the Warcraft lock-up bug:
> 
> - Run celestia with GL_ARB_vertex_program enabled and planet rendering, then exit.
> - Switch to a console via (e.g) Alt-F6.
> - Switch back into X.
> - Run Warcraft in OpenGL mode.
Since this seems to be some kind of initialization bug, that is probably due to
the chip reset which is done when switching back to X. Actually I think I know
what could cause it, I'll check in a patch soon.
Comment 22 Roland Scheidegger 2006-09-01 14:04:56 UTC
(In reply to comment #19)
Ok hopefully fixed in cvs (and if not another not yet discovered bug at least
got fixed...). If WoW not only queried but actually also tried to execute a not
specified shader, a lockup probably would have happened since the driver would
not have set up a vertex program but enabled it anyway, so random junk (tnl
data) would have got executed (except after a chip reset where it probably
defaults to program length 0). If you can verify it no longer locks up you can
close this bug. Note however that WoW most likely will still render wrong if
this indeed fixes the lockup, I can't imagine that it would try to use
non-existing shaders on purpose.
Comment 23 Chris Rankin 2006-09-01 15:02:38 UTC
Yes, you fixed the lockup. And WoW renders even more wrongly too, because
there's now a curious item on the left of the login screen. Possibly this email
might explain wine's behaviour:

http://www.winehq.org/pipermail/wine-patches/2006-September/030335.html

However, what worries me is that the flickering shadow on Saturn's rings is back
in celestia 1.4.1.
Comment 24 Roland Scheidegger 2006-09-01 15:56:50 UTC
(In reply to comment #23)
> Yes, you fixed the lockup. And WoW renders even more wrongly too, because
> there's now a curious item on the left of the login screen. Possibly this email
> might explain wine's behaviour:
> 
> http://www.winehq.org/pipermail/wine-patches/2006-September/030335.html
Maybe. In any case, doesn't appear to be a mesa bug then.

> However, what worries me is that the flickering shadow on Saturn's rings is back
> in celestia 1.4.1.
You mean with the arb_vp path? Strange. Nothing in that patch should really
change anything, as long as you don't specify bogus shader. Works just fine for
me. Are you sure it's picking up the right driver file? The default location
changed somewhat recently.
Comment 25 Chris Rankin 2006-09-01 16:11:07 UTC
I'm not sure what is causing my "flicking ring shadow" problem this time. It
could be a strange kind of graphics-card crash that can survive a warm reboot.
(There is also a moth which has gone AWOL somewhere in this room...)

I have noticed that the r200_context.h file has changed, and have recompiled the
r200_dri.so file accordingly. I am running the FC6 1.1.1-34 version of the XOrg
server with the 6.6.2-1 version of the ATI driver, both compiled locally,
because this is the only way to run XOrg 7.1 on FC5. In order to compile the
latest version of Mesa, I have downloaded the 6.5.1-0.rc2 Mesa packages, patched
the following files from CVS, and rebuilt r200_dri.so (only):

r200_vertprog.c
r200_state.c
r200_context.h
r200_texstate.c

Everything is working OK at the moment, but when the ring-shadow flickering
starts in celestia, I know that switching to a text console will crash the video.

WoW is working *much* better in OpenGL mode now. I will attach a new screenshot
for comparison.
Comment 26 Chris Rankin 2006-09-01 16:13:41 UTC
Created attachment 6789 [details]
OpenGL Warcraft, with the new shader handling.

Warcraft, once the "lockup" fix has been applied to the r200_dri.so module.
Compare with the screenshot in #8027.
Comment 27 Chris Rankin 2006-09-02 03:20:04 UTC
Something's still wrong. I cold-booted this morning, ran celestia with
GL_ARB_vertex_program enabled and looked at Saturn's rings. Saturn's shadow on
its rings was flicking as if GL_NV_vertex_program was still enabled (although it
is not).

There's obviously something I did in order to make it behave last night, because
the *exact* same software setup was working fine then. I think something is no
longer being initialised correctly.
Comment 28 Chris Rankin 2006-09-02 08:07:58 UTC
This new initialisation problem is not being caused by the "lock-up" fix,
because I stumbled across it before last night. However, I attributed it to a
build problem at the time.

I have currently managed to make the problem go away by moving the new
r200_dri.so to one side, reinstalling the package's version, running celestia
with GL_ARB_program_vertex enabled, exiting celestia and then restoring the
patched r200_dri.so. After this, everything seems to be fine with both celestia
and WoW. (By which I mean WoW still has the misrendered artifact on the login
screen, but at least it doesn't lock up.)
Comment 29 Chris Rankin 2006-09-02 17:09:45 UTC
I'm really confused by the "flickering shadow" problem now, because I have just
tried to repeat my "work-around" of moving the original r200_dri.so back into
place, running celestia and then installing the patched r200_dri.so again, only
to find that it didn't work this time!

My latest theory for how I'm managing to work around this issue is that I need
to run celestia once with GL_ARB_vertex_program *disabled* via the celestia.cfg
file. After that, I can reenable GL_ARB_vertex_program and everything seems fine
until I shut down again.

Could the Xorg server be failing to initialise my graphics card completely after
a cold boot? I am using these FC6 development packages, which I have compiled
myself so that I can use them on FC5:

xorg-x11-server-Xorg-1.1.1-34.i386.rpm
xorg-x11-drv-ati-6.6.2-1.i386.rpm
Comment 30 Chris Rankin 2006-09-03 06:45:05 UTC
I have managed to get Warcraft to render perfectly in OpenGL mode, but only by
creating a filter shared object that stops Wine and Warcraft from seeing the
GL_ARB_vertex_program extension. I don't know who wrote this filter code, but
once compiled, the following command line works:

LD_PRELOAD="/path/to/shared/object/filter_ext.so" /opt/wine/World\ of\
Warcraft/WoW.exe -opengl
Comment 31 Chris Rankin 2006-09-03 06:55:46 UTC
Created attachment 6800 [details]
Source code for filter shared object

Compile with:

$ gcc -o filter_ext.so -Wall -Wextra -O2 -shared filter_ext.c -ldl

The developer who suggested trying this code to me originally told me to filter
GL_ARB_vertex_buffer_object. However, that didn't help Warcraft at all.
Comment 32 Chris Rankin 2006-09-04 13:07:57 UTC
I can confirm that the "flicking shadow" problem only happens if I cold-boot my
machine. I.e. once I apply the fix (I run celestia with GL_ARB_vertex_program
ignored), the fix survives a warm boot.

The recent change to r200_state_init.c has not helped either the flickering, or
the mis-rendering of Warcraft's login screen. Also, wine's OpenGL developer has
managed to get this same screen to render correctly with his NVIDIA graphics
card, so I'm thinking that there's still something going wrong in r200_dri.so.

In his opinion:
"I would recommend to only disable vertex buffer objects. The extension provides
a different way to upload geometry data to the gpu. If it isn't there another
sometimes less efficient mechanism is used. The game will still use vertex
shaders and other 3d effects.

If disabling ARB_vertex_program helps as well it means that there's a bug in the
use of VBOs in ARB_vertex_program."
Comment 33 Chris Rankin 2006-09-04 16:56:25 UTC
I have discovered the R200_DEBUG environment variable, and have run celestia
both after a cold boot, and after having fixed the flickering problem as well.
Both debug logs complain of corrupt texture memory:

COLD:
...
leaving r200SanityCmdBuffer


driValidateTextureHeaps: blocks_in_mempool = 2, last_end = 58720256, p->ofs = 0
r200FlushCmdBufLocked: texture memory is inconsistent - expect mangled textures

Syncing in r200FlushCmdBufLocked
...

WARM:
...
leaving r200SanityCmdBuffer


driValidateTextureHeaps: blocks_in_mempool = 43, last_end = 58720256, p->ofs = 0
r200FlushCmdBufLocked: texture memory is inconsistent - expect mangled textures

Syncing in r200FlushCmdBufLocked

r200FlushCmdBufLocked from r200Flush
...

So the "cold" scenario has only 2 blocks in the memory pool before p->ofs = 0,
while the "warm" one has 43. And as for expecting mangled textures, could that
explain what is happening in WoW?

I have attached both R200 debug logs (compressed).
Comment 34 Chris Rankin 2006-09-04 17:01:21 UTC
Created attachment 6813 [details]
R200_DEBUG=all for celestia 1.4.1 (COLD)

Immediately after a cold reboot:

R200_DEBUG=all celestia

with the following URL:
cel://Follow/Saturn/2006-08-29T09:35:30.41946?x=+0NycYGJxGNWDA&y=m2Y3cow4O1oC&z=kQLl9SiJ7Lma/////////w&ow=-0.442510&ox=-0.299259&oy=-0.843421&oz=0.057180&select=Saturn&fov=20.361984&ts=1.000000&ltd=0&rf=22423&lm=0
Comment 35 Chris Rankin 2006-09-04 17:13:07 UTC
Created attachment 6814 [details]
R200_DEBUG=all for celestia 1.4.1 (WARM)

Same as attachment 6813 [details], except that "the fix" has already been applied.
Comment 36 Chris Rankin 2006-09-04 17:14:59 UTC
Created attachment 6815 [details]
R200_DEBUG=all for celestia 1.4.1 (THE FIX)

Same as attachments 6813 and 6814, except this is "the fix" actually being
applied.
Comment 37 Roland Scheidegger 2006-09-04 18:00:43 UTC
(In reply to comment #33)
> I have discovered the R200_DEBUG environment variable, and have run celestia
> both after a cold boot, and after having fixed the flickering problem as well.
> Both debug logs complain of corrupt texture memory:
> driValidateTextureHeaps: blocks_in_mempool = 2, last_end = 58720256, p->ofs = 0
> r200FlushCmdBufLocked: texture memory is inconsistent - expect mangled textures
I'll need to look into it, but I suspect it's more a bug in the validation code
itself than an actual texture memory inconsistency (the code is only used by
r200 currently and only if debugging is enabled).

> So the "cold" scenario has only 2 blocks in the memory pool before p->ofs = 0,
> while the "warm" one has 43. And as for expecting mangled textures, could that
> explain what is happening in WoW?
Doesn't look like corrupted textures to me.

> I have attached both R200 debug logs (compressed).
I'll look into it.
Comment 38 Roland Scheidegger 2006-09-05 12:45:24 UTC
(In reply to comment #32)
> I can confirm that the "flicking shadow" problem only happens if I cold-boot my
> machine. I.e. once I apply the fix (I run celestia with GL_ARB_vertex_program
> ignored), the fix survives a warm boot.
> 
> The recent change to r200_state_init.c has not helped either the flickering, or
> the mis-rendering of Warcraft's login screen.
I'd have been surprised if it would have helped...
I can confirm the celestia problem - the shadow on the ring doesn't flicker
here, but is just missing instead though. However, this is actually not our bug,
this is a celestia bug. Celestia's shader do not properly initialize the output
registers, so the q tex coordinate ends up uninitialized. This is an important
difference to NV_vertex_program, where you do not need to initialize the output
(and temp) regs. In particular, in rings_vp.arb only s and t (x,y) coords are
written to tex coord sets 0 and 1, r and q (z,w) are not touched. (Other shaders
have the same problem, but it doesn't seem to cause problems). Apparently, if
you use the multitexturing path, that coordinate is written (even if it doesn't
really work), and it seems that this state even survives a reboot.
Here's the warning from the ARB_vp extension:
"If conventional OpenGL texture mapping operations are performed, a
program should always write to the "w" coordinate of any texture
coordinates result registers it needs to use.  Conventional OpenGL
texture accesses always use projective texture coordinates (e.g.,
s/q, t/q, r/q), even though q is almost always 1.0.  An undefined q
coordinate (coming from the "w" component of the result register)
may produce undefined coordinates on the texture lookup."

> Also, wine's OpenGL developer has
> managed to get this same screen to render correctly with his NVIDIA graphics
> card, so I'm thinking that there's still something going wrong in r200_dri.so.
> 
> In his opinion:
> "I would recommend to only disable vertex buffer objects. The extension provides
> a different way to upload geometry data to the gpu. If it isn't there another
> sometimes less efficient mechanism is used. The game will still use vertex
> shaders and other 3d effects.
> 
> If disabling ARB_vertex_program helps as well it means that there's a bug in the
> use of VBOs in ARB_vertex_program."
It's quite possible there are bugs related to vertex programs in either core
Mesa or the driver code, though those weird vertex prog related queries suggests
something goes fundamentally wrong. You could compile mesa with debug options
(add -DDEBUG somewhere to your config) that would tell at least if mesa thinks
there was a user (i.e. WoW) error.
 
Comment 39 Chris Rankin 2006-09-05 14:09:31 UTC
Mesa does indeed believe that there are "user errors", although I'm guessing
that ATI's and NVIDIA's drivers disagree:

StringARB: [!!ARBvp1.0
PARAM c0 = { 255.002, 3, 0, 1 };
PARAM c94 = { 0, 1, 0, 0 };
PARAM c1 = { 2, 0, 0, 0 };
TEMP R0, R1, R2, R3;
ADDRESS A0;
ATTRIB v25 = vertex.texcoord[1];
ATTRIB v24 = vertex.texcoord[0];
ATTRIB v7 = vertex.attrib[7];
ATTRIB v17 = vertex.weight;
ATTRIB v18 = vertex.normal;
ATTRIB v16 = vertex.position;
PARAM c28[2] = { program.env  [28..29] };
PARAM c2[4] = { program.env  [2..5] };
PARAM c31[65]={program.env[31..95]};
        MOV result.fogcoord.x, c0.z;
        MUL R3, v7.zyxw, c0.x;
        FRC R0, R3;
        SLT R2, -R0, R0;
        SLT R1, R3, -R3;
        ADD R0, -R0, R3;
        MAD R0.z, R2, R1, R0;
        MUL R1.x, R0.z, c0.y;
        ARL A0.x, R1.x;
        DP4 R0.x, c31[A0.x], v16;
        DP4 R0.y, c31[A0.x + 1], v16;
        DP4 R0.z, c31[A0.x + 2], v16;
        MOV R0.w, c0.w;
        DP4 result.position.x, c2[0], R0;
        DP4 result.position.y, c2[1], R0;
        DP4 result.position.z, c2[2], R0;
        DP4 result.position.w, c2[3], R0;
        MOV R0, c28[0];
        ADD result.color.front.primary, R0, c28[1];
        MOV result.texcoord[0].xy, v24.xyxx;
        MOV result.texcoord[0].zw, c94.xyxy;
        MOV result.texcoord[1].xy, v25.xyxx;
        MOV result.texcoord[1].zw, c94.xyxy;
END
]
Mesa: User error: GL_INVALID_OPERATION in glProgramStringARB(syntax error)

StringARB: [!!ARBvp1.0
PARAM c0 = { 255.002, 3, 0, 1 };
PARAM c94 = { 0, 1, 0, 0 };
PARAM c1 = { 2, 0, 0, 0 };
TEMP R0, R1, R2, R3;
ADDRESS A0;
ATTRIB v25 = vertex.texcoord[1];
ATTRIB v24 = vertex.texcoord[0];
ATTRIB v7 = vertex.attrib[7];
ATTRIB v17 = vertex.weight;
ATTRIB v18 = vertex.normal;
ATTRIB v16 = vertex.position;
PARAM c28[2] = { program.env  [28..29] };
PARAM c2[4] = { program.env  [2..5] };
PARAM c31[65]={program.env[31..95]};
        MUL R3, v7.zyxw, c0.x;
        FRC R0, R3;
        SLT R2, -R0, R0;
        SLT R1, R3, -R3;
        ADD R0, -R0, R3;
        MAD R0.z, R2, R1, R0;
        MUL R1.x, R0.z, c0.y;
        ARL A0.x, R1.x;
        DP4 R0.x, c31[A0.x], v16;
        DP4 R0.y, c31[A0.x + 1], v16;
        DP4 R0.z, c31[A0.x + 2], v16;
        MOV result.fogcoord.x, R0.z;
        MOV R0.w, c0.w;
        DP4 result.position.x, c2[0], R0;
        DP4 result.position.y, c2[1], R0;
        DP4 result.position.z, c2[2], R0;
        DP4 result.position.w, c2[3], R0;
        MOV R0, c28[0];
        ADD result.color.front.primary, R0, c28[1];
        MOV result.texcoord[0].xy, v24.xyxx;
        MOV result.texcoord[0].zw, c94.xyxy;
        MOV result.texcoord[1].xy, v25.xyxx;
        MOV result.texcoord[1].zw, c94.xyxy;
END
]
Mesa: User error: GL_INVALID_OPERATION in glProgramStringARB(syntax error)

StringARB: [!!ARBvp1.0
PARAM c0 = { 255.002, 3, 0, 1 };
PARAM c94 = { 0, 1, 0, 0 };
PARAM c1 = { 2, 0, 0, 0 };
TEMP R0, R1, R2, R3, R4;
ADDRESS A0;
ATTRIB v25 = vertex.texcoord[1];
ATTRIB v24 = vertex.texcoord[0];
ATTRIB v7 = vertex.attrib[7];
ATTRIB v17 = vertex.weight;
ATTRIB v18 = vertex.normal;
ATTRIB v16 = vertex.position;
PARAM c28[2] = { program.env  [28..29] };
PARAM c17[11] = { program.env  [17..27] };
PARAM c10[7] = { program.env  [10..16] };
PARAM c2[4] = { program.env  [2..5] };
PARAM c31[65]={program.env[31..95]};
        MUL R3, v7.zyxw, c0.x;
        FRC R0, R3;
        SLT R2, -R0, R0;
        SLT R1, R3, -R3;
        ADD R0, -R0, R3;
        MAD R0.z, R2, R1, R0;
        MUL R0.x, R0.z, c0.y;
        ARL A0.x, R0.x;
        DP4 R0.y, c31[A0.x], v16;
        DP4 R0.z, c31[A0.x + 1], v16;
        DP4 R0.w, c31[A0.x + 2], v16;
        MOV result.fogcoord.x, R0.w;
        MOV R1.xyz, R0.yzww;
        MOV R1.w, c0.w;
        DP4 result.position.x, c2[0], R1;
        DP4 result.position.y, c2[1], R1;
        DP4 result.position.z, c2[2], R1;
        DP4 result.position.w, c2[3], R1;
        ADD R3.xyz, c17[4].xyzx, -R0.yzwy;
        DP3 R1.z, R3.xyzx, R3.xyzx;
        ADD R2.xyz, c17[5].xyzx, -R0.yzwy;
        DP3 R1.w, R2.xyzx, R2.xyzx;
        RSQ R1.x, R1.z;
        RSQ R1.y, R1.w;
        MUL R0.yz, R1.zzwz, R1.xxyx;
        MUL R0.yz, R0.yyzy, c17[9].xxyx;
        MAD R0.yz, R1.zzwz, c17[10].xxyx, R0.yyzy;
        ADD R0.yz, R0.yyzy, c17[8].xxyx;
        RCP R4.x, R0.y;
        RCP R4.y, R0.z;
        DP3 R0.y, c31[A0.x].xyzx, v18.xyzx;
        DP3 R0.z, c31[A0.x + 1].xyzx, v18.xyzx;
        DP3 R0.w, c31[A0.x + 2].xyzx, v18.xyzx;
        DP3 R0.x, R0.yzwy, R0.yzwy;
        RSQ R0.x, R0.x;
        MUL R0.xyz, R0.x, R0.yzwy;
        DP3 R1.z, R3.xyzx, R0.xyzx;
        DP3 R1.w, R2.xyzx, R0.xyzx;
        MUL R1.xy, R1.zwzz, R1.xyxx;
        MAX R1.xy, R1.xyxx, c0.z;
        MUL R4.xy, R1.xyxx, R4.xyxx;
        MUL R1, R0.xyzz, R0.yzzx;
        DP4 R3.x, c10[3], R1;
        DP4 R3.y, c10[4], R1;
        DP4 R3.z, c10[5], R1;
        MOV R2.xyz, R0.xyzz;
        MOV R2.w, c0.w;
        DP4 R1.x, c10[0], R2;
        DP4 R1.y, c10[1], R2;
        DP4 R1.z, c10[2], R2;
        ADD R1.xyz, R1.xyzx, R3.xyzx;
        MUL R0.w, R0.y, R0.y;
        MAD R0.x, R0.x, R0.x, -R0.w;
        MAD R0.xyz, c10[6].xyzx, R0.x, R1.xyzx;
        MAD R0.xyz, c17[0].xyzx, R4.x, R0.xyzx;
        MAD R0.xyz, c17[1].xyzx, R4.y, R0.xyzx;
        MIN R0.xyz, c0.w, R0.xyzx;
        MAX R1.xyz, c0.z, R0.xyzx;
        MOV R1.w, c0.w;
        MOV R0, c28[0];
        MAD result.color.front.primary, R0, R1, c28[1];
        MOV result.texcoord[0].xy, v24.xyxx;
        MOV result.texcoord[0].zw, c94.xyxy;
        MOV result.texcoord[1].xy, v25.xyxx;
        MOV result.texcoord[1].zw, c94.xyxy;
END
]
Mesa: User error: GL_INVALID_OPERATION in glProgramStringARB(syntax error)

StringARB: [!!ARBvp1.0
PARAM c0 = { 255.002, 3, 0, 1 };
PARAM c94 = { 0, 1, 0, 0 };
PARAM c1 = { 2, 0, 0, 0 };
TEMP R0, R1, R2, R3;
ADDRESS A0;
ATTRIB v25 = vertex.texcoord[1];
ATTRIB v24 = vertex.texcoord[0];
ATTRIB v7 = vertex.attrib[7];
ATTRIB v17 = vertex.weight;
ATTRIB v18 = vertex.normal;
ATTRIB v16 = vertex.position;
PARAM c6[4] = { program.env  [6..9] };
PARAM c28[2] = { program.env  [28..29] };
PARAM c2[4] = { program.env  [2..5] };
PARAM c31[65]={program.env[31..95]};
        MOV result.fogcoord.x, c0.z;
        MUL R3, v7.zyxw, c0.x;
        FRC R0, R3;
        SLT R2, -R0, R0;
        SLT R1, R3, -R3;
        ADD R0, -R0, R3;
        MAD R0.z, R2, R1, R0;
        MUL R1.x, R0.z, c0.y;
        ARL A0.x, R1.x;
        DP4 R0.x, c31[A0.x], v16;
        DP4 R0.y, c31[A0.x + 1], v16;
        DP4 R0.z, c31[A0.x + 2], v16;
        MOV R0.w, c0.w;
        DP4 result.position.x, c2[0], R0;
        DP4 result.position.y, c2[1], R0;
        DP4 result.position.z, c2[2], R0;
        DP4 result.position.w, c2[3], R0;
        MOV R0, c28[0];
        ADD result.color.front.primary, R0, c28[1];
        DP4 result.texcoord[0].x, c6[0], v24;
        DP4 result.texcoord[0].y, c6[1], v24;
        MOV result.texcoord[0].zw, c94.xyxy;
        MOV result.texcoord[1].xy, v25.xyxx;
        MOV result.texcoord[1].zw, c94.xyxy;
END
]
Mesa: User error: GL_INVALID_OPERATION in glProgramStringARB(syntax error)
Comment 40 Brian Paul 2006-09-05 14:53:31 UTC
Those vertex programs reference the vertex.weight attribute.  That's illegal if
the GL_EXT_vertex_weighting / GL_ARB_vertex_blend extensions are not supported
(as is the case with Mesa).  Another celestia bug.
Comment 41 Chris Rankin 2006-09-05 15:27:19 UTC
This isn't Mesa, it's Warcraft. So it sounds as if the Mesa drivers cannot allow
Warcraft to use GL_ARB_vertex_program until Mesa also supports the
GL_ARB_vertex_blend extension. (Assuming that you didn't mean that Mesa would
have to support both.)

According to this link, ATI's own drivers do not support GL_EXT_vertex_weighting:

http://www.phoronix.com/lch/?k=entry&l=102&t=ati%209200
Comment 42 Chris Rankin 2006-09-05 15:34:20 UTC
(In reply to comment #41)
> This isn't Mesa, it's Warcraft.

Oops, I meant it isn't celestia.
Comment 43 Brian Paul 2006-09-05 15:41:56 UTC
Mesa is not going to support GL_ARB_vertex_blend (or GL_EXT_vertex_weighting). 
Those extensions are pretty much obsolete now that we have vertex programs.

Warcraft should check if GL_ARB_vertex_blend or GL_EXT_vertex_weighting are
supported with glGetString() before defining vertex programs that use vertex.weight.

This probably can't be readily fixed in Warcraft (it's not open-source, right?)
so we may have to bend the rules in the Mesa parser.
Comment 44 Chris Rankin 2006-09-05 15:54:15 UTC
(In reply to comment #43)
> This probably can't be readily fixed in Warcraft (it's not open-source, right?)
Correct. Warcraft is not Open Source and almost certainly never will be.

> so we may have to bend the rules in the Mesa parser.
OK, sounds good, thanks. Especially if ARB vertex programs have a "booby
attribute" that Mesa will never support but whose careless use renders the
entire extension inactive ;-).
Comment 45 Ian Romanick 2006-09-05 16:08:46 UTC
(In reply to comment #41)
> This isn't Mesa, it's Warcraft. So it sounds as if the Mesa drivers cannot allow
> Warcraft to use GL_ARB_vertex_program until Mesa also supports the
> GL_ARB_vertex_blend extension. (Assuming that you didn't mean that Mesa would
> have to support both.)
> 
> According to this link, ATI's own drivers do not support GL_EXT_vertex_weighting:

Right, but they *do* support ARB_vertex_blend.  The odd thing is that Nvidia
doesn't support either extension, and I assume that WoW works on Nvidia
hardware.  I suspect that Nvidia might also be working around this problem *or*
WoW might use GLSL on Nvidia hardware.  The other possability is that the
problems in those programs are generated by Wine, but that seems unlikely.
Comment 46 Brian Paul 2006-09-05 16:17:51 UTC
I've checked in a hack to Mesa so that vertex.weight can be used in a program
(though the results will be undefined).  Warcraft binds that attribute to a
named parameter but doesn't actually use it.

NVIDIA doesn't support either vertex blend/weight extension but allows the
vertex program to compile, BTW.
Comment 47 Chris Rankin 2006-09-05 16:36:21 UTC
Well, Warcraft writes these error messages now:

Mesa warning: Application error: vertex program uses 'vertex.weight' but
GL_ARB_vertex_blend not supported.
Mesa warning: Application error: vertex program uses 'vertex.weight' but
GL_ARB_vertex_blend not supported.
Mesa warning: Application error: vertex program uses 'vertex.weight' but
GL_ARB_vertex_blend not supported.
Mesa warning: Application error: vertex program uses 'vertex.weight' but
GL_ARB_vertex_blend not supported.

but unfortunately the login screen doesn't render any differently. Is there
anything in my .drirc which could affect things?

<driconf>
    <device screen="0" driver="r200">
        <application name="Default">
            <option name="force_s3tc_enable" value="false" />
            <option name="no_rast" value="false" />
            <option name="fthrottle_mode" value="2" />
            <option name="tcl_mode" value="3" />
            <option name="texture_depth" value="0" />
            <option name="def_max_anisotropy" value="1.0" />
            <option name="no_neg_lod_bias" value="false" />
            <option name="texture_units" value="6" />
            <option name="dither_mode" value="0" />
            <option name="hyperz" value="false" />
            <option name="round_mode" value="0" />
            <option name="allow_large_textures" value="1" />
            <option name="nv_vertex_program" value="false" />
            <option name="color_reduction" value="1" />
            <option name="vblank_mode" value="1" />
            <option name="texture_blend_quality" value="1.0" />
        </application>
    </device>
</driconf>
Comment 48 Chris Rankin 2006-09-05 17:09:41 UTC
(In reply to comment #46)
> I've checked in a hack to Mesa so that vertex.weight can be used in a program
> (though the results will be undefined).  Warcraft binds that attribute to a
> named parameter but doesn't actually use it.

FWIW, I have managed to make Warcraft's login screen render correctly, if
slowly, by replacing the "return 1" at arbprogparse.c:1502 with "break". So we
*have* identified the problem after all, if not the best solution ;-).
Comment 49 Chris Rankin 2006-09-05 17:14:22 UTC
Well, I say that the login screen now renders "slowly", but I suppose I really
mean that the animations (fog, swirling lights, two small fires) are a lot
jerkier with the "break" statement than on the Direct3D screen. The OpenGL
screen itself appears promply enough, however.
Comment 50 Roland Scheidegger 2006-09-05 17:26:32 UTC
(In reply to comment #49)
> Well, I say that the login screen now renders "slowly", but I suppose I really
> mean that the animations (fog, swirling lights, two small fires) are a lot
> jerkier with the "break" statement than on the Direct3D screen. The OpenGL
> screen itself appears promply enough, however.
Defining vertex.weight as input is probably enough to trigger a tcl fallback in
the r200 driver thus the vert prog would be handled by software (which is slow,
unless you set MESA_EXPERIMENTAL, which might be buggy). Try R200_DEBUG=fall
(preferably together with tcl_mode=1 to avoid reporting the vtxfmt fallbacks
too, the vtxfmt code won't do anything useful with optimized apps anyway) to see
if that't the case.
Comment 51 Chris Rankin 2006-09-05 18:11:30 UTC
Well deduced:

Mesa warning: Application error: vertex program uses 'vertex.weight' but
GL_ARB_vertex_blend not supported.
can't handle vert prog inputs 0x800307
R200 begin tcl fallback Vertex program
Mesa warning: Application error: vertex program uses 'vertex.weight' but
GL_ARB_vertex_blend not supported.
can't handle vert prog inputs 0x800307
Mesa warning: Application error: vertex program uses 'vertex.weight' but
GL_ARB_vertex_blend not supported.
can't handle vert prog inputs 0x800307
Mesa warning: Application error: vertex program uses 'vertex.weight' but
GL_ARB_vertex_blend not supported.
can't handle vert prog inputs 0x800307
R200 end tcl fallback Vertex program
R200 end tcl fallback
R200 begin tcl fallback Vertex program
R200 end tcl fallback Vertex program
R200 end tcl fallback
R200 begin tcl fallback Vertex program
R200 end tcl fallback Vertex program
R200 end tcl fallback
R200 begin tcl fallback Vertex program
R200 end tcl fallback Vertex program
R200 end tcl fallback
R200 begin tcl fallback Vertex program
R200 end tcl fallback Vertex program
R200 end tcl fallback
R200 begin tcl fallback Vertex program
R200 end tcl fallback Vertex program
R200 end tcl fallback
R200 begin tcl fallback Vertex program
R200 end tcl fallback Vertex program
R200 end tcl fallback
R200 begin tcl fallback Vertex program
R200 end tcl fallback Vertex program
R200 end tcl fallback
R200 begin tcl fallback Vertex program
R200 end tcl fallback Vertex program
R200 end tcl fallback
...
Comment 52 Roland Scheidegger 2006-09-05 18:24:28 UTC
Actually, it will definitely cause a tcl fallback, since it uses generic
attribs, which right now aren't supported (it's on my todo list). And
unfortunately, it actually uses them too not just in the declaration... A pity
since at least one of them would be quite a good test, it's the first
"real-life" vert prog I've encountered over 64 instructions, with some good
instruction mix using even ARL, outputting to fogcoord (which is probably
broken). Running this in software with more than about 3 vertices will
definitely be slow.
Comment 53 Chris Rankin 2006-09-06 01:02:08 UTC
(In reply to comment #52)
> Actually, it will definitely cause a tcl fallback, since it uses generic
> attribs, which right now aren't supported (it's on my todo list).

Oh well - bang goes OpenGL WoW for me, then (for now anyway). I guess I'll
either have to use the filter object to hide GL_ARB_vertex_program from WoW, or
wait until ATI fixes their Linux driver to recognise a DFP monitor attached to
an rv280.

Presumably, Wine's Direct3D layer can't be using GL_ARB_vertex_program either.

Thanks for ploughing through all this.
Comment 54 Roland Scheidegger 2006-09-06 05:55:13 UTC
Put this to rest. The original problem has long been fixed, along with unrelated
stuff.
Comment 55 Chris Rankin 2006-09-06 14:59:39 UTC
*** Bug 8027 has been marked as a duplicate of this bug. ***
Comment 56 Adam Jackson 2009-08-24 12:24:15 UTC
Mass version move, cvs -> git


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.