Created attachment 25021 [details]
Bug detailed description:
start X and run ut2004-demo, both with exa and uxa it will hang after about 1 minute.
2. run ut2004 demo(benchmark.sh)
Isn't this like bug#19231? (just needs shorter time to reproduce?)
Does this impact the auto benchmarking?
(In reply to comment #1)
> Isn't this like bug#19231? (just needs shorter time to reproduce?)
> Does this impact the auto benchmarking?
No. What I mean is just its benchmark. Maybe it is caused by its kernel.
Created attachment 25023 [details]
gpu dump info
Jian, does this still exist with Q2 RC1 package?
Eric, this blocks our nightly performance regression testing on this machine (for stable release branch).
(In reply to comment #4)
> Jian, does this still exist with Q2 RC1 package?
> Eric, this blocks our nightly performance regression testing on this machine
> (for stable release branch).
Yes, it still exist with Q2 RC1 package. And its gpu dump is in attachment.
Created attachment 26793 [details]
gpu dump after ut2004 hangs X
Spent a bunch of time today looking into this.
always_flush_batch=true always_flush_cache=true INTEL_DEBUG=sync narrowed it down to a small batchbuffer in the dump where it all looked sane as far as I can tell (new bits are pushed to intel_gpu_dump for better decoding, too).
My current guess is that the VS program is failing (see updated intel_gpu_top output at hang time) and the instruction parser error is a red herring.
Based on finding today that the G965 doesn't exhibit the problem while the G/GM45 does, other things that don't help:
- Switching G4x to use only 256 URB register pairs
- Enabling the -RHW workaround
- Cutting max VS threads to 16 like 965.
- Cutting max WM threads to 32 like 965.
What did help was forcing the minimum URB allocation. More experimenting with this tomorrow.
*** Bug 19231 has been marked as a duplicate of this bug. ***
Author: Eric Anholt <firstname.lastname@example.org>
Date: Tue Jun 30 14:26:06 2009 -0700
i965: Increase G4X default VS URB allocation to actually allow 32 threads.
This improves the performance of my GLSL demo by 30%. It also fixes the
VS deadlock that ut2004 had, for reasons I can't explain. Bug #21330.
This will be cherry-picked to 7.5 if the next set of regression testing comes out OK.
It works well on master branch now. verified.
Eric, can you cherry pick this fix to mesa 7.5 branch?