Summary: | Insanely slow startup on Radeon X300 SE | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Alon Ziv <alonz> | ||||||||||||||||||
Component: | Driver/Radeon | Assignee: | xf86-video-ati maintainers <xorg-driver-ati> | ||||||||||||||||||
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||||||||||||||||
Severity: | normal | ||||||||||||||||||||
Priority: | high | ||||||||||||||||||||
Version: | 7.0.0 | ||||||||||||||||||||
Hardware: | x86 (IA32) | ||||||||||||||||||||
OS: | Linux (All) | ||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||||
Attachments: |
|
Description
Alon Ziv
2006-08-04 13:34:27 UTC
Please post your xorg.conf and xorg log. Do you have the same problem if you use the newest version of the ati xorg driver (6.6.1)? Created attachment 6476 [details]
xorg.conf
My current xorg.conf (redirecting ATI drivers to home directory)
Created attachment 6477 [details]
Latest Xorg.log
Latest Xorg.log
Note that the ATI driver is the bog-standard version 6.6.0--it incorrectly
identifies itself as version 6.5.7
(In reply to comment #1) > Do you have the same problem if you use the newest version of the ati xorg > driver (6.6.1)? I cannot build version 6.6.1 of the driver with my current version of X (7.0.0), apparently due to API changes related to xf86_libc.h. (In reply to comment #4) > I cannot build version 6.6.1 of the driver with my current version of X (7.0.0), > apparently due to API changes related to xf86_libc.h. Current xf86-video-ati git should build and work with 7.0. If it doesn't, you could try the ati-1-0-branch. If the problem persists, please try to isolate where exactly it's spending so much time by adding additional output between the lines you mentioned. (In reply to comment #5) > Current xf86-video-ati git should build and work with 7.0. Well, it doesn't. Dave Aielie's latest cleanup patches (commit 58c6aac0) broke it; I succeeded in compiling the latest by reverting some of these (removing some extra #include's and re-introducing the #undef's at the head of theatre_detect.c). Should I open a separate bug on this? (I don't think so, but...) And now to actually testing this latest driver... (In reply to comment #5) > If the problem persists, please try to isolate > where exactly it's spending so much time by adding additional output between the > lines you mentioned. OK, done that... The culprit is RADEONWaitForVerticalSync2(). I do not have a second monitor (as correctly identified by the driver--the log lists "Monitor -- NONE" for the secondary), and yet this function waits for its vsync until it times out--2000 calls to usleep(1). And given that my kernel is configured with HZ=250, every such usleep(1) appears to end up using two time slices of 4ms each... Resulting in a total wait of 16 seconds. I am not sure how to correctly solve this; as an experiment, I have commented out this call, and indeed the radeon driver's startup time went down to 1.5 seconds (total X startup time went down to ~5 seconds). I'm afraid it's not so simple; this is used to make sure the secondary display controller no longer makes any requests to the memory controller, which could theoretically happen even if no secondary display is actually being used. If we can't come up with a better test for the secondary controller triggering vertical blank events, the loop should probably time out after a fixed amount of time, say 100ms. Sorry about the phenomenal bug spam, guys. Adding xorg-team@ to the QA contact so bugs don't get lost in future. OK, I'm finally back to hacking on this bug... I am attaching a different take on the fix: this time, I simply change the loop in RADEONWaitForVerticalSync2 so it actually waits only RADEON_TIMEOUT microseconds. The older code assumed a usleep(1) really sleeps for 1us, which is very wrong--the minimum actual sleep period on Linux appears to be around 2ms. IMO all wait loops in this code (and elsewhere...) should change to this kind of waiting; as they stand they all wait for far longer than designed (or desired). Created attachment 10834 [details] [review] Sane looping patch + if (tv2.tv_sec - tv.tv_sec > RADEON_TIMEOUT/1000000 || + (tv2.tv_sec - tv.tv_sec == 0 && tv2.tv_usec > tv.tv_usec)) The second part of this would seem to generally pass as soon as tv2.tv_usec != tv.tv_usec. It might be better to compute the usec difference first and then test that. Created attachment 10871 [details] [review] Sane looping patch, take 2 OK, this version of the patch abstracts timeout handling a bit better... I also changed the timeout to 200ms, instead of 2 seconds. IMO even this is too much (I can't see any hardware taking more than ~10ms to respond), but whatever. (In reply to comment #13) > OK, this version of the patch abstracts timeout handling a bit better... I'm afraid it still doesn't handle tv_usec wraparounds properly though. There are also some minor cosmetic concerns such as using uppercase for the names of proper functions. > I also changed the timeout to 200ms, instead of 2 seconds. IMO even this is > too much (I can't see any hardware taking more than ~10ms to respond), but > whatever. In this particular case, waiting for vertical blank can take up to 16 ms (at 60 Hz refresh rate). Other cases such as waiting for engine idle might legitimately take much longer. I'd suggest passing the timeout value as an argument to radeon_timedout() (and using an internal struct timeval in both functions so not every caller has to provide its own). I believe you are wrong regarding wraparound handling: wraparound is handled in RADEON_INIT_TIMEOUT, and the following comparison (in RADEON_TIMEDOUT) is modeled exactly after the timercmp #define from /usr/include/sys/time.h. Re usage of an internal timeval value: it would need to be a global variable, as it is shared between the two functions. Such a change will also break the possibility to use two timeout values in parallel (e.g. nested). IMO it is a bad idea. I am attaching a further-cleaned-up patch. Created attachment 10882 [details] [review] Take 3 (In reply to comment #15) > I believe you are wrong regarding wraparound handling: wraparound is handled in > RADEON_INIT_TIMEOUT, and the following comparison (in RADEON_TIMEDOUT) is > modeled exactly after the timercmp #define from /usr/include/sys/time.h. Ah yes, I see now. > Re usage of an internal timeval value: it would need to be a global variable, > as it is shared between the two functions. That shouldn't be a problem as long as the timeout is handled within a single driver entrypoint, as the X server is single threaded. > Such a change will also break the possibility to use two timeout values in > parallel (e.g. nested). Is there any existing such case, or any expectation it'll be useful in the future? OTOH, I guess leaving the struct timeval argument and passing in a global or RADEONInfoRec member by default wouldn't be too bad. > I am attaching a further-cleaned-up patch. I don't see any difference to take 2, did you upload the right file? Created attachment 10888 [details] [review] Take 3, for real this time :-/ Ouch... I goofed with the previous attachment. This is the real take 3. Created attachment 10909 [details] [review] Take 4 Further cleanups: RADEONWaitForVerticalSync() and RADEONWaitForVerticalSync2() are now handled with the same logic. Also documented the rationale behind the patch and added a signoff line (I know, X.org doesn't use these... But I do.) Looks good, I'd like to push it but there are still some minor issues: * The minimum actual sleep time probably depends on CONFIG_HZ and related factors. * The timeout value in RADEONWaitForVerticalSync() is missing a digit. Might be worth using a define such as RADEON_VSYNC_TIMEOUT to avoid mistakes like this. * In standard cases like this, I still think it would be better to use a global struct timeval or one in RADEONInfoRec instead of adding a struct timeval declaration for each caller. Do you want to do another spin, or are you fine with me pushing with these fixed? Ah... I would better leave it in your capable hands now; it's probably easier for you, being more familiar with this code base. After all, I have proved my point, in the proper open-source way (i.e., with code) :) Created attachment 10918 [details] [review] Take 5: this actually works :-/ Take 4 didn't actually work (I found out I didn't install it, by accident). Spelling error :-( This one does a few other cleanups as well, e.g. by using symbolic constants for the timeout values. I still use a separate timeval struct, though; I detest globals too much. ping? Pushed (with some cosmetical changes to the log and a compile warning fix) to GIT master and 6.7 branch, thanks. Am I imagining things, or is the fix in the committed version broken? +#include <sys/time.h> /* For +#include <time.h> * gettimeofday() */ The second #include is entirely commented out... Right, so the #include <time.h> seems superfluous... I've removed it on the master branch. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.