Bug 95192 - [EXA] X server intermittently hangs for short moments with rv710 card
Summary: [EXA] X server intermittently hangs for short moments with rv710 card
Status: RESOLVED WORKSFORME
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/Radeon (show other bugs)
Version: git
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: xf86-video-ati maintainers
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-28 16:55 UTC by aceman
Modified: 2016-09-17 10:44 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
sample Xorg.log (32.16 KB, text/plain)
2016-04-30 10:31 UTC, aceman
no flags Details

Description aceman 2016-04-28 16:55:55 UTC
For at least a year now I observe random temporary hangs of the Xorg server+dekstop. The hangs take 1-10 seconds and after that everything continues normally. It never freezes permanently. Mouse and keyboard does not react at that time (does not move) but the presses are queued and executed once the server unfreezes. The hangs happen several times per hour.

This happened for a long time now with may versions of Xorg server as they arrive (from disto) and many kernels (custom compiled).

My observation is that this mostly happens when there are large graphics operations going on affecting large number of pixels, e.g. switching virtual desktops, dragging large bitmaps in Firefox (think panning in a map on openstreetmap.org).

My setup:
-an AMD CPU desktop, no laptop
-RV710 card (Radeon HD 4350)
-Xorg 1.18.3 (and older)
-Mesa from git (custom compiled)
-radeon DDX driver from git (custom compiled)
-old KDE 3 desktop (from distro)
-linux kernel 4.5.1 and older (custom compiled)
-radeon.modeset=1 radeon.audio=0 radeon.dynpm=1 radeon.dpm=1 on kernel command line

I've tried the various Vsync options but nothing seems to fix these hangs completely.

My latest hint is that when I now tried to set AccelMethod to GLAMOR in xorg.conf, I have NOT observed the hangs for several days now. This is NOT the default on RV710, I used EXA till now. I think EXA's job is to accelerate those movements of large bitmaps.
Comment 1 Michel Dänzer 2016-04-30 06:22:29 UTC
Please attach the Xorg log file and output of dmesg, preferably captured after at least one such hang occurred.
Comment 2 aceman 2016-04-30 10:27:45 UTC
Yes, I observed those files over the course of the years, but in these cases there is nothing int he logs.

On other occasions (e.g. testing vdpau) there were GPU freezes and the driver had to reset it and that was in the logs (or at least was much longer and with screen going black). So I can distinguish the cases. But for these short hangs, there is nothing.
Comment 3 aceman 2016-04-30 10:31:23 UTC
Created attachment 123363 [details]
sample Xorg.log

This is a sample xorg.log file with EXA enabled. It is NOT after the hang, but I think there is no difference. Also, it has Tearfree enabled, but that does not affect the hangs.
Comment 4 Michel Dänzer 2016-05-01 09:30:11 UTC
First of all, I consider using glamor a very good workaround. In fact, I'm planning to try enabling glamor by default on >= R(V)6xx once DRI3 is enabled by default (which I'm planning to try soon).

If the mouse cursor is frozen as well during the hangs, it indicates that the Xorg process is hung so badly that it can't even react to SIGIO generated by the mouse input device (actually I'm not 100% sure that's still true with the mouse driver; does the mouse cursor also freeze during the hangs using the evdev driver instead of the mouse driver?). It might be interesting to see a backtrace of the Xorg process during a freeze.

Does the problem also occur without the kernel parameter vmalloc=384000000? And without any non-default options in xorg.conf?

P.S. It would probably have been easier to identify the change or at least component which introduced the problem if it was reported sooner after it started happening.
Comment 5 aceman 2016-05-01 15:41:36 UTC
(In reply to Michel Dänzer from comment #4)
> First of all, I consider using glamor a very good workaround. In fact, I'm
> planning to try enabling glamor by default on >= R(V)6xx once DRI3 is
> enabled by default (which I'm planning to try soon).

But on such a low card, wouldn't EXA be always faster than anything openGL?

> If the mouse cursor is frozen as well during the hangs, it indicates that
> the Xorg process is hung so badly that it can't even react to SIGIO
> generated by the mouse input device (actually I'm not 100% sure that's still
> true with the mouse driver; does the mouse cursor also freeze during the
> hangs using the evdev driver instead of the mouse driver?). It might be
> interesting to see a backtrace of the Xorg process during a freeze.

Maybe, but I can't produce it as I can't do anything during the freeze. And it is not that long enough that I would manage to switch to virtual terminal and run some trace.

> Does the problem also occur without the kernel parameter vmalloc=384000000?
> And without any non-default options in xorg.conf?

I'll try.

> P.S. It would probably have been easier to identify the change or at least
> component which introduced the problem if it was reported sooner after it
> started happening.

Surely. But I tried to find out what is causing it and made various long experiments until I finally filed the bug.

E.g. I have an overclocked CPU on watercooling and due to this the kernel disables TSC clock. So the current clock source is HPET (which is supposedly slow). But I also tried acpi_pm without success.

Also of interest could be that the kernel is 64bit, but all the userland (X.org and mesa too) are 32bit.
Comment 6 Michel Dänzer 2016-05-02 01:45:36 UTC
(In reply to aceman from comment #5)
> (In reply to Michel Dänzer from comment #4)
> > In fact, I'm planning to try enabling glamor by default on >= R(V)6xx once
> > DRI3 is enabled by default (which I'm planning to try soon).
> 
> But on such a low card, wouldn't EXA be always faster than anything openGL?

No, why? glamor is generally on par with EXA and for some operations such as text rendering (which is among the most important in practice) significantly faster. If there are any cases where glamor is significantly slower than EXA for you, please file bug reports against the Server/Acceleration/glamor component.


> > It might be interesting to see a backtrace of the Xorg process during a
> > freeze.
> 
> Maybe, but I can't produce it as I can't do anything during the freeze. And
> it is not that long enough that I would manage to switch to virtual terminal
> and run some trace.

If you can ssh in from another machine, that might allow you to get more information about the hangs.


> Also of interest could be that the kernel is 64bit, but all the userland
> (X.org and mesa too) are 32bit.

Any chance you could try if the problem also occurs with 64-bit Xorg?
Comment 7 aceman 2016-05-03 20:11:41 UTC
(In reply to Michel Dänzer from comment #6)
> > But on such a low card, wouldn't EXA be always faster than anything openGL?
> 
> No, why? glamor is generally on par with EXA and for some operations such as
> text rendering (which is among the most important in practice) significantly
> faster. If there are any cases where glamor is significantly slower than EXA
> for you, please file bug reports against the Server/Acceleration/glamor
> component.

So far I've seen glxgears fullscreen (full HD) does not manage the normal 60fps that EXA does. With glamor it is like 30fps. But I need to determine if it isn't caused by Tearfree.

> If you can ssh in from another machine, that might allow you to get more
> information about the hangs.

I am not able to ssh to the machine and run a tracer in 10 seconds ;)

> > Also of interest could be that the kernel is 64bit, but all the userland
> > (X.org and mesa too) are 32bit.
> 
> Any chance you could try if the problem also occurs with 64-bit Xorg?

That would require upgrading the whole distro. But I have that in planning in the next 1-2 months.

>> Does the problem also occur without the kernel parameter vmalloc=384000000?
>> And without any non-default options in xorg.conf?

Yes it does. I've removed these non-default options but the hangs are back (default switched from glamor to exa again).
Comment 8 aceman 2016-09-17 10:44:51 UTC
(In reply to aceman from comment #7)
> (In reply to Michel Dänzer from comment #6)
> > > But on such a low card, wouldn't EXA be always faster than anything openGL?
> > 
> > No, why? glamor is generally on par with EXA and for some operations such as
> > text rendering (which is among the most important in practice) significantly
> > faster. If there are any cases where glamor is significantly slower than EXA
> > for you, please file bug reports against the Server/Acceleration/glamor
> > component.
> 
> So far I've seen glxgears fullscreen (full HD) does not manage the normal
> 60fps that EXA does. With glamor it is like 30fps. But I need to determine
> if it isn't caused by Tearfree.

After the upgrade mentioned below, glamor+tearfree also manages 60fps now.

> > > Also of interest could be that the kernel is 64bit, but all the userland
> > > (X.org and mesa too) are 32bit.
> > 
> > Any chance you could try if the problem also occurs with 64-bit Xorg?
> 
> That would require upgrading the whole distro. But I have that in planning
> in the next 1-2 months.

I have upgraded the distro to full 64bit now (kernel and apps). That involved a forced upgrade of KDE desktop to 4.0. That one has the compositor set to OpenGL 3.1 .

I haven't seen the intermittent hangs with this upgraded system ever. Everything in the desktop is smooth and performs fine (except real OpenGL apps of course, due to the low GPU:)). I tried both glamor and EXA. Maybe KDE 4 is using the GPU more properly.

So I've settled on glamor, as you say it is the way forward.

Let's close the bug until I can reproduce it again. Thanks for the cooperation.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.