Bug 99618 - AVX Intrinsics Run in GUI thread only
Summary: AVX Intrinsics Run in GUI thread only
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/swr (show other bugs)
Version: 13.0
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: mesa-dev
QA Contact: mesa-dev
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-31 17:20 UTC by chris
Modified: 2017-04-12 00:25 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description chris 2017-01-31 17:20:34 UTC
I have been implementing a couple of threaded algorithms using QtConcurrent::blockingMap(indices, Functor).

The functor utilizes AVX/AVX2 instructions for completing the algorithm.

Using the Gallium driver on LLVMPipe, all 8 of my CPU cores are being used during the algorithm. When I switch to Gallium driver on swr, the CPU0 is the only cpu being utilized during the algorithm. Only thing different is the driver.

Any ideas?
Comment 1 Bruce Cherniak 2017-02-02 20:58:13 UTC
We've seen a similar problem with apps using TBB as their threading model.  OpenSWR creates its thread pool when the OpenGL context is created and, by default, binds threads to physical cores.

Two things you might try:
1) Create your application thread pool before creating the OpenGL context.
2) Set the KNOB_MAX_WORKER_THREADS environment variable to limit the number of threads that OpenSWR creates.  This also tells OpenSWR to not bind threads to cores.  Setting it to 0 may work in your situation and still enable the full number of OpenSWR threads.
Comment 2 Bruce Cherniak 2017-02-10 14:26:37 UTC
I did some experimenting and created a similar situation.  Try the KNOB, it should work for you.

export KNOB_MAX_WORKER_THREADS=<# threads SWR can use>

You might need to play with the number a little to get a good balance for your situation.
Comment 3 Bruce Cherniak 2017-04-06 21:54:42 UTC
Any chance you've been able to try the suggestion I made (KNOB_MAX_WORKER_THREADS) to resolve this issue?
Comment 4 chris 2017-04-07 15:39:10 UTC
(In reply to Bruce Cherniak from comment #3)
> Any chance you've been able to try the suggestion I made
> (KNOB_MAX_WORKER_THREADS) to resolve this issue?

Hi Bruce, my apologies for the late reply. Yes, I fired up this project today and tried setting the KNOB_MAX_WORKER_THREADS to the number of physical cores. 

I do get a balanced cpu load now, it definitely worked. Like you said, performance varies a bit with different values, so I will try and play with it.
Comment 5 Bruce Cherniak 2017-04-12 00:25:08 UTC
I'm glad that works for you.  We will continue to work on threading models that don't require external intervention for optimal performance.  But, in the meantime, I'm going to close this bug as resolved.

Please let us know of any other troubles you observe and thank you for your interest in OpenSWR.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.