Hi, I am facing preemption issues related with rt preempt realtime tasks and intel video driver with next chipsets : * Intel J1900 * 4300 U On heavy X11 load (eg lot of widgets creation at application startup, moving windows, hide/show windows...) my realtime task is interrupted for about 500us to 1ms. My realtime task is isolated on a cpu, I suppose the problem is related to memory access, the driver locks /access memory and my application can't access it . (There is only a numa node available) If I run the same graphics applications through a ssh connection the problem has not happened. I would like to know if there are workarounds ? and preemption status with intel drivers. Thanks
I don't suppose you have a handy recipe for setting up a machine with an isolated RT CPU? https://gitlab.freedesktop.org/drm/igt-gpu-tools/blob/master/benchmarks/gem_syslatency.c is a tool we use to try and assess the damage we cause to the system by measuring the latency of an RT thread. But I've never before setup an isolated cpu... Just to check, do you mean the cpu isolation feature in the kernel, or are just using cpuset? And also to confirm you are talking about CPU preemption and not GPU preeemption?
Intel J1900 https://ark.intel.com/content/www/us/en/ark/products/78867/intel-celeron-processor-j1900-2m-cache-up-to-2-42-ghz.html So Baytrail. Hmm. That shouldn't be as bad for RT as we don't need interrupt processing to keep the GPU feed (I automatically assumed the complaint would be about execlists which has much more noticeable impact on RT). On the other hand, we don't perform GPU preemption.
One should also note that memory contention between the GPU and CPU cores is a real issue; there should be some MSR (don't ask me!) for configuring the relative priorities and allotments as tanstaafl.
(In reply to Chris Wilson from comment #1) > I don't suppose you have a handy recipe for setting up a machine with an > isolated RT CPU? > If I understand correctly the question, the system is started with cpus 0 isolated thanks to isolcpus=1 kernel boot parameters. When RT threads are started, they are migrated to cpus 0 using pthread_setaffinity_np() > https://gitlab.freedesktop.org/drm/igt-gpu-tools/blob/master/benchmarks/ > gem_syslatency.c > > is a tool we use to try and assess the damage we cause to the system by > measuring the latency of an RT thread. But I've never before setup an > isolated cpu... > The problem can better being watched, if 2 RT tasks are used, first one starts periodically at eg 3ms and lasts eg 300us, the next one is chained by the first one and lasts eg 100us. adding 50us for each context change, you unfortunately finish the last task at almost 1.3 ms when problem has occured. I can monitor tasks activation using serial com port and a scope meter. Monitoring times thanks to clock_gettime in threads is accurate too. > Just to check, do you mean the cpu isolation feature in the kernel, or are > just using cpuset? > > And also to confirm you are talking about CPU preemption and not GPU > preeemption? If I understand , that is CPU preemption , the realtime task does not access GPU. however graphic tasks (non RT tasks) are loading GPU unit.
I forgot to indicate : 19 inch display with a Resolution 1280x1024 / vertical orientation
Just a quick check on the baseline, the results of our cycletest for measuring RT thread latency, maximum latency measured during a 120s period: x baseline-max.lt + i915-max.lt +------------------------------------------------------------------------------+ | xxx + | | xxxxx + + + | | xxxxx ++ + * | | xxxxxx *+ + * | | xxxxxxx *+ ++* | | xxxxxxx x*+ ++* | | xxxxxxx **+x*+* | |xxxxxxxxx **+*****+ + + + + + + + | |xxxxxxxxxx**+*****+x+ + ++ + + + ++ + + ++ ++ + + + + | |xxxxxxxxxx**+******x**x*x++++*+ ++ + ++ ++++ + ++ ++ ++++ +++++++++ + ++| | |____M_A_____| | | |___________M_________A____________________| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 120 11 47 18 20.533333 7.7924045 + 120 23 105 39 50.716667 26.207377 in microseconds. So even in the best case, the worst case impact of submitting nops is on average 50us. Now, this does not take into account any impact memory contention has on the RT thread since we are not stressing the GPU in that manner. Nor does it setup an isolated cpu. Just establishing expectations for j1900.
I should provide some unittests similar to my RT tasks in order for you being able to measure impact on it.
One easy thing for you to check would be how does GPU frequency affect the memory contention: $ cd /sys/class/drm/card0 $ cat gt_RPn_freq_mhz > gt_min_freq_mhz $ cat gt_RPn_freq_mhz > gt_max_freq_mhz $ cat gt_RPn_freq_mhz > gt_boost_freq_mhz
Using the command you provided, I can watch an impact. The complete cycletime of my process is normally around 700us with non graphic interaction When the GPU uses default settings, and there is graphic activity this time raises up to 1.3ms. Reducing the frequency with the command you provided reduces this time to 1ms .
(In reply to ANCELOT Stéphane from comment #9) > Using the command you provided, I can watch an impact. > > The complete cycletime of my process is normally around 700us with non > graphic interaction > > When the GPU uses default settings, and there is graphic activity this time > raises up to 1.3ms. > > Reducing the frequency with the command you provided reduces this time to > 1ms . @Chris, What are the next steps here?
(In reply to ANCELOT Stéphane from comment #9) > Using the command you provided, I can watch an impact. > > The complete cycletime of my process is normally around 700us with non > graphic interaction > > When the GPU uses default settings, and there is graphic activity this time > raises up to 1.3ms. > > Reducing the frequency with the command you provided reduces this time to > 1ms . GPU causes memory contention with a RT app. There is nothing we can help if the RT app is very important to you. To me, system works as expected and no changes needed in this case. I would like to close this bug as WORKSFORME.
(In reply to Lakshmi from comment #11) > (In reply to ANCELOT Stéphane from comment #9) > > Using the command you provided, I can watch an impact. > > > > The complete cycletime of my process is normally around 700us with non > > graphic interaction > > > > When the GPU uses default settings, and there is graphic activity this time > > raises up to 1.3ms. > > > > Reducing the frequency with the command you provided reduces this time to > > 1ms . > > GPU causes memory contention with a RT app. There is nothing we can help if > the RT app is very important to you. > > To me, system works as expected and no changes needed in this case. I would > like to close this bug as WORKSFORME. As said, this issue works as expected. Resolving this bug as NOTABUG. Thanks!
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.