Summary: | Long-running OpenCL kernels cause ring stalls and GPU lockups on Kabini when radeon.lockup_timeout is enabled | ||
---|---|---|---|
Product: | Mesa | Reporter: | Vedran Miletić <vedran> |
Component: | Drivers/Gallium/radeonsi | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED MOVED | QA Contact: | Default DRI bug account <dri-devel> |
Severity: | major | ||
Priority: | medium | ||
Version: | git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Bug Depends on: | |||
Bug Blocks: | 99553 |
Description
Vedran Miletić
2017-01-07 18:25:19 UTC
If you have not already done so, try disabling the watchdog timer: MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default 10000 = 10 seconds, 0 = disable)"); module_param_named(lockup_timeout, radeon_lockup_timeout, int, 0444); As part of HSA/ROC development we dropped the priority of compute work relative to graphics which improved interactivity and *almost* eliminated timeouts without having to disable the timer - when I get back in the office I'll dig up the changes. In the meantime, I think disabling the timer will do what you need although you will still have sluggish graphics while long-running kernels are active. Lowering the priority of compute waves across the board won't be a fully general solution because there are going to be some cases (eg Valve's recent work with using high priority compute to improve VR smoothness) where compute will need to be *higher* priority than graphics but it should cover most cases other than "simultaneously running GROMACS and VR". (In reply to John Bridgman from comment #1) > If you have not already done so, try disabling the watchdog timer: > > > MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default 10000 = > 10 seconds, 0 = disable)"); > module_param_named(lockup_timeout, radeon_lockup_timeout, int, 0444); > Yup, that works around the problem. > As part of HSA/ROC development we dropped the priority of compute work > relative to graphics which improved interactivity and *almost* eliminated > timeouts without having to disable the timer - when I get back in the > office I'll dig up the changes. In the meantime, I think disabling the timer > will do what you need although you will still have sluggish graphics while > long-running kernels are active. > Eager to hear the details. -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1246. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.