Bug 99048

Summary: piglit.spec.ext_framebuffer_object.fbo-maxsize OOM disables GPU
Product: Mesa Reporter: Mark Janes <mark.a.janes>
Component: Drivers/DRI/i965Assignee: Intel 3D Bugs Mailing List <intel-3d-bugs>
Status: RESOLVED MOVED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features: GEM/Other

Description Mark Janes 2016-12-10 14:33:56 UTC
This test is killed by OOM.  Bisection underway.
Comment 1 Mark Janes 2016-12-10 18:46:28 UTC
*** Bug 99036 has been marked as a duplicate of this bug. ***
Comment 2 Mark Janes 2016-12-10 18:47:40 UTC
Quite likely to be caused by:

e9133dd90ec498cfb6a23fa22504e06488352c51
Author:     Jordan Justen <jordan.l.justen@intel.com>
CommitDate: Wed Dec 7 09:00:49 2016 -0800

i965: Increase max texture to 16k for gen7+
Comment 3 Mark Janes 2016-12-11 01:12:52 UTC
reducing the memory requirements allows the test to complete:

https://lists.freedesktop.org/archives/piglit/2016-December/021581.html

However, the lack of error recovery seems like a kernel bug.  After this test is killed, subsequent GPU operations fail until reboot.  Without reboot, the system will eventually hard-hang.
Comment 4 Mark Janes 2016-12-12 20:18:11 UTC
Assigning to the kernel team, because it appears that OOM can kill a process while it holds a lock, generating an unrecoverable error.

To reproduce:

 1) use a device with 1GB RAM
 2) use Mesa after e9133dd9, when texture sizes were increased
 3) use piglit before 12b5938, when Ken reduced the memory footprint of fbo-maxsize

You should see dmesg errors:
  "Unable to purge GPU memory due lock contention"

Subsequently, other GPU workloads will fail intermittently.
Comment 5 Chris Wilson 2016-12-13 08:22:03 UTC
(In reply to Mark Janes from comment #4)
> Assigning to the kernel team, because it appears that OOM can kill a process
> while it holds a lock, generating an unrecoverable error.
> 
> To reproduce:
> 
>  1) use a device with 1GB RAM
>  2) use Mesa after e9133dd9, when texture sizes were increased
>  3) use piglit before 12b5938, when Ken reduced the memory footprint of
> fbo-maxsize
> 
> You should see dmesg errors:
>   "Unable to purge GPU memory due lock contention"
> 
> Subsequently, other GPU workloads will fail intermittently.

That is not what that means at all.
Comment 6 Mark Janes 2016-12-13 16:46:31 UTC
Updating title to reflect that this issue is not platform dependent.
Comment 7 Matt Turner 2016-12-13 19:26:34 UTC
(In reply to Chris Wilson from comment #5)
> That is not what that means at all.

I can't even tell what you're responding to.

We're not mind readers. Why don't you tell us what you're thinking?
Comment 8 GitLab Migration User 2019-09-25 18:59:17 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1553.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.