Bug 93721 - Tonga [drm:amdgpu_vm_init [amdgpu]] *ERROR* Cannot allocate memory for page table array
Summary: Tonga [drm:amdgpu_vm_init [amdgpu]] *ERROR* Cannot allocate memory for page t...
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/AMDgpu (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-01-14 23:10 UTC by Andy Furniss
Modified: 2016-02-05 02:06 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg showing mem fail (109.50 KB, text/plain)
2016-01-14 23:10 UTC, Andy Furniss
no flags Details
mpv-vdpau-fail (10.31 KB, text/plain)
2016-01-15 00:53 UTC, Andy Furniss
no flags Details
drm/amdgpu: Use drm_calloc_large for VM page_tables array (1.96 KB, patch)
2016-01-15 03:45 UTC, Michel Dänzer
no flags Details | Splinter Review

Description Andy Furniss 2016-01-14 23:10:24 UTC
Created attachment 121052 [details]
dmesg showing mem  fail

Been testing the latest agd5f drm-next-4.5 with powerplay=1.

Nothings changed from previously reported issues, uvd still breaks powerplay and it's self, lockups with uvd possible.

This particular boot I hadn't touched uvd or anything gl excepting glamor is always used.

Was testing/benching vce which was going OK (apart from previously reported issue with UHD content).

After I had finished testing I turned auto/down GPU/CPU which had been set high.

memclk stayed stuck high. I then tried glxgears just to see if it would change the clocks.

I got - 

libGL error: failed to open drm device: Cannot allocate memory
libGL error: failed to load driver: radeonsi

tried as user = same, tried glxinfo = same, did free =

              total        used        free      shared  buff/cache   available
Mem:        8139284      244600      340524     1519216     7554160     6314420
Swap:       4605948           0     4605948

Looks OK but as I had 1.5Gig in a ramdisk I deleted the file and then gl started working again - strange as I've done many tests with 6Gig in there and it just doesn't look like I was low on mem. Maybe you can see why from the memdumps in the dmesg. 

I haven't been able to reproduce this.
Comment 1 Andy Furniss 2016-01-14 23:27:50 UTC
Should add I haven't tested VCE for a while - but it isn't the only "new" thing as I was using a net bridge at the time as well which I don't normally do so maybe that is relevant.
Comment 2 Andy Furniss 2016-01-15 00:53:14 UTC
Created attachment 121053 [details]
mpv-vdpau-fail

Same boot and I still can't reproduce with opengl. I can run gears and unigine valley repeatedly OK, but I can still randomly trigger using vdpau.
Comment 3 Michel Dänzer 2016-01-15 03:45:41 UTC
Created attachment 121055 [details] [review]
drm/amdgpu: Use drm_calloc_large for VM page_tables array

Does this patch help?
Comment 4 Ernst Sjöstrand 2016-01-15 08:37:04 UTC
I think I also had a UVD related hang on Fiji with Powerplay when setting
EnableLinuxHWVideoDecode = 1 in /etc/adobe/mms.cfg and watching flash video.
Not sure if my Firefox settings affect the result, but I had 100% hang rate. Couldn't find any logs messages though.
Comment 5 Andy Furniss 2016-01-15 12:37:44 UTC
(In reply to Michel Dänzer from comment #3)
> Created attachment 121055 [details] [review] [review]
> drm/amdgpu: Use drm_calloc_large for VM page_tables array
> 
> Does this patch help?

Running now - OK so far, but then it took some time/luck to show up on unpatched.
Comment 6 Andy Furniss 2016-01-15 12:41:29 UTC
(In reply to Ernst Sjöstrand from comment #4)
> I think I also had a UVD related hang on Fiji with Powerplay when setting
> EnableLinuxHWVideoDecode = 1 in /etc/adobe/mms.cfg and watching flash video.
> Not sure if my Firefox settings affect the result, but I had 100% hang rate.
> Couldn't find any logs messages though.

On older powerplays I could get a logless uvd lock up - not on current drm-next yet (it took many hours to provoke).

I can trigger a different one with mpv - to get logging you need to wait a few minutes before sysrq as the logging comes from the kernel hung task timeout.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.