Created attachment 105127 [details]
Ubuntu 14.04. With Oibaf repository for GPU drivers.
Radeon 7870 (Pitcairn)
When I try to run glCopyBufferSubData on large buffers (128MB works ok, 200MiB not) my Desktop freezes and I only can move the mouse around.
The mouse curser image stays the same after that. E.g. a normal pointer or text marker icon.
I have no relevant message in dmsg.0 or Xorg.1.log, as far as I can tell.
Minimal example code that will trigger the freeze (After using sdl2 to init Core GL3.3 Context + Window):
int bufferSize = 200 * 1024 * 1024;
GLuint clientBufferId, serverBufferId;
glBufferStorage(GL_ARRAY_BUFFER, bufferSize, 0, /*GL_DYNAMIC_STORAGE_BIT | GL_MAP_WRITE_BIT | GL_MAP_READ_BIT | GL_MAP_PERSISTENT_BIT| GL_CLIENT_STORAGE_BIT*/0);
glBufferStorage(GL_ARRAY_BUFFER, bufferSize, 0, 0);
glCopyBufferSubData(GL_COPY_READ_BUFFER, GL_COPY_WRITE_BUFFER, 0, 0, bufferSize);
Can you flesh out the minimal reproducer so we can just compile and run it?
Created attachment 105143 [details]
minimal example only needs sdl2 to compile
Created attachment 105144 [details]
Created attachment 105145 [details]
Here is a minimal example, only needs sdl2. cmake files included.
Sometimes I hat to run the executable 2-3 times before it froze my system.
I tryed to get a more exact size that triggers the freeze. And im down to 128MiB. (Still could be lower)
Because it often need several tries before it freezes I changed from repeating my program in the console to just a simple loop around the C++ code (generating, copy, deleting of the buffers)
Looping over 64 MiB sized buffers now gave me errors in the program console and in dmesg.
Created attachment 105177 [details]
Ignore the ext4 errors please ;]
Created attachment 105178 [details]
Created attachment 105179 [details]
main.cpp with looping (generating, copy, deleting) buffers
This triggers the error messages in the application console and dmesg
ok forget the errors I get except the freezing.
The driver was just ranning our of memory because I hat no glFinish and gave it no time for the actually copy/deleting of the buffers befor creating new ones.
There is no exact buffer size that causes the freeze.
If I try often enough I managed to kill it with a 123032 KiB buffer so far.
You're saying you can still get the freeze even with glFinish() in the loop? And it never recovers from that?
Can you still switch to a console with Ctrl-Alt-Fx after the freeze, or log in via SSH? If so, can you attach the dmesg output from at least 10 seconds after the freeze?
Does this still happen with a newer kernel?
Trying to copy 200 MiB, only the screen frezes. SSH works fine.
I can just kill -9 the program and everything is ok again. After that I have this line in dmesg
[ 1851.802876] radeon 0000:01:00.0: failed to get a new IB (-512)
When I try to switch to console while my program still runs it appears to do notihing, but after I kill -9 the program via ssh, I'm in the console.
I also tryed to move a windows around during the freeze, but that didn't work.
Sometimes the mouse pointer freezes, too.
Also the size of the buffer does not matter, relevant is how much bytes get copyed by glCopyBufferSubData.
And the bigger the copy is the more likely the freeze seems to be.
With a 256 MiB copy I managed to get a kernel bug message in dmesg and could not recover by killing the program.
Will bake me some fresh kernels now.
Created attachment 105283 [details]
dmesg when trying to copy 256 MiB
(In reply to comment #13)
> Created attachment 105283 [details]
> dmesg when trying to copy 256 MiB
That's a known issue with older kernels, please update your kernel and make sure it has "drm/radeon: update IB size estimation for VM".
Yes, it works with 3.17.0-rc2.
For anyone else, here is a link to the Kernel Bug Tracker:
Thanks, please open up a new bug report if you have further issues.