Right now most of the transfer functions do something roughly in between blocking and non-blocking transfers, and blocking boolean parameter is ignored. We should implement support for non-blocking transfer and clearly separate the two.
Created attachment 130217 [details] [review] Attempt at a fix
Benchmarking ftp://ftp.gromacs.org/pub/benchmarks/water_GMX50_bare.tar.gz with GROMACS 2016.3 I get the following results: Without the patch: Launch GPU ops. 1 8 66001 16.905 460.911 67.1 Wait GPU local 1 8 66001 2.569 70.041 10.2 Launch GPU ops. 1 8 66001 20.110 548.283 65.2 Wait GPU local 1 8 66001 2.586 70.511 8.4 Launch GPU ops. 1 8 33001 10.902 297.224 61.0 Wait GPU local 1 8 33001 1.309 35.695 7.3 Launch GPU ops. 1 8 16001 6.460 176.126 54.4 Wait GPU local 1 8 16001 0.642 17.499 5.4 Launch GPU ops. 1 8 8001 4.624 126.062 47.7 Wait GPU local 1 8 8001 0.361 9.833 3.7 Launch GPU ops. 1 8 4001 3.471 94.623 39.5 Wait GPU local 1 8 4001 0.246 6.699 2.8 Launch GPU ops. 1 8 2001 3.648 99.469 38.5 Wait GPU local 1 8 2001 0.185 5.054 2.0 Launch GPU ops. 1 8 1001 3.396 92.583 35.5 Wait GPU local 1 8 1001 0.106 2.902 1.1 Launch GPU ops. 1 8 501 6.106 166.478 46.2 Wait GPU local 1 8 501 0.064 1.745 0.5 Launch GPU ops. 1 8 251 4.909 133.845 35.1 Wait GPU local 1 8 251 0.040 1.089 0.3 Launch GPU ops. 1 8 126 3.936 107.314 26.5 Wait GPU local 1 8 126 0.021 0.576 0.1 Launch GPU ops. 1 8 66 3.314 90.362 18.5 Wait GPU local 1 8 66 0.011 0.296 0.1 Launch GPU ops. 1 8 34 3.139 85.581 14.4 Wait GPU local 1 8 34 0.005 0.144 0.0 With the patch: Launch GPU ops. 1 8 66001 18.298 498.894 66.8 Wait GPU local 1 8 66001 2.519 68.691 9.2 Launch GPU ops. 1 8 66001 22.010 600.094 66.4 Wait GPU local 1 8 66001 2.543 69.343 7.7 Launch GPU ops. 1 8 33001 11.510 313.821 60.9 Wait GPU local 1 8 33001 1.261 34.368 6.7 Launch GPU ops. 1 8 16001 6.928 188.899 56.0 Wait GPU local 1 8 16001 0.626 17.077 5.1 Launch GPU ops. 1 8 8001 4.850 132.245 48.3 Wait GPU local 1 8 8001 0.350 9.531 3.5 Launch GPU ops. 1 8 4001 3.657 99.696 40.2 Wait GPU local 1 8 4001 0.272 7.415 3.0 Launch GPU ops. 1 8 2001 3.791 103.350 38.9 Wait GPU local 1 8 2001 0.191 5.202 2.0 Launch GPU ops. 1 8 1001 3.514 95.813 36.0 Wait GPU local 1 8 1001 0.106 2.886 1.1 Launch GPU ops. 1 8 501 6.102 166.361 46.0 Wait GPU local 1 8 501 0.063 1.716 0.5 Launch GPU ops. 1 8 251 4.936 134.578 35.1 Wait GPU local 1 8 251 0.040 1.086 0.3 Launch GPU ops. 1 8 126 3.941 107.435 26.5 Wait GPU local 1 8 126 0.021 0.559 0.1 Launch GPU ops. 1 8 66 3.335 90.920 18.3 Wait GPU local 1 8 66 0.011 0.297 0.1 Launch GPU ops. 1 8 34 3.173 86.500 14.6 Wait GPU local 1 8 34 0.005 0.147 0.0 So it's not an improvement.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/138.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.