Created attachment 112474 [details] Test case Running on "Intel(R) HD Graphics Haswell GT2 Mobile" (Gen7.5). Tested with Release_v1.0.0 and with latest master (786da41). The attached patch adds a test which essentially computes the dot-product of two 16-element arrays, slightly unrolled so it does two multiplications per loop iteration. The inputs are 16-bit and the sum is 64-bit. Every work item does exactly the same computation. It runs with global and local work size 16. The output shows the first 8 work items get the correct result, but the next 8 get the wrong result. If I un-unroll the loop (change "i += 2" to 1, and remove the "sum += b0 * b1") then it gives the correct output. If I change "long sum = 0" to "int sum = 0", then it gives the correct output.
It seems a post register allocation bug. You can disable the post register allocation by set the following environment: # export OCL_POST_ALLOC_INSN_SCHEDULE=0 And try again. But that will cause about 8% performance regression. I will fix it soon. Thanks for reporting this.
I just submitted a patch to the mail list, the patch is at: http://lists.freedesktop.org/archives/beignet/2015-January/004935.html Could you try it at your side?
That seems to fix it for me - thanks!
The following patch has been pushed to the master and Release_v1.0 branches. commit 1511fe52ef956634075eae4d5a5d58c9b5ffdde1 Author: Zhigang Gong <zhigang.gong@intel.com> Date: Tue Jan 20 14:40:39 2015 +0800 GBE: fix an ACC register related instruction scheduling bug Some instructions modify the ACC register in the gen_context stage which's not regonized by current instruction scheduling algorithm. This patch fix this bug by checking all the possible SEL_OPs which may change the ACC implicitly. The corresponding bugzilla link is as below: https://bugs.freedesktop.org/show_bug.cgi?id=88587 Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.