88587 – [HSW][GT2] Incorrect code generation with long in loop

Bug 88587 - [HSW][GT2] Incorrect code generation with long in loop

Summary: [HSW][GT2] Incorrect code generation with long in loop

Status:	CLOSED FIXED

Alias:	None

Product:	Beignet
Classification:	Unclassified
Component:	Beignet (show other bugs)
Version:	unspecified
Hardware:	Other All

Importance:	medium normal
Assignee:	Zhigang Gong
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2015-01-19 14:39 UTC by Philip Taylor
Modified:	2015-01-21 10:19 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments
Test case (3.04 KB, text/plain) 2015-01-19 14:39 UTC, Philip Taylor	Details
View All

Description Philip Taylor 2015-01-19 14:39:06 UTC

Created attachment 112474 [details]
Test case

Running on "Intel(R) HD Graphics Haswell GT2 Mobile" (Gen7.5). Tested with Release_v1.0.0 and with latest master (786da41).

The attached patch adds a test which essentially computes the dot-product of two 16-element arrays, slightly unrolled so it does two multiplications per loop iteration. The inputs are 16-bit and the sum is 64-bit.

Every work item does exactly the same computation. It runs with global and local work size 16.

The output shows the first 8 work items get the correct result, but the next 8 get the wrong result.

If I un-unroll the loop (change "i += 2" to 1, and remove the "sum += b0 * b1") then it gives the correct output.

If I change "long sum = 0" to "int sum = 0", then it gives the correct output.

Comment 1 Zhigang Gong 2015-01-20 02:38:23 UTC

It seems a post register allocation bug. You can disable the post register allocation by set the following environment:
# export OCL_POST_ALLOC_INSN_SCHEDULE=0

And try again. But that will cause about 8% performance regression.

I will fix it soon. Thanks for reporting this.

Comment 2 Zhigang Gong 2015-01-20 07:40:16 UTC

I just submitted a patch to the mail list, the patch is at:
http://lists.freedesktop.org/archives/beignet/2015-January/004935.html

Could you try it at your side?

Comment 3 Philip Taylor 2015-01-20 11:48:39 UTC

That seems to fix it for me - thanks!

Comment 4 Zhigang Gong 2015-01-21 10:18:59 UTC

The following patch has been pushed to the master and Release_v1.0 branches.

commit 1511fe52ef956634075eae4d5a5d58c9b5ffdde1
Author: Zhigang Gong <zhigang.gong@intel.com>
Date:   Tue Jan 20 14:40:39 2015 +0800

    GBE: fix an ACC register related instruction scheduling bug

    Some instructions modify the ACC register in the gen_context
    stage which's not regonized by current instruction scheduling
    algorithm. This patch fix this bug by checking all the possible
    SEL_OPs which may change the ACC implicitly.

    The corresponding bugzilla link is as below:
    https://bugs.freedesktop.org/show_bug.cgi?id=88587

    Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
    Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.