A piglit test which can be used to reproduce the problem: https://lists.freedesktop.org/archives/piglit/2018-May/024378.html
There is a similar bug filed for Nvidia driver https://devtalk.nvidia.com/default/topic/971005/linux/glviewportindexedf-wrongly-rounding-down-x-and-y-parameters/ .
I wasn't able to find the exact specification for such case but e.g. Wine expects coordinates not to be shifted (see wined3d_caps_gl_ctx_test_viewport_subpixel_bits function in Wine).
The closest line in specs is in the specification of ARB_viewport_array extension:
> Direct3D 11 specifies that rasterization along the one-pixel edges of
fractional viewports to be undefined. If implementations want defined
behavior with fractional viewports, they can program a slightly wider viewport
and scissor away the pixels along the edge of the expanded viewport.
Viewport's width and height is already ceiled on radeon driver and i965 behaves the same before gen8.
I've made and sent the patch to fix the issue.
This is a very subtle and tricky bug. I'm pretty sure we do have something wrong in our driver when it comes to rounding but it's not entirely clear to me exactly what is wrong. The problem comes from a mismatch between the way GL/Vulkan define clipping and the way it actually happens in hardware. In GL and Vulkan, cliipping is defined in what they call "clip space" which happens before the viewport transform. The x, y, and z components of the position which come out of the vertex shader are compared against the w component of the position and are clipped such that:
-pos.w <= pos.x <= pos.w
-pos.w <= pos.y <= pos.w
0 <= pos.z <= pos.w
The clip volume is further restricted by the clip distances (if any) and I believe the formula there is 0 <= clip[i] <= 1. AFTER clipping is complete, the viewport transform is applied (divide each vertex position by it's w coordinate and apply the viewport matrx) and if that lands you with some fractional thing, that's what happens. It has nothing to do with clipping.
In hardware, however, it almost never works this way. It turns out that 3D clipping in floating-point is really expensive and somewhat lossy because you're modifying the floating point values provided by the client. For these two reasons, you try to avoid it at all costs. Instead, the actual HW clip stage just does a very fast accept/reject pass on the incoming based on the viewport what's called the guard band and only does a full floating-point clip for extremely large primitives which cross the guardband. There's a very good picture of this in the Ivy Bridge PRM Vol. 2 Pt. 1 section 9.2.3 "Guard Band". Anything which passes the initial guard band clip is sent through the viewport transform and on to rasterization.
It's in the rasterization stage, after the viewport transform has been applied and things have been quantized to fixed-point and the hardware is trying to determine the actual pixel coverage that most clipping is done. It may sound counter-intuitive but it's actually way faster to just rasterize everything and clip as part of 2D rasterization than to do 3D clipping in floating point. In a sense, this 2D clipping is actually scissoring and not really clipping. There are three types of scissoring that are done at this point: The viewport clip, the drawing rectangle, and client-specified scissors.
Ok, now that you've taken Clipping Theory 101, we have to ask what's going on in this bug. First of all, I don't really know why the -1's are in there when we define the viewport clip volume. That seems completely wrong to me at least from a floating-point viewport volume perspective. That said, I strongly suspect it's there for a reason. If so, this likely means that the hardware rounds it and treats it as a maximum pixel coordinate. There are a couple of things that make me think that the "X/Y Min/Max ViewPort" fields are only ever used for the screen-space 2D clip and have nothing to do with clip-space 3D clipping:
1. The guard band test has been around since forever but the "X/Y Min/Max ViewPort" fields have only been around since Broadwell.
2. The the guardband is specified in clip space and you don't really need the viewport transform for the view volume clip if you truely do it before applying the viewport transform.
This leads me to think that the only point of the "X/Y Min/Max ViewPort" fields is for the 2D screen-space viewport clip. (It's really a scissor and not a clip but that's irrelevant.) Given that this is it's only use, I have no idea why it's specified in floating-point and not as an integer unless it's for multi-sampling.
What we need are a few things:
1. Some VERY good tests for both Vulkan and GL which stress all sorts of interesting fractional clipping and viewport corner-cases.
2. Figure out exactly what's up with the -1 in a floating-point field in a hardware packet. In particular, understand the relationship between the viewport clip and multisampling.
3. Write a correct patch which fixes all the various places in GL and Vulkan (I'm pretty sure there's more than one) so that we actually get clipping right.
In order to test this properly, we'll need to use MSAA with as many samples as possible (gen9+ supports 16x, gen8 8x, and gen7 and earlier only 4x) and we have to know exact sample positions so we can test precisely. With Vulkan, this likely means relying on standard sample positions. For GL, there's some way you can query the sample positions (which, for us, are just the Vulkan standard sample positions.)