Created attachment 130512 [details]
Noticeable jumps when drawing diagonal line
Versions 1.5 and higher seem to have introduced jumps in the pointer motion for data being reported using the RMI4 driver. On example is when moving in a descending diagonal line from right to left the pointer will jump down and then back into position. It maybe that the new acceleration code is amplifying the motion. Here is the link to the kernel mailing list discussing this issue.
There is probably a kernel component which is contributing to this. But, since the jumps are much more noticeable on newer versions of libinput I thought it was worth looking into on the libinput side.
Created attachment 130513 [details]
Additional jumps when pointing
Created attachment 130522 [details]
Visualisation of attachment #130512 [details]
These are the three coordinates libinput deals with: 'device' for input in
device units , 'accel' for accelerated data in device units ,
'normalized' for the data we pass to the caller, normalized to 1000dpi.
Note that they are *not* the same scale because of the slowdown we apply to
touchpads, but it still shows the issue.
data generated by simply adding the three delta to their respective static
variable and printing its current value.
There are a couple of issues here that all likely play into each other:
first, the device data *should* be diagonal for the input given but it's a
steping motion. Quick skim of the evemu data shows this is correct, lots of
events with only one axis updating, only for the next axis to update in the
next event. To avoid this, we'd probably have to average through the last
two, three motion events to guess at the correct vector. Either way, the
input data is less than ideal here.
second, the triangle poking out is a libinput bug. I'm not 100% sure what's
happening there, it's the last of 3 triangles in a sequence in the input
data (at ca -60/45). Until then, the mapping is ca one-to-one, but that one
triggered something to poke out like this. Could be a timing-related thing,
I don't know yet.
third, because of the stepping input motion, any time acceleration is
applied, that stepping motion is just amplified, see the 'normalized' line
at ca -20/10).
So judging from this recording: less-than-ideal input but the big jump you
see is definitely libinput and should be fixable.
I do wonder if cfdaaa32a73b is partially responsible here.
 Note that for devices with xres != yres, y is normalized to the x
resolution. This does not apply here, x/y both have a resolution of 12.
Created attachment 130523 [details]
Visualisation of attachment #130513 [details]
see comment #2 for a description of the data
This one doesn't have a smoking gun like attachment #130522 [details], in parts
because it's messier and a bit harder to compare. The best comparison is if
you follow back from the end of the 'normalized' line bottom right
(200/-140) past the first curve, there are two steps (250/0 and 190/70).
These are present in the input as stepping motion but nowhere near as bad.
There are others where little edges in the input data are amplified greatly,
causing the visible jumps you're describing.
So in summary, yes, we're looking at a libinput bug here but I don't quite
know how to fix this yet. It's not necessarily a pointer acceleration bug, I
rather suspect some more input delta smoothening will make these disappear.
I'll move this to the top of my priority list.
Created attachment 130524 [details]
Visualisation of attachment #130513 [details]
sorry, attached the wrong file at first
Created attachment 130525 [details]
libinput diff for the data generation - use with evemu-play to reproduce data file
looking at this again, there is another issue: we have some velocity prediction in libinput where we look at the movement and average the velocity over (a short) time. That resets whenever the direction changes significantly. The stepping motion does just that, it changes 90 degrees, resetting the acceleration history. That likely makes the pointer motion less predictable than it could be.
Not sure whether the timestamps have much of an influence, though they seem to be quite variable (baseline is 7ms).
Created attachment 130629 [details]
A diagonal line on system with Benjamin's kernel patch applied
I'm having this issue with Arch Linux kernel 4.11.7 on a Razer laptop and libinput 1.7.3
The pointer is very jumpy and barely usable at all. Is there any known workaround ?
Thanks for the help.
If you need help testing, I'm available. I have my netbook sitting in the corner, doing nothing.
It's not something exclusive to Wayland. Xorg too. Any distro I tried, except Debian with its older packages.
I'm still having this issue on 1.10. Any ideas on how to fix it?
This is bothering me enough that I woudld like to give it a try if you can offer some guidance :)
(In reply to Oscar Morante from comment #10)
> I'm still having this issue on 1.10. Any ideas on how to fix it?
> This is bothering me enough that I woudld like to give it a try if you can
> offer some guidance :)
For me, the combination of libinput 1.10 and kernel 4.15 is quite usable now (although still has an occassional jump, but much less often than previously).
Another workaround is available – kernel 4.15 also handles PS/2-to-RMI switchover a bit better, so blacklisting i2c_hid (in my case) will make the touchpad remain in PS/2 legacy mode, which has no such issues.
(Which is fortunate, because 4.16 came with RMI device detection pre-broken and PS/2 is the only mode which still works.)
fwiw, with the omnidirectional hysteresis (1.10.2) any effects libinput had on the stepping motion for small movements should be gone now. Doesn't change the underlying data problem though.
On my laptop (Razer Blade late 2016) it doesn't only happen with small movements.
If try to move in circles the cursor jumps all over the screen. I don't know how to measure the quality of the data but mtview doesn't seem to have any problem with it.
Oscar, I'll need another evemu recording for that then please, thanks
Created attachment 139543 [details]
Diagonal line on Razer Blade (late 2016)
There you go, sorry it took me so long.
Created attachment 139544 [details]
Circles on Razer Blade (late 2016)
This is me trying to go in circles in the center of the touchpad. This used to work fine and the pointer would stay around the center of the screen. But "now" (it's been propbably a year or more like this) it moves around the screen erraticaly.
sigh, this is garbage data coming out of the device. On the per slot data, the x and y data is split across two events, but the single touch emulation is correct.
see for example:
E: 0.026424 0003 0035 1718 # EV_ABS / ABS_MT_POSITION_X 1718
E: 0.026424 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +8ms
E: 0.026436 0003 0036 1041 # EV_ABS / ABS_MT_POSITION_Y 1041
E: 0.026436 0003 0000 1718 # EV_ABS / ABS_X 1718
E: 0.026436 0003 0001 1041 # EV_ABS / ABS_Y 1041
E: 0.026436 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +0ms
This explains why the synaptics driver doesn't have the same zig-zag motion, it only used ABS_X/ABS_Y, not the per-slot data so it never sees that bug.
Now that I fixed up my analysis scripts to identify this I found one jump like that in Andrew's original recording too, which may match that big triangle we saw in attachment 130522 [details].
Note how the events above show 0ms for the second frame, the delta is actually nonzero but always (?) less than 1 ms. This means we could potentially quirk this and put some heuristics in place, but a hwdb quirk will be needed.
Anyway, the first quick write-up of a potential fix is here:
Give this a try please and let me know how it goes. It's full of fixmes and will break some bluetooth mice, but for confirming whether I'm on the right path it's enough. Thanks!
oscar, can you give the branch liked in comment #17 a try please? thanks
Created attachment 139691 [details] [review]
Andrew, could you try the attached patch on your kernel and report if this is working properly? This have a high chance of fixing the pointer jumps though I am not 100% sure it's correct.
I'm running Peter's branch right now and it's a massive improvement! Thank you so much! I'm gonna test it for a couple of days and then I can send you another recording if you think it helps for fine-tunning.
Oscar, please try Benjamin's patch as well if you can. This is a 20-line fix in the kernel (if it works), compared to complex heuristics etc. in libinput. So the kernel patch should make the problem go away even without the libinput bits.
Sure, that's going to take me a bit longer but I'll report back as soon as I can try it.
I just tried Benjamin's patch with libinput's stable release and it also solves the issue for me. Thanks a lot for taking a look at it!
Thanks Oscar for testing this. I just submitted the patch upstream: https://patchwork.kernel.org/patch/10427341/
Great :) No problem adding my name/email, do I need to do anything?
(In reply to Oscar Morante from comment #25)
> Great :) No problem adding my name/email, do I need to do anything?
You don't have anything to do now, I already answered on the LKML :)
Closing as MOVED because we have the kernel fix in the queue now. Benjamin, please re-open this one if the kernel bug gets rejected for some reason, otherwise I'm going to assume this one is fixed.
Looks like the patch was accepted for v4.18 as:
(I guess it won't go to stable 4.16/4.17 though?)
Created attachment 140112 [details]