Bug 100436

Summary: Pointer jumps when using the rmi4 driver for precision touchpads
Product: Wayland Reporter: Andrew Duggan <aduggan>
Component: libinputAssignee: Peter Hutterer <peter.hutterer>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: alquimista.ds, benjamin.tissoires, grawity, nate, peter.hutterer, spacepluk
Version: 1.5.0   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: Noticeable jumps when drawing diagonal line
Additional jumps when pointing
Visualisation of attachment #130512
Visualisation of attachment #130513
Visualisation of attachment #130513
libinput diff for the data generation - use with evemu-play to reproduce data file
A diagonal line on system with Benjamin's kernel patch applied
Diagonal line on Razer Blade (late 2016)
Circles on Razer Blade (late 2016)
0001-HID-rmi-use-HID_QUIRK_NO_INPUT_SYNC.patch
email garbage

Description Andrew Duggan 2017-03-28 18:51:46 UTC
Created attachment 130512 [details]
Noticeable jumps when drawing diagonal line

Versions 1.5 and higher seem to have introduced jumps in the pointer motion for data being reported using the RMI4 driver. On example is when moving in a descending diagonal line from right to left the pointer will jump down and then back into position. It maybe that the new acceleration code is amplifying the motion. Here is the link to the kernel mailing list discussing this issue.

https://marc.info/?l=linux-kernel&m=148998603125837&w=2

There is probably a kernel component which is contributing to this. But, since the jumps are much more noticeable on newer versions of libinput I thought it was worth looking into on the libinput side.
Comment 1 Andrew Duggan 2017-03-28 18:52:29 UTC
Created attachment 130513 [details]
Additional jumps when pointing
Comment 2 Peter Hutterer 2017-03-29 05:53:45 UTC
Created attachment 130522 [details]
Visualisation of attachment #130512 [details]

These are the three coordinates libinput deals with: 'device' for input in
device units [1], 'accel' for accelerated data in device units [2],
'normalized' for the data we pass to the caller, normalized to 1000dpi.

Note that they are *not* the same scale because of the slowdown we apply to
touchpads, but it still shows the issue. 

data generated by simply adding the three delta to their respective static
variable and printing its current value.

There are a couple of issues here that all likely play into each other:
first, the device data *should* be diagonal for the input given but it's a
steping motion. Quick skim of the evemu data shows this is correct, lots of
events with only one axis updating, only for the next axis to update in the
next event. To avoid this, we'd probably have to average through the last
two, three motion events to guess at the correct vector. Either way, the
input data is less than ideal here.

second, the triangle poking out is a libinput bug. I'm not 100% sure what's
happening there, it's the last of 3 triangles in a sequence in the input
data (at ca -60/45). Until then, the mapping is ca one-to-one, but that one
triggered something to poke out like this. Could be a timing-related thing,
I don't know yet.

third, because of the stepping input motion, any time acceleration is
applied, that stepping motion is just amplified, see the 'normalized' line
at ca -20/10).

So judging from this recording: less-than-ideal input but the big jump you
see is definitely libinput and should be fixable.

I do wonder if cfdaaa32a73b is partially responsible here.

[1] Note that for devices with xres != yres, y is normalized to the x
resolution. This does not apply here, x/y both have a resolution of 12.
Comment 3 Peter Hutterer 2017-03-29 05:59:37 UTC
Created attachment 130523 [details]
Visualisation of attachment #130513 [details]

see comment #2 for a description of the data

This one doesn't have a smoking gun like attachment #130522 [details], in parts
because it's messier and a bit harder to compare. The best comparison is if
you follow back from the end of the 'normalized' line bottom right
(200/-140) past the first curve, there are two steps (250/0 and 190/70).
These are present in the input as stepping motion but nowhere near as bad.

There are others where little edges in the input data are amplified greatly,
causing the visible jumps you're describing.

So in summary, yes, we're looking at a libinput bug here but I don't quite
know how to fix this yet. It's not necessarily a pointer acceleration bug, I
rather suspect some more input delta smoothening will make these disappear.

I'll move this to the top of my priority list.
Comment 4 Peter Hutterer 2017-03-29 06:00:20 UTC
Created attachment 130524 [details]
Visualisation of attachment #130513 [details]

sorry, attached the wrong file at first
Comment 5 Peter Hutterer 2017-03-29 06:07:18 UTC
Created attachment 130525 [details]
libinput diff for the data generation - use with evemu-play to reproduce data file
Comment 6 Peter Hutterer 2017-03-31 08:08:32 UTC
looking at this again, there is another issue: we have some velocity prediction in libinput where we look at the movement and average the velocity over (a short) time. That resets whenever the direction changes significantly. The stepping motion does just that, it changes 90 degrees, resetting the acceleration history. That likely makes the pointer motion less predictable than it could be.

Not sure whether the timestamps have much of an influence, though they seem to be quite variable (baseline is 7ms).
Comment 7 Andrew Duggan 2017-03-31 22:08:59 UTC
Created attachment 130629 [details]
A diagonal line on system with Benjamin's kernel patch applied
Comment 8 zganus 2017-07-05 00:06:19 UTC
I'm having this issue with Arch Linux kernel 4.11.7 on a Razer laptop and libinput 1.7.3
The pointer is very jumpy and barely usable at all. Is there any known workaround ?
Thanks for the help.
Comment 9 Douglas 2017-07-18 13:07:12 UTC
If you need help testing, I'm available. I have my netbook sitting in the corner, doing nothing.
It's not something exclusive to Wayland. Xorg too. Any distro I tried, except Debian with its older packages.
Comment 10 Oscar Morante 2018-04-19 07:55:46 UTC
I'm still having this issue on 1.10.  Any ideas on how to fix it?

This is bothering me enough that I woudld like to give it a try if you can offer some guidance :)
Comment 11 Mantas Mikulėnas 2018-04-19 08:29:39 UTC
(In reply to Oscar Morante from comment #10)
> I'm still having this issue on 1.10.  Any ideas on how to fix it?
> 
> This is bothering me enough that I woudld like to give it a try if you can
> offer some guidance :)

For me, the combination of libinput 1.10 and kernel 4.15 is quite usable now (although still has an occassional jump, but much less often than previously).

Another workaround is available – kernel 4.15 also handles PS/2-to-RMI switchover a bit better, so blacklisting i2c_hid (in my case) will make the touchpad remain in PS/2 legacy mode, which has no such issues.

(Which is fortunate, because 4.16 came with RMI device detection pre-broken and PS/2 is the only mode which still works.)
Comment 12 Peter Hutterer 2018-04-20 00:12:16 UTC
fwiw, with the omnidirectional hysteresis (1.10.2) any effects libinput had on the stepping motion for small movements should be gone now. Doesn't change the underlying data problem though.
Comment 13 Oscar Morante 2018-04-20 10:03:26 UTC
On my laptop (Razer Blade late 2016) it doesn't only happen with small movements.

If try to move in circles the cursor jumps all over the screen.  I don't know how to measure the quality of the data but mtview doesn't seem to have any problem with it.
Comment 14 Peter Hutterer 2018-04-27 07:09:26 UTC
Oscar, I'll need another evemu recording for that then please, thanks
Comment 15 Oscar Morante 2018-05-13 15:04:51 UTC
Created attachment 139543 [details]
Diagonal line on Razer Blade (late 2016)

There you go, sorry it took me so long.
Comment 16 Oscar Morante 2018-05-13 15:09:58 UTC
Created attachment 139544 [details]
Circles on Razer Blade (late 2016)

This is me trying to go in circles in the center of the touchpad.  This used to work fine and the pointer would stay around the center of the screen. But "now" (it's been propbably a year or more like this) it moves around the screen erraticaly.
Comment 17 Peter Hutterer 2018-05-17 05:32:09 UTC
sigh, this is garbage data coming out of the device. On the per slot data, the x and y data is split across two events, but the single touch emulation is correct.

see for example:

E: 0.026424 0003 0035 1718	# EV_ABS / ABS_MT_POSITION_X    1718
E: 0.026424 0000 0000 0000	# ------------ SYN_REPORT (0) ---------- +8ms
E: 0.026436 0003 0036 1041	# EV_ABS / ABS_MT_POSITION_Y    1041
E: 0.026436 0003 0000 1718	# EV_ABS / ABS_X                1718
E: 0.026436 0003 0001 1041	# EV_ABS / ABS_Y                1041
E: 0.026436 0000 0000 0000	# ------------ SYN_REPORT (0) ---------- +0ms

This explains why the synaptics driver doesn't have the same zig-zag motion, it only used ABS_X/ABS_Y, not the per-slot data so it never sees that bug.

Now that I fixed up my analysis scripts to identify this I found one jump like that in Andrew's original recording too, which may match that big triangle we saw in attachment 130522 [details].

Note how the events above show 0ms for the second frame, the delta is actually nonzero but always (?) less than 1 ms. This means we could potentially quirk this and put some heuristics in place, but a hwdb quirk will be needed.

Anyway, the first quick write-up of a potential fix is here:
https://github.com/whot/libinput/tree/wip/0ms-batch-processing
Give this a try please and let me know how it goes. It's full of fixmes and will break some bluetooth mice, but for confirming whether I'm on the right path it's enough. Thanks!
Comment 18 Peter Hutterer 2018-05-23 07:32:22 UTC
oscar, can you give the branch liked in comment #17 a try please? thanks
Comment 19 Benjamin Tissoires 2018-05-23 09:13:28 UTC
Created attachment 139691 [details] [review]
0001-HID-rmi-use-HID_QUIRK_NO_INPUT_SYNC.patch

Andrew, could you try the attached patch on your kernel and report if this is working properly? This have a high chance of fixing the pointer jumps though I am not 100% sure it's correct.
Comment 20 Oscar Morante 2018-05-23 09:46:21 UTC
I'm running Peter's branch right now and it's a massive improvement! Thank you so much! I'm gonna test it for a couple of days and then I can send you another recording if you think it helps for fine-tunning.
Comment 21 Peter Hutterer 2018-05-23 10:01:14 UTC
Oscar, please try Benjamin's patch as well if you can. This is a 20-line fix in the kernel (if it works), compared to complex heuristics etc. in libinput. So the kernel patch should make the problem go away even without the libinput bits.
Comment 22 Oscar Morante 2018-05-23 10:18:47 UTC
Sure, that's going to take me a bit longer but I'll report back as soon as I can try it.
Comment 23 Oscar Morante 2018-05-24 15:12:27 UTC
I just tried Benjamin's patch with libinput's stable release and it also solves the issue for me.  Thanks a lot for taking a look at it!
Comment 24 Benjamin Tissoires 2018-05-25 12:53:02 UTC
Thanks Oscar for testing this. I just submitted the patch upstream: https://patchwork.kernel.org/patch/10427341/
Comment 25 Oscar Morante 2018-05-25 18:55:27 UTC
Great :)  No problem adding my name/email, do I need to do anything?
Comment 26 Benjamin Tissoires 2018-05-28 16:38:26 UTC
(In reply to Oscar Morante from comment #25)
> Great :)  No problem adding my name/email, do I need to do anything?

You don't have anything to do now, I already answered on the LKML :)
Comment 27 Peter Hutterer 2018-05-28 22:56:41 UTC
Closing as MOVED because we have the kernel fix in the queue now. Benjamin, please re-open this one if the kernel bug gets rejected for some reason, otherwise I'm going to assume this one is fixed.
Comment 28 Mantas Mikulėnas 2018-06-11 07:12:53 UTC
Looks like the patch was accepted for v4.18 as:
https://git.kernel.org/linus/c94ba060112ad24fa29b2bdafc0c32173e1f1959

(I guess it won't go to stable 4.16/4.17 though?)
Comment 29 Andrew Duggan 2018-06-11 07:13:05 UTC
Created attachment 140112 [details]
email garbage

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.