This has several steps:
- add screenshooting to the wl_test extension
- implement it in the test plugin
- make sure it works with the headless backend
In the test framework:
- add a way to fetch screenshots
- add a function to compute a checksum from the screen capture
- add an environment variable, that when set to a directory, all screenshots will be written there as image files, named based on the test where they came from
In the applicable tests, mainly those that test that the Pixman and the flavors of the GL renderer work with the headless backend:
- take a screenshot at the end
- manually check that the recorded image is correct
- hardcode the expected checksum into the code, and make the test fail if the computed checksum from the screenshot differs
You also needs to make sure that Weston's rendering is completely deterministic:
- disable random window placement
- ensure the wallpaper is always the same
- ensure the pointer cursors are always the same
- show a fake time in the weston-desktop-shell panel
The end goal is, that we will be able to easily write more tests, that check the real output on screen. That tests the whole path from client rendering to compositor, through compositing, and onto the (fake) screen.
We will also be able to run these tests reliably without any particular hardware requirements.
As Derek pointed out in IRC, GL is not pixel-perfect between implementations, so we cannot use checksumming there.
I wonder if it would be feasible to generate the reference images with the Pixman renderer at test runtime, se don't need to store reference images in the repository?
Listing checksums for each GL implementation is not feasible. I also would not want to clutter Weston repository with reference images, as that would likely be a lot of data.
Some plan for the GL-renderer would be needed. Using Pixman generated images as the reference will limit GL testing to features that Pixman renderer supports, but maybe that is enough?
This also reminds us, that we need to verify if the Pixman rendered results are really intended to be pixel-perfect between Pixman releases by the Pixman upstream.
If we cannot do checksumming at all, I fear we need a reference image repository.
We don't have to require pixel-precision to do absolutely everything, mind. We can get a lot of the way there by using solid colours and subregions:
- deterministic background colour
- clients composed of deterministic coloured rectangles
- test subregions of where clients are supposed to be (i.e. avoiding borders) to ensure the colours are as they are supposed to be
That alone should get us enough to test the transform and rendering stack fully.
I think we can close this as fixed by now.
We have the framework for doing screenshot tests on Pixman renderer in place. We use reference images stored in git instead of checksums.
Some of the open things are:
- merging the test shell, so that we get deterministic window placement
- adding GL support
- ensuring wallpaper and panel are deterministic
- fixing races against weston-desktop-shell start-up if necessary
- fuzzy image matching
- ensuring cursor images are deterministic
But I think all those are a matter for other tasks.