This allows handling them without ever needing to offscreen for losing
the clip, because the clip can always be transformed.
Also, all the optimizations keep working, like occlusion culling,
clears, and so on.
The main benefit of this work is the ability for offloading to now
handle dihedral transforms of the video buffer.
The other big advantage is that we can now start our rendering with a
dihedral transform from the compositor.
Using clear avoids the shader engine (see last commit), so if we can get
pixels out of it, we should.
So we detect the overlap with the rounded corners of the clip region and
emit shaders for those, but then use Clear() for the rest.
With this in place, widget-factory on my integrated Intel TigerLake gets
a 60% performance boost.
The idea is that for a rectangle intersection, each corner of the
result is either entirely part of one original rectangle or it is
an intersection point.
By detecting those 2 cases and treating them differently, we can
simplify the code to compare rounded rectangles.
This includes a copy of the diff(1) algorithm used by git diff by Davide
Libenzi.
It's used for the common case ofcontainer nodes having only very few
changes for the few nodes of child widgets that changed (like a button
lighting up when hilighted or a spinning spinner).
This is now tracking the clips added by the clip nodes.
If any particular node can't deal with a clip, it falls back to Cairo
rendering. But if it can, it will render it directly.