If one of the descriptor sets doesn't have any items, don't include it
in the sets passed to vkUpdateDescriptorSets().
This has no effect right now, because we either have both images and
samplers or neither, but it will become relevant once we also support
buffers.
Respect the matrix in use at time of encountering a repeat node so that
the offscreen uses roughly the same device pixel density as the target.
Fixes the handling of the clipped-repeat test.
Sometimes the GPU is still busy when the next frame starts (like when
no-vsync benchmarking), so we need to keep all those resources alone and
create new ones.
That's what the render object is for, so we just create another one.
However, when we create too many, we'll starve the CPU. So we'll limit
it. Currently, that limit is at 4, but I've never reached it (I've also
not starved the GPU yet), so that number may want to be set lower/higher
in the future.
Note that this is different from the number of outstanding buffers, as
those are not busy on the GPU but on the compositor, and as such a
buffer may have not finished rendering but have been returend from the
compositor (very busy GPU) or have finished rendering but not been
returned from the compositor (very idle GPU).
The idea here is that we can do more complex combinations and use that
to support texture-scale nodes or use fancy texture formats (suc as
YUV).
I'm not sure this is actually necessary, but for now it gives more
flexibility.
For blend and crossfade nodes, one of the children may exist and
influence the rendering, while the other does not.
Previously, we would skip the node, which would cause the required
rendering to not happen. We now send a valid texture id for the
invalid offscreen, thereby actually rendering the required parts.
Fixes the blend-invisible-child compare test
Current state for compare tests:
Ok: 397
Expected Fail: 0
Fail: 26
Unexpected Pass: 0
Skipped: 2
Timeout: 0
Instead of having a descriptor set per operation, we just have one
descriptor set and bind all our images into it.
Then the shaders get to use an index into the large texture array
instead.
Getting this to work - because it's a Vulkan extension that needs to be
manually enabled, even though it's officially part of Vulkan 1.2 - is
insane.
If we have a rectangular clip without transforms, we can use
scissoring. This works particularly well because it allows intersecting
rounded rectangles with regular rectangles in all cases:
Use the scissor rect for the rectangle and the normal clipping code for
the rounded rectangle.
The idea is to use it for clip nodes when they are integer-aligned.
To do that, we need to track the scissor rect in the parse state, so we
do that, too.
Also move the viewport offset out of the projection matrix, as it is
part of the transform between clip and scissor, so it needs to live in
the offset.
We align the data to a multiple of vertex stride, that way we use more
memory, but we could compute an offset into the vertex buffer without
changing the offset.
We can set the vertex offset while counting the data, this gets rid of
the need of passing all the counting machinery into the actual data
collection code.
When attempting a complex transform, check if the clip can be ignored
and do that if possible.
That way we don't cause fallbacks when transforming the clip is too
complex.
The idea is that for a rectangle intersection, each corner of the
result is either entirely part of one original rectangle or it is
an intersection point.
By detecting those 2 cases and treating them differently, we can
simplify the code to compare rounded rectangles.
Instead of emitting the render commands once per rectangle of the clip
region, just emit them once with the region's extents.
This is generally faster because it emits fewer commands to the GPU,
even though it may touch significantly more pixels.
For a proper method, we'd need to record the commands per clip rectangle
instead of emitting all of them all the time.
The border and color shaders - the ones that do AA - now multiply their
coordinates by the scale factor, which gives them better rounding
capabilities.
This in particular improves the case where they are used in fractional
scaling situations, where the scale is defined at the root element.
Previously, we just used the defaultscale factor, but now that we're
having it available in push constants, we can read it back for creating
offscreens and rendering fallbacks.
So do that.
It's a 1:1 replacement for GskVulkanPushConstants, just without the
indirection through a different file.
GskVulkanPushConstants as a struct is gone now.
The file still exists to handle the push_constants operation.
1. Use a graphene_vec2_t
2. Ensure it's always positive
3. Don't break with fallback
The scale value is nothing more than an indication of how many pixels to
assume per unit of a node.
We don't want to render the offscreen trnsformed, we want to render it
as-is.
We lose the correct scale factor, but that requires some separate work,
so for now it gets a bit blurry on hidpi.
This introduces the rect object and adds a rect_distance() and
rect_coverage() function.
_distance() returns the signed distance tp the rectangle.
_coverage() returns the coverage of a pixel centered at that position.
Note that the pixel size is computed using dFdx/dFdy.
When the node bounds were a non-integer size, the texture would get
ceil()ed pixels, but various viewport or scissor computations might
floor() instead, leaving the right/bottom row of pixels untouched.
Make sure those functions ceil(), too.
This is not the optimal way of doing it: we're
reuploading the texture with client-side conversion.
But it fits nicely into our current handling of mipmaps.
We can do better once we use shaders for colorspace
conversions.
Make the callers of this function check for
straight alpha themselves, and only do the
version compatibility check here. This makes
the function usable in contexts where straight
alpha is acceptable.
When the command queue is out of batches, there is
no point in doing further work like allocating uniforms.
This helps us avoid assertions in the uniform code
that we would hit when we run out of uniform space
too.
When we start ignoring batches, we must do it everywhere,
or we may run into assertions. This was triggered by an
enormous text node tree produced by tests/rendernode-create.
Vulkan has a different initial coordinate system to GL.
GL:
(-1, 1, -1) +------+.
|`. | `.
| `·--|---·
| : | :
+------+. :
`. : `.:
`·------· (1, -1, 1)
Vulkan:
(-1, -1, 0) +------+.
|`. | `.
| `·--|---·
| : | :
+------+. :
`. : `.:
`·------· (1, 1, 1)
so adjust the near and far plane we pass to
graphene_matrix_init_ortho() to make it end up with the same
projection as the GL renderer.
This was the intention, but the object data by itself
does not achieve that: We do run dispose on the display
when it is closed, but object data is only cleared in
finalize. So listen to the ::closed signal and remove
the driver ourselves.
Fix up the drivers dispose implementation enough for
that to actually work.