Images can now have samplers - meaning they must be rendered with that
sampler. It also means that sampler must be handled as an immutable
sampler in descriptorsets.
These samplers can be created with a samplerYcbcrConversion, so code has
been added to pass that conversion when creating the imageview.
Also add code to GskVulkanFrame to track immutable samplers.
Nobody is making use of this yet.
Define an array with a compile-time-constant variable size for the
immutable samplers.
A bunch of work is necessary to ensure that at least one element is in
the sampler array, because the GLSL code
sampler2D immutable_textures[0];
is invalid.
This allows having different layouts sothat we can support immutable
samplers, whcih are required for multiplane and YUV formats.
We don't use them yet.
use it to collect the optional features we are interested in and turn
them on only if available.
For now we add the dmabuf features, but we don't use them yet.
The main reason here is that we want to not fail when the texture size
is larger than the supported GpuImage size.
When that happens, for now we just fallback slowly - ulitmately to
drawing with Cairo, which is going to be clipped.
There's multiple uses I want it for:
1. Generating the box-shadow area for blurring
2. Generating masks for rounded-rect masking
3. Optimizing the common use case of rounded-clip + color
Only the last one is implemented in this commit.
Don't try to use all those fancy GL features like glMapBuffer() and
such. Just malloc() some buffer memory and glBufferSubData() it later.
That works everywhere and is faster than (almost?) any combination of
fancy new buffer APIs. And yes I'm frustrated because I played with
those flags and none of them were better than this.
Doubles the framerate on my discrete AMD GPU.
Introduce a new GskGpuImageDescriptors object that tracks descriptors
for a set of images that can be managed by the GPU.
Then have each GskGpuShaderOp just reference the descriptors object they are
using, so that the coe can set things up properly.
To reference an image, the ops now just reference their descriptor -
which is the uint32 we've been sending to the shaders since forever.
Use glDrawArraysInstancedBaseInstance() to draw. (Yay for GL naming.)
That allows setting up the offset in the vertex array without having to
glVertexAttribPointer() everything again.
However, this is only supported since GL 4.2 and not at all in stock GLES,
so we need to have code that can work without it.
Fortunately, it is mandatory in Vulkan, so every recent GPU supports it.
And if that GPU has a proper driver, it will also expose the GL extension
for it.
(Hint: You can check https://opengles.gpuinfo.org/listextensions.php for
how many proper drivers exist outside of Mesa.)
The env var allows skipping various optimizations in the GPU shader.
This is useful for testing during development when trying to figure
out how to make a renderer as fast as possible.
We could also use it to enable/disable optimizations depending on GL
version or so, but I didn't think about that too much yet.
When drawing opaque color regions that are large enough, use
vkCmdClearAttachments()/glClear() instead of a shader. This speeds up
background rendering on particular on older GPUs.
See the commit messages of
bb2cd7225ece042f7ba10edd7547c1
for a further discussion of performance impacts.
The previous algorithm would reverse the order of subpasses, whcih leads
to unexpected behavior if dependent subpasses are not added as children
of a subpass, but just as a previous subpass - like when a subpass is
used multiple times later.
An example for this is a shadow node with multiple shadows - the source
of the shadow is used by the multiple shadows.
So ensure that adjacent subpasses stay in the same order.
The code generated by glslc -O is optimized worse by Mesa than
code generated unoptimized.
So generate unoptimized code until somebody figures out what's going
wrong here.
They're done using the pattern shader.
The pattern shader now gained a stack where vec4's can be pushed and
popped back later, which allows storing the position before computing
the new position inside the repeat node's child.
Due to GLES and old GL not allowing non-constant texture array
lookups,we need to turn the array lookup into a big switch statementin
those versions, and that requires putting the texture() call into that
switch.
But with that trick, we can use texture IDs in GLSL.
... and use it for glyphs.
The name is a slight variation of the "coloring" name from the GL
renderer.
The functionality is exactly what the "glyph" shader from the Vulkan
renderer does.
1. Compute the fwidth() twice with offset offsets
That way, we avoid glitches at the boundary between 0.0 and 1.0,
because by offsetting it by 0.5, that boundary goes away.
Then we take the min() of both which gives us the one we care about.
2. Set the gradient to repeating
By doing that, we don't get values at the 0.0/1.0 boundary clamped,
but things smoothly transition.
This smoothes the line at that boundary and makes it look just like
every other line.
Instead of strictly rounding to the given clip rectangle, increase the
rectangle to the next pixel boundary.
Also add docs that the clip_bounds do not influence the actual size of
the returned image.
It's just an object that encapsulates everything needed to create (the
data for) a pattern op.
It also clarifies which code does what, because now the NodeProcessor
and the PatternWriter are 2 different things.
Pretty much a copy of the Vulkan border shader.
A notable change is that the input arguments are changed, because GL
gets confused if you put a mat4 at the end.
when doing get_node_as_image(), that may spawn a new buffer writer that
writes into the samme buffer when rendering an offscreen with patterns.
So as a more or less hacky workaround, we now abort the current buffer
write and restart it once we've created the image.