Take a rendernode as source and a GskPath and GskStroke,
and fill the area that is covered when stroking the path
with the given stroke parameters, like cairo_stroke() would.
Instead of scale and whatnot, pass:
1. The image size
2. The viewport to map to that image size
and compute everything else from there.
In particular, we set the Vulkan viewport to the image dimensions
instead of the viewport size.
All of this makes things a lot simpler while keeping the required
functionality.
When a GdkMemoryFormat is not supported natively and there's
postprocessing required, add a way to mark a VulkanImage as such via the
new postprocess flags.
Also allow texting such iamges only with new_for_upload() and detect
when that is the case and then run a postprocessing step that converts
that image to a suitable format.
This is done with a new "convert" shader/op.
This now supports all formats natively, no conversions happen on the CPU
anymore (unless the GPU is old).
The API was using regions because it always had. But all the code ever
did was get the extents of the region.
So simplify everything by using rectangles everywhere.
These days, we can query it with gsk_vulkan_render_get_context().
Makes quite a few functions require one less argument.
And it also makes the GskVulkanRenderPass empty. Gotta figure out what
to do with it.
For small regions, the optimization doesn't matter that much, so we
don't need to do lots of work on the CPU.
In particular, this should catch icons and their backgrounds (32x32),
but I was generous in selecting the number.
Gets my discrete AMD on widget-factory back to the 1900fps it had before
this optimization while making the driver clock the GPU's shader at
1.7GHz instead of the 2.1GHz it used before.
Using clear avoids the shader engine (see last commit), so if we can get
pixels out of it, we should.
So we detect the overlap with the rounded corners of the clip region and
emit shaders for those, but then use Clear() for the rest.
With this in place, widget-factory on my integrated Intel TigerLake gets
a 60% performance boost.
The op emits a vkCmdClearAttachments() with a given color. That can be
used with color nodes that are pixel-aligned and opaque to significantly
speed up rendering when the window background is a solid color.
However, currently this fails a bit outside of fullscreen when rounded
clip rectangles are in use to draw rounded corners.
This is a massive refactoring because it collects all the renderops
of all renderpasses into one long array in the Render object.
Lots of code in there is still flaky and needs cleanup. That will
follow in further commits.
Other than that it does work fine though.
Instead of recreating the same renderpass object in every frame and for
every offscreen, just reuse it.
Technically, we can save this per-renderer or even per-display (it
should really be cached by VkDevice), but we have no infrastructure for
that.
The function name gsk_vulkan_render_get_pipeline() had been used for
GskVulkanPipeline. Since those are gone now, we can use that name for
VkPipelines.
Renderpasses get recreated every frame, but we keep render objects
around. So if we keep the vertex buffer in the render object, we can
also keep it around and just reuse it.
Also, we only need one buffer for all the render passes, which is
another bonus.
The initial buffer size is chosen at 128kB. Maximized Nautilus,
gnome-text-editor with an open file and widget-factory take ~100kB when
doing a full redraw. Other apps are between 30-50kB usually.
So I chose a value that is not too big, but catches ~90% of cases.
Interning strings is slow, especially if we can instead do direct
pointer compares.
Also refactor the pipeline lookup code a bit to make use of the
refactored code.
Set it after creating all the ops and then use it for iterating.
Note that we cannot set it while creating the ops because the array may
be realloc()ed into a different memory region which would invalidate all
the pointers.
It currently has no use, but that will come later.
Also put the typedefs into headers in gsk/vulkan, they have nthing to do
outside that directory.
Remove the function to add a node from both the GskVulkanRender and the
GskVulkanRenderPass.
That means they are both now meant to draw exactly one node.
This is a rudimentary - but working - port.
Glyph uploads are still using the old machinery, a bunch of functions
still exist that probably aren't necessary anymore and each glyph emits
its own node.
This will need to be improved in further commits.
This shader is an updated version of the mask shader, but I want to use
the mask name for the mask node and that's a different functionality.
Also, add an operation for it and partially implement the mask node
using it, so we can test that this shader works.
Replacing the shader used for text rendering is the next step.
The benefit here is that we can now properly cross-fade when one of
start/end is fully clipped out by just replacing it with an opacity op
for the other.
This was not possible with the old way we did things.
Instead of creating a pipeline GObject, just ask for the VkPipeline.
And instead of having the Op handle it, just let the renderpass look
up/create the relevant pipeline while creating commands so that it can
insert vkCmdBindPipeline calls as-needed.
This reverts most of commit f420c143e0
again because it turns out GPUs like combined images and samplers.
But: The one thing we don't revert is allowing the C code to select any
combination of sampler and image:
gsk_vulkan_render_get_image_descriptor() now takes a 2nd argument
specifying the sampler.
This allows the same flexibility as before, we just combine things
early.
This change was inspired by
https://developer.nvidia.com/blog/vulkan-dos-donts/
Instead of creating the op manually, just pass in the renderpass and
have the op created from there.
This way ops aren't really initialized anymore, they are more appended
to the queue, so instead of foo_op_init() we can just call the function
foo_op().