This node essentially implements the feColorMatrix SVG filter. I got the
idea yesterday after looking at the opacity implementation.
It can be used for opacity (not sure if we want to) and to implement a
bunch of the CSS filters.
...but disable them for now. Configs will be added for the projects to
support Vulkan-enabled builds which will then enable the builds of these
sources. Extra commands and items will be needed for the GSK resources
along with ensuring GSK_RENDERER_GSK being defined for the build of GDK,
GDK-Win32 and GSK so that the builds of Vulkan-enabled builds can be done
properly.
Filter out the Vulkan sources from the 'dist hook' rules in
gsk/Makefile.am as we don't want to in turn include them twice in the
projects when the 'make dist' is performed on a system with Vulkan
builds enabled.
One cannot use #if...#endif within macro calls in Visual Studio and
possibly other compilers, and there are more uses of VLAs that need to be
replaced with g_newa().
There were also checks for the clip type in gskvulkanrenderpass.c which
were possibly not done right (using the address of the type value to check
for a type value), which triggered errors as one is attempting to compare
a pointer type to an enum/int type.
https://bugzilla.gnome.org/show_bug.cgi?id=773299
Use g_newa() instead of VLAs, as VLAs may never be supported by some
compilers as it became optional in C11 and there are concerns about their
implementations in compilers that do support it.
https://bugzilla.gnome.org/show_bug.cgi?id=773299
Forces a full redraw every frame.
This is done generically, so it's supported on every renderer.
For widget-factory first page (with the spinner spinning and progressbar
pulsing), I get these numbers per frame:
action clipped full redraw
snapshot 0ms 7-10ms
cairo rendering 0ms 10-15ms
Vulkan rendering 3-5ms 18-20ms
Vulkan expected * 0ms 1-2ms
GL rendering unsupported 55-62ms
* expected means disabling rendering of unsupported render nodes,
instead of doing fallback drawing. So it overestimates the performance,
because borders and box-shadows are disabled.
It's faster to render once for every rectangle in the clip region than
rendering the outline of the clip region.
Especially because this reduces the time necessary to build up the frame
data.
In widget-factory (where we have 3 rectangles), this leads to a 5x
speedup in the rendering time rendering alone.
Snapshotting time goes from 10ms to ~1ms, which is another huge
improvement.
Note: We interpolate premultiplied colors as per the CSS spec. This i
different from Cairo, which interpolates unpremultiplied.
So in testcases with translucent gradients, it's actually Cairo that is
wrong.
This is now tracking the clips added by the clip nodes.
If any particular node can't deal with a clip, it falls back to Cairo
rendering. But if it can, it will render it directly.
... and implement it for the Cairo renderer.
It's an API that instructs a renderer to render to a texture.
So far this is mostly meant to be used for testing, but I could imagine
it being useful for rendering DND icons.
That code doesn't do anything.
And what the code should be doing (clearing the abckground) isn't
necessary as cairo drawing is guaranteed to clear the surface.
This does a conversion to/from GBytes and is intended for writing tests.
It's really crude but it works.
And that probably means Alex will (ab)use it for broadway.
I had originally thought I'd use GskShadow for box-shadow, but didn't in
the end.
So now it's only used for text-shadow and icon-shadow, and those don't
have a spread.
Instead of a separate allocation for any arrays in the render node
we allocate these as part of the render node itself, using C99
flexible arrays.
This leads to less allocations, which is nice, but the major reason
for this is that it allows us to change the allocation scheme further
in the future. For instance, we want to do stack-like allocation so
that all the render-nodes for an entire frame are allocated in one
(or a few) chunks.
Instead of constantly recalculating this (especially recursively for
parents!) we do it only on construction, because everything is
immutable anyway. Also, most nodes had a bounds already and can
use the new parent member instead.
We also do direct access to the node bounds rather than calling
gsk_render_node_get_bounds in various places, which means
we do less copying.
... and make the icon rendering code use it.
This requires moving even more shadow renering code into GSK, but so be
it. At least the "shadows not implemented" warning is now gone!
The node draws a solid CSS border, which can be used to cover everything
but dashed and dotted borders (double, groove, inset, ...).
For different border styles, we overlay multiple nodes and set their
colors to transparent for sides with non-matching styles.
This way we can pass the command pool around.
And that allows us to allocate and submitcustom buffers.
And that is necessary to make staging images work.
This code makes renderers fall back to Cairo rendering if they don't
know how to handle a render node's type.
This allows adding new render nodes with impunity.
Instead of appending a container node and adding the nodes to it as they
come in, we now collect the nodes until gtk_snapshot_pop() is called and
then hand them out in a container node.
The caller of gtk_snapshot_push() is then responsible for doing whatever
he wants with the created node.
Another addigion is the keep_coordinates flag to gtk_snapshot_push()
which allows callers to keep the current offset and clip region or
discard it. Discarding is useful when doing transforms, keeping it is
useful when inserting effect nodes (like the ones I'm about to add).
Instead of having a setter for the transform, have a GskTransformNode.
Most of the oprations that GTK does do not require a transform, so it
doesn't make sense to have it as a primary attribute.
Also, changing the transform requires updating the uniforms of the GL
renderer, so we're happy if we can avoid that.
gsk_render_node_get_bounds() still exists and is computed via vfunc
call:
- containers dynamically compute the bounds from their children
- surface and texture nodes get bounds passed on construction
In the brave new world of refactored render nodes, this function doesn't
really make any sense anymore. We could turn it into a vfunc, but I
don't think it's useful.
Especially because even in the brave old world, this function was
causing a vastl overallocation of nodes when the GL renderer needed render
targets.
If we ever feel, we need this function again, we can readd it later.
But nobody is using it other than for overriding opactiy. And you can
just override opacity directly if you care.
Creating render nodes is fire-and-forget, so all one should do is create
a container, append, append, append and then send it off to the
renderer. So there's no need to replace, insert between or anything
else.
- Recognize "gl" as well as "opengl" for the GL renderer
- GSK_RENDERER=help now works
- g_warning() for an unrecognized renderer (typo detection!)
- g_print() the actual renderer that is used (and error messages when
selecting) when a GSK_RENDERER is given, so you'll notice if your
renderer isn't taken.
By creating unlimited render objects, we would never wait on the GPU.
This would mean that if the GPU was the bottleneck, we would fill its
queue with render commands faster than it could process them.
And because the nvidia binary driver and my code work surprisingly well
and bugfree, this lead to exhaustion of RAM. I had 50GB of swap
configured and my hard disk was quicker as swap storage than my GPU was
at processing the commands, so stuff still filled up.
At that point my computer became rather unresponsive and I decided to
reboot it, so I that could write this patch.
Add SURFACE and TEXTURE operations. This way, we actually render more
than one node every frame because not everything is a fallback node
anymore that gets composited with its children into a cairo surface.
Instead of pushing the root matrix, push the world matrix for the
current node. That way, the bounds we emit as vertices are actually
properly transformed.
First, we collect all the info about descriptor sets into a hash table,
then we use its size to determine the amount of sets and allocate those
before we finally go ahead and use the hash table's contents to
initialize the descriptor sets.
And then we're ready to render.
We can let the GPU do its stuff without waiting. The GPU knows what it's
doing.
Which means we now get a lot of time to spend on doing CPU things (read:
we're way better in benchmarks).
The old behavior is safer, so we want to keep it around for debugging.
It can be reenabled with GSK_RENDERING_MODE=sync.
And move the actual rendering code there.
A RenderPass is a collection of operations on the same target that
get executed one after another. It roughly targets VkRenderPass or
rather the subpasses of a VkRenderPass.
For now, only the infrastructure is there. No real stuff is happening.
This is refactoring work.
GskVulkanRender is supposed to be the global object for a render
operation, ie GskVulkanRenderer.render() will create this object for
what it does.
The object will be split into stages that perform the operations
necessary to create a drawing.
Instead of using a staging iamge, we require the final image to be
linearly allocated and have host-visible memory.
This improves performance quite a bit.
The old code is still there and can be enabled with a simple change
to a #define in gskvulkanimage.h
We can now upload vertices.
And we use this to draw a yellow background. Which is clearly superior
to not drawing anything.
Also, we have shaders now. If you modify them, you need glslc installed
so they can be recompiled into Spir-V bytecode.
This is a way to query the damaged area of the backbuffer.
The GL renderer uses this to compute the extents of that damage region
(computed via buffer age) and use them to minimize the area to redraw.
This changes the semantics of GL rendering to "When calling
gdk_window_begin_frame() with a GL context, the area by
gdk_gl_context_get_damage() needs to be redrawn and every other pixel of
the backbuffer is guaranteed to be correct.
After gdk_window_end_frame() on a GL-drawn window, the whole backbuffer
must be correct.
We can always glXBufferSwap() now because of this.
... instead of a gl context.
This requires some refactoring in the way we mark the shared context as
drawing: We now call begin_frame/end_frame() on it and ignore the call
on the main context.
Unfortunately we need to do this check in all vfuncs, which sucks. But I
haven't found a better way.
Reenable GL drawing, but do it without Cairo.
Now, the context passed to gdk_window_begin_draw_frame() decides how
drawing is going to happen. If it is NULL, Cairo is used like before.
If a context is passed, Cairo may not be used for drawing and
gdk_drawing_context_get_cairo_context() is going to return NULL.
Instead, the GL renderer must draw to the GL backbuffer and
end_draw_frame() is then swapping that to the front.
The GskGLRenderer has lost the texture it used to render to and adapted
to render directly to the backbuffer instead.
The only thing missing is for GtkGLArea to gain back a performant way to
render. But it didn't have one since the introduction of GSK, this
patchset doesn't change anything about it.
The new rendering avoids two indirections (the GSK renderer's texture
and the GDK double buffering surface).
It improves icon count in the fishbowl demo by 30%.