This node essentially implements the feColorMatrix SVG filter. I got the
idea yesterday after looking at the opacity implementation.
It can be used for opacity (not sure if we want to) and to implement a
bunch of the CSS filters.
...but disable them for now. Configs will be added for the projects to
support Vulkan-enabled builds which will then enable the builds of these
sources. Extra commands and items will be needed for the GSK resources
along with ensuring GSK_RENDERER_GSK being defined for the build of GDK,
GDK-Win32 and GSK so that the builds of Vulkan-enabled builds can be done
properly.
Filter out the Vulkan sources from the 'dist hook' rules in
gsk/Makefile.am as we don't want to in turn include them twice in the
projects when the 'make dist' is performed on a system with Vulkan
builds enabled.
One cannot use #if...#endif within macro calls in Visual Studio and
possibly other compilers, and there are more uses of VLAs that need to be
replaced with g_newa().
There were also checks for the clip type in gskvulkanrenderpass.c which
were possibly not done right (using the address of the type value to check
for a type value), which triggered errors as one is attempting to compare
a pointer type to an enum/int type.
https://bugzilla.gnome.org/show_bug.cgi?id=773299
Use g_newa() instead of VLAs, as VLAs may never be supported by some
compilers as it became optional in C11 and there are concerns about their
implementations in compilers that do support it.
https://bugzilla.gnome.org/show_bug.cgi?id=773299
Forces a full redraw every frame.
This is done generically, so it's supported on every renderer.
For widget-factory first page (with the spinner spinning and progressbar
pulsing), I get these numbers per frame:
action clipped full redraw
snapshot 0ms 7-10ms
cairo rendering 0ms 10-15ms
Vulkan rendering 3-5ms 18-20ms
Vulkan expected * 0ms 1-2ms
GL rendering unsupported 55-62ms
* expected means disabling rendering of unsupported render nodes,
instead of doing fallback drawing. So it overestimates the performance,
because borders and box-shadows are disabled.
It's faster to render once for every rectangle in the clip region than
rendering the outline of the clip region.
Especially because this reduces the time necessary to build up the frame
data.
In widget-factory (where we have 3 rectangles), this leads to a 5x
speedup in the rendering time rendering alone.
Snapshotting time goes from 10ms to ~1ms, which is another huge
improvement.
Note: We interpolate premultiplied colors as per the CSS spec. This i
different from Cairo, which interpolates unpremultiplied.
So in testcases with translucent gradients, it's actually Cairo that is
wrong.
This is now tracking the clips added by the clip nodes.
If any particular node can't deal with a clip, it falls back to Cairo
rendering. But if it can, it will render it directly.
... and implement it for the Cairo renderer.
It's an API that instructs a renderer to render to a texture.
So far this is mostly meant to be used for testing, but I could imagine
it being useful for rendering DND icons.
That code doesn't do anything.
And what the code should be doing (clearing the abckground) isn't
necessary as cairo drawing is guaranteed to clear the surface.
This does a conversion to/from GBytes and is intended for writing tests.
It's really crude but it works.
And that probably means Alex will (ab)use it for broadway.
I had originally thought I'd use GskShadow for box-shadow, but didn't in
the end.
So now it's only used for text-shadow and icon-shadow, and those don't
have a spread.
Instead of a separate allocation for any arrays in the render node
we allocate these as part of the render node itself, using C99
flexible arrays.
This leads to less allocations, which is nice, but the major reason
for this is that it allows us to change the allocation scheme further
in the future. For instance, we want to do stack-like allocation so
that all the render-nodes for an entire frame are allocated in one
(or a few) chunks.
Instead of constantly recalculating this (especially recursively for
parents!) we do it only on construction, because everything is
immutable anyway. Also, most nodes had a bounds already and can
use the new parent member instead.
We also do direct access to the node bounds rather than calling
gsk_render_node_get_bounds in various places, which means
we do less copying.
... and make the icon rendering code use it.
This requires moving even more shadow renering code into GSK, but so be
it. At least the "shadows not implemented" warning is now gone!
The node draws a solid CSS border, which can be used to cover everything
but dashed and dotted borders (double, groove, inset, ...).
For different border styles, we overlay multiple nodes and set their
colors to transparent for sides with non-matching styles.
This way we can pass the command pool around.
And that allows us to allocate and submitcustom buffers.
And that is necessary to make staging images work.
This code makes renderers fall back to Cairo rendering if they don't
know how to handle a render node's type.
This allows adding new render nodes with impunity.
Instead of appending a container node and adding the nodes to it as they
come in, we now collect the nodes until gtk_snapshot_pop() is called and
then hand them out in a container node.
The caller of gtk_snapshot_push() is then responsible for doing whatever
he wants with the created node.
Another addigion is the keep_coordinates flag to gtk_snapshot_push()
which allows callers to keep the current offset and clip region or
discard it. Discarding is useful when doing transforms, keeping it is
useful when inserting effect nodes (like the ones I'm about to add).
Instead of having a setter for the transform, have a GskTransformNode.
Most of the oprations that GTK does do not require a transform, so it
doesn't make sense to have it as a primary attribute.
Also, changing the transform requires updating the uniforms of the GL
renderer, so we're happy if we can avoid that.
gsk_render_node_get_bounds() still exists and is computed via vfunc
call:
- containers dynamically compute the bounds from their children
- surface and texture nodes get bounds passed on construction
In the brave new world of refactored render nodes, this function doesn't
really make any sense anymore. We could turn it into a vfunc, but I
don't think it's useful.
Especially because even in the brave old world, this function was
causing a vastl overallocation of nodes when the GL renderer needed render
targets.
If we ever feel, we need this function again, we can readd it later.
But nobody is using it other than for overriding opactiy. And you can
just override opacity directly if you care.
Creating render nodes is fire-and-forget, so all one should do is create
a container, append, append, append and then send it off to the
renderer. So there's no need to replace, insert between or anything
else.
- Recognize "gl" as well as "opengl" for the GL renderer
- GSK_RENDERER=help now works
- g_warning() for an unrecognized renderer (typo detection!)
- g_print() the actual renderer that is used (and error messages when
selecting) when a GSK_RENDERER is given, so you'll notice if your
renderer isn't taken.
By creating unlimited render objects, we would never wait on the GPU.
This would mean that if the GPU was the bottleneck, we would fill its
queue with render commands faster than it could process them.
And because the nvidia binary driver and my code work surprisingly well
and bugfree, this lead to exhaustion of RAM. I had 50GB of swap
configured and my hard disk was quicker as swap storage than my GPU was
at processing the commands, so stuff still filled up.
At that point my computer became rather unresponsive and I decided to
reboot it, so I that could write this patch.
Add SURFACE and TEXTURE operations. This way, we actually render more
than one node every frame because not everything is a fallback node
anymore that gets composited with its children into a cairo surface.
Instead of pushing the root matrix, push the world matrix for the
current node. That way, the bounds we emit as vertices are actually
properly transformed.
First, we collect all the info about descriptor sets into a hash table,
then we use its size to determine the amount of sets and allocate those
before we finally go ahead and use the hash table's contents to
initialize the descriptor sets.
And then we're ready to render.
We can let the GPU do its stuff without waiting. The GPU knows what it's
doing.
Which means we now get a lot of time to spend on doing CPU things (read:
we're way better in benchmarks).
The old behavior is safer, so we want to keep it around for debugging.
It can be reenabled with GSK_RENDERING_MODE=sync.
And move the actual rendering code there.
A RenderPass is a collection of operations on the same target that
get executed one after another. It roughly targets VkRenderPass or
rather the subpasses of a VkRenderPass.
For now, only the infrastructure is there. No real stuff is happening.
This is refactoring work.
GskVulkanRender is supposed to be the global object for a render
operation, ie GskVulkanRenderer.render() will create this object for
what it does.
The object will be split into stages that perform the operations
necessary to create a drawing.
Instead of using a staging iamge, we require the final image to be
linearly allocated and have host-visible memory.
This improves performance quite a bit.
The old code is still there and can be enabled with a simple change
to a #define in gskvulkanimage.h
We can now upload vertices.
And we use this to draw a yellow background. Which is clearly superior
to not drawing anything.
Also, we have shaders now. If you modify them, you need glslc installed
so they can be recompiled into Spir-V bytecode.
This is a way to query the damaged area of the backbuffer.
The GL renderer uses this to compute the extents of that damage region
(computed via buffer age) and use them to minimize the area to redraw.
This changes the semantics of GL rendering to "When calling
gdk_window_begin_frame() with a GL context, the area by
gdk_gl_context_get_damage() needs to be redrawn and every other pixel of
the backbuffer is guaranteed to be correct.
After gdk_window_end_frame() on a GL-drawn window, the whole backbuffer
must be correct.
We can always glXBufferSwap() now because of this.
... instead of a gl context.
This requires some refactoring in the way we mark the shared context as
drawing: We now call begin_frame/end_frame() on it and ignore the call
on the main context.
Unfortunately we need to do this check in all vfuncs, which sucks. But I
haven't found a better way.
Reenable GL drawing, but do it without Cairo.
Now, the context passed to gdk_window_begin_draw_frame() decides how
drawing is going to happen. If it is NULL, Cairo is used like before.
If a context is passed, Cairo may not be used for drawing and
gdk_drawing_context_get_cairo_context() is going to return NULL.
Instead, the GL renderer must draw to the GL backbuffer and
end_draw_frame() is then swapping that to the front.
The GskGLRenderer has lost the texture it used to render to and adapted
to render directly to the backbuffer instead.
The only thing missing is for GtkGLArea to gain back a performant way to
render. But it didn't have one since the introduction of GSK, this
patchset doesn't change anything about it.
The new rendering avoids two indirections (the GSK renderer's texture
and the GDK double buffering surface).
It improves icon count in the fishbowl demo by 30%.
This way, we don't spam criticals when GL is not available. Instead, we
print a useful debug message to stderr and continue with the Cairo renderer.
Signed-off-by: Emmanuele Bassi <ebassi@gnome.org>
and remove gsk_renderer_get_for_display().
This new function returns a realized renderer. Because of that, GSK can
catch failures to realize, destroy the renderer and try another one.
Or in short: I can finally use GTK on Weston with the nvidia binary
drivers again.
Signed-off-by: Emmanuele Bassi <ebassi@gnome.org>
Instead of having a gsk_renderer_set_window() call, pass the window to
realize(). This way, the realization can fail with the wrong window.
Signed-off-by: Emmanuele Bassi <ebassi@gnome.org>
This allows renderers (or anyone really) to attach "render data" to
textures. Only the first render data sticks.
You can gsk_texture_set_render_data() with the key you will use to
look the data up again, and if no data has been set yet, yours will be
set.
You can retrieve this data via gsk_texture_get_render_data() later on.
If your data has been cleared, NULL will be returned.
When gsk_texture_clear_render_data() is called (which the texture will
call when it is finalized), your destory notify will be called and you
have to release your render data.
The GL driver uses this to attach texture ids to GskTextures.
We do no longer bind textures to a renderer, instead they are a way for
applications to provide texture data.
For now, that's it. We've reverted to uploading it from scratch every
frame.
This happens in regular code paths for example when trying to render the
empty text string. We don't want to store a surface on the render
node in such a case (so actual rendering isn't slowed down), but we do
want to return a working cairo context that is not in an error state
(so the cairo rendering can continue without error messages).
Now that GTK+ is built as a single DLL, and the .lib that is built is
gtk-4.lib, we need to update the autotools sections in generating the
NMake Makefile snippets so that we can have the correct commands and flags
for building the .gir files, which will all now link to gtk-4-vsXX.dll (or
so).
Now that the autotools build folded the GDK/GSK bits into the main GTK+
DLL, there are some updates that need to be done for this. We need to:
-Fold the DllMain() of GDK-Win32 into the main GTK+ DllMain(), as we need
the HINSTANCE to register the window. We can't have two DllMain()'s in a
single DLL.
-Remove the GDK rc(.in) files, as that is not used anymore. Make the GTK+
.rc(.in) file load the gtk.ico GTK+ logo file instead so that we still
get the GTK+ logo for the application icon by default. Update the
autotools build files as well.
-Revert commit b9f9980 as LRN pointed out in comment 25 in bug 773299, as
GTK+ is now a monolithic DLL, and we ought not to export this private
function.
https://bugzilla.gnome.org/show_bug.cgi?id=773299
gtk/inspector/rendernodeview.c calls this private function from GSK, so we
need to ensure that this function is exported so that GTK+ can link
properly on compilers that do not support automatic exporting.
https://bugzilla.gnome.org/show_bug.cgi?id=773299
The GLSL versions are:
OpenGL 2.1: #version 110
OpenGL 3.0: #version 130
OpenGL 3.2: #version 150
OpenGLES 2.0: #version 100
OpenGLES 3.0: #version 300 es
So we need to check the version of the GdkGLContext if we want use the
appropriate version, especially for legacy OpenGL contexts, which can be
both 3.x and 2.x.
We want to have the coordinate system of the created cairo surface to be
identical to the coordinate system of the node's bounds. For that, we
need to translate the cairo surface by the bounds' origin.
We need an overridable entry point for GskRenderer to create Cairo
surfaces.
Implementations of GskRenderer can override create_cairo_surface() to
create efficient surfaces, possibly with zero copies involved, depending
on the GDK backend.
This merged gtk, gdk and gsk into one library, making it possible to
have internal private APIs between gtk them, as well as producing more
efficient code.
https://bugzilla.gnome.org/show_bug.cgi?id=773100
This adds the initial MSVC build items needed to build GSK under Visual Studio,
this is part of it that is required, we need to add items to the property sheets
to generate the code that is generated via glib-mkenums and glib-compile-resources.
This set includes, with the autotools scripts for the complete:
-GSK project files, which is integrated into the gtk+-4.sln.
-The NMake snippets to build the introspection files for GSK.
-The .bat files to call glib-mkenums to generate the enumeration sources.
While porting GTK to GskRenderer we noticed that the current fallback
code for widgets using Cairo to draw is not enough to cover all the
possible cases.
For instance, if a container widget still uses GtkWidget::draw to render
its children, and at least one of them has been ported to using render
nodes instead, the container won't know how to draw it.
For this reason we want to provide to layers above GSK the ability to
create a "fallback" renderer instance, created using a "parent"
GskRenderer instance, but using a Cairo context as the rendering target
instead of a GdkDrawingContext.
GTK will use this inside the gtk_widget_draw() implementation, if a
widget implements GtkWidgetClass.get_render_node().
We're going to need to allow rendering on a specific cairo_t in order to
implement fallback code paths inside GTK; this means that there will be
times when we have a transient GskRenderer instance that does not have a
GdkDrawingContext to draw on.
Instead of adding a new render() implementation for those cases and then
decide which one to use, we can remove the drawing context argument from
the virtual function itself, and allow using a NULL GdkDrawingContext
when calling gsk_renderer_render(). A later commit will add a generic
function to create a transient GskRenderer with a cairo_t attached to
it.
Renderers inside GSK will have to check whether we have access to a
GdkDrawingContext, in which case we're going to use it; or if we have
access to a cairo_t and a window.
GskRenderNode is, at its core, a write-only API; you're supposed to set
up the render nodes instead of querying them for state.
Querying render nodes is left to the GskRenderer implementation.
We store the vertices in (unscaled) window coords (but the item size
is still scaled to match the texture size). Also, the
projection/model-view multiplication order is switched so that the scale
is applied at the right place.
The renderer will always use nearest-neighbor filters because it renders
at 1:1 pixel to texel ratio.
On the other hand, render nodes may be scaled, so we need to offer a way
to control the minification and magnification filters.
If we already have a GL texture we definitely don't want to use
gdk_cairo_draw_from_gl() to draw on a Cairo context if we're going
to take the Cairo surface to which we draw and put it into an OpenGL
texture.
The details of the modelview and projection matrices are only useful for
the GL renderer; there's really no point in having those details
available in the generic API — especially as the Cairo fallback renderer
cannot really set up a complex modelview or a projection matrix.
Just like we reuse texture ids with the same size we can, at the expense
of a little memory, reuse vertex buffers if they reference the same
attributes and contain the same data.
Each VAO is marked as free at the end of the frame, and if it's not
reused in the following frame, it gets dropped.
The child-transform is useful only if we also provide clipping to the
parent nodes, otherwise children will just be drawn outside of the
parent's bounds.
We'll introduce child transforms either at a higher layer, or once we
add clipping support to GskRenderNode.
I don't think this should stay in the code long-term, but it
is useful for debugging. It helped me track down some suspicious
placements of render nodes.
Instead of passing the size of the buffer, we should pass the number of
quads; we know what the size of a single quad structure is, so we can do
the multiplication internally when creating the VAO.
This allows us to print the quads for debugging purposes.
The naming is consistent with other scene graph libraries, as it
represents an additional translation transformation applied on top of
the provided transformation matrices.
We can also simplify the implementation by applying the translation when
we compute the world matrix.
We keep the textures used inside a frame around until the end of the
following frame; whenever we need a texture with the same size, and
it's not marked in use, then we just reuse the existing texture.
This was overwhelming other useful debug output, so make it
opt-in. We print the render items for both opengl and transforms,
since the matrices bleed into each other, otherwise.
Since we use an FBO to render the contents of the render node tree, the
coordinate space is going to be flipped in GL. We can undo the flip by
using an appropriate projection matrix, instead of changing the sampling
coordinates in the shaders and updating all our coordinates at render
time.
We need to apply a scaling factor whenever we deal with user-supplied
coordinates, like:
- when creating textures
- when setting up the viewport
- when submitting the scene
Render nodes need access to rendering information like scaling factors.
If we keep render nodes separate from renderers until we submit a nodes
tree for rendering we're going to have to duplicate all that information
in a way that makes the API more complicated and fuzzier on its
semantics.
By having GskRenderer create GskRenderNode instances we can tie nodes
and renderers together; since higher layers will also have access to
the renderer instance, this does not add any burden to callers.
Additionally, if memory measurements indicate that we are spending too
much time in the allocation of new render nodes, we can now easily
implement a free-list or a renderer-specific allocator without breaking
the API.
If a node is non-opaque and has a non-zero opacity we need to paint its
contents and children first to an off screen buffer, and then render the
resulting texture at the desired opacity — otherwise the opacities will
combine and result in the wrong rendering.
We're not going to use separate rendering lists soon, and the way we
render items is less similar to a gaming engine and more similar to a
simpler compositor. This means we don't need to perform a two pass
rendering — opaque items first, transparent items later.
Use appropriate names, and annotate the names with the types — 'u' for
uniforms, 'a' for attributes. The common preambles for shaders are split
from the bodies, so we need some way to distinguish the uniforms and the
attributes just from their name.
We want the GL driver to cache as many resources as possible, so we can
always ensure that we're in a consistent state, and we can handle state
transitions more appropriately.
Drop the texture parameters handling from the texture creation, and
associate them with the contents upload. Also, rename the function to
something more in line with what it does.
We want to add the list of FBOs tied to a texture; this means we cannot
trivally copy the Texture structure when adding it to a GArray. We're
also going to have more textures than VAOs, so it makes more sense to
use a O(1) access data structure for them.
We can use the GL_ARB_timer_query extension (available since OpenGL
3.2, and part of the OpenGL specification since version 3.3) to query
the time elapsed when drawing each frame. This allows us to gather
timing information on our use of the GPU.
For the root node we do not need to use blending, as it does not have
any backdrop to blend into. We can use a simpler 'blit' program that
only takes the content of the source and fills the texture quad with
it.
We should use ShaderBuilder to create and store programs for the GL
renderer. This allows us to simplify the creation of programs (by moving
the compilation phase into the ShaderBuilder::create_program() method),
and move towards the ability to create multiple programs and just keep a
reference to the program id.
We should keep the ShaderBuilder around and use it to query the various
uniform and attribute locations when needed, instead of storing those
offsets into the Renderer instance, and copying them. This allows a bit
more flexibility, once we have more than one program built into the
renderer.
The GL renderer should build the GLSL shaders using GskShaderBuilder.
This allows us to separate the common parts into separate files, and
assemble them as necessary, instead of shipping one big shader per type
of GL API (GL3, GL legacy, and GLES).