By creating unlimited render objects, we would never wait on the GPU.
This would mean that if the GPU was the bottleneck, we would fill its
queue with render commands faster than it could process them.
And because the nvidia binary driver and my code work surprisingly well
and bugfree, this lead to exhaustion of RAM. I had 50GB of swap
configured and my hard disk was quicker as swap storage than my GPU was
at processing the commands, so stuff still filled up.
At that point my computer became rather unresponsive and I decided to
reboot it, so I that could write this patch.
Add SURFACE and TEXTURE operations. This way, we actually render more
than one node every frame because not everything is a fallback node
anymore that gets composited with its children into a cairo surface.
Instead of pushing the root matrix, push the world matrix for the
current node. That way, the bounds we emit as vertices are actually
properly transformed.
First, we collect all the info about descriptor sets into a hash table,
then we use its size to determine the amount of sets and allocate those
before we finally go ahead and use the hash table's contents to
initialize the descriptor sets.
And then we're ready to render.
We can let the GPU do its stuff without waiting. The GPU knows what it's
doing.
Which means we now get a lot of time to spend on doing CPU things (read:
we're way better in benchmarks).
The old behavior is safer, so we want to keep it around for debugging.
It can be reenabled with GSK_RENDERING_MODE=sync.
And move the actual rendering code there.
A RenderPass is a collection of operations on the same target that
get executed one after another. It roughly targets VkRenderPass or
rather the subpasses of a VkRenderPass.
For now, only the infrastructure is there. No real stuff is happening.
This is refactoring work.
GskVulkanRender is supposed to be the global object for a render
operation, ie GskVulkanRenderer.render() will create this object for
what it does.
The object will be split into stages that perform the operations
necessary to create a drawing.
Instead of using a staging iamge, we require the final image to be
linearly allocated and have host-visible memory.
This improves performance quite a bit.
The old code is still there and can be enabled with a simple change
to a #define in gskvulkanimage.h
We can now upload vertices.
And we use this to draw a yellow background. Which is clearly superior
to not drawing anything.
Also, we have shaders now. If you modify them, you need glslc installed
so they can be recompiled into Spir-V bytecode.
This is a way to query the damaged area of the backbuffer.
The GL renderer uses this to compute the extents of that damage region
(computed via buffer age) and use them to minimize the area to redraw.
This changes the semantics of GL rendering to "When calling
gdk_window_begin_frame() with a GL context, the area by
gdk_gl_context_get_damage() needs to be redrawn and every other pixel of
the backbuffer is guaranteed to be correct.
After gdk_window_end_frame() on a GL-drawn window, the whole backbuffer
must be correct.
We can always glXBufferSwap() now because of this.
... instead of a gl context.
This requires some refactoring in the way we mark the shared context as
drawing: We now call begin_frame/end_frame() on it and ignore the call
on the main context.
Unfortunately we need to do this check in all vfuncs, which sucks. But I
haven't found a better way.
Reenable GL drawing, but do it without Cairo.
Now, the context passed to gdk_window_begin_draw_frame() decides how
drawing is going to happen. If it is NULL, Cairo is used like before.
If a context is passed, Cairo may not be used for drawing and
gdk_drawing_context_get_cairo_context() is going to return NULL.
Instead, the GL renderer must draw to the GL backbuffer and
end_draw_frame() is then swapping that to the front.
The GskGLRenderer has lost the texture it used to render to and adapted
to render directly to the backbuffer instead.
The only thing missing is for GtkGLArea to gain back a performant way to
render. But it didn't have one since the introduction of GSK, this
patchset doesn't change anything about it.
The new rendering avoids two indirections (the GSK renderer's texture
and the GDK double buffering surface).
It improves icon count in the fishbowl demo by 30%.
This way, we don't spam criticals when GL is not available. Instead, we
print a useful debug message to stderr and continue with the Cairo renderer.
Signed-off-by: Emmanuele Bassi <ebassi@gnome.org>
and remove gsk_renderer_get_for_display().
This new function returns a realized renderer. Because of that, GSK can
catch failures to realize, destroy the renderer and try another one.
Or in short: I can finally use GTK on Weston with the nvidia binary
drivers again.
Signed-off-by: Emmanuele Bassi <ebassi@gnome.org>
Instead of having a gsk_renderer_set_window() call, pass the window to
realize(). This way, the realization can fail with the wrong window.
Signed-off-by: Emmanuele Bassi <ebassi@gnome.org>
This allows renderers (or anyone really) to attach "render data" to
textures. Only the first render data sticks.
You can gsk_texture_set_render_data() with the key you will use to
look the data up again, and if no data has been set yet, yours will be
set.
You can retrieve this data via gsk_texture_get_render_data() later on.
If your data has been cleared, NULL will be returned.
When gsk_texture_clear_render_data() is called (which the texture will
call when it is finalized), your destory notify will be called and you
have to release your render data.
The GL driver uses this to attach texture ids to GskTextures.
We do no longer bind textures to a renderer, instead they are a way for
applications to provide texture data.
For now, that's it. We've reverted to uploading it from scratch every
frame.
This happens in regular code paths for example when trying to render the
empty text string. We don't want to store a surface on the render
node in such a case (so actual rendering isn't slowed down), but we do
want to return a working cairo context that is not in an error state
(so the cairo rendering can continue without error messages).
Now that GTK+ is built as a single DLL, and the .lib that is built is
gtk-4.lib, we need to update the autotools sections in generating the
NMake Makefile snippets so that we can have the correct commands and flags
for building the .gir files, which will all now link to gtk-4-vsXX.dll (or
so).
Now that the autotools build folded the GDK/GSK bits into the main GTK+
DLL, there are some updates that need to be done for this. We need to:
-Fold the DllMain() of GDK-Win32 into the main GTK+ DllMain(), as we need
the HINSTANCE to register the window. We can't have two DllMain()'s in a
single DLL.
-Remove the GDK rc(.in) files, as that is not used anymore. Make the GTK+
.rc(.in) file load the gtk.ico GTK+ logo file instead so that we still
get the GTK+ logo for the application icon by default. Update the
autotools build files as well.
-Revert commit b9f9980 as LRN pointed out in comment 25 in bug 773299, as
GTK+ is now a monolithic DLL, and we ought not to export this private
function.
https://bugzilla.gnome.org/show_bug.cgi?id=773299
gtk/inspector/rendernodeview.c calls this private function from GSK, so we
need to ensure that this function is exported so that GTK+ can link
properly on compilers that do not support automatic exporting.
https://bugzilla.gnome.org/show_bug.cgi?id=773299
The GLSL versions are:
OpenGL 2.1: #version 110
OpenGL 3.0: #version 130
OpenGL 3.2: #version 150
OpenGLES 2.0: #version 100
OpenGLES 3.0: #version 300 es
So we need to check the version of the GdkGLContext if we want use the
appropriate version, especially for legacy OpenGL contexts, which can be
both 3.x and 2.x.
We want to have the coordinate system of the created cairo surface to be
identical to the coordinate system of the node's bounds. For that, we
need to translate the cairo surface by the bounds' origin.
We need an overridable entry point for GskRenderer to create Cairo
surfaces.
Implementations of GskRenderer can override create_cairo_surface() to
create efficient surfaces, possibly with zero copies involved, depending
on the GDK backend.
This merged gtk, gdk and gsk into one library, making it possible to
have internal private APIs between gtk them, as well as producing more
efficient code.
https://bugzilla.gnome.org/show_bug.cgi?id=773100
This adds the initial MSVC build items needed to build GSK under Visual Studio,
this is part of it that is required, we need to add items to the property sheets
to generate the code that is generated via glib-mkenums and glib-compile-resources.
This set includes, with the autotools scripts for the complete:
-GSK project files, which is integrated into the gtk+-4.sln.
-The NMake snippets to build the introspection files for GSK.
-The .bat files to call glib-mkenums to generate the enumeration sources.
While porting GTK to GskRenderer we noticed that the current fallback
code for widgets using Cairo to draw is not enough to cover all the
possible cases.
For instance, if a container widget still uses GtkWidget::draw to render
its children, and at least one of them has been ported to using render
nodes instead, the container won't know how to draw it.
For this reason we want to provide to layers above GSK the ability to
create a "fallback" renderer instance, created using a "parent"
GskRenderer instance, but using a Cairo context as the rendering target
instead of a GdkDrawingContext.
GTK will use this inside the gtk_widget_draw() implementation, if a
widget implements GtkWidgetClass.get_render_node().
We're going to need to allow rendering on a specific cairo_t in order to
implement fallback code paths inside GTK; this means that there will be
times when we have a transient GskRenderer instance that does not have a
GdkDrawingContext to draw on.
Instead of adding a new render() implementation for those cases and then
decide which one to use, we can remove the drawing context argument from
the virtual function itself, and allow using a NULL GdkDrawingContext
when calling gsk_renderer_render(). A later commit will add a generic
function to create a transient GskRenderer with a cairo_t attached to
it.
Renderers inside GSK will have to check whether we have access to a
GdkDrawingContext, in which case we're going to use it; or if we have
access to a cairo_t and a window.
GskRenderNode is, at its core, a write-only API; you're supposed to set
up the render nodes instead of querying them for state.
Querying render nodes is left to the GskRenderer implementation.
We store the vertices in (unscaled) window coords (but the item size
is still scaled to match the texture size). Also, the
projection/model-view multiplication order is switched so that the scale
is applied at the right place.
The renderer will always use nearest-neighbor filters because it renders
at 1:1 pixel to texel ratio.
On the other hand, render nodes may be scaled, so we need to offer a way
to control the minification and magnification filters.
If we already have a GL texture we definitely don't want to use
gdk_cairo_draw_from_gl() to draw on a Cairo context if we're going
to take the Cairo surface to which we draw and put it into an OpenGL
texture.
The details of the modelview and projection matrices are only useful for
the GL renderer; there's really no point in having those details
available in the generic API — especially as the Cairo fallback renderer
cannot really set up a complex modelview or a projection matrix.
Just like we reuse texture ids with the same size we can, at the expense
of a little memory, reuse vertex buffers if they reference the same
attributes and contain the same data.
Each VAO is marked as free at the end of the frame, and if it's not
reused in the following frame, it gets dropped.
The child-transform is useful only if we also provide clipping to the
parent nodes, otherwise children will just be drawn outside of the
parent's bounds.
We'll introduce child transforms either at a higher layer, or once we
add clipping support to GskRenderNode.
I don't think this should stay in the code long-term, but it
is useful for debugging. It helped me track down some suspicious
placements of render nodes.
Instead of passing the size of the buffer, we should pass the number of
quads; we know what the size of a single quad structure is, so we can do
the multiplication internally when creating the VAO.
This allows us to print the quads for debugging purposes.
The naming is consistent with other scene graph libraries, as it
represents an additional translation transformation applied on top of
the provided transformation matrices.
We can also simplify the implementation by applying the translation when
we compute the world matrix.
We keep the textures used inside a frame around until the end of the
following frame; whenever we need a texture with the same size, and
it's not marked in use, then we just reuse the existing texture.
This was overwhelming other useful debug output, so make it
opt-in. We print the render items for both opengl and transforms,
since the matrices bleed into each other, otherwise.
Since we use an FBO to render the contents of the render node tree, the
coordinate space is going to be flipped in GL. We can undo the flip by
using an appropriate projection matrix, instead of changing the sampling
coordinates in the shaders and updating all our coordinates at render
time.
We need to apply a scaling factor whenever we deal with user-supplied
coordinates, like:
- when creating textures
- when setting up the viewport
- when submitting the scene
Render nodes need access to rendering information like scaling factors.
If we keep render nodes separate from renderers until we submit a nodes
tree for rendering we're going to have to duplicate all that information
in a way that makes the API more complicated and fuzzier on its
semantics.
By having GskRenderer create GskRenderNode instances we can tie nodes
and renderers together; since higher layers will also have access to
the renderer instance, this does not add any burden to callers.
Additionally, if memory measurements indicate that we are spending too
much time in the allocation of new render nodes, we can now easily
implement a free-list or a renderer-specific allocator without breaking
the API.
If a node is non-opaque and has a non-zero opacity we need to paint its
contents and children first to an off screen buffer, and then render the
resulting texture at the desired opacity — otherwise the opacities will
combine and result in the wrong rendering.
We're not going to use separate rendering lists soon, and the way we
render items is less similar to a gaming engine and more similar to a
simpler compositor. This means we don't need to perform a two pass
rendering — opaque items first, transparent items later.
Use appropriate names, and annotate the names with the types — 'u' for
uniforms, 'a' for attributes. The common preambles for shaders are split
from the bodies, so we need some way to distinguish the uniforms and the
attributes just from their name.
We want the GL driver to cache as many resources as possible, so we can
always ensure that we're in a consistent state, and we can handle state
transitions more appropriately.
Drop the texture parameters handling from the texture creation, and
associate them with the contents upload. Also, rename the function to
something more in line with what it does.
We want to add the list of FBOs tied to a texture; this means we cannot
trivally copy the Texture structure when adding it to a GArray. We're
also going to have more textures than VAOs, so it makes more sense to
use a O(1) access data structure for them.
We can use the GL_ARB_timer_query extension (available since OpenGL
3.2, and part of the OpenGL specification since version 3.3) to query
the time elapsed when drawing each frame. This allows us to gather
timing information on our use of the GPU.
For the root node we do not need to use blending, as it does not have
any backdrop to blend into. We can use a simpler 'blit' program that
only takes the content of the source and fills the texture quad with
it.
We should use ShaderBuilder to create and store programs for the GL
renderer. This allows us to simplify the creation of programs (by moving
the compilation phase into the ShaderBuilder::create_program() method),
and move towards the ability to create multiple programs and just keep a
reference to the program id.
We should keep the ShaderBuilder around and use it to query the various
uniform and attribute locations when needed, instead of storing those
offsets into the Renderer instance, and copying them. This allows a bit
more flexibility, once we have more than one program built into the
renderer.
The GL renderer should build the GLSL shaders using GskShaderBuilder.
This allows us to separate the common parts into separate files, and
assemble them as necessary, instead of shipping one big shader per type
of GL API (GL3, GL legacy, and GLES).
GskShaderBuilder is an ancillary, private type that deals with the
internals of taking GLSL shaders from resources and building them,
with the additional feature of being able to compose shaders from a
common preamble, as well as adding conditional defines (useful for
enabling debugging code in the shaders themselves).
Using GObject as the base type for a transient tree may prove to be too
intensive, especially when creating a lot of node instances. Since we
don't need properties or signals, and we don't need complex destruction
semantics, we can use GTypeInstance directly as the base type for
GskRenderNode.
This commit changes the way GskRenderer and GskRenderNode interact and
are meant to be used.
GskRenderNode should represent a transient tree of rendering nodes,
which are submitted to the GskRenderer at render time; this allows the
renderer to take ownership of the render tree. Once the toolkit and
application code have finished assembling it, the render tree ownership
is transferred to the renderer.
Whenever the render tree changes we want to drop the RenderItem arrays,
as each item contains a pointer to the GskRenderNode which becomes
dangling once the root node changed.
GSK is conceptually split into two scene graphs:
* a simple rendering tree of operations
* a complex set of logical layers
The latter is built on the former, and adds convenience and high level
API for application developers.
The lower layer, though, is what gets transformed into the rendering
pipeline, as it's simple and thus can be transformed into appropriate
rendering commands with minimal state changes.
The lower layer is also suitable for reuse from more complex higher
layers, like the CSS machinery in GTK, without necessarily port those
layers to the GSK high level API.
This lower layer is based on GskRenderNode instances, which represent
the tree of rendering operations; and a GskRenderer instance, which
takes the render nodes and submits them (after potentially reordering
and transforming them to a more appropriate representation) to the
underlying graphic system.