Commit Graph

2851 Commits

Author SHA1 Message Date
Matthias Clasen
f26efd9adf gsk: Add a profiler mark for pipeline creation
This is the Vulkan equivalent of shader compilation, it could be
expensive, so lets add a mark around it.
2024-04-22 20:47:25 -04:00
Matthias Clasen
b21708c5e4 offload: Consolidate logging a bit
Spew a bit less per-frame. Unfortunately, we still spew for
every frame, and fixing that would require more extensive
refactoring to centralize all logging in gskoffload.c
2024-04-21 20:18:38 -04:00
Matthias Clasen
c9c29d8bde gsk: Only prefer Vulkan on Wayland
Make Vulkan the default on Vulkan-friendly platforms.
For now, that list only includes Wayland.
2024-04-20 18:10:21 -04:00
Matthias Clasen
582ad79088 gsk: Change the default renderer
The intent of this change to get wider testing and verify that the
Vulkan drivers we get to use in the wild are good enough for our
needs. If significant problems show up, we will revert this change
for 4.16.

The new preference order is vulkan > ngl > gl > cairo.

The gl renderer is still there because we need it to support gles2.

If you need to override the default renderer choice, you can
still use the GSK_RENDERER environment variable.

Fixes: #6537
2024-04-19 13:50:40 -04:00
Matthias Clasen
ec9cdb74ef gsk: Actually punch transparent holes
In a57f7e3935 I accidentally replaced { 0, 0, 0, 0 } with
GDK_RGBA_BLACK instead of GDK_RGBA_TRANSPARENT. Oops.

Fixes: #6634
2024-04-17 20:09:13 -04:00
Matthias Clasen
0a5a720fe1 Use gdkrgbaprivate.h in more places
This gets inline functions used where it matters.
2024-04-15 22:57:01 -04:00
Matthias Clasen
e583e823b5 Use our defines for color
We have GDK_RGBA_WHITE, GDK_RGBA_BLACK and GDK_RGBA_TRANSPARENT,
lets use it instead of open-coding it.
2024-04-15 22:57:01 -04:00
Matthias Clasen
4aac64edf0 offload: Some renaming
Rename things to be more in line with the subsurface api.
2024-04-15 19:53:46 -04:00
Matthias Clasen
c97bbfdfb1 offload: Use subsurface bounds for diffing
When adding the whole subsurface to the diff, use the subsurface
bounds, which takes both the texture and the background into
account.
2024-04-15 19:53:46 -04:00
Matthias Clasen
933a0e5a98 subsurface: Some api revision and documentation
Rename things so they make more sense. The dest/source naming got
a bit unclear when we added background into the mix. Now we're going
for:

source_rect - the texture region to display
texture_rect - dimensions of the subsurface showing the texture
background_rect - dimensions of the background subsurface
bounds - union of texture_rect and background_rect

Also use this opportunity to add some api docs.
2024-04-15 19:53:46 -04:00
Matthias Clasen
0108a5f56d offload: Use subsurface background optimization
Detect a black color node below the texture node and pass that
information to the subsurface, to take advange of the single-pixel
buffer optimization.

To make this work, we need to stop using the bounds of the subsurface
node for sizing the offload, and instead use either the clip or
the texture node for that.
2024-04-15 19:53:46 -04:00
Matthias Clasen
3f9bdaa4c8 Add background to subsurfaces
Make it possible for subsurfaces to have a black background on a
secondary subsurface below the actual subsurface. Using a single-pixel
buffer for that background increases the changes that the compositor
will use direct scanout for the actual subsurface.

This changes the private subsurface API. All callers have been
updated to pass an empty background rect.
2024-04-15 19:53:46 -04:00
Matthias Clasen
ce030b1b36 gsk: Fix a minor type mismatch
Use the same types in the declaration of gsk_standard_contour_init.

Fixes: #6628
2024-04-13 22:28:48 -04:00
Philip Withnall
707e492f0d
gsk: Fix a maybe-uninitialized warning
The compiler (gcc 13.2) thinks that `t` could be used uninitialised.
That’s obviously not the case, because there’s always going to be at
least one loop iteration due to the initial values of `t1` and `t2`.

Change the loop to a `do…while` to make that a bit clearer to the
compiler without making any functional changes to the code.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
2024-04-12 12:08:03 +01:00
Matthias Clasen
cc8db1805d gsk: Be safer against bad font options
Some combinations of hint-style and hint-metrics lead to bad glyph
placement in the glyph cache, so avoid them.
2024-04-09 19:12:49 -04:00
Benjamin Otte
3080e2974d gpu: ceil() offscreen size before generating offscreen
The goal is to generate an offscreen at 1x scale.
When not ceil()ing the numbers the offscreen code would do it *and*
adjust the scale accordingly, so we'd end up with something like a
1.01x scale.

And that would cause the code to reenter this codepath with the goal to
generate an offscreen at 1x scale.
And indeed, this would lead to infinite recursion.

Tests included.

Fixes #6553
2024-04-09 17:39:32 +02:00
Benjamin Otte
9fe9ea34fd vulkan: Handle generating mipmaps for 1x1 images
Testcase included.
2024-04-08 21:06:54 +02:00
Matthias Clasen
704ee6a9d0 offload: Determine output transforms
When we look for the texture to attach to the subsurface, keep
track of transforms we see along the way, and look at their scale
component to determine if the texture needs to be flipped.
We currently don't allow rotations here.

This fixes glarea rendering being upside-down when offloaded.
2024-04-07 11:12:06 -04:00
Matthias Clasen
72e9f30937 subsurface: Add transforms
Allow to specify a D₂ transform when attaching a texture to a
subsurface, to handle flipped and rotated content. The Wayland
implementation handles these transforms by setting a buffer
transform on the subsurface.

All callers have been updated to pass GDK_TEXTURE_TRANSFORM_NORMAL.
2024-04-07 11:02:40 -04:00
Matthias Clasen
8d5633cb88 Drop code handling old cairo
We require cairo 1.18 now, so know that we will get subpixel
positioning for text, and have a little less uncertainty in
our font rendering.
2024-04-05 09:00:22 +02:00
Matthias Clasen
23a336df0e Bump the pango dep
Require pango 1.52, and drop the fallback code.

Fixes: #6554
2024-04-04 00:56:24 +02:00
Matthias Clasen
cc24401dfb Drop unused private API
We are not using gsk_get_unhinted_glyph_string_extents anymore.
2024-04-03 10:53:55 +02:00
Matthias Clasen
f445d8b518 gsk: Use hinted extents
This works better for cff fonts, where hinting is not as local as
what the autohinter does for ttf fonts, and it does not seem to
have negative effects.

Fixes: #6577
Fixes: #6568
2024-04-03 10:52:13 +02:00
Matthias Clasen
d50b780551 gsk: Keep metrics hinting on when rendering
It turns out that we mispositioned glyphs with some cff fonts
when metrics hinting is off, and hinting is on. Since we don't
fully understand the interactions of these settings at this point,
lets preserve metrics hinting as it was on the font we got.

This at least gives folks a workaround for when they experience
clipped rendering with cff fonts: Turn on hint-metrics.

We forced hint metrics off here because it made Pango do some
creative wfh for hex boxes at small sizes, but I've dropped that
on the Pango side.
2024-04-02 09:10:46 +02:00
Matthias Clasen
f0f3ea1b3e Fix build without fontconfig
We were missing some ifdefs for Windows builds.

Fixes: #6591
2024-03-31 13:08:01 +02:00
Matthias Clasen
a973e8ea8d Merge branch 'gl-offload-fixes' into 'main'
gl: Handle offloads in offscreen context better

Closes #6551

See merge request GNOME/gtk!7053
2024-03-18 15:22:22 +00:00
Matthias Clasen
1e83a44c93 gl: Handle offloads in offscreen context better
Back out of offloading below if we are in an offscreen context,
since the hole will get lost in the offscreen.

Fixes: #6551
2024-03-18 08:41:31 -04:00
Matthias Clasen
144cd2d91c gsk: Avoid some allocations
We can use a static font options object and allocate it only once.
2024-03-17 21:30:36 -04:00
Benjamin Otte
195ebf6848 Merge branch 'wip/otte/gl-map-buffer' into 'main'
Add GLBuffer implementation w/ persistent mapping

See merge request GNOME/gtk!7042
2024-03-17 00:27:51 +00:00
Benjamin Otte
aff34e8d1b gpu: Sort passes correctly
In a very particular situation, it could happen that our renderpass
reordering did not work out.
Consider this nesting of renderpasses (indentation indicates subpasses):

pass A
  subpass of A
pass B
  subpass of B

Out reordering code would reorder this as:

subpass of B
subpass of A
pass A
pass B

Which doesn't sound too bad, the subpasses happen before the passes
after all.

However, a subpass might be a pass that converts the image for a texture
stored in the texture cache and then updates the cached image.
If "subpass of A" is such a pass *and* if "subpass of B" then renders
with exactly this texture, then "subpass of B" will use the result of
"subpass of A" as a source.

The fix is to ensure that subpasses stay ordered, too.

The new order moves subpasses right before their parent pass, so the
order of the example now looks like:

subpass of A
pass A
subpass of B
pass B

The place where this would happen most common was when drawing thumbnail
images in Nautilus, the GTK filechooser or Fractal.
Those images are usually PNG files, which are straight alpha. They are then
drawn with a drop shadow, which requires an offscreen for drawing as
well as those images as premultipled sources, so lots of subpasses happen.
If there is then a redraw with a somewhat tricky subregion, then the
slicing of the region code could end up generating 2 passes that each draw
half of the thumbnail image - the first pass drawing the top half and the
second pass drawing the bottom half.
And due to the bug the bottom half would then be drawn from the
offscreen before the actual contents of the offscreen would be drawn,
leading to a corrupt bottom part of the image.

Test included.

Fixes: #6318
2024-03-16 23:44:59 +01:00
Benjamin Otte
47307dc7c1 vulkan: Prefer cached buffer memory
We write the buffers in small chunks, and we even sometimes read it. So
prefer it when it's cached.

Speeds up the text benchmarks by a factor of 3x on my dedicated GPU.
2024-03-16 22:32:49 +01:00
Benjamin Otte
96b800fa0c gl: Add buffer implementation using persistent mapping
If glBufferStorage() is available, we can replace our usage of
glBufferSubData() with persistently mapped storage via
glMappedBufferRange().

This has 1 disadvantage:

1. It's not supported everywhere, it requires GL 4.4 or
   GL_EXT_buffer_storage. But every GPU of the last 10 years should
   implement it. So we check for it and keep the old code.
   The old code can also be forced via GDK_GL_DISABLE=buffer-storage.

But it has 2 advantages:

1. It is what Vulkan does, so it unifies the two renderers' buffer
   handling.

2. It is a significant performance boost in use cases with large vertex
   buffers. Those are pretty rare, but do happen with lots of text at a
   small font size. An example would be a small font in a maximized VTE
   terminal or the overview in gnome-text-editor.

A custom benchmark tailored for this problem can be created with:

  tests/rendernode-create-tests 1000000 text.node

This creates a node file called "text.node" that draws 1 million text
nodes.
(Creating that test takes a minute or so. A smaller number may be useful
on less powerful hardware than my Intel Tigerlake laptop.)
The difference can then be compared via:

  tools/gtk4-rendernode-tool benchmark --runs=20 text.node
and
  GDK_GL_DISABLE=buffer-storage tools/gtk4-rendernode-tool benchmark --runs=20 text.node

For my laptop, the difference is:
before: 1.1s
after:  0.8s

Related: !7021
2024-03-16 20:55:26 +01:00
Benjamin Otte
e7a2baf78c gpu: Remove unused arguments
It's not just unused, it's also wrong.

We are reading from the buffer when reallocating the vertex buffer
and memcpy()ing the old into the new buffer - at that point we read from
it.
2024-03-16 19:46:37 +01:00
Matthias Clasen
438d86fcf5 gsk: Move the buffer upload counter
Move the sysprof counter for buffer uploads to the generic
code, so it works for both ngl and Vulkan. This partially
reverts commit ecf1b7c18a.
2024-03-16 19:39:16 +01:00
Matthias Clasen
1cbdf88b0f Merge branch 'debug-cleanup' into 'main'
gsk: Fix a typo

See merge request GNOME/gtk!7039
2024-03-16 14:41:16 +00:00
Matthias Clasen
b1fb7cd4ae gsk: Drop unused debug flags
The 'surface', 'sync' and 'opengl' flags are not used anywhere.
2024-03-16 09:44:57 -04:00
Matthias Clasen
fd90b56df6 gsk: Move and clarify a debug message
Move the only error message in the OPENGL category to RENDERER,
and make it clearer what and how.
2024-03-16 09:44:57 -04:00
Benjamin Otte
43373e6350 gpu: Rename env var GSK_GPU_SKIP to GSK_GPU_DISABLE
See previous commits.
2024-03-16 14:11:08 +01:00
Benjamin Otte
f725bdad25 gl: Move GL_ARB_base_instance check
It's a GLContext feature check, not a GpuRenderer thing.

So put it there.
2024-03-16 13:52:28 +01:00
Benjamin Otte
cfbe3709bf gpu: Respect the GDK_GL_DISABLE flag
It's now possible to disable sync support.
2024-03-16 13:52:21 +01:00
Benjamin Otte
141769fb46 gl: Turn has_foo flags into GdkGLFeatures
The goal is to have it mirror GdkVulkanFeatures, and in particular
having an environment variable to turn individual flags off.
2024-03-16 13:44:02 +01:00
Benjamin Otte
93cdcc5e88 gpu: Merge multiple ops into one ShaderOp
When ops get allocated that use the same stats as the last op, put them
into the same ShaderOp. This reduces the number of ShaderOps we need to
record, which has 3 benefits:

1. It's less work when iterating over all the ops.
   This isn't a big win, but it makes submit() and print() run a bit
   faster.
2. We don't need to manage data per-op.
   This is a large win because we don't need to ref/unref descriptors
   as much anymore, and refcounting is visible on profiles.
3. We save memory.
   This is a pretty big win because we iterate over ops a lot, and when
   the array is large enough (I've managed to write testcases that makes
   it grow to over 4GB) it kills all the caches and that's bad.

The main benefit of all this are glyphs, which used to emit 1 ShaderOp
per glyph and can now end up with 1 ShaderOp for multiple text nodes,
even if those text nodes use different fonts or colors - because they
can all share the same ColorizeOp.
2024-03-15 20:25:02 +01:00
Matthias Clasen
d51912c0b4 gsk: Add gsk_gpu_frame_get_last_op
This function will be used in the future to find the previous
op during node processing, so we can make optimization decisions
based on that.
2024-03-15 20:25:02 +01:00
Benjamin Otte
bad6e1e102 gpu: Change the way we merge draw calls
With potentially multiple ops per ShaderOp, we may encounter situations
where 1 ShaderOp contains more ops than we want to merge. (With
GSK_GPU_SKIP=merge, we don't want to merge at all.)

So we still merge the ShaderOps (now unconditionally), but we then run
a loop that potentially splits the merged ops again - exactly at the
point we want to.

This way we can merge ops inside of ShaderOps and merge ShaderOps, but
still have the draw calls contain the exact number of ops we want.
2024-03-15 20:25:02 +01:00
Benjamin Otte
28a8dc5a14 gpu: Add GskGpuShaderOp.n_ops
This just introduces the variable and sets it to 1 everywhere.

The ultimate goal is to allow one ShaderOp to collect multiple ops into
one, thereby saving memory in the ops array and leading to faster
performance.
2024-03-15 19:49:17 +01:00
Benjamin Otte
975cdd8c30 gpu: Remove unused return value from function
Technically, an alloc() function should return what it allocated. But
the return value is never used.

Maybe we should rename the function?
2024-03-15 19:49:17 +01:00
Benjamin Otte
153b78e2bc gpu: Add a ShaderOp.print_instance vfunc
... and add gsk_shader_op_print() to do the generic stuff.
2024-03-15 19:49:17 +01:00
Benjamin Otte
de2b10e46c gpu: Set variable to NULL after freeing
Saw this while reviewing code.
2024-03-15 19:49:17 +01:00
Benjamin Otte
30dddf2412 gpu: Refactor waiting for frames
Instead of having renderer API to wait for any number of frames, just
have gsk_gpu_frame_wait() to wait for a single frame.

This unifies behavior on Vulkan and GL, because unlike Vulkan, GL does
not allow waiting for multiple fences.

To make up for it, we replace waiting for multiple frames with finding
the frame with the earliest timestamp and waiting for that one.

Also implement wait() for GL.
2024-03-14 06:06:33 +01:00
Benjamin Otte
b43950d0f7 gpu: Don't reuse frames while they're in use
This copies the Vulkan idea of using a fence at the end of command
submission and waiting until it gets signaled before reusing the frame.

This frees up the GL driver from doing the work of making buffers etc
reusable and instead allocates new ones when they're still in use and is
a pretty massive performance win.
2024-03-14 04:53:12 +01:00