Commit Graph

3355 Commits

Author SHA1 Message Date
Benjamin Otte
67b9fb43d0 gpu: Completely revamp YCbCr handling
There is now a GskGpuYcbcr struct that maintains all the Vulkan
machinery related to YCbCrConversions.
It's a GskGpuCached, so it will make itself go away when it is no longer
used, ie a video stopped playing.
2024-07-22 19:40:24 +02:00
Benjamin Otte
762b981dfe gpu: Make the device queryable from the cache
That's needed in cached subclasses during destruction, when they want to
destroy system resources.
2024-07-22 19:40:24 +02:00
Benjamin Otte
5e027ae5d9 gpu: Allow creating GskGpuCached objects externally
Export the GskGpuCached and GskGpuCachedClass objects in the header, and
make gsk_gpu_cached_new() available.
2024-07-22 19:40:24 +02:00
Benjamin Otte
7dd3680d7d gpu: Refactor code a bit
Turn the 2 ways to construct cached items into 2 constructors.

Useful for next commit.
2024-07-22 19:40:24 +02:00
Benjamin Otte
14a7b4b4b4 vulkan: Remove unused features
Now that we don't use the fancy features anymore, we don't need to
enable them.
And that also means we don't need an env var to disable it for testing.
2024-07-22 19:40:24 +02:00
Benjamin Otte
f5096fd11a vulkan: No need for different shaders anymore
Now that we don't do fancy texture stuff anymore, we don't need fancy
shaders either, so we can just compile against Vulkan 1.0 again.

And that means we need no fallback shaders for Vulkan 1.0 anymore.
2024-07-22 19:40:24 +02:00
Benjamin Otte
9c5ac13301 gpu: Remove now unused variables
No need to track them anymore.
2024-07-22 19:40:24 +02:00
Benjamin Otte
ecc33d6e62 gpu: Add the same cache as the GL shader uses
This avoids unnecessary rebinds of textures.

I can't really measure a performance change with it though.
2024-07-22 19:40:24 +02:00
Benjamin Otte
03c34021af gpu: Completely revamp descriptor set handling
Instead of trying to cram all descriptors into one large array and only
binding it at the start, we now keep 1 descriptor set per image+sampler
combo and just rebind it every time we switch textures.

This is the very dumb solution that essentially maps to what GL does,
but the performance impact is negligible compared to the complicated
dance we were attempting before.
2024-07-22 19:40:24 +02:00
Benjamin Otte
1b2156493b gpu: Remove descriptors
They are no longer a thing with the new way we manage textures.
2024-07-22 19:40:24 +02:00
Benjamin Otte
7b76170f46 gpu: Flip the big switch
Rewrite all shaders to use 2 predefined samplers called GSK_TEXTURE0 and
GSK_TEXTURE1 instead of wrapper functions.

On GL and Vulkan compat mode, these map directly to samplers.
On Vulkan proper, they map to 2 indices into the texture array, like
before.

From now on, the old nvidia GPUs - ie the 3xx drivers - should start
working again.

Fixes: #6564
Fixes: #6574
Fixes: #6654
2024-07-22 19:40:14 +02:00
Benjamin Otte
8109e8e3b6 gpu: Add GskGpuShaderFlags
This is just blowing up GskGpuShaderClip to hold more information so
that we don't need even more specialization constants.
2024-07-22 18:37:07 +02:00
Benjamin Otte
163278af0d gpu: Add infrastructure to write texture vertex data
This allows GskGpuFrame implementations to store data per vertex
attribute.

This is just the plumbing, no actual implementation is done in this
commit.
2024-07-22 18:37:07 +02:00
Benjamin Otte
677b6c1a81 gpu: Force new descriptors every time
This guarantees that the images get ID 0 and 1 (on GL), which is going
to be quite important for the next steps.

Just for funsies, here's fps numbers on my desktop for this change:
NGL     1500 => 1400
Vulkan  2650 => 2250
2024-07-22 18:37:07 +02:00
Benjamin Otte
1331a10e88 gpu: Remove buffer handling
We don't use buffers atm, and if we want to bring them back later, we
can just look at reverting this commit.

And it's in the way while refactoring.
2024-07-22 18:37:07 +02:00
Benjamin Otte
dc9f0869b1 gpu: Pass used images to shader ops
This by itself is just more work refcounting all those images, but
there's actually a goal here, that will become visible in future
commits.

But this is split out for correctness and benchmarking purposes (the
overhead from refcounting seems to be negligible on my computer).
2024-07-22 18:37:07 +02:00
Benjamin Otte
ea6253c1df gpu: Add a member in ShaderOpClass for number of textures
This just puts the number from the header into a strcut where it can be
accessed.
2024-07-22 18:37:07 +02:00
Benjamin Otte
b481fd854f gpu: Encode number of textures use in every shader
Just define GSK_N_TEXTURES in every glsl file, extract that #define in
the python parser and emit a static const uint variable
"{shader_name}_n_textures" in the generated header.
2024-07-22 18:37:07 +02:00
Benjamin Otte
68baa93460 gpu: Use GskGpuShaderImage for blur ops 2024-07-22 18:37:07 +02:00
Benjamin Otte
52db54e803 gpu: Use GskGpuShaderImage for crossfade ops 2024-07-22 18:37:07 +02:00
Benjamin Otte
71e412e8f8 gpu: Use GskGpuShaderImage for convert ops 2024-07-22 18:37:07 +02:00
Benjamin Otte
9644fc2e8f gpu: Use GskGpuShaderImage for mask ops 2024-07-22 18:37:07 +02:00
Benjamin Otte
0795d86df7 gpu: Use GskGpuShaderImage for colormatrix ops 2024-07-22 18:37:07 +02:00
Benjamin Otte
21988ea700 gpu: Use GskGpuShaderImage for blendmode ops 2024-07-22 18:37:07 +02:00
Benjamin Otte
23081d2bc4 gpu: Use GskGpuShaderImage for texture ops 2024-07-22 18:37:07 +02:00
Benjamin Otte
b1e441d18a gpu: Introduce GskGpuShaderImage
It's a struct collecting all relevant info for a texture passed to a
shader.

The ultimate goal is to get rid of the descriptors and let ops
manage them on thir own.
2024-07-22 18:37:07 +02:00
Benjamin Otte
4639b3bc4c gpu: Make cache keep track of time
If GskGpuCache has an idea of what time it is, cached items can use that
time to update their last-use time instead of having to carry it around
throught function calls everywhere.
2024-07-22 17:10:37 +02:00
Benjamin Otte
821eb92dfb gpu: Handle corner-case
Port an optimization of the GL renderer where it fast-paths crossfades
with progress <= 0 and >=1 - which should really never happen because
nobody should emit them in the first place, but oh well.
2024-07-22 17:10:37 +02:00
Benjamin Otte
0370043775 gpu: Add missing string
This made debug output kinda not so good.
2024-07-22 17:10:37 +02:00
Benjamin Otte
82bcc05ca1 gpu: Move code around
Move the atlas code to the top of the file, so that other code can use
it.

No functional changes.
2024-07-22 17:10:37 +02:00
Benjamin Otte
c1e008fa86 gpu: Improve cache stats printing
We no longer hardcode the few different classes we have, but generically
walk over all classes.

As a side effect we now get new classes added to stats automatically.

The content itself did not change.
2024-07-22 02:03:00 +02:00
Benjamin Otte
c47a3c54fd vulkan: Make images track the device
Now that the cache is a separate object, there are no longer cyclic
uncollectable references, so images can use the device like everyone
else.
2024-07-22 01:28:40 +02:00
Benjamin Otte
3bc1e0534f gpu: Clean up headers
After the device/cache split, this was forgotten.
2024-07-22 01:28:40 +02:00
Benjamin Otte
7fb11dfeb0 gpu: Print filename in exceptions
I want to know which shader I screwed up.
2024-07-22 01:28:40 +02:00
Benjamin Otte
6d09eed90e gpu: Remove unused argument
It's always passing NULL.
2024-07-22 01:28:40 +02:00
Benjamin Otte
2a9056b49e ngl: Fix crash at startup
Commit 1580490670 included a reordering of
acquiring the frame before making the context current.

Sometimes (like at startup) new frames need to be created.

Setting up a new frame assumed the GL context was current.

Change it so that we delay the one GL setup we do in frames until later.
2024-07-19 21:37:48 +02:00
Benjamin Otte
300639e537 vulkan: Use right check for waiting on external image semaphore
Commit 3aa6c27c26 changed the initial layout of imported dmabuf images,
but did not adapt this check.
2024-07-17 22:59:23 +02:00
Benjamin Otte
ad218f0786 gpu: Pass the pass to frame_submit()
We will need that in the next commit.
2024-07-17 22:59:23 +02:00
Benjamin Otte
4966f8cdf8 vulkan: Add an acquire semaphore to frames
Vulkan requires us waiting on the image acquired from
vkAcquireNextImageKHR() before we start rendering to it, as that
function is allowed to return images that are still in use by the
compositor.
Because of that requirement, vkAcquireNextImageKHR() requires a
semaphore or fence to be passed that it can signal once it's done.

We now use a side channel to begin_frame() - calling
set_draw_semaphore() - to pass that semaphore so that the
vkAcquireNextImageKHR() call inside begin_frame() can use it, and then
we can wait on it later when we submit.

And yes, this is insanely convoluted, the Vulkan developers should
totally have thought about GTK's internal designs before coming up
with that idea.
2024-07-17 22:59:23 +02:00
Benjamin Otte
1580490670 gpu: add gsk_gpu_frame_begin/end()
These are just factoring out gdk_draw_context_begin/end_frame() so I can
add one tiny thing there later.

And I did both even though I only need one, because it felt wrong to
just do one.
2024-07-17 22:59:23 +02:00
Benjamin Otte
3cf5e8cf4e gpu: Move gc calls further to the edges of the function
Make the function look like that:

1. handle special case
2. maybe GC
3. draw
4. queue next gc
5. cleanup

This seems like the sanest approach to avoid gc() collecting things
necessary for drawing in the future.

And I need to refactor stuff, so having it out of the way is a good
idea.
2024-07-17 22:59:23 +02:00
Benjamin Otte
d21ac80178 gpu: Simplify a function
Now that we only ever use 2 images max per shader due to the removal of
the ubershader, we can just hardcode it in the function.
2024-07-17 22:59:23 +02:00
Benjamin Otte
11543a229a texturedownloader: Add color state
... and plumb the color state through the downloading machinery, where
no matter what path it takes it ends up in
gdk_memory_convert_color_state() or gdk_memory_convert().

The 2nd of those has been expanded to optionally do colorstate
conversion when the 2 colorstates are different.
2024-07-16 21:23:44 +02:00
Benjamin Otte
37bea9d162 gpu: Don't transition invalid cache items
When a cache item is invalid, don't move it into the hash table.
Instead, just delete it.

Something like this could happen:

1. A texture is cached
In the case of #6867 this would be a webpage in epiphany.

2. The texture cache item is garbage-collected
For example, epiphany might switch to a new tab, and the previous page's
texture will remain. After 15s or so, we collect our item for that
texture.

3. The texture is cached again, but in the target colorspace
We now decide we need the texture again, but not in any colorspace, we
need it in the target colorspace. This might be because we run an
effect on it (like a crossfade) or because we want mipmaps (like in the
overview map, where its zoomed out).

4. The old invalid item is transitioned into the hash table
We now have an invalid item in the hash table. This is extra bad,
because it had only one reference (from the texture), but we treat it
like it has 2 (from us in the hash table and from the texture).
So depending on if the texture is freed before we reuse it, we get
different results: If it was free, we get invalid memory accesses, if it
was not freed, we treat it like a valid cache item and think the image
inside is still valid.

Fixes #6867
2024-07-16 03:15:36 +02:00
Michael Catanzaro
4c40395a38 gpu: fix memory corruption in cache_gc_cb()
gsk_gpu_device_gc() may release the last ref on the GskGpuDevice,
leading to memory corruption when setting priv->cache_gc_source = 0.

Includes a bit of refactoring, so the ref/unref wraps nicely around the
actual code.

Fixes crashes seen after using the inspector and closing the window,
thereby closing all windows of a display and releasing all references to
the device.

Fixes #6861
2024-07-14 21:54:57 +02:00
Benjamin Otte
5f8e83d75d gpu: Fix memleak in texture-scale code 2024-07-14 21:54:40 +02:00
Matthias Clasen
54e5cc296f colorstate: Add rec2100-pq and rec2100-linear
These are wide-gamut, HDR colorstates that we will need for HDR support.
2024-07-13 15:11:07 -04:00
Matthias Clasen
457fd68168 gpu: Make color conversion extensible
Change the glsl convert_color function to proceed in stages:
- first unpremultiply
- then linearize
- then transform linearly
- then delinearize
- then premultiply
All the steps are only taken if needed.
2024-07-13 15:09:12 -04:00
Benjamin Otte
648c780e91 gpu: Respect colorstate for offscreens
We want to render in at least the minimum required depth of the used
colorstate.
2024-07-13 14:51:49 -04:00
Benjamin Otte
6d263f8680 gpu: Add GSK_GPU_BLEND_NONE
Allows writing without blending. This is useful when copying/converting
textures.

In particular, we use it for colorspace conversions.
2024-07-13 10:56:47 +02:00
Benjamin Otte
d9ab6495ef gpu: Fix color convert path to not crash
The occlusion culling reorganization messed up this branch.

Make it work again.
2024-07-13 10:56:47 +02:00
Benjamin Otte
d54b68b93c gpu: Convert values to float[4] from GdkRGBA
We need to make sure our clear values are in the right colorstate, not
in sRGB.

The occluision culling managed to sneak through the big transition for
that.
2024-07-13 02:07:15 +02:00
Benjamin Otte
761346ed5a gpu: Remove unused macro
This is a leftover from the pre-color-managed times
2024-07-13 02:07:15 +02:00
Benjamin Otte
1c1b78aa1c gpu: Implement tiling for texture-scale nodes
This is actually the node Loupe is using, so having tiling work with it
is important.

Because of the previous commit, different filters are supported fine.

Fixes: #6324
2024-07-12 18:09:46 +02:00
Benjamin Otte
cdb2308ddd gpu: Add filter support to tiled images
This allows mipmapping if downscaled a lot, like we do for non-tiled
images.

A side effect is that due to the simpler caching for tiles, we can only
cache the mipmapped images in one colorstate. But we need to pick a
potentially non-default one, because we want to mipmap in a linear
colorstate.

So this is somewhat suboptimal. Patches with improvements accepted.
2024-07-12 17:31:36 +02:00
Benjamin Otte
c581f722bd gpu: Split out a function
We'll need mapping scaling filters to samplers elsewhere soon.
2024-07-12 17:31:36 +02:00
Benjamin Otte
340c98c6cd gpu: Split a function
Split drawing the tiles from setting up the offscreen for drawing the
tiles.
2024-07-12 17:31:36 +02:00
Benjamin Otte
39f5c5bf49 gpu: Implement tiling for texture nodes
Use the new cache feature to split oversized textures into tiles the
size given by the new device API.

Then number those tiles from left to right and top to bottom and use
that number as the tile id.
2024-07-12 17:31:36 +02:00
Benjamin Otte
392f6855ca gpu: Add gsk_gpu_device_get_tile_size()
This allows managing tiling of images. And I'd like this value to live
somewhere prominent instead of as a hardcoded number in the
nodeprocessor.
2024-07-12 17:31:36 +02:00
Benjamin Otte
1cae48ab93 gpu: Add a tile cache
Nobody is using it yet, but it's the API.

It's very simple and just allows adding tiles by an index. What that
index means is up to the caller.
2024-07-12 17:31:36 +02:00
Benjamin Otte
d0f8ef09a0 gpu: Do a GC run after every tile of large images
When we draw large images, we absolutely do not want to keep memory that
we do not need. So do a GC run after every tile. That otentially slows
down things, but it also improves the chances of not running out of
memory.

Here's the node for the image I managed to create after I applied this
patch:

repeat {
  bounds: 0 0 50000 50000;
  child: text {
    font: "Noto Color Emoji 10000px";
    glyphs: 661 0 0 0 color;
    offset: 0 10000;
    hint-style: none;
  }
}
2024-07-12 16:57:23 +02:00
Benjamin Otte
0516dca116 vulkan: Don't try to use nonexisting formats
Handle the error that new rgba format exists.
2024-07-12 16:56:23 +02:00
Benjamin Otte
ebc6a043c9 gpu: Cleanups 2024-07-12 16:56:23 +02:00
Benjamin Otte
8e2ae79875 gpu: Change function to (transfer full)
Functions should behave as I expect, and I just spent an hour debugging
a refcount issue because I assumed our image creation functions return
refrences. Which is a very sane assumption.
2024-07-12 16:55:59 +02:00
Benjamin Otte
27ac764653 gpu: Don't multiply by 1/x, divide by x
This is less error-prone with floating point math, even though it is
somewhat slower.
2024-07-12 16:55:59 +02:00
Benjamin Otte
e40ad5faa5 gpu: Cache textures when doing copies
The texture and texture-scale node code is creating image copies
for mipmaps and to adapt to the compositing colorstate.

Those texture should be cached.
2024-07-11 14:57:20 +02:00
Benjamin Otte
dd393a4a0e gpu: Split out texture lookup function
It's unused in 3 function and has become somewhat unwieldy.
2024-07-11 14:57:20 +02:00
Benjamin Otte
881954dfca gpu: Rework texture caching
We want to cache textures in the compositing color state, not in their
original color state. However, the compositing color state may change
(think multimonitor setups).

So we additionally keep a cache per colorstate.

That means texture lookup is now a 3-step process:

1. Look up in the compositing colorstate's cache

2. Look up in the general cache

3. Upload
2024-07-11 14:57:20 +02:00
Benjamin Otte
cf12503fec gpu: Don't replace cache items
Instead, keep them. This is not useful yet, but will become so in the
next commits.
2024-07-11 14:57:20 +02:00
Benjamin Otte
ad757cccb6 Don't use GL_SRGB for premultiplied textures
GL_SRGB is doing postmultiplied alpha, so if the texture is
premultiplied, we can't use this optimization.

The optimization still works for unpremultiplied and opaque images,
because those don't do that step.
2024-07-11 14:57:20 +02:00
Benjamin Otte
b801eae00f gpu: All ops obey the ccs now
Remove the macro used for the not-yet converted ops.
2024-07-11 14:57:20 +02:00
Benjamin Otte
2fe9ff7918 gpu: Make mask op obey ccs
No colorstate conversions allowed here, though technically we could use
the alternate color state for the source most of the time, as the mask's
colorstate is only relevant for luminance.
2024-07-11 14:57:20 +02:00
Benjamin Otte
c407582096 gpu: Make blur op obey ccs
Blend ops don't do colorspace conversion, so this commit just hardcodes
that and rewrites the shader to use recent APIs.
2024-07-11 14:57:20 +02:00
Benjamin Otte
0fc3dbaa9b gpu: Make texture op obey ccs
Well, texture ops actually don't do any colorspace stuff, but let's
explicitly hardcode that.
2024-07-11 14:57:20 +02:00
Benjamin Otte
77ed264714 gpu: Introduce gsk_gpu_color_states_create_equal()
This is a function that's meant to be used whenever both color states
of the shader are equal. In that case no colorspace conversion code
needs to be created and shaders can be shared.
2024-07-11 14:57:20 +02:00
Benjamin Otte
a14efce914 gpu: Make blur op obey ccs 2024-07-11 14:57:20 +02:00
Benjamin Otte
08f90a188c gl: Force U8 textures for U8_SRGB depths
The GL renderer is using FLOAT32 instead of GL_SRGB, which is screwing
up the node-editor by making it turn on high bit depth unconditionally.

So until someone fixes the GL renderer properly, do this quickfix.
2024-07-11 14:57:20 +02:00
Benjamin Otte
df46eeafdb gpu: Make colormatrix op obey ccs
The colormatrix needs to be applied to unpremultiplied values, so we use
the alternative colorstate for that.
2024-07-11 14:57:20 +02:00
Benjamin Otte
0029286a1e gpu: Remove unused function
The colormatrix shader is no longer used for opacity.
2024-07-11 14:57:20 +02:00
Benjamin Otte
0e46d4eb98 gpu: Pass sampler to the image op
That way, we can use it in one other place where we want to use mipmaps.

I don't really like it because it adds yet another argument,
but then the one new caller was selecting suboptimal shaders, and that's
worse.
2024-07-11 14:57:20 +02:00
Benjamin Otte
f80e3fff92 gpu: Don't use the colormatrix shader for opacity
The colormatrix shade does a whole matrix multiplication, which is
absolutely not necessary.

The convert shader has builtin opacity handling and when the colorstates
match will do no conversion.
2024-07-11 14:57:20 +02:00
Benjamin Otte
504c6ba792 gpu: Don't use color matrix for opacity
We can use the regular image op which will select the fastest shader.
2024-07-11 14:57:20 +02:00
Benjamin Otte
c0aba9aee1 gpu: Make crossfade op obey ccs
I didn't have an idea what to use the alternate color state for, so I
don't use it.
2024-07-11 14:57:20 +02:00
Benjamin Otte
b97cf2b9d9 gpu: Track image colorstate
So far we only track the image colorstate and convert if necessary.

There is no caching of the converted images happening.
2024-07-11 14:57:20 +02:00
Benjamin Otte
f81e7b2112 gpu: Make conic gradient op obey ccs
Straight copy of the linear gradient changes.
2024-07-11 14:57:20 +02:00
Benjamin Otte
b59e4a929e gpu: Make radial gradient op obey ccs
Straight copy of the linear gradient changes.
2024-07-11 14:57:20 +02:00
Benjamin Otte
099d72f037 gpu: Make box shadow op obey ccs 2024-07-11 14:57:20 +02:00
Benjamin Otte
73acb41931 gpu: Make colorize op obey ccs 2024-07-11 14:57:20 +02:00
Benjamin Otte
ebb7fdb099 gpu: Make linear gradient op obey ccs
The alternative color state is used as the interpolation color state.
Colors are transformed into that space on the CPU.

For now we set the interpolation color state to SRGB, because ultimately
we want to let callers specify it, so having something that's easy to
map to that behavior is desirable.
Otherwise we might have chosen to interpolate in the compositing
colorstate.

It also means that we need to premultiply colors on the CPU now because
of the limitations of the shader colorstates APIs.
2024-07-11 14:57:20 +02:00
Benjamin Otte
ec3cb0ad9a gpu: Make border op obey ccs 2024-07-11 14:57:20 +02:00
Benjamin Otte
7ffef6792f gpu: Make rounded-color op obey ccs
This is the same as the color op.
2024-07-11 14:57:20 +02:00
Benjamin Otte
383148dc31 gpu: Make color op obey ccs
This makes use of the GskGpuColorStates by setting the ccs as output
colorstate and the color's colorstate as alternative color state.

The shader adaption is very straightforward because of that.
2024-07-11 14:57:20 +02:00
Benjamin Otte
a31601ccfc gpu: Make clear op obey ccs
This is the first op to obey the compositing color state. This means
from now on until all ops obey the ccs rendering is broken when ccs is
not set to linear.

I'll keep individual ops in seperate commits for easier review, because
they all need different adaptations.
2024-07-11 14:57:20 +02:00
Benjamin Otte
a587492cad gpu: Handle target not being composite colorstate
Render to an offscreen and add a final conversion if the target
colorstate is not a rendering colorstate.

This now allows the GPU renderer to render to any colorstate.
2024-07-11 14:57:20 +02:00
Benjamin Otte
f65d4914e4 gpu: Port convert op to GskGpuColorStates
Make it handle straight alpha, too, by checking if the alt colorspace is
premultiplied - which is the colorspace of the source.
2024-07-11 14:57:20 +02:00
Benjamin Otte
88dc49a5b6 gpu: Print the color states of shader ops
Makes the verbose output (a lot) more verbose, but it makes the
colorstates used in the shaders very visible.

And it will be relevant once people start using different colorstates
everywhere (like oklab for gradients/colors and so on).
2024-07-11 14:57:20 +02:00
Benjamin Otte
91d970e9c5 gpu: Add shaders for the new specialization constant
This adds the following functions:

output_color_from_alt()
alt_color_from_output()
  Converts between the two colors

output_color_alpha()
alt_color_alpha()
  Multiplies a color with an alpha value
2024-07-11 14:57:20 +02:00
Benjamin Otte
6c5ae48a05 gpu: Pass color states as specialization constant
This adds a GdkColorStates that encodes 2 of the default GdkColorStates
and wether their values are premultiplied or not.

Neither do the shaders do anything with this information yet, nor do the
shaders do anything with it yet, this is just the plumbing.
2024-07-11 14:57:20 +02:00
Benjamin Otte
d85ec2cbb4 gpu: create SRGB images
If desired, try creating GL_SRGB images. Pass a try_srgb boolean down to
the image creation functions and have them attempt to create images like
that.

When it is not possible to create srgb images in the given format, just
fall back to regular images. The calling code is meant to check the
GSK_GPU_IMAGE_SRGB flags to determine the actual format of the resulting
image.
2024-07-11 14:57:20 +02:00
Benjamin Otte
05b79bc378 gpu: Handle SRGB in render_texture()
When GDK_MEMORY_U8_SRGB is desired by the node, and a SRGB image is
created, pick SRGB_LINEAR as the colorspace to pass to frame_render().
2024-07-11 14:57:20 +02:00
Matthias Clasen
3ba63315d5 gpu: Pass compositing color states
Make the node processor and the pattern writer track the current
compositing color state. Color state nodes change it. We pass
the surface color state down via the frame apis.

The name of the variable is "ccs" for "compositing color space". It's an
unused variable name and it's common enough to deserve a short and sweet
name.
2024-07-11 14:57:20 +02:00
Benjamin Otte
eccdb594eb gpu: Remove straightalpha shader
As the new convert shader can do everything this shader could, use it
instead.
2024-07-11 14:57:20 +02:00
Matthias Clasen
a78796f22c gpu: Add a color convert shader
This shader converts between two color states, by using the
same functions that we use on the cpu. The conversion to perform
is passed as part of the variation.

As premultiplication is part of color states on the shader, we also
encode the premultiplication in the shader.
And because opacity is a useful optimization, we also allow setting
opacity.

For now, the only possible color states are srgb and srgb-linear.
2024-07-11 14:57:20 +02:00
Benjamin Otte
4fa6f791f4 cairo: Use the draw context's color state
This just passes through the sRGB set by the GDK backends instead of
hardcoding sRGB, so no functional changes.
2024-07-11 14:57:20 +02:00
Benjamin Otte
6287eaa745 cairo: Add colorstate to GskRenderNode::draw and use it
This adds the following:
- ccs argument to GskRenderNode::draw
  This is the compositing color state to use when drawing.

- make implementations use the CCS argument
  FIXME: Some implementations are missing

- gsk_render_node_draw_with_color_state()
  Draws a node with any color state, by switching to its compositing
  color state, drawing in that color state and then converting to the
  desired color state.
  This does draw the result OVER the previous contents in the passed in
  color state, so this function should be called with the target being
  empty.

- gsk_render_node_draw_ccs()
  This needs to be passed a css and then draws with that ccs.
  The main use for this is chaining up in rendernode draw()
  implementations.

- split out shared Cairo functions into gdkcairoprivate.h
  gskrendernode.c and gskrendernodeimpl.c need the same functions.
  Plus, there's various code in GDK that wants to use it, so put it in
  gdk/ not in gsk/

gsk_render_node_draw() now calls gsk_render_node_draw_with_color_state()
with GDK_COLOR_STATE_SRGB.
2024-07-11 14:57:20 +02:00
Benjamin Otte
ab7d969700 gdk: Add color state arg to gdk_texture_download_surface()
All callers set it to SRGB at the moment.
2024-07-11 14:57:20 +02:00
Matthias Clasen
5a7d7cc9f5 gsk: Show srgb information in verbose output
Show which offscreens are using an srgb format.
2024-07-11 14:57:20 +02:00
Benjamin Otte
1bbf5f7a17 gdk: Add GDK_MEMORY_NONE depth
That's basically the "undefined" value. We need that when drawing
nothing, which so far only happens with empty container nodes.

But empty container nodes can be children of other nodes, and that makes
things propagate. So instead of catching them, force the whole rest of
the code to deal with an undefined depth.

We also can't just set a random depth, because that will cause merging
to fail.
2024-07-11 14:57:20 +02:00
Matthias Clasen
db3b3c62bb ngl: Mark backbuffers as srgb
When the surface tells us that a surface is using an sRGB backbuffer,
set the corresponding flag on the backbuffer.
2024-07-11 14:57:20 +02:00
Matthias Clasen
de76045939 vulkan: Mark swapchain images as GSK_GPU_IMAGE_SRGB
Detect if an SRGB format is in use and mark the images as such.

So far this doesn't happen, but once it does, things will work.
2024-07-11 14:57:20 +02:00
Benjamin Otte
16c29a7db5 texture: Add gdk_texture_get_depth()
... and use it.
2024-07-11 14:57:20 +02:00
Benjamin Otte
b5b0002d24 rendernode: Return the right depth for colors
This makes sure we return U8_SRGB for GdkRGBA colors with linear
compositing.
2024-07-11 14:57:19 +02:00
Benjamin Otte
a7ceb8ce66 gdk: Add GDK_MEMORY_U8_SRGB depth
This is an experiment for now, but it seems that encoding srgb inside
the depth makes sense, as we not just use depth to decide on the
GL fbconfigs/Vulkan formats to pick, depth also encodes how the [0...1]
color values are quantized when stored.

Let's see where this goes.
2024-07-11 14:57:19 +02:00
Benjamin Otte
6dea23128a gpu: Add the GSK_GPU_IMAGE_SRGB flag
This commit just adds the flag, but I wanted to make it an individual
commit to explain the purpose:

The SRGB flag is meant to be used for images that have an SRGB format.
In Vulkan terms, that means VK_FORMAT_*_SRGB.
In GL, it means GL_SRGB or GL_SRGB_ALPHA.

As these formats have been madatory since GL 3.0, we can (ab)use them
uncoditionally. Images in these formats are renderable, too, so it's
not just usable for uploading.

What these images allow is treating the data as sRGB while shaders
access them as linear, thereby getting sRGB<=>linear conversions for
free.

It is also possible to switch off the linearization of these images and
treat them as sRGB, which allows all sorts of shenanigans, though one
has to be careful if that turning off applies to the relevant GL/Vulkan
code in question.
2024-07-11 14:57:19 +02:00
Benjamin Otte
2f4e19d514 gdk: Allow querying GL SRGB formats
Nobody is using this yet.
2024-07-11 14:57:19 +02:00
Benjamin Otte
1273413c7b vulkan: Import GL textures via dmabufs
If the GL texture is exportable to a dmabuf, we can just use our dmabuf
importing code to get that texture into Vulkan.
There is no need to go via host memory in that case.

And if it doesn't work, we just fall back, like before.
2024-07-11 14:14:35 +02:00
Benjamin Otte
ef3f48a2be vulkan: Refactor gsk_vulkan_image_new_for_dmabuf()
It now works with just a dmabuf and doesn't take a texture anymore.

Which means it can be used from other codepaths in the future.
2024-07-11 14:14:35 +02:00
Benjamin Otte
fff78b60e9 gpu: All nodes are implemented
Unimplemented nodes are a failure now.

We make this a soft failure with a g_warning() so that during
development when adding new nodes, the renderer doesn't instantly crash,
but instead prnts a warning.

But we do consider unimplemented nodes a bug now.

Because of that, add_fallback_node() is now renamed to add_cairo_node().
2024-07-11 13:34:37 +02:00
Benjamin Otte
9abc7fc80b gpu: Don't hand Cairo invalid nodes
When encountering an invalid node, exit asap. Don't draw it with Cairo,
Cairo won't know what to do with it either.
2024-07-11 13:34:37 +02:00
Benjamin Otte
d8059ebdd2 gpu: "Implement" GL shader nodes
Instead of falling back to Cairo, draw the pink error rectangle
directly.
2024-07-11 13:34:36 +02:00
Matthias Clasen
8c988a7b07 Take all transforms into account
When determining which way is up for the offloaded texture, we
must take all transforms into account - the ones outside the
subsurface node, and the ones inside.
2024-07-10 22:11:13 +02:00
Benjamin Otte
4f2b639a24 gpu: We can handle 90 degree rotations quite easily 2024-07-10 22:06:24 +02:00
Benjamin Otte
3e01924ca3 gpu: Handle dihedral transforms in occlusion culling 2024-07-10 22:06:24 +02:00
Matthias Clasen
4d2884bde7 offload: Support dihedral transforms
When looking for the texture transform, allow dihedral transforms,
and pass them along when attaching to the subsurface.
2024-07-10 21:34:12 +02:00
Benjamin Otte
6065165060 Add gdk_dihedral_swaps_xy
This function determines if a dihedral transform swaps x and y.
2024-07-10 21:34:12 +02:00
Benjamin Otte
9272bf96f6 gpu: Handle affine transforms without touching matrix
By moving negative affines to be treated like dihedrals, because they
also need support of the modelview, we can free up the affine branch for
doing work without it.

Not a big win I guess, but it makes scaling more efficient.
2024-07-10 21:34:12 +02:00
Benjamin Otte
ae3efb2d2f gpu: Implement transform support for dihedral transforms
This allows handling them without ever needing to offscreen for losing
the clip, because the clip can always be transformed.

Also, all the optimizations keep working, like occlusion culling,
clears, and so on.

The main benefit of this work is the ability for offloading to now
handle dihedral transforms of the video buffer.

The other big advantage is that we can now start our rendering with a
dihedral transform from the compositor.
2024-07-10 21:34:12 +02:00
Benjamin Otte
b9ecae84f5 gpu: Shuffle some transform flags around
No need to check for negative numbers now that we can just use the
category that doesn't give us any.
2024-07-10 21:34:12 +02:00
Benjamin Otte
cd3286c71a gpu: Switch to GskFineTransformCategory
This is purely replacing the enums, no functional changes.
2024-07-10 21:34:12 +02:00
Benjamin Otte
83c2766e3f rendernode: Handle rotation transforms in opacity calculations
Tests included.
2024-07-10 21:34:12 +02:00
Benjamin Otte
5f719c8ea3 transform: Implement transform_point() for dihedrals
No longer using the default path and risking rounding issues.
2024-07-10 21:34:12 +02:00
Benjamin Otte
555f7d5404 transform: Implement transform_bounds() for dihedrals
No longer using the default path and risking rounding issues.
2024-07-10 21:34:12 +02:00
Benjamin Otte
3bcfc5539c transform: Add gsk_transform_to_dihedral()
I hate everything about this.

Is is xy or yx now?
Do I need to combine(a, b) or combine(b, a)?
2024-07-10 14:38:48 +02:00
Benjamin Otte
3221a8bdab gdk: Rename GdkTextureTransform to GdkDihedral
... and put it into its own header.

We want to use it in more places and the name and location are awkward.
2024-07-10 12:36:07 +02:00
Benjamin Otte
930f059f5e transform: Introduce GskFineTransformCategory
This category does a finer-grained categorization than
GskTransformCategory, but it is deliberatedly made to allow
easy backwards compatibility.

The reason for the categories is that they fit our renderers more
fine.

In particular, it allows implementing wl_output_transform support more
efficiently, thereby allowing rendering buffers the right way for
rotated phone screens or monitors.
2024-07-10 12:36:07 +02:00
Benjamin Otte
9a6e61e510 rect: There's no coverage witout overlap
The rectangles need to touch/overlap in both directions, otherwise
there's no coverage that covers both rectangles.

Test included.

Fixes rendering glitches in various apps when redrawing.

Fixes: #6849
2024-07-10 00:28:33 +02:00
Matthias Clasen
ed4f51436c Merge branch 'matthiasc/for-main' into 'main'
Add a missing include

Closes #6842

See merge request GNOME/gtk!7433
2024-07-09 03:04:25 +00:00
Matthias Clasen
7837354c6c Add a missing include
When building without Vulkan, we don't get this include for free.
So add it explicitly.

Fixes: #6842
2024-07-08 22:18:11 -04:00
Benjamin Otte
5dc6f134c5 gpu: Make offscreening code use process()
... instead of init_draw(); add_node(); finish_node();

We hook into the infrastructure one step earlier and close to where the
default renderer_render() and renderer_render_texture() arrive in the
nodeprocessor.

Why is this relevant?
Because process() does occlusion culling.

TL;DR: offscreens do culling now
2024-07-08 23:22:55 +02:00
Benjamin Otte
6daeb7e504 gpu: Transition exported textures into GENERAL layout
We import them as general, so they should be exported like that.

This was a longstanding issue that I never got around to fixing and I'm
touching this code anyway atm.
See commit 3aa6c27c26 for more details.
2024-07-08 23:22:41 +02:00
Benjamin Otte
fcf59ad135 gpu: Allow NULL as clear color
NULL disables clearing. We only implement this for GL as in Vulkan we'd
need to create different renderpasses with different attachment
descriptions and that would require more plumbing.
2024-07-08 23:07:36 +02:00
Benjamin Otte
1dd905d976 gpu: Fix wrong rect check in occlusion fallback path
We need to check that the clip is inside the opaque region, not that the
opaque region is inside the clip.

Test included, using the only not that hits the fallback path with an
opaque region smaller than its bounds.
2024-07-08 23:07:36 +02:00
Benjamin Otte
155f7cdeec gpu: Chceck if a container node is opaque as fallback
Sometimes container nodes contain lots of overlapping opaque items. In
that case we can use the container node itself as the first node even
though none of the children cover the whole paint area.

The use case for this is a grid of cells like in a terminal where all
the cells are opaque and we want to avoid drawing the background behind
them.
2024-07-08 23:07:36 +02:00
Benjamin Otte
6d564075f3 gsk: texture-scale nodes with non-integer bounds aren't opaque
Due to the way the intermediate offscreen gets drawn, we might end up
with seams at the edges.

And I don't think it's worth spending more time on than saying "not
opaque".

Fixes the compare-render testsuite

New testcase included.
2024-07-08 15:28:14 +02:00
Benjamin Otte
4cdf7aa65a rendernode: Add containernode->opaque and fill it at startup
We want to operate with opacities, so it makes sense to have this radily
available.

And we're doing a walk over all children on creation anyway, so why not
just capture the rect there.
2024-07-08 15:28:14 +02:00
Benjamin Otte
329dc9e0cf gpu: Change get_opaque() implementation of containers
We want to be able to express opaque grids. This means that the app
provides either a row of columns of opaque nodes or a column of rows,
and then the containers will magically figure it out.

The main use case for this is terminals, which are uilt using cells. And
when there's a transparent background configured but the contents are
opaque, it'd be nice if we could figure that out.

Also remove the 80% requirement. It is rather arbitrary and while it
helps for some cases, the aforementioned grid would suffer.
2024-07-08 15:28:14 +02:00
Benjamin Otte
e02de45537 gpu: Add GSK_GPU_DISABLE=occlusion
This simply disables add_first_node() usage.

Useful to find bugs in its implementation or track performance
with/without it.
2024-07-08 15:28:14 +02:00
Benjamin Otte
29fbda49bb gpu: Implement add_first_node() for rounded clip nodes
This is a bit more expensive than clip nodes, because we have to check
the rounded edges are outside of the clip.
2024-07-08 15:28:14 +02:00
Benjamin Otte
1b155341bd gpu: Implement add_first_node() for clip nodes
Clip nodes often appear in the widget tree.

And the implementation can be trivial because of the sanity checks
already performed before calling the vfunc.
2024-07-08 15:28:14 +02:00
Benjamin Otte
96c02c1eb4 gpu: Implement add_first_node() for transform nodes
This is required because transform nodes appear everywhere.

We just exit for all transforms that can't transform the clip rect
losslessly. Both because they are rare and because we'd make the
coverage possibilities much lower.
2024-07-08 15:28:14 +02:00
Benjamin Otte
116d662e0f gpu: Add early exit to add_first_node()
A node must cover the full clip region to be eligible for being the
first opaque node.

Do an early exit for all nodes that aren't big enough for that.
2024-07-08 15:28:14 +02:00
Benjamin Otte
d81cd4751f gpu: Add add_first_node for colors
Color nodes can set the default background of the renderpass, instead of
doing a clear op or running a shader.
2024-07-08 15:28:14 +02:00
Benjamin Otte
dd33068943 gpu: Implement add_first_node for containers
Containers can walk the list of children back to front, trying to find
the topmost node that fully covers the viewport.

And then they can skip drawing all the nodes before that one.
2024-07-08 15:28:14 +02:00
Benjamin Otte
09c1e51b8a gpu: Add gsk_gpu_node_processor_add_first_node()
Asks a node to add itself if it is fully covering the clip rectangle.
In that case, it is the first node that needs to be added.

If the node is not fully covering the clip, it should not draw itself,
because there might be stuff needing to be drawn below.

If a node adds itself, it should call gsk_gpu_render_pass_begin_op().
2024-07-08 15:28:14 +02:00
Benjamin Otte
af9a9422c4 gpu: Allow passing a background color to renderpasses
It's not used yet, everybody is passing GDK_RGBA_TRANSPARENT.
2024-07-08 15:28:14 +02:00
Benjamin Otte
df3c85ea7f gpu: Move renderpass handling into the nodeprocessor
There's no need for the frame to do this.
2024-07-08 15:28:14 +02:00
Benjamin Otte
b14a115fc0 container: Implement get_opaque_rect()
We find the first child that covers >80% of the container and return
that.

This is a nice speedup for the common case of a GtkWindow being covered
by a large opaque background.
It will fall apart for fancy themes that play with transparency or for
small windows because the shadow region gets too large.

But then we just scan the whole node tree.

We could think about adapting the 80% number, because that wasn't chosen
with any real scientific data behind it.
2024-07-08 15:28:14 +02:00
Benjamin Otte
355a88d002 roundedclip: Implement get_opaque_rect
This takes both the vertical and horizontal rectangle that isn't
covering the rounded corners and intersects both with the child's opaque
rect.
And then it returns the larger of the two.

This means the small slices of a window near the left/right (or
top/bottom) will never be covered, but if we wanted that, we'd need to
use something else than a rectangle - either a region or actually a
rounded rect.

But that is a lot more expensive to implement.
2024-07-08 15:28:14 +02:00
Benjamin Otte
1014b3d972 rendernode: Implement a bunch of simple opque_rect getters
Not much to write home about here.
2024-07-08 15:28:14 +02:00
Benjamin Otte
1a2b1575d6 rendernode: Add fully-opaque flag
Allows rendernodes to advertise themselves as fully opaque.

This is used as a fast path for get_opaque_rect().
2024-07-08 15:28:14 +02:00
Benjamin Otte
9c032bec89 gsk: Add gsk_render_node_get_opaque_rect()
Gets a rectangle inside the rendernode that is opaque.

This function only adds the API. So far, no implementation exists.
2024-07-08 15:28:14 +02:00
Matthias Clasen
5a59548d72 Merge branch 'fix-offload-transforms' into 'main'
offload: Don't assert things that might fail

Closes #6824

See merge request GNOME/gtk!7407
2024-07-08 12:58:28 +00:00
Matthias Clasen
880c33a777 offload: Don't assert things that might fail
We can in fact meet complex transforms here. Asserting that they
are simple doesn't make it so. Instead, simply bail out if a
transform is too complex; in this case we can't offload anyway,
so no need to walk the tree further.

Test included.

Fixes: #6824
2024-07-08 08:15:02 -04:00
Matthias Clasen
3a97b44321 Merge branch 'glshader-docs-typo' into 'main'
gskglshader: Fix typo in deprecation docs

See merge request GNOME/gtk!7426
2024-07-08 10:58:33 +00:00
Benjamin Otte
058252e895 vulkan: Can't blit to/from formats with a swizzle
Fixes grayscale images appearing red on some hardware.

Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11467
2024-07-08 10:51:37 +02:00
Benjamin Otte
c594de8302 vulkan: Code more defensively
Check for identity swizzle too, even though we don't use it.
2024-07-08 10:51:37 +02:00
Sebastian Dröge
8eaf05a3ef gskglshader: Fix typo in deprecation docs 2024-07-08 09:21:43 +03:00
Benjamin Otte
ba05963729 gpu: Make get_node_as_image() always return premultiplied images
We wanted premultiplied images in all cases anyway, and moving that
requirement means we can also move the caching code for re-caching
textures into the texture specific code.
2024-07-07 14:39:07 +02:00
Benjamin Otte
f14f7e7df6 gpu: Add GSK_GPU_DISABLE=offscreen
This uses offscreens for every call to get_node_as_image().

This is useful both for benchmarking benefits of those implementations
as well as checking that the node-specific paths produce identical
results.
2024-07-07 14:39:07 +02:00
Benjamin Otte
daa97d4b79 gpu: Refactor functions
gsk_gpu_node_processor_ensure_image() was a weird amalgamation of stuff
withe weird required and disallowed flags.

Refactor it to make the two operations we actually do there more
explicit: Removing straight alpha and generating mipmaps.

This untangling is also desirable in the future when we also want to
handle colorstates here.
2024-07-07 14:39:07 +02:00
Benjamin Otte
4822b85cb0 gpu: Make gsk_gpu_node_processor_get_node_as_image() more restrictive
Always return premultiplied images.

2 fallback cases for clip and transform nodes did not require that. If
those cases turn out to be important, they can call
gsk_gpu_get_node_as_image() directly as that's the more flexible option.
2024-07-07 14:39:07 +02:00
Benjamin Otte
7f61d7ac8b gpu: Implement get_node_as_image() for subsurface nodes
Pass through to the child instead of offscreening.

I mainly implemented it for the assertion, because this might be a
sneaky way to introduce bugs without exhaustive checking that we don't
offload stuff that is offscreened.

No actual bugs that I'm aware of, so no tests.
Strictly defensive coding.
2024-07-07 14:39:04 +02:00
Benjamin Otte
010ca5feef gpu: Implement get_node_as_image() for debug nodes
Just pass through to the child instead of offscreening.
2024-07-07 12:54:05 +02:00
Benjamin Otte
31a907be35 gpu: Make sure textures used as image are mipmapped
When getting a texture as image, we were always returning the texture
unconditionally.

However, we want to mipmap textures when the scale factor is too large,
and this code path did not do that.

The same codepath on the GL renderer doesn't do that either, so the test
is disabled for it.
2024-07-07 12:54:05 +02:00
Benjamin Otte
ab37fed974 gpu: vfuncify get_node_as_image()
The switch statement was ugly.

Plus, the code should be close to the add_node() vfunc implementation,
so they can be modified together.

See future commits for an example where this matters.
2024-07-07 12:54:05 +02:00
Benjamin Otte
d7308f2d73 gsk: Rename GSK_DEBUG=glyphcache to GSK_DEBUG=cache
1. I mistype it all the time
2. It's shorter
3. We use it for all caching these days, not just glyphs.
2024-07-07 05:24:45 +02:00
Benjamin Otte
012c4b9425 gpu: Remove the ubershader
It didn't bring any noticable benefits and it isn't compatible with the
way we intend to do colorstate support.

And nobody seems to want to spend time on it, so let's get rid of it.

We can bring it back later if someone wants to work on it.
2024-07-07 05:19:32 +02:00
Benjamin Otte
16c2e0a642 gl: Use GdkTextureDownloader
Cairo surfaces are so outdated.
2024-07-05 13:55:18 +02:00
Benjamin Otte
32625381fa gsk: Deprecate GskGLShader and the rendernode
The new renderers don't support them due to the required complexity of
integrating them with Vulkan and the assumptions those nodes make about
the renderer (the GL renderer exports its internal APIs into the
GLShader).

There haven't been any complaints that I'm aware of since 4.14 was
released where the default renderer does not support the nodes, so usage
in public seems to be close to nonexistant.

The 2 uses I know of were workarounds about missing features in GTK that
have stopped since GTK now supports them:

1. GStreamer used in to do premultiplication when the old GL renderer
   did not do so in hardware but on the CPU.
2. Adwaita used it for masking before the mask node wa added in 4.10.
2024-07-04 21:28:06 +02:00
Benjamin Otte
0f55b3bc18 Merge branch 'wip/otte/for-main' into 'main'
gpu: Split out the cache

See merge request GNOME/gtk!7411
2024-07-03 19:51:10 +00:00
Benjamin Otte
9c249fefc3 gpu: Improve periodic cache debug message
I was watching the log in my terminal and nothing happened.
And I wasn't sure if that was because nothing was printed or because the
same thing was printed every few seconds.

Fix that by printing a timestamp, so that in a few seconds something
else will be printed.
2024-07-03 20:36:56 +02:00
Benjamin Otte
eae7ee6c25 gpu: Track atlas differently
Previously we tracked the dead pixels, but that meant we didn't know the
alive pixels (because there's also unused pixels never accounted for).
And we would free the current atlas randomly due to that.

Now we track if any pixels are alive, and if so, we never gc the current
atlas.
2024-07-03 20:36:56 +02:00
Benjamin Otte
bf7f302ff5 gpu: gc atlas, too
After 60s, we gc the atlas, too. This ensures that after that time, we
free all cache resources, so if an application gets moved to the
background, it will no longer use GPU resources. (Well, at least the
cache won't.)
2024-07-03 19:55:18 +02:00
Benjamin Otte
0d6981bd54 gpu: Make the cache track if it's empty
Only if a non-stale item is in the cache do we consider the cache not
empty.

Once the cache is empty, the device frees it and stops running the
periodic GC.
2024-07-03 19:55:18 +02:00
Benjamin Otte
148d7bcc25 gpu: Don't remove gc timeout unless cache is empty
If the cache isn't empty, we want to rerun the GC.
2024-07-03 19:55:18 +02:00
Benjamin Otte
71161b6352 gpu: Split cache and device
This is for 3 reasons:

1. Separation of concerns
   The device is meant to manage the Vulkan/GL device and check stuff
   like image sizes.
   Caching is not part of that.

2. Refcounting
   Images etc want to reference the device, but the cache wants to
   reference images. If the cache is the device, that's a refcycle.

3. Flexibility
   It's now easier to implement >1 cache, say one per depth or one per
   color state.
2024-07-03 19:55:15 +02:00
Benjamin Otte
dd33a2f280 rendernode: Make sure depth variable has enough bits
This will be relevant when we add new values to it.
2024-07-03 19:55:15 +02:00
Emmanuele Bassi
554045de6e Use the appropriate annotations for a callback closure
The (closure) annotation with an parameter is meant to be used on the
callback argument, and point to the user data argument.
2024-07-03 17:21:04 +01:00
Matthias Clasen
f9856d547f gsk: Avoid a crash
The subsurface in subsurface nodes can be NULL, so check before
poking at it.
2024-07-02 15:52:51 -04:00
Benjamin Otte
9f7254a2d0 gl: Fix wrong drawing of mask node corner case
An empty mask with inverted-alpha means the source is visible.
2024-07-02 02:04:06 +02:00
Benjamin Otte
43c2b53811 rendernode: Remove default empty draw function
No node ever wants to draw nothing. So make sure code crashes if the
draw func is not set instead of silently drawing nothing.
2024-07-01 20:30:25 +02:00
Benjamin Otte
33d87bc22e gdk: Add GDK_N_DEPTHS to the enum
We are indexing arrays by the number of depths, and when adding depths
we don't want to forget to grow those arrays.
2024-07-01 20:07:22 +02:00
Maximiliano Sandoval
9126bf2c9d
docs: gsk: Add docstring for BroadwayRenderer 2024-06-29 15:14:33 +02:00
Maximiliano Sandoval
c73ff197e0
gsk: meson: Document renderers
These contain docstrings.
2024-06-29 15:14:32 +02:00
Benjamin Otte
72a4fae8dc vulkan: More slight refactoring
This applies the same refactoring as commit
5fbdec2a29 to another function.
2024-06-29 07:13:27 -04:00
Matthias Clasen
36c7d05445 gpu: Keep actual clear values in clear op
Keeping the GdkRGBA requires doing later conversions, which isn't
necessary if we just keep the already converted float[4].

It also prepares for future color states, where the color will need to
be converted using the colorstate.
2024-06-29 07:12:28 -04:00
Matthias Clasen
f3d5683f67 rendernode: Fix container nodes preferred depth
The current code injects an implicit GDK_MEMORY_U8 into this.
It is harmless now, but may not be in future. So avoid it.
2024-06-29 07:12:10 -04:00
Benjamin Otte
120887979d rendernode: Set proper values for fill and stroke nodes
Fill and stroke nodes were not reporting proper offscreen-for-opacity
and preferred depth.

This was unlikely to have been noticed as their child is usually a solid
color.
2024-06-29 07:12:10 -04:00
Matthias Clasen
51012c1802 ngl: Export dmabuf textures from render_texture
We want dmabufs because we can import them into Vulkan, amongst
other things.
2024-06-22 08:02:31 -04:00
Matthias Clasen
6f5c610858 gsk: Give up automatic font subsetting
Despite my best effort, it seems impossible to make ci and local
builds agree on what font subsetter and fonts to use, so make this
opt-in for now: If you want to produce a node file with embedded
fonts, set GSK_SUBSET_FONTS=1.
2024-06-21 18:45:32 -04:00
Matthias Clasen
b1a840bec0 Try to avoid aliasing with system fonts
The rendernode parser creates its own fontmap for the fonts that
we deserialize from blobs. But we were using the system fontconfig
configuration for it, leading to system fonts still being found.
This is bad, and causes test failures in ci. Try with an empty
fontconfig configuration instead.
2024-06-21 18:17:11 -04:00
Benjamin Otte
5fbdec2a29 vulkan: Slight refactoring for future changes
No functional changes.
2024-06-21 19:53:46 +02:00
Matthias Clasen
d1d4d80a1d Simply some internal api
The only caller of gdk_memory_texture_from_texture doesn't use
the second argument.
2024-06-19 02:06:14 -04:00
Matthias Clasen
a63a201812 Fix a crash in rendernode serialization
This snuck in with a6ffd6b3b2.
2024-06-17 22:09:30 -04:00
Matthias Clasen
f09caced9b Drop debug spew 2024-06-17 21:44:33 -04:00
Matthias Clasen
36993ac707 gpu: Print some more details
Print the variations of mask and blendmode operations.
Just because we can.
2024-06-15 14:00:46 -04:00
Matthias Clasen
34fb08af6e Fix a copy-paste error
This was obviously meant to compare two different colors.
2024-06-14 12:30:06 -04:00
Matthias Clasen
0ec29c4176 gsk: Pass the memory format for back buffer
We can now get this information from the Vulkan context,
so use it to accurately represent the back buffer.

Related: #6767
2024-06-09 15:59:56 -04:00
Matthias Clasen
18b3b4feed gpu: Print more info for images
Show the memory format.

This helps debugging our depth selection.
2024-06-09 15:59:32 -04:00
Matthijs Velsink
721be8fe9f gsktransform: Document consuming functions
Since GskTransform is immutable, a lot of the documented "methods" are
more like "functions", in the sense that they don't keep the instance
alive but rather consume it.

This is annotated with `(transfer full)`, but since these functions are
listed as methods, their first argument is not shown.

Instead, let's add a line to the docs of each consuming function that
clarifies this behavior.
2024-06-05 15:59:49 +02:00
Matthias Clasen
bf1a434d5c Merge branch 'font-subsetting-in-node-files' into 'main'
Use font subsetting in serialized nodes

See merge request GNOME/gtk!7227
2024-06-03 12:47:59 +00:00
Matthias Clasen
9256b5b552 rendernode tool: Add an extract command
This lets one extract the data urls from a node file.
2024-06-03 08:28:21 -04:00
Matthias Clasen
577e4afb3c Improve font deserialization
Even if we disable font fallback, after adding Cantarell Regular
to the custom fontmap, fontconfig will helpfully synthesize
Cantarell Bold for us. So, just don't check for the font at all.
If there is a url, add it to the fontmap and leave it up to the
serializing code to ensure that we don't end up with duplicate
fonts.
2024-06-03 07:45:57 -04:00
Matthias Clasen
2a05c04db7 Use the hb face as key when tracking fonts
The hb face is is a wrapper around the font file, which is what
we need to track here, since we want to subset and serialize each
used font file exactly once.
2024-06-03 07:44:16 -04:00
Matthias Clasen
a6ffd6b3b2 nodeparser: Subset fonts
When serializing nodes, collect the glyphs that are used from
each font, subset the font to that set of glyphs, and embed it
into the node file. We are careful to preserve the glyph IDs,
so our text nodes transparently work with the subsettted fonts.
2024-06-03 07:38:51 -04:00
Matthias Clasen
5ffa2b757c Merge branch 'less-vulkan' into 'main'
Don't use Vulkan without dmabufs

See merge request GNOME/gtk!7220
2024-06-02 23:58:24 +00:00
Matthias Clasen
cfcc5c5c0b Merge branch 'docs-add-missing-returns-args' into 'main'
docs: Add missing returns and parameter annotations

See merge request GNOME/gtk!7325
2024-06-02 15:26:50 +00:00
Maximiliano Sandoval
0bacde8e0a
gskpathbuilder: Document add_cairo_path path arg 2024-06-01 10:01:19 +02:00
Maximiliano Sandoval
949cd45bb7
gskstroke: Add missing return annotations 2024-06-01 10:01:19 +02:00
Maximiliano Sandoval
a931335f24
gskglshader: Correct typo in source property 2024-06-01 09:04:00 +02:00
Maximiliano Sandoval
1ef320a9ec
gsktransform: Document constructor 2024-06-01 09:04:00 +02:00
Matthias Clasen
4961241f26 gsk: Use gdk_rgba_print when suitable 2024-05-31 21:39:04 -04:00
Matthias Clasen
90e1ce0906 Merge branch 'new-docstrings' into 'main'
Add missing docstrings

See merge request GNOME/gtk!7321
2024-05-31 19:57:53 +00:00
Maximiliano Sandoval
3d1f914271
gskglrenderer: Document GL renderers 2024-05-31 11:47:30 +02:00
Maximiliano Sandoval
7bb0639a75
gskrendernode: Document serialization error quark 2024-05-31 11:47:29 +02:00
Maximiliano Sandoval
f8f38aab63
gskpathpoint: Document copy and free 2024-05-31 11:47:29 +02:00
gayathri.berli@ibm.com
ba92ce342e Merge branch 'main' into memoryfix 2024-05-30 18:21:06 +05:30
Chun-wei Fan
9dbdbaca43 gskvulkandevice.c: Put Vk[Pipeline|RenderPass] in structures
This way, we can simply duplicate the keys as separate pointers to store
the corresponding Vulkan handles so that we can safely hash them, as
Vulkan handles may or may not be pointers depending on the target
platform.

This will fix builds on 32-bit Windows at least.
2024-05-29 18:16:22 +08:00
Chun-wei Fan
4c677e4dcd gskvulkanmemory.c: Use VK_NULL_HANDLE for VkDeviceMemory
...rather than NULL, so that things will build fine on non-LLP, non-64-bit
systems.
2024-05-29 12:57:07 +08:00
Chun-wei Fan
be2ff60787 gsk: Call glDeleteSync() directly
This function does not use the standard __cdecl calling convention on
Windows, meaning using g_clear_pointer() on it directly will cause
crashes on 32-bit Windows.  Just call it directly if the GLsync it uses
exists.
2024-05-25 11:07:37 +08:00
Alejandro Piñeiro
130a6fe0cf gsk: use the correct memory type index
https://gitlab.gnome.org/GNOME/gtk/-/issues/6726
2024-05-22 19:43:03 +02:00
Matthias Clasen
690c06109e gsk: Speed up mask nodes with cairo
Switch symbolc icon drawing from color-matrix to mask nodes
make the performance of the iconscroll demo crater (from 60fps
to 10fps).

Apply the same optimization we already have for color-matrix
nodes when drawing mask nodes. This gets us back to 60fps.

Fixes: #6700
2024-05-10 07:24:25 -04:00
Matthias Clasen
32a4f805b8 gsk: Require dmabuf support for Vulkan
Don't use the Vulkan renderer if Vulkan doesn't support any
dmabuf formats.
2024-05-05 15:19:17 -04:00
Matthias Clasen
76b0687467 Put newlines before base64 blobs
This makes things look a bit cleaner in the node editor, since
the first line no longer sticks out.
2024-05-05 10:16:01 -04:00
Georges Basile Stavracas Neto
c45a6ad52d gsk/gpu: Use G_GSIZE_FORMAT for printing gsizes
On Windows, gsize is a long long unsigned. The compiler complains about
that.

Use G_GSIZE_FORMAT which translates to %llu on Windows, %lu on most
platforms, and sometimes just %u on rare cases.
2024-05-03 12:30:39 -03:00
Matthias Clasen
9904259661 gsk: Serialize hint metrics too
We need this to ensure that we properly roundtrip text nodes
without any changes.
2024-05-01 14:00:18 -04:00
Matthias Clasen
ef1ff8313f gsk: Improve logging
Log the shader compilation with sufficient detail.
2024-04-30 07:36:42 -04:00
Matthias Clasen
744016e768 Merge branch 'matthiasc/for-main' into 'main'
gsk: Drop statistics from GSK_DEBUG=renderer

See merge request GNOME/gtk!7203
2024-04-29 18:30:59 +00:00
Matthias Clasen
c23d3c4406 gsk: Add debug spew to renderer selection
Reshuffle things a bit and make GSK_DEBUG=renderer print out
more information about why renderers were skipped or selected.
2024-04-29 12:28:15 -04:00
Matthias Clasen
1e2eae4ddf gsk: More detailed debug spew
Include information about why a renderer was selected in
GSK_DEBUG=renderer.
2024-04-29 12:12:22 -04:00
Matthias Clasen
77eb3df7c0 gsk: Drop statistics from GSK_DEBUG=renderer
This only works with the old gl renderer, and it interferes with
the setup information that is printed by that category.
2024-04-29 12:12:22 -04:00
Benjamin Otte
e6700405c9 dmabuf: Use narrow range instead of full range
It's way more common, and Mutter uses it, too.

Avoid visual glitches when going in/out of offload.

Fixes #6672
2024-04-29 14:30:56 +02:00
Matthias Clasen
8b59771cd1 Merge branch 'matthiasc/for-main' into 'main'
inspector: Cosmetics

See merge request GNOME/gtk!7199
2024-04-29 10:27:26 +00:00
Matthias Clasen
a3bd0a3e17 gsk: Cosmetics
Tweak a profiler counter name.
2024-04-28 23:54:55 -04:00
Joshua Lee
d069eb173a path builder: Fix doc typo 2024-04-28 22:24:38 +01:00
Benjamin Otte
719021e1f4 gpu: Handle tiny offscreens
Due to rounding errors, it is possible after intersecting a lot of
rectangles to end up with a tiny size for an offscreen. And because we
allow an epsilon before ceil()ing to an integer (see commit afc7b46264
for details) it is now possible that we end up with a size of 0.

Avoid that by always enforcing a minimum size of 1px.

Test included

The test uses a different codepath to arrive at the same problem - it
specifies the small size instead of triggering it via rounding errors
and clipping like the original bug (and most likely the more common case
to encounter this problem.

Fixes #6656
2024-04-28 13:51:42 +02:00
Benjamin Otte
4856e115a9 path: document enum 2024-04-28 08:33:03 +02:00
Matthias Clasen
c45199e388 gsk: Fix a profiler mark
I messed this up in f26efd9adf.
2024-04-27 10:23:45 -04:00
Matthias Clasen
a1fdf06d80 gsk: Add a warning for inefficient texture import
With GSK_DEBUG=fallback, warn if a non-memory texture has to be
downloaded for importing it into Vulkan or GL.
2024-04-26 11:04:47 -04:00
Matthias Clasen
3fc9de7539 offload: Make logging more compact
Use a format of

[XXX] SYMBOL DETAILS

where SYMBOL indicates the offloading status:
 🗙 - no offload
 ▲ - offload above, with background
 △ - offload above, no background
 ▼ - offload below, with background
 ▽ - offload below, no background
2024-04-24 22:01:56 -04:00
Matthias Clasen
1c9a55d185 Merge branch 'vulkan-msvc' into 'main'
gskvulkandescriptors.c: Don't return value from void-rettype function

See merge request GNOME/gtk!7175
2024-04-25 01:37:49 +00:00
Georges Basile Stavracas Neto
3aa6c27c26 vulkan/image: Use GENERAL for initial layout of DMA-BUF textures
The VK_IMAGE_LAYOUT_UNDEFINED layout means that the data hold by the
texture can be discarded, and we don't want to discard it. Because the
Vulkan spec is unclear (see [1] for a discussion), err on the side of
caution and use VK_IMAGE_LAYOUT_GENERAL.

Fixes import failures with WebKit.

[1] https://github.com/ValveSoftware/gamescope/issues/356
2024-04-24 17:21:51 -03:00
Chun-wei Fan
016354b6dd gskvulkandescriptors.c: Don't return value from void-rettype function
Fixes builds on Visual Studio with Vulkan enabled, as later GLib releases
consider this as an error on Visual Studio builds.
2024-04-24 16:19:43 +08:00
Matthias Clasen
f26efd9adf gsk: Add a profiler mark for pipeline creation
This is the Vulkan equivalent of shader compilation, it could be
expensive, so lets add a mark around it.
2024-04-22 20:47:25 -04:00
Matthias Clasen
b21708c5e4 offload: Consolidate logging a bit
Spew a bit less per-frame. Unfortunately, we still spew for
every frame, and fixing that would require more extensive
refactoring to centralize all logging in gskoffload.c
2024-04-21 20:18:38 -04:00
Matthias Clasen
c9c29d8bde gsk: Only prefer Vulkan on Wayland
Make Vulkan the default on Vulkan-friendly platforms.
For now, that list only includes Wayland.
2024-04-20 18:10:21 -04:00
Matthias Clasen
582ad79088 gsk: Change the default renderer
The intent of this change to get wider testing and verify that the
Vulkan drivers we get to use in the wild are good enough for our
needs. If significant problems show up, we will revert this change
for 4.16.

The new preference order is vulkan > ngl > gl > cairo.

The gl renderer is still there because we need it to support gles2.

If you need to override the default renderer choice, you can
still use the GSK_RENDERER environment variable.

Fixes: #6537
2024-04-19 13:50:40 -04:00
gayathri.berli@ibm.com
06351844bb changes to fix the memorytexture regression 2024-04-18 17:13:41 +05:30
Matthias Clasen
ec9cdb74ef gsk: Actually punch transparent holes
In a57f7e3935 I accidentally replaced { 0, 0, 0, 0 } with
GDK_RGBA_BLACK instead of GDK_RGBA_TRANSPARENT. Oops.

Fixes: #6634
2024-04-17 20:09:13 -04:00
Matthias Clasen
0a5a720fe1 Use gdkrgbaprivate.h in more places
This gets inline functions used where it matters.
2024-04-15 22:57:01 -04:00
Matthias Clasen
e583e823b5 Use our defines for color
We have GDK_RGBA_WHITE, GDK_RGBA_BLACK and GDK_RGBA_TRANSPARENT,
lets use it instead of open-coding it.
2024-04-15 22:57:01 -04:00
Matthias Clasen
4aac64edf0 offload: Some renaming
Rename things to be more in line with the subsurface api.
2024-04-15 19:53:46 -04:00
Matthias Clasen
c97bbfdfb1 offload: Use subsurface bounds for diffing
When adding the whole subsurface to the diff, use the subsurface
bounds, which takes both the texture and the background into
account.
2024-04-15 19:53:46 -04:00
Matthias Clasen
933a0e5a98 subsurface: Some api revision and documentation
Rename things so they make more sense. The dest/source naming got
a bit unclear when we added background into the mix. Now we're going
for:

source_rect - the texture region to display
texture_rect - dimensions of the subsurface showing the texture
background_rect - dimensions of the background subsurface
bounds - union of texture_rect and background_rect

Also use this opportunity to add some api docs.
2024-04-15 19:53:46 -04:00
Matthias Clasen
0108a5f56d offload: Use subsurface background optimization
Detect a black color node below the texture node and pass that
information to the subsurface, to take advange of the single-pixel
buffer optimization.

To make this work, we need to stop using the bounds of the subsurface
node for sizing the offload, and instead use either the clip or
the texture node for that.
2024-04-15 19:53:46 -04:00
Matthias Clasen
3f9bdaa4c8 Add background to subsurfaces
Make it possible for subsurfaces to have a black background on a
secondary subsurface below the actual subsurface. Using a single-pixel
buffer for that background increases the changes that the compositor
will use direct scanout for the actual subsurface.

This changes the private subsurface API. All callers have been
updated to pass an empty background rect.
2024-04-15 19:53:46 -04:00
Matthias Clasen
ce030b1b36 gsk: Fix a minor type mismatch
Use the same types in the declaration of gsk_standard_contour_init.

Fixes: #6628
2024-04-13 22:28:48 -04:00
Philip Withnall
707e492f0d
gsk: Fix a maybe-uninitialized warning
The compiler (gcc 13.2) thinks that `t` could be used uninitialised.
That’s obviously not the case, because there’s always going to be at
least one loop iteration due to the initial values of `t1` and `t2`.

Change the loop to a `do…while` to make that a bit clearer to the
compiler without making any functional changes to the code.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
2024-04-12 12:08:03 +01:00
Matthias Clasen
cc8db1805d gsk: Be safer against bad font options
Some combinations of hint-style and hint-metrics lead to bad glyph
placement in the glyph cache, so avoid them.
2024-04-09 19:12:49 -04:00
Benjamin Otte
3080e2974d gpu: ceil() offscreen size before generating offscreen
The goal is to generate an offscreen at 1x scale.
When not ceil()ing the numbers the offscreen code would do it *and*
adjust the scale accordingly, so we'd end up with something like a
1.01x scale.

And that would cause the code to reenter this codepath with the goal to
generate an offscreen at 1x scale.
And indeed, this would lead to infinite recursion.

Tests included.

Fixes #6553
2024-04-09 17:39:32 +02:00
Benjamin Otte
9fe9ea34fd vulkan: Handle generating mipmaps for 1x1 images
Testcase included.
2024-04-08 21:06:54 +02:00
Matthias Clasen
704ee6a9d0 offload: Determine output transforms
When we look for the texture to attach to the subsurface, keep
track of transforms we see along the way, and look at their scale
component to determine if the texture needs to be flipped.
We currently don't allow rotations here.

This fixes glarea rendering being upside-down when offloaded.
2024-04-07 11:12:06 -04:00
Matthias Clasen
72e9f30937 subsurface: Add transforms
Allow to specify a D₂ transform when attaching a texture to a
subsurface, to handle flipped and rotated content. The Wayland
implementation handles these transforms by setting a buffer
transform on the subsurface.

All callers have been updated to pass GDK_TEXTURE_TRANSFORM_NORMAL.
2024-04-07 11:02:40 -04:00
Matthias Clasen
8d5633cb88 Drop code handling old cairo
We require cairo 1.18 now, so know that we will get subpixel
positioning for text, and have a little less uncertainty in
our font rendering.
2024-04-05 09:00:22 +02:00
Matthias Clasen
23a336df0e Bump the pango dep
Require pango 1.52, and drop the fallback code.

Fixes: #6554
2024-04-04 00:56:24 +02:00
Matthias Clasen
cc24401dfb Drop unused private API
We are not using gsk_get_unhinted_glyph_string_extents anymore.
2024-04-03 10:53:55 +02:00
Matthias Clasen
f445d8b518 gsk: Use hinted extents
This works better for cff fonts, where hinting is not as local as
what the autohinter does for ttf fonts, and it does not seem to
have negative effects.

Fixes: #6577
Fixes: #6568
2024-04-03 10:52:13 +02:00
Matthias Clasen
d50b780551 gsk: Keep metrics hinting on when rendering
It turns out that we mispositioned glyphs with some cff fonts
when metrics hinting is off, and hinting is on. Since we don't
fully understand the interactions of these settings at this point,
lets preserve metrics hinting as it was on the font we got.

This at least gives folks a workaround for when they experience
clipped rendering with cff fonts: Turn on hint-metrics.

We forced hint metrics off here because it made Pango do some
creative wfh for hex boxes at small sizes, but I've dropped that
on the Pango side.
2024-04-02 09:10:46 +02:00
Matthias Clasen
f0f3ea1b3e Fix build without fontconfig
We were missing some ifdefs for Windows builds.

Fixes: #6591
2024-03-31 13:08:01 +02:00
Matthias Clasen
a973e8ea8d Merge branch 'gl-offload-fixes' into 'main'
gl: Handle offloads in offscreen context better

Closes #6551

See merge request GNOME/gtk!7053
2024-03-18 15:22:22 +00:00
Matthias Clasen
1e83a44c93 gl: Handle offloads in offscreen context better
Back out of offloading below if we are in an offscreen context,
since the hole will get lost in the offscreen.

Fixes: #6551
2024-03-18 08:41:31 -04:00
Matthias Clasen
144cd2d91c gsk: Avoid some allocations
We can use a static font options object and allocate it only once.
2024-03-17 21:30:36 -04:00
Benjamin Otte
195ebf6848 Merge branch 'wip/otte/gl-map-buffer' into 'main'
Add GLBuffer implementation w/ persistent mapping

See merge request GNOME/gtk!7042
2024-03-17 00:27:51 +00:00
Benjamin Otte
aff34e8d1b gpu: Sort passes correctly
In a very particular situation, it could happen that our renderpass
reordering did not work out.
Consider this nesting of renderpasses (indentation indicates subpasses):

pass A
  subpass of A
pass B
  subpass of B

Out reordering code would reorder this as:

subpass of B
subpass of A
pass A
pass B

Which doesn't sound too bad, the subpasses happen before the passes
after all.

However, a subpass might be a pass that converts the image for a texture
stored in the texture cache and then updates the cached image.
If "subpass of A" is such a pass *and* if "subpass of B" then renders
with exactly this texture, then "subpass of B" will use the result of
"subpass of A" as a source.

The fix is to ensure that subpasses stay ordered, too.

The new order moves subpasses right before their parent pass, so the
order of the example now looks like:

subpass of A
pass A
subpass of B
pass B

The place where this would happen most common was when drawing thumbnail
images in Nautilus, the GTK filechooser or Fractal.
Those images are usually PNG files, which are straight alpha. They are then
drawn with a drop shadow, which requires an offscreen for drawing as
well as those images as premultipled sources, so lots of subpasses happen.
If there is then a redraw with a somewhat tricky subregion, then the
slicing of the region code could end up generating 2 passes that each draw
half of the thumbnail image - the first pass drawing the top half and the
second pass drawing the bottom half.
And due to the bug the bottom half would then be drawn from the
offscreen before the actual contents of the offscreen would be drawn,
leading to a corrupt bottom part of the image.

Test included.

Fixes: #6318
2024-03-16 23:44:59 +01:00
Benjamin Otte
47307dc7c1 vulkan: Prefer cached buffer memory
We write the buffers in small chunks, and we even sometimes read it. So
prefer it when it's cached.

Speeds up the text benchmarks by a factor of 3x on my dedicated GPU.
2024-03-16 22:32:49 +01:00
Benjamin Otte
96b800fa0c gl: Add buffer implementation using persistent mapping
If glBufferStorage() is available, we can replace our usage of
glBufferSubData() with persistently mapped storage via
glMappedBufferRange().

This has 1 disadvantage:

1. It's not supported everywhere, it requires GL 4.4 or
   GL_EXT_buffer_storage. But every GPU of the last 10 years should
   implement it. So we check for it and keep the old code.
   The old code can also be forced via GDK_GL_DISABLE=buffer-storage.

But it has 2 advantages:

1. It is what Vulkan does, so it unifies the two renderers' buffer
   handling.

2. It is a significant performance boost in use cases with large vertex
   buffers. Those are pretty rare, but do happen with lots of text at a
   small font size. An example would be a small font in a maximized VTE
   terminal or the overview in gnome-text-editor.

A custom benchmark tailored for this problem can be created with:

  tests/rendernode-create-tests 1000000 text.node

This creates a node file called "text.node" that draws 1 million text
nodes.
(Creating that test takes a minute or so. A smaller number may be useful
on less powerful hardware than my Intel Tigerlake laptop.)
The difference can then be compared via:

  tools/gtk4-rendernode-tool benchmark --runs=20 text.node
and
  GDK_GL_DISABLE=buffer-storage tools/gtk4-rendernode-tool benchmark --runs=20 text.node

For my laptop, the difference is:
before: 1.1s
after:  0.8s

Related: !7021
2024-03-16 20:55:26 +01:00
Benjamin Otte
e7a2baf78c gpu: Remove unused arguments
It's not just unused, it's also wrong.

We are reading from the buffer when reallocating the vertex buffer
and memcpy()ing the old into the new buffer - at that point we read from
it.
2024-03-16 19:46:37 +01:00
Matthias Clasen
438d86fcf5 gsk: Move the buffer upload counter
Move the sysprof counter for buffer uploads to the generic
code, so it works for both ngl and Vulkan. This partially
reverts commit ecf1b7c18a.
2024-03-16 19:39:16 +01:00
Matthias Clasen
1cbdf88b0f Merge branch 'debug-cleanup' into 'main'
gsk: Fix a typo

See merge request GNOME/gtk!7039
2024-03-16 14:41:16 +00:00
Matthias Clasen
b1fb7cd4ae gsk: Drop unused debug flags
The 'surface', 'sync' and 'opengl' flags are not used anywhere.
2024-03-16 09:44:57 -04:00
Matthias Clasen
fd90b56df6 gsk: Move and clarify a debug message
Move the only error message in the OPENGL category to RENDERER,
and make it clearer what and how.
2024-03-16 09:44:57 -04:00
Benjamin Otte
43373e6350 gpu: Rename env var GSK_GPU_SKIP to GSK_GPU_DISABLE
See previous commits.
2024-03-16 14:11:08 +01:00
Benjamin Otte
f725bdad25 gl: Move GL_ARB_base_instance check
It's a GLContext feature check, not a GpuRenderer thing.

So put it there.
2024-03-16 13:52:28 +01:00
Benjamin Otte
cfbe3709bf gpu: Respect the GDK_GL_DISABLE flag
It's now possible to disable sync support.
2024-03-16 13:52:21 +01:00
Benjamin Otte
141769fb46 gl: Turn has_foo flags into GdkGLFeatures
The goal is to have it mirror GdkVulkanFeatures, and in particular
having an environment variable to turn individual flags off.
2024-03-16 13:44:02 +01:00
Benjamin Otte
93cdcc5e88 gpu: Merge multiple ops into one ShaderOp
When ops get allocated that use the same stats as the last op, put them
into the same ShaderOp. This reduces the number of ShaderOps we need to
record, which has 3 benefits:

1. It's less work when iterating over all the ops.
   This isn't a big win, but it makes submit() and print() run a bit
   faster.
2. We don't need to manage data per-op.
   This is a large win because we don't need to ref/unref descriptors
   as much anymore, and refcounting is visible on profiles.
3. We save memory.
   This is a pretty big win because we iterate over ops a lot, and when
   the array is large enough (I've managed to write testcases that makes
   it grow to over 4GB) it kills all the caches and that's bad.

The main benefit of all this are glyphs, which used to emit 1 ShaderOp
per glyph and can now end up with 1 ShaderOp for multiple text nodes,
even if those text nodes use different fonts or colors - because they
can all share the same ColorizeOp.
2024-03-15 20:25:02 +01:00
Matthias Clasen
d51912c0b4 gsk: Add gsk_gpu_frame_get_last_op
This function will be used in the future to find the previous
op during node processing, so we can make optimization decisions
based on that.
2024-03-15 20:25:02 +01:00
Benjamin Otte
bad6e1e102 gpu: Change the way we merge draw calls
With potentially multiple ops per ShaderOp, we may encounter situations
where 1 ShaderOp contains more ops than we want to merge. (With
GSK_GPU_SKIP=merge, we don't want to merge at all.)

So we still merge the ShaderOps (now unconditionally), but we then run
a loop that potentially splits the merged ops again - exactly at the
point we want to.

This way we can merge ops inside of ShaderOps and merge ShaderOps, but
still have the draw calls contain the exact number of ops we want.
2024-03-15 20:25:02 +01:00
Benjamin Otte
28a8dc5a14 gpu: Add GskGpuShaderOp.n_ops
This just introduces the variable and sets it to 1 everywhere.

The ultimate goal is to allow one ShaderOp to collect multiple ops into
one, thereby saving memory in the ops array and leading to faster
performance.
2024-03-15 19:49:17 +01:00
Benjamin Otte
975cdd8c30 gpu: Remove unused return value from function
Technically, an alloc() function should return what it allocated. But
the return value is never used.

Maybe we should rename the function?
2024-03-15 19:49:17 +01:00