Commit Graph

618 Commits

Author SHA1 Message Date
Matthias Clasen
7acc1c0125 Make dmabuf initialization lazier
Only initialize the Vulkan or EGL parts where possible.

When dmabufs or dmabuf formats are actually used, we still
initialize fully by creating both a Vulkan and EGL downloader.

This shortens the time to first commit from 149ms to 108ms.
2024-09-17 09:46:01 +02:00
Benjamin Otte
f2e75529cb Merge branch 'wip/otte/for-main' into 'main'
vulkan: Actually return the preferred memory format

See merge request GNOME/gtk!7713
2024-09-15 08:19:53 +00:00
Benjamin Otte
eed58b4051 gpu: Split out rect grid snapping function
We might want to use it outside of the nodeprocessor.

The function is now called gsk_rect_snap_to_grid().
2024-09-15 02:31:02 +02:00
Benjamin Otte
28a01ca954 gpu: Print some tex rects in verbose output
In the colorize and texture ops, print the tex rect. This is useful
because when adding new features with textures (like atlas usage), these
are the ops that I use for testing.
2024-09-15 02:31:02 +02:00
Benjamin Otte
fa86bfcb55 gpu: Use existing frame in render_texture()
Instead of recreating frames from scratch every time, use an existing one.

This ensures that renderers don't need to recreate GPU resources every
time (like buffers and everything else that frames manage). It also
speeds up occasional render_texture() calls in default renderers.

This speeds up in particular the Vulkan renderer.
2024-09-15 02:31:01 +02:00
Benjamin Otte
c3d9f4a9ac gpu: Move a function
No functional changes.
2024-09-15 02:31:01 +02:00
Benjamin Otte
d11ac3585d gpu: Clean up the frame after we're done waiting for it
This is useful because cleaning up will do the final copies of texture
data.

It also means we use less memory, as we're going to release images that
were used in ops.
2024-09-15 02:31:01 +02:00
Benjamin Otte
8f78a0f809 gpu: Add an internal is_clean() check
If no ops are recorded, then we don't need to wait for any ops to
finish.

Also fix the initial fence creation on Vulkan - we no longer need to
create it fixed because of the random cleanup() call at startup does no
longer happen.
2024-09-15 02:30:59 +02:00
Benjamin Otte
3286d9f1b5 vulkan: Actually return the preferred memory format
We were just exiting the loop, but not remembering the index.

Speeds up various memory operations, sometimes by quite a lot.
2024-09-15 00:07:47 +02:00
Matthias Clasen
e5705bd141 gsk: Add another profiler mark
This lets us compare the cost of realizing the new renderers vs
the old GL renderer.
2024-09-13 11:02:52 -04:00
Benjamin Otte
e9735f0c35 gpu: Move GskGpuClip declaration into types header
Makes it possible to use it in multiple places.
2024-09-11 08:34:41 +02:00
Benjamin Otte
d320373262 gpu: Clarify a function
It does functionally the same now, but it makes it mroe clear how it
works.

As a bonus, it will now trigger for -Wswitch-enum, too.
2024-09-11 08:33:22 +02:00
Benjamin Otte
a8748598b6 gpu: verbose-print if shaders are inside merged ops
This is useful when trying to get more ops merged for performance
reasons.
2024-09-11 08:17:58 +02:00
Benjamin Otte
748acaf654 gpu: Use a better character for debug print
Instead of 🞨 which isn't supported in many places, use ⬚.

⬚ also fits better with □ and ▢ for describing clip regions.
2024-09-11 08:17:58 +02:00
Benjamin Otte
e35670a014 gpu: Don't crash when there's no ops
In the rare situation (read: I triggered it with obscure hacks) where no
ops are emitted, we could end up pointing into invalid memory and
crashing.

Don't do that.
2024-09-11 08:17:43 +02:00
Benjamin Otte
9ba41ed6e8 Merge branch 'wip/otte/blur-and-blit' into 'main'
Fix blur for opaque textures

Closes #6980

See merge request GNOME/gtk!7697
2024-09-09 04:46:43 +00:00
Benjamin Otte
56fc8f0077 gpu: Blur opaque textures correctly
Opaque textures don't clamp to transparent but instead to black.
We didn't consider this, so we were blurring their edges into blackness
not into transparency.

Fix this by adding the GSK_GPU_AS_IMAGE_SAMPLED_OUT_OF_BOUNDS flag
and respecting it in the implementation that uses it.

Test included.

Fixes #6980
2024-09-09 05:10:51 +02:00
Benjamin Otte
85abff343e ngl: Images are not blittable if they have a swizzle
Swizzling is not respected for blitting.
See commit 058252e895 for the same change in Vulkan.
Apparently that never made it to ngl.

The next commit will have a test for this.
2024-09-09 04:18:13 +02:00
Benjamin Otte
03230181ce gpu: Add GskGpuAsImageFlags
I've had a need for flags for the get_as_image() call but so far have
been able to work around it. But now it seems I might finally need it.

This just introduces the flags but doesn't add any.

Related: #6980
2024-09-09 01:25:03 +02:00
Matthias Clasen
76a13596aa gpu: Reduce per-glyph overhead
Pull the shader clip computation out of the loop in the common
case that the entire node is contained in the clip.
2024-09-08 12:57:31 -04:00
Matthias Clasen
c18cd6050b gpu: Use gsk_gpu_colorize_op2
This reduces the cost of these calls significantly, and this is
the inner loop of the node processor.
2024-09-08 12:43:02 -04:00
Matthias Clasen
60f5d4c93e gpu: Add a variant of gsk_gpu_colorize_op
This variant takes the color_states, instead of computing it
anew from the ccs and the color state of the color. This will
be used to pull this work out of the loop in add_glyph_node.
2024-09-08 12:41:48 -04:00
Matthias Clasen
03840151ac gsk: Drop an unused variable
We're not using last_image for anything.
2024-09-08 11:48:43 -04:00
Matthias Clasen
2e44f3e2ff gsk: Get the text node color once
We don't need to do this in the loop.
2024-09-08 11:48:43 -04:00
Matthias Clasen
4f9fd5cf1d Add gsk_text_node_get_font_hint_style
Getting the hint style is one of the more expensive calls we do
when adding glyph nodes, so cache this information in the node.
2024-09-08 11:48:43 -04:00
Matthias Clasen
c505a08e46 gsk: Small optimization
Avoid calling gsk_container_node_get_child in a loop.
2024-09-08 11:48:43 -04:00
Matthias Clasen
1ae9cdb4c9 gpu: Print blur colors
Relevant information when debugging shadows.
2024-09-07 12:34:04 -04:00
Benjamin Otte
896ea5b753 memoryformat: Add linear/nearest choice for mipmaping
linear will average all the pixels for the lod, nearest will just pick
one (using the same method as OpenGL/Vulkan, picking bottom right
center).

This doesn't really make linear/nearest filtering work as it should
(because it's still a form of mipmaps), but it has 2 advantages:

1. it gets closer to the desired effect

2. it is a lot faster

Because only 1 pixel is chosen from the original image, instead of
averaging all pixels, a lot less memory needs to be accessed, and
because memory access is the bottleneck for large images, the speedup is
almost linear with the number of pixels not accessed.
And that means that even for lot level 3, aka 1/8th scale, only 1/64 of
the pixels need to be accessed, and everything is 50x faster.

Switching gtk4-demo --run=image_scaling to linear/nearest makes all the
lag go away for me, even with a 64k x 64k image.
2024-09-06 15:47:35 -04:00
Benjamin Otte
cea961f4f4 memoryformat: Take src_format and dest_format
Why do we need this? Because RGB images are provided in RGB format but
GPUs can't handle RGB, only RGBA, so we need to convert.

And we need to do that without allocating too much memory, because
allocating memory is slow. Which means in aprticular we need to do the
conversion after mipmapping, not before (like we were doing).
2024-09-06 15:47:34 -04:00
Benjamin Otte
848c6815d3 gpu: Allow uploading of mipmap levels when tiling
This allows uploading less memory but requires computing lod levels on
the CPU which is slow because it reads through all of the memory and so
far entirely not optimized.

However, it uses significantly less VRAM.

This is done by adding a gdk_memory_mipmap() function that does this
task.
The texture upload op now accepts a lod level and if that is >0 it uses
gdk_memory_mipmap() on the source texture.
2024-09-06 15:47:34 -04:00
Benjamin Otte
563cce5530 gsk: Use gsk_rect_init_offset() everywhere
... and make it use a graphene_point_t as argument, because that's what
the callers want.
2024-09-02 00:22:37 +02:00
Benjamin Otte
6a1cd87480 gpu: Use builder for memory texture 2024-09-01 22:49:34 +02:00
Benjamin Otte
49ee69f316 gpu: Use right GL context when exporting texture
We want to use the display's context on the resulting texture,
but we do not want to use it for the stufff we need to do while
exporting - most importantly the GLsync.

Fixes #6976
2024-09-01 22:49:34 +02:00
Benjamin Otte
0defdc4af5 gpu: Plug fd leak in fallback path
If we can't construct a dmabuf texture, we need to clear the dmabuf fd.
2024-08-26 20:31:08 +02:00
Benjamin Otte
ea9b47f1b6 gpu: Use common cleanup function
Just simple cleanup, both functions do the same thing.
2024-08-26 20:31:08 +02:00
Benjamin Otte
65c8320a32 gpu: Fix fd leak in GL dmabuf export
The texture ID is not deleted on dmabuf export; a copy is made, the
GskGpuImage retains ownership.

However when doing GL export, the texture *does* take ownership, so we
need the stealing semantics for that case.
2024-08-26 20:31:08 +02:00
Benjamin Otte
4be1d754b7 vulkan: Don't spam stderr on failed Vulkan import
We write a debug message and then handle things using fallback.

Fixes error messages when trying to import incompatible dmabufs.
(in my case: llvmpipe dmabufs into radv)
2024-08-23 22:53:13 +02:00
Benjamin Otte
73c94cf1d6 gpu: Use the shared GL context when creating GL textures
The non-shared context's surface must survive the lifetime of the
GL texture, and when the renderer gets unrealized the surface goes away,
but we cannot guarantee that all GL textures have been destroyed by
then.

So better use a context we know will survive becuase it isn't bound to a
surface.

This is the same fix for NGL as f3ac0535f8
was for GL.
2024-08-23 20:04:46 +02:00
Benjamin Otte
6a986f03b6 gpu: Simplify the blur op a bit
I was looking through it and thought this looks better.

It's also 21 lineas of code less.
2024-08-23 19:01:05 +02:00
Benjamin Otte
54758bee1f gpu: Only run a single renderpass
Instead of running one renderpass per clip region, run one renderpass for
the whole clip extents, and just set the scissor to the individual clip
rects.

This means that we need to use LOAD_OP_LOAD in cases where we don't
redraw the full extents, but nonetheless, the eprformance wins of
avoiding renderpasses are worth it, in particualr on tilers like the
Raspberry Pi or other mobile chips and the Apple M1/2.
2024-08-21 21:13:34 +02:00
Benjamin Otte
cfe0da1eed gpu: Add GskGpuLoadOp
We want to differentiate between CLEAR, DONT_CARE and LOAD in the
future, and the current boolean doesn't allow that.

Also implement support for the the different ops in the Vulkan
renderpass code.
2024-08-21 21:13:34 +02:00
Benjamin Otte
1198dc76a4 gpu: Add gsk_gpu_first_node_begin_rendering()
This starts the renderpass at the given scissor rect.

It just splits out the gsk_gpu_render_pass_begin_op() call into a
simpler function, so it's harder to mess up.
2024-08-21 21:13:34 +02:00
Benjamin Otte
bed3e9918b gpu: Add GskGpuFirstNodeInfo
Just encapsulate all the data for the add_first_node() call into a
single struct.
2024-08-21 21:13:34 +02:00
Benjamin Otte
d76bb2991a gpu: Refactoring: Pull out nodeprocessor
Add gsk_gpu_node_processor_set_scissor() that allows resetting the
nodeprocessor's scissor and clip rectangle.
That in turn allows using the same nodeprocessor instance for all the
rects we draw for the clip region.
2024-08-21 21:13:34 +02:00
Benjamin Otte
e962e86fcd gpu: Split out a function
converting an image to any colorstate (not just ccs-capable default
colorstates) can go in its own function.
2024-08-21 21:13:34 +02:00
Benjamin Otte
c7fabe2897 gpu: Be more aggressive about GC'ing dead textures
When we encounter many dead textures, we want to GC. Textures can take
up fds (potentially even more than 1) and when we are not collecting
them quickly enough, we may run out of fds.

So GC when more then 50 dead textures exist.

An example for this happening is recent Mesa with llvmpipe udmabuf
support, which takes 2 fds per texture and the test in
testsuite/gdk/memorytexture that creates 800 textures of ~1 pixel each,
which it diligently releases but doesn't GC.

Related: #6917
2024-08-21 09:45:05 +02:00
Matthias Clasen
740965016f Merge branch 'matthiasc/for-main' into 'main'
Add a GDK_DISABLE env var

See merge request GNOME/gtk!7632
2024-08-20 01:58:54 +00:00
Matthias Clasen
26a2966a7b gdk: Beef up gdk_parse_debug_var
Add a docstring for the variable itself, and print it as part
of the help message. Update all callers to provide a docstring.
2024-08-19 20:40:32 -04:00
Benjamin Otte
f6a8ba0ccb gpu: The colorstate op doesn't need a colorstates arg
It's using the same colorstate all the time: any premultiplied.

So just hardcode it.
2024-08-20 01:05:20 +02:00
Matthias Clasen
436989d745 gsk: Apply the same transfer fixes
These are copied from gdkcolordefs.h.
2024-08-14 13:21:28 -04:00