Commit Graph

593 Commits

Author SHA1 Message Date
Matthias Clasen
c505a08e46 gsk: Small optimization
Avoid calling gsk_container_node_get_child in a loop.
2024-09-08 11:48:43 -04:00
Matthias Clasen
1ae9cdb4c9 gpu: Print blur colors
Relevant information when debugging shadows.
2024-09-07 12:34:04 -04:00
Benjamin Otte
896ea5b753 memoryformat: Add linear/nearest choice for mipmaping
linear will average all the pixels for the lod, nearest will just pick
one (using the same method as OpenGL/Vulkan, picking bottom right
center).

This doesn't really make linear/nearest filtering work as it should
(because it's still a form of mipmaps), but it has 2 advantages:

1. it gets closer to the desired effect

2. it is a lot faster

Because only 1 pixel is chosen from the original image, instead of
averaging all pixels, a lot less memory needs to be accessed, and
because memory access is the bottleneck for large images, the speedup is
almost linear with the number of pixels not accessed.
And that means that even for lot level 3, aka 1/8th scale, only 1/64 of
the pixels need to be accessed, and everything is 50x faster.

Switching gtk4-demo --run=image_scaling to linear/nearest makes all the
lag go away for me, even with a 64k x 64k image.
2024-09-06 15:47:35 -04:00
Benjamin Otte
cea961f4f4 memoryformat: Take src_format and dest_format
Why do we need this? Because RGB images are provided in RGB format but
GPUs can't handle RGB, only RGBA, so we need to convert.

And we need to do that without allocating too much memory, because
allocating memory is slow. Which means in aprticular we need to do the
conversion after mipmapping, not before (like we were doing).
2024-09-06 15:47:34 -04:00
Benjamin Otte
848c6815d3 gpu: Allow uploading of mipmap levels when tiling
This allows uploading less memory but requires computing lod levels on
the CPU which is slow because it reads through all of the memory and so
far entirely not optimized.

However, it uses significantly less VRAM.

This is done by adding a gdk_memory_mipmap() function that does this
task.
The texture upload op now accepts a lod level and if that is >0 it uses
gdk_memory_mipmap() on the source texture.
2024-09-06 15:47:34 -04:00
Benjamin Otte
563cce5530 gsk: Use gsk_rect_init_offset() everywhere
... and make it use a graphene_point_t as argument, because that's what
the callers want.
2024-09-02 00:22:37 +02:00
Benjamin Otte
6a1cd87480 gpu: Use builder for memory texture 2024-09-01 22:49:34 +02:00
Benjamin Otte
49ee69f316 gpu: Use right GL context when exporting texture
We want to use the display's context on the resulting texture,
but we do not want to use it for the stufff we need to do while
exporting - most importantly the GLsync.

Fixes #6976
2024-09-01 22:49:34 +02:00
Benjamin Otte
0defdc4af5 gpu: Plug fd leak in fallback path
If we can't construct a dmabuf texture, we need to clear the dmabuf fd.
2024-08-26 20:31:08 +02:00
Benjamin Otte
ea9b47f1b6 gpu: Use common cleanup function
Just simple cleanup, both functions do the same thing.
2024-08-26 20:31:08 +02:00
Benjamin Otte
65c8320a32 gpu: Fix fd leak in GL dmabuf export
The texture ID is not deleted on dmabuf export; a copy is made, the
GskGpuImage retains ownership.

However when doing GL export, the texture *does* take ownership, so we
need the stealing semantics for that case.
2024-08-26 20:31:08 +02:00
Benjamin Otte
4be1d754b7 vulkan: Don't spam stderr on failed Vulkan import
We write a debug message and then handle things using fallback.

Fixes error messages when trying to import incompatible dmabufs.
(in my case: llvmpipe dmabufs into radv)
2024-08-23 22:53:13 +02:00
Benjamin Otte
73c94cf1d6 gpu: Use the shared GL context when creating GL textures
The non-shared context's surface must survive the lifetime of the
GL texture, and when the renderer gets unrealized the surface goes away,
but we cannot guarantee that all GL textures have been destroyed by
then.

So better use a context we know will survive becuase it isn't bound to a
surface.

This is the same fix for NGL as f3ac0535f8
was for GL.
2024-08-23 20:04:46 +02:00
Benjamin Otte
6a986f03b6 gpu: Simplify the blur op a bit
I was looking through it and thought this looks better.

It's also 21 lineas of code less.
2024-08-23 19:01:05 +02:00
Benjamin Otte
54758bee1f gpu: Only run a single renderpass
Instead of running one renderpass per clip region, run one renderpass for
the whole clip extents, and just set the scissor to the individual clip
rects.

This means that we need to use LOAD_OP_LOAD in cases where we don't
redraw the full extents, but nonetheless, the eprformance wins of
avoiding renderpasses are worth it, in particualr on tilers like the
Raspberry Pi or other mobile chips and the Apple M1/2.
2024-08-21 21:13:34 +02:00
Benjamin Otte
cfe0da1eed gpu: Add GskGpuLoadOp
We want to differentiate between CLEAR, DONT_CARE and LOAD in the
future, and the current boolean doesn't allow that.

Also implement support for the the different ops in the Vulkan
renderpass code.
2024-08-21 21:13:34 +02:00
Benjamin Otte
1198dc76a4 gpu: Add gsk_gpu_first_node_begin_rendering()
This starts the renderpass at the given scissor rect.

It just splits out the gsk_gpu_render_pass_begin_op() call into a
simpler function, so it's harder to mess up.
2024-08-21 21:13:34 +02:00
Benjamin Otte
bed3e9918b gpu: Add GskGpuFirstNodeInfo
Just encapsulate all the data for the add_first_node() call into a
single struct.
2024-08-21 21:13:34 +02:00
Benjamin Otte
d76bb2991a gpu: Refactoring: Pull out nodeprocessor
Add gsk_gpu_node_processor_set_scissor() that allows resetting the
nodeprocessor's scissor and clip rectangle.
That in turn allows using the same nodeprocessor instance for all the
rects we draw for the clip region.
2024-08-21 21:13:34 +02:00
Benjamin Otte
e962e86fcd gpu: Split out a function
converting an image to any colorstate (not just ccs-capable default
colorstates) can go in its own function.
2024-08-21 21:13:34 +02:00
Benjamin Otte
c7fabe2897 gpu: Be more aggressive about GC'ing dead textures
When we encounter many dead textures, we want to GC. Textures can take
up fds (potentially even more than 1) and when we are not collecting
them quickly enough, we may run out of fds.

So GC when more then 50 dead textures exist.

An example for this happening is recent Mesa with llvmpipe udmabuf
support, which takes 2 fds per texture and the test in
testsuite/gdk/memorytexture that creates 800 textures of ~1 pixel each,
which it diligently releases but doesn't GC.

Related: #6917
2024-08-21 09:45:05 +02:00
Matthias Clasen
740965016f Merge branch 'matthiasc/for-main' into 'main'
Add a GDK_DISABLE env var

See merge request GNOME/gtk!7632
2024-08-20 01:58:54 +00:00
Matthias Clasen
26a2966a7b gdk: Beef up gdk_parse_debug_var
Add a docstring for the variable itself, and print it as part
of the help message. Update all callers to provide a docstring.
2024-08-19 20:40:32 -04:00
Benjamin Otte
f6a8ba0ccb gpu: The colorstate op doesn't need a colorstates arg
It's using the same colorstate all the time: any premultiplied.

So just hardcode it.
2024-08-20 01:05:20 +02:00
Matthias Clasen
436989d745 gsk: Apply the same transfer fixes
These are copied from gdkcolordefs.h.
2024-08-14 13:21:28 -04:00
Benjamin Otte
824ccfc562 gpu: Add an assertion
If the wrong color states get passed, we need to figure that out, before
we run shaders with junk values doing god knows what.
2024-08-14 08:30:55 +02:00
Benjamin Otte
79c2df8392 color: Handle negative values in all transfer functions
Make sure that for all eotfs/oetfs, eotf(x) == -eotf(-x)

In particular, don't pass negative values to pow() and cause undefined
behavior.
2024-08-13 20:47:40 +02:00
Matthias Clasen
13d415f2f6 Add a comment 2024-08-11 16:31:18 -04:00
Matthias Clasen
ca7aa30bc5 gsk: Use private text node api 2024-08-10 07:37:01 -04:00
Benjamin Otte
04ee41d7b0 Merge branch 'wip/otte/for-main' into 'main'
lots of small things

See merge request GNOME/gtk!7585
2024-08-10 01:10:12 +00:00
Benjamin Otte
23af1cd8ad gpu: When transforming to simpler transform, we might be all clipped
When transforming back from a complex transform to a simpler transform,
the resulting clip might turn out to clip everything, because clips can
grow while transforming, but the scissor rect won't. So when this
process happens, we can end up with an empty clip by transforming:

1. Set a clip that also sets the scissor
2. transform in a way that grows the clip, say rotate(45)
3. modify the clip to shrink it
4. transform in a way that simplifies the transform, say another
   rotate(45)
5. Figure out that this clip and the scissor rect do no longer overlap

Catch this case and avoid drawing anything.
2024-08-10 02:38:13 +02:00
Matthias Clasen
5551f30400 gsk: Use private shadow node api 2024-08-09 20:09:31 -04:00
Matthias Clasen
44fe51247c gsk: Change the blur op api
Pass the ccs, opacity and GdkColor to the op to let it make
decisions about color conversion.

Update the callers.
2024-08-09 20:09:31 -04:00
Matthias Clasen
3a99f1e9f1 gsk: Change the colorize op api
Pass the ccs, opacity and GdkColors to the op to let it make
decisions about color conversion.

Update the callers.
2024-08-09 20:09:30 -04:00
Benjamin Otte
16c7003acb gdk: Pass the opaque rect to begin_frame() actually
We know it at begin_frame() time, so if we pass it there instead of
end_frame(), we can use it then to make decisions about opacity.

For example, we could notice that the whole surface is opaque and choose
an RGBx format.
We don't do that yet, but now we could.
2024-08-10 01:40:46 +02:00
Benjamin Otte
a35f8d52d6 gsk: Clear current context after unrealize()
Make sure both GL renderers don't leave their contexts alive via the
current context, but ensure they dispose of them properly.

Fixes issues when the corresponding GL resources in the surfaces they
were attached to go away.
2024-08-10 01:40:45 +02:00
Benjamin Otte
b08ccc0bec ngl: Stop crashing with zink and llvmpipe
We were not calling make_current() early enough anymore after the
Vulkan validation-layer fixes in !7468

Change that by calling it earlier.
2024-08-10 01:40:45 +02:00
Benjamin Otte
523cd0dff7 glcontext: Add a surface_attached flag
GLContexts marked as surface_attached are always attached to the surface
in make_current().
Other contexts continue to only get attached to their surface between
begin_frame() and end_frame().

All our renderer use surface-attached contexts now.
Public API only gives out non-surface-attached contexts.

The benefit here is that we can now choose whenever we want to
call make_current() because it will not cause a re-make_current() if we
call it outside vs inside the begin/end_frame() region.

Or in other words: I want to call make_current() before begin_frame()
without a performance penalty, and now I can.
2024-08-10 01:40:45 +02:00
Benjamin Otte
0b2275774f gdk: Add gdk_draw_context_end_frame_full()
... and pass the opaque region of the node.

We don't do anything with it yet, this is just the plumbing.

The original function still exists, it passes NULL which is the value
for no opaque region at all.
2024-08-10 01:40:45 +02:00
Benjamin Otte
904b1815b5 nodeprocessor: Consult scissor after rounded clip
If we apply a rounded clip, we might change the clip in a way that makes
it intersectable with the scissor again, if both had diverged before.

So try and intersect with the clip.
2024-08-10 01:40:45 +02:00
Benjamin Otte
d6322c6389 gsk: Switch box shadow nodes to use offsets
Instead of
  float dx;
  float dy;
have a
  graphene_point_t offset;
in the object and in all APIs relating to them.
2024-08-10 01:40:45 +02:00
Matthias Clasen
d3b9eb7fc8 gsk: Use private box shadow api 2024-08-08 15:43:49 -04:00
Matthias Clasen
070ddcd14b Change box shadow op api
Pass the ccs, opacity and GdkColor and let the op decide about
color conversions. Update all callers.
2024-08-08 15:43:49 -04:00
Benjamin Otte
99d291eb69 Merge branch 'wip/otte/occlusion' into 'main'
Implement advanced occlusion culling

See merge request GNOME/gtk!7570
2024-08-08 14:22:22 +00:00
Benjamin Otte
3313fd4e2b vulkan: Turn debug messages into warnings
Vulkan errors are quire critical, so we want to see them.

Our code is good enough to handle all non-critical errors.
2024-08-08 04:41:16 +02:00
Matthias Clasen
56c02dd7d1 Merge branch 'css-color-hookup-2' into 'main'
Make non-srgb css colors work for borders

See merge request GNOME/gtk!7550
2024-08-07 23:14:35 +00:00
Benjamin Otte
4e4ed1e2d5 gpu: Implement occlusion for subsurface nodes
In the case of no offloading, we want to pass through to the child
(which is likely a big texture doing occlusion).

In the case of punching a hole, we want to punch the hole and not draw
anything behind it, so we start an occlusion pass with transparency.

And in the final case with offloading active, we don't draw anything,
so we don't draw anything.

This should fix concerns about drawing the background behind the video
as mentioned for example in
https://github.com/Rafostar/clapper/issues/343#issuecomment-1445425004
2024-08-07 23:26:03 +02:00
Benjamin Otte
82aa2cb5c2 gpu: Implement add_first_node for debug nodes
As always, we want to have no influence on any results
from these nodes.
2024-08-07 23:26:03 +02:00
Benjamin Otte
e9944148d5 gpu: Add GSK_DEBUG=occlusion
Draws a semi-transparent white overlay over all regions that have
been chosen for occlusion.
2024-08-07 23:26:03 +02:00
Benjamin Otte
afa4eb7d35 gpu: Make containers check opaque size for early exit
Container nodes save their opaque region, so it's quick to access. Use
that to check if the largest opaque region even qualifies for culling -
and if not, just exit.

Speeds up walking node trees by a lot.
2024-08-07 23:26:03 +02:00