On Windows with nVidia drivers at least, when we create a legacy context
via wglCreateContext(), we may still get a (W)GL 4.x context. Allow
such contexts to also use GLSL version 130 instead of 110, so that
things do continue to work.
But don't call it too early, we only want to call it once we have
prepared the target.
This way, we guarantee that a GL context is always available and that it
is bound to the correct target.
Don't pass texture + rect, but instead have
gdk_memory_texture_new_subtexture()
and use it to generate subtextures and pass them.
This has the advantage of downloading the a too large texture only once
instead of N times.
It does not belong in GdkGLContext, it's a renderer thing.
It's also the only user of that API.
Introduce gdk_gl_context_check_version() private API to make version
checks simpler.
It turns out glReadPixels() cannot convert pixels and you are only
allowed to pass a single value into the function arguments. You need to
know which ones or things will explode.
GL is great.
Pass a format do GdkTextureClass::download(). That way we can download
data in any format.
Also replace gdk_texture_download_texture() with
gdk_memory_texture_from_texture() which again takes a format.
The old functionality is still there for code that wants it: Just pass
gdk_texture_get_format (texture) as the format argument.
Make a deep texture, if the render nodes have
high depth content.
For now, we use 32F here for the deep format,
since using 16F causes small rounding errors
that break the memorytexture roundtrip tests.
Look at the framebuffer and the rendernode to
determine what format to use for intermediate
textures.
Our preference here is to use fp16, if we have it
and it makes sense for the framebuffer we're given.
Add private api to find out if the content
of a render node should be considered 'deep'.
The information is collected at creation time,
so there is no tree-walking involved when we
are using this information in the renderer.
Currently, this comes down to whether there are
any texture nodes with high depth textures in the subtree.
In the future, we may want to allow marking gradient
nodes in this way as well.
For MemoryTexture, this is a simple change.
For GLTexture, we need to query the format at texture creation. This
sounds like a bad idea and extra work until one realizes that we'd
need to do that anyway when using the texure the first time - either
when downloading, or when trying to use it in a rendernode, where we
will soon need that information to determine if the texture prefers high
depth.
Also, now make gdk_memory_convert() the only conversion functions
and allow conversions between any 2 formats by going via a float[4].
This could be optimized via fast-paths, but so far it isn't.
We never put large icons into the icon cache,
so all its items are always atlased, but we do
put large glyphs in to the glyph cache, and we
were never freeing those items, even when they
go unused. Fix that.
This reverts commit 87af45403a.
I've found that this change is needed to ensure that the
bounding boxes of text nodes encompass all the glyphd drawing.
Without it, we overdraw the widget boundaries and cut off
glyphs.
We are rendering the glyphs on a larger surface,
and we should avoid introducing unnecessary
rounding errors here. Also, I've found that
we always need to enlarge the surface by one
pixels in each direction to avoid cutting off
the tops of large glyphs.
For 2D transforms, we can read the scale
factors more directly off the matrix.
This should eventually be moved out into a
function to decompose a 2D transform into
scale + rotation + skew + translation.
We avoid an offscreen if we know the child node
can 'handle' the transform. Shadow nodes can if their
child node does - either the child node is a text node
in which case the shortcuts we take for shadow nodes
will work fine with the transform (we just render the
text node offset), or the child is not a text node,
in which case we render the shadow to an offscreen
anyway.
This change makes the label-shadows reftest pass with
the GL renderer, not by fixing the issue but by avoiding
it.
For shadow nodes, we try pretty hard to avoid
rendering shadows, and and we have a shortcut
that just renders text offset, but we can try
harder to do nothing - if the text is offset
by zero, we don't need to draw it at all.
We need to use an offscreen whenever there is overlapping
children somewhere in the tree below, just checking the
direct child of the opacity node is not enough.
Fixes: #4261
The cairo_t that we create to render glyphs for
the glyph cache needs to match the font options
that are supposedly governing how glyphs are
drawn.
Since we allow font options to be different per
widget in gtk, we need to have them at least at
the level of individual render nodes. Adding them
to the lookup key for the glyph cache has the
side effect of solving another problem: We are
not flushing the cache when font options change.
Since font options affect how the glyphs get rendered,
we need to pass the font options down from the gtk level
to where the glyph cache is populated.
Add a new gsk_text_node_new_full api that takes a
cairo_font_options_t in addition to the other parameters.
This is needed as GskRenderNode is its own fundamental type and has its
own GValue infrastructure. And I want to put render nodes into the
clipboard which uses GValues.
Long time ago, Cairo shadows in both GTK3 and 4 were drawn at a size about
twice their radius. Eventually this was fixed but the shadow extents are
still calculated for the previous size and appear unreasonably large: for
example, 141px for a 50px radius shadow. This can get very noticeable in
places such as invisible window frame which gets included into screenshots.
https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/3419 just divides the
radius by 2 when drawing a shadow with Cairo, do the same when calculating
extents.
See https://gitlab.gnome.org/GNOME/gtk/-/issues/3841
harfbuzz has all the information we need, so we
can avoid poking directly at freetype apis. Also
drop the caching of color glyph information until
it turns out to be a problem.
The vfunc is called to initialize GL and it returns a "base" context
that GDK then uses as the context all others are shared with. So the GL
context share tree now looks like:
+ context from init_gl
- context1
- context2
...
So this is a flat tree now, the complexity is gone.
The only caveat is that backends now need to create a GL context when
initializing GL so some refactoring was needed.
Two new functions have been added:
* gdk_display_prepare_gl()
This is public API and can be used to ensure that GL has been
initialized or if not, retrieve an error to display (or debug-print).
* gdk_display_get_gl_context()
This is a private function to retrieve the base context from
init_gl(). It replaces gdk_surface_get_shared_data_context().
Scale factors can be negative, but we were not
looking out for that, triggering an assertion when
trying to create a render target with negative
width of height. Avoid that.
Fixes: #4096
We are pretty good at batching commands now, and we can easily
produce batches that exceed the maximum number of elements per
draw call that the hw can handle. Query that number, and respect
it when merging batches.
This fixes the rendering of the overview map in GtkSourceView.
Remove a boatload of "or %NULL" from nullable parameters
and return values. gi-docgen generates suitable text from
the annotation that we don't need to duplicate.
This adds a few missing nullable annotations too.
Make gsk_ngl_texture_library_pack always return
the position including the padding. And compute
texture coordinates accurately in all cases (we
were fudging the padding for standalone textures.
We can't use this flag for any code that may get run
outside the __builtin_cpu_supports() check, and meson
doesn't allow per-file cflags. So we have to split this
code off into its own static library.
When we clean up the uniform allocations after a frame,
it can happen that our space requirements actually increase,
due to padding that depends on the order of allocations.
Instead of asserting that it doesn't happen, just make
it work by growing our allocation.
Fixes: #3853
We need to use __cpuid() to check for the presence of F16C instructions on
Visual Studio builds, and call the half_to_float4() or float_to_half4()
implementation accordingly, as the __builtin_cpu...() functions are strictly
for GCC or CLang only.
Also, since __m128i_u is not a standard intrisics type across the board, just
use __m128i on Visual Studio as it is safe to do so there for use for
_mm_loadl_epi64().
Like running on Darwin, we cannot use the alias __attribute__ as __attribute__
is also for GCC and CLang only.
When 9-slicing shadows, omit the center tile when it is
entirely contained in the outline (that is not always
the case, depending on corners and offsets).
gsk_rounded_rect_contains_rect was calling
gsk_rounded_rect_contains_point, which potentially
checks all four corners, for a total of up to 16
corner/point checks. But there is no need to do
more than 4 such checks to answer the question.
Opportunistically use the coloring program for
drawing underlines instead of the color program.
This avoids program changes in the middle of
text.
For the Emoji text scrolling benchmark, this reduces
the program changes per frame from > 1000 to around 100.
Use an IFUNC resolver to determine whether we can use
intrinsics for FP16 conversion. This requires the functions
to be no longer inline.
Sadly, it turns out that __builtin_cpu_supports ("f16c")
doesn't compile on the systems where we want it to prevent
us from getting a SIGILL at runtime.