This attempts to improve the accuracy for the "presentation_time" of an
individual GdkFrameTimings. That information is currently filled in as soon
as we get a frame callback. However, if presentation-time wayland protocol
is available, that will be used to supliment a more accurate time which
may improve future presentation-time predictions within GdkFrameClockIdle.
The protocol states that all related and sub surfaces will receive the
same information so it is safe that this could be registered for more
than just the toplevel. The information becomes idempotent.
When no action is selected, use the default cursor, and only
switch to one of the action-indicating cursors when we are over
a drop target.
Fixes: #6337Fixes: #6511
If glBufferStorage() is available, we can replace our usage of
glBufferSubData() with persistently mapped storage via
glMappedBufferRange().
This has 1 disadvantage:
1. It's not supported everywhere, it requires GL 4.4 or
GL_EXT_buffer_storage. But every GPU of the last 10 years should
implement it. So we check for it and keep the old code.
The old code can also be forced via GDK_GL_DISABLE=buffer-storage.
But it has 2 advantages:
1. It is what Vulkan does, so it unifies the two renderers' buffer
handling.
2. It is a significant performance boost in use cases with large vertex
buffers. Those are pretty rare, but do happen with lots of text at a
small font size. An example would be a small font in a maximized VTE
terminal or the overview in gnome-text-editor.
A custom benchmark tailored for this problem can be created with:
tests/rendernode-create-tests 1000000 text.node
This creates a node file called "text.node" that draws 1 million text
nodes.
(Creating that test takes a minute or so. A smaller number may be useful
on less powerful hardware than my Intel Tigerlake laptop.)
The difference can then be compared via:
tools/gtk4-rendernode-tool benchmark --runs=20 text.node
and
GDK_GL_DISABLE=buffer-storage tools/gtk4-rendernode-tool benchmark --runs=20 text.node
For my laptop, the difference is:
before: 1.1s
after: 0.8s
Related: !7021
We cannot depend on the exact event, since some events (e.g. for popups)
are rewritten. Therefore we need to determine the NSEvent based on
heuristics. The usual suspects are event type, device and timestamp.
This allows us to fix IMContext for popups.
Keep at least 1 second of frame timings.
This is necessary for 2 reasons - a real one and a fun one.
First, with the difference in monitor refresh rates, we can have 48Hz
latops as well as 240Hz high refresh rate monitors. That's a factor of
4, and tracking frame rates in both situations reliably is kind of hard
- either we track over too many frames and the fps take a lot of time to
adjust, or we track too little time and the fps fluctuate wildly.
Second, when benchmarking with GDK_DEBUG=no-vsync with a somewhat fast
renderer (*cough*Vulkan*cough*) frame rates can go into insane dimensions
and only very few frames are actually getting presentation times
reported. So to report accurate frame rates in those cases, we need a
*very* large history that can be 1000s of times larger than the usual
history. And that's just a waste for normal usage.
Previously, our reported fps numbers could be too low when the start
timings weren't complete. In that case we would use the frame time, but
the frame time is the time when the frame was rendered, which is quite a
few milliseconds before it is presented.
So in that case we would not report the difference in presentation
times, but the difference from start of rendering. However, those times
are way more variable and can smear over the whole frame because they
depend on when we received the frame callbacks to high priority GSources
as well as our own render time predictions.
This happened in particular with GDK_DEBUG=no-vsync and could report
number that are off by a factor of 2.
Now we skip any incomplete frames, because those frames never have
presentation times reported. This makes it theoretically more likely to
not being able to report fps at all, but I'd rather have no fps than fps
off by a factor of 2.
This is done with a NSCursor whose content is an NSImage. Image pixels are filled by a NSBitmap, and the format is premultiplied RGBA. So we can just use the texture downloader with GDK_MEMORY_R8G8B8A8_PREMULTIPLIED format.
While it’s documented as being safe, it triggers warnings from ubsan.
While we work out the best way to deal with that inside the
implementation of `G_ADD_PRIVATE` in GLib, let’s pragmatically just
short-circuit the code which triggers the warning here. This is helpful
because `gdk_display_get_debug_flags()` is called from a number of
locations within GTK, so is likely to be hit if anyone is running a UI
app under ubsan.
See https://gitlab.gnome.org/GNOME/glib/-/issues/3267#note_2033550
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: https://gitlab.gnome.org/GNOME/glib/-/issues/3267
It turns out that the workaround in 7b380b2ffc was insufficient.
During initialization, we end up calling apply_monitor_changes()
while xdg_output is set, but xdg_output_geometry isn't. Be more
careful and prevent that from wreaking havoc with negative scales.
Fixes: #6472
unsigned char is promoted to int, which lacks the 32nd bit to
make 0xff << 24 work. Explicitly cast to unsigned int to make
it clear what we want to happen.