This reverts commit 84a304e66e.
This produces marks that are confusing to me. They don't correlate
with actual gaps in the frame cycle and often overlap with regular
'window presented' marks. Also, the function we are emitting these
marks from is called from the get_frame_time getter, and we
definitely don't want to emit marks from there.
Keep at least 1 second of frame timings.
This is necessary for 2 reasons - a real one and a fun one.
First, with the difference in monitor refresh rates, we can have 48Hz
latops as well as 240Hz high refresh rate monitors. That's a factor of
4, and tracking frame rates in both situations reliably is kind of hard
- either we track over too many frames and the fps take a lot of time to
adjust, or we track too little time and the fps fluctuate wildly.
Second, when benchmarking with GDK_DEBUG=no-vsync with a somewhat fast
renderer (*cough*Vulkan*cough*) frame rates can go into insane dimensions
and only very few frames are actually getting presentation times
reported. So to report accurate frame rates in those cases, we need a
*very* large history that can be 1000s of times larger than the usual
history. And that's just a waste for normal usage.
It started out as busywork, but it does many separate things. If I could
start over, I'd take them apart into multiple commits:
1. Remove G_ENABLE_DEBUG around GDK_DEBUG_*() calls
This is not needed at all, the calls themselves take care of it.
2. Remove G_ENABLE_DEBUG around profiling code
This now enables profiling support in release builds.
3. Stop poking _gdk_debug_flags and use GDK_DEBUG_CHECK()
This was old code that was never updated.
4. Make !G_ENABLE_DEBUG turn off GDK_DEBUG_CHECK()
The code used to
#define GDK_DEBUG_CHECK(...) false
#define GDK_DEBUG(...)
which would compile away all the code inside those macros. This
means a lot of variable definitions and debug utility functions
would suddenly no longer be used and cause compiler errors.
Don't return to the main loop, instead force a run of the paint idle.
The paint idle will know to skip all the phases that aren't requested.
This is critically important becuase gdksurface.c assumes the
FLUSH_EVENTS and RESUME_EVENTS phases are matched, and we cannot
guarantee that if we return to the main loop and let various reentrant
code change the frame clock state.
This would lead to bugs with events being paused and never unpaused
again or even crashes.
Fixes#4941
Fix scheduling of the frame clock when we don't receive "frame drawn"
messages from the compositor.
If we received "frame drawn" events recently, then the "smooth frame
time" would be in sync with the vsync time. When we don't receive frame
drawn events, the "smooth frame time" is simply incremented by constant
multiples of the refresh interval. In both cases we can use this smooth
time as the basis for scheduling the next clock cycle.
By only using the "smooth frame time" as a basis we also benefit from
more consistent scheduling cadence. If, for example, we got "frame
drawn" events, then didn't receive them for a few frames, we would still
be in sync when we start receiving these events again.
When an animation is started while the application is idle, that often
happens as a result of some external event. This can be an input event,
an expired timer, data arriving over the network etc. The result is that
the first animation clock cycle could be scheduled at some random time,
as opposed to follow up cycles which are usually scheduled right after a
vsync.
Since the frame time we report to the application is correlated to the
time when the frame clock was scheduled to run, this can result in
uneven times reported in the first few animation frames. In order to fix
that, we measure the phase of the first clock cycle - i.e. the offset
between the first cycle and the preceding vsync. Once we start receiving
"frame drawn" signals, the cadence of the frame clock scheduling becomes
tied to the vsync. In order to maintain the regularity of the reported
frame times, we adjust subsequent reported frame times with the
aforementioned phase.
When the application does not receive "frame drawn" signals we schedule
the clock to run more or less at intervals equal to the last known
refresh interval. In order to minimize clock skew we have to aim for
exact intervals.
We try to step the frame clock in whole refresh_interval steps, but to
avoid drift and rounding issues we additionally try to converge it to
be synced to the physical vblank (actually the time we get the
frame-drawn message from the compositor, but these are tied together).
However, the convergence to vsync only really makes sense if the new
frame_time actually is tied to the vsync. It may very well be that
some other kind of event (say a network or mouse event) triggered
the redraw, and not a vsync presentation.
We used to assume that all frames that are close in time (< 4 frames
apart) were regular and thus tied to the vsync, but there is really no
guarantee of that. Even non regular times could be rapid.
This commit changes the code to only do the convergence-to-real-time
if the cause of the clock cycle was a thaw (i.e. last frame drawn and
animating). Paint cycles for any other kind of reason are always
scheduled an integer number of frames after the last cycle that was
caused by a thaw.
When we get to a paint cycle we now know if this was caused by a
thaw, which typically means last frame was drawn, or some other event.
In the first case the time of the cycle is tied to the vblank in some
sense, and in the others it is essentially random. We can use this
information to compute better frame times. (Will be done in later
commits.)
When we run the frameclock RUN_FLUSH_IDLE idle before the paint,
then gdk_frame_clock_flush_idle() sets
```
priv->phase = GDK_FRAME_CLOCK_PHASE_BEFORE_PAINT
```
at the end if there is a paint comming.
But, before doing the paint cycle it may handle other X events, and
during that time the phase is set to BEFORE_PAINT. This means that the
current check on whether we're inside a paint is wrong:
```
if (priv->phase != GDK_FRAME_CLOCK_PHASE_NONE &&
priv->phase != GDK_FRAME_CLOCK_PHASE_FLUSH_EVENTS)
return priv->smoothed_frame_time_base;
```
This caused us to sometimes use this smoothed_frame_time_base even
though we previously reported a later value during PHASE_NONE, thus
being non-monotonic.
We can't just additionally check for the BEGIN_PAINT phase though,
becasue if we are in the paint loop actually doing that phase we
should use the time base. Instead we check for `!(BEFORE_PAINT &&
in_paint_idle)`.
A call to frame gdk_frame_clock_get_frame_time() outside of the paint
cycle could report an un-error-corrected frame time, and later a
corrected value could be earlier than the previously reported value.
We now always store the latest reported time so we can ensure
monotonicity.
In commit c6901a8b, the frame clock reported time was changed from
simply reporting the time we ran the frame clock cycle to reporting a
smoothed value that increased by the frame interval each time it was
called.
However, this change caused some problems, such as:
https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/1415https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/1416https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/1482
I think a lot of this is caused by the fact that we just overwrote the
old frame time with the smoothed, monotonous timestamp, breaking
some things that relied on knowing the actual time something happened.
This is a new approach to doing the smoothing that is more explicit.
The "frame_time" we store is the actual time we ran the update cycle,
and then we separately compute and store the derived smoothed time and
its period, allowing us to easily return a smoothed time at any time
by rounding the time difference to an integer number of frames.
The initial frame_time can be somewhat arbitrary, as it depends on the
first cycle which is not driven by the frame clock. But follow-up
cycles are typically tied to the the compositor sending the drawn
signal. It may happen that the initial frame is exactly in the middle
between two frames where jitter causes us to randomly round in
different directions when rounding to nearest frame. To fix this we
additionally do a quadratic convergence towards the "real" time,
during presentation driven clock cycles (i.e. when the frame times are
small).
This drops the marks for before/after-paint as they are internal
things that very rarely use any time, and also flush/resume-events
as any events reported here will get separate marks so will be easy
to see anyway.
Also, we rename the entire frameclock cycle to "frameclock cycle"
rather than "paint_idle" which is rather cryptic.
These don't take a duration, instead they call g_get_monotonic_time() to
and subtract the start time for it.
Almost all our calls are like this, and this makes the callsites clearer
and avoids inlining the clock call into the call site.
When we use if (GDK_PROFILER_IS_RUNNING) this means we get an
inlined if (FALSE) when the compiler support is not compiled in, which
gets rid of all the related code completely.
We also expand to G_UNLIKELY(gdk_profiler_is_running ()) in the supported
case which might cause somewhat better code generation.
usec is the scale of the monotonic timer which is where we get almost
all the times from. The only actual source of nsec is the opengl
GPU time (but who knows what the actual resulution of that is).
Changing this to usec allows us to get rid of " * 1000" in a *lot* of
places all over the codebase, which are ugly and confusing.
Instead of reporting the frame clock phases as defined,
report the duration of the signal emissions, which is more
useful for tracking down what is taking time.
We were adding incomplete frame timings to the
profile, which lead to occasional nonsense
numbers. Instead, only add timings to the profile
once we marked them as complete. This also
gives us an opportunity to add the presentation
time as a marker.
Since commit 3b2f9395, the frame time may be set into the future, so
only ensure monotonicity, and don't store the offset. This prevents the
frame time from becoming out of sync with g_get_monotonic_time().
Fixes#1612
The main GDK thread lock is not portable and deprecated.
The only reason why gdk_threads_add_timeout() and
gdk_threads_add_timeout_full() exist is to allow invoking a callback
with the GDK lock held, in case 3rd party libraries still use the
deprecated gdk_threads_enter()/gdk_threads_leave() API.
Since we're removing the GDK lock, and we're releasing a new major API,
such code cannot exist any more; this means we can use the GLib API for
installing timeout callbacks.
https://bugzilla.gnome.org/show_bug.cgi?id=793124
This fixes stuttering in animations that rely on the regularity of
gdk_frame_clock_get_frame_time.
https://bugzilla.gnome.org/show_bug.cgi?id=787665
BEFORE
gdkgears:
58 FPS and visibly stuttering
gnome-maps on a 59.95Hz monitor:
"paint" g_get_monotonic_time +17278μs, gdk_frame_clock_get_frame_time +17278μs
"paint" g_get_monotonic_time +17449μs, gdk_frame_clock_get_frame_time +17426μs
"paint" g_get_monotonic_time +17620μs, gdk_frame_clock_get_frame_time +17600μs
AFTER
gdkgears:
60 FPS and smoother
gnome-maps on a 59.95Hz monitor:
"paint" g_get_monotonic_time +18228μs, gdk_frame_clock_get_frame_time +16680μs
"paint" g_get_monotonic_time +15010μs, gdk_frame_clock_get_frame_time +16680μs
"paint" g_get_monotonic_time +17134μs, gdk_frame_clock_get_frame_time +16680μs
This patch makes that work using 1 of 2 options:
1. Add all missing enums to the switch statement
or
2. Cast the switch argument to a uint to avoid having to do that (mostly
for GdkEventType).
I even found a bug while doing that: clearing a GtkImage with a surface
did not notify thae surface property.
The reason for enabling this flag even though it is tedious at times is
that it is very useful when adding values to an enum, because it makes
GTK immediately warn about all the switch statements where this enum is
relevant.
And I expect changes to enums to be frequent during the GTK4 development
cycle.
These were showing up higher in Sysprof profiles.
The simple fix is to avoid the emit_by_name() and let the interface emit
the signals directly. No function preconditions are provided since these
are internal API.