If the size changes, we need to relayout the tiles. Otherwise we can keep
using what we had before. Generally, that shouldn't happen, but the
previous check was failing in a number of ways.
It looks like, particularly on the M1, we might need to double buffer the
contents of the IOSurface<->OpenGL texture bindings. This doesn't appear
to show up on the Intel macbooks I've tried, but I've seen it in the wild
on an M1.
If we have a 2x scale laptop with a 1x scale external display, we would
need to create a new IOSurface for the external display once it crosses
a boundary, otherwise we won't have something capable of displaying
correctly on the second monitor.
This provides a major shift in how we draw both when accelerated OpenGL
as well as software rendering with Cairo. In short, it uses tiles of Core
Animation's CALayer to display contents from an OpenGL or Cairo rendering
so that the window can provide partial damage updates. Partial damage is
not generally available when using OpenGL as the whole buffer is flipped
even if you only submitted a small change using a scissor rect.
Thankfully, this speeds up Cairo rendering a bit too by using IOSurface to
upload contents to the display server. We use the tiling system we do for
OpenGL which reduces overall complexity and differences between them.
A New Buffer
============
GdkMacosBuffer is a wrapper around an IOSurfaceRef. The term buffer was
used because 1) surface is already used and 2) it loosely maps to a
front/back buffer semantic.
However, it appears that IOSurfaceRef contents are being retained in
some fashion (likely in the compositor result) so we can update the same
IOSurfaceRef without flipping as long as we're fast. This appears to be
what Chromium does as well, but Firefox uses two IOSurfaceRef and flips
between them. We would like to avoid two surfaces because it doubles the
GPU VRAM requirements of the application.
Changes to Windows
==================
Previously, the NSWindow would dynamically change between different
types of NSView based on the renderer being used. This is no longer
necessary as we just have a single NSView type, GdkMacosView, which
inherits from GdkMacosBaseView just to keep the tedius stuff separate
from the machinery of GdkMacosView. We can merge those someday if we
are okay with that.
Changes to Views
================
GdkMacosCairoView, GdkMacosCairoSubView, GdkMacosGLView have all been
removed and replaced with GdkMacosView. This new view has a single
CALayer (GdkMacosLayer) attached to it which itself has sublayers.
The contents of the CALayer is populated with an IOSurfaceRef which
we allocated with the GdkMacosSurface. The surface is replaced when
the NSWindow resizes.
Changes to Layers
=================
We now have a dedicated GdkMacosLayer which contains sublayers of
GdkMacosTile. The tile has a maximum size of 128x128 pixels in device
units.
The GdkMacosTile is partitioned by splitting both the transparent
region (window bounds minus opaque area) and then by splitting the
opaque area.
A tile has either translucent contents (and therefore is not opaque) or
has opaque contents (and therefore is opaque). An opaque tile never
contains transparent contents. As such, the opaque tiles contain a black
background so that Core Animation will consider the tile's bounds as
opaque. This can be verified with "Quartz Debug -> Show opaque regions".
Changes to Cairo
================
GTK 4 cannot currently use cairo-quartz because of how CSS borders are
rendered. It simply causes errors in the cairo_quartz_surface_t backend.
Since we are restricted to using cairo_image_surface_t (which happens to
be faster anyway) we can use the IOSurfaceBaseAddress() to obtain a
mapping of the IOSurfaceRef in user-space. It always uses BGRA 32-bit
with alpha channel even if we will discard the alpha channel as that is
necessary to hit the fast paths in other parts of the platform. Note
that while Cairo says CAIRO_FORMAT_ARGB32, it is really 32-bit BGRA on
little-endian as we expect.
OpenGL will render flipped (Quartz Native Co-ordinates) while Cairo
renders with 0,O in the top-left. We could use cairo_translate() and
cairo_scale() to reverse this, but it looks like some cairo things may
not look quite as right if we do so. To reduce the chances of one-off
bugs this continues to draw as Cairo would normally, but instead uses
an CGAffineTransform in the tiles and some CGRect translation when
swapping buffers to get the same effect.
Changes to OpenGL
=================
To simplify things, removal of all NSOpenGL* related components have
been removed and we strictly use the Core GL (CGL*) API. This probably
should have been done long ago anyay.
Most examples found in the browsers to use IOSurfaceRef with OpenGL are
using Legacy GL and there is still work underway to make this fit in
with the rest of how the GSK GL renderer works.
Since IOSurfaceRef bound to a texture/framebuffer will not have a
default framebuffer ID of 0, we needed to add a default framebuffer id
to the GdkGLContext. GskGLRenderer can use this to setup the command
queue in such a way that our IOSurface destination has been
glBindFramebuffer() as if it were the default drawable.
This stuff is pretty slight-of-hand, so where things are and what needs
flushing when and where has been a bit of an experiment to see what
actually works to get synchronization across subsystems.
Efficient Damages
=================
After we draw with Cairo, we unlock the IOSurfaceRef and the contents
are uploaded to the GPU. To make the contents visible to the app,
we must clear the tiles contents with `layer.contents=nil;` and then
re-apply the IOSurfaceRef. Since the buffer has likely not changed, we
only do this if the tile overlaps the damage region.
This gives the effect of having more tightly controlled damage regions
even though updating the layer would damage be the whole window (as it
is with OpenGL/Metal today with the exception of scissor-rect).
This too can be verified usign "Quartz Debug -> Flash screen udpates".
Frame Synchronized Resize
=========================
In GTK 4, we have the ability to perform sizing changes from compute-size
during the layout phase. Since the macOS backend already tracks window
resizes manually, we can avoid doing the setFrame: immediately and instead
do it within the frame clock's layout phase.
Doing so gives us vastly better resize experience as we're more likely to
get the size-change and updated-contents in the same frame on screen. It
makes things feel "connected" in a way they weren't before.
Some additional effort to tweak gravity during the process is also
necessary but we were already doing that in the GTK 4 backend.
Backporting
===========
The design here has made an attempt to make it possible to backport by
keeping GdkMacosBuffer, GdkMacosLayer, and GdkMacosTile fairly
independent. There may be an opportunity to integrate this into GTK 3's
quartz backend with a fair bit of work. Doing so could improve the
situation for applications which are damage-rich such as The GIMP.
There are situations where our "default framebuffer" is not actually
zero, yet we still want to apply a scissor rect.
Generally, 0 is the default framebuffer. But on platforms where we need
to bind a platform-specific feature to a GL_FRAMEBUFFER, we might have a
default that is not 0. For example, on macOS we bind an IOSurfaceRef to
a GL_TEXTURE_RECTANGLE which then is assigned as the backing store for a
framebuffer. This is different than using gsk_gl_renderer_render_texture()
in that we don't want to incur an extra copy to the destination surface
nor do we even have a way to pass a texture_id into render_texture().
Previously, the popover would cause the window to go into the :backdrop
state which is not what we want for consistency with other platforms. This
fixes that by walking up the surface chain when we get notified of
loosing or acquiring "key" input from the display server.
We might have panels with controls in them where the window is running in
another process. The control could have a wrapper window which we would
see from this process. This can happen with the GtkFileChooserNative, but
any NSSavePanel in macOS 10.15+ is out of process (not just sandboxed
applications).
This significantly cleans up how we handle various move-resize, compute-
size, and configure (notification of changes) in the macOS GDK backend.
Originally when prototyping this backend, there were some bits that came
over from the quartz backend and some bits which did not. It got confusing
and so this makes an attempt to knock down all that technical debt.
It is much simpler now in that the GdkMacosSurface makes requests of the
GdkMacosWindow, and the GdkMacosWindow notifies the GdkMacosSurface of
changes that happen.
User resizes are delayed until the next compute-size so that we are much
closer to the layout phase, reducing chances for in-between frames.
This also improves the situation of leaving maximized state so that a
grab and drag feels like you'd expect on other platforms.
I removed the opacity hack we had in before, because that is all coming
out anyway and it's a bit obnoxious to maintain through the async flows
here.
This fixes GTK's NSWindow for toplevels so that they are allowed to enter
fullscreen. We were already handlign the state transitions from the
setStyleMask: halper, but we didn't previously tell the window we are
allowed to transition into that.
There is a bit of a mismatch here in that GTK doesn't have any such flag
that determines if a window is "allowed" by policy to enter fullscreen
since window managers on Linux are free to do that at will.
This more than halves the total runtime of this function since the
previous commit, from 8.36% to 4.02%, and is most likely memory
bandwidth limited on this specific board now.
I tried to do a SSE2 version as well, but couldn’t find any equivalent
of the LD4/ST4 ARM instruction.
On x86 on a Kaby Lake CPU, this makes it go from 6.63% of the total
execution time (loading some PNGs using the cairo backend) down to
3.20%.
On ARM on a Cortex-A7, on the same workload, this makes it go from 57%
to 8.36%.
We want our tracking area to be limited to the input region so that we
don't pass along events outside of them for the window. This improves the
chances we click-out of a popover with a large shadow.
This still doesn't let us pass-through clicks for large shadows on top-
level windows though.
We only should be asserting in static functions. Furthermore, this function
did not need to have GDK_BEGIN_MACOS_ALLOC_POOL as nothing is being
allocated there which would cause pooling to get used.
This needs to handle the boundary case where the value is exactly equal
to the edge of a rectangle (which gdk_rectangle_contains_point() does not
consider to be containing). However, if there is a monitor in the list
that is a better match, we still want to prefer it.
When using an external mouse on MacOS, the scrolling behavior is
reversed from the user's scrolling preference. Additionally, it is
noticeably sluggish.
This commit fixes both issues by negating the deltas and multiplying
them by 32 before constructing a new scroll event. 32 seems to be the
"traditional" scaling factor according to [Druid], but I'm not sure
where that value actually comes from. Regardless, scaling the deltas by
this amount makes scrolling feel a lot more responsive in the GTK demos.
Scrolling with a trackpad is not affected by either issue because it
triggers a different code path that uses more precise deltas, and
already negates them.
[Druid]: https://linebender.gitbook.io/linebender-graphics-wiki/mouse-wheel#external-mouse-wheel-vs-trackpad
Some Windows keymaps have bogus mappings for the Ctrl modifier. !4423 attempted
to fix this by ignoring the Ctrl layer, but that was not enough. We also need to
ignore combinations of Ctrl with other modifiers, i.e. Ctrl + Shift. For example,
Ctrl + Shift + 6 is mapped to the character 0x1E on a US keyboard (but it should
be treated as Ctrl + ^). Basically, always ignore Ctrl unless it is used in
conjunction with Alt, i.e. as part of AltGr.
Related issue: #4667
`free` is defined in `stdlib.h`, see for example
<https://pubs.opengroup.org/onlinepubs/009604499/functions/free.html>. Without
this include compilation can fail with the following error:
```
../gdk/loaders/gdkjpeg.c: In function ‘gdk_save_jpeg’:
../gdk/loaders/gdkjpeg.c:264:7: warning: implicit declaration of function ‘free’ [-Wimplicit-function-declaration]
free (data);
^
../gdk/loaders/gdkjpeg.c:264:7: warning: incompatible implicit declaration of built-in function ‘free’
../gdk/loaders/gdkjpeg.c:264:7: note: include ‘<stdlib.h>’ or provide a declaration of ‘free’
../gdk/loaders/gdkjpeg.c:302:67: error: ‘free’ undeclared (first use in this function)
return g_bytes_new_with_free_func (data, size, (GDestroyNotify) free, NULL);
^
../gdk/loaders/gdkjpeg.c:302:67: note: each undeclared identifier is reported only once for each function it appears in
../gdk/loaders/gdkjpeg.c:303:1: warning: control reaches end of non-void function [-Wreturn-type]
}
^
```
We don't want to risk having something really weird come out if we have a
WCG colorspace, so instead only do the performance hack on systems where
the output is likely reasonable.
We will want to eventually just be drawing in the appropriate colorspace,
but that is not available yet.