mirror of
https://gitlab.gnome.org/GNOME/gtk.git
synced 2024-11-14 12:41:07 +00:00
356 lines
9.9 KiB
Plaintext
356 lines
9.9 KiB
Plaintext
|
General ideas of Pixops
|
||
|
=======================
|
||
|
|
||
|
- Gain speed by special-casing the common case, and using
|
||
|
generic code to handle the uncommon case.
|
||
|
|
||
|
- Most of the time in scaling an image is in the center;
|
||
|
however code that can handle edges properly is slow
|
||
|
because it needs to deal with the possibility of running
|
||
|
off the edge. So make the fast case code only handle
|
||
|
the centers, and use generic, slow, code for the edges,
|
||
|
|
||
|
Structure of Pixops
|
||
|
===================
|
||
|
|
||
|
The code of pixops can roughly be grouped into four parts:
|
||
|
|
||
|
- Filter computation functions
|
||
|
|
||
|
- Functions for scaling or compositing lines and pixels
|
||
|
using precomputed filters
|
||
|
|
||
|
- pixops process, the central driver that iterates through
|
||
|
the image calling pixel or line functions as necessary
|
||
|
|
||
|
- Wrapper functions (pixops_scale/composite/composite_color)
|
||
|
that compute the filter, chooses the line and pixel functions
|
||
|
and then call pixops_processs with the filter, line,
|
||
|
and pixel functions.
|
||
|
|
||
|
|
||
|
pixops process is a pretty scary looking function:
|
||
|
|
||
|
static void
|
||
|
pixops_process (guchar *dest_buf,
|
||
|
int render_x0,
|
||
|
int render_y0,
|
||
|
int render_x1,
|
||
|
int render_y1,
|
||
|
int dest_rowstride,
|
||
|
int dest_channels,
|
||
|
gboolean dest_has_alpha,
|
||
|
const guchar *src_buf,
|
||
|
int src_width,
|
||
|
int src_height,
|
||
|
int src_rowstride,
|
||
|
int src_channels,
|
||
|
gboolean src_has_alpha,
|
||
|
double scale_x,
|
||
|
double scale_y,
|
||
|
int check_x,
|
||
|
int check_y,
|
||
|
int check_size,
|
||
|
guint32 color1,
|
||
|
guint32 color2,
|
||
|
PixopsFilter *filter,
|
||
|
PixopsLineFunc line_func,
|
||
|
PixopsPixelFunc pixel_func)
|
||
|
|
||
|
(Some of the arguments should be moved into structures. It's basically
|
||
|
"all the arguments to pixops_composite_color plus three more") The
|
||
|
arguments can be divided up into:
|
||
|
|
||
|
|
||
|
Information about the destination buffer
|
||
|
|
||
|
guchar *dest_buf, int dest_rowstride, int dest_channels, gboolean dest_has_alpha,
|
||
|
|
||
|
Information about the source buffer
|
||
|
|
||
|
guchar *src_buf, int src_rowstride, int src_channels, gboolean src_has_alpha,
|
||
|
int src_width, int src_height,
|
||
|
|
||
|
Information on how to scale the source buf and the region of the scaled source
|
||
|
to render onto the destination buffer
|
||
|
|
||
|
int render_x0, int render_y0, int render_x1, int render_y1
|
||
|
double scale_x, double scale_y
|
||
|
|
||
|
Information about a constant color or check pattern onto which to to composite
|
||
|
|
||
|
int check_x, int check_y, int check_size, guint32 color1, guint32 color2
|
||
|
|
||
|
Information precomputed to use during the scale operation
|
||
|
|
||
|
PixopsFilter *filter, PixopsLineFunc line_func, OixopsPixelFunc pixel_func
|
||
|
|
||
|
|
||
|
Filter computation
|
||
|
==================
|
||
|
|
||
|
The PixopsFilter structure looks like:
|
||
|
|
||
|
struct _PixopsFilter
|
||
|
{
|
||
|
int *weights;
|
||
|
int n_x;
|
||
|
int n_y;
|
||
|
double x_offset;
|
||
|
double y_offset;
|
||
|
};
|
||
|
|
||
|
|
||
|
'weights' is an array of size:
|
||
|
|
||
|
weights[SUBSAMPLE][SUBSAMPLE][n_x][n_y]
|
||
|
|
||
|
SUBSAMPLE is a constant - currently 16 in pixops.c.
|
||
|
|
||
|
|
||
|
In order to compute a scaled destination pixel we convolve
|
||
|
an array of n_x by n_y source pixels with one of
|
||
|
the SUBSAMPLE * SUBSAMPLE filter matrices stored
|
||
|
in weights. The choice of filter matrix is determined
|
||
|
by the fractional part of the source location.
|
||
|
|
||
|
To compute dest[i,j] we do the following:
|
||
|
|
||
|
x = i * scale_x + x_offset;
|
||
|
y = i * scale_x + y_offset;
|
||
|
x_int = floor(x)
|
||
|
y_int = floor(y)
|
||
|
|
||
|
C = weights[SUBSAMPLE*(x - x_int)][SUBSAMPLE*(y - y_int)]
|
||
|
total = sum[l=0..n_x-1, j=0..n_y-1] (C[l,m] * src[x_int + l, x_int + m])
|
||
|
|
||
|
The filter weights are integers scaled so that the total of the
|
||
|
weights in the weights array is equal to 65536.
|
||
|
|
||
|
When the source does not have alpha, we simply compute each channel
|
||
|
as above, so total is in the range [0,255*65536]
|
||
|
|
||
|
dest = src / 65536
|
||
|
|
||
|
When the source does have alpha, then we need to compute using
|
||
|
"pre-multiplied alpha":
|
||
|
|
||
|
a_total = sum (C[l,m] * src_a[x_int + l, x_int + m])
|
||
|
c_total = sum (C[l,m] * src_a[x_int + l, x_int + m] * src_c[x_int + l, x_int + m])
|
||
|
|
||
|
This gives us a result for c_total in the range of [0,255*a_total]
|
||
|
|
||
|
c_dest = c_total / a_total
|
||
|
|
||
|
|
||
|
Mathematical aside:
|
||
|
|
||
|
The process of producing a destination filter consists
|
||
|
of:
|
||
|
|
||
|
- Producing a continuous approximation to the source
|
||
|
image via interpolation.
|
||
|
|
||
|
- Sampling that continuous approximation with filter.
|
||
|
|
||
|
This is representable as:
|
||
|
|
||
|
S(x,y) = sum[i=-inf,inf; j=-inf,inf] A(frac(x),frac(y))[i,j] * S[floor(x)+i,floor(y)+j]
|
||
|
|
||
|
D[i,j] = Integral(s=-inf,inf; t=-inf,inf) B(i+x,j+y) S((i+x)/scale_x,(i+y)/scale_y)
|
||
|
|
||
|
By reordering the sums and integrals, you get something of the form:
|
||
|
|
||
|
D[i,j] = sum[l=-inf,inf; m=-inf;inf] C[l,m] S[i+l,j+l]
|
||
|
|
||
|
The arrays in weights are the C[l,m] above, and are thus
|
||
|
determined by the interpolating algorithm in use and the
|
||
|
sampling filter:
|
||
|
|
||
|
INTERPOLATE SAMPLE
|
||
|
ART_FILTER_NEAREST nearest neighbour point
|
||
|
ART_FILTER_TILES nearest neighbour box
|
||
|
ART_FILTER_BILINEAR (scale < 1) nearest neighbour box (scale < 1)
|
||
|
ART_FILTER_BILINEAR (scale > 1) bilinear point (scale > 1)
|
||
|
ART_FILTER_HYPER bilinear box
|
||
|
|
||
|
|
||
|
Pixel Functions
|
||
|
===============
|
||
|
|
||
|
typedef void (*PixopsPixelFunc) (guchar *dest, int dest_x, int dest_channels, int dest_has_alpha,
|
||
|
int src_has_alpha,
|
||
|
int check_size, guint32 color1, guint32 color2,
|
||
|
int r, int g, int b, int a);
|
||
|
|
||
|
The arguments here are:
|
||
|
|
||
|
dest: location to store the output pixel
|
||
|
dest_x: x coordinate of destination (for handling checks)
|
||
|
dest_has_alpha, dest_channels: Information about the destination pixbuf
|
||
|
src_has_alpha: Information about the source pixbuf
|
||
|
|
||
|
check_size, color1, color2: Information for color background for composite_color variant
|
||
|
|
||
|
r,g,b,a - scaled red, green, blue and alpha
|
||
|
|
||
|
r,g,b are premultiplied alpha.
|
||
|
|
||
|
a is in [0,65536*255]
|
||
|
r is in [0,255*a]
|
||
|
g is in [0,255*a]
|
||
|
b is in [0,255*a]
|
||
|
|
||
|
If src_has_alpha is false, then a will be 65536*255, allowing optimization.
|
||
|
|
||
|
|
||
|
Line functions
|
||
|
==============
|
||
|
|
||
|
typedef guchar *(*PixopsLineFunc) (int *weights, int n_x, int n_y,
|
||
|
guchar *dest, int dest_x, guchar *dest_end, int dest_channels, int dest_has_alpha,
|
||
|
guchar **src, int src_channels, gboolean src_has_alpha,
|
||
|
int x_init, int x_step, int src_width,
|
||
|
int check_size, guint32 color1, guint32 color2);
|
||
|
|
||
|
The argumets are:
|
||
|
|
||
|
weights, n_x, n_y
|
||
|
|
||
|
Filter weights for this row - dimensions weights[SUBSAMPLE][n_x][n_y]
|
||
|
|
||
|
dest, dest_x, dest_end, dest_channels, dest_has_alpha
|
||
|
|
||
|
The destination buffer, function will start writing into *dest and
|
||
|
increment by dest_channels, until dest == dest_end. Reading from
|
||
|
src for these pixels is guaranteed not to go outside of the
|
||
|
bufer bounds
|
||
|
|
||
|
src, src_channels, src_has_alpha
|
||
|
|
||
|
src[n_y] - an array of pointers to the start of the source rows
|
||
|
for each filter coordinate.
|
||
|
|
||
|
x_init, x_step
|
||
|
|
||
|
Information about x positions in source image.
|
||
|
|
||
|
src_width - unused
|
||
|
|
||
|
check_size, color1, color2: Information for color background for composite_color variant
|
||
|
|
||
|
The total for the destination pixel at dest + i is given by
|
||
|
|
||
|
SUM (l=0..n_x - 1, m=0..n_y - 1)
|
||
|
src[m][(x_init + i * x_step)>> SCALE_SHIFT + l] * weights[m][l]
|
||
|
|
||
|
|
||
|
Algorithms for compositing
|
||
|
==========================
|
||
|
|
||
|
Compositing alpha on non alpha:
|
||
|
|
||
|
R = As * Rs + (1 - As) * Rd
|
||
|
G = As * Gs + (1 - As) * Gd
|
||
|
B = As * Bs + (1 - As) * Bd
|
||
|
|
||
|
This can be regrouped as:
|
||
|
|
||
|
Cd + Cs * (Cs - Rd)
|
||
|
|
||
|
Compositing alpha on alpha:
|
||
|
|
||
|
A = As + (1 - As) * Ad
|
||
|
R = (As * Rs + (1 - As) * Rd * Ad) / A
|
||
|
G = (As * Gs + (1 - As) * Gd * Ad) / A
|
||
|
B = (As * Bs + (1 - As) * Bd * Ad) / A
|
||
|
|
||
|
The way to think of this is in terms of the "area":
|
||
|
|
||
|
The final pixel is composed of area As of the source pixel
|
||
|
and (1 - As) * Ad of the target pixel. So the final pixel
|
||
|
is a weighted average with those weights.
|
||
|
|
||
|
Note that the weights do not add up to one - hence the
|
||
|
non-constant division.
|
||
|
|
||
|
|
||
|
Integer tricks for compositing
|
||
|
==============================
|
||
|
|
||
|
|
||
|
|
||
|
MMX Code
|
||
|
========
|
||
|
|
||
|
Line functions are provided in MMX functionsfor a few special
|
||
|
cases:
|
||
|
|
||
|
n_x = n_y = 2
|
||
|
|
||
|
src_channels = 3 dest_channels = 3 op = scale
|
||
|
src_channels = 4 with alpha dest_channels = 4 no alpha op = composite
|
||
|
src_channels = 4 with alpha dest_channels = 4 no alpha op = composite_color
|
||
|
|
||
|
For the case n_x = n_y = 2 - primarily hit when scaling up with bilinear
|
||
|
scaling, we can take advantage of the fact that multiple destination
|
||
|
pixels will be composed from the same source pixels.
|
||
|
|
||
|
That is a destination pixel is a linear combination of the source
|
||
|
pixels around it:
|
||
|
|
||
|
|
||
|
S0 S1
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
D D' D'' ...
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
S2 S3
|
||
|
|
||
|
Each mmx register is 64 bits wide, so we can unpack a source pixel
|
||
|
into the low 8 bits of 4 16 bit words, and store it into a mmx
|
||
|
register.
|
||
|
|
||
|
For each destination pixel, we first make sure that we have pixels S0
|
||
|
... S3 loaded into registers mm0 ...mm3. (This will often involve not
|
||
|
doing anything or moving mm1 and mm3 into mm0 and mm1 then reloading
|
||
|
mm1 and mm3 with new values).
|
||
|
|
||
|
Then we load up the appropriate weights for the 4 corner pixels
|
||
|
based on the offsets of the destination pixel within the source
|
||
|
pixels.
|
||
|
|
||
|
We have preexpanded the weights to 64 bits wide and truncated the
|
||
|
range to 8 bits, so an original filter value of
|
||
|
|
||
|
0x5321 would be expanded to
|
||
|
|
||
|
0x0053005300530053
|
||
|
|
||
|
For source buffers without alpha, we simply do a multiply-add
|
||
|
of the weights, giving us a 16 bit quantity for the result
|
||
|
that we shift left by 8 and store in the destination buffer.
|
||
|
|
||
|
When the source buffer has alpha, then things become more
|
||
|
complicated - when we load up mm0 and mm3, we premultiply
|
||
|
the alpha, so they contain:
|
||
|
|
||
|
(a*ff >> 8) (r*a >> 8) (g*a >> 8) (b*a >> a)
|
||
|
|
||
|
Then when we multiply by the weights, and add we end up
|
||
|
with premultiplied r,g,b,a in the range of 0 .. 0xff * 0ff,
|
||
|
call them A,R,G,B
|
||
|
|
||
|
We then need to composite with the dest pixels - which
|
||
|
we do by:
|
||
|
|
||
|
r_dest = (R + ((0xff * 0xff - A) >> 8) * r_dest) >> 8
|
||
|
|
||
|
(0xff * 0xff)
|