skia2/bazel/rbe
Kevin Lubick c4872ce644 [bazel] Add support for Macs to make Linux RBE builds
The big change here is having the C++ toolchain use
Bazel platforms instead of the C++ specific flags/setup.
In Bazel, platforms are a general purpose way to define
things like os, cpu architecture, etc. We were not using
platforms previously, because the best documentation at
the time focused on the old ways.

However, the old ways were clumsy/difficult when trying
to manage cross-compilation, specifically when trying
to have a Mac host trigger a build on our Linux RBE
system targeting a Linux x64 system. Thus, rather than
keep investing in the legacy system, this CL migrates
us to using platforms where possible.

Suggested background reading to better understand this CL:
 - https://bazel.build/concepts/platforms-intro
 - https://bazel.build/docs/platforms
 - https://bazel.build/docs/toolchains#registering-building-toolchains

The hermetic toolchain itself is not changing in this CL
(and likely does not need to), only how we tell Bazel
about it (i.e. registering it) and how Bazel decides
to use it (i.e. resolving toolchains).

Here is my understanding of how platforms and toolchains
interact (supported by some evidence from [1][2])
 - Bazel needs to resolve platforms for the Host, Execution,
   and Target.
   - If not specified via flags, these are the machine from
     which Bazel is invoked, aka "@local_config_platform//:host".
   - With this CL, the Host could be a Mac laptop, the Execution
     platform is our Linux RBE pool, and the Target is "a Linux
     system with a x64 CPU"
 - To specify the Host, that is, describe to Bazel the
   capabilities of the system it is running on, one can
   set --host_platform [3] with a label pointing to a platform()
   containing the appropriate settings. Tip: have this
   platform inherit from @local_config_platform//:host
   so it can add to any of the constraint_settings and
   constraint_values that Bazel deduces automatically.
 - To specify the Target platform(s), that is, the system
   on which a final output resides and can execute, one
   can set the --platforms flag with a label referencing
   a platform().
 - Bazel will then choose an execution platform to fulfill
   that request. Bazel will look through a list of available
   platforms, which can be augmented* with the
   --extra_execution_platforms. Platforms specified by this
   flag will be considered higher than the default platforms!
 - Having selected the appropriate platforms, Bazel now
   needs to select a toolchain to actually run the actions
   of the appropriate type.
 - Bazel looks through the list of available toolchains
   and finds one that "matches" the Execution and the Target
   platform. This means, the toolchain's exec_compatible_with
   is a strict subset of the Execution platform and
   the toolchain's target_compatible_with is a strict subset
   of the Target platform. To register toolchains* (i.e. add
   them to the resolution list), we use --extra_toolchains.
   Once Bazel finds a match, it stops looking.
   Using --toolchain_resolution_debug=".*" makes Bazel log
   how it is resolving these toolchains and what execution
   platform it picked.

* We can also register execution platforms and toolchains in
  WORKSPACE.bazel [4], but the flags come with higher priority
  and that made resolution a bit tricky. Also, when we want
  to conditionally add them (e.g. --config=linux_rbe), we
  cannot remove them conditionally in the WORKSPACE.bazel file.

The above resolution flow directly necessitated the changes
in this CL.

Example usage of the new configs and platforms:

    # Can be run on a x64 Linux host and uses the hermetic toolchain.
    bazel build //:skia_public

    # Can be run on Mac or Linux and uses the Linux RBE system along
    # with the hermetic toolchain to compile a binary for Linux x64.
    bazel build //:skia_public --config=linux_rbe --config=for_linux_x64

    # Shorthand for above
    bazel build //:skia_public --config=for_linux_x64_with_rbe

Notice we don't have to type out --config=clang_linux anymore!
That was due to me reading the Bazel docs more carefully and
realizing we can set options for *all* Bazel build commands.

Current Limitations:
 - Targets which require a py_binary (e.g. Dawn's genrules)
   will not work on RBE when cross compiling because the
   python runtime we download is for the host machine, not
   the executor. This means //example:hello_world_dawn does
   not work on Mac when cross-compiling via linux_rbe.
 - Mac M1 linking not quite working with SkOpts settings.
   Probably need to set -target [5]

Suggested Review order:
 - toolchain/BUILD.bazel Notice how we do away with
   cc_toolchain_suite for toolchain. These have the same
   role: giving Bazel the information about where a toolchain
   can run. The platforms one is more expressive (IMO), allowing
   us to say both where to run the toolchain and what it can
   make. In order to more easily force the use of our hermetic
   toolchain, but also allow the hermetic toolchain to be used
   on RBE, we specify "use_hermetic_toolchain" only on the target,
   because the RBE image does not have the hermetic toolchain
   on it by default (but can certainly run it).
 - bazel/platform/BUILD.bazel to see the custom constraint_setting
   and corresponding constraint_value. The names for both of these
   are completely arbitrary - they do not need to have any deeper
   meaning or relation to any file or Docker image or system or
   any other constraints. Think of the constraint_setting as
   an Enum and the constraint_value being the one and only member.
   We need to pass around a constant value, not a type, so we
   need to provide the constraint_value (e.g. in toolchain/BUILD.bazel)
   but not a constraint_setting. However we need a
   constraint_setting declared so we can make a constraint_value
   of that "type".
   Notice the platform declared here - it allows us to force
   Bazel to use the hermetic toolchain because of the extra
   constraint_value.
 - .bazelrc I set a few flags that will be on for all
   bazel build commands. Importantly, this causes the C++
   build logic to use platforms and not the old, bespoke way.
   I also found a way to avoid using the local toolchain on
   the host, which will hopefully lead to clearer errors
   if platforms are mis-specified instead of odd compile
   errors because the host toolchain is too old or something.
   There are also a few RBE settings tweaked to be a bit
   more modern, as well the new shorthands for specifying
   target platforms (e.g. for_linux_x64).
 - bazel/buildrc where we have to turn off the platforms
   logic for emscripten https://github.com/emscripten-core/emsdk/issues/984
 - bazel/rbe/BUILD.bazel for a fix in the platform description
   that makes it work on Mac.
 - Notice that _m1 has been removed from the mac-related toolchain
   files because the same toolchain should work on both
   architectures.
 - All other changes in any order.

[1] https://bazel.build/docs/toolchains#debugging-toolchains
[2] https://bazel.build/docs/toolchains#toolchain-resolution
[3] https://bazel.build/reference/command-line-reference
[4] https://bazel.build/docs/toolchains#registering-building-toolchains
[5] 17dc3f16fc/gn/skia/BUILD.gn (L258-L271)
Change-Id: I515c114099d659639a808f74e47d489a68b7af62
Bug: skia:12541
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/549737
Reviewed-by: Erik Rose <erikrose@google.com>
Reviewed-by: Jorge Betancourt <jmbetancourt@google.com>
2022-06-23 12:00:43 +00:00
..
gce_linux [bazel] Run buildifier on BUILD.bazel files 2022-04-14 18:13:43 +00:00
gce_linux_container [bazel] Add RBE support using hermetic Linux Clang toolchain 2022-03-28 13:56:16 +00:00
BUILD.bazel [bazel] Add support for Macs to make Linux RBE builds 2022-06-23 12:00:43 +00:00
Makefile [bazel] Add RBE support using hermetic Linux Clang toolchain 2022-03-28 13:56:16 +00:00
README.md [bazel] Add RBE support using hermetic Linux Clang toolchain 2022-03-28 13:56:16 +00:00

RBE configurations

Some subdirectories of this folder are generated. For example, gce_linux was generated by running make generate_linux_config. Those generated files describe the C++ and Java toolchain that are in the RBE Docker image; these toolchains are required to run Bazel, but are not the toolchains that we use to compile our code.

We build our own, bare-bones, Docker image to use on RBE. We intend to use a hermetic toolchain (see //toolchain) that specifies everything necessary to compile and link Skia. Use of the hermetic toolchain on and off RBE makes the build reproducible and consistent across machines, and not require internet access (assuming the toolchain has been cached at least once). This setup has the desirable property of not needing to change and upload RBE Docker images if we need to change a small detail of our toolchain.

The only requirement we have of our Docker image (beyond the minimum requirements to run Bazel) are that it have sufficient runtime libraries to run our toolchain. For example, this means that the Linux RBE image has at least glibc 2.32, which is the current minimum requirement of the Linux binaries in our toolchain. This is the same requirement of any developer who tries to build Skia using Bazel locally.

Getting rbe_configs_gen

It is suggested to download a prebuilt binary from GitHub and put that onto your PATH.

Creating/Updating the RBE image

In accordance with SLSA level 1, we want to be able to have a scripted way of building our image and specify exactly what artifacts are in it. To accommodate this, we specify the exact sha256 hash of the base Docker image we use and the exact versions of the packages we install on top of that. If we need to add a package or update things, it is best build the image without these qualifiers to see what was actually used, and then respecify them so if someone were to build the docker image again, they are likely to get the same image.

This process is:

  1. Modify the appropriate Dockerfile (e.g. gce_linux_container/Dockerfile) to not have the version or hash qualifiers. Also increment the appropriate VERSION variable in Makefile.
  2. Add any new packages or make any changes.
  3. Run make build_linux_container to build the image locally. One may verify it works by running something like docker run -it gcr.io/skia-public/rbe_linux:v2 /bin/bash.
  4. Note the versions and base image hash that were used. Modify the Dockerfile to use these.
  5. Run make push_linux_container to rebuild the container and push it to GCS where it can be used by our RBE workers. Note the sha256 hash of this created container
  6. Modify the appropriate generate step in Makefile (e.g. generate_linux_config) to refer to the correct toolchain_container. Then, run that step.
  7. Modify the RBE platform in ./BUILD.bazel to refer to the new container_image.

We chose not to use Bazel rules for this container step, as that could be difficult to bootstrap without Bazel already setup. Additionally, Make is a simple and sufficient way to script the steps for SLSA purposes.

Defining our own Bazel RBE platforms

While the generated files do have a platform we can use (e.g. //bazel/rbe/gce_linux/config:platform), we do not use it because we cannot easily customize it without a risk that the changes will be lost when we update the image. Thankfully, we can specify our own platforms, which we do in ./BUILD.bazel.

More details

https://docs.google.com/document/d/14xMZCKews69SSTfULhE8HDUzT5XvPwZ4CvRufEvcZ74/edit

RBE Metrics

http://go/skia-rbe-metrics