[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
// Copyright 2017 the V8 project authors. All rights reserved.
|
|
|
|
// Use of this source code is governed by a BSD-style license that can be
|
|
|
|
// found in the LICENSE file.
|
|
|
|
|
|
|
|
#include <stddef.h>
|
|
|
|
#include <stdint.h>
|
|
|
|
#include <stdlib.h>
|
|
|
|
|
|
|
|
#include <algorithm>
|
|
|
|
|
|
|
|
#include "include/v8.h"
|
|
|
|
#include "src/isolate.h"
|
|
|
|
#include "src/objects-inl.h"
|
|
|
|
#include "src/objects.h"
|
|
|
|
#include "src/ostreams.h"
|
|
|
|
#include "src/wasm/wasm-interpreter.h"
|
|
|
|
#include "src/wasm/wasm-module-builder.h"
|
|
|
|
#include "src/wasm/wasm-module.h"
|
|
|
|
#include "test/common/wasm/test-signatures.h"
|
|
|
|
#include "test/common/wasm/wasm-module-runner.h"
|
|
|
|
#include "test/fuzzer/fuzzer-support.h"
|
2017-05-08 09:22:54 +00:00
|
|
|
#include "test/fuzzer/wasm-fuzzer-common.h"
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
|
|
|
|
typedef uint8_t byte;
|
|
|
|
|
2017-05-08 09:22:54 +00:00
|
|
|
using namespace v8::internal;
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
using namespace v8::internal::wasm;
|
2017-05-08 09:22:54 +00:00
|
|
|
using namespace v8::internal::wasm::fuzzer;
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
|
|
|
|
namespace {
|
|
|
|
|
|
|
|
class DataRange {
|
|
|
|
const uint8_t* data_;
|
|
|
|
size_t size_;
|
|
|
|
|
|
|
|
public:
|
|
|
|
DataRange(const uint8_t* data, size_t size) : data_(data), size_(size) {}
|
|
|
|
|
|
|
|
size_t size() const { return size_; }
|
|
|
|
|
|
|
|
std::pair<DataRange, DataRange> split(uint32_t index) const {
|
|
|
|
return std::make_pair(DataRange(data_, index),
|
|
|
|
DataRange(data_ + index, size() - index));
|
|
|
|
}
|
|
|
|
|
|
|
|
std::pair<DataRange, DataRange> split() {
|
|
|
|
uint16_t index = get<uint16_t>();
|
|
|
|
if (size() > 0) {
|
|
|
|
index = index % size();
|
|
|
|
} else {
|
|
|
|
index = 0;
|
|
|
|
}
|
|
|
|
return split(index);
|
|
|
|
}
|
|
|
|
|
|
|
|
template <typename T>
|
|
|
|
T get() {
|
|
|
|
if (size() == 0) {
|
|
|
|
return T();
|
|
|
|
} else {
|
2017-03-01 16:50:04 +00:00
|
|
|
// We want to support the case where we have less than sizeof(T) bytes
|
|
|
|
// remaining in the slice. For example, if we emit an i32 constant, it's
|
|
|
|
// okay if we don't have a full four bytes available, we'll just use what
|
|
|
|
// we have. We aren't concerned about endianness because we are generating
|
|
|
|
// arbitrary expressions.
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
const size_t num_bytes = std::min(sizeof(T), size());
|
2017-03-01 16:50:04 +00:00
|
|
|
T result = T();
|
2017-02-20 11:06:50 +00:00
|
|
|
memcpy(&result, data_, num_bytes);
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
data_ += num_bytes;
|
|
|
|
size_ -= num_bytes;
|
|
|
|
return result;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
class WasmGenerator {
|
|
|
|
template <WasmOpcode Op, ValueType... Args>
|
|
|
|
std::function<void(DataRange)> op() {
|
|
|
|
return [this](DataRange data) {
|
|
|
|
Generate<Args...>(data);
|
|
|
|
builder_->Emit(Op);
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
template <ValueType T>
|
|
|
|
std::function<void(DataRange)> block() {
|
|
|
|
return [this](DataRange data) {
|
|
|
|
blocks_.push_back(T);
|
|
|
|
builder_->EmitWithU8(
|
|
|
|
kExprBlock, static_cast<uint8_t>(WasmOpcodes::ValueTypeCodeFor(T)));
|
|
|
|
Generate<T>(data);
|
|
|
|
builder_->Emit(kExprEnd);
|
|
|
|
blocks_.pop_back();
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
template <ValueType T>
|
|
|
|
std::function<void(DataRange)> block_br() {
|
|
|
|
return [this](DataRange data) {
|
|
|
|
blocks_.push_back(T);
|
|
|
|
builder_->EmitWithU8(
|
|
|
|
kExprBlock, static_cast<uint8_t>(WasmOpcodes::ValueTypeCodeFor(T)));
|
|
|
|
|
|
|
|
const uint32_t target_block = data.get<uint32_t>() % blocks_.size();
|
|
|
|
const ValueType break_type = blocks_[target_block];
|
|
|
|
|
|
|
|
Generate(break_type, data);
|
2017-04-25 10:45:46 +00:00
|
|
|
builder_->EmitWithI32V(kExprBr, target_block);
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
builder_->Emit(kExprEnd);
|
|
|
|
blocks_.pop_back();
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
public:
|
|
|
|
WasmGenerator(v8::internal::wasm::WasmFunctionBuilder* fn) : builder_(fn) {}
|
|
|
|
|
|
|
|
void Generate(ValueType type, DataRange data);
|
|
|
|
|
|
|
|
template <ValueType T>
|
|
|
|
void Generate(DataRange data);
|
|
|
|
|
|
|
|
template <ValueType T1, ValueType T2, ValueType... Ts>
|
|
|
|
void Generate(DataRange data) {
|
|
|
|
const auto parts = data.split();
|
|
|
|
Generate<T1>(parts.first);
|
|
|
|
Generate<T2, Ts...>(parts.second);
|
|
|
|
}
|
|
|
|
|
|
|
|
private:
|
|
|
|
v8::internal::wasm::WasmFunctionBuilder* builder_;
|
|
|
|
std::vector<ValueType> blocks_;
|
|
|
|
};
|
|
|
|
|
|
|
|
template <>
|
|
|
|
void WasmGenerator::Generate<kWasmI32>(DataRange data) {
|
|
|
|
if (data.size() <= sizeof(uint32_t)) {
|
|
|
|
builder_->EmitI32Const(data.get<uint32_t>());
|
|
|
|
} else {
|
|
|
|
const std::function<void(DataRange)> alternates[] = {
|
|
|
|
op<kExprI32Eqz, kWasmI32>(), //
|
|
|
|
op<kExprI32Eq, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32Ne, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32LtS, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32LtU, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32GeS, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32GeU, kWasmI32, kWasmI32>(),
|
|
|
|
|
|
|
|
op<kExprI64Eqz, kWasmI64>(), //
|
|
|
|
op<kExprI64Eq, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64Ne, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64LtS, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64LtU, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64GeS, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64GeU, kWasmI64, kWasmI64>(),
|
|
|
|
|
|
|
|
op<kExprF32Eq, kWasmF32, kWasmF32>(),
|
|
|
|
op<kExprF32Ne, kWasmF32, kWasmF32>(),
|
|
|
|
op<kExprF32Lt, kWasmF32, kWasmF32>(),
|
|
|
|
op<kExprF32Ge, kWasmF32, kWasmF32>(),
|
|
|
|
|
|
|
|
op<kExprF64Eq, kWasmF64, kWasmF64>(),
|
|
|
|
op<kExprF64Ne, kWasmF64, kWasmF64>(),
|
|
|
|
op<kExprF64Lt, kWasmF64, kWasmF64>(),
|
|
|
|
op<kExprF64Ge, kWasmF64, kWasmF64>(),
|
|
|
|
|
|
|
|
op<kExprI32Add, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32Sub, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32Mul, kWasmI32, kWasmI32>(),
|
|
|
|
|
|
|
|
op<kExprI32DivS, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32DivU, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32RemS, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32RemU, kWasmI32, kWasmI32>(),
|
|
|
|
|
|
|
|
op<kExprI32And, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32Ior, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32Xor, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32Shl, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32ShrU, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32ShrS, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32Ror, kWasmI32, kWasmI32>(),
|
|
|
|
op<kExprI32Rol, kWasmI32, kWasmI32>(),
|
|
|
|
|
|
|
|
op<kExprI32Clz, kWasmI32>(), //
|
|
|
|
op<kExprI32Ctz, kWasmI32>(), //
|
|
|
|
op<kExprI32Popcnt, kWasmI32>(),
|
|
|
|
|
|
|
|
op<kExprI32ConvertI64, kWasmI64>(), //
|
|
|
|
op<kExprI32SConvertF32, kWasmF32>(),
|
|
|
|
op<kExprI32UConvertF32, kWasmF32>(),
|
|
|
|
op<kExprI32SConvertF64, kWasmF64>(),
|
|
|
|
op<kExprI32UConvertF64, kWasmF64>(),
|
|
|
|
op<kExprI32ReinterpretF32, kWasmF32>(),
|
|
|
|
|
|
|
|
block<kWasmI32>(),
|
|
|
|
block_br<kWasmI32>()};
|
|
|
|
|
|
|
|
static_assert(arraysize(alternates) < std::numeric_limits<uint8_t>::max(),
|
|
|
|
"Too many alternates. Replace with a bigger type if needed.");
|
|
|
|
const auto which = data.get<uint8_t>();
|
|
|
|
|
|
|
|
alternates[which % arraysize(alternates)](data);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
template <>
|
|
|
|
void WasmGenerator::Generate<kWasmI64>(DataRange data) {
|
|
|
|
if (data.size() <= sizeof(uint64_t)) {
|
2017-04-25 10:45:46 +00:00
|
|
|
builder_->EmitI64Const(data.get<int64_t>());
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
} else {
|
|
|
|
const std::function<void(DataRange)> alternates[] = {
|
|
|
|
op<kExprI64Add, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64Sub, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64Mul, kWasmI64, kWasmI64>(),
|
|
|
|
|
|
|
|
op<kExprI64DivS, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64DivU, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64RemS, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64RemU, kWasmI64, kWasmI64>(),
|
|
|
|
|
|
|
|
op<kExprI64And, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64Ior, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64Xor, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64Shl, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64ShrU, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64ShrS, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64Ror, kWasmI64, kWasmI64>(),
|
|
|
|
op<kExprI64Rol, kWasmI64, kWasmI64>(),
|
|
|
|
|
|
|
|
op<kExprI64Clz, kWasmI64>(),
|
|
|
|
op<kExprI64Ctz, kWasmI64>(),
|
|
|
|
op<kExprI64Popcnt, kWasmI64>(),
|
|
|
|
|
|
|
|
block<kWasmI64>(),
|
|
|
|
block_br<kWasmI64>()};
|
|
|
|
|
|
|
|
static_assert(arraysize(alternates) < std::numeric_limits<uint8_t>::max(),
|
|
|
|
"Too many alternates. Replace with a bigger type if needed.");
|
|
|
|
const auto which = data.get<uint8_t>();
|
|
|
|
|
|
|
|
alternates[which % arraysize(alternates)](data);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
template <>
|
|
|
|
void WasmGenerator::Generate<kWasmF32>(DataRange data) {
|
2017-04-25 10:45:46 +00:00
|
|
|
if (data.size() <= sizeof(float)) {
|
|
|
|
builder_->EmitF32Const(data.get<float>());
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
} else {
|
|
|
|
const std::function<void(DataRange)> alternates[] = {
|
|
|
|
op<kExprF32Add, kWasmF32, kWasmF32>(),
|
|
|
|
op<kExprF32Sub, kWasmF32, kWasmF32>(),
|
|
|
|
op<kExprF32Mul, kWasmF32, kWasmF32>(),
|
|
|
|
|
|
|
|
block<kWasmF32>(), block_br<kWasmF32>()};
|
|
|
|
|
|
|
|
static_assert(arraysize(alternates) < std::numeric_limits<uint8_t>::max(),
|
|
|
|
"Too many alternates. Replace with a bigger type if needed.");
|
|
|
|
const auto which = data.get<uint8_t>();
|
|
|
|
|
|
|
|
alternates[which % arraysize(alternates)](data);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
template <>
|
|
|
|
void WasmGenerator::Generate<kWasmF64>(DataRange data) {
|
2017-04-25 10:45:46 +00:00
|
|
|
if (data.size() <= sizeof(double)) {
|
|
|
|
builder_->EmitF64Const(data.get<double>());
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
} else {
|
|
|
|
const std::function<void(DataRange)> alternates[] = {
|
|
|
|
op<kExprF64Add, kWasmF64, kWasmF64>(),
|
|
|
|
op<kExprF64Sub, kWasmF64, kWasmF64>(),
|
|
|
|
op<kExprF64Mul, kWasmF64, kWasmF64>(),
|
|
|
|
|
|
|
|
block<kWasmF64>(), block_br<kWasmF64>()};
|
|
|
|
|
|
|
|
static_assert(arraysize(alternates) < std::numeric_limits<uint8_t>::max(),
|
|
|
|
"Too many alternates. Replace with a bigger type if needed.");
|
|
|
|
const auto which = data.get<uint8_t>();
|
|
|
|
|
|
|
|
alternates[which % arraysize(alternates)](data);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void WasmGenerator::Generate(ValueType type, DataRange data) {
|
|
|
|
switch (type) {
|
|
|
|
case kWasmI32:
|
|
|
|
return Generate<kWasmI32>(data);
|
|
|
|
case kWasmI64:
|
|
|
|
return Generate<kWasmI64>(data);
|
|
|
|
case kWasmF32:
|
|
|
|
return Generate<kWasmF32>(data);
|
|
|
|
case kWasmF64:
|
|
|
|
return Generate<kWasmF64>(data);
|
|
|
|
default:
|
|
|
|
UNREACHABLE();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-05-08 09:22:54 +00:00
|
|
|
class WasmCompileFuzzer : public WasmExecutionFuzzer {
|
|
|
|
virtual bool GenerateModule(
|
|
|
|
Isolate* isolate, Zone* zone, const uint8_t* data, size_t size,
|
|
|
|
ZoneBuffer& buffer, int32_t& num_args,
|
2017-07-14 13:49:01 +00:00
|
|
|
std::unique_ptr<WasmValue[]>& interpreter_args,
|
2017-05-08 09:22:54 +00:00
|
|
|
std::unique_ptr<Handle<Object>[]>& compiler_args) override {
|
|
|
|
TestSignatures sigs;
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
|
2017-05-08 09:22:54 +00:00
|
|
|
WasmModuleBuilder builder(zone);
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
|
2017-05-08 09:22:54 +00:00
|
|
|
v8::internal::wasm::WasmFunctionBuilder* f =
|
|
|
|
builder.AddFunction(sigs.i_iii());
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
|
2017-05-08 09:22:54 +00:00
|
|
|
WasmGenerator gen(f);
|
|
|
|
gen.Generate<kWasmI32>(DataRange(data, static_cast<uint32_t>(size)));
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
|
2017-05-08 09:22:54 +00:00
|
|
|
uint8_t end_opcode = kExprEnd;
|
|
|
|
f->EmitCode(&end_opcode, 1);
|
2017-05-12 11:06:25 +00:00
|
|
|
builder.AddExport(v8::internal::CStrVector("main"), f);
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
|
2017-05-08 09:22:54 +00:00
|
|
|
builder.WriteTo(buffer);
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
|
2017-05-08 09:22:54 +00:00
|
|
|
num_args = 3;
|
2017-07-14 13:49:01 +00:00
|
|
|
interpreter_args.reset(
|
|
|
|
new WasmValue[3]{WasmValue(1), WasmValue(2), WasmValue(3)});
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
|
2017-05-08 09:22:54 +00:00
|
|
|
compiler_args.reset(new Handle<Object>[3]{
|
|
|
|
handle(Smi::FromInt(1), isolate), handle(Smi::FromInt(1), isolate),
|
|
|
|
handle(Smi::FromInt(1), isolate)});
|
|
|
|
return true;
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
}
|
2017-05-08 09:22:54 +00:00
|
|
|
};
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
|
2017-05-08 09:22:54 +00:00
|
|
|
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
|
|
|
|
return WasmCompileFuzzer().FuzzWasmModule(data, size);
|
[wasm] Syntax- and Type-aware Fuzzer
This is the beginning of a new fuzzer that generates
correct-by-construction Wasm modules. This should allow us to better
exercise the compiler and correctness aspects of fuzzing. It is based off
of ahaas' original Wasm fuzzer.
At the moment, it can generate expressions made up of most binops, and
also nested blocks with unconditional breaks. Future CLs will add
additional constructs, such as br_if, loops, memory access, etc.
The way the fuzzer works is that it starts with an array of arbitrary
data provided by libfuzzer. It uses the data to generate an expression.
Care is taken to make use of the entire string. Basically, the
generator has a bunch of grammar-like rules for how to construct an
expression of a given type. For example, an i32 can be made by adding
two other i32s, or by wrapping an i64. The process then continues
recursively until all the data is consumed.
We generate an expression from a slice of data as follows:
* If the slice is less than or equal to the size of the type (e.g. 4
bytes for i32), then it will emit the entire slice as a constant.
* Otherwise, it will consume the first 4 bytes of the slice and use
this to select which rule to apply. Each rule then consumes the
remainder of the slice in an appropriate way. For example:
* Unary ops use the remainder of the slice to generate the argument.
* Binary ops consume another four bytes and mod this with the length
of the remaining slice to split the slice into two parts. Each of
these subslices are then used to generate one of the arguments to
the binop.
* Blocks are basically like a unary op, but a stack of block types is
maintained to facilitate branches. For blocks that end in a break,
the first four bytes of a slice are used to select the break depth
and the stack determines what type of expression to generate.
The goal is that once this generator is complete, it will provide a one
to one mapping between binary strings and valid Wasm modules.
Review-Url: https://codereview.chromium.org/2658723006
Cr-Commit-Position: refs/heads/master@{#43289}
2017-02-17 17:06:29 +00:00
|
|
|
}
|