libFuzzer菜鸟入门

简介

LibFuzzer是一个in-processcoverage-guidedevolutionary模糊测试引擎,是LLVM项目的一部分。LibFuzzer和要被测试的库链接在一起,通过一个特殊的模糊测试进入点(目标函数),用测试用例feed要被测试的库。fuzzer会跟踪哪些代码区域已经测试过,然后在输入数据的语料库上产生变异,来最大化代码覆盖。代码覆盖的信息由LLVM的SanitizerCoverage插桩提供。

模糊测试种类:
1. Generation Based:通过对目标协议或文件格式建模的方法,从零开始产生测试用例,没有先前的状态
2. Mutation Based:基于一些规则,从已有的数据样本或存在的状态变异而来
3. Evolutionary:产生或变异或两者都有,In-process有代码覆盖反馈

安装

我的测试环境是VMware+Ubuntu16.04 x64(官方推荐),一个最近版本的clang编译器,可以使用checkout_build_install_llvm.sh这个脚本来安装。
其他依赖:

1
2
3
4
$ sudo apt-get install -y make autoconf automake libtool pkg-config zlib1g-dev
$ git clone https://github.com/Dor1s/libfuzzer-workshop.git
$ cd libfuzzer-workshop/libFuzzer
$ Fuzzer/build.sh

编写fuzzer

Fuzzer例子1
考虑如下的函数(libfuzzer-workshop/lessons/04/vulnerable_functions.h):

1
2
3
4
5
6
7
8
9
10
bool VulnerableFunction1(const uint8_t* data, size_t size) {
bool result = false;
if (size >= 3) {
result = data[0] == 'F' &&
data[1] == 'U' &&
data[2] == 'Z' &&
data[3] == 'Z';
}
return result;
}

用下面的fuzz target来测试它:

1
2
3
4
5
6
7
#include <stdint.h>
#include <stddef.h>
#include "vulnerable_functions.h"
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
VulnerableFunction1(data, size);
return 0;
}

如下编译fuzzer

1
2
3
$ clang++ -g -std=c++11 -fsanitize=address -fsanitize-coverage=trace-pc-guard \
first_fuzzer.cc ../../libFuzzer/libFuzzer.a \
-o first_fuzzer

Memory Tools
1. AddressSanitizer:检测use-after-free, buffer overflows(heap, stack, globals), stack-use-after-return, container-overflow
2. MemorySanitizer:检测uninitialized memory reads
3. UndefinedBehaviorSanitizer:检测多种类别的bug,esp on type confusion, signed-integer-overflow, undefined shift, etc.

建一个空的目录来存放语料库,运行fuzzer:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ mkdir corpus1
$ ./first_fuzzer corpus1
INFO: Seed: 675115685
INFO: Loaded 1 modules (39 guards): [0x773ea0, 0x773f3c),
Loading corpus dir: corpus1
INFO: -max_len is not provided, using 64
#0 READ units: 4
#4 INITED cov: 7 ft: 6 corp: 4/131b exec/s: 0 rss: 29Mb
#100 NEW cov: 7 ft: 7 corp: 5/133b exec/s: 0 rss: 29Mb L: 2 MS: 1 EraseBytes-
=================================================================
==3898==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200001b133 at pc 0x000000515620 bp 0x7ffd8ed9b2b0 sp 0x7ffd8ed9b2a8
READ of size 1 at 0x60200001b133 thread T0
#0 0x51561f in VulnerableFunction1(unsigned char const*, unsigned long) /home/fanrong/Computer/libfuzzer-workshop/lessons/04/./vulnerable_functions.h:22:14
#1 0x515d4e in LLVMFuzzerTestOneInput /home/fanrong/Computer/libfuzzer-workshop/lessons/04/first_fuzzer.cc:10:3
#2 0x520993 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /home/fanrong/Computer/libfuzzer-workshop/libFuzzer/Fuzzer/FuzzerLoop.cpp:451:13
#3 0x520bc0 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long) /home/fanrong/Computer/libfuzzer-workshop/libFuzzer/Fuzzer/FuzzerLoop.cpp:408:3
#4 0x5215ab in fuzzer::Fuzzer::MutateAndTestOne() /home/fanrong/Computer/libfuzzer-workshop/libFuzzer/Fuzzer/FuzzerLoop.cpp:587:30

fuzzer发现了一个heap-buffer-overflow,复现crash:

1
$ ./first_fuzzer crash-0eb8e4ed029b774d80f2b66408203801cb982a60

fuzzer例子2
再考虑另一个函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
constexpr auto kMagicHeader = "ZN_2016";
constexpr std::size_t kMaxPacketLen = 1024;
constexpr std::size_t kMaxBodyLength = 1024 - sizeof(kMagicHeader);
bool VulnerableFunction2(const uint8_t* data, size_t size, bool verify_hash) {
if (size < sizeof(kMagicHeader))
return false;

std::string header(reinterpret_cast<const char*>(data), sizeof(kMagicHeader));
std::array<uint8_t, kMaxBodyLength> body;

if (strcmp(kMagicHeader, header.c_str()))
return false;

auto target_hash = data[--size];

if (size > kMaxPacketLen)
return false;
if (!verify_hash)
return true;

std::copy(data, data + size, body.data());
auto real_hash = DummyHash(body);
return real_hash == target_hash;
}

这个例子稍微复杂一点,用最简单的fuzz target来测试它(和第一个几乎一样):

1
2
3
4
5
6
7
8
#include <stdint.h>
#include <stddef.h>
#include "vulnerable_functions.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
VulnerableFunction2(data, size, false);
return 0;
}

编译fuzzer:

1
2
3
$ clang++ -g -std=c++11 -fsanitize=address -fsanitize-coverage=trace-pc-guard \
second_fuzzer.cc ../../libFuzzer/libFuzzer.a \
-o second_fuzzer

运行fuzzer:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ ./second_fuzzer corpus2
INFO: Seed: 844743323
INFO: Loaded 1 modules (39 guards): [0x773ea0, 0x773f3c),
Loading corpus dir: corpus2
INFO: -max_len is not provided, using 64
#0 READ units: 3
#3 INITED cov: 5 ft: 5 corp: 3/23b exec/s: 0 rss: 29Mb
#2097152 pulse cov: 5 ft: 5 corp: 3/23b exec/s: 1048576 rss: 161Mb
#4194304 pulse cov: 5 ft: 5 corp: 3/23b exec/s: 838860 rss: 293Mb
#8388608 pulse cov: 5 ft: 5 corp: 3/23b exec/s: 838860 rss: 547Mb
#16777216 pulse cov: 5 ft: 5 corp: 3/23b exec/s: 883011 rss: 549Mb
#33554432 pulse cov: 5 ft: 5 corp: 3/23b exec/s: 958698 rss: 549Mb
...

输出没有什么意义,下面修改fuzz target,设置verify_hash为不同的值:

1
2
3
4
5
6
7
8
9
10
11
#include <stdint.h>
#include <stddef.h>
#include "vulnerable_functions.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
bool verify_hash_flags[] = { false, true };

for (auto flag : verify_hash_flags)
VulnerableFunction2(data, size, flag);
return 0;
}

编译fuzzer:

1
2
3
$ clang++ -g -std=c++11 -fsanitize=address -fsanitize-coverage=trace-pc-guard \
third_fuzzer.cc ../../libFuzzer/libFuzzer.a \
-o third_fuzzer

在同一个语料库上运行fuzzer:

1
$ ./third_fuzzer corpus2

fuzzer找到一条新路径,但还是没什么用。注意到:

1
INFO: -max_len is not provided, using 64

但我们的目标分析ZN_2016协议的数据包的长度是:

1
constexpr std::size_t kMaxPacketLen = 1024;

那就添加libFuzzer的参数-max_len=1024

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
./third_fuzzer corpus2 -max_len=1024
INFO: Seed: 2241499835
INFO: Loaded 1 modules (41 guards): [0x773ee0, 0x773f84),
Loading corpus dir: corpus2
#0 READ units: 3
#3 INITED cov: 26 ft: 26 corp: 3/23b exec/s: 0 rss: 31Mb
=================================================================
==5664==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffedef9ecc8 at pc 0x0000004c57d1 bp 0x7ffedef9e710 sp 0x7ffedef9dec0
WRITE of size 1023 at 0x7ffedef9ecc8 thread T0
#0 0x4c57d0 in __asan_memmove /home/fanrong/Downloads/src/llvm/projects/compiler-rt/lib/asan/asan_interceptors.cc:470
#1 0x5163f0 in unsigned char* std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<unsigned char>(unsigned char const*, unsigned char const*, unsigned char*) /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/bits/stl_algobase.h:384:6
#2 0x5162c2 in unsigned char* std::__copy_move_a<false, unsigned char const*, unsigned char*>(unsigned char const*, unsigned char const*, unsigned char*) /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/bits/stl_algobase.h:401:14
#3 0x516220 in unsigned char* std::__copy_move_a2<false, unsigned char const*, unsigned char*>(unsigned char const*, unsigned char const*, unsigned char*) /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/bits/stl_algobase.h:438:18
#4 0x516033 in unsigned char* std::copy<unsigned char const*, unsigned char*>(unsigned char const*, unsigned char const*, unsigned char*) /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/bits/stl_algobase.h:470:15
#5 0x515a52 in VulnerableFunction2(unsigned char const*, unsigned long, bool) /home/fanrong/Computer/libfuzzer-workshop/lessons/04/./vulnerable_functions.h:61:3
#6 0x515edc in LLVMFuzzerTestOneInput /home/fanrong/Computer/libfuzzer-workshop/lessons/04/third_fuzzer.cc:13:5

vulnerable_functions.h:61:3找到了stack-buffer-overflow
fuzzer例子3
看下面的函数:

1
2
3
4
5
6
constexpr std::size_t kZn2016VerifyHashFlag = 0x0001000;

bool VulnerableFunction3(const uint8_t* data, size_t size, std::size_t flags) {
bool verify_hash = flags & kZn2016VerifyHashFlag;
return VulnerableFunction2(data, size, verify_hash);
}

实际上这只是前面函数的一个wrapper,但是关键点是这里有一个大的flags取值范围。列举fuzzer中所有可能的组合不太现实,而且也不能保证新的可能值不会再添加进来。
这种情况,我们可以借助libFuzzer提供的data来取一些flags的随机值:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <stdint.h>
#include <stddef.h>
#include "vulnerable_functions.h"
#include <functional>
#include <string>

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
std::string data_string(reinterpret_cast<const char*>(data), size);
auto data_hash = std::hash<std::string>()(data_string);

std::size_t flags = static_cast<size_t>(data_hash);
VulnerableFunction3(data, size, flags);
return 0;
}

编译运行fuzzer:

1
2
3
4
5
$ clang++ -g -std=c++11 -fsanitize=address -fsanitize-coverage=trace-pc-guard \
fourth_fuzzer.cc ../../libFuzzer/libFuzzer.a \
-o fourth_fuzzer
$ mkdir corpus3
$ ./fourth_fuzzer corpus3/ -max_len=1024

很快就能发现相同的crash,但是现在fuzzer对于flags的值来说是通用的了。
关于libFuzzer的基本使用就先介绍到这里,后面会继续学习一些实际的用法。
reference
https://github.com/Dor1s/libfuzzer-workshop
https://github.com/google/fuzzer-test-suite
http://llvm.org/docs/LibFuzzer.html
https://security.googleblog.com/2016/08/guided-in-process-fuzzing-of-chrome.html