标题: [翻译]AFL 性能优化建议 分类: Fuzzing 创建: 2023-08-25 07:25 修改: 链接: http://0x2531.tech/fuzzing/202308250725.txt -------------------------------------------------------------------------------- ================================= Tips for performance optimization AFL 性能优化建议 ================================= This file provides tips for troubleshooting slow or wasteful fuzzing jobs. See README for the general instruction manual. 本文件提供了对运行缓慢或浪费资源的 fuzzing 任务的排障建议,查看 README 文档了解 一般说明。 1) Keep your test cases small 小测试用例 ----------------------------- This is probably the single most important step to take! Large test cases do not merely take more time and memory to be parsed by the tested binary, but also make the fuzzing process dramatically less efficient in several other ways. 这可能是最重要的。大测试用例被处理时,不但花费更多的时间和内存,而且会在许多方面降低 fuzzing 过程的效率。 To illustrate, let's say that you're randomly flipping bits in a file, one bit at a time. Let's assume that if you flip bit #47, you will hit a security bug; flipping any other bit just results in an invalid document. 为了说明这一点,我们随机翻转文件中的位,一次一位。假设当翻转到 #47 位时,会发现 bug, 翻转任何其它位只会生成一个无效文档。 Now, if your starting test case is 100 bytes long, you will have a 71% chance of triggering the bug within the first 1,000 execs - not bad! But if the test case is 1 kB long, the probability that we will randomly hit the right pattern in the same timeframe goes down to 11%. And if it has 10 kB of non-essential cruft, the odds plunge to 1%. 现在,如果起始测试用例大小为 100 字节,那么将有 71% (注:1-(799/800)^1000)的概率 在前 1000 次执行中触发 bug。如果测试用例大小为 1 KB,在相同时间内触发 bug 的概率将会 降至 11%。当测试用例大小为 10 KB 时,则概率会降至 1%。 On top of that, with larger inputs, the binary may be now running 5-10x times slower than before - so the overall drop in fuzzing efficiency may be easily as high as 500x or so. 更重要的是,目标程序处理大测试用例会慢 5-10 倍,因此整体的 fuzzing 效率很容易下降 500 倍之多。 In practice, this means that you shouldn't fuzz image parsers with your vacation photos. Generate a tiny 16x16 picture instead, and run it through jpegtran or pngcrunch for good measure. The same goes for most other types of documents. 实际操作中,这意味着不应该用度假照片等大文件来模糊测试图片解析器,而应该生成 16x16 小图片,并使用 jpegtran 或 pngcrunch 小工具(注:这些小工具和大型图片解析器使用相同 的图像处理库)处理它。处理大多数其它类型的文档也是如此。 There's plenty of small starting test cases in ../testcases/* - try them out or submit new ones! 在 testcases 目录下有很多这样小的起始测试用例,可以尝试使用它们。 If you want to start with a larger, third-party corpus, run afl-cmin with an aggressive timeout on that data set first. 如果想使用第三方更大的语料集,建议先使用 afl-cmin 工具(设置一个较短的超时时间)删减 冗余语料集。 2) Use a simpler target 使用更简单的目标 ----------------------- Consider using a simpler target binary in your fuzzing work. For example, for image formats, bundled utilities such as djpeg, readpng, or gifhisto are considerably (10-20x) faster than the convert tool from ImageMagick - all while exercising roughly the same library-level image parsing code. 考虑在 fuzzing 中使用更简单的目标二进制程序。例如,对于图像格式,捆绑实用程序 (如:djpeg、readpng 或 gifhisto)比 ImageMagick 的转换工具快得多(10-20 倍), 同时,它们执行大致相同的库级图像解析代码。 Even if you don't have a lightweight harness for a particular target, remember that you can always use another, related library to generate a corpus that will be then manually fed to a more resource-hungry program later on. 即使没有针对特定目标的轻量级工具,你也可以使用另一个相关库来生成语料库,然后将其手动输入 到更需要资源的程序中。 3) Use LLVM instrumentation 使用 LLVM 插桩 --------------------------- When fuzzing slow targets, you can gain 2x performance improvement by using the LLVM-based instrumentation mode described in llvm_mode/README.llvm. Note that this mode requires the use of clang and will not work with GCC. 当测试慢目标程序时,可以通过使用文件 llvm_mode/README.llvm 描述的 LLVM 插桩模式获得 2 倍 的性能提升。注意,该模式需要使用 Clang,并且不适用于 GCC。 The LLVM mode also offers a "persistent", in-process fuzzing mode that can work well for certain types of self-contained libraries, and for fast targets, can offer performance gains up to 5-10x; and a "deferred fork server" mode that can offer huge benefits for programs with high startup overhead. Both modes require you to edit the source code of the fuzzed program, but the changes often amount to just strategically placing a single line or two. LLVM 模式还提供了进程内 fuzzing 模式,名为 persistent。可以很好地适用于某些类型的 自包含库,并且对于快目标程序,可以提升 5-10 倍的性能;还有 "deferred fork server" 模式, 特别适用于具有高启动开销(注:启动时执行大量初始化设置,且和业务逻辑无关)的目标程序。 4) Profile and optimize the binary 分析并优化二进制程序 ---------------------------------- Check for any parameters or settings that obviously improve performance. For example, the djpeg utility that comes with IJG jpeg and libjpeg-turbo can be called with: 检查是否有任何明显提高性能的参数或设置。例如,可以使用以下命令调用 IJG jpeg 和 libjpeg-turbo 附带的 djpeg 实用程序: -dct fast -nosmooth -onepass -dither none -scale 1/4 ...and that will speed things up. There is a corresponding drop in the quality of decoded images, but it's probably not something you care about. 这会加快 fuzzing 进程。解码图像的质量也会相应下降,但这可能不是什么要紧的事情。 In some programs, it is possible to disable output altogether, or at least use an output format that is computationally inexpensive. For example, with image transcoding tools, converting to a BMP file will be a lot faster than to PNG. 在某些程序中,可以完全禁用输出,或者使用计算成本更低的输出格式。例如,使用图像转码工具, 转换为 BMP 文件将比转换为 PNG 快得多。 With some laid-back parsers, enabling "strict" mode (i.e., bailing out after first error) may result in smaller files and improved run time without sacrificing coverage; for example, for sqlite, you may want to specify -bail. 对于一些缓慢的解析器,启用“严格”模式(即在第一次错误后退出)可能会生成更小的文件并 改善程序运行,而不会牺牲覆盖范围。例如,对于 sqlite,您可能需要指定 -bail 选项。 If the program is still too slow, you can use strace -tt or an equivalent profiling tool to see if the targeted binary is doing anything silly. Sometimes, you can speed things up simply by specifying /dev/null as the config file, or disabling some compile-time features that aren't really needed for the job (try ./configure --help). One of the notoriously resource-consuming things would be calling other utilities via exec*(), popen(), system(), or equivalent calls; for example, tar can invoke external decompression tools when it decides that the input file is a compressed archive. 如果程序仍然慢,您可以使用 strace -tt 等分析工具来查看目标二进制程序是否做了任何愚蠢的 事情。有时,您可以通过简单地指定 /dev/null 作为配置文件,或禁用一些并不真正需要的编译时 功能来加快速度(尝试 ./configure --help)。典型的消耗资源的方式是通过 exec*(), popen() 和 system()等函数调用其它程序。例如,当 tar 确定输入文件是压缩包时,调用外部解压缩工具。 Some programs may also intentionally call sleep(), usleep(), or nanosleep(); vim is a good example of that. Other programs may attempt fsync() and so on. There are third-party libraries that make it easy to get rid of such code, e.g.: 如 vim 等一些程序会故意调用 sleep(), usleep() 或 nanosleep() 等暂停函数。其它程序 可能会尝试调用 fsync() 等函数(注:产生多次的文件 IO),有第三方库可以轻松的摆脱此类 代码,如: https://launchpad.net/libeatmydata In programs that are slow due to unavoidable initialization overhead, you may want to try the LLVM deferred forkserver mode (see llvm_mode/README.llvm), which can give you speed gains up to 10x, as mentioned above. 在测试具有较大初始化开销的程序时,建议尝试 LLVM 延迟 forkserver 模式(请查看 llvm_mode/README.llvm),如上所述,它可以提升 10 倍以上的性能。 Last but not least, if you are using ASAN and the performance is unacceptable, consider turning it off for now, and manually examining the generated corpus with an ASAN-enabled binary later on. 最后,如果使用 ASAN 性能表现糟糕,考虑将其关闭,待 fuzzing 结束后,再使用开启了 ASAN 的 目标程序手动检查生成的语料库。 5) Instrument just what you need 按需插桩 -------------------------------- Instrument just the libraries you actually want to stress-test right now, one at a time. Let the program use system-wide, non-instrumented libraries for any functionality you don't actually want to fuzz. For example, in most cases, it doesn't make to instrument libgmp just because you're testing a crypto app that relies on it for bignum math. 只对你当前真正想要进行测试的库进行插桩,一次一个。不关心的库不要插桩。 Beware of programs that come with oddball third-party libraries bundled with their source code (Spidermonkey is a good example of this). Check ./configure options to use non-instrumented system-wide copies instead. 要小心那些与其源代码捆绑在一起的奇怪的第三方库的程序(Spidermonkey就是一个很好的例子)。 6) Parallelize your fuzzers 并行测试 --------------------------- The fuzzer is designed to need ~1 core per job. This means that on a, say, 4-core system, you can easily run four parallel fuzzing jobs with relatively little performance hit. For tips on how to do that, see parallel_fuzzing.txt. 模糊器执行每个任务需要约 1 个核。这意味着在 4 核系统上,您可以轻松运行四个并行 fuzzing 任务,而对性能影响相对较小。 有关如何执行此操作的提示,请查看 parallel_fuzzing.txt。 The afl-gotcpu utility can help you understand if you still have idle CPU capacity on your system. (It won't tell you about memory bandwidth, cache misses, or similar factors, but they are less likely to be a concern.) afl-gotcpu 工具可以帮助您了解系统是否仍有空闲的 CPU 容量。(它不会告诉您有关内存、带宽、 缓存未命中等信息,但它们不太可能成为问题。) 7) Keep memory use and timeouts in check 控制内存使用和超时 ---------------------------------------- If you have increased the -m or -t limits more than truly necessary, consider dialing them back down. 按需设置 -m 或 -t 参数值,不可过大(多数时候保持默认值即可) For programs that are nominally very fast, but get sluggish for some inputs, you can also try setting -t values that are more punishing than what afl-fuzz dares to use on its own. On fast and idle machines, going down to -t 5 may be a viable plan. 在快速且空闲的机器上,降低到 -t 5 可能是一个可行的计划。 The -m parameter is worth looking at, too. Some programs can end up spending a fair amount of time allocating and initializing megabytes of memory when presented with pathological inputs. Low -m values can make them give up sooner and not waste CPU time. -m 参数也值得注意。当出现畸形输入时,某些程序最终可能会花费大量时间来分配和初始化 MB 量级的内存。设置一个低 -m 值可以使它们更快放弃并且不浪费 CPU 时间。 8) Check OS configuration 检查 OS 配置 ------------------------- There are several OS-level factors that may affect fuzzing speed: 有几个操作系统级别的因素可能会影响 fuzzing 速度: - High system load. Use idle machines where possible. Kill any non-essential CPU hogs (idle browser windows, media players, complex screensavers, etc). 系统负载高。尽量使用空闲的机器。kill 非必要的进程。 - Network filesystems, either used for fuzzer input / output, or accessed by the fuzzed binary to read configuration files (pay special attention to the home directory - many programs search it for dot-files). 网络文件系统,要么用于模糊器输入输出,要么用于目标二进制程序读取配置文件。 - On-demand CPU scaling. The Linux 'ondemand' governor performs its analysis on a particular schedule and is known to underestimate the needs of short-lived processes spawned by afl-fuzz (or any other fuzzer). On Linux, this can be fixed with: CPU 按需扩展。Linux ondemand 调控器按照特定的时间表执行分析,并且它会低估 afl-fuzz (或任何其他模糊器)产生的短期进程的需求。在 Linux 上,可以通过以下方式修复此问题: cd /sys/devices/system/cpu echo performance | tee cpu*/cpufreq/scaling_governor On other systems, the impact of CPU scaling will be different; when fuzzing, use OS-specific tools to find out if all cores are running at full speed. 在其他系统上,CPU 扩展的影响会有所不同。模糊测试时,使用特定于操作系统的工具来查明 所有内核是否都在全速运行。 - Transparent huge pages. Some allocators, such as jemalloc, can incur a heavy fuzzing penalty when transparent huge pages (THP) are enabled in the kernel. You can disable this via: 透明大页。当内核启用透明大页(THP)时,某些分配器(例如jemalloc)可能会导致严重的 模糊测试惩罚。通过以下方式禁用: echo never > /sys/kernel/mm/transparent_hugepage/enabled - Suboptimal scheduling strategies. The significance of this will vary from one target to another, but on Linux, you may want to make sure that the following options are set: 次优调度策略。这个问题的重要性会因目标的不同而异,但在Linux上,您可能希望确保设置 以下选项: echo 1 >/proc/sys/kernel/sched_child_runs_first echo 1 >/proc/sys/kernel/sched_autogroup_enabled Setting a different scheduling policy for the fuzzer process - say SCHED_RR - can usually speed things up, too, but needs to be done with care. 为模糊器进程设置不同的调度策略 - 比如SCHED_RR - 通常也可以加快速度,但需要谨慎处理。 9) If all other options fail, use -d 如果都失效,使用 -d (quick & dirty 模式) ------------------------------------ For programs that are genuinely slow, in cases where you really can't escape using huge input files, or when you simply want to get quick and dirty results early on, you can always resort to the -d mode. 对于真正缓慢的程序,在不得不使用巨大输入文件的情况下,或者只是想尽早获得快速而简单的结果时, 您可以使用 -d 模式。 The mode causes afl-fuzz to skip all the deterministic fuzzing steps, which makes output a lot less neat and can ultimately make the testing a bit less in-depth, but it will give you an experience more familiar from other fuzzing tools. 该模式会导致 afl-fuzz 跳过所有确定性模糊测试步骤,最终可能使测试的深度稍微减少,其它模糊器 大都使用这种随机突变模式。