--- title: "Few performance facts about bgpgrep" mobile_menu_title: "bgpgrep performance facts" date: 2021-10-19 description: "After a complete codebase rewrite, some tweaks, and a lot of new features, it is time to look back and enjoy some benchmarking between bgpgrep and bgpscanner. Let's see how much things have improved and why." series: [ "ubgpsuite - The Micro BGP Suite" ] categories: [ "benchmarks", "development" ] tags: [ "ubgpsuite", "bgpscanner", "bgpgrep", "Networking", "BGP", "benchmarks" ] news_keywords: [ "ubgpsuite", "bgpscanner", "bgpgrep", "benchmarks" ] --- ## Few performance facts about bgpgrep If you are a performance junkie like me, the first question that probably pops in your mind after a major code rewrite is something like: *Is it faster than before?* Let's now satisfy this curiosity (of mine) with some benchmarking. ## Benchmark environment * Processor: Intel© Core™ i7-8565U at 1.80GHz (4 cores physical, 8 cores with hyperthreading) * Cache layout: - L1 data cache: 128 KiB (4 instances) - L1 instructions cache: 128 KiB (4 instances) - L2 cache: 1 MiB (4 instances) - L3 cache: 8 MiB (1 instance) * Memory: 16 GB RAM DDR4, in two 8GB banks * Hard disk: SAMSUNG MZALQ512HALU-000L1 * Kernel: Linux 5.10.62-1-lts SMP x86_64 GNU/Linux To avoid adultering results we also: - disable `cron` and any other background file indexing service; - force performance CPU profile, disabling powersave mode; - disable [Linux address space layout randomization](https://en.wikipedia.org/wiki/Address_space_layout_randomization) for the duration of our tests; - increase kernel performance events sample rate; - drop filesystem caches and clean any temporary file; - run benchmarks in console mode, outside any desktop environment. Both `bgpscanner` and `bgpgrep` have been compiled in release mode with full optimizations, as documented in their official build instructions, using `clang` version 12.0.1. For reference, we also do a benchmark run with `bgpdump`, version 1.6.2, as available from [Arch Linux User Repositories (AUR)](https://aur.archlinux.org/packages/bgpdump/). Results are calculated by averaging five runs of each command, immediately after one warmup round. MRT data is decompressed upfront, to avoid accounting for decompression overhead, the output is sent directly to `/dev/null`, to avoid any disk write overhead. ## The show's on We take the data for the first benchmark from RouteViews' [Sydney Route Collector](http://archive.routeviews.org/route-views.sydney/bgpdata), and pull the very first RIB of December 2020, along with any subsequent updates from the same month. This gives us 47.1GB uncompressed MRT data to work with. We then run our benchmarks with the following commands: ```sh bgpgrep sydney/2020-12/uncompressed.mrt >/dev/null bgpscanner sydney/2020-12/uncompressed.mrt >/dev/null bgpdump -mv sydney/2020-12/uncompressed.mrt >/dev/null ``` | | Average (sec) | Best (sec) | Worst (sec) | Memory (KiB) | |------------|---------------|------------|-------------|--------------| | bgpgrep | 404.45 | 401.62 | 411.38 | 2076 | | bgpscanner | 453.59 | 451.93 | 455.13 | 2448 | | bgpdump | 2053.73 | 2037.19 | 2082.22 | 2316 | `bgpgrep` is 11% faster than `bgpscanner`, which is good. Since this benchmark operates mostly on MRT update dumps, let's try the same on a different dataset, mostly made of RIBs. We pull nine RIBs from RIPE RIS NCC [RRC00 Route Collector]](https://data.ris.ripe.net/rrc00/2019.12/), and obtain 25.7GB worth of uncompressed MRT data. This time the benchmark is limited to `bgpgrep` and `bgpscanner`. Executed commands and results: ```sh bgpgrep rrc00/2019-12/rib-uncompressed.mrt >/dev/null bgpscanner rrc00/2019-12/rib-uncompressed.mrt >/dev/null ``` | | Average (sec) | Best (sec) | Worst (sec) | Memory (KiB) | |------------|---------------|------------|-------------|--------------| | bgpgrep | 295.84 | 292.20 | 298.14 | 2112 | | bgpscanner | 333.35 | 321.73 | 339.56 | 3016 | The same trend is confirmed, `bgpgrep` is about 12% faster, indicating that the advantage was not data dependent. Though, running our benchmarks under average system load might lead to an interesting surprise: ```sh bgpgrep isolario/2021-07/rib-uncompressed.mrt >/dev/null bgpscanner isolario/2021-07/rib-uncompressed.mrt >/dev/null ``` | | Average (sec) | Best (sec) | Worst (sec) | Memory (KiB) | |------------|---------------|------------|-------------|--------------| | bgpgrep | 344.90 | 342.88 | 347.03 | 2260 | | bgpscanner | 411.39 | 405.13 | 412.70 | 2436 | These runs have been performed under a regular GNOME desktop session, with other applications running. We used 60.8GB worth of MRT data from the Isolario project [Dagobah Collector](https://isolario.it/Isolario_MRT_data/Dagobah/), from the month of July, 2021 (mostly RIBs). It might strike us that the performance gain now approaches 20%. The reason might be a smarter use of memory, and the reduced chance of page faults. You might have noticed by our results that `bgpgrep` memory requirements are moderate compared to `bgpscanner`, what's less evident is that `bgpgrep` also keeps its data structures compact and doesn't like moving them around much. This lessens the page pressure on the system (and makes the CPU cache happier). The net effects of this aren't evident in the benchmarking environment, since `bgpgrep` and `bgpscanner`, in turns, are the only resource intensive tasks on the system. The initial warmup round contributes to their ideal performance. When more tasks are concurrently fighting over memory, and processes might get swapped to different cores for various reasons, invalidating their cache, the value of `bgpgrep` approach becomes more prominent. ## Conclusion `bgpgrep` seems to be a nice improvement over `bgpscanner`, and I am quite satisfied with the performance improvements. Especially when they come with a more solid codebase. In the next few weeks I intend to improve the filtering engine. In general I'd like to stop for a bit to polish the codebase to make it more mature, before moving on to implement more features. Like always, happy hacking to you all! Lorenzo Cogotti