Improving Network Performance with Linux Flowtables

March 4, 2024 · 5 min read
Furkan Sahin
Senior Software Engineer

We’re building an opensource alternative to AWS. Among other things, that means running a ton of VMs,which we do on Linux. We rely on Linux KVM for virtualization, and keep each VM in a separate namespace for isolation.

In a setup like this, the networking stack has to provide encryption in transit, dynamically assign public IPv4 addresses to VMs, and allow flexible firewall rules. For encryption, we can offload logic to the underlying network card, if the card supports it. This saves CPU cycles on the machine and improves VM performance. For IPv4 assignment and firewall rules, we use Linux’s Netfilter / Nftables. This subsystem provides a powerful way to handle packets addressed to the host.

We started to wonder if we could offload some of the packet processing logic to the network card, similarly to what we can do with encryption, in order to save even more CPU cycles. While investigating, we came across flowtables—a network acceleration feature in the kernel that works like a routing cache. That is, the kernel remembers the routing decisions for packets that belong to a particular connection.

When we introduced flowtables into our stack, it reduced network latencies by 7.5%, and all it took was a 7-line change. We thought that was remarkable and worth sharing! So in this post, we describe how we use Netfilter and flowtables for packet processing, and include a simple benchmark that shows flowtables’ benefits.

Background

Ubicloud uses an established pattern in building public cloud services. A control plane manages a data plane, where the data plane usually uses open source software. The control plane holds the data model, responds to web requests, and coordinates changes to the data plane (nodes).

For example, when the user wants to update firewall rules on a VM, we register this change with the control plane. The control plane then finds the corresponding bare metal instance running the VM and pushes changes to the data plane. Ubicloud’s data plane then reprograms the Linux hosts’ networking rules for the firewall changes to take effect.

To reprogram the host OS, Ubicloud uses the Netfilter project. Netfilter is a framework inside the Linux kernel that provides hooks for features like packet filtering, connection tracking, and network address translation (NAT). For firewall rules, we use Netfilter’s packet filtering feature. (For assigning IPv4 addresses to VMs, we use the NAT feature.) Let’s look at how firewall rules work in a bit more detail.

An Example: Implementing Firewall Rules

Classic forwarding path for a packet. Please see Acknowledgements, CC BY-SA 4.0 license

The above diagram displays the classic packet forwarding path within the Linux kernel, also identifying the Netfilter hooks. To make things more concrete, let’s consider an example where we implement firewall rules in Ubicloud. In this example, on a VM, we want to allow incoming TCP connections on port 5432 for all IP addresses, and reject other traffic.

The VM has an IPv4 address of 12.12.12.12. A packet then comes to the host with the following packet contents:

  1. Source address: 11.11.11.11
  2. Destination address: 12.12.12.12
  3. Destination port: 5432
  4. Protocol: TCP

The routing lookup on the host detects  that the packet isn’t for itself and needs to be forwarded to the VM. As a result, the packet traverses the Netfilter forward hook. The forward stage then has the following filter (which we configured) that needs to be applied:


  ip saddr 0.0.0.0/0 tcp dport 5432 ip daddr 12.12.12.12 accept

This rule says if the packet is coming from any address, using the TCP protocol, with destination port 5432 and destination IPv4 address 12.12.12.12, accept it. So, the forward hook simply passes the packet to the post-routing hook to be sent to the destination address. If the destination port or any of these fields doesn’t match this rule, Nftables checks for any other rules in the chain. If no other rules exist, Nftables follows the table policy to take the appropriate action.

What’s interesting here is that a lot of the work is done in the pre-routing, routing, forward, and post-routing stages.

Flowtables: Optimizing Network Traffic Handling

Flowtables is an optimization to improve network packet throughput. By remembering and reusing the connection based packet processing decisions, flowtables reduces the number of repetitive processing steps for each packet. This further reduces CPU utilization and improves network latency and throughput.

Adding flowtables into a packet's forwarding path. Please see Acknowledgements, CC BY-SA 4.0 license

The previous diagram shows how Netfilter / Nftables work when flowtables are applied. Further, since flowtables integrate nicely with Netfilter, enabling them is straightforward. In Ubicloud’s case, enabling flowtables just took seven lines of code!

Simple Latency Benchmarks with PostgreSQL

We designed a simple benchmark to measure our networking stack’s latency at the host level. For this, we created a Ubicloud PostgreSQL instance and installed pgbench on the host machine. pgbench is a simple benchmarking tool provided by PostgreSQL; it’s a nice fit for simple benchmarking because we can tweak pgbench’s parameters to focus on the networking overhead.

We first initialized pgbench using its initialization option (-i) and then ran pgbench for benchmarking:


  pgbench -i -s 100 demo-pg.postgresql.ubicloud.com

  pgbench -c 1 -j 1 -T 60 -P 1 -S demo-pg.postgresql.ubicloud.com

By keeping the client and thread counts at one, we could better isolate flowtable optimization’s impact. Additionally, we ran pgbench directly from the host against Ubicloud’s managed PostgreSQL. This way, we could remove any variance associated with taking an actual network hope; and only measure the end-to-end latency for one pgbench SELECT query.

Our observations were clear and consistent:

  • Without flowtables (our original Netfilter / Nftables implementation), the average latency was 0.127ms.
  • With flowtables, the average latency decreased to 0.118ms. This showed a latency improvement of 7.5%.
  • There are two ways to think about these improvements. First, we didn’t take network latency into account in this benchmark. So, we’d expect real life latency benefits to be lower.
  • Second, and on the flip side, most of the end-to-end latency was associated with the pgbench client sending a query to PostgreSQL and receiving the reply. We didn’t work to measure this latency. Intuitively, we’d expect flowtable’s throughput benefits (shaved off CPU cycles) to be more important than its latency benefits.

Conclusion

We use Netfilter / Nftables on our data plane bare metal instances to provide cloud networking services. These Linux kernel features are reliable and portable. Recently, we introduced flowtables into our networking setup at Ubicloud. The change took seven lines of code and improved latency by 7.5% in a simple application benchmark.

As we work on Ubicloud, we’re actively learning more about and making improvements to our networking layer. If you have any questions or feedback for us, we’d love to hear from you. Please feel free to drop us a line at [email protected].

Acknowledgements

As we worked to introduce flowtables into Ubicloud’s networking stack, a significant portion of our understanding came from the blog post, "Flowtables: A Netfilter nftables Fastpath". Andrej's post provides an in-depth look into flowtables and their benefits. We’d like to thank its author Andrej Stender for that comprehensive work!