Dedicated Routing Tables in Linux

Jun 3, 2013 How To

Linux Routing

Linux is a very powerful platform. It is the framework for thousands of applications and software suites. It’s flexibility when it comes to network is second to none, especially for power users. Although other platforms like Windows support manual routing through multiple gateways, it does not support policy based routing. This is where Linux has the upper-hand.

Note: The steps below have been completed and tested on Ubuntu 12.04.2.

Multiple Interfaces

We recently built and installed a new server which has 4 network interfaces. Two of them are copper/RJ-45 ports and two of them are optical Gigabit Ethernet ports. On this box, if we run an lspci, it will show us a hardware profile of the server. Here is a list of our 4 interfaces.

$ lspci -nn | grep Ethernet
03:03.0 Ethernet controller [0200]: Broadcom Corporation NetXtreme II BCM5706 Gigabit Ethernet [14e4:164a] (rev 02)
07:01.0 Ethernet controller [0200]: Broadcom Corporation NetXtreme II BCM5706S Gigabit Ethernet [14e4:16aa] (rev 02)
07:02.0 Ethernet controller [0200]: Broadcom Corporation NetXtreme II BCM5706S Gigabit Ethernet [14e4:16aa] (rev 02)
07:03.0 Ethernet controller [0200]: Broadcom Corporation NetXtreme II BCM5706 Gigabit Ethernet [14e4:164a] (rev 02)

You can pair up the interfaces by their [VENDOR:MAKE] combo’s.

To figure out what the name of each interface is, you can take it’s PCI location and grep dmesg:

dmesg | grep "07:03.0"

The response would look like:

[3.852851] bnx2 0000:07:03.0: eth3: Broadcom NetXtreme II BCM5706 1000Base-T (A2) PCI-X 64-bit 100MHz found at mem f6000000, IRQ 19, node addr 00:0a:ba:di:d3:a0

The Gigabit Ethernet looks like this:

[3.371963] bnx2 0000:07:02.0: eth2: HP NC370F Multifunction Gigabit Server Adapter (A2) PCI-X 64-bit 100MHz found at mem f8000000, IRQ 18, node addr 00:0a:ba:di:d3:a1

Right after it’s location there is the interface name. For the two examples above, they are eth3 and eth3 respectively. We will need to know these interface names so we can configure them and manage the IP rules. The next step for you would be to configure your /etc/network/interfaces file or your /etc/sysconfig/network-scripts/ depending on your distribution.

Tip! You can always look at the contents of /etc/udev/rules.d/70-persistent-net.rules which will show your the name of each interface. You can always rename your interfaces if you want. For example, above eth0 and eth3 are alike cards, but divided by the optical ports. We can rename them by changing the interface NAME

Managing the Routing Table

To aid in explanation, here is our interface configuration:

$ sudo ifconfig -s
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0 1500 0 2694 0 0 0 6 0 0 0 BMRU
fab0 1500 0 5517 0 0 0 3224 0 0 0 BMRU
fab1 1500 0 2682 0 0 0 6 0 0 0 BMRU
lo 16436 0 256 0 0 0 256 0 0 0 LRU
mgmt 1500 0 1109 0 0 0 75 0 0 0 BMRU

You can see we named our 4 interfaces.

Normally, if traffic is received on an interface, it will be routed back out whichever interface has a default route. This can break IP flows, especially if you are navigating across a Firewall or using NAT. Many systems now adays that use Wireless LAN and Ethernet at the same time, have individual default routes which is why you do not see this issue.

To resolve this, you would create an independent routing table for each interface.

Step 1) Create a new routing table:

echo "61 eth0" | sudo tee /etc/iproute2/rt_tables

Step 2) Add Default Routes for each Interface:

ip route add default dev eth0 table eth0

Step 3) Add Routing Policy for each Interface:

$ sudo ip rule add from 10.1.101.61 table eth0

And that’s it! Now, when you send traffic to 10.1.101.61 it will be routed by the eth0 interface instead of the mgmt interface! Repeat the above steps for every interface you have.

If you run a ip route you should see your updated routing table:

~$ sudo ip route
default dev mgmt scope link
10.1.32.0/24 dev mgmt proto kernel scope link src 10.1.32.61
10.1.101.0/24 dev eth0 proto kernel scope link src 10.1.101.61
10.1.101.0/24 dev fab0 proto kernel scope link src 10.1.101.62
10.1.101.0/24 dev fab1 proto kernel scope link src 10.1.101.63

Implementation

The above steps work great, but there can be an issue if you restart the machine. Default route will be assigned to the first up interface, which can round-robin. We can take steps to prevent this from happening which includes editing /etc/network/interfaces.

Add the following lines to the end of your /etc/network/interfaces:

# Adds default route for box
up route add default dev mgmt


# Adds default route for eth0 interface
up route add default dev eth0 table eth0
up rule add from 10.1.101.61 table eth0

Now when you restart your machine, the above commands will be executed and restore your interface routes and rules.

Worth Mentioning

There are a few gotcha’s that may leave you scratching your head. Here are some tips and tricks to solve them:

Problem 1: You receive an interface not configured when using ifdown:

ifdown: interface ethx not configured

The fix would be to use the ip link command. You can change the status of the interface by issuing sudo ip link set ethx down.

Problem 2: Gigabit Fiber interface does not come up

sudo ethtool -r ethx

This uses the utility known as ethtool, which is a very handy CLI networking tool. The -r option tells the interface to restart negotiations. Sometimes on system start, the interface does not do auto-negotiation and needs a little shove.

Problem 3: Multiple interfaces within the same subnet are a Virtual Machine Guest

This is a little trickier. Because of the way the interfaces interact with the host machine at a layer-2 level, you will need to apply ARP blocking on the non-desired interfaces. One utility for this would be arptables. It works exactly like iptables but at Layer 2.

For example, if you do not want interface fab0 (10.1.101.62) responding to traffic destined to fab1 (10.1.101.63), you can do:

sudo arptables -A INPUT -j DROP -i fab0 ! -d 10.1.101.62
sudo arptables -A INPUT -j DROP -i fab1 ! -d 10.1.101.63

For this to take effect, you must enable ARP filtering at the kernel level:

echo "net.ipv4.conf.all.arp_filter = 1" >> /etc/sysctl.conf

To view your arptables rules, execute:

~$ sudo arptables -vnL
Chain INPUT (policy ACCEPT 988 packets, 27664 bytes)
-j DROP -i fab0 -o * ! -d 10.1.101.62 , pcnt=43899 -- bcnt=1229K
-j DROP -i fab1 -o * ! -d 10.1.101.63 , pcnt=44655 -- bcnt=1250K


Chain OUTPUT (policy ACCEPT 988 packets, 27664 bytes)


Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)

This will prevent the upstream switch from storing the wrong MAC address in it’s ARP table and subsequently sending traffic to the correct interface.


comments powered by Disqus