After my good experience with NIC bonding on Fedora7 I thought I’d implement it on my CentOS 5.2 fileserver. Big mistake.

On top of the built-in 100BaseT NIC, I installed a second gigabit NIC, which was still an r8169 same as the other one, but a different brand card. That of course caused random switching around of the order of the network cards, so bond0 wouldn’t come up. As you can’t put the MAC address in the config file when using bonding, and my motherboard is modern so auto-assigns IRQ’s, I ended up having to blacklist the forcedeth module to prevent the built-in NIC coming up, leaving just the two gigabit cards.

So got that working. However the traffic coming out of bond0 was absurdly slow, I thought it might be arp caching, so rebooted the router and other desktop machine, no difference. Changed the bond type from round-robin (0) to XOR (2), no difference. Also noticed that if you try to restart the network service or ifdown the bond0, the system hangs. Unplugging one NIC seems to kill the link altogether. Starting to think NIC bonding is not so stable on CentOS…..

Anyway after running iperf and finding that the bonded speed was the same as with just one NIC, I reversed all the settings and removed the network card; noticing that both NICs were very hot to the touch.

So scrub that idea. Very strange though, I wonder if there was some IRQ/DMA weirdness going on with two relatively identical cards, especially with the overheating. I’ve got a feeling that one card was maybe being fed all the traffic bound for the two. I did a quick search and found at least one other person found that bonding on CentOS was very slow, so maybe its the old 2.6.18 kernel’s bonding module.

I’m really leaning towards making my Pentium4 (the F7 box) into a new fileserver, maybe wait for CentOS 5.3 to be released or stick Ubuntu Server 8.04.2 LTS/Debian 5.0 on there to get a newer 2.6.24/2.6.26 kernel but retain the 3-year support model. It just seems a waste to have a 3Gb/3GHz machine as a fileserver, although it isn’t doing much right now….

I also used the new Acronis True Image Echo Server to backup the fileserver’s boot drive over the network, then I plugged in an indentical 80Gb IDE disk and cloned it to that for good measure, the disk clone took about 8mins, I think the network backup took about 40mins.