Troubleshooting Multi-WAN Load Balancing
This document gives troubleshooting steps that are specific for load balancing failures, focusing strictly on issues of traffic not properly distributed across various WAN connections. The principal symptom is one where all your outbound traffic, in a multiple-WAN set-up, just uses one single WAN interface; this defeats the purpose of doing load balancing where the idea was to distribute it across multiple WANs for high bandwidth, redundancy, and other performance improvements. These steps will help you verify the configuration, diagnose routing problems, and ensure the multi-WAN service is functioning correctly.
Summary Table for Common Issues and Fixes
| Issue | Possible Cause | Recommended Action |
|---|---|---|
| All traffic on one WAN | Incorrect weight/metric or disabled interface | Check and correct weights and enable all intended interfaces |
| Load balancing not happening | mwan3 service stopped or misconfigured | Start/restart mwan3 service; verify config files |
| Ping fails on specific WAN | WAN connection problem or tracking IP unreachable | Check physical connection and ISP status; verify track IPs |
| Traceroute shows single path | Routing or policy misconfiguration | Check mwan3 policies and rules; ensure load balancing enabled |
| Logs show errors related to mwan3 | Config errors or interface issues | Review and fix errors from logs |
Troubleshooting Steps
- Cloud
- UCI
- Run-Time
- Testing
- Log
Cloud Configuration Verification
Verify Current Load Balancing Configuration:
Log into the CE device and gain root access:
sudo su -
Display the last applied load balancing configuration in a readable format
cat /tmp/last_config_response.json | jq .multiWanV2
The given one is just an example output; when this command is run, it will show something like this.
Example Response
{
"enable": true,
"mode": "LOAD_BALANCE",
"notificationEmails": [],
"wanInterfaces": null,
"wanInterfacesConfig": {
"pppoe0": {
"interfaceName": "pppoe0",
"targetIps": [
"8.8.8.8",
"4.2.2.2"
],
"failureInterval": 5,
"recoveryInterval": 5,
"pingInterval": 5,
"pingTimeout": 2,
"multiWANMetric": 3,
"multiWANWeight": 2,
"enable": false
},
"eth1": {
"interfaceName": "eth1",
"targetIps": [
"8.8.8.8",
"4.2.2.2"
],
"failureInterval": 5,
"recoveryInterval": 5,
"pingInterval": 5,
"pingTimeout": 2,
"multiWANMetric": 1,
"multiWANWeight": 2,
"enable": true
},
"eth0": {
"interfaceName": "eth0",
"targetIps": [
"8.8.8.8",
"4.2.2.2"
],
"failureInterval": 5,
"recoveryInterval": 5,
"pingInterval": 5,
"pingTimeout": 2,
"multiWANMetric": 2,
"multiWANWeight": 2,
"enable": true
},
"wlm0": {
"interfaceName": "wlm0",
"targetIps": [
"8.8.8.8",
"4.2.2.2"
],
"failureInterval": 5,
"recoveryInterval": 5,
"pingInterval": 5,
"pingTimeout": 2,
"multiWANMetric": 2,
"multiWANWeight": 2,
"enable": false
}
}
}
Please carefully check that all the weight values assigned to each WAN interface are correctly set. These weights represent the portion of traffic routed through each connection, and incorrect weights can lead to imbalanced traffic distribution.
UCI Configuration verification
Check network config:
The network configuration file also plays a role:
cat /etc/config/network
The given one is just an example output; when this command is run, it will show something like this.
Example Response
config interface 'loopback'
option device 'lo'
option proto 'static'
option ipaddr '127.0.0.1'
option netmask '255.0.0.0'
config globals
option packet_steering '1'
config interface 'eth0'
option device 'eth0'
option proto 'dhcp'
option metric '1'
option ip4table '1'
option peerdns '0'
option default_wan '1'
list dns '8.8.8.8'
list dns '4.2.2.2'
option disabled '0'
option mtu '1500'
config interface 'eth2'
option device 'eth2'
option disabled '0'
option mtu '1500'
option proto 'static'
option ipaddr '172.1.30.3'
option netmask '255.255.255.0'
config rule
option priority '901'
option lookup 'main'
config interface 'eth1'
option disabled '0'
option device 'eth1'
option proto 'dhcp'
option metric '2'
option ip4table '2'
option mtu '1500'
config interface 'wlm0'
option disabled '1'
option proto '3g'
option pppname 'wlm0'
option device 'ttyUSB0'
option apn 'comgt'
option ipv6 '0'
option delegate '0'
option metric '3'
option ip4table '3'
config route '4f8253ad3b144cfca9f81e3223664117'
option target '172.1.30.3'
option netmask '255.255.255.0'
option gateway '172.1.30.1'
option metric '0'
option proto 'static'
option interface 'eth2'
The file will define the network interface and its associated setting, which would be important to how muli-WAN would act. Ensure the interface used in the multi-WAN would act. Ensure the interface used in the multi-WAN configuration is correctly defined here.
Check mwan3 config:
The details of the mwan3 configuration can be checked in the following files.
cat/etc/config/mwan3
The given one is just an example output; when this command is run, it will show something like this.
Example Response
config globals 'globals'
option mmx_mask '0x3F00'
option local_source 'lan'
option mode 'LOAD_BALANCE'
option enabled '1'
config rule 'DEFAULT_HTTPS'
option family 'ipv4'
option sticky '1'
option proto 'tcp'
option dest_ip '0.0.0.0/0'
option dest_port '443'
option use_policy 'LOAD_BALANCE'
config rule 'DEFAULT_ANY'
option family 'ipv4'
option dest_ip '0.0.0.0/0'
option use_policy 'LOAD_BALANCE'
config interface 'eth1'
option enabled '1'
list track_ip '8.8.8.8'
list track_ip '4.2.2.2'
option interval '5'
option timeout '2'
option failure_interval '5'
option recovery_interval '5'
option down '1'
option up '3'
option initial_state 'online'
option track_method 'ping'
option reliability '1'
option count '1'
option size '56'
option max_ttl '60'
option check_quality '0'
config member 'eth1_m1_w1'
option interface 'eth1'
option metric '1'
option weight '1'
config policy 'FAIL_OVER'
list use_member 'eth1_m1_w1'
list use_member 'eth0_m2_w1'
config interface 'eth0'
option enabled '1'
list track_ip '8.8.8.8'
list track_ip '4.2.2.2'
option interval '5'
option timeout '2'
option failure_interval '5'
option recovery_interval '5'
option down '1'
option up '3'
option initial_state 'online'
option track_method 'ping'
option reliability '1'
option count '1'
option size '56'
option max_ttl '60'
option check_quality '0'
config member 'eth0_m2_w1'
option interface 'eth0'
option metric '2'
option weight '1'
config interface 'wlm0'
option enabled '0'
list track_ip '8.8.8.8'
list track_ip '4.2.2.2'
option interval '5'
option timeout '2'
option failure_interval '5'
option recovery_interval '5'
option down '1'
option up '3'
option initial_state 'online'
option track_method 'ping'
option reliability '1'
option count '1'
option size '56'
option max_ttl '60'
option check_quality '0'
config member 'eth1_m1_w2'
option interface 'eth1
option metric '1'
option weight '2'
config policy 'LOAD_BALANCE'
list use_member 'eth1_m1_w2'
list use_member 'eth0_m1_w2'
config member 'eth0_m1_w2'
option interface 'eth0'
option metric '1'
option weight '2'
This file defines the multi-WAN rules, interface, and policies. Verify:
- enabled '1' in the globals section.
- Right measurement and weight value for each interface.
- Your rules usage policy matches your defined policies.
- The track_ip addresses are reachable from each WAN interface.
mwan3 Command-Line Interface
The mwan3 command provides additional information and control:
uci show mwan3
The given one is just an example output; when this command is run, it will show something like this.
Example Response
mwan3.globals=globals
mwan3.globals.mmx_mask='0x3F00'
mwan3.globals.enabled='1'
mwan3.globals.local_source='lan'
mwan3.globals.mode='FAIL_OVER'
mwan3.DEFAULT_HTTPS=rule
mwan3.DEFAULT_HTTPS.family='ipv4'
mwan3.DEFAULT_HTTPS.sticky='1'
mwan3.DEFAULT_HTTPS.proto='tcp'
mwan3.DEFAULT_HTTPS.dest_ip='0.0.0.0/0'
mwan3.DEFAULT_HTTPS.dest_port='443'
mwan3.DEFAULT_HTTPS.use_policy='FAIL_OVER'
mwan3.DEFAULT_ANY=rule
mwan3.DEFAULT_ANY.family='ipv4'
mwan3.DEFAULT_ANY.dest_ip='0.0.0.0/0'
mwan3.DEFAULT_ANY.use_policy='FAIL_OVER'
mwan3.eth0=interface
mwan3.eth0.enabled='1'
mwan3.eth0.track_ip='8.8.8.8' '4.2.2.2'
mwan3.eth0.interval='5'
mwan3.eth0.timeout='2'
mwan3.eth0.failure_interval='5'
mwan3.eth0.recovery_interval='5'
mwan3.eth0.down='1'
mwan3.eth0.up='3'
mwan3.eth0.initial_state='offline'
mwan3.eth0.track_method='ping'
mwan3.eth0.reliability='1'
mwan3.eth0.count='1'
mwan3.eth0.size='56'
mwan3.eth0.max_ttl='60'
mwan3.eth0.check_quality='0'
mwan3.eth0_m1_w1=member
mwan3.eth0_m1_w1.interface='eth0'
mwan3.eth0_m1_w1.metric='1'
mwan3.eth0_m1_w1.weight='1'
mwan3.FAIL_OVER=policy
mwan3.FAIL_OVER.use_member='eth0_m1_w1'
mwan3.FAIL_OVER.last_resort='default'
mwan3.wlm0=interface
mwan3.wlm0.enabled='0'
mwan3.wlm0.track_ip='8.8.8.8' '4.2.2.2'
mwan3.wlm0.interval='5'
mwan3.wlm0.timeout='2'
mwan3.wlm0.failure_interval='5'
mwan3.wlm0.recovery_interval='5'
mwan3.wlm0.down='1'
mwan3.wlm0.up='3'
mwan3.wlm0.initial_state='offline'
mwan3.wlm0.track_method='ping'
mwan3.wlm0.reliability='1'
mwan3.wlm0.count='1'
mwan3.wlm0.size='56'
mwan3.wlm0.max_ttl='60'
mwan3.wlm0.check_quality='0'
This command with no arguments shows helpful information about the mwan3 command and all of its available subcommands.
Q:1 How do I verify that an interface is properly configured for multi-WAN?
To verify that an interface is properly configured for multi-WAN: Check the network configuration file (/etc/config/network): Ensure the interface is defined with the correct device name (e.g., eth0, eth1). Confirm option disabled '0' (enabled). Verify IP addressing, netmask, and MTU values are correct. Check that metrics are set appropriately (metric '1' for primary, higher values for secondary). Check the mwan3 configuration file (/etc/config/mwan3): Confirm option enabled '1' under the interface section. Ensure valid track_ip addresses (e.g., 8.8.8.8, 4.2.2.2) are reachable. Verify interval, timeout, failure_interval, and recovery_interval values are set correctly. Confirm the interface is included in a member and referenced in a policy (e.g., LOAD_BALANCE). Use the command-line check:- uci show mwan3 Verify the interface appears with correct parameters (enabled, track_ip, metric, weight). Ensure it is part of the intended load balancing policy. If all these checks are correct, the interface is properly configured to participate in multi-WAN load balancing.
Q:2 What to do if network problems occur after running the above commands?
If network problems occur after running the verification commands: Check interface status – Use mwan3 status to confirm if interfaces are online and tracking properly. Restart services – Restart the mwan3 service to apply configuration changes: /etc/init.d/mwan3 restart Validate connectivity – Run ping tests through each WAN interface to confirm reachability: ping -I eth0 8.8.8.8 ping -I eth1 8.8.8.8 Inspect logs – Use: logread -e mwan3 Look for errors related to interface tracking, unreachable IPs, or misapplied policies. Check rules and policies – Ensure traffic rules are correctly mapped to the load balancing policy (LOAD_BALANCE) and not mistakenly set to FAIL_OVER. Physical checks – Verify cables, ISP connectivity, and that each WAN link is active. By systematically checking configuration, service status, connectivity, and logs, you can isolate whether the issue is due to misconfiguration, service failure, or physical WAN problems.
Run time Configuration Verification
Check the current mwan3 status:
mwan3 status
The given one is just an example output; when this command is run, it will show something like this.
Example Response:
Interface status:
interface eth1 is online 00h:25m:17s, uptime 00h:25m:24s and tracking is active
interface eth0 is online 00h:25m:25s, uptime 00h:25m:37s and tracking is active
interface wlm0 is offline and tracking is down
Current ipv4 policies:
FAIL_OVER:
eth1 (100%)
LOAD_BALANCE:
eth0 (50%)
eth1 (50%)
Current ipv6 policies:
FAIL_OVER:
unreachable
LOAD_BALANCE:
unreachable
Directly connected ipv4 networks:
172.1.30.255
192.168.1.0
192.168.1.255
172.1.30.3
172.30.4.165
127.0.0.0/8
192.168.1.4
127.0.0.0
172.1.30.0
172.30.4.0
172.30.4.255
127.0.0.1
172.1.30.0/24
224.0.0.0/3
127.255.255.255
Directly connected ipv6 networks:
Active ipv4 user rules:
185 11100 S DEFAULT_HTTPS tcp -- * * 0.0.0.0/0 0.0.0.0/0 multiport dports 443
163 12612 - LOAD_BALANCE all -- * * 0.0.0.0/0 0.0.0.0/0
Active ipv6 user rules:
This provides a detailed overview of the mwan3 service's status, including interface states, connection statuses, and routing information. It can be more informative than the basic service status check.
mwan3 Service Status and Control
Check the status of the mwan3 service:
/etc/init.d/mwan3 status
The given one is just an example output; when this command is run, it will show something like this.
Example Response
running
This will output the status of the mwan3 service, whether it is running, and any details that may have been encountered. Look for error or failure indications.
If the service is not running, start it:
/etc/init.d/mwan3 start
The given one is just an example output; when this command is run, it will show something like this.
Example Response:
running
This command starts the mwan3 service, thus beginning the process of load balancing based on configuration.
For troubleshooting, you can stop and restart the service:
/etc/init.d/mwan3 stop
/etc/init.d/mwan3 restart
The given one is just an example output; when this command is run, it will show something like this.
Example Response
- stop
running
- Restart
running
Stopping and restarting the mwan3 service sometimes solves transient issues or applies configuration changes.
Q:1 How to check the current status of mwan3 service?
To check the current status of the mwan3 service: Log into the CE device with root privileges. Run the command: /etc/init.d/mwan3 status. Review the output: If it shows running, the mwan3 service is active and managing WAN connections. If it shows stopped or no output, the service is not running and load balancing will not function. If the service is not running, start it with: /etc/init.d/mwan3 start. For troubleshooting or applying new configurations, you can stop and restart the service: /etc/init.d/mwan3 stop, /etc/init.d/mwan3 restart. This ensures the mwan3 process is active and able to execute load balancing and failover policies.
Q:2 How to verify load balancing and failover settings in mwan3?
To check the current status of the mwan3 service: Log into the CE device with root privileges. Run the command: /etc/init.d/mwan3 status. Review the output: If it shows running, the mwan3 service is active and managing WAN connections. If it shows stopped or no output, the service is not running and load balancing will not function. If the service is not running, start it with: /etc/init.d/mwan3 start. For troubleshooting or applying new configurations, you can stop and restart the service: /etc/init.d/mwan3 stop, /etc/init.d/mwan3 restart. This ensures the mwan3 process is active and able to execute load balancing and failover policies.
Testing Verification
Validate WAN Selection with Traceroute:
Utilize the traceroute command to identify what actual path traffic travels on the way to the destination. This is helpful to verify which WAN interface is currently being used by way of sending out requests. Perform a traceroute to some known external IP address, and replace x.x.x.x with an appropriate IP, such as a public DNS server.
traceroute -n x.x.x.x
The WAN interface the traceroute keeps showing up for is only one, even after configuring the load balancing, which indicates a problem in the load balance rules or routing configuration.
- To test specifically via a particular WAN interface, use:
traceroute -i eth0 google.com
traceroute -i eth1 google.com
Ping via specific WAN:
Test connectivity through a specific interface using ping:
mwan3 use eth0 ping -4 google.com
The given one is just an example output; when this command is run, it will show something like this.
Example Response
could not find family for eth0. Using ipv4.
Running 'ping -4 google.com' with DEVICE=eth0 SRCIP=192.168.1.4 FWMARK=0x3f00 FAMILY=ipv4
PING google.com (142.250.76.206): 56 data bytes
64 bytes from 142.250.76.206: seq=0 ttl=60 time=14.012 ms
64 bytes from 142.250.76.206: seq=1 ttl=60 time=13.390 ms
64 bytes from 142.250.76.206: seq=2 ttl=60 time=14.927 ms
64 bytes from 142.250.76.206: seq=3 ttl=60 time=14.979 ms
64 bytes from 142.250.76.206: seq=4 ttl=60 time=12.831 ms
64 bytes from 142.250.76.206: seq=5 ttl=60 time=14.128 ms
^C
--- google.com ping statistics ---
6 packets transmitted, 6 packets received, 0% packet loss
round-trip min/avg/max = 12.831/14.044/14.979 ms
This command forces the ping to use eth0 and isolates WAN link connectivity issues. Examine the output for errors or unexpected behavior. If a ping fails on an interface, even if all the configurations appear correct, there may be a WAN connection issue.
Q:1 How to run traceroute for WAN interface?
To run traceroute for a WAN interface: Use the standard traceroute command to check the default path: traceroute -n x.x.x.x Replace x.x.x.x with a known external IP (e.g., 8.8.8.8). To test specifically through a particular WAN interface, use the -i option:traceroute -i eth0 google.com, traceroute -i eth1 google.com, -i eth0 forces traceroute to use the eth0 interface. -i eth1 forces traceroute to use the eth1 interface. This helps confirm whether traffic is correctly distributed across WAN interfaces or if it is stuck on a single path, indicating a load balancing configuration issue.
Q:2 Why are both traceroute and ping necessary?
Both traceroute and ping are necessary because they test different aspects of WAN connectivity: 1. Traceroute:- Shows the path traffic takes to reach a destination. Helps verify which WAN interface is being used for outbound traffic. Useful for detecting routing or policy misconfigurations in load balancing. 2. Ping:- Tests basic connectivity and responsiveness through a specific WAN interface. Confirms whether the interface can successfully send and receive packets. Useful for isolating WAN link issues (e.g., ISP down, cable unplugged, unreachable tracking IP). 3. Together: Traceroute validates routing and policy behavior. Ping validates link health and connectivity. Using both ensures a complete check of whether load balancing is functioning correctly and whether each WAN interface is operational.
Log Verification
Checking logs can help you diagnose specific issues, such as failed authentication attempts or service errors.
System Log Inspection:
The system logs are always a good source of information in the operation of the Multi-WAN service. Errors related to interface configuration or routing issues are some of the problems that could be affecting the load balancing by reviewing the logs.
Check the system logs for errors related to Mwan3
logread -e mwan3
Investigate the log output for error messages, warnings, or unusual activity. Such logs can highlight specific issues with load balancing.
Q:1 Which command should be used to check the log configuration?
To check the log configuration for mwan3, use: logread -e mwan3. This command filters the system log for entries related to mwan3, showing warnings, errors, and interface state changes. It highlights issues such as unreachable tracking IPs, misconfigured policies, or disabled interfaces. Reviewing this output ensures that the load balancing service is functioning correctly and helps pinpoint misconfigurations.