Strange behaviour on iptable with nat AND port forwarding

I have several dedicated servers hosted in several datacenters, and I want to migrate mail (pop3, imap, smtp and their TLS/SSL variants) services from one server to another.

For that purpose, I intend to temporarily install a NAT routing on the new server to the old one for handling the time of DNS propagation.

So I defined the following IPTABLES rules :

iptables -t nat -A PREROUTING -p tcp -m tcp --dport 25  -j DNAT --to-destination <my_remote_ip>:8025
iptables -t nat -A PREROUTING -p tcp -m tcp --dport 110 -j DNAT --to-destination <my_remote_ip>
iptables -t nat -A PREROUTING -p tcp -m tcp --dport 143 -j DNAT --to-destination <my_remote_ip>
iptables -t nat -A PREROUTING -p tcp -m tcp --dport 465 -j DNAT --to-destination <my_remote_ip>:8465
iptables -t nat -A PREROUTING -p tcp -m tcp --dport 587 -j DNAT --to-destination <my_remote_ip>:8587
iptables -t nat -A PREROUTING -p tcp -m tcp --dport 993 -j DNAT --to-destination <my_remote_ip>
iptables -t nat -A PREROUTING -p tcp -m tcp --dport 995 -j DNAT --to-destination <my_remote_ip>
iptables -t nat -A POSTROUTING -d <my_remote_ip> -p tcp -m tcp --dport 110  -j SNAT --to-source <my_local_ip>
iptables -t nat -A POSTROUTING -d <my_remote_ip> -p tcp -m tcp --dport 143  -j SNAT --to-source <my_local_ip>
iptables -t nat -A POSTROUTING -d <my_remote_ip> -p tcp -m tcp --dport 993  -j SNAT --to-source <my_local_ip>
iptables -t nat -A POSTROUTING -d <my_remote_ip> -p tcp -m tcp --dport 995  -j SNAT --to-source <my_local_ip>
iptables -t nat -A POSTROUTING -d <my_remote_ip> -p tcp -m tcp --dport 8025 -j SNAT --to-source <my_local_ip>:25
iptables -t nat -A POSTROUTING -d <my_remote_ip> -p tcp -m tcp --dport 8465 -j SNAT --to-source <my_local_ip>:465
iptables -t nat -A POSTROUTING -d <my_remote_ip> -p tcp -m tcp --dport 8587 -j SNAT --to-source <my_local_ip>:587

iptables -A FORWARD -d <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --dport 110  -j ACCEPT
iptables -A FORWARD -d <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --dport 143  -j ACCEPT
iptables -A FORWARD -d <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --dport 993  -j ACCEPT
iptables -A FORWARD -d <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --dport 995  -j ACCEPT
iptables -A FORWARD -d <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --dport 8025 -j ACCEPT
iptables -A FORWARD -d <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --dport 8465 -j ACCEPT
iptables -A FORWARD -d <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --dport 8587 -j ACCEPT
iptables -A FORWARD -s <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --sport 110  -j ACCEPT
iptables -A FORWARD -s <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --sport 143  -j ACCEPT
iptables -A FORWARD -s <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --sport 993  -j ACCEPT
iptables -A FORWARD -s <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --sport 995  -j ACCEPT
iptables -A FORWARD -s <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --sport 8025 -j ACCEPT
iptables -A FORWARD -s <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --sport 8465 -j ACCEPT
iptables -A FORWARD -s <my_remote_ip> -i eth0 -o eth0 -p tcp -m tcp --sport 8587 -j ACCEPT

(actually simplified, in fact this is duplicated for IPv4 and IPV6, and on some servers the interface may be different than eth0 … and of course I mangled the actual IP addresses)

You may note that mail services are only NAT’ed, but that SMTP related services have also a port translation, associated with either the reverse translation or the specific listening of these ports on the destination server.

There is a compelling reason for that: my hosting provider uses to monitor outgoing SMTP connections in order to detect and block spam hosted on all servers hosted by them. But if I forward incoming connections to SMTP port to another of my servers, incoming spam becomes outgoing spam (from the point of view of the datacenter) before it has any chances to be filtered (on the destination server), and the result is that my hosting provider immediately blocks the NAT’ting server.

So I have to translate also the port number in order to forward these connections.

Incoming packets and outgoing (NAT’ed) packets use the same interface (because these servers have only one network interface).

Actually this works more or less, except that the port forwarded connections (and only these ones, no problem on ports 110, 143, etc) have a strange behaviour: they work the first time I use them, but if I disconnect and reconnect immediately, the forwarding no longer works and I have to wait about 1 to 3 minutes before being able to connect again.

This seems to be related to the IP address, not to the port number: Even if the previous connection was on port 110 (pop3, not port forwarded), I have to wait the same delay before being able to connect to port 25.

I have verified that on several servers, all of them being Linux Debian Wheezy, Jessie or Stretch, and on both IPv4 and IPv6 (except on Wheezy which cannot NAT IPv6). ……… Yes, I know that Wheezy is now old, this is why I am migrating.

All IP addresses are fully static.

And yes, I have set /proc/sys/net/ipv4/ip_forward (and it’s IPv6 equivalent) to 1.

I use telnet for testing the connections, and I check the forwarding with tcpdump. With the latter I can check that the forwarding is really not done, and that it is not the destination server which blocks the incoming connections.

Please could someone help me to find the reason of this 1-3 minutes blocking delay and how I could fix it?



Your issue relates to especialities of the linux connection tracker.

The quick answer: you cannot avoid this delay in your configuration. Only way is avoid this issue is usage the -j SNAT without a port number specification in the --to-source option.

There is also a little trick, that can little help you – use in the j SNAT the range of ports instead single port number. It’ll able you establish multiple connections. Number of the simultanious connections is number of ports in the range. The rule will seem like:

iptables -t nat -A POSTROUTING -d <my_remote_ip> -p tcp -m tcp --dport 8025 -j SNAT --to-source <my_local_ip>:10025-10125

If you want the bloody details, I can extend the answer.


To understand the details, you should have some background. Read "Linux kernel networking: implementation and theory" by Rami Rosen. Mainly you need the “Chapter 9. Netfilter. Connection tracker” from this book.

When the packets go through your linux host, the linux connection tracker (conntrack) analyze them and store the information about packet flows into the table (conntrack table). Every packet flow is presented as the conntrack entry. The conntrack uses tuples to identified packet flows. Tuples consist of L3 (ip addresses of both ends and number of L4 protocol) and L4 (for TCP this is port numbers information of both ends) information of flows.

The conntrack has some modules for every L4 protocols to trace the transport protocol specific states of connections. The TCP conntrack part implements the TCP finite state machine.

In the lab (kernel 4.14) there is a strange behavious of this TCP conntrack part. Let’s I demonstrate this on the simple environment.

Client ( connects to the linux host (, that forwards this connection to other host ( The linux host also uses SNAT rule like in your setup.

The tcpdump output:

14:47:32.036809 IP > Flags [S], seq 2159011818, win 29200, length 0
14:47:32.037346 IP > Flags [S.], seq 960236935, ack 2159011819, win 28960, options [mss 1460,sackOK,TS val 1415987649 ecr 3003498128,nop,wscale 5], length 0
14:47:32.037683 IP > Flags [.], ack 1, win 913, options [nop,nop,TS val 3003498129 ecr 1415987649], length 0
14:47:32.041407 IP > Flags [P.], seq 1:22, ack 1, win 905, options [nop,nop,TS val 1415987653 ecr 3003498129], length 21
14:47:32.041806 IP > Flags [.], ack 22, win 913, options [nop,nop,TS val 3003498133 ecr 1415987653], length 0
14:47:35.826919 IP > Flags [F.], seq 1, ack 22, win 913, options [nop,nop,TS val 3003501918 ecr 1415987653], length 0
14:47:35.827996 IP > Flags [F.], seq 22, ack 2, win 905, options [nop,nop,TS val 1415991440 ecr 3003501918], length 0
14:47:35.828386 IP > Flags [.], ack 23, win 913, options [nop,nop,TS val 3003501919 ecr 1415991440], length 0

In the conntrack on the linux host I see this:

ipv4     2 tcp      6 431999 ESTABLISHED src= dst= sport=40079 dport=22 src= dst= sport=22 dport=22 [ASSURED] mark=0 zone=0 use=2
ipv4     2 tcp      6 431998 ESTABLISHED src= dst= sport=40079 dport=22 src= dst= sport=22 dport=22 [ASSURED] mark=0 zone=0 use=2
ipv4     2 tcp      6 431997 ESTABLISHED src= dst= sport=40079 dport=22 src= dst= sport=22 dport=22 [ASSURED] mark=0 zone=0 use=2
ipv4     2 tcp      6 431996 ESTABLISHED src= dst= sport=40079 dport=22 src= dst= sport=22 dport=22 [ASSURED] mark=0 zone=0 use=2
ipv4     2 tcp      6 119 TIME_WAIT src= dst= sport=40079 dport=22 src= dst= sport=22 dport=22 [ASSURED] mark=0 zone=0 use=2
ipv4     2 tcp      6 0 TIME_WAIT src= dst= sport=40079 dport=22 src= dst= sport=22 dport=22 [ASSURED] mark=0 zone=0 use=2

As you can see, despite the fact that connection has been closed correctly, the associated conntrack entry is still presented in the conntrack table with the TIME_WAIT state. And, because, we have only single possible port for SNAT, that is already busy, the new connection attempts fail. Why we don’t use this port one more time? Because the system cannot distinguish the reply packets from between current flow in TIME_WAIT state and new flow.

Why the conntrack set the connection into the TIME_WAIT state instead destroy it I hadn’t found out.

Source : Link , Question Author : GingkoFr , Answer Author : Community

Leave a Comment