Virtual private network technology allows two computers on different networks to work as if they are on the same LAN. However, in order for them to actually connect, the most complicated and difficult step is often to break through the network address translation, that is, the obstruction of the NAT device.
Communication obstacles caused by NAT
Network address translation is widely deployed in home routers and corporate firewalls, which allows multiple internal devices to share a public IP address, but this means that there is no way to directly access a specific internal host from the external network. In the case where both devices required to establish a virtual private network connection are behind such a NAT device, direct IP layer communication does not work.
Generally speaking, the traditional IPsec protocol is designed to achieve end-to-end encryption and authentication. Its encapsulated data packets do not include transport layer port information that is easily recognized by NAT. The NAT device needs to modify the port mapping to maintain the session. However, standard IPsec encapsulation does not support such modifications. This results in the data packet not being correctly forwarded back to the internal host, and the connection fails.
UDP encapsulation becomes key
In order to solve this problem, the industry widely adopts the method of encapsulating IPsec data packets into the UDP protocol. The UDP protocol header has a clear source port and destination port, and the NAT device can perform port conversion on it just like ordinary network traffic, thereby building an effective mapping table. This allows encrypted virtual private network traffic to pass through most NAT devices.
In comparison, using TCP to encapsulate has obvious shortcomings. The TCP protocol itself has a complicated three-way handshake, as well as flow control and retransmission mechanisms. Superimposing these mechanisms on a virtual private network tunnel will introduce additional delays and overhead. IPsec itself already has integrity verification and reliability guarantees. Using TCP again is equivalent to duplicating work and is very inefficient.
Challenges under symmetric NAT
However, relying solely on UDP encapsulation cannot solve all NAT traversal problems. When both parties to the communication are behind a symmetric NAT or a strict firewall, the situation becomes complicated. Symmetric NAT will assign an independent port mapping to each external target address. This will cause the port opened by device A to send a packet to server S. It cannot be used by device B to send data directly to A.
This means that if two virtual private network nodes are behind this strict network address translation, then they cannot directly know each other's public network address and port after network address translation mapping, so they cannot initiate a direct peer-to-peer network connection. At this time, the user datagram protocol hole punching technology may fail, and other solutions must be sought.
Introducing third-party relay servers
When a direct point-to-point connection cannot be established, it becomes a necessary choice to introduce a third-party server with a public network IP address for data relay. The two nodes after NAT establish stable connections with the public network server respectively, and all communication data between them must be forwarded through the server.
This process is similar to the intermediary delivering news in reality: Lao Zhang and Lao Li cannot talk directly, but they can both contact Lao Wang. Lao Zhang tells Lao Wang what he said, and Lao Wang then tells Lao Li, and vice versa. In the virtual private network scenario, this "old king" is the relay server, which is responsible for forwarding encrypted data between two established tunnels.
Dynamic establishment of tunnel routes
In a complex virtual private network with multiple nodes participating, the communication path is not necessarily single. To access node C, node A may have to go through node B. This requires a mechanism to dynamically discover other nodes in the network, negotiate available communication tunnels, and ultimately build a tunnel routing table.
Each of the many virtual private network nodes needs to maintain this table indicating which tunnel should be used to reach different destination network addresses. Routing information must be dynamically updated in order to cope with changes in node IP addresses (i.e. dynamic addressing) or changes in tunnel status (such as when a relay server goes offline) to ensure network resilience.
The long road from principle to product
Even if core protocol issues such as NAT traversal, dynamic addressing, and tunnel routing are solved, there is still a big gap between having a mature and easy-to-use virtual private network product. What we have to face next is a large number of engineering implementation details and edge case processing, which are often more complex and time-consuming than implementing the core principles.
For example, it is necessary to deal with the compatibility issues of NAT equipment from different manufacturers, design an efficient heartbeat mechanism to keep the NAT mapping table alive, achieve elegant tunnel switching and fault recovery, design an easy-to-use configuration management interface, and ensure the stability and security of the entire system in large-scale deployment. These complexities become major challenges on the road to productization.
When configuring a virtual private network, or when using a virtual private network, what is the most troublesome NAT-related or firewall-related problem that you encounter? Everyone is welcome to share their personal experiences in the comment area. If you find this article helpful, please give it a like and support.


