Page MenuHomeVyOS Platform

NAT Problem with VRF
Open, Requires assessmentPublicBUG

Description

hi,

I start using VRF and stumbled over a nasty nat bug:

Device:

eth0 192.168.0.100/24 gw 192.168.0.1 VRF OOBM
eth1 192.168.0.1/24 VRF default
eth2 no IP VRF default
pppoe0 dynamic public IP from ISP VRF default

eth0 and eth1 are conntected to the same switch and can ping each other

NAT RULE:

set nat source rule 100 outbound-interface 'pppoe0'
set nat source rule 100 protocol 'all'
set nat source rule 100 translation address 'masquerade'

The nat works for all other devices in 192.168.0/24. But all packets from 192.168.0.100 goes without masquerade out of pppoe0.

Details

Difficulty level
Unknown (require assessment)
Version
1.3.0-rc4
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)

Event Timeline

rherold created this object in space S1 VyOS Public.
Viacheslav changed the subtype of this task from "Task" to "Bug".Jun 28 2021, 5:56 PM
Viacheslav added a project: VyOS 1.3 Equuleus.

Hi ruben,

I was doing different test with our lab environment , I trying to isolate the issues , It can just be reproduce with a simple nat.(on vrf XX) ,the topology used:

 nat 
 |
RT-VYOS---- vrf-OOBM 
 |                   \
 |                    \
default  -------     switch

I found some parameters that may be limiting the nat translation:

By default the scope of the port bindings for unbound sockets is
 limited to the default VRF. That is, it will not be matched by packets
 arriving on interfaces enslaved to an l3mdev and processes may bind to
 the same port if they bind to an l3mdev.

TCP & UDP services running in the default VRF context (ie., not bound
to any VRF device) can work across all VRF domains by enabling the
tcp_l3mdev_accept and udp_l3mdev_accept sysctl options:

  sysctl -w net.ipv4.tcp_l3mdev_accept=1
  sysctl -w net.ipv4.udp_l3mdev_accept=1

These options are disabled by default so that a socket in a VRF is only
selected for packets in that VRF.

however , I can't test this behavior well .So we ''ll try a different topology and use only vrf (without default ) ,

hi,

as I wrote on slack, from my point of view it is a kernel problem. It seems that the conntrack in the kernel detects the packets eben if they come in on an input interface in default and so
the nat code won'T match cause for conntrack the outgoing interface is still eth0 which is in vrf OOBM instead pppoe0.

I would expect to have conntrack entries for each vrf for this flow.

It seems that what I thought is true:

vyos@gw-1:~$ sudo ip vrf exec OOBM telnet 62.104.56.93
in the same time root@gw-1:/home/vyos# conntrack -L |grep 62.104.56.93

conntrack v1.4.6 (conntrack-tools): 72 flow entries have been shown.
tcp 6 119 SYN_SENT src=192.168.0.100 dst=62.104.56.93 sport=47704 dport=23 [UNREPLIED] src=62.104.56.93 dst=192.168.0.100 sport=23 dport=47704 mark=0 use=1

I would expect to see two entries. One for vrf OOBM and one fro VRF default.

Hi ruben

I would like to ask you if you can configure the following :

set protocols vrf OOBM static route 0.0.0.0/0 next-hop 192.168.0.1 next-hop-vrf 'OOBM'

after that you can confirm what your behavior was, in my environment it gives the following result :

vyos@vyos-rt1:~$ ping 8.8.8.8 vrf OOBM interface 192.168.0.100
PING 8.8.8.8 (8.8.8.8) from 192.168.0.100 : 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=114 time=17.8 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=114 time=18.9 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=114 time=16.6 ms
^V64 bytes from 8.8.8.8: icmp_seq=4 ttl=114 time=19.2 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=114 time=16.8 ms
^C
--- 8.8.8.8 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 12ms
rtt min/avg/max/mdev = 16.627/17.860/19.206/1.072 ms
vyos@vyos-rt1:~$ conntrack -L

tcp      6 230 ESTABLISHED src=192.168.125.1 dst=192.168.125.61 sport=60444 dport=22 src=192.168.0.100 dst=192.168.125.1 sport=22 dport=60444 [ASSURED] mark=0 use=1  ////////////////// dnat

tcp      6 230 ESTABLISHED src=192.168.125.1 dst=192.168.0.100 sport=60444 dport=22 src=192.168.0.100 dst=192.168.125.1 sport=22 dport=29075 [ASSURED] mark=0 use=1

icmp     1 10 src=192.168.0.100 dst=8.8.8.8 type=8 code=0 id=3431 src=8.8.8.8 dst=192.168.0.100 type=0 code=0 id=3431 mark=0 use=1 ////// ICMP

It seems 1.4-rolling has this bug also
i setup vrf wg with all wireguard clients (with private ip)
and setup vrf leak to vrf default
NAT didn't work on it.
it will send un-NAT packet to eth0

@zsdc
please take a look on this
it might be some similar issue in this patch?
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=0fb4d21956f4a9af225594a46857ccf29bd747bc

because PREROUTING will be called twice

Hi @tj2852847

thanks for your comment , we are testing first with @rherold , I understand that your case is similar but it's not the same (you have an explicit route-leaking between default vrf and vrf X ). So we also need to test it and try to sure the version solved it .

Please take a look at the commit 9213ce6672582bc12f02c1530726fe97030d2cfe for kernel 5.13.