Join Accepted is not delivered when network latency is inserted

OTAA works normally when I use a low latency internet connection between gateway and server.

However when I add a satellite communication between gateway and server, which adds about 600ms of delay, the join request is received by server but the join accept does not reach the gateway anymore.

In the gateway, under Semtech Forwarder Options, I increased the push timeout but no results.

I think is it noteworthy that the same scenario but using TTN as server (instead of Loraserver) works fine.

Thank you in advance!

That’s odd, but perhaps not in a way that really matters.

The irony is that the receive window for join accepts is much, much later than that for ordinary traffic.

600 mS latency should not be challenging for getting the join accept back in time, but it’s going to pretty much kill any chance of operating with a standard 1 second delay for RX1.

So it’s odd that you are getting stuck where you are, but you shouldn’t really expect the system to actually work with that kind of latency.

If the network is really isolated, perhaps you want to run the server there, and only share the resulting application feed back over the satellite.

2 Likes

Hi,

Responding to this old thread before asking a similar question myself. I have a cellular gateway with high latency and i’m seeing the same issue. it’s about a 400-500ms ping to my gateway (Tektelic Kona Micro). I’m running Chirpstack Bridge, and the latest Tektelic packet forwarders and Chirpstack 3.16.1.

If i put my gateway on a hardwired connection, low latency, the joins are accepted every time. As soon as i introduce latency, the OTAA join process fails 90% of the time. On the AS, i can see the JoinRequests coming in, but a JoinAccept is never sent and never seen on the web UI. I also checked on the logs of the bridge and packet forwarder as well and see no downlinks… I know with OTAA there’s a 5 second timeout, and i would think that this process should be well within the 5 second timeout, but what’s curious is that the JoinAccept is never even sent back. As soon as i switch to hardwired the Join requests will be accepted every time. Once joined, i can put the gateway back on the high latency connection and receive all the uplink packets without any dropped packets, as well as confirmed uplinks.

Is there something on the Network server that’s perhaps checking the timestamp that the gateway receives the Join vs. when the Network Server receives it? At the moment i’m completely stumped and any suggestions would be appreciated.

Thanks!

Just going with my usual list of network debugging,
have you checked your MTU settings? You should be able to start the GWB with debugging, that way you might see if your frame has been schedule for emission. Recently I had a similar Issue and the MTU of the mobile network changed.

Interesting thought, No, i havent checked the MTU settings, but i dont see anything in the GWB logs, even at debug level. It’s strange that the Join Request is going through, but the AS isn’t showing any accept being sent back at all (keys are correct as it works on a low latency connetion). I just updated NS to 3.16.2 and we’ll see if that makes a difference.

Out of curiosity, what did you set your MTU to? Something significantly smaller?

Depends on your network provider. for wired connections most of the time 1500 is okay.
Then if you have a vpn-tunnel, you set the MTU lower on the interface providing the vpn, as the IP frame will get wrapped in another frame.

How you set that permanently on your gateway OS, i dont know. That’s up to you to find out and might be the wrong path the the solution, just thought it might be worth checking. Good luck!

On our mobile provider I had to lower the MTU of the wireguard interface to 1330 for it to work.
MTU of the WAN interface was something around 1420sh

Good to know, thanks! I suspect it’s not an MTU issue, but worth investigating regardless.

Turns out this was an oversight on my end. the gateway bridge was timing out due to a ping frequency that was too low. When I decreased the keepalive time on my gateway config, things started working much more reliably.

Thanks again for the tips!

What this probably means, is that there is a firewall/some routing component that is terminating the UDP session prematurely. Since UDP is stateless but yet the UDP packet forwarder requires the downlinks to be sent back to the same port, the “session” must be maintained along all hops. Making those heartbeats more frequent, probably kept the “session” open in your case.

If it is due to latency alone, then the gateway should have received the downlink. But whether it modulates it or not, would be another matter. i.e. if it was received too late, then the gateway would reject it.

1 Like