Acks timing issues

Hi,

I set up LoRa server on RPi3 + IC880a. Basically, installation and configuration where fairly easy, but I’m experiencing some issues with receiving acks sent from server to nodes. Most of acks are sent successfully - I can see them on the node side and spectrum analyzer, but some are missing. Sniffing communication between packet-forwarder and lora-bridge I found that server is trying to send acks with delay longer than 1s. Logs below show 3s delay. Packet-forwarder seems to drop such requests. What may be a root cause of such behavior?

16:51:01.807140 IP localhost.48291 > localhost.1700: UDP, length 284
E…8F.@.@…$.7…C…'…{.{“rxpk”:[{“tmst”:1624278388,“time”:“2017-09-13T14:51:01.806692Z”,“chan”:1,“rfch”:1,“freq”:868.300000,“stat”:1,“modu”:“LORA”,“datr”:“SF9BW125”,“codr”:“4/5”,“lsnr”:12.5,“rssi”:-95,“size”:53,“data”:“gJkVASaAHAABDeuDOYz0kxGLPWQu5n4I/7NdgoYMSSLpRwe/YWh5H4UWW6o/m1OPaPiMrP0=”}]}
16:51:01.810586 IP localhost.1700 > localhost.48291: UDP, length 4
E… F.@.@…C.
16:51:04.140250 IP localhost.1700 > localhost.40318: UDP, length 170
E…Fc@.@…~…{“txpk”:{“imme”:false,“tmst”:1625278388,“freq”:868.3,“rfch”:0,“powe”:14,“modu”:“LORA”,“datr”:“SF9BW125”,“codr”:“4/5”,“ipol”:true,“size”:12,“data”:“YJkVASagog1VPIrs”}}

17:03:03.226140 IP localhost.48291 > localhost.1700: UDP, length 283
E…7.M@.@.kf…#.6…“…'…{.{“rxpk”:[{“tmst”:2345698979,“time”:“2017-09-13T15:03:03.225714Z”,“chan”:2,“rfch”:1,“freq”:868.500000,“stat”:1,“modu”:“LORA”,“datr”:“SF7BW125”,“codr”:“4/5”,“lsnr”:7.5,“rssi”:-93,“size”:53,“data”:“gJkVASaAJgABM7wY8ruGl6gP0jzK+QtaoWidRkSc4rsG7Umttm7OcSFNdlZuCBaY9H2AKCw=”}]}
17:03:03.228542 IP localhost.1700 > localhost.48291: UDP, length 4
E… .N@.@.l|…”.
17:03:06.440227 IP localhost.1700 > localhost.40318: UDP, length 170
E…@.@.k…~…{“txpk”:{“imme”:false,“tmst”:2346698979,“freq”:868.5,“rfch”:0,“powe”:14,“modu”:“LORA”,“datr”:“SF7BW125”,“codr”:“4/5”,“ipol”:true,“size”:12,“data”:“YJkVASagrw1+T/Y6”}}

And the good one:

17:07:44.789074 IP localhost.48291 > localhost.1700: UDP, length 284
E…8 !@.@.2…$.7.$…'…{.{“rxpk”:[{“tmst”:2627261820,“time”:“2017-09-13T15:07:44.788633Z”,“chan”:0,“rfch”:1,“freq”:868.100000,“stat”:1,“modu”:“LORA”,“datr”:“SF9BW125”,“codr”:“4/5”,“lsnr”:11.8,“rssi”:-93,“size”:53,“data”:“gJkVASaALAABUKoEkhfjTMeJzPfz9kJgl6Rqy412DCsZPUX5ILUJXC0g3b44+FExnunv/EU=”}]}
17:07:44.793059 IP localhost.1700 > localhost.48291: UDP, length 4
E… "@.@.3…$…
17:07:45.721514 IP localhost.1700 > localhost.40318: UDP, length 170
E… s@.@.2…~…{“txpk”:{“imme”:false,“tmst”:2628261820,“freq”:868.1,“rfch”:0,“powe”:14,“modu”:“LORA”,“datr”:“SF9BW125”,“codr”:“4/5”,“ipol”:true,“size”:12,“data”:“YJkVASagvA1G7Gt5”}}

Just to confirm, all components are on localhost (so there should be no network latency)?

That’s right. It’s local connection.

The only places where LoRa Server waits are:

  • for the de-duplication (as multiple gateways might receive the same frame)
  • getting payloads from LoRa App Server (reading the queue)
  • reading from Redis and PostgreSQL

Could you check the loraserver and lora-app-server logs to check the timestamps. Maybe that gives a clue where LoRa Server “hangs”?

My initial thought is that it might be I/O related. E.g. could it be a process performing writes to the sd-card when these delays occur?

  1. The good one.

mqtt:

gateway/b827ebfffe997bd3/rx {"rxInfo":{"mac":"b827ebfffe997bd3","time":"2017-09-13T14:24:42.478792Z","timestamp":44940492,"frequency":867100000,"channel":3,"rfChain":0,"crcStatus":1,"codeRate":"4/5","rssi":-89,"loRaSNR":11,"size":53,"dataRate":{"modulation":"LORA","spreadFactor":11,"bandwidth":125}},"phyPayload":"gJkVASaAAAABFaeUQWrq0W3ULm+ZQ3u7YpgaWCLS8iFbJrTOArhIhsSBO0n7uQeRNc5mef8="}
application/1/node/0010230450670890/rx {"applicationID":"1","applicationName":"Application","nodeName":"Node1","devEUI":"0010230450670890","rxInfo":[{"mac":"b827ebfffe997bd3","time":"2017-09-13T14:24:42.478792Z","rssi":-89,"loRaSNR":11,"name":"raspberry","latitude":0,"longitude":0,"altitude":0}],"txInfo":{"frequency":867100000,"dataRate":{"modulation":"LORA","bandwidth":125,"spreadFactor":11},"adr":true,"codeRate":"4/5"},"fCnt":0,"fPort":1,"data":"cJkVASYBAAEAHlRDRMwDAFBQG4MBAFIljBEAAEIllCYAAEJW2gEAAA=="}
gateway/b827ebfffe997bd3/tx {"txInfo":{"mac":"b827ebfffe997bd3","immediately":false,"timestamp":45940492,"frequency":867100000,"power":14,"dataRate":{"modulation":"LORA","spreadFactor":11,"bandwidth":125},"codeRate":"4/5","iPol":null},"phyPayload":"YJkVASagZQ0BLmPO"}

syslog:

Sep 13 16:24:42 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:24:42+02:00" level=info msg="gateway: received udp packet from gateway" addr=127.0.0.1:48291 protocol_version=1 type=PushData
**Sep 13 16:24:42 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:24:42+02:00" level=info msg="gateway: rxpk packet received" addr=127.0.0.1:48291 data="gJkVASaAAAABFaeUQWrq0W3ULm+ZQ3u7YpgaWCLS8iFbJrTOArhIhsSBO0n7uQeRNc5mef8=" mac=b827ebfffe997bd3**
Sep 13 16:24:42 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:24:42+02:00" level=info msg="backend: publishing packet" topic="gateway/b827ebfffe997bd3/rx"
Sep 13 16:24:42 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:24:42+02:00" level=info msg="gateway: sending udp packet to gateway" addr=127.0.0.1:48291 protocol_version=1 type=PushACK
Sep 13 16:24:42 ttn-gateway loraserver[13144]: time="2017-09-13T16:24:42+02:00" level=info msg="backend/gateway: rx packet received"
Sep 13 16:24:42 ttn-gateway loraserver[13144]: time="2017-09-13T16:24:42+02:00" level=info msg="node-session saved" dev_addr=26011599 dev_eui=0010230450670890
Sep 13 16:24:42 ttn-gateway loraserver[13144]: time="2017-09-13T16:24:42+02:00" level=warning msg="frame counters reset" dev_addr=26011599 dev_eui=0010230450670890
Sep 13 16:24:42 ttn-gateway loraserver[13144]: time="2017-09-13T16:24:42+02:00" level=info msg="packet(s) collected" dev_eui=0010230450670890 gw_count=1 gw_macs=b827ebfffe997bd3 mtype=ConfirmedDataUp
Sep 13 16:24:42 ttn-gateway loraserver[13144]: time="2017-09-13T16:24:42+02:00" level=info msg="rx info sent to network-controller" dev_eui=0010230450670890
Sep 13 16:24:42 ttn-gateway lora-app-server[13105]: time="2017-09-13T16:24:42+02:00" level=info msg="handler/mqtt: publishing data-up payload" topic="application/1/node/0010230450670890/rx"
Sep 13 16:24:42 ttn-gateway lora-app-server[13105]: time="2017-09-13T16:24:42+02:00" level=info msg="handler/http: publishing data-up payload" dev_eui=0010230450670890 url="https://192.168.1.29:8080/uplink"
Sep 13 16:24:43 ttn-gateway lora-app-server[13105]: 2017/09/13 16:24:43 http: TLS handshake error from 192.168.1.29:38450: remote error: tls: bad certificate
Sep 13 16:24:43 ttn-gateway lora-app-server[13105]: time="2017-09-13T16:24:43+02:00" level=error msg="handler *httphandler.Handler error: http request error: Post https://192.168.1.29:8080/uplink: x509: cannot validate certificate for 192.168.1.29 because it doesn't contain any IP SANs"
Sep 13 16:24:43 ttn-gateway loraserver[13144]: time="2017-09-13T16:24:43+02:00" level=info msg="node-session saved" dev_addr=26011599 dev_eui=0010230450670890
Sep 13 16:24:43 ttn-gateway lora-app-server[13105]: time="2017-09-13T16:24:43+02:00" level=info msg="data-down item requested by network-server, but queue is empty" dev_eui=0010230450670890
Sep 13 16:24:43 ttn-gateway loraserver[13144]: time="2017-09-13T16:24:43+02:00" level=info msg="backend/gateway: publishing tx packet" topic="gateway/b827ebfffe997bd3/tx"
Sep 13 16:24:43 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:24:43+02:00" level=info msg="backend: packet received" topic="gateway/b827ebfffe997bd3/tx"
**Sep 13 16:24:43 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:24:43+02:00" level=info msg="gateway: sending udp packet to gateway" addr=127.0.0.1:40318 protocol_version=1 type=PullResp**
Sep 13 16:24:43 ttn-gateway loraserver[13144]: time="2017-09-13T16:24:43+02:00" level=info msg="node-session saved" dev_addr=26011599 dev_eui=0010230450670890
Sep 13 16:24:50 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:24:50+02:00" level=info msg="gateway: received udp packet from gateway" addr=127.0.0.1:40318 protocol_version=1 type=PullData
Sep 13 16:24:50 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:24:50+02:00" level=info msg="gateway: sending udp packet to gateway" addr=127.0.0.1:40318 protocol_version=1 type=PullACK
Sep 13 16:24:59 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:24:59+02:00" level=info msg="gateway: received udp packet from gateway" addr=127.0.0.1:48291 protocol_version=1 type=PushData
Sep 13 16:24:59 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:24:59+02:00" level=info msg="gateway: stat packet received" addr=127.0.0.1:48291 mac=b827ebfffe997bd3
Sep 13 16:24:59 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:24:59+02:00" level=info msg="gateway: sending udp packet to gateway" addr=127.0.0.1:48291 protocol_version=1 type=PushACK
Sep 13 16:24:59 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:24:59+02:00" level=info msg="backend: publishing packet" topic="gateway/b827ebfffe997bd3/stats"
Sep 13 16:24:59 ttn-gateway loraserver[13144]: time="2017-09-13T16:24:59+02:00" level=info msg="backend/gateway: gateway stats packet received" mac=b827ebfffe997bd3
Sep 13 16:24:59 ttn-gateway loraserver[13144]: time="2017-09-13T16:24:59+02:00" level=info msg="gateway updated" mac=b827ebfffe997bd3
Sep 13 16:25:00 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:25:00+02:00" level=info msg="gateway: received udp packet from gateway" addr=127.0.0.1:40318 protocol_version=1 type=PullData
Sep 13 16:25:00 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:25:00+02:00" level=info msg="gateway: sending udp packet to gateway" addr=127.0.0.1:40318 protocol_version=1 type=PullACK
  1. The faulty one.

mqtt:

gateway/b827ebfffe997bd3/rx {"rxInfo":{"mac":"b827ebfffe997bd3","time":"2017-09-13T14:26:16.725086Z","timestamp":139191044,"frequency":868100000,"channel":0,"rfChain":1,"crcStatus":1,"codeRate":"4/5","rssi":-90,"loRaSNR":10,"size":53,"dataRate":{"modulation":"LORA","spreadFactor":11,"bandwidth":125}},"phyPayload":"gJkVASaAAQABMK9qEncJbwcWpqPfhz6EJMIqQxCoRWPvjg/yUcnDf35fv8uQmGcAVqWEntE="}
application/1/node/0010230450670890/rx {"applicationID":"1","applicationName":"Application","nodeName":"Node1","devEUI":"0010230450670890","rxInfo":[{"mac":"b827ebfffe997bd3","time":"2017-09-13T14:26:16.725086Z","rssi":-90,"loRaSNR":10,"name":"raspberry","latitude":0,"longitude":0,"altitude":0}],"txInfo":{"frequency":868100000,"dataRate":{"modulation":"LORA","bandwidth":125,"spreadFactor":11},"adr":true,"codeRate":"4/5"},"fCnt":1,"fPort":1,"data":"cJkVASYBAAEAHlRDN84DAFBQG4MBAFIlZREAAEIllCYAAEJW2gEAAA=="}
gateway/b827ebfffe997bd3/tx {"txInfo":{"mac":"b827ebfffe997bd3","immediately":false,"timestamp":140191044,"frequency":868100000,"power":14,"dataRate":{"modulation":"LORA","spreadFactor":11,"bandwidth":125},"codeRate":"4/5","iPol":null},"phyPayload":"YJkVASagZw09i37v"}

syslog:

Sep 13 16:26:16 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:16+02:00" level=info msg="gateway: received udp packet from gateway" addr=127.0.0.1:48291 protocol_version=1 type=PushData
**Sep 13 16:26:16 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:16+02:00" level=info msg="gateway: rxpk packet received" addr=127.0.0.1:48291 data="gJkVASaAAQABMK9qEncJbwcWpqPfhz6EJMIqQxCoRWPvjg/yUcnDf35fv8uQmGcAVqWEntE=" mac=b827ebfffe997bd3**
Sep 13 16:26:16 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:16+02:00" level=info msg="backend: publishing packet" topic="gateway/b827ebfffe997bd3/rx"
Sep 13 16:26:16 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:16+02:00" level=info msg="gateway: sending udp packet to gateway" addr=127.0.0.1:48291 protocol_version=1 type=PushACK
Sep 13 16:26:16 ttn-gateway loraserver[13144]: time="2017-09-13T16:26:16+02:00" level=info msg="backend/gateway: rx packet received"
Sep 13 16:26:18 ttn-gateway loraserver[13144]: time="2017-09-13T16:26:18+02:00" level=info msg="packet(s) collected" dev_eui=0010230450670890 gw_count=1 gw_macs=b827ebfffe997bd3 mtype=ConfirmedDataUp
Sep 13 16:26:18 ttn-gateway loraserver[13144]: time="2017-09-13T16:26:18+02:00" level=info msg="rx info sent to network-controller" dev_eui=0010230450670890
Sep 13 16:26:18 ttn-gateway lora-app-server[13105]: time="2017-09-13T16:26:18+02:00" level=info msg="handler/mqtt: publishing data-up payload" topic="application/1/node/0010230450670890/rx"
Sep 13 16:26:18 ttn-gateway lora-app-server[13105]: time="2017-09-13T16:26:18+02:00" level=info msg="handler/http: publishing data-up payload" dev_eui=0010230450670890 url="https://192.168.1.29:8080/uplink"
Sep 13 16:26:18 ttn-gateway lora-app-server[13105]: 2017/09/13 16:26:18 http: TLS handshake error from 192.168.1.29:38454: remote error: tls: bad certificate
Sep 13 16:26:18 ttn-gateway lora-app-server[13105]: time="2017-09-13T16:26:18+02:00" level=error msg="handler *httphandler.Handler error: http request error: Post https://192.168.1.29:8080/uplink: x509: cannot validate certificate for 192.168.1.29 because it doesn't contain any IP SANs"
Sep 13 16:26:18 ttn-gateway loraserver[13144]: time="2017-09-13T16:26:18+02:00" level=info msg="node-session saved" dev_addr=26011599 dev_eui=0010230450670890
Sep 13 16:26:19 ttn-gateway lora-app-server[13105]: time="2017-09-13T16:26:19+02:00" level=info msg="data-down item requested by network-server, but queue is empty" dev_eui=0010230450670890
Sep 13 16:26:19 ttn-gateway loraserver[13144]: time="2017-09-13T16:26:19+02:00" level=info msg="backend/gateway: publishing tx packet" topic="gateway/b827ebfffe997bd3/tx"
Sep 13 16:26:19 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:19+02:00" level=info msg="backend: packet received" topic="gateway/b827ebfffe997bd3/tx"
**Sep 13 16:26:19 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:19+02:00" level=info msg="gateway: sending udp packet to gateway" addr=127.0.0.1:40318 protocol_version=1 type=PullResp**
Sep 13 16:26:19 ttn-gateway loraserver[13144]: time="2017-09-13T16:26:19+02:00" level=info msg="node-session saved" dev_addr=26011599 dev_eui=0010230450670890
Sep 13 16:26:21 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:21+02:00" level=info msg="gateway: received udp packet from gateway" addr=127.0.0.1:40318 protocol_version=1 type=PullData
Sep 13 16:26:21 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:21+02:00" level=info msg="gateway: sending udp packet to gateway" addr=127.0.0.1:40318 protocol_version=1 type=PullACK
Sep 13 16:26:29 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:29+02:00" level=info msg="gateway: received udp packet from gateway" addr=127.0.0.1:48291 protocol_version=1 type=PushData
Sep 13 16:26:29 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:29+02:00" level=info msg="gateway: sending udp packet to gateway" addr=127.0.0.1:48291 protocol_version=1 type=PushACK
Sep 13 16:26:29 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:29+02:00" level=info msg="gateway: stat packet received" addr=127.0.0.1:48291 mac=b827ebfffe997bd3
Sep 13 16:26:29 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:29+02:00" level=info msg="backend: publishing packet" topic="gateway/b827ebfffe997bd3/stats"
Sep 13 16:26:29 ttn-gateway loraserver[13144]: time="2017-09-13T16:26:29+02:00" level=info msg="backend/gateway: gateway stats packet received" mac=b827ebfffe997bd3
Sep 13 16:26:30 ttn-gateway loraserver[13144]: time="2017-09-13T16:26:30+02:00" level=info msg="gateway updated" mac=b827ebfffe997bd3
Sep 13 16:26:31 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:31+02:00" level=info msg="gateway: received udp packet from gateway" addr=127.0.0.1:40318 protocol_version=1 type=PullData
Sep 13 16:26:31 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:31+02:00" level=info msg="gateway: sending udp packet to gateway" addr=127.0.0.1:40318 protocol_version=1 type=PullACK
Sep 13 16:26:38 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:38+02:00" level=info msg="gateway: received udp packet from gateway" addr=127.0.0.1:48291 protocol_version=1 type=PushData
Sep 13 16:26:38 ttn-gateway lora-gateway-bridge[13126]: time="2017-09-13T16:26:38+02:00" level=info msg="gateway: rxpk packet received" addr=127.0.0.1:48291 data="gJkVASaAAQABMK9qEncJbwcWpqPfhz6EJMIqQxCoRWPvjg/yUcnDf35fv8uQmGcAVq

I will also check I/Os.

Sep 13 16:26:16 ttn-gateway loraserver[13144]: time="2017-09-13T16:26:16+02:00" level=info msg="backend/gateway: rx packet received"
Sep 13 16:26:18 ttn-gateway loraserver[13144]: time="2017-09-13T16:26:18+02:00" level=info msg="packet(s) collected" dev_eui=0010230450670890 gw_count=1 gw_macs=b827ebfffe997bd3 mtype=ConfirmedDataUp

Unfortunately, it only shows the seconds, but it looks like between these two steps some time is wasted. This is either because you set a really long --deduplication-delay (config) or LoRa Server is waiting for the Redis query to return. As far as I know, that is the only thing that blocks LoRa Server in this stage.

I have default value - 200ms. Will try to use something lower.

That is the highest IOs I saw.

Actual DISK READ:       0.00 B/s | Actual DISK WRITE:      85.43 K/s
  PID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
   94 be/3 root          0.00 B    164.00 K  0.00 %  6.33 % [jbd2/mmcblk0p2-]
11891 be/4 postgres      0.00 B     88.00 K  0.00 % 11.91 % postgres: 9.6/main: lor~rver_ns ::1(45368) idle
11857 be/4 postgres      0.00 B    248.00 K  0.00 %  1.90 % postgres: 9.6/main: lor~rver_ns ::1(45328) idle
  462 be/4 postgres      0.00 B    324.00 K  0.00 %  1.43 % postgres: 9.6/main: checkpointer process
  378 be/4 root          0.00 B    160.00 K  0.00 %  0.54 % rsyslogd -n
  464 be/4 postgres      0.00 B     24.00 K  0.00 %  0.04 % postgres: 9.6/main: wal writer process
11169 be/4 root          0.00 B      0.00 B  0.00 %  0.01 % [kworker/u8:0]