Troubleshoot Bandwidth and Throughput Issues

Last updated
Save as PDF

For supported software information, click here.

This article describes steps for troubleshooting bandwidth and through issues.

Check for Half-Duplex Issues and Link Speed

To check the link speed and to check for half-duplex issues, issue the following CLI command:

admin@cli > show interfaces detail interface-name

For example:

admin@CPE3-cli> show interfaces detail vni-0/0
Interface: vni-0/0
  Tenant
  Vlan-Id
  Administrative status : up
  Operational status    : n/a
  Protocols Down        : up
  Interface index       : 1061
  Interface Role        : external
  MAC address           : 00:90:0b:54:e9:08
  IP address            : n/a
  Obtained from DHCP    : False
  DHCP Server IP        : n/a
  DHCP Lease Time       : n/a
  DHCP Leaase Expiry    : n/a
  Name Server 1 Address : n/a
  Name Server 2 Address : n/a
  Routing instance      : Interenet-Transport-VR (10)
  Host interface        : eth1
  MTU                   : 1500
  Duplex / Speed        : half-duplex / 100mbps
    RX packets:12767402441  errors:0
    RX bytes:15555320808343
    TX packets:22243079055  errors:2
    TX bytes:27602690732046

The highlighted text in the command output shows that the interface is in half-duplex mode and operating at 100 Mbps. The interface should not be in half-duplex mode and, here, 100 Mbps is incorrect.

Fix any half-duplex and link speed issues by correcting the configuration on the device to which the VOS device is connected. For the transmission odes configured on the ISP side, check for an auto/auto configuration.

Check for Asymmetrical SD-WAN Paths

Check that there are no asymmetrical SD-WAN paths. An example is a path on which traffic is transmitted on one transport network and returns on another transport that has different bandwidth.

To check for traffic traverses an asymmetrical SD-WAN path, issue the following CLI command:

admin@cli > show orgs org organization-name sessions sdwan brief

For example:

admin@cli > show orgs org Tenant-Common sessions sdwan brief

VSN VSN  SESS                  DESTINATION     SOURCE  DESTINATION
ID  VID  ID    SOURCE IP       IP              PORT    PORT        PROTOCOL NATTED SDWAN APPLICATION RX WAN CKT        TX WAN CKT
0   2    33287 192.168.50.100  192.168.106.100 1024    1024        17       No     Yes   unknown_udp Internet:Internet Internet:Internet Silver-Customer-CPE50
0   2    33293 192.168.55.100  192.168.106.100 1024    1024        17       No     Yes   unknown_udp Internet:Internet Internet:Internet Silver-Customer-CPE55
0   2    33295 192.168.59.100  192.168.106.100 1024    1024        17       No     Yes   unknown_udp Internet:Internet Internet:Internet Silver-Customer-CPE59
0   2    33306 192.168.2.100   192.168.106.100 1024    1024        17       No     Yes   unknown_udp Internet:Internet Internet:Internet Silver-Customer-CPE2
0   2    33306 192.168.3.100   192.168.106.100 1024    1024        17       No     Yes   unknown_udp Internet:Internet Internet:Internet Silver-Customer-CPE3
0   2    33307 192.168.5.100   192.168.106.100 1024    1024        17       No     Yes   unknown_udp Internet:Internet Internet:Internet Silver-Customer-CPE5
0   2    33308 192.168.12.100  192.168.106.100 1024    1024        17       No     Yes   unknown_udp Internet:Internet Internet:Internet Silver-Customer-CPE12
0   2    33320 192.168.19.100  192.168.106.100 1024    1024        17       No     Yes   unknown_udp Internet:Internet Internet:Internet Silver-Customer-CPE19
0   2    33362 192.168.106.100 192.168.31.100  1024    1024        17       No     Yes   unknown_udp Internet:Internet Internet:Internet Silver-Customer-CPE31
0   2    33363 192.168.106.100 192.168.33.100  1024    1024        17       No     Yes   unknown_udp Internet:Internet MPLS:MPLS         Silver-Customer-CPE33
0   2    33364 192.168.106.100 192.168.36.100  1024    1024        17       No     Yes   unknown_udp Internet:Internet MPLS-v6:MPLS-v6   Silver-Customer-CPE36
0   2    33365 192.168.106.100 192.168.61.100  1024    1024        17       No     Yes   unknown_udp MPLS-v6:MPLS-v6   Internet:Internet Silver-Customer-CPE61
0   2    33366 192.168.106.100 192.168.62.100  1024    1024        17       No     Yes   unknown_udp MPLS-v6:MPLS-v6   MPLS:MPLS         Silver-Customer-CPE62
0   2    33367 192.168.106.100 192.168.66.100  1024    1024        17       No     Yes   unknown_udp MPLS-v6:MPLS-v6   MPLS-v6:MPLS-v6   Silver-Customer-CPE66
0   2    33368 192.168.106.100 192.168.75.100  1024    1024        17       No     Yes   unknown_udp Internet:Internet Internet:Internet Silver-Customer-CPE75
0   2    33369 192.168.106.100 192.168.81.100  1024    1024        17       No     Yes   unknown_udp Internet:Internet Internet:Internet Gold-Customer-CPE81

If the SD-WAN paths are symmetrical, the circuits shown in the RX WAN CKT and TX WAN CKT fields must be the same on the local and remote sites. If, for example, a packet is transmitted to a remote branch on the Internet circuit and returns on an MPLS circuit, you may see different throughput based on the bandwidth available on the Internet and MPLS circuits.

If you do not enable FEC and replication, the overhead added to packets is 82 bytes. Enabling FEC adds 12 bytes and enabling replication adds 12 bytes, for a total of 24 bytes in addition to the 82 bytes. To add headers while doing performance testing of sending traffic over tunnels, it is recommended that you use the Spirent or IXIA tools to add 82 bytes to the packets.

For traffic going from one LAN to another over SD-WAN, the packet overhead is based on the packet size and type of traffic.

Check that Packets Are not Dropped by CoS

If a CoS shaper or rate limiter is configured on the VOS device, it may drop packets when the number of packets exceeds the configured shaping rate.

To check whether CoS is dropping packets, issue the following CLI commands:

admin@cli> show class-of-services interfaces brief
admin@cli> show class-of-services interfaces detail interface-name
admin@cli> show orgs org-services organization-name class-of-service qos-policies
admin@cli> show orgs org-services organization-name class-of-service app-qos-policies

For example:

admin@cli> show class-of-services interfaces brief
                                                     TX
         TX       TX   TX                            BYTES     QUEUE
NAME     PACKETS  PPS  DROPPED  TX BYTES    TX BPS   DROPPED   LEN
---------------------------------------------------------------------
vni-0/0  130475   726  465      1183016979  5069504  562734    0
vni-0/1  0        0    0        0           0        0         0

admin@cli> show class-of-services interfaces detail vni-0/0

Intereface: vni-0/0
  Traffic Stats:
    TX Packets         : 133214
    TX PPS             : 253
    TX Packets Dropped : 465
    TX Bytes           : 119712785
    TX bps             : 834688
    TX Bytes Dropped   : 562734
  Port Stats:
           Traffic Class        TX Pkts     TX Dropped     TX Bytes    Bytes Dropped
     tc0 network-control          16289              0     11136743                0
     tc1   expedited-fwd              0              0            0                0
     tc2     assured-fwd              0              0            0                0
     tc3     best-effort         116925            465    108576042           562734
   Pipe Stat:
      Pipe ID    : 0
      Users      : [ vni-0/0.0 ]
           Traffic Class        TX Pkts     TX Dropped     TX Bytes    Bytes Dropped
     tc0 network-control          16289              0     11136743                0
     tc1   expedited-fwd              0              0            0                0
     tc2     assured-fwd              0              0            0                0
     tc3     best-effort         116925            465    108576042           562734

admin@cli> show orgs org-services Tenant-Common class-of-service qos-policies
                       QOS                          QOS       QOS        QOS      PPS      PPS         KBPS     KBPS
                QOS    DROP                         FORWARD   FORWARD    SESSION  POLICER  POLICER     POLICER  POLICER
                RULE   HIT     PACKET   QOS DROP    PACKET    BYTE       DENY     PKTS     BYTES       PKTS     BYTES
NAME            NAME   COUNT   COUNT    BYTE COUNT  COUNT     COUNT      COUNT    DROPPED  DROPPED     DROPPED  DROPPED
------------------------------------------------------------------------------------------------------------------------------------------
Default-Policy  VOICE  123281  5193940  6210916995  201902    319018559  0        5193940  6210916994  0        0

admin@cli> show orgs org-services Tenant-Common class-of-service app-qos-policies
                        APP     APP QOS  APP QOS    APP QOS   APP QOS
                        QOS     DROP     DROP       FORWARD   FORWARD
                RULE    HIT     PACKET   BYTE       PACKET    BYTE
NAME            NAME    COUNT   COUNT    COUNT      COUNT     COUNT
-----------------------------------------------------------------------
Default-Policy  STREAM  30240   36371    50620084   998       1660980

To avoid packet drops correct the CoS shaper and rate limiter configurations. If you are running throughput tests in a lab environment, remove the CoS configuration and verify the throughput.

Check that Traffic Sessions Use All Worker Cores

For the best throughout from a VOS device, all the CPU cores must be used.

The VOS software allocates separate CPU cores for control daemons, worker threads, and poller threads. The CPUs assigned to control cores control daemons such as BGP and DHCPD. CPUs assigned for worker threads are responsible for things such as the forwarding plane, encryption, and decryption. CPUs assigned for the poller thread are responsible for reading packets from NICs and passing them to worker threads, and also and writing packets to NICs during transmission. The following figure illustrates the functioning of the CPU cores.

A session, as defined by 5-tuple consisting of a source IP address, a destination IP address, a source port number, a destination port number, and a protocol, is processed by a single worker CPU.

To check that all the worker CPUs are being used for sessions, send enough sessions so that at least few are processed by each worker CPU. It is recommended that you send traffic for at least 100 sessions while running throughput tests on eight core CPUs so that at least a few sessions are processed by each core. The following sample output shows that the usage each of the five core CPUs is approximately balanced, although CPU 4 is being used less than the others.

$ vsh connect vsmd
vsm-vcsn0> show vsf per-thread nfp stats summary
 Thr-Id     Sess-Active     Sess-Created     Sess-Closed
--------   -------------   --------------   -------------
    0            25            1397618         1396726
    1            18            1396258         1395344    
    2            20            1394215         1393289
    3            15            1395297         1394376
    4            22            1266916         1266045

Check for Underlay Throughput Issues

Ensure that the underlay is not dropping the packets. For example, if a customer is trying to measure a 10-Gbps throughput but the underlay switches are not capable of switching at speeds of 10 Gbps, packets drop.

Check that the SLA is not experience a PDU loss of 100 percent. For information about PDUs, see Configure SLA Profiles for SD-WAN Traffic Steering.

Check the input rate (in pps and bps) and output rate (in pps/bps). To confirm, verify that packets transmitted out one side of a transport interface reach on other side of the transport.

To display TCP/IP and other packets being transmitted or received over a network, use the tcpdump utility on the WAN interface:

admin@cli> tcpdump vni-x/x filter remote-host

To check for drops on the circuit side, use the rapid ping utility on the VOS device, with a large count value, such as 1000:

admin@cli> ping ip-address rapid enable ?
Possible completions:
  count            - Number of pings to send
  df-bit           - Enable Do Not Fragment bit in IP header
  interface        - Source  interface from where to send the ping
  packet-size      - Packet size to send
  record-route     - Displays the route buffer on returned packets
  routing-instance - Routing instance
  source           - Source IP address

Then check the interface statistics to ensure that the TX pps and bps counts on the local site match the RX pps and bps counts on the remote site. For example:

admin@cli> show interfaces port statistics brief
               IF
         HOST  OPER   RX        RX              RX      RX    TX        TX               TX      TX    RX     TX
NAME     INF   STATUS PACKETS   PPS RX BYTES    ERRORS  BPS   PACKETS   PPP  TX BYTES    ERRORS  BPS   USAGE  USAGE
-------------------------------------------------------------------------------------------------------------------
vni-0/0  eth1  up     22578104  1   3663729635  0       1376  23241202  1    4048636473  0       1056  0.0    0.0
vni-0/1  eth2  up     13188574  1   890447986   0       2216  1514288   1    112008904   0       160   0.0    0.0
vni-0/2  eth3  up     8959110   1   646170340   0       1192  8092802   1    566530672   0       1352  0.0    0.0
vni-0/3  eth4  down   0         1   0           0       0     0         1    0           0       0     0.0    0.0

To run ping and tcpdump from the Director GUI, see Access Monitoring Tools.

Check whether Application Offload Is Enabled

First, if NGFW or UTM is not configured, check whether application offload is enabled:

admin@cli> show configuration orgs org-services organization-name application-identification application-generic-options

Note that if NGFW or UTM is enabled, it is recommended that you disable application offload. If you enable it, with HTTP Version 1.1 or later, different transactions of a connection may be identified as different applications. For example, if a Facebook session later reuses the same connection to exchange chat messages, it might be identified as Facebook Messenger instead of as Facebook.

If application offload is not enabled, enable it:

admin@cli> set orgs org-services tenant-id application-identification application-generic-options offload enabled

Then, check whether isolcpu is enabled, to isolate the CPUs from the kernel scheduler. When you are doing performance throughput testing, if you wants to achieve close to no packet loss (that is, a packet loss of < 0.01%), it is recommended that you enable isolcpu.

Check whether isolcpu is enabled:

admin@cli> request system isolate-cpu status
status  isolcpu disabled

If isolcpu is not enabled, enable it:

admin@cli> request system isolate-cpu enable
status  GRUB PARAMETERS HAVE CHANGED. PLEASE REBOOT THE SYSTEM FOR VERSA-FLEXVNF TO FUNCTION CORRECTLY.
admin@cli> request system isolcpu status
status  isolcpu enabled with num-control-cpus 1

Check that Sessions Are Load-Balanced on All Workers

To check that traffic sessions are load balanced equally across all the worker cores, issue the vsh connect vsmd and show vsf per-thread nfp stats summary commands. For more information, see Check that Traffic Sessions Use All Worker Cores, above.

If sessions are not load-balanced across worker threads, issue the following commands to check that class of traffic being received:

$ vsh connect vsmd
vsm-vcsn0> show vsm anchor core map
+--------+--------+---------+--------+---------+
|H-Index | NC Core| EF Core | AF Core| BE Core |
+--------+--------+---------+--------+---------+
|      0 |      0 |       1 |      1 |       0 |
|      1 |      0 |       2 |      2 |       1 |
|      2 |      0 |       3 |      3 |       2 |
|      3 |      0 |       4 |      4 |       3 |
|      4 |      0 |       5 |      5 |       4 |
|      5 |      0 |       1 |      1 |       5 |
+--------+--------+---------+--------+---------+

vsm-vcsn0> show vsm cq stats
+-------+--------+-----------+-------+-------+
| W TID | CTRL   | DATA      | EF    | AF    |
+-------+--------+-----------+-------+-------+
|     0 | 528669 | 356169364 |     0 |     0 |
|     1 |    199 | 330210649 |     0 |     0 |
|     2 |    160 | 339295575 |     0 |     0 |
|     3 |    200 | 337426918 |     0 |     0 |
|     4 |    189 | 313042396 |     0 |     0 |
|     5 |    157 | 301416739 |     0 |     0 |
+-------+--------+---------+---------+--------+

By default, the VOS software maps traffic for given class is mapped to worker cores. You can configure changes to these mappings.

If the sessions are not equally distributed across worker cores and throughput is less than expected, contact Versa Network Customer Support.

Check for Fragmented Packets

Check that there are not too many fragmented packets. Fragmentation and reassembly are CPU-intensive tasks, so throughput decreases if there are too many fragments.

A tunnel overhead is added to traffic transiting an SD-WAN tunnel. If larger packets are sent to the SD-WAN LAN network before they are sent to the WAN, the VOS device may fragment the packets before sending them over the SD-WAN tunnel. Fragmented packets are reassembled at the remote site before they are sent to the customer LAN.

The Director node adjusts the MSS for TCP packets transiting SD-WAN tunnels. If TCP MSS adjust is set for the tunnel, TCP packets are not fragmented. Instead, only larger UDP packets that may not fit into the SD-WAN tunnel are fragmented.

To check whether TCP MSS adjust is enabled, issue the following command:

admin@cli> show configuration system session tcp-adjust-mss
enable          true
interface-types all;

To check the number of packets that have been fragmented and reassembled, issue the following commands:

$ vsh connect vsmd
vsm-vcsn0> show vsm statistics datapath
# Packets Punt to WT                                   : 63784
# Fragments Received for Reassembly                    : 47692
# Packets Reassembled                                  : 23846
# Packets FDT Action Error                             : 22
# Pipeline Session Lookup - 2nd time local             : 1681
# Allowed - Filter Lookup                              : 333533
# Forward - NNon-local tunneled pkt, decaps not done   : 24
# Forwarded - Filter Lookup                            : 333533
# Forwarded - SFW No Match                             : 333533
# Sent - ARP to CT                                     : 263
# Passed - Host-bound rate limit                       : 333533
# Injected - into VUNET                                : 46685
# Packets FDT Action Error                             : 22
# Packets Dropped - Interface disabled                 : 3327
# Packets Dropped - Filter Lookup Module Action Denied : 46761
# Packets Dropped - Tunnel Decaps pkt processing error : 5988

For customer traffic whose DF bit is set, when the traffic arrives on a LAN or WAN network but fragmentation is needed to send it over an SD-WAN tunnel, the VOS device sends the ICMP error message “DF bit set but fragmentation needed” to the sender. Most network devices react to this message by sending future packets in which the DF bit is not set. However, some SIP phones and legacy devices, such as RADIUS servers, do not respond to this ICMP error message and continue to send packets with the DF bit set. As a result, these packets are dropped. To handle these situations, configure the override-df-bit tunnels option. Then, when traffic requires fragmentation but the DF bit is set and the sender does not respond to the ICMP error message, the VOS device clears the DF bit, fragments the packets, and sends them over the SD-WAN tunnels. At the other end of the tunnel, the fragments are reassembled and the DF bit is reset.

To check whether the override-df-bit tunnels option is set, issue the following command:

admin@cli> show configuration orgs org-services organization-name options override-df-bit

For example:

admin@cli> show configuration orgs org-services Tenant-Common options override-df-bit
options override-df-bit
override-df-bit tunnels;

To check packet fragmentation on the SD-WAN tunnel, issue the following commands:

$ vsh connect vsmd
vsm-vcsn0> show vsf tunnel stats
--------------------------------------------------------------
                      Tunnel encap stats
--------------------------------------------------------------
Tunnel Encap Processing successful:                  246827441
Tunnel Encap Processing dropped:                           978
Tunnel IP-UDP transport encap forwarded:             246827441
Tunnel MPLSoGRE encap forwarded:                     246827441
Tunnel VXLAN-GPE encap forwarded:                    246827441
Tunnel IPsec-ESP encap forwarded:                    234464119
Tunnel IPsec-ESP encap scheduled:                    234464119
Tunnel Encap packt map not found, dropped:                  24
Tunnel Encap Pre-processing dropped:                       978
Tunnel Encap Pre-processing pre-fragmented:           14580958
Tunnel Encap Pre-processing Fragments:                29161916
Tunnel Encap Send completed:                         246827441
Tunnel Encap ether output completed:                 246827441
Tunnel Encap Invalid Access circuit dropped:                 3
Tunnel Overhead Calculation failed:                        951
Tunnel Pkts switched to valid AC:                          112
Tunnel Pkts switched to mgmt tenant:                   1304651
--------------------------------------------------------------

Check for Packet Punting across Worker Threads

The traffic for a session is processed by a single worker core. To anchor a session to a worker core, a 5-tuple is used, consisting of a source IP address, a destination IP address, a source port number, a destination port number, and a protocol. All the traffic between the local site and a remote site travels over a single SD-WAN tunnel that has same 5-tuple for all customer sessions carried in the tunnel.

To anchor a session on a core, th worker thread must perform decapsulation on the tunnel, which is a CPU-intensive operation. To achieve load balancing among worker threads at the remote end, the local site sends to the remote site a CRC of the 5-tuple in the encapsulation header. The remote site then anchors the session based on the CRC. It is possible that some sessions may be anchored on an incorrect core and is then later punted to correct core. If a large number of packets are being punted or if the rate of punting is high, the throughput might decrease.

To check the number of packets being punted to a different worker thread (WT), issue the following commands:

$ vsh connect vsmd
vsm-vcsn0> show vsm statistics datapath
# Packets Punt to WT                                   : 1662602
# Fragments Received for Reassembly                    : 226354
# Packets Reassembled                                  : 113177
# Packets FDT Action Error                             : 49
# Pipeline Session Lookup - 2nd time punt              : 16
# Pipeline Session Lookup - 2nd time local             : 2259530
# Allowed - Filter Lookup                              : 6945526
# Forward - Non-local tunneled pkt, decaps not done    : 59
# Forwarded - Filter Lookup                            : 6945526
# Forwarded - SFW No Match                             : 81058422
# Sent - ARP to CT                                     : 1479
# Passed - Host-bound rate limit                       : 6945526
# Injected - into VUNET                                : 904280
# Packets FDT Action Error                             : 49
# Packets Dropped - Interface disabled                 : 127521
# Packets Dropped - Interface disabled Reinject        : 27
# Packets Dropped - Tuple Extract Failure              : 22
# Packets Dropped - Filter Lookup Module Action Denied : 439633
# Packets Dropped - Tunnel Decaps pkt processing error : 68162
# Packets Dropped - Packet reinject ttl expired        : 142307

If you have enabled NAT, firewall and HA, it is expected that packets are punted between worker threads and so the output may report a large number of fragmented packets.

If packets are punted to different worker threads at high rate and throughput is less than expected, contact Versa Networks Customer Support.

Check the Poller Count

Typically, a VOS device allocates one poller CPU for up to 10 Gbps of bandwidth allocation. The poller CPUs are assigned when Versa services come up during a boot, reboot, or restart. Even though some NICs may not be connected or used, poller CPUs are assigned based on number of NICs present in the VOS device.

To check number of poller CPUs assigned, issue the following commands:

$ vsh connect vsmd
vsm-vcsn0> show vsm cpu info
VSM CPU into:
--------------------------------------
# of CPUs             :  8
# of poller threads   :  1
# of worker threads   :  6
# of control threads  :  1
  Used CPUs           :  [ 0 1 2 3 4 5 6 7  ]
  Poller CPUs         :  [ 7  ]
  Worker CPUs         :  [ 1 2 3 4 5 6  ]
  Control CPUs        :  [ 0  ]

If some of NICs are not used, you can reduce the number of poller CPUs assigned to make more CPUs worker cores available. To change the number of poller CPUs, issue the following CLI command:

admin@cli> set system service-options poller-count number

Check Worker and Poller CPU Utilization and Drops

Check the CPU usage by the worker and poller CPU usage. If it is already running at 100 percent, the VOS device has reached its maximum throughput even if you have enabled all optimizations.

By default, VOS devices run in performance mode. However, if you change the run mode to hyper, the CPUs run at 100 percent even if there are no packets.

To check worker and poller CPU utilization, issue the following commands, pressing 1 to sort by process ID:

admin@vos$ htop
admin@vos$ top -H

To check for high memory usage, issue the following command:

admin@vos$ top -o %MEM

To check for high CPU usage, issue the following command:

admin@vos$ top -o %CPU

For example:

admin@vos$ htop

1  [||||||||||                                                 ]  Tasks: 89, 69  thr; 3 running
2  [||||||||||||                                               ]  Load average:       0.68 0.72
3  [||||||||||||                                               ]  Uptime:  46 days,  23:41:30
4  [|||||||                                                    ]
Mem[||||||||||||||||||||||||||||||||||||||||||||||||2713/3007MB]
Swp[                                                           ]

admin@vos$ top -H
top - 14:29:09 up 46 days, 23:43,  2 users,  load average: 0.92, 0.77, 0.74
Threads: 259 total,  5 running, 254 sleeping,  0 stopped,  0 zombie
%Cpu(s): 14.8 us,  2.7 sy,  0.0 ni,  82.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0st
KiB Mem:   3080100 total,  2998636 used,    81464 free,   109412 buffers
KiB Swap:        0 total,        0 used,        0 free,   111256 cached Mem

  PID USER    PR  NI    VIRT    RES   SHR S %CPU %MEM     TIME+ COMMAND
20630 root    20   0 2609412 0.983g  8904 S 20.9 33.4  10374:38 worker-0
20631 root    20   0 2609412 0.983g  8904 S 20.6 33.4  10124:04 worker-1
20803 root    20   0 2609412 0.983g  8904 R 14.3 33.4   7090:53 ipsec-control
20632 root    20   0 2609412 0.983g  8904 S 10.0 33.4   4871:50 poller-0
19649 versa   20   0   50708  12248     0 S  2.0  0.4 668:51.32 versa-certd
19509 root    20   0 2609412 0.983g  8904 R  1.0 33.4 398:33.67 versa-vsmd
20798 root    20   0 2609412 0.983g  8904 R  1.0 33.4 501:34.86 ctrl-data-0
20779 root    20   0 2609412 0.983g  1524 R  0.7 33.4 306:05.67 vunet-timer
 2238 root    20   0   89320  15268  8904 S  0.3  0.5  40:09.24 vmtoolsd
19790 root    20   0  107656   2068  1384 S  0.3  0.1  35:50.74 monit
20507 root    20   0   39384   7724  1572 S  0.3  0.3 110:27.30 redis-server
    1 root    20   0   34096   3428  1436 S  0.0  0.1   0:07.61 init
    2 root    20   0       0      0     0 S  0.0  0.0   0:00.32 kthreadd
...

admin@vos$ top -o %MEM
top - 14:30:43 up 46 days, 23:44,  2 users,  load average: 0.58, 0.69, 0.72
Tasks: 187 total,  1 running, 186 sleeping,  0 stopped,  0 zombie
%Cpu(s): 13.1 us,  2.8 sy,  0.0 ni,  84.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0st
KiB Mem:   3080100 total,  2995984 used,    84116 free,   109500 buffers
KiB Swap:        0 total,        0 used,        0 free,   111272 cached Mem

  PID USER    PR  NI    VIRT    RES   SHR S %CPU %MEM     TIME+ COMMAND
19509 root    20   0 2609412 0.983g  8904 S 69.9 33.4  33669:05 versa-vsmd
19509 root    20   0 2609412 0.983g  8904 S 69.9 33.4  59:27.61 confd
19509 versa   20   0 2609412 0.983g  8904 S 69.9 33.4  11:57.38 versa-vmod
19509 versa   20   0 2609412 0.983g  8904 S 69.9 33.4 102:04.73 versa-acctmgrd
19509 versa   20   0 2609412 0.983g  8904 S 69.9 33.4  29:07.86 versa-rtd
19509 versa   20   0 2609412 0.983g  8904 S 69.9 33.4  0:00.02 versa-fltrmgr
19509 versa   20   0 2609412 0.983g  8904 S 69.9 33.4   11:47.17 versa-dhcpd
19509 versa   20   0 2609412 0.983g  8904 S 69.9 33.4   0:00.74 nodejs
...

admin@vos$ top -o %CPU
top - 02:34:38 up 195 days, 6:09,  r users,  load average: 3.22, 3.28, 3.32
Tasks: 208 total,  1 running, 207 sleeping,  0 stopped,  0 zombie
%Cpu(s): 0.4 us,  0.2 sy,  0.0 ni,  52.3 id,  0.0 wa,  0.0 hi,  0.0 si,  47.1 st
KiB Mem:  4045984 total,  3904600 used,   141384 free,   34712 buffers
Ki Swap:        0 total,        0 used,        0 free,   76992 cached Mem

  PID USER    PR  NI    VIRT    RES   SHR S %CPU %MEM      TIME+ COMMAND
31529 root    20   0 3649752 1.969g  6288 S 66.8 51.0 174511:005 versa-vsmd
31671 versa   20   0   85912  40488  2048 S  1.7  1.0    1740:15 versa-acctmgrd
17560 admin   20   0   25208   2924  2368 R  1.3  0.1   00:00.20 top
32555 versa   20   0   47584  13796  1840 S  1.3  0.3    1738:28 redis-server
    7 root    20   0       0      0     0 S  1.0  0.0    4460:37 rcu_sched
   13 root    20   0       0      0     0 S  0.7  0.0 873:23.18 ksoftirqd/1
31693 versa   20   0   50708  13620  1364 S  0.7  0.3   4102:51 versa-certd
31655 versa   20   0  149312  36492  3592 S  0.3  0.9 205:57.88 versa-rtd
    1 root    20   0   33920   3392  1580 S  0.0  0.1   1:01.28 init
    2 root    20   0       0      0     0 S  0.0  0.0   0:18.98 kthreadd
    3 root    20   0       0      0     0 S  0.0  0.0 568:49.96 ksoftirqd/0
    5 root     0 -20       0      0     0 S  0.0  0.0   0:00.00 kworker/0:0H
    8 root    20   0       0      0     0 S  0.0  0.0   0:00.00 rcu_bh
    9 root    rt   0       0      0     0 S  0.0  0.0  198:10.12 migration/0
   10 root    rt   0       0      0     0 S  0.0  0.0   1:49.48 watchdog/0
   11 root    rt   0       0      0     0 S  0.0  0.0   1:33.12 watchdog/1

admin@vos$ top -H
Threads: 283 total,  9 running, 273 sleeping,  0 stopped,  1 zombie
%Cpu(s): 18.3 us,  4.1 sy,  0.0 ni,  77.2 id,  0.1 wa,  0.0 hi,  0.4 si,  0.0st
KiB Mem:  16405444 total, 10084552 used,  6320892 free,   257940 buffers
KiB Swap: 16748540 total,        0 used, 16748540 free,  1741824 cached Mem

  PID USER    PR  NI    VIRT    RES   SHR S %CPU %MEM     TIME+ COMMAND
29813 root    20   0 7949212 2.889g 85732 R 98.3 18.5 114:43.34 worker-1
29815 root    20   0 7949212 2.889g 85732 R 98.3 18.5 112:21.57 worker-3
29816 root    20   0 7949212 2.889g 85732 R 98.3 18.5 108:11.48 worker-4
29817 root    20   0 7949212 2.889g 85732 R 98.3 18.5 119:33.45 worker-5
29812 root    20   0 7949212 2.889g 85732 R 98.3 18.5 125:59.33 worker-0
29814 root    20   0 7949212 2.889g 85732 R 98.3 18.5 105:43.43 worker-2
29818 root    20   0 7949212 2.889g 85732 R 17.3 18.5 112:34.01 poller-0

The following flags are present in the output of the top –H command:

us—User. Time running un-niced user processes.
sy—System. Time running kernel processes.
ni—Nice. Time running niced user processes.
wa—IO-wait. Time waiting for I/O completion.
hi—Time spent servicing hardware interrupts.
si—Time spent servicing software interrupts.
st—Time stolen from this VM by the hypervisor. If KVM or Hypervisor is oversubscribed or has high CPU usage, this number is high.

If the worker and poller are running at 100 percent, packet drops may occur at worker and poller. To check the worker and poller drops, issue the following commands:

$ vsh connect vsmd
vsm-vcsn0> show vsm statistics dropped
DPDK ERROR STATISTICS
~~~~~~~~~~~~~~~~~~~~~~~~~
DATAPATH ERROR STATISTICS
~~~~~~~~~~~~~~~~~~~~~~~~~
# Packets FDT Action Error                 : 22
# Packets Dropped - Interface disabled     : 3382
# Packets Dropped - Stale Fragment Entry   : 1634
# Packets Dropped - Filter Lookup Module Action Denied : 47571
# Packets Dropped - Tunnel Decaps pkt processing error : 5988

THRM ERROR STATISTICS
~~~~~~~~~~~~~~~~~~~~~
POLLER PID : 29818
# Drop Packets RX                          : 12694125
# Drop Packets TX                          : 40

NFP ERROR STATISTICS
~~~~~~~~~~~~~~~~~~~~
# Packets Dropped - Invalid session handle : 1
# Number of calls to icmp_error            : 294

VSF ERROR STATISTICS
~~~~~~~~~~~~~~~~~~~~
# Sess Create Denied (mbuf sanity fail)    : 58
# Route lookup failure (ip-out)            : 24

VUNET ERROR STATISTICS
~~~~~~~~~~~~~~~~~~~~~~
# VN_MOD_IP_ERR_NO_ROUTE_CNT               : 266
# VN_MOD_ETH_ERR_BAD_TYPE_CNT              : 6764

COS DROPS
~~~~~~~~~~~~
# Shaper drops                             : 0

For more details about where the packets are dropped, issue the following command:

admin@vos$ show vsm statistics thrm detail

If the worker and poller are not running at 100 percent, if you are still seeing packet drops in the poller or worker, if you have performed all the checks above, and if the throughput is less than expected, contact Versa Networks Customer Support to debug further.

Verify Link Bandwidth

To verify link bandwidth, run the automatic bandwidth test. For more information, see Troubleshoot Link Bandwidth Issues.

Supported Software Information

Releases 20.2 and later support all content described in this article.

Additional Information

Troubleshoot Link Bandwidth Issues