Troubleshoot Log Export Functionality Issues
For supported software information, click here.
The Versa Operating SystemTM (VOSTM) log export functionality (LEF) forwards logs from VOS devices to log collector nodes that are part of an Analytics cluster. Typically, the logs are sent to the Analytics nodes using a Controller node as an intermediary. Logs are forwarded between branch devices, Controller nodes, and Analytics nodes over TCP or UDP connections, which are referred to as LEF connections.
If LEF connections are improperly configured or if operational errors occur on them, either the logs are never received by the Analytics nodes or the logs received produce an error. In both cases, the logs are never processed into the Analytics database or search engine on the Analytics cluster.
To identify and resolve log forwarding issues:
- Verify the LEF configuration on the VOS devices.
- Display LEF statistics to confirm that a VOS device is exporting logs.
- Trace LEF connections from VOS devices to log collector nodes.
- Verify that logs are received on log collector nodes.
- Verify that log collector nodes are not overwhelmed by too many LEF connections.
This article describes how to perform these verification and tracing operations.
Note that VOS devices can also create LEF connections to third-party log collection devices. In this case, the verification steps that you perform on Analytics log connector nodes do not apply to the troubleshooting process.
Verify the LEF Configuration on the VOS Device
When VOS devices do not properly forward logs and when Analytics nodes do not received logs, the cause may be that LEF is configured incorrectly on the VOS branch or Controller device.
Each tenant on a VOS device uses a separate set of LEF profiles, groups, and collectors, so you must verify the LEF configuration for each tenant.
In typical configurations, the tenant's software features and services are configured to export logs to the active LEF collector associated with the default LEF profile. VOS devices automatically forward critical logs to this collector. Critical logs include those for alarms, LTE summary information, MOS summary information, SD-WAN SLA metrics, SD-WAN traffic steering, and system load. To capture these critical logs, it is recommended that for each tenant, you designate a LEF profile to be the default profile.
To ensure that LEF is configured for a tenant on a VOS device, first navigate to the LEF configuration screens:
- In Director view, select the Administration tab in the top menu bar.
- Select Appliances in the left menu bar.
- Select a VOS device in the main pane. The view changes to Appliance view.
- Select Configuration > Objects & Connectors > Connectors > Reporting > Log Export Functionality in the left menu bar.
- Select a tenant in the Organization field. The main pane displays LEF information for the tenant.
Verify that a LEF profile is present:
- Select the Profiles tab in the horizontal menu bar. The main pane displays a list of the LEF profiles that are already configured.
- Ensure that at least one profile is listed.
- Check the Default column, and ensure that one of the LEF profiles is designated as the default profile. Having a default profile ensures that critical logs are automatically logged. Only one profile can be designated as the default. The following screenshot shows that the profile called Default-Logging-Profile is the default LEF profile.
- If there is no default LEF profile or to change which LEF profile is the default, select one of the profiles listed in the Profiles tab. In the Edit Profile popup window, click the Default field, and then click OK. If you select a different profile to be the default, the existing default LEF profile is automatically unselected.
Verify that at least one LEF collector is associated with each LEF profile:
- In the Edit Profile popup window, if the Collector field is selected, note the name of the LEF collector. If the Collector Group field is selected, note the name of the collector. In the screenshot above, the Collector Group table is selected and the collector group name is Default-Collector-Group.
- Click Cancel to close the popup window.
Make a note of LEF collector and collector group information for use later in the troubleshooting process:
- For a LEF collector, select the Collectors tab in the horizontal menu bar, and then note the collector's names, IP addresses, and port numbers. This information is useful when tracing LEF connections, as described in Trace LEF Connections, below.
- For a LEF collector group, select the Collector Groups tab, and then note the names of the LEF collectors listed in the Collectors column. This information is useful when tracing LEF connections, as described in Trace LEF Connections, below.
Finally, ensure that LEF is configured for each of the VOS services and software features that the tenant uses. For information about how to associate a LEF profile with a VOS feature or service and how to verify that LEF is configured for a feature or service, see Apply Log Export Functionality.
Display LEF Statistics
You can ensure that log data is being forwarded by a VOS device by displaying LEF statistics. These statistics include a count of logs that LEF has exported from the device. Counts are listed by log category, giving you a general idea of whether the VOS device is exporting logs for a specific software feature or service.
After you display the statistics, you can wait a few minutes and then redisplay them to see whether the log counts have incremented. Doing this allows you to verify that logs are currently being exported by the VOS device. If the log counts do not increment for a specific log category, it is possible that that type of log is misconfigured on the device.
To display LEF statistics:
- Determine the identifier number for the tenant's active collector.
- Log in to the shell on the VOS device, and connect to the Versa Services Management Daemon (vsmd), which is a component of the VOS distributed data plane:
admin@Branch1$ vsh connect vsmd vsm-vcsn0>
- To view a list of LEF profiles configured for each tenant, issue the show lef exporter config profile command. Locate the LEF profile name associated with the VOS software feature or service, and note the name of the collector group associated with the profile. You must know the tenant ID number to identify which entry belongs to the tenant. To look up the tenant ID in the Director GUI, in Director view, select Administration > Organizations, and then in the main pane locate the tenant in the Organization Name column. The tenant ID is listed in the Global Organization ID column.
The following example shows the output for tenant number 3. Here, the LEF profile name is Default-Logging-Profile.
vsm-vcsn0> show lef exporter config profile ... TENANT [3] ========================= Profile : Default-Logging-Profile Id : 1 Collector Grp id : 1 Collector Grp Name : Default-Collector-Group Collector ptr : (nil) Collector grp ptr : 0x7faf39d40340 Flow mon en : TRUE PCAP en : FALSE HTTP en : FALSE Def profile : TRUE ...
- To view a list of collector groups, issue the the show lef exporter config profile command. Locate the entry for the tenant's collector group. Then, note the name of the active collector and use this name to look up the number of the active collector, which is shown in parenthesis in the Collector field.
The following output shows that for tenant number 3 and collector group Default-Collector-Group, the active collector is LEF-Collector-log_collector2. The Collector entry for LEF-Collector-log_collector2 shows a 2 in parenthesis, which means that the active collector number is 2.
vsm-vcsn0> show lef exporter config coll-grp ... TENANT [3] ========================= Collector Group : Default-Collector-Group Id : 1 Active Collector : LEF-Collector-log_collector2 Update time : 3928 sec ago Timer Running : FALSE Primary Collector : (0) Fallback Collector : (0) Collector : LEF-Collector-log_collector1(1) Collector : LEF-Collector-log_collector2(2) Coll Count : 2 Suspend Backup Coll : FALSE Suspend Timer Running : FALSE LTE DFIT Enabled : FALSE Num Force Suspend Collectors : 0 All Collector Suspend TS : 0 sec ago ...
- To display the statistics for the active collector, issue the show lef export statistics command, specifying the tenant ID and the number of the active collector. In the following example, the tenant ID is 3 and the active collector number is 2.
vsm-vcsn0> show lef exporter statistics collector 3 2 Tenant : 2 Collector : LEF-Collector-log_collector2(2) ... Logs processed, failed, bytes per type, avg log size: flow_mon_v4_base : 3 0 0 0 idp_attack : 3 0 0 0 mon_stats : 2470800 350 283067929 114 urlf_config_log : 2 0 0 0 branch_info : 56 0 1311 23 acc_ckt_info : 674 2 5454 8 b2b_slam : 96390 12 1046229 10 bw_mon : 106092 8 2348928 22 tcp_app_mon : 16412724 31 3218573711 196 alarm_log : 31298 2 83816 2 event_log : 4032 34 181879 45 entitlement : 3441 1 155061 45 sdwan_path_cond_log : 6822 2 80639 11 secacc_user_stats_log : 88477 15 14993549 169 secacc_global_stats_log : 3441 1 87357 25 sdwan_health_log : 3441 1 165275 48 active_user : 4376 4 27481 6 unknown_user_stats : 41284 12 1532194 37
The following are some of the log categories shown in the command output:
- acc_ckt_cos—Quality of service (QoS) statistics
- access-policy—Firewall logs (search log)
- alarm_log—Alarm logs (search logs)
- b2b_slam—SLA metrics and violation statistics
- bw_mon—SD-WAN usage and access-circuit usage statistics
- flow_mon—Traffic-monitoring logs (search logs)
- intf_util—System and WAN interface logs
- mon_stats—Application and user statistics
- Wait 10 minutes, and then reissue the show lef export statistics command to confirm that the log counters are incrementing.
Trace LEF Connections
When LEF is properly configured, VOS devices form LEF connections, either directly to an Analytics node or through an application delivery controller (ADC) on a Versa Controller node, which then forwards the logs to an Analytics node. If LEF connections are not successfully established, logs are never delivered to the Analytics cluster.
This section describes how to trace a LEF connection from a VOS device to an Analytics node, including how to trace the connection through an ADC on a Controller node. This section also describes how to correct problems you might discover during the trace and how to verify that an Analytics node is storing the logs received through the LEF connections.
To trace LEF connections:
- Log in to a shell on the VOS device. To open a shell window to a VOS device from the Director GUI, see Access the CLI on a VOS Device.
- Issue the date command, and confirm it shows the current time. The timezone does not matter. What is important is that the VOS device is configured with the current time.
admin@Branch1$ date
- If the date and time on the device do not reflect the current date and time, either configure NTP (see Configure Time Settings) on the device or set the date and time manually by issuing the sudo timedatectl and sudo date commands. Setting the time zone is optional. For example:
admin@Branch1$ sudo timedatectl set-timezone Africa/Cairo admin@Branch1$ sudo date --set "6 Apr 2021 14:14:00"
- If you modified the time in Step 3, then wait for approximately 20 minutes, and then check the Analytics dashboards to see whether the charts and tables are populated with current data.
- To troubleshoot either an individual tenant on a VOS device or log forwarding for all tenants, choose one of the tenants on the VOS device to use for the investigation.
- Check the tenant configuration, and note the name of the LEF profile for the software feature or service.
Each tenant on a VOS device uses a separate set of LEF profiles. Typically, you configure features and services to use the default LEF profile for the tenant. You must also know the name of the LEF collector or collectors used by the LEF profile. If the LEF profile uses a collector group list, there can be many collectors—one or more for each collector group in the list. To determine the name of the LEF collector or collectors used by the LEF profile for the feature or service, refer to Step 1c in Verify LEF Statistics, above.
For example, for Tenant1, the software feature or service might be configured with the following. Here, you trace LEF-Collector-log_collector1 or LEF-Collector-log_collector2, whichever you discover is the active collector in the next step.- LEF profile—Default-LEF-Profile
- LEF collector group—Collector-Group-1
- LEF collectors—LEF-Collector-log_collector1 and LEF-Collector-log_collector2 (the two
- Trace the LEF connection from the VOS device to the Analytics local collector.
- Determine the active collector. If the LEF profile uses a collector group list, there can be multiple active collectors.
From the CLI on the VOS device, issue the show orgs org-services command for the tenant on each of the LEF collectors. There can be more than one LEF connection, and at least one of them should have the status Established .
The following output is for organization Tenant1 for collector LEF-Collector-log_collector1:
- Determine the active collector. If the LEF profile uses a collector group list, there can be multiple active collectors.
admin@Branch1$ cli admin@Branch1> show orgs org-services Tenant1 lef collectors LEF-Collector-log_collector1 status status 15 source-ip 10.20.64.106 source-port 1025 destination-ip 10.20.64.1 routing-instance provider-org-Control-VR status Established pending-msgs 0 flaps 0 last-flapped 15w5d00h
The output shows the following information:
- status—Status of the connection
- destination-ip—IP address of an ADC service configured on a Controller node, or the IP address for a local collector configured on an Analytics cluster node
- routing-instance—Routing instance that the connection uses, which is typically the provider's routing instance.
- source-ip—Source IP address of the LEF connection initiated by the tenant on the VOS device
- A status of Reconnect indicates an issue with the LEF connection in which the TCP session is not successfully established. If the LEF configuration exports logs to an ADC service, which is the typical configuration, continue with Step 7c. If the configuration exports logs directly to Analytics nodes, continue with Step 7d.
- Trace the connection through the ADC on the Controller node. Log in to the shell on the Controller node, and then issue the following show orgs org command to list the sessions for the provider's routing instance. Look for a NAT session that includes the source IP address from Step 7a.
admin@Controller1$ cli admin@Controller1> show orgs org provider-org sessions nat brief | tab | match 1234 0 2 30 10.20.64.106 10.20.64.1 1439 1234 6 Yes No Analytics/(userdef) 192.168.71.2 192.168.95.2 10383 1234 0 2 31 10.20.64.106 10.20.64.1 1448 1234 6 Yes No Analytics/(userdef) 192.168.71.2 192.168.95.2 10177 1234 0 2 32 10.20.64.105 10.20.64.1 1457 1234 6 Yes No Analytics/(userdef) 192.168.71.2 192.168.95.2 36589 1234
The output shows the following information:
- Source IP address (highlighted in green)—IP address of the VOS device that initiated the connection.
- Destination IP address (highlighted in blue)—IP address of the ADC service on the Controller node.
- NAT source IP address and source port number (both highlighted in yellow)—Source IP address and port number used for connections to the Analytics node.
- NAT destination IP address and port number (highlighted in purple)—IP address and port number of the local collector on the Analytics node. This is the address and port of the local collector on the Analytics node that receives the logs.
- Log in to the shell on the Analytics node that receives the logs. For configurations that use ADC services, this is the NAT destination shown in Step 7c. For configurations that export logs directly to Analytics nodes, this is the destination IP address shown in Step 7a. In both cases, this is typically the IP address for the southbound interface on the Analytics node.
- Verify that the connection has completed. To do this, log in to the Analytics node, and then enter the log collector exporter (LCE) debugger by issuing the following telnet command:
admin@Analytics$ telnet 0 9100 Trying 0.0.0.0... Connected to 0. Escape character is '^]'. Versa Networks Log Collector Exporter Daemon LCED-DBG> show lced connections | grep 10383 IP Address : 192.168.71.2, Port: 10383 LCED-DBG> exit
- At the LCED-DBG> prompt, enter the show lced connections | grep command. In the grep command, if you are using the ADC, use the NAT source port number in Step 7c (here, 10383). If you are connecting directly from the VOS device, use port number 1234.
- If you do not see a connection, this indicates a routing issue between the local collector on the Analytics node and the ADC service or, for direct LEF connections, the VOS device. Check to see whether Layer 2 or Layer 3 devices are dropping packets.
- From the CLI, verify that local collectors are configured with the correct IP address, which is the IP address you expect to receive the logs. This is the NAT destination IP in Step 7c.
admin@Analytics$ cli
admin@Analytics> show log-collector-exporter local collectors
collector1 {
address 192.168.95.2;
port 1234;
storage {
directory /var/tmp/log;
format syslog;
file-generation-interval 3600;
}
}
admin@Analytics$ exit
- Issue the netstat –rn command on the log collector node, and then confirm that there is a return route for the Controller node's southbound subnet (when using the ADC) or the VOS device's subnet (when using a direct connection from VOS device to local collector).
admin@Analytics$ netstat -rn Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 0.0.0.0 10.48.0.1 0.0.0.0 UG 0 0 0 eth0 10.48.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0 192.168.71.0 192.168.95.1 255.255.255.0 UG 0 0 0 eth1.703 192.168.72.0 192.168.95.1 255.255.255.0 UG 0 0 0 eth1.703 192.168.95.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1.703
- Verify that the VOS device's software version matches the VOS software version on cluster nodes or is lower than the version. If the VOS device version is higher, the Analytics node may not be able to parse the logs, in which case the logs are dropped.
- To view the VOS software version for an Analytics node, issue the show system package-info command from the CLI. The Release field shows the software version.
admin@Analytics$ cli
admin@Analytics> show system package-info
Package Versa Analytics Software
Release 22.1.1
OS version bionic
Release Type GA
Release date 20230818
Package id 4a11002
Package name versa-analytics-20230818-050000-4a11002-22.1.1-B
Branch 22.1
Creator jenkins
- To view the VOS software version for a VOS device, in Director view, select Administration > Appliances. The Software Version column show the software version running on the VOS device.
- To identify log parsing errors from the shell on an Analytics node, enter the LCE debugger and display LCE statistics using the following commands. Errors display in the indicated area.
- Restart the cluster processes to clear transient states. To do this, issue the vsh restart command on each node in the cluster. This command restarts the LCE, Analytics driver, and other services on the node. The database and search engine not affected.
admin@Forwarder$ vsh restart
Verify Log Receipt on Analytics Nodes
To verify that an Analytics log collector node is receiving logs:
- Log in to the shell on the Analytics node.
- Issue the ls –lt command, specifying the directory that contains the log subdirectories for the tenant, Tenant1 in the following example. The directory contains a separate subdirectory for each VOS device, which is named VSN0-SDWAN-VOS-device-name.
admin@Analytics$ sudo ls -lt /var/tmp/log/tenant-Tenant1 total 232 drwx------ 2 root root 28672 Sep 25 17:24 VSN0-SDWAN-Branch4 drwx------ 2 root root 49152 Sep 25 17:23 VSN0-SDWAN-Controller1 drwx------ 2 root root 49152 Sep 25 17:19 VSN0-SDWAN-Controller2 drwx------ 2 root root 28672 Sep 25 17:18 VSN0-SDWAN-Branch1 drwx------ 2 root root 24576 Sep 25 17:08 VSN0-SDWAN-Branch5 drwx------ 2 root root 24576 Sep 25 17:02 VSN0-SDWAN-Branch2 drwxr-xr-x 9 root root 4096 Jul 24 2019 backup drwx------ 2 root root 4096 Jul 24 2019 VSN0-versa
- Issue the ls –l command, specifying the subdirectories for various VOS devices, Branch1 in the following example. If any files are present, some logs have been received from the VOS device. Check timestamps to determine when the files were last updated. Files whose suffix is txt.tmp are the log files for which logs have been received but that have not yet been processed by the Analytics driver.
admin@Analytics$ sudo ls -lt /var/tmp/log/tenant-Tenant1/VSN0-SDWAN-Branch1 total 16 -rw-r--r-- 1 root root 14427 Sep 25 17:29 20230925T171845.txt.tmp
Troubleshoot Issues with Too Many LEF Connections
Analytics nodes can accept only the maximum number of LEF connections that they are configured to receive. When the maximum number is reached, the node begins to refuse new connections. This situation can happen when one node in a cluster receives an unbalanced share of LEF connections requested by VOS devices, the configured maximum is too small, or not enough Analytics nodes are configured to handle the number of incoming LEF connections.
This section describes how to identify and resolve this issue.
If most log collector nodes in a cluster receive near or exactly the maximum number of LEF connection, the cluster is reaching its connection capacity. You can increase cluster connection capacity by adding new log collector nodes. For information about adding a node to an Analytics cluster, see https://support.versa-networks.com/s...r-in-20-2-21-x.
If the configured maximum number of LEF connections is too small, you can increase it. If one log collector node is receiving a large share of LEF connections compared to other nodes, you can restart the LCE. Doing this drops current LEF connections on the node and allows the ADC service to rebalance connections among the nodes in the cluster.
To view the number of active LEF connections, and view and modify the maximum number of connections:
- Log in to the shell on the node.
- List LCE statistics. The Clients Active field displays the number of LEF connections.
admin@Analytics$ telnet 0 9100
LCED-DBG> show lced stats
Local Collectors
===========================
Local Collector : collector1(0)
Clients Active : 124
Clients Connected : 124
Log Template Received Count : 1219
Log Data Received Count : 8175878
...
LCED-DBG> exit
- Verify the maximum connection value. In the following example, the maximum number of connections is 100. (The default maximum number of connections is 512.)
admin@Analytics$ cli admin@Analytics> configure admin@Analytics% show log-collector-exporter local collectors collector1 max-connections max-connections 100;
- To modify the maximum number of connections:
admin@Analytics% set log-collector-exporter local collectors collector1 max-connections 512 admin@Analytics% commit admin@Analytics> exit admin@Analytics$
- To restart the LCE program:
admin@Analytics$ sudo service versa-lced restart
Supported Software Information
Releases 20.2 and later support all content described in this article.