Overview of Analytics Troubleshooting
For supported software information, click here.
Analytics clusters receive logs from branch and Controller Versa Operating SystemTM (VOSTM) devices, extract and analyze the logs, and store them for incorporation into dashboards and reports. To troubleshoot Analytics operations, it is helpful to understand how logs are transferred to the cluster and the specific operations that are performed by the node types within a cluster. This article describes how logs are transferred to a cluster and the types and functions of its nodes, and it also describes how to view Analytics alarms.
Analytics Node Functions
Analytics cluster nodes perform the following functions:
- Log collection
- Log processing
- Database operations
- Search engine operations
- Analytics application operations
Cluster nodes are specialized into specific node types that perform one or more of these functions, and the functions affect cluster resources in different ways. The primary Analytics nodes types are analytics, search, and log forwarder. Analytics-type nodes manage a noSQL database (Apache Cassandra), and search-type nodes manage a search engine (either Apache Cassandra or Apache Solr, depending on the Analytics release). Both analytics-type and search-type nodes can also perform log collection and log processing. Most Analytics clusters also contain log forwarder-type nodes, which perform log collection and processing only.
Each cluster node runs a copy of the Analytics application, which provides a GUI interface to manage the cluster, generate Analytics reports, and display dashboards and log screens.
Log Collection and Processing
Nodes that perform log collection receive logs from VOS devices over TCP connections called log export functionality (LEF) connections. VOS devices either send logs directly to the log collector node or, more commonly, they use an ADC load balancer on a Controller node. In the latter case, the VOS device sends logs to the IP address of the Controller node's ADC and the Controller node then NATs this traffic using the source IP address of the Controller node's egress interface and the destination IP of the log collector node.
The data path for transporting logs sent by a VOS device to an Analytics cluster is as follows:
- Logs are generated by features and services on VOS devices. Each log contains a syslog identifier, which is referred to as its log type.
- Logs are sent by VOS devices over a LEF connection to a log collector node.
- Optionally, the ADC on a Controller node acts as an intermediary to distribute LEF connections among the log collector nodes in the cluster.
- On the log collector node, a program called the log collector exporter (LCE) listens for incoming LEF connections.
- Based on the log type, the LCE uses rules to determine whether logs should be streamed to a remote collector or stored locally.
- For local logs, the LCE parses them, determines the tenant and appliance name, and stores them in appropriately named directories under /var/tmp/log. For example, logs from tenant Tenant1 on VOS device Branch1 are stored in the directory /var/tmp/log/tenant-Tenant1/VSN0-SDWAN-Branch1.
- A program called the Analytics driver on the node processes the log files in the /var/tmp/log directory and transfers logs to the database and search engine. If the database or search engine is located on a different node, the Analytics driver contacts the node to perform the transfer.
- After the transfer, the Analytic driver moves log files to backup subdirectories in the /var/tmp/log directory. Once an hour, a cron job compresses these files and moves them to subdirectories in the /var/tmp/archive directory.
For information about troubleshooting log export functionality, see Troubleshoot Log Export Functionality Issues. For information about troubleshooting log processing, see Troubleshoot Log Processing and Archiving Issues
Analytics Database Operations
Analytics-type nodes store data in a Cassandra database. Analytics data includes aggregated statistics, which are generated every 5 minutes by VOS devices. Aggregated statistics are typically counts, summaries, or averages of other statistics. Examples of aggregated statistics include logs of the following types:
- bwMonLog—Aggregated SD-WAN, DIA and access-circuit usage
- intfUtilLog—Aggregated WAN utilizations
- monStatLog—Aggregated user and application statistics
- qosLog—Aggregated QoS statistics
- slamLog—Aggregated SLA metrics
The Analytics application uses the database data to generate dashboards and Analytics reports. Dashboards and newly generated reports reflect the current contents of the database.
Entries in the database are subject to configurable retention time and summarization limits. For information about troubleshooting database operations, see Troubleshoot Analytics Database Issues.
If disk usage is high on nodes running Cassandra, the database can crash. For information about troubleshooting disk full conditions, see Troubleshoot Analytics Disk Storage Issues
Search Engine Operations
Search-type nodes operate a search engine and store real-time logs that come in the form of alarm logs (alarmLog), firewall rule hits (accessLog), traffic monitoring logs (flowmonLog), DHCP logs (dhcpLog), and CGNAT rule hits (cgnatLog), as well as other log types. VOS devices can generate large volumes of these type of logs, and you can set daily log limits to avoid overfilling the search engine datastore. If a log collector node reaches its daily log limit, some log data is dropped until the start of the next day. Critical log data is not subject to daily log limits and continues to be stored.
The Analytics application uses search-engine data to generate searchable log screens and Analytics reports. Log screens and newly generated reports reflect the current contents of the search engine.
The search engine is implemented differently on the DSE and Fusion Analytics platform. The DSE platform uses the Cassandra database and the Fusion platform uses Apache Solr. The Fusion platform uses the Apache Zookeeper utility to keep track of node status. If Zookeeper holds incorrect node status, or if a cluster node loses communication with Zookeeper, Solr can become non-operational. To troubleshoot search engine issues for the Fusion platform, contact Versa Networks Customer Support. To troubleshoot search nodes on the DSE platform, see Troubleshoot Analytics Database Issues.
If disk usage is high on search-type nodes, the search engine may crash. For information about troubleshooting disk full conditions, see Troubleshoot Analytics Disk Usage Issues.
Analytics Application Operations
The Analytics application allows you manage Analytics cluster nodes, access Analytics dashboards and log screens, and generate Analytics reports. You can access the application using the Analytics tab in the Director GUI or directly using the IP address of a cluster node in a browser window. An instance of the Analytics application runs on each node in a cluster, and you can use any instance to manage all nodes in the cluster.
To access the application from the Director Analytics tab, you must configure an Analytics connector and self-signed key certificates. For information about troubleshooting the Analytics application and certificates, see Troubleshoot Analytics Access and Certificate Issues.
Display Analytics Alarms
Alarms provide a warning that a particular resource is running low or is down, or to warn of unauthorized or suspicious activity. Analytics node alarms display on the Alarm screen under the Analytics tab.
To view Analytics cluster alarms:
- In Director view, select the Analytics tab.
- Select an Analytics cluster node. For Releases 22.1.1 and later, hover over the Analytics tab and then select a node. For Releases 21.2 and earlier, select a node in the drop-down menu in the horizontal menu bar.
- Select Administration > System Status > Alarms in the left menu bar. Alarms display in the main pane.
Supported Software Information
Releases 20.2 and later support all content described in this article.
Additional Information
Analytics Log Collector Log Type Overview
Troubleshoot Analytics Access and Certificate Issues
Troubleshoot Analytics Database Issues
Troubleshoot Analytics Disk Storage Issues
Troubleshoot Log Export Functionality Issues
Troubleshoot Log Processing and Archiving Issues