Skip to main content
Versa Networks

Configure Appliance Monitoring Settings

Versa-logo-release-icon.pngFor supported software information, click here.

You can configure appliance monitoring settings to control how Versa Director monitors appliances for reachability, sync status, and service status. When configured, the changes take effect dynamically—no restart is required.

To configure appliance monitoring settings in Versa Director:

  1. In Director view, select the Administration tab in the top menu bar.
  2. Select System > Settings in the left menu bar.
  3. Select the Advanced tab in the main pane.

    Appliance-monitoring-settings-border.png
  4. In the Appliance Monitoring Settings section, click the edit-pencil-icon-black-on-white-22-v2.png Edit icon. The Edit Appliance Monitoring Settings window displays.

    edit-appliance-monitoring-settings.png
     
  5. Enter information for the following fields.
     
    Field Description
    Polling Interval (seconds)

    Enter the polling interval, in seconds. The polling interval controls how often the Director node runs a full status check per appliance, including sync status and service status.

    Optimal Values: 180 through 600 seconds

    Default: 300 seconds (5 minutes)

    Concurrency Level

    Enter the number of appliances that can be polled in parallel during each full polling cycle. This setting controls the thread pool for full status checks only—it does not affect reachability ping batching.

    Setting a higher concurrency level value causes more appliances to be polled simultaneously so that a full polling cycle completes faster. It also increases Director resource usage. Setting a lower concurrency level value lowers resource consumption but causes polling cycles to take longer to complete.

    Optimal Values: 10 through 50 parallel threads

    Default: 10

    Keep Alive Timer (seconds)

    Enter how often the Director node performs a lightweight reachability ping across all appliances. This is independent of the full polling cycle.

    A lower keep alive timer produces faster detection when an appliance goes offline, while slightly increasing network traffic. A higher keep alive timer results in slower detection of unreachable appliances while reducing overhead. The key relationship with the hold-time multiplier is as follows:

    Time to mark UNREACHABLE = Keep Alive Timer × Hold Time Multiplier

    Optimal Values: 60-180 seconds

    Default: 60 seconds

    Hold Time Multiplier

    Enter the number of consecutive failed pings that can occur before an appliance is marked as Unreachable. During intermediate failures, the appliance is shown in an Unknown state.

    A higher hold-time multiplier results in more tolerance for network issues, which reduces false alarms but increases the time to detect when an appliance is down. A lower hold-time multiplier results in faster detection but is more susceptible to false positives from brief network issues.

    The examples in the following table are calculated with a keep-alive timer of 60 seconds and a hold-time multiplier of 3:

    Time Ping Result Appliance Status
    0 seconds Success Reachable
    60 seconds Fail (1st) 

    Unknown

    120 seconds Fail (2nd) Unknown
    180 seconds Fail (3rd) Unreachable
     

    Optimal Values: 2 through 5 failures

    Default: 3 failures

    Single Device Ping Timeout (milliseconds)

    Enter the timeout applied when a single appliance is being pinged for reachability, in milliseconds (ms). When only one device is checked (for example, an on-demand status check for a single appliance), this shorter timeout is used instead of the Bulk Devices Ping Timeout.

    This timeout is strictly enforced. If the device does not respond within this window, it is marked as unreachable.

    Direction Effect
    Higher (for example, 1000 ms) Accommodates appliances on high-latency links, such as satellite and cross-region WAN links. Reduces false unreachable results for slower links.
    Lower (for example, 500 ms) Faster failure detection, but appliances on slower links may be incorrectly flagged as unreachable.
     

    Optimal Values: 500 through 1000 ms

    Default: 500 ms

    Bulk Devices Ping Timeout (milliseconds)

    Enter the aggregate timeout for a batch ping operation, in milliseconds (ms). Appliances are pinged in batches of up to 100 devices per batch, or 150 for deployments with more than 2000 appliances. This timeout is the maximum time the system waits for all devices in a single batch to respond.

    This timeout is strictly enforced. When it expires, the system immediately returns results for all devices that responded and marks the remaining devices as unreachable.

    Direction Effect
    Higher (for example, 3000 ms) Gives more time for all appliances in a batch to respond. Useful for high-latency networks.
    Lower (for example, 2000 ms) Ping cycles complete faster, but slow-responding appliances may be incorrectly marked as unreachable.
     

    Optimal Values: 2000 through 3000 ms

    Default: 2000 ms

    Enable Monitoring

    Click the checkbox to enable or disable the appliance monitoring subsystem. When enabled, the Director node periodically checks all appliances for reachability and sync status.

    Disabling stops all monitoring. The appliance status in the UI becomes stale—devices are not marked unreachable even if they go down, and sync status does not update. Re-enabling resumes monitoring with the current settings. 

    Default: Enabled (checked)

  6. Click OK to save the configuration.

Supported Software Information

Releases 23.1.2 and later support all content described in this article.

  • Was this article helpful?