Skip to main content

Monitoring Your Vodia PBX

Monitoring Your Vodia PBX (Overview): Ensuring Continuous Communication

Maintaining a healthy Vodia PBX is crucial for uninterrupted communication. 24/7 monitoring and proactive alerts are essential for addressing potential issues promptly. Vodia provides a suite of built-in tools and integration options to facilitate robust monitoring.

Vodia's Integrated Monitoring Capabilities

System Status Graphs

  • Access vital performance metrics for your PBX via the web interface.
  • Monitor key indicators such as call quality, registration changes, media, main CPU usage and HTTP/S statistics.
  • Refer to: System Status Graphs Documentation

Syslog/Logfile Analysis

  • Leverage comprehensive logging for in-depth troubleshooting.
  • Configure granular logging to capture specific events (e.g., webclient events, REGISTER requests).
  • Forward logs to an external syslog server (e.g., 1.1.1.1:555) for centralized analysis.
  • Utilize tools like Graylog or Splunk for historical log retention and advanced querying.
    • Example: Graylog can extract data like PBX domains, transport layers, client IPs, and geolocation for detailed analysis.
    • Example query: "All SIP register requests in the last 30 minutes for a specific client IP."
  • Refer to: Logging Documentation

Troubleshooting VoIP Calls with Call History (CDRs):

Vodia's tenant Call Log provides detailed Call Data Records (CDRs) for comprehensive VoIP troubleshooting. Each CDR contains:

  • Detailed Call Flow: Tracks the complete call progression.
  • SIP Packets: Captures SIP messaging for each call leg.
  • CDRQ Statistics: Presents quality metrics per call leg.
  • Device Information: Identifies involved devices.
  • Related Logs: Offers supplementary call-specific logging.
  • Refer to: https://doc.vodia.com/docs/cdr-troubleshooting

PCAP (Packet Capture) Tracking

  • Capture network traffic at the extension and trunk levels.
  • Download PCAP files directly from the call log page.
  • Enable TLS key logging for encrypted traffic analysis via the "TLS key log file" setting in the system security section.

Event Notifications

Registration Monitoring

SNMP Monitoring

  • Utilize pre-defined Object Identifiers (OIDs) for SNMP polling.
  • Use snmpget or snmpwalk to retrieve PBX metrics.
  • Integrate with existing monitoring systems (e.g., Nagios) to create custom SNMP checks and historical data visualization.
    • Example: A Nagios check every 5 minutes to monitor the number of active calls.
  • Refer to: SNMP Monitoring Documentation

Prometheus Integration

External and System-Level Monitoring for Vodia PBX

To ensure the reliability and performance of your Vodia PBX, a multi-layered monitoring approach is recommended. This includes external port checks, host system metrics, and call testing.

External Port Monitoring

Utilize external monitoring services like Site24x7 or Pingdom to verify the accessibility of critical PBX ports.

Key ports to monitor:

  • 80 (HTTP)
  • 443 (HTTPS)
  • 5060 (SIP UDP/TCP)
  • 5061 (SIP TLS)

Server/Virtual Machine Monitoring

Monitor the underlying server or VM hosting the Vodia PBX for resource utilization and potential bottlenecks.

For Linux systems, track metrics such as:

  • CPU usage
  • Memory utilization
  • Disk I/O
  • Network traffic

Leverage monitoring tools:

Call Tester Script for Vodia PBX Monitoring

For environments with numerous Vodia PBX instances across multiple virtual machines, a call testing script provides proactive identification of SIP connectivity problems. This perl script simulates calls and reports success or failure. Scheduled periodic execution across all Vodia servers, using monitoring tools like Nagios, Icinga, or Prometheus, ensures continuous health checks.

Install the necessary dependencies

apt install  perl
cpan Net::SIP IO::Socket::INET Getopt::Long
cpan Net::DNS

Create a perl script called sip-call-tester.pl

#!/usr/bin/env perl
use strict;
use warnings;
use IO::Socket::INET;
use Getopt::Long qw(:config posix_default bundling);
use Net::SIP;
use Net::SIP::Util 'create_socket_to';
use Net::SIP::Debug;

# Subroutine to display usage information
sub usage {
print STDERR "ERROR: @_\n" if @_;
print STDERR <<EOS;
usage: $0 [options] FROM TO
Makes a SIP call from FROM to TO, optionally records RTP data,
and optionally hangs up after a specified time.

Options:
-d|--debug [level] Enable debugging
-h|--help Display this help message
-P|--proxy host[:port] Use an outgoing proxy (register there unless a registrar is specified)
-R|--registrar host[:port] Register at the given address
-O|--outfile filename Write received RTP data to a file
-T|--time interval Hang up after the specified interval (in seconds)
-L|--leg ip[:port] Use the given local IP[:port] for the outgoing leg
-C|--contact sipaddr Use the given contact address for registration and invitation
--username name Username for authorization
--password pass Password for authorization
--route host[:port] Add a SIP route (can be specified multiple times)
--prometheus Output metrics in Prometheus format
--nagios Output status in Nagios format

Examples:
$0 -T 10 -O record.data sip:30\@192.168.178.4 sip:31\@192.168.178.1
$0 --username 30 --password secret --proxy=192.168.178.3 sip:30\@example.com 31
$0 --username 30 --password secret --leg 192.168.178.4 sip:30\@example.com 31

EOS
exit(@_ ? 1 : 0);
}

# Subroutine to handle fatal errors
sub die_with_error {
my ($message, $exit_code) = @_;
print STDERR "$message\n";
exit $exit_code;
}

###################################################
# Parse command-line options
###################################################

my ($proxy, $outfile, $registrar, $username, $password, $hangup, $local_leg, $contact);
my (@routes, $debug, $prometheus, $nagios);
GetOptions(
'd|debug:i' => \$debug,
'h|help' => sub { usage() },
'P|proxy=s' => \$proxy,
'R|registrar=s' => \$registrar,
'O|outfile=s' => \$outfile,
'T|time=i' => \$hangup,
'L|leg=s' => \$local_leg,
'C|contact=s' => \$contact,
'username=s' => \$username,
'password=s' => \$password,
'route=s' => \@routes,
'prometheus' => \$prometheus,
'nagios' => \$nagios,
) or usage("Invalid option");

# Set debug level if specified
Net::SIP::Debug->level($debug || 1) if defined $debug;

# Validate arguments
my ($from, $to) = @ARGV;
usage("No target specified") unless $to;

# Use proxy as registrar if no registrar is specified
$registrar ||= $proxy;

###################################################
# Determine local leg (IP and port)
###################################################

my ($local_host, $local_port);
if ($local_leg) {
($local_host, $local_port) = split(/:/, $local_leg, 2);
} elsif (!$proxy) {
# Extract local host and port from the FROM address if no proxy is specified
($local_host, $local_port) = $from =~ m{\@([\w\-\.]+)(?::(\d+))?}
or die_with_error("Cannot find SIP domain in '$from'", 3);
}

my $leg;
if ($local_host) {
my $addr = gethostbyname($local_host)
or die_with_error("Cannot resolve IP for SIP domain '$local_host'", 3);
$addr = inet_ntoa($addr);

$leg = IO::Socket::INET->new(
Proto => 'udp',
LocalAddr => $addr,
LocalPort => $local_port || 5060,
);

# If port 5060 is unavailable, try a random port
if (!$leg && !$local_port) {
$leg = IO::Socket::INET->new(
Proto => 'udp',
LocalAddr => $addr,
LocalPort => 0,
) or die_with_error("Cannot create leg at $addr: $!", 3);
}

$leg = Net::SIP::Leg->new(sock => $leg);
}

###################################################
# SIP code starts here
###################################################

# Create necessary legs
my @legs;
push @legs, $leg if $leg;
foreach my $addr ($proxy, $registrar) {
next unless $addr;
unless (grep { $_->can_deliver_to($addr) } @legs) {
my $sock = create_socket_to($addr)
or die_with_error("Cannot create socket to $addr", 3);
push @legs, Net::SIP::Leg->new(sock => $sock);
}
}

# Create user agent
my $ua = Net::SIP::Simple->new(
from => $from,
outgoing_proxy => $proxy,
route => \@routes,
legs => \@legs,
$contact ? (contact => $contact) : (),
$username ? (auth => [$username, $password]) : (),
);

# Optional registration
if ($registrar && $registrar ne '-') {
$ua->register(registrar => $registrar);
die_with_error("Registration failed: " . $ua->error, 1) if $ua->error;
}

# Invite peer
my $peer_hangup; # Flag to check if the peer hung up
my $call = $ua->invite($to,
init_media => $ua->rtp('recv_echo', $outfile, 0),
recv_bye => \$peer_hangup,
) or die_with_error("Invite failed: " . $ua->error, 2);
die_with_error("Invite failed (call): " . $call->error, 2) if $call->error;

# Main loop: wait for peer to hang up or timeout
my $stopvar;
$ua->add_timer($hangup, \$stopvar) if $hangup;
$ua->loop(\$stopvar, \$peer_hangup);

# Timeout: hang up
if ($stopvar) {
$stopvar = undef;
$call->bye(cb_final => \$stopvar);
$ua->loop(\$stopvar);
}

# Output for Prometheus
if ($prometheus) {
print "# HELP sip_call_status Status of the SIP call (1 = success, 0 = failure)\n";
print "# TYPE sip_call_status gauge\n";
print "sip_call_status{to=\"$to\"} ", ($peer_hangup || $stopvar) ? 1 : 0, "\n";
}

# Output for Nagios
if ($nagios) {
if ($peer_hangup) {
print "OK: Call to $to completed successfully.\n";
exit 0;
} elsif ($stopvar) {
print "OK: Call to $to completed successfully.\n";
exit 1;
} else {
print "CRITICAL: Call to $to failed.\n";
exit 2;
}
}

print "Test call successful: $to\n";

Usage

./sip-call-tester.pl --username 80 --password sip-password --time 5 --leg 1.1.1.1 --proxy=pbx.vodia.com --prometheus sip:80@pbx.vodia.com 200

Parameters Explained:
--username 80: The extension number on the PBX.
--password sip_password: The SIP password for extension 80.
--time 5: The hangup timeout in seconds.
--leg 1.1.1.1: The local IP address of the server where the script is initiated.
--proxy pbx.vodia.com: The tenant domain on the PBX.
--nagios or --prometheus: Generates output compatible with Nagios or Prometheus monitoring systems.
sip:80@pbx.vodia.com: The SIP From used in INVITE request.
200: The destination extension or account on the tenant pbx.vodia-pbx.com that will play audio.

Example Output

OK: Call to 200 completed successfully.

# HELP sip_call_status Status of the SIP call (1 = success, 0 = failure)
# TYPE sip_call_status gauge
sip_call_status{to="200"} 1
Test call successful: 200