Skip to main content

Monitoring and Analytics

Effective monitoring and analytics are essential for maintaining a healthy, secure, and optimized NexStorage environment. This guide covers the metrics NexStorage exposes, how to set up monitoring dashboards, configure alerts, and leverage advanced analytics capabilities.

Available Metrics

NexStorage exposes comprehensive metrics at multiple levels, providing visibility into all aspects of your storage infrastructure.

System-Level Metrics

MetricDescriptionUnit
system.cpu.usageCPU utilization across all NexStorage nodesPercentage
system.memory.usageMemory utilization across all NexStorage nodesPercentage
system.disk.usageStorage utilization across all disksPercentage
system.network.throughputNetwork traffic throughputBytes/second
system.requests.totalTotal number of API requestsCount
system.requests.activeCurrently active API requestsCount
system.nodes.onlineNumber of online nodesCount
system.nodes.totalTotal number of nodesCount
system.uptimeTime since system startSeconds

Storage Metrics

MetricDescriptionUnit
storage.capacity.totalTotal storage capacityBytes
storage.capacity.usedUsed storage capacityBytes
storage.capacity.freeAvailable storage capacityBytes
storage.objects.countTotal number of objects storedCount
storage.objects.size.avgAverage object sizeBytes
storage.buckets.countNumber of bucketsCount
storage.replication.lagReplication lag between nodesSeconds
storage.healing.activeNumber of active healing operationsCount
storage.healing.queuedNumber of queued healing operationsCount

Performance Metrics

MetricDescriptionUnit
performance.latency.readRead operation latencyMilliseconds
performance.latency.writeWrite operation latencyMilliseconds
performance.throughput.readRead throughputBytes/second
performance.throughput.writeWrite throughputBytes/second
performance.iops.readRead operations per secondCount/second
performance.iops.writeWrite operations per secondCount/second
performance.requests.successSuccessful API requestsCount
performance.requests.errorFailed API requestsCount
performance.cache.hit_ratioCache hit ratioPercentage

Per-Bucket Metrics

MetricDescriptionUnit
bucket.{name}.sizeTotal size of objects in bucketBytes
bucket.{name}.objectsNumber of objects in bucketCount
bucket.{name}.bandwidth.inIncoming bandwidth to bucketBytes/second
bucket.{name}.bandwidth.outOutgoing bandwidth from bucketBytes/second
bucket.{name}.operations.readRead operations on bucketCount
bucket.{name}.operations.writeWrite operations on bucketCount
bucket.{name}.operations.deleteDelete operations on bucketCount

Metrics Exporters

NexStorage supports multiple metrics export formats for integration with your existing monitoring stack.

Prometheus Metrics

NexStorage exposes Prometheus-compatible metrics at the /metrics endpoint:

# Enable Prometheus metrics
nexstorage-admin config set metrics.prometheus.enabled true
nexstorage-admin config set metrics.prometheus.endpoint ":9000/metrics"

Example scrape configuration for Prometheus:

scrape_configs:
- job_name: 'nexstorage'
scrape_interval: 15s
scheme: https
tls_config:
insecure_skip_verify: false
ca_file: /path/to/ca.crt
static_configs:
- targets: ['nexstorage-server:9000']

StatsD Metrics

For StatsD integration:

# Enable StatsD metrics export
nexstorage-admin config set metrics.statsd.enabled true
nexstorage-admin config set metrics.statsd.address "statsd.example.com:8125"
nexstorage-admin config set metrics.statsd.prefix "nexstorage"

JSON Metrics API

NexStorage also provides a comprehensive JSON metrics API:

# Authenticate and get a token
TOKEN=$(curl -s -X POST \
-H "Content-Type: application/json" \
-d '{"accessKey":"YOUR_ACCESS_KEY","secretKey":"YOUR_SECRET_KEY"}' \
https://nexstorage.example.com/api/v1/auth | jq -r .token)

# Get system metrics
curl -s -H "Authorization: Bearer $TOKEN" \
https://nexstorage.example.com/api/v1/metrics/system

# Get metrics for a specific bucket
curl -s -H "Authorization: Bearer $TOKEN" \
https://nexstorage.example.com/api/v1/metrics/bucket/my-bucket

Setting Up Monitoring Dashboards

Grafana Dashboard

NexStorage provides pre-built Grafana dashboards for comprehensive monitoring:

  1. Ensure Prometheus is configured to scrape NexStorage metrics

  2. Import the NexStorage dashboard into Grafana:

    • In Grafana, go to Dashboard → Import
    • Enter dashboard ID 12345 or upload the JSON file from the NexStorage resources
    • Select your Prometheus data source
    • Click Import
  3. The dashboard includes:

    • System health overview
    • Storage capacity and utilization
    • Performance metrics
    • Request statistics
    • Per-bucket analytics

Custom Dashboard Example

Example JSON for a custom Grafana dashboard panel:

{
"datasource": "Prometheus",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisLabel": "Bytes",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "smooth",
"lineWidth": 2,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": true,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "bytes"
},
"overrides": []
},
"options": {
"legend": {
"calcs": ["mean", "max", "min"],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"expr": "sum(bucket_size{bucket=~\"$bucket\"}) by (bucket)",
"interval": "",
"legendFormat": "{{bucket}}",
"refId": "A"
}
],
"title": "Bucket Size Over Time",
"type": "timeseries"
}

Alerting

NexStorage supports multiple alerting mechanisms to notify you of potential issues or performance degradation.

Alert Configuration

Configure alerts directly in NexStorage:

# Set up an email alert for low disk space
nexstorage-admin alert create \
--name "low-disk-space" \
--condition "storage.capacity.free < 100GB" \
--severity "warning" \
--message "Storage space is running low" \
--targets "email:admin@example.com"

# Set up a webhook alert for high error rates
nexstorage-admin alert create \
--name "high-error-rate" \
--condition "rate(performance.requests.error[5m]) > 10" \
--severity "critical" \
--message "Elevated error rate detected" \
--targets "webhook:https://alerts.example.com/webhook"

Prometheus AlertManager Integration

Example AlertManager configuration for NexStorage:

groups:
- name: nexstorage-alerts
rules:
- alert: NexStorageHighCPUUsage
expr: avg(system_cpu_usage{job="nexstorage"}) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage on NexStorage"
description: "NexStorage system CPU usage is above 80% for 5 minutes"

- alert: NexStorageLowDiskSpace
expr: system_disk_free_bytes{job="nexstorage"} < 100 * 1024 * 1024 * 1024
for: 10m
labels:
severity: critical
annotations:
summary: "Low disk space on NexStorage"
description: "NexStorage has less than 100GB free disk space"

- alert: NexStorageHighErrorRate
expr: rate(performance_requests_error{job="nexstorage"}[5m]) > 10
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate on NexStorage"
description: "NexStorage API error rate is above 10 per second"

PagerDuty Integration

Configure PagerDuty integration for critical alerts:

# Set up PagerDuty integration
nexstorage-admin alert integration create \
--type "pagerduty" \
--name "nexstorage-pagerduty" \
--config "routing_key=YOUR_PAGERDUTY_KEY"

# Create alert using the PagerDuty integration
nexstorage-admin alert create \
--name "node-failure" \
--condition "system.nodes.online < system.nodes.total" \
--severity "critical" \
--message "Node failure detected" \
--targets "pagerduty:nexstorage-pagerduty"

Usage Analytics

NexStorage provides comprehensive usage analytics to help you understand how your storage is being utilized and optimize costs.

Access Logs Analysis

Enable detailed access logs for analytics:

# Enable access logs
nexstorage-admin config set log.access.enabled true
nexstorage-admin config set log.access.destination "file"
nexstorage-admin config set log.access.file.path "/var/log/nexstorage/access.log"

Example access log entry:

2023-06-15T14:32:45.123Z - 192.168.1.100 - TXID:a1b2c3d4 - "GET /my-bucket/image.jpg" 200 1048576 0.235 - Mozilla/5.0 - ACCESSKEY:AKIAIOSFODNN7EXAMPLE

Usage Reports

Generate usage reports to understand storage patterns:

# Generate daily usage report
nexstorage-admin report generate \
--type "usage" \
--period "daily" \
--output-format "csv" \
--output-file "usage-report.csv"

Example report columns:

  • Bucket Name
  • Total Size (GB)
  • Object Count
  • GET Requests
  • PUT Requests
  • Bandwidth In (GB)
  • Bandwidth Out (GB)

Cost Allocation

Set up cost allocation tags to track usage by department or project:

# Set tags on a bucket
nexstorage-client bucket tag set \
--bucket "marketing-assets" \
--tags "Department=Marketing,Project=WebsiteRedesign,CostCenter=MKT-123"

# Generate cost allocation report
nexstorage-admin report generate \
--type "cost" \
--group-by "tags" \
--period "monthly" \
--output-format "json" \
--output-file "cost-report.json"

Advanced Analytics

Object Lifecycle Analysis

Analyze object access patterns to optimize lifecycle policies:

# Generate object lifecycle report
nexstorage-admin report generate \
--type "lifecycle" \
--bucket "data-archive" \
--output-format "json" \
--output-file "lifecycle-report.json"

This report helps identify:

  • Frequently accessed objects
  • Rarely accessed objects
  • Objects that could be moved to lower-cost storage tiers
  • Objects that should be deleted based on retention policies

Data Insights

Enable the NexStorage Data Insights feature for advanced analytics:

# Enable Data Insights
nexstorage-admin insights enable

The Data Insights dashboard provides:

  • Storage usage trends and forecasting
  • Access pattern visualization
  • Performance hotspots
  • Cost optimization recommendations
  • Compliance risk detection

Integration with Business Intelligence Tools

NexStorage metrics and logs can be integrated with BI tools for custom analytics:

  1. Tableau Integration:

    • Connect Tableau to the NexStorage metrics database
    • Import pre-built Tableau workbooks from the NexStorage resources
    • Create custom visualizations for usage patterns
  2. PowerBI Integration:

    • Use the provided PowerBI templates
    • Connect to the NexStorage metrics API
    • Create dashboards for executives and storage administrators

Monitoring Best Practices

Follow these best practices for effective NexStorage monitoring:

  1. Baseline Establishment:

    • Monitor normal usage patterns for at least two weeks
    • Establish baselines for performance metrics
    • Document seasonal patterns in usage
  2. Comprehensive Alerts:

    • Set up alerts for capacity thresholds (70%, 80%, 90%)
    • Monitor performance degradation
    • Configure alerts for security events
    • Set up alerts for replication issues
  3. Dashboard Organization:

    • Create role-specific dashboards (Admin, Developer, Executive)
    • Group related metrics
    • Use consistent units and scales
    • Include context and documentation
  4. Regular Review:

    • Schedule weekly reviews of monitoring data
    • Adjust alerting thresholds based on patterns
    • Update dashboards as requirements change
    • Archive historical data for long-term analysis

Troubleshooting with Metrics

Use NexStorage metrics to diagnose common issues:

Performance Problems

  1. Check performance.latency.read and performance.latency.write:

    • Increasing latency may indicate network, disk, or CPU issues
    • Compare against historical baselines
  2. Analyze system.cpu.usage and system.memory.usage:

    • High CPU or memory usage may indicate resource constraints
    • Check if specific nodes are experiencing higher load
  3. Review performance.cache.hit_ratio:

    • Low cache hit rates may indicate inefficient access patterns
    • Consider adjusting cache size or improving application access patterns

Capacity Issues

  1. Monitor storage.capacity.free trend:

    • Project when you'll reach capacity limits
    • Identify buckets with highest growth rates
  2. Analyze bucket.{name}.size for each bucket:

    • Identify buckets consuming the most space
    • Look for unexpected growth
  3. Review storage.objects.size.avg:

    • Changes in average object size may indicate application changes
    • Very small objects can impact performance

Security Monitoring

  1. Track performance.requests.error:

    • Spikes in error rates may indicate security issues
    • Look for patterns in access logs
  2. Monitor system.requests.active:

    • Unusual patterns may indicate DDoS attempts
    • Compare against historical patterns for your applications

Next Steps

Now that you've set up monitoring and analytics for your NexStorage environment, consider exploring: