Monitoring and Analytics
Effective monitoring and analytics are essential for maintaining a healthy, secure, and optimized NexStorage environment. This guide covers the metrics NexStorage exposes, how to set up monitoring dashboards, configure alerts, and leverage advanced analytics capabilities.
Available Metrics
NexStorage exposes comprehensive metrics at multiple levels, providing visibility into all aspects of your storage infrastructure.
System-Level Metrics
| Metric | Description | Unit |
|---|---|---|
system.cpu.usage | CPU utilization across all NexStorage nodes | Percentage |
system.memory.usage | Memory utilization across all NexStorage nodes | Percentage |
system.disk.usage | Storage utilization across all disks | Percentage |
system.network.throughput | Network traffic throughput | Bytes/second |
system.requests.total | Total number of API requests | Count |
system.requests.active | Currently active API requests | Count |
system.nodes.online | Number of online nodes | Count |
system.nodes.total | Total number of nodes | Count |
system.uptime | Time since system start | Seconds |
Storage Metrics
| Metric | Description | Unit |
|---|---|---|
storage.capacity.total | Total storage capacity | Bytes |
storage.capacity.used | Used storage capacity | Bytes |
storage.capacity.free | Available storage capacity | Bytes |
storage.objects.count | Total number of objects stored | Count |
storage.objects.size.avg | Average object size | Bytes |
storage.buckets.count | Number of buckets | Count |
storage.replication.lag | Replication lag between nodes | Seconds |
storage.healing.active | Number of active healing operations | Count |
storage.healing.queued | Number of queued healing operations | Count |
Performance Metrics
| Metric | Description | Unit |
|---|---|---|
performance.latency.read | Read operation latency | Milliseconds |
performance.latency.write | Write operation latency | Milliseconds |
performance.throughput.read | Read throughput | Bytes/second |
performance.throughput.write | Write throughput | Bytes/second |
performance.iops.read | Read operations per second | Count/second |
performance.iops.write | Write operations per second | Count/second |
performance.requests.success | Successful API requests | Count |
performance.requests.error | Failed API requests | Count |
performance.cache.hit_ratio | Cache hit ratio | Percentage |
Per-Bucket Metrics
| Metric | Description | Unit |
|---|---|---|
bucket.{name}.size | Total size of objects in bucket | Bytes |
bucket.{name}.objects | Number of objects in bucket | Count |
bucket.{name}.bandwidth.in | Incoming bandwidth to bucket | Bytes/second |
bucket.{name}.bandwidth.out | Outgoing bandwidth from bucket | Bytes/second |
bucket.{name}.operations.read | Read operations on bucket | Count |
bucket.{name}.operations.write | Write operations on bucket | Count |
bucket.{name}.operations.delete | Delete operations on bucket | Count |
Metrics Exporters
NexStorage supports multiple metrics export formats for integration with your existing monitoring stack.
Prometheus Metrics
NexStorage exposes Prometheus-compatible metrics at the /metrics endpoint:
# Enable Prometheus metrics
nexstorage-admin config set metrics.prometheus.enabled true
nexstorage-admin config set metrics.prometheus.endpoint ":9000/metrics"
Example scrape configuration for Prometheus:
scrape_configs:
- job_name: 'nexstorage'
scrape_interval: 15s
scheme: https
tls_config:
insecure_skip_verify: false
ca_file: /path/to/ca.crt
static_configs:
- targets: ['nexstorage-server:9000']
StatsD Metrics
For StatsD integration:
# Enable StatsD metrics export
nexstorage-admin config set metrics.statsd.enabled true
nexstorage-admin config set metrics.statsd.address "statsd.example.com:8125"
nexstorage-admin config set metrics.statsd.prefix "nexstorage"
JSON Metrics API
NexStorage also provides a comprehensive JSON metrics API:
# Authenticate and get a token
TOKEN=$(curl -s -X POST \
-H "Content-Type: application/json" \
-d '{"accessKey":"YOUR_ACCESS_KEY","secretKey":"YOUR_SECRET_KEY"}' \
https://nexstorage.example.com/api/v1/auth | jq -r .token)
# Get system metrics
curl -s -H "Authorization: Bearer $TOKEN" \
https://nexstorage.example.com/api/v1/metrics/system
# Get metrics for a specific bucket
curl -s -H "Authorization: Bearer $TOKEN" \
https://nexstorage.example.com/api/v1/metrics/bucket/my-bucket
Setting Up Monitoring Dashboards
Grafana Dashboard
NexStorage provides pre-built Grafana dashboards for comprehensive monitoring:
-
Ensure Prometheus is configured to scrape NexStorage metrics
-
Import the NexStorage dashboard into Grafana:
- In Grafana, go to Dashboard → Import
- Enter dashboard ID
12345or upload the JSON file from the NexStorage resources - Select your Prometheus data source
- Click Import
-
The dashboard includes:
- System health overview
- Storage capacity and utilization
- Performance metrics
- Request statistics
- Per-bucket analytics
Custom Dashboard Example
Example JSON for a custom Grafana dashboard panel:
{
"datasource": "Prometheus",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisLabel": "Bytes",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "smooth",
"lineWidth": 2,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": true,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "bytes"
},
"overrides": []
},
"options": {
"legend": {
"calcs": ["mean", "max", "min"],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"expr": "sum(bucket_size{bucket=~\"$bucket\"}) by (bucket)",
"interval": "",
"legendFormat": "{{bucket}}",
"refId": "A"
}
],
"title": "Bucket Size Over Time",
"type": "timeseries"
}
Alerting
NexStorage supports multiple alerting mechanisms to notify you of potential issues or performance degradation.
Alert Configuration
Configure alerts directly in NexStorage:
# Set up an email alert for low disk space
nexstorage-admin alert create \
--name "low-disk-space" \
--condition "storage.capacity.free < 100GB" \
--severity "warning" \
--message "Storage space is running low" \
--targets "email:admin@example.com"
# Set up a webhook alert for high error rates
nexstorage-admin alert create \
--name "high-error-rate" \
--condition "rate(performance.requests.error[5m]) > 10" \
--severity "critical" \
--message "Elevated error rate detected" \
--targets "webhook:https://alerts.example.com/webhook"
Prometheus AlertManager Integration
Example AlertManager configuration for NexStorage:
groups:
- name: nexstorage-alerts
rules:
- alert: NexStorageHighCPUUsage
expr: avg(system_cpu_usage{job="nexstorage"}) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage on NexStorage"
description: "NexStorage system CPU usage is above 80% for 5 minutes"
- alert: NexStorageLowDiskSpace
expr: system_disk_free_bytes{job="nexstorage"} < 100 * 1024 * 1024 * 1024
for: 10m
labels:
severity: critical
annotations:
summary: "Low disk space on NexStorage"
description: "NexStorage has less than 100GB free disk space"
- alert: NexStorageHighErrorRate
expr: rate(performance_requests_error{job="nexstorage"}[5m]) > 10
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate on NexStorage"
description: "NexStorage API error rate is above 10 per second"
PagerDuty Integration
Configure PagerDuty integration for critical alerts:
# Set up PagerDuty integration
nexstorage-admin alert integration create \
--type "pagerduty" \
--name "nexstorage-pagerduty" \
--config "routing_key=YOUR_PAGERDUTY_KEY"
# Create alert using the PagerDuty integration
nexstorage-admin alert create \
--name "node-failure" \
--condition "system.nodes.online < system.nodes.total" \
--severity "critical" \
--message "Node failure detected" \
--targets "pagerduty:nexstorage-pagerduty"
Usage Analytics
NexStorage provides comprehensive usage analytics to help you understand how your storage is being utilized and optimize costs.
Access Logs Analysis
Enable detailed access logs for analytics:
# Enable access logs
nexstorage-admin config set log.access.enabled true
nexstorage-admin config set log.access.destination "file"
nexstorage-admin config set log.access.file.path "/var/log/nexstorage/access.log"
Example access log entry:
2023-06-15T14:32:45.123Z - 192.168.1.100 - TXID:a1b2c3d4 - "GET /my-bucket/image.jpg" 200 1048576 0.235 - Mozilla/5.0 - ACCESSKEY:AKIAIOSFODNN7EXAMPLE
Usage Reports
Generate usage reports to understand storage patterns:
# Generate daily usage report
nexstorage-admin report generate \
--type "usage" \
--period "daily" \
--output-format "csv" \
--output-file "usage-report.csv"
Example report columns:
- Bucket Name
- Total Size (GB)
- Object Count
- GET Requests
- PUT Requests
- Bandwidth In (GB)
- Bandwidth Out (GB)
Cost Allocation
Set up cost allocation tags to track usage by department or project:
# Set tags on a bucket
nexstorage-client bucket tag set \
--bucket "marketing-assets" \
--tags "Department=Marketing,Project=WebsiteRedesign,CostCenter=MKT-123"
# Generate cost allocation report
nexstorage-admin report generate \
--type "cost" \
--group-by "tags" \
--period "monthly" \
--output-format "json" \
--output-file "cost-report.json"
Advanced Analytics
Object Lifecycle Analysis
Analyze object access patterns to optimize lifecycle policies:
# Generate object lifecycle report
nexstorage-admin report generate \
--type "lifecycle" \
--bucket "data-archive" \
--output-format "json" \
--output-file "lifecycle-report.json"
This report helps identify:
- Frequently accessed objects
- Rarely accessed objects
- Objects that could be moved to lower-cost storage tiers
- Objects that should be deleted based on retention policies
Data Insights
Enable the NexStorage Data Insights feature for advanced analytics:
# Enable Data Insights
nexstorage-admin insights enable
The Data Insights dashboard provides:
- Storage usage trends and forecasting
- Access pattern visualization
- Performance hotspots
- Cost optimization recommendations
- Compliance risk detection
Integration with Business Intelligence Tools
NexStorage metrics and logs can be integrated with BI tools for custom analytics:
-
Tableau Integration:
- Connect Tableau to the NexStorage metrics database
- Import pre-built Tableau workbooks from the NexStorage resources
- Create custom visualizations for usage patterns
-
PowerBI Integration:
- Use the provided PowerBI templates
- Connect to the NexStorage metrics API
- Create dashboards for executives and storage administrators
Monitoring Best Practices
Follow these best practices for effective NexStorage monitoring:
-
Baseline Establishment:
- Monitor normal usage patterns for at least two weeks
- Establish baselines for performance metrics
- Document seasonal patterns in usage
-
Comprehensive Alerts:
- Set up alerts for capacity thresholds (70%, 80%, 90%)
- Monitor performance degradation
- Configure alerts for security events
- Set up alerts for replication issues
-
Dashboard Organization:
- Create role-specific dashboards (Admin, Developer, Executive)
- Group related metrics
- Use consistent units and scales
- Include context and documentation
-
Regular Review:
- Schedule weekly reviews of monitoring data
- Adjust alerting thresholds based on patterns
- Update dashboards as requirements change
- Archive historical data for long-term analysis
Troubleshooting with Metrics
Use NexStorage metrics to diagnose common issues:
Performance Problems
-
Check
performance.latency.readandperformance.latency.write:- Increasing latency may indicate network, disk, or CPU issues
- Compare against historical baselines
-
Analyze
system.cpu.usageandsystem.memory.usage:- High CPU or memory usage may indicate resource constraints
- Check if specific nodes are experiencing higher load
-
Review
performance.cache.hit_ratio:- Low cache hit rates may indicate inefficient access patterns
- Consider adjusting cache size or improving application access patterns
Capacity Issues
-
Monitor
storage.capacity.freetrend:- Project when you'll reach capacity limits
- Identify buckets with highest growth rates
-
Analyze
bucket.{name}.sizefor each bucket:- Identify buckets consuming the most space
- Look for unexpected growth
-
Review
storage.objects.size.avg:- Changes in average object size may indicate application changes
- Very small objects can impact performance
Security Monitoring
-
Track
performance.requests.error:- Spikes in error rates may indicate security issues
- Look for patterns in access logs
-
Monitor
system.requests.active:- Unusual patterns may indicate DDoS attempts
- Compare against historical patterns for your applications
Next Steps
Now that you've set up monitoring and analytics for your NexStorage environment, consider exploring:
- Security Best Practices - Enhance your storage security
- Integration Guides - Connect with more tools and services
- Migration from AWS S3 - Move existing data to NexStorage