Monitoring and Analytics

Effective monitoring and analytics are essential for maintaining a healthy, secure, and optimized NexStorage environment. This guide covers the metrics NexStorage exposes, how to set up monitoring dashboards, configure alerts, and leverage advanced analytics capabilities.

Available Metrics

NexStorage exposes comprehensive metrics at multiple levels, providing visibility into all aspects of your storage infrastructure.

System-Level Metrics

Metric	Description	Unit
`system.cpu.usage`	CPU utilization across all NexStorage nodes	Percentage
`system.memory.usage`	Memory utilization across all NexStorage nodes	Percentage
`system.disk.usage`	Storage utilization across all disks	Percentage
`system.network.throughput`	Network traffic throughput	Bytes/second
`system.requests.total`	Total number of API requests	Count
`system.requests.active`	Currently active API requests	Count
`system.nodes.online`	Number of online nodes	Count
`system.nodes.total`	Total number of nodes	Count
`system.uptime`	Time since system start	Seconds

Storage Metrics

Metric	Description	Unit
`storage.capacity.total`	Total storage capacity	Bytes
`storage.capacity.used`	Used storage capacity	Bytes
`storage.capacity.free`	Available storage capacity	Bytes
`storage.objects.count`	Total number of objects stored	Count
`storage.objects.size.avg`	Average object size	Bytes
`storage.buckets.count`	Number of buckets	Count
`storage.replication.lag`	Replication lag between nodes	Seconds
`storage.healing.active`	Number of active healing operations	Count
`storage.healing.queued`	Number of queued healing operations	Count

Performance Metrics

Metric	Description	Unit
`performance.latency.read`	Read operation latency	Milliseconds
`performance.latency.write`	Write operation latency	Milliseconds
`performance.throughput.read`	Read throughput	Bytes/second
`performance.throughput.write`	Write throughput	Bytes/second
`performance.iops.read`	Read operations per second	Count/second
`performance.iops.write`	Write operations per second	Count/second
`performance.requests.success`	Successful API requests	Count
`performance.requests.error`	Failed API requests	Count
`performance.cache.hit_ratio`	Cache hit ratio	Percentage

Per-Bucket Metrics

Metric	Description	Unit
`bucket.{name}.size`	Total size of objects in bucket	Bytes
`bucket.{name}.objects`	Number of objects in bucket	Count
`bucket.{name}.bandwidth.in`	Incoming bandwidth to bucket	Bytes/second
`bucket.{name}.bandwidth.out`	Outgoing bandwidth from bucket	Bytes/second
`bucket.{name}.operations.read`	Read operations on bucket	Count
`bucket.{name}.operations.write`	Write operations on bucket	Count
`bucket.{name}.operations.delete`	Delete operations on bucket	Count

Metrics Exporters

NexStorage supports multiple metrics export formats for integration with your existing monitoring stack.

Prometheus Metrics

NexStorage exposes Prometheus-compatible metrics at the /metrics endpoint:

# Enable Prometheus metrics
nexstorage-admin config set metrics.prometheus.enabled true
nexstorage-admin config set metrics.prometheus.endpoint ":9000/metrics"

Example scrape configuration for Prometheus:

scrape_configs:
  - job_name: 'nexstorage'
    scrape_interval: 15s
    scheme: https
    tls_config:
      insecure_skip_verify: false
      ca_file: /path/to/ca.crt
    static_configs:
      - targets: ['nexstorage-server:9000']

StatsD Metrics

For StatsD integration:

# Enable StatsD metrics export
nexstorage-admin config set metrics.statsd.enabled true
nexstorage-admin config set metrics.statsd.address "statsd.example.com:8125"
nexstorage-admin config set metrics.statsd.prefix "nexstorage"

JSON Metrics API

NexStorage also provides a comprehensive JSON metrics API:

# Authenticate and get a token
TOKEN=$(curl -s -X POST \
  -H "Content-Type: application/json" \
  -d '{"accessKey":"YOUR_ACCESS_KEY","secretKey":"YOUR_SECRET_KEY"}' \
  https://nexstorage.example.com/api/v1/auth | jq -r .token)

# Get system metrics
curl -s -H "Authorization: Bearer $TOKEN" \
  https://nexstorage.example.com/api/v1/metrics/system

# Get metrics for a specific bucket
curl -s -H "Authorization: Bearer $TOKEN" \
  https://nexstorage.example.com/api/v1/metrics/bucket/my-bucket

Setting Up Monitoring Dashboards

Grafana Dashboard

NexStorage provides pre-built Grafana dashboards for comprehensive monitoring:

Ensure Prometheus is configured to scrape NexStorage metrics
Import the NexStorage dashboard into Grafana:
- In Grafana, go to Dashboard → Import
- Enter dashboard ID 12345 or upload the JSON file from the NexStorage resources
- Select your Prometheus data source
- Click Import
The dashboard includes:
- System health overview
- Storage capacity and utilization
- Performance metrics
- Request statistics
- Per-bucket analytics

Custom Dashboard Example

Example JSON for a custom Grafana dashboard panel:

{
  "datasource": "Prometheus",
  "fieldConfig": {
    "defaults": {
      "color": {
        "mode": "palette-classic"
      },
      "custom": {
        "axisLabel": "Bytes",
        "axisPlacement": "auto",
        "barAlignment": 0,
        "drawStyle": "line",
        "fillOpacity": 10,
        "gradientMode": "none",
        "hideFrom": {
          "legend": false,
          "tooltip": false,
          "viz": false
        },
        "lineInterpolation": "smooth",
        "lineWidth": 2,
        "pointSize": 5,
        "scaleDistribution": {
          "type": "linear"
        },
        "showPoints": "never",
        "spanNulls": true,
        "stacking": {
          "group": "A",
          "mode": "none"
        },
        "thresholdsStyle": {
          "mode": "off"
        }
      },
      "mappings": [],
      "thresholds": {
        "mode": "absolute",
        "steps": [
          {
            "color": "green",
            "value": null
          },
          {
            "color": "red",
            "value": 80
          }
        ]
      },
      "unit": "bytes"
    },
    "overrides": []
  },
  "options": {
    "legend": {
      "calcs": ["mean", "max", "min"],
      "displayMode": "table",
      "placement": "bottom"
    },
    "tooltip": {
      "mode": "single",
      "sort": "none"
    }
  },
  "targets": [
    {
      "expr": "sum(bucket_size{bucket=~\"$bucket\"}) by (bucket)",
      "interval": "",
      "legendFormat": "{{bucket}}",
      "refId": "A"
    }
  ],
  "title": "Bucket Size Over Time",
  "type": "timeseries"
}

Alerting

NexStorage supports multiple alerting mechanisms to notify you of potential issues or performance degradation.

Alert Configuration

Configure alerts directly in NexStorage:

# Set up an email alert for low disk space
nexstorage-admin alert create \
  --name "low-disk-space" \
  --condition "storage.capacity.free < 100GB" \
  --severity "warning" \
  --message "Storage space is running low" \
  --targets "email:admin@example.com"

# Set up a webhook alert for high error rates
nexstorage-admin alert create \
  --name "high-error-rate" \
  --condition "rate(performance.requests.error[5m]) > 10" \
  --severity "critical" \
  --message "Elevated error rate detected" \
  --targets "webhook:https://alerts.example.com/webhook"

Prometheus AlertManager Integration

Example AlertManager configuration for NexStorage:

groups:
- name: nexstorage-alerts
  rules:
  - alert: NexStorageHighCPUUsage
    expr: avg(system_cpu_usage{job="nexstorage"}) > 80
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High CPU usage on NexStorage"
      description: "NexStorage system CPU usage is above 80% for 5 minutes"

  - alert: NexStorageLowDiskSpace
    expr: system_disk_free_bytes{job="nexstorage"} < 100 * 1024 * 1024 * 1024
    for: 10m
    labels:
      severity: critical
    annotations:
      summary: "Low disk space on NexStorage"
      description: "NexStorage has less than 100GB free disk space"
  
  - alert: NexStorageHighErrorRate
    expr: rate(performance_requests_error{job="nexstorage"}[5m]) > 10
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "High error rate on NexStorage"
      description: "NexStorage API error rate is above 10 per second"

PagerDuty Integration

Configure PagerDuty integration for critical alerts:

# Set up PagerDuty integration
nexstorage-admin alert integration create \
  --type "pagerduty" \
  --name "nexstorage-pagerduty" \
  --config "routing_key=YOUR_PAGERDUTY_KEY"

# Create alert using the PagerDuty integration
nexstorage-admin alert create \
  --name "node-failure" \
  --condition "system.nodes.online < system.nodes.total" \
  --severity "critical" \
  --message "Node failure detected" \
  --targets "pagerduty:nexstorage-pagerduty"

Usage Analytics

NexStorage provides comprehensive usage analytics to help you understand how your storage is being utilized and optimize costs.

Access Logs Analysis

Enable detailed access logs for analytics:

# Enable access logs
nexstorage-admin config set log.access.enabled true
nexstorage-admin config set log.access.destination "file"
nexstorage-admin config set log.access.file.path "/var/log/nexstorage/access.log"

Example access log entry:

2023-06-15T14:32:45.123Z - 192.168.1.100 - TXID:a1b2c3d4 - "GET /my-bucket/image.jpg" 200 1048576 0.235 - Mozilla/5.0 - ACCESSKEY:AKIAIOSFODNN7EXAMPLE

Usage Reports

Generate usage reports to understand storage patterns:

# Generate daily usage report
nexstorage-admin report generate \
  --type "usage" \
  --period "daily" \
  --output-format "csv" \
  --output-file "usage-report.csv"

Example report columns:

Bucket Name
Total Size (GB)
Object Count
GET Requests
PUT Requests
Bandwidth In (GB)
Bandwidth Out (GB)

Cost Allocation

Set up cost allocation tags to track usage by department or project:

# Set tags on a bucket
nexstorage-client bucket tag set \
  --bucket "marketing-assets" \
  --tags "Department=Marketing,Project=WebsiteRedesign,CostCenter=MKT-123"

# Generate cost allocation report
nexstorage-admin report generate \
  --type "cost" \
  --group-by "tags" \
  --period "monthly" \
  --output-format "json" \
  --output-file "cost-report.json"

Advanced Analytics

Object Lifecycle Analysis

Analyze object access patterns to optimize lifecycle policies:

# Generate object lifecycle report
nexstorage-admin report generate \
  --type "lifecycle" \
  --bucket "data-archive" \
  --output-format "json" \
  --output-file "lifecycle-report.json"

This report helps identify:

Frequently accessed objects
Rarely accessed objects
Objects that could be moved to lower-cost storage tiers
Objects that should be deleted based on retention policies

Data Insights

Enable the NexStorage Data Insights feature for advanced analytics:

# Enable Data Insights
nexstorage-admin insights enable

The Data Insights dashboard provides:

Storage usage trends and forecasting
Access pattern visualization
Performance hotspots
Cost optimization recommendations
Compliance risk detection

Integration with Business Intelligence Tools

NexStorage metrics and logs can be integrated with BI tools for custom analytics:

Tableau Integration:
- Connect Tableau to the NexStorage metrics database
- Import pre-built Tableau workbooks from the NexStorage resources
- Create custom visualizations for usage patterns
PowerBI Integration:
- Use the provided PowerBI templates
- Connect to the NexStorage metrics API
- Create dashboards for executives and storage administrators

Monitoring Best Practices

Follow these best practices for effective NexStorage monitoring:

Baseline Establishment:
- Monitor normal usage patterns for at least two weeks
- Establish baselines for performance metrics
- Document seasonal patterns in usage
Comprehensive Alerts:
- Set up alerts for capacity thresholds (70%, 80%, 90%)
- Monitor performance degradation
- Configure alerts for security events
- Set up alerts for replication issues
Dashboard Organization:
- Create role-specific dashboards (Admin, Developer, Executive)
- Group related metrics
- Use consistent units and scales
- Include context and documentation
Regular Review:
- Schedule weekly reviews of monitoring data
- Adjust alerting thresholds based on patterns
- Update dashboards as requirements change
- Archive historical data for long-term analysis

Troubleshooting with Metrics

Use NexStorage metrics to diagnose common issues:

Performance Problems

Check performance.latency.read and performance.latency.write:
- Increasing latency may indicate network, disk, or CPU issues
- Compare against historical baselines
Analyze system.cpu.usage and system.memory.usage:
- High CPU or memory usage may indicate resource constraints
- Check if specific nodes are experiencing higher load
Review performance.cache.hit_ratio:
- Low cache hit rates may indicate inefficient access patterns
- Consider adjusting cache size or improving application access patterns

Capacity Issues

Monitor storage.capacity.free trend:
- Project when you'll reach capacity limits
- Identify buckets with highest growth rates
Analyze bucket.{name}.size for each bucket:
- Identify buckets consuming the most space
- Look for unexpected growth
Review storage.objects.size.avg:
- Changes in average object size may indicate application changes
- Very small objects can impact performance

Security Monitoring

Track performance.requests.error:
- Spikes in error rates may indicate security issues
- Look for patterns in access logs
Monitor system.requests.active:
- Unusual patterns may indicate DDoS attempts
- Compare against historical patterns for your applications

Next Steps

Now that you've set up monitoring and analytics for your NexStorage environment, consider exploring:

Security Best Practices - Enhance your storage security
Integration Guides - Connect with more tools and services
Migration from AWS S3 - Move existing data to NexStorage

Available Metrics​

System-Level Metrics​

Storage Metrics​

Performance Metrics​

Per-Bucket Metrics​

Metrics Exporters​

Prometheus Metrics​

StatsD Metrics​

JSON Metrics API​

Setting Up Monitoring Dashboards​

Grafana Dashboard​

Custom Dashboard Example​

Alerting​

Alert Configuration​

Prometheus AlertManager Integration​

PagerDuty Integration​

Usage Analytics​

Access Logs Analysis​

Usage Reports​

Cost Allocation​

Advanced Analytics​

Object Lifecycle Analysis​

Data Insights​

Integration with Business Intelligence Tools​

Monitoring Best Practices​

Troubleshooting with Metrics​

Performance Problems​

Capacity Issues​

Security Monitoring​

Next Steps​