Rivellum Node Logging Guide

Overview

Rivellum node implements production-grade structured logging with comprehensive features for observability, debugging, and compliance. The logging system supports:

Structured log formats (JSON for production, pretty for development)
Dynamic log level adjustment via RPC endpoint (no restart required)
Automatic log rotation based on file size with configurable retention
Context propagation for distributed tracing (node_id, chain_id, batch_id, intent_id)
Security safeguards to protect sensitive data
Panic logging with full context and backtraces

Configuration

LogConfig Structure

[logging]
level = "info"                           # Log level: trace|debug|info|warn|error
format = "json"                           # Output format: json|pretty
file_path = "logs/rivellum-node.log"     # Log file path (optional)
max_file_size_mb = 100                   # Rotate when file exceeds this size
max_backup_files = 10                     # Number of rotated files to keep
retention_days = 30                       # Optional: Delete files older than N days
enable_console = true                     # Also log to stdout/stderr
debug_sensitive_logging = false           # DANGER: Log decrypted payloads (dev only)

Environment Profiles

Development (Default)

[logging]
level = "debug"
format = "pretty"                         # Human-readable with colors
enable_console = true
file_path = null                          # No file logging in dev
debug_sensitive_logging = false

Testnet

[logging]
level = "info"
format = "json"                           # Structured for log aggregation
file_path = "logs/testnet-node.log"
max_file_size_mb = 100
max_backup_files = 10
enable_console = true
debug_sensitive_logging = false

Mainnet (Production)

[logging]
level = "info"
format = "json"
file_path = "/var/log/rivellum/mainnet-node.log"
max_file_size_mb = 500                   # Larger files in production
max_backup_files = 30                     # 30 days of backups @ ~500MB/day
retention_days = 30                       # Automatic cleanup
enable_console = false                    # Console logs handled by systemd
debug_sensitive_logging = false           # NEVER enable in production

Environment Variable Overrides

Configuration can be overridden at runtime without modifying config files:

# Override log level
export RIVELLUM_LOG_LEVEL=debug

# Override log format
export RIVELLUM_LOG_FORMAT=json

# Override log file path
export RIVELLUM_LOG_FILE=/custom/path/node.log

# Start node (config file settings are overridden)
./target/release/rivellum-node server --config config/mainnet.toml

Log Formats

JSON Format (Production)

Structured JSON lines optimized for log aggregation systems (Loki, ELK, Datadog):

{"timestamp":"2024-01-15T10:30:45.123Z","level":"INFO","target":"rivellum_node::rpc","message":"RPC server listening","fields":{"rpc_bind":"0.0.0.0:8080","node_id":"node-alpha","chain_id":"rivellum-mainnet","environment":"prod"}}
{"timestamp":"2024-01-15T10:30:46.234Z","level":"INFO","target":"rivellum_ledger","message":"Batch executed successfully","fields":{"batch_id":"batch-1234","height":42,"intent_count":150,"total_gas":1500000,"node_id":"node-alpha","chain_id":"rivellum-mainnet"}}
{"timestamp":"2024-01-15T10:30:47.345Z","level":"WARN","target":"rivellum_intents","message":"Intent validation failed","fields":{"intent_id":"0x789abc...","error":"InvalidNonce","expected_nonce":5,"got_nonce":3}}

Fields:

timestamp: RFC3339 format with millisecond precision
level: TRACE | DEBUG | INFO | WARN | ERROR
target: Rust module path (e.g., rivellum_node::rpc)
message: Human-readable message
fields: Structured context data (key-value pairs)
span: Hierarchical span information for tracing

Pretty Format (Development)

Human-readable colorized output for local development:

2024-01-15T10:30:45.123Z  INFO rivellum_node::rpc: RPC server listening
    rpc_bind: 0.0.0.0:8080
    node_id: node-alpha
    chain_id: rivellum-testnet
    
2024-01-15T10:30:46.234Z  INFO rivellum_ledger: Batch executed successfully
    batch_id: batch-1234
    height: 42
    intent_count: 150
    total_gas: 1500000
    
2024-01-15T10:30:47.345Z  WARN rivellum_intents: Intent validation failed
    intent_id: 0x789abc...
    error: InvalidNonce
    expected_nonce: 5
    got_nonce: 3

Features:

Color-coded log levels (ERROR=red, WARN=yellow, INFO=green, DEBUG=blue, TRACE=gray)
Indented field display for readability
File and line number annotations (in debug/trace)
Span hierarchy visualization

Log Rotation & Retention

Size-Based Rotation

When the current log file exceeds max_file_size_mb, it is automatically rotated:

logs/
├── rivellum-node.log        # Current active log
├── rivellum-node.log.1      # Most recent backup
├── rivellum-node.log.2      # Older backup
├── ...
└── rivellum-node.log.10     # Oldest backup (if max_backup_files=10)

Rotation Process:

Check file size before each write
If size > threshold:
- Close current file
- Rename file.log.N-1 → file.log.N (for all N)
- Rename file.log → file.log.1
- Open new file.log
Delete file.log.{max_backup_files+1} if it exists

Time-Based Retention (Optional)

If retention_days is set, old rotated files are deleted based on modification time:

retention_days = 30  # Delete files older than 30 days

Cleanup Schedule:

Runs during log rotation
Checks all *.log.* files in the directory
Deletes files where mtime > retention_days

Disk Space Management

Estimate storage requirements:

Total disk space = max_file_size_mb * (max_backup_files + 1)

Example:

max_file_size_mb = 100
max_backup_files = 10
Total: ~1.1 GB (100MB × 11 files)

Production recommendations:

Mainnet: 500MB × 30 files = ~15GB
Testnet: 100MB × 10 files = ~1.1GB
Development: No file logging (console only)

Dynamic Log Level Adjustment

Change log verbosity at runtime without restarting the node.

RPC Endpoint

POST /log/level

Request:

{
  "level": "debug"
}

Valid levels: trace, debug, info, warn, error

Response:

{
  "message": "Log level updated successfully",
  "old_level": "info",
  "new_level": "debug"
}

Usage Examples

Using curl:

# Increase verbosity for debugging
curl -X POST http://localhost:8080/log/level \
  -H "Content-Type: application/json" \
  -d '{"level":"debug"}'

# Reduce verbosity after investigation
curl -X POST http://localhost:8080/log/level \
  -H "Content-Type: application/json" \
  -d '{"level":"info"}'

# Trace very verbose events (caution: high volume)
curl -X POST http://localhost:8080/log/level \
  -H "Content-Type: application/json" \
  -d '{"level":"trace"}'

Using PowerShell:

$body = @{ level = "debug" } | ConvertTo-Json
Invoke-RestMethod -Method Post -Uri "http://localhost:8080/log/level" `
  -Body $body -ContentType "application/json"

Audit Logging

All log level changes are audited:

{"timestamp":"2024-01-15T10:35:00.000Z","level":"INFO","message":"Log level changed","fields":{"event_type":"audit","old_level":"info","new_level":"debug","changed_by":"admin"}}

Security Considerations

⚠️ Production Deployment:

Add authentication to the /log/level endpoint (not implemented by default)
Restrict access via firewall rules (e.g., allow only from monitoring network)
Use HTTPS/TLS for encrypted transport
Monitor for unauthorized level changes via audit logs

Context Fields & Distributed Tracing

Global Context

All logs include global context from NodeContext:

node_id: Unique identifier for this node instance
chain_id: Blockchain network identifier (mainnet, testnet, devnet)
environment: Deployment environment (prod, testnet, dev)

Structured Spans

Use tracing spans to add hierarchical context:

use tracing::{info, info_span};

// Top-level span for batch execution
let _span = info_span!("batch_execution",
    batch_id = %batch_id,
    height = height,
).entered();

info!("Starting batch execution");

for intent in intents {
    // Nested span for individual intent
    let _intent_span = info_span!("intent_processing",
        intent_id = %intent.intent_id(),
        sender = %intent.sender,
    ).entered();
    
    info!("Executing intent");
    // ... execution logic ...
}

info!(intent_count = intents.len(), "Batch execution complete");

Resulting logs:

{"timestamp":"...","level":"INFO","message":"Starting batch execution","span":{"batch_id":"batch-1234","height":42},"fields":{"node_id":"node-alpha"}}
{"timestamp":"...","level":"INFO","message":"Executing intent","span":{"batch_id":"batch-1234","height":42,"intent_id":"0x789abc","sender":"0x123def"},"fields":{"node_id":"node-alpha"}}
{"timestamp":"...","level":"INFO","message":"Batch execution complete","span":{"batch_id":"batch-1234","height":42},"fields":{"intent_count":150,"node_id":"node-alpha"}}

Common Context Fields

Field	Description	Example
`node_id`	Node identifier	`"node-alpha"`
`chain_id`	Network identifier	`"rivellum-mainnet"`
`environment`	Deployment env	`"prod"`
`batch_id`	Execution batch ID	`"batch-1234"`
`height`	Ledger height	`42`
`intent_id`	Intent identifier	`"0x789abc..."`
`sender`	Intent sender address	`"0x123def..."`
`job_id`	PoUW job identifier	`"job-5678"`
`prover_id`	Prover node ID	`"prover-beta"`

Security & Privacy

Sensitive Data Protection

Encrypted intent payloads are NEVER logged in plaintext by default.

use tracing::info;

// ✅ SAFE: Only log metadata
info!(
    intent_id = %intent.intent_id(),
    sender = %intent.sender,
    payload_type = "encrypted",
    payload_size_bytes = intent.encrypted_payload.len(),
    "Intent received"
);

// ❌ DANGER: Never log raw payloads (unless debug_sensitive_logging=true)
if config.logging.debug_sensitive_logging {
    // Only in development with explicit flag
    warn!(
        intent_id = %intent.intent_id(),
        decrypted_payload = ?decrypted,
        "DEBUG: Decrypted intent payload (sensitive logging enabled)"
    );
}

debug_sensitive_logging Flag

⚠️ WARNING: Setting debug_sensitive_logging = true allows decrypted payloads to be logged.

When enabled:

Logs clear warnings on startup
Adds [SENSITIVE_LOGGING_ENABLED] prefix to all logs
Should NEVER be used in production
Useful for local debugging with test data

Startup warnings:

WARN rivellum_node::logging: ⚠️  SENSITIVE DATA LOGGING ENABLED ⚠️
    This mode may log decrypted intent payloads and private keys
    NEVER use in production or with real user data
    Set debug_sensitive_logging=false in config to disable

Audit Events

Security-relevant events are logged with event_type="audit":

{"timestamp":"...","level":"WARN","message":"Invalid signature detected","fields":{"event_type":"audit","intent_id":"0x789abc","sender":"0x123def","error":"SignatureVerificationFailed"}}
{"timestamp":"...","level":"INFO","message":"Log level changed","fields":{"event_type":"audit","old_level":"info","new_level":"debug"}}
{"timestamp":"...","level":"WARN","message":"Rate limit exceeded","fields":{"event_type":"audit","client_ip":"192.168.1.100","endpoint":"/submit-intent"}}

Panic Handling

Structured Panic Logs

When a panic occurs, full context is logged before termination:

{
  "timestamp": "2024-01-15T10:40:00.000Z",
  "level": "ERROR",
  "message": "PANIC",
  "fields": {
    "panic_message": "index out of bounds: the len is 5 but the index is 10",
    "panic_location": "src/execution.rs:234:15",
    "node_id": "node-alpha",
    "chain_id": "rivellum-mainnet",
    "environment": "prod",
    "thread": "tokio-runtime-worker-3",
    "backtrace": "... (full backtrace) ..."
  }
}

Backtrace capture:

Automatically enabled in release builds
Set RUST_BACKTRACE=1 for full backtraces
Set RUST_BACKTRACE=full for verbose backtraces

Integration with Log Aggregation

Grafana Loki

Promtail configuration:

clients:
  - url: http://loki:3100/loki/api/v1/push

scrape_configs:
  - job_name: rivellum-node
    static_configs:
      - targets:
          - localhost
        labels:
          job: rivellum-node
          environment: prod
          __path__: /var/log/rivellum/*.log

pipeline_stages:
  - json:
      expressions:
        level: level
        timestamp: timestamp
        target: target
        node_id: fields.node_id
        chain_id: fields.chain_id
  - labels:
      level:
      target:
      node_id:
      chain_id:
  - timestamp:
      source: timestamp
      format: RFC3339

LogQL queries:

# All errors from a specific node
{job="rivellum-node",level="ERROR",node_id="node-alpha"}

# Intent processing errors
{job="rivellum-node",target="rivellum_intents"} |= "error"

# Batch execution timeline
{job="rivellum-node"} | json | fields.batch_id="batch-1234"

# Security audit events
{job="rivellum-node"} | json | fields.event_type="audit"

Elasticsearch (ELK Stack)

Filebeat configuration:

filebeat.inputs:
  - type: log
    paths:
      - /var/log/rivellum/*.log
    json.keys_under_root: true
    json.add_error_key: true
    fields:
      service: rivellum-node
      environment: prod

output.elasticsearch:
  hosts: ["elasticsearch:9200"]
  index: "rivellum-logs-%{+yyyy.MM.dd}"

setup.ilm.enabled: true
setup.ilm.rollover_alias: "rivellum-logs"
setup.ilm.pattern: "{now/d}-000001"

Kibana queries:

# Errors in the last hour
level:ERROR AND @timestamp:[now-1h TO now]

# Specific batch execution
fields.batch_id:"batch-1234"

# High gas usage
fields.total_gas:>1000000 AND level:INFO

Datadog

Datadog Agent configuration:

logs:
  - type: file
    path: /var/log/rivellum/*.log
    service: rivellum-node
    source: rust
    sourcecategory: blockchain
    tags:
      - env:prod
      - chain:mainnet

Performance Considerations

Log Sampling (Future Feature)

For very high-volume logs (e.g., per-intent execution traces), implement sampling:

use tracing::trace;

if should_sample(intent_id, 0.1) {  // 10% sample rate
    trace!(intent_id = %intent_id, "Detailed execution trace");
}

Asynchronous Logging

The logging system uses buffered writes to minimize performance impact:

File writes: Buffered via BufWriter (8KB buffer)
Rotation: Occurs synchronously but infrequently (only when size threshold hit)
JSON serialization: Optimized with serde_json streaming

Benchmarks (approximate):

JSON log write: ~5-10 µs per log line
Pretty log write: ~10-20 µs per log line
Log rotation: ~100-500 ms (depends on file size)

Production Tips

Tune log level by subsystem:

RUST_LOG=info,rivellum_execution=debug,rivellum_network=trace

Use JSON in production (pretty format has higher overhead)
Monitor disk I/O - High log volume can impact disk performance
Separate log directories - Use dedicated disk/partition for logs in high-volume deployments

Troubleshooting

Log Files Not Created

Symptom: No log file appears at configured path

Solutions:

Check directory permissions:

ls -ld /var/log/rivellum
# Should be writable by node user

Check enable_console is true to see startup errors:
```
enable_console = true  # See errors on console
```
Check for initialization errors in systemd logs:
```
journalctl -u rivellum-node -n 50
```

Log Rotation Not Working

Symptom: Log file grows beyond max_file_size_mb

Causes:

File size check happens before writes, so file may slightly exceed threshold
Rotation happens per-instance; if multiple processes share same log file, rotation may conflict

Solution:

Ensure only one node process writes to each log file
Use different file_path for each node instance

High Disk Usage

Symptom: Logs consuming excessive disk space

Solutions:

Reduce max_backup_files:

max_backup_files = 5  # Keep only 5 old files

Enable retention_days:

retention_days = 7  # Auto-delete files older than 7 days

Reduce log level:

curl -X POST http://localhost:8080/log/level -d '{"level":"warn"}'

Missing Context Fields

Symptom: Logs don't include expected node_id, batch_id, etc.

Solution:

Ensure spans are entered:

let _span = info_span!("batch_execution", batch_id = %id).entered();

Check NodeContext is initialized correctly
Verify logging initialization includes context

Performance Degradation

Symptom: Node slows down with verbose logging

Solutions:

Reduce log level:

curl -X POST http://localhost:8080/log/level -d '{"level":"info"}'

Disable file logging temporarily:
```
file_path = null
enable_console = true
```
Use separate disk for logs (avoid contention with database)

Testing Logging Configuration

Manual Test

Start node with test config:

cargo run --bin rivellum-node -- server --config config/testnet.toml

Check log file created:
```
ls -lh logs/
```

Change log level dynamically:

curl -X POST http://localhost:8080/log/level -d '{"level":"debug"}'

Watch logs in real-time:
```
tail -f logs/rivellum-node.log | jq .
```

Trigger log rotation (fill file):

# Submit many intents to generate logs
for i in {1..10000}; do
  curl -X POST http://localhost:8080/submit-intent -d "@test-intent.json"
done

# Check rotated files
ls -lh logs/

Automated Tests

Run unit tests for logging components:

# Test log rotation
cargo test -p rivellum-node test_rotating_writer

# Test log configuration
cargo test -p rivellum-node test_log_config

# All logging tests
cargo test -p rivellum-node --lib -- logging::

Example Configurations

Local Development

[logging]
level = "debug"
format = "pretty"
enable_console = true
file_path = null
debug_sensitive_logging = false

# Start node
cargo run --bin rivellum-node -- sandbox

CI/CD Pipeline

[logging]
level = "debug"
format = "json"
file_path = "logs/ci-test.log"
enable_console = true
max_file_size_mb = 10
max_backup_files = 2
debug_sensitive_logging = false

# Run tests with JSON logs
cargo test --all -- --nocapture 2>&1 | tee test-output.log

Docker Container

FROM rust:1.75 as builder
# ... build steps ...

FROM debian:bookworm-slim
RUN mkdir -p /var/log/rivellum && chown appuser:appuser /var/log/rivellum
COPY config/prod.toml /etc/rivellum/config.toml
USER appuser
CMD ["rivellum-node", "server", "--config", "/etc/rivellum/config.toml"]

# prod.toml
[logging]
level = "info"
format = "json"
file_path = "/var/log/rivellum/node.log"
enable_console = true  # Docker captures stdout
max_file_size_mb = 100
max_backup_files = 5

# View logs
docker logs rivellum-node-container --follow | jq .

# Change log level at runtime
docker exec rivellum-node-container \
  curl -X POST http://localhost:8080/log/level -d '{"level":"debug"}'

Kubernetes Deployment

apiVersion: v1
kind: ConfigMap
metadata:
  name: rivellum-config
data:
  config.toml: |
    [logging]
    level = "info"
    format = "json"
    enable_console = true
    file_path = null  # Use stdout for k8s logging
    
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: rivellum-node
spec:
  template:
    spec:
      containers:
        - name: node
          image: rivellum/node:latest
          volumeMounts:
            - name: config
              mountPath: /etc/rivellum
          env:
            - name: RUST_LOG
              value: "info"
      volumes:
        - name: config
          configMap:
            name: rivellum-config

# View logs via kubectl
kubectl logs -f deployment/rivellum-node | jq .

# Change log level
kubectl exec deployment/rivellum-node -- \
  curl -X POST http://localhost:8080/log/level -d '{"level":"debug"}'

Reference

Log Levels

Level	Use Case	Volume	Examples
ERROR	System failures, unrecoverable errors	Very Low	Panic, database corruption, network failure
WARN	Degraded operation, recoverable errors	Low	Invalid intent, retry limit exceeded, config issue
INFO	Normal operation milestones	Medium	Batch committed, RPC server started, intent executed
DEBUG	Detailed operation info	High	Intent validation details, state transitions
TRACE	Very verbose execution traces	Very High	Per-instruction execution, network packets

Configuration Reference

Field	Type	Default	Description
`level`	String	`"info"`	Log level filter
`format`	String	`"pretty"`	Output format: `json` or `pretty`
`file_path`	Option<String>	`null`	Log file path (null = no file logging)
`max_file_size_mb`	u64	`50`	File size threshold for rotation (MB)
`max_backup_files`	u32	`5`	Number of rotated files to keep
`retention_days`	Option<u32>	`null`	Delete files older than N days
`enable_console`	bool	`true`	Log to stdout/stderr
`debug_sensitive_logging`	bool	`false`	DANGER: Log decrypted payloads

Environment Variables

Variable	Description	Example
`RIVELLUM_LOG_LEVEL`	Override configured log level	`debug`
`RIVELLUM_LOG_FORMAT`	Override configured log format	`json`
`RIVELLUM_LOG_FILE`	Override configured log file path	`/custom/path.log`
`RUST_LOG`	Standard Rust logging env var (overrides all)	`info,rivellum_node=debug`
`RUST_BACKTRACE`	Enable panic backtraces	`1` or `full`

Rivellum Node Logging Guide

Overview

Configuration

LogConfig Structure

Environment Profiles

Development (Default)

Testnet

Mainnet (Production)

Environment Variable Overrides

Log Formats

JSON Format (Production)

Pretty Format (Development)

Log Rotation & Retention

Size-Based Rotation

Time-Based Retention (Optional)

Disk Space Management

Dynamic Log Level Adjustment

RPC Endpoint

Usage Examples

Audit Logging

Security Considerations

Context Fields & Distributed Tracing

Global Context

Structured Spans

Common Context Fields

Security & Privacy

Sensitive Data Protection

debug_sensitive_logging Flag

Audit Events

Panic Handling

Structured Panic Logs

Integration with Log Aggregation

Grafana Loki

Elasticsearch (ELK Stack)

Datadog

Performance Considerations

Log Sampling (Future Feature)

Asynchronous Logging

Production Tips

Troubleshooting

Log Files Not Created

Log Rotation Not Working

High Disk Usage

Missing Context Fields

Performance Degradation

Testing Logging Configuration

Manual Test

Automated Tests

Example Configurations

Local Development

CI/CD Pipeline

Docker Container

Kubernetes Deployment

Reference

Log Levels

Configuration Reference

Environment Variables

Further Reading