Production observability dashboard with structured log analysis

Structured Logging: From console.log to Production-Ready Observability

It was 3 AM and our checkout service was dropping 15% of transactions. I opened CloudWatch, searched for "payment error," and got back 47,000 results — a wall of unstructured text that told me nothing useful. Half were console.log("payment error", error) with no context about which user, which order, which payment provider, or which server instance. The other half were stack traces that had been truncated by the log collector.

We spent 90 minutes correlating timestamps across five services before finding the root cause: a third-party payment provider had changed their API response format, and our parser was silently failing. Ninety minutes of downtime because our logs were console.log statements scattered across the codebase like confetti.

The next week, I migrated our entire logging stack to structured JSON logs with correlation IDs. The following incident took 4 minutes to diagnose — because I could filter by service=payment AND level=error AND correlation_id=abc123 and see the exact sequence of events across all services.

Structured logging isn't glamorous. Nobody writes blog posts about switching from console.log to pino. But it's the single highest-ROI investment you can make in operational readiness. According to Splunk's 2024 State of Observability Report, organizations with mature logging practices resolve incidents 69% faster than those without.

What Makes Logging "Structured"?

Unstructured logging is human-readable text. Structured logging is machine-parseable data (typically JSON) with consistent fields. Here's the difference:

// Unstructured (bad for production)
console.log(`[${new Date().toISOString()}] ERROR: Payment failed for user ${userId} - ${error.message}`);
// Output: [2026-03-20T14:23:45.123Z] ERROR: Payment failed for user usr_abc - timeout

// Structured (good for production)
logger.error({
  message: 'Payment processing failed',
  userId: 'usr_abc',
  orderId: 'ord_xyz',
  paymentProvider: 'stripe',
  errorCode: 'TIMEOUT',
  errorMessage: error.message,
  durationMs: 5023,
  correlationId: req.correlationId,
  service: 'payment-service',
  environment: 'production'
});
// Output: {"level":"error","message":"Payment processing failed","userId":"usr_abc","orderId":"ord_xyz","paymentProvider":"stripe","errorCode":"TIMEOUT","durationMs":5023,...,"timestamp":"2026-03-20T14:23:45.123Z"}

The structured version is harder to read in a terminal — but it's infinitely easier to search, filter, aggregate, and alert on. You can write a query like "show me all errors from the payment service in the last hour where the payment provider is Stripe and duration exceeded 5 seconds" and get precise results in milliseconds.

The Five Properties of Production-Ready Logs

Property	Description	Example
Structured	Machine-parseable format (JSON)	`{"level":"error","message":"..."}`
Contextual	Includes who, what, where, when	userId, orderId, service, environment
Correlated	Traceable across services	correlationId / traceId shared across services
Leveled	Severity classification	debug, info, warn, error, fatal
Sampled	Cost-controlled at high volume	Log 100% of errors, 10% of info

Engineering team reviewing production logs on monitoring screens

Choosing a Logging Library: Pino vs Winston vs Bunyan

In the Node.js ecosystem, three libraries dominate production logging:

Feature	Pino	Winston	Bunyan
Performance (ops/sec)	~180,000	~25,000	~35,000
Output format	JSON (native)	JSON or text	JSON (native)
Child loggers	Yes (fast)	Yes	Yes
Transport plugins	Separate thread	In-process	Streams
Ecosystem	Growing	Largest	Stable (minimal updates)

My recommendation: Pino. It's 5-7x faster than Winston in published benchmarks, produces structured JSON by default, and runs transports (log shipping) in a separate thread so logging never blocks your application's event loop.

Setting Up Pino: The Complete Configuration

// logger.js — Production-ready Pino setup
const pino = require('pino');

const logger = pino({
  level: process.env.LOG_LEVEL || 'info',

  // Base context added to every log line
  base: {
    service: process.env.SERVICE_NAME || 'api',
    environment: process.env.NODE_ENV || 'development',
    version: process.env.APP_VERSION || 'unknown',
    hostname: require('os').hostname()
  },

  // Timestamp in ISO format
  timestamp: pino.stdTimeFunctions.isoTime,

  // Redact sensitive fields
  redact: {
    paths: ['req.headers.authorization', 'req.headers.cookie', 'password', 'token', 'creditCard'],
    censor: '[REDACTED]'
  },

  // Serializers for common objects
  serializers: {
    err: pino.stdSerializers.err,
    req: pino.stdSerializers.req,
    res: pino.stdSerializers.res
  },

  // Pretty print in development only
  transport: process.env.NODE_ENV === 'development'
    ? { target: 'pino-pretty', options: { colorize: true } }
    : undefined
});

module.exports = logger;

Request-Scoped Logging with Child Loggers

// middleware/requestLogger.js
const { randomUUID } = require('crypto');
const logger = require('./logger');

const requestLogger = (req, res, next) => {
  // Generate or extract correlation ID
  const correlationId = req.headers['x-correlation-id'] || randomUUID();
  req.correlationId = correlationId;

  // Create a child logger with request context
  req.log = logger.child({
    correlationId,
    requestId: randomUUID(),
    method: req.method,
    path: req.path,
    userAgent: req.headers['user-agent'],
    ip: req.ip
  });

  // Log request start
  const startTime = Date.now();
  req.log.info('Request started');

  // Log request end
  res.on('finish', () => {
    req.log.info({
      statusCode: res.statusCode,
      durationMs: Date.now() - startTime,
      contentLength: res.get('content-length')
    }, 'Request completed');
  });

  // Pass correlation ID downstream (for microservices)
  res.setHeader('x-correlation-id', correlationId);
  next();
};

// Usage in route handlers
app.get('/api/orders/:id', async (req, res) => {
  req.log.info({ orderId: req.params.id }, 'Fetching order');

  try {
    const order = await OrderService.findById(req.params.id);
    req.log.info({ orderId: order.id, status: order.status }, 'Order found');
    res.json(order);
  } catch (err) {
    req.log.error({ orderId: req.params.id, err }, 'Failed to fetch order');
    res.status(500).json({ error: 'Internal server error' });
  }
});

Log Levels: When to Use What

Log levels are not arbitrary. Each level has a specific purpose, and misusing them makes your logs useless:

Level	When to Use	Example	Alert?
fatal	Application is crashing	Uncaught exception, out of memory	Page immediately
error	Operation failed, needs attention	Payment processing failed, DB connection lost	Alert within minutes
warn	Unexpected but recovered	Retry succeeded, deprecated API used, rate limit approaching	Dashboard / weekly review
info	Normal business operations	User logged in, order created, job completed	No
debug	Development / troubleshooting	SQL query, cache hit/miss, function arguments	No (off in prod)
trace	Granular debugging	Entry/exit of functions, loop iterations	No (off in prod)

Team monitoring real-time application logs and metrics

The Log Pipeline: From Application to Dashboard

Your application produces logs. But logs are useless if they're sitting in a file on a server that nobody reads. A production log pipeline typically looks like this:

Application → stdout (JSON) → Log Collector → Log Aggregator → Dashboard/Alerts

Concrete stack examples:

1. Cloud-native:
   App → stdout → CloudWatch/GCP Logging → Dashboard + Alerts

2. ELK Stack:
   App → stdout → Filebeat → Logstash → Elasticsearch → Kibana

3. Modern lightweight:
   App → stdout → Vector/Fluent Bit → Grafana Loki → Grafana

4. Managed:
   App → stdout → Datadog Agent → Datadog → Dashboards + Alerts

Tool Comparison: Where to Send Your Logs

Platform	Type	Cost (100 GB/mo)	Query Speed	Best For
ELK (Elasticsearch)	Self-hosted	$100-500 (infra)	Fast	Full control, complex queries
Grafana Loki	Self-hosted/Cloud	$50-200	Good (label-based)	Cost-effective, Grafana users
Datadog	Managed	$1,000-3,000	Very fast	Enterprise, full observability
AWS CloudWatch	Managed	$50-300	Moderate	AWS-native workloads
Axiom	Managed	$25-200	Fast	Startups, cost-sensitive

Correlation IDs: Tracing Requests Across Services

In a microservices architecture, a single user request might touch 5-10 services. Without correlation IDs, connecting logs across services requires manual timestamp matching — which is slow, error-prone, and often impossible.

// Correlation ID propagation pattern
// Service A: Generates or extracts the correlation ID
const correlationId = req.headers['x-correlation-id'] || randomUUID();

// Service A calls Service B with the correlation ID
const response = await fetch('http://service-b/api/orders', {
  headers: {
    'x-correlation-id': correlationId,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify(orderData)
});

// Service B extracts and propagates the correlation ID
app.use((req, res, next) => {
  req.correlationId = req.headers['x-correlation-id'] || randomUUID();
  req.log = logger.child({ correlationId: req.correlationId });
  next();
});

// Now you can search all logs across all services with:
// correlationId = "abc-123-def"
// and see the complete request lifecycle

For full distributed tracing (not just log correlation), consider OpenTelemetry — it provides standardized trace context propagation, spans, and metrics alongside logs.

Developer implementing structured logging in code editor

My Opinionated Logging Rules

1. Never use console.log in production code. console.log is synchronous, unstructured, and missing context. Replace every instance with a proper logger. This is not negotiable.

2. Log outcomes, not implementations. Bad: logger.info("Calling Stripe API"). Good: logger.info({ orderId, amount, provider: 'stripe' }, "Payment initiated"). Log what happened and why it matters, not what function you're about to call.

3. Never log sensitive data. Use Pino's redact option to automatically censor passwords, tokens, credit card numbers, and PII from logs. A OWASP Logging Cheat Sheet lists the data you should never log.

4. Every error log must include context to reproduce the issue. The error message alone is useless. Include: what operation was being performed, who triggered it, what inputs were provided, and what the system state was.

5. Logging is not free — budget for it. At scale, log storage costs more than compute. At BirJob, I log 100% of errors and warnings, 100% of request start/end, but only 10% of debug-level events. This keeps costs manageable while maintaining visibility.

Action Plan: Production Logging in 2 Weeks

Week 1: Foundation

Replace all console.log with Pino (or your language's equivalent)
Configure base context (service, environment, version)
Add request-scoped child loggers with correlation IDs
Enable sensitive data redaction
Establish log level guidelines for the team

Week 2: Pipeline and Observability

Set up log collection (Fluent Bit, Vector, or cloud-native)
Deploy a log aggregation platform (Loki, ELK, or managed)
Create dashboards for error rates, response times, and service health
Set up alerts for error rate spikes and specific error patterns
Document logging standards and share with the team

Sources and Further Reading

I'm Ismat, and I build BirJob — Azerbaijan's job aggregator scraping 80+ sources daily.

Loading BirJob...

Structured Logging: From console.log to Production-Ready Observability

Structured Logging: From console.log to Production-Ready Observability

What Makes Logging "Structured"?

The Five Properties of Production-Ready Logs

Choosing a Logging Library: Pino vs Winston vs Bunyan

Setting Up Pino: The Complete Configuration

Request-Scoped Logging with Child Loggers

Log Levels: When to Use What

The Log Pipeline: From Application to Dashboard

Tool Comparison: Where to Send Your Logs

Correlation IDs: Tracing Requests Across Services

My Opinionated Logging Rules

Action Plan: Production Logging in 2 Weeks

Sources and Further Reading

İş axtarışınıza başlayın

Oxşar məqalələr