Logging Standards
An opinionated, production-grade standard for application logging: what a log record must contain, how it is serialised, how it is emitted, and how it reaches an observability backend. It is deliberately language-agnostic - the rules apply equally to PowerShell, C#, Python, Go, TypeScript/Node, Bash, and Java - and grounded in the OpenTelemetry logs data model so that the same records ship to OpenTelemetry, Azure Monitor / Application Insights, or AWS CloudWatch without rework.
Scope: Application and automation logs - the diagnostic event stream a service or script emits about its own behaviour. It does not cover audit logs (immutable, compliance-retained, separate pipeline) or metrics and traces, except where logs must correlate with them. Examples use the canonical field names from the OpenTelemetry log data model .
Grounding: OpenTelemetry logs data model · OTel semantic conventions · RFC 5424 syslog severities · Azure Monitor Logs Ingestion API .
Why a logging standard?
Logs are the highest-cardinality, most ad-hoc signal a system produces, and the one engineers reach for first during an incident. Left ungoverned, every service invents its own format, timezone, level vocabulary, and field names - and the aggregator becomes a pile of opaque strings you can only grep. A standard turns logs into queryable, correlatable, machine-parseable events:
- An on-call engineer can filter by
level,service.name, andtrace_idacross every service, not just the one they wrote. - A single log line can be joined to the distributed trace and the metric spike that accompanied it.
- Backends index fields instead of re-parsing free text, so queries are fast and cheap.
- Alerting rules are stable because the level vocabulary and field names never change.
- Switching or adding a backend (OTel Collector, Azure Monitor, CloudWatch) is a transport change, not a rewrite, because the record shape is backend-neutral.
The cost of getting this wrong compounds: unstructured logs are re-parsed at ingestion (expensive and brittle), ambiguous timestamps make correlation impossible, and a leaked secret in a log line is a security incident with an indefinite blast radius.
Core principles
These are non-negotiable. Everything else in this document elaborates on them.
- Structured, not stringly-typed. Every log record is a structured object (a map of fields), serialised as JSON. Never log a pre-formatted human sentence as the only artifact.
- One JSON object per line. Emit newline-delimited JSON (NDJSON /
jsonl) to a dedicated log stream -stdoutfor long-running services,stderrfor CLIs and tools whosestdoutcarries real output. One event = one line = one parse. - Machines first, humans second. The default format is machine-parseable JSON. A human-readable coloured format is a developer-experience opt-in for interactive terminals, never the production default.
- UTC, ISO-8601, always. Timestamps are RFC 3339 / ISO-8601 with millisecond (or finer) precision, in UTC, with an explicit offset.
- A fixed level vocabulary. Use the canonical levels below and nothing else. Levels map deterministically to OpenTelemetry
SeverityNumber. - Correlate by default. When a trace context is active, every record carries
trace_idandspan_id. - Logs are not a data store and not a vault. Never log secrets or unmasked PII. Never use logs as the system of record for business data.
- Configure at the edge, log everywhere. Logging level, format, and destination are decided once at the process entry point (from environment/config), never hard-coded inside library functions.
- Logging must never break the program. Emitting a log - including at
ERROR- is side-effect-only. It must not throw, must not corrupt the function’s return value, and must not block on the network in the hot path.
The structured log record
A log record is a flat (or shallowly-nested) JSON object. The following fields are standardised. Names follow OpenTelemetry semantic conventions where one exists so records map cleanly to the OTel data model and to backend schemas.
| Field | Required | Type | Description |
|---|---|---|---|
timestamp | Yes | string | Event time, RFC 3339 / ISO-8601, UTC, ≥ millisecond precision (e.g. YYYY-MM-DDTHH:mm:ss.sssZ). Maps to OTel Timestamp. |
level | Yes | string | Canonical severity text: TRACE, DEBUG, INFO, WARN, ERROR, FATAL. Maps to OTel SeverityText. |
severity_number | Recommended | int | OTel SeverityNumber (see mapping). Lets backends sort/filter by severity without parsing text. |
message | Yes | string | Human-readable event description. The constant part; variable data goes in fields, not interpolated here. Maps to OTel Body. |
service.name | Yes | string | Logical service or component emitting the record (e.g. payments-api, LibreDevOpsHelpers). Maps to OTel resource attribute. |
service.version | Recommended | string | Build/semver of the emitter. Essential for “started after the 2.3.1 deploy” queries. |
deployment.environment | Recommended | string | prod, staging, dev. |
trace_id | When in trace | string | W3C trace id (32 hex chars). Joins the log to its distributed trace. Maps to OTel TraceId. |
span_id | When in trace | string | W3C span id (16 hex chars). Maps to OTel SpanId. |
host.name | Recommended | string | Machine/container/pod identity. |
error.type | On error | string | Exception class / error category. |
error.stack | On error | string | Stack trace, when logging a caught exception. |
| (attributes) | Optional | any | Any number of additional, domain-specific structured fields (e.g. order_id, resource_group, duration_ms). Map to OTel log Attributes. |
A canonical record:
{"timestamp":"YYYY-MM-DDTHH:mm:ss.sssZ","level":"ERROR","severity_number":17,"message":"Failed to create resource group","service.name":"LibreDevOpsHelpers","service.version":"2.1.0","deployment.environment":"prod","trace_id":"4bf92f3577b34da6a3ce929d0e0e4736","span_id":"00f067aa0ba902b7","resource_group":"rg-prod-uks","error.type":"Azure.RequestFailedException"}Rule:
messageis the constant description of an event; identifying and variable data are separate fields. Log{"message":"order rejected","order_id":4412,"reason":"insufficient_funds"}, never{"message":"order 4412 rejected: insufficient funds"}. Constant messages are groupable, alertable, and translatable; interpolated ones are not.
Field naming
- Use a single, consistent convention per organisation. This standard uses OTel-style dotted namespaces for resource/semantic fields (
service.name,host.name) andsnake_casefor domain attributes (order_id,duration_ms). Pick one and enforce it; do not mixuserId,user_id, anduser.idacross services. - Never reuse a field name for two different types (a field that is sometimes a string and sometimes an object breaks backend indexing).
- Reserve and never overload the standardised names above.
Why JSON, and why one object per line
Why structured JSON at all. Free-text logs must be re-parsed at ingestion with fragile regular expressions (Grok patterns), which break the moment a developer rewords a message. JSON moves parsing to the producer, where the schema is known: fields arrive typed, named, and ready to index. Every modern backend - the OTel Collector, Loki, Elasticsearch, Splunk, Azure Monitor, CloudWatch Logs - ingests JSON natively and exposes its fields as queryable columns.
Why one object per line (NDJSON). A log stream is unbounded and read incrementally. Newline-delimited JSON lets a collector tail the stream and parse each event independently: a newline is an unambiguous record boundary, no streaming JSON parser or array-close needed, and a single corrupt line is skipped without poisoning the rest. Pretty-printed (multi-line, indented) JSON breaks this - a log shipper reading line-by-line sees each line as a separate, invalid record. Indented JSON is a local-debugging affordance only and must never be the production output format.
Why stdout. In the twelve-factor model a process writes its event stream, unbuffered, to stdout and stays oblivious to routing. The execution environment - the container runtime, systemd, the platform - captures the stream and the collector ships it. This decouples the application from the destination: the same binary logs to a file in dev, to the Docker json-file driver in CI, and to a DaemonSet collector in Kubernetes, with zero code change.
Rule: Diagnostics never pollute the data path. In shells and pipelines, the success/
stdoutchannel may carry a function’s actual return value; route log lines to a dedicated stream (stderr, or a language’s information/log stream) so logging never corrupts a caller that captures output. The reference implementation below does exactly this.
Timestamps
YYYY-MM-DDTHH:mm:ss.sssZ ✅ ISO-8601 / RFC 3339, UTC, millisecond precision
YYYY-MM-DDTHH:mm:ss.sss+00:00 ✅ explicit zero offset is equally valid
YYYY-MM-DD HH:mm:ss ❌ no timezone - ambiguous and unjoinable
DD/MM/YYYY h:mm A ❌ locale-dependent, unsortable, lossy
1750516328 ❌ bare epoch - human-opaque, precision unclear- UTC, always. Local time with DST shifts makes logs from different regions impossible to order and creates a duplicated/missing hour twice a year. Convert to local at display time only.
- ISO-8601 / RFC 3339 with an explicit offset (
Zor+00:00). It is lexicographically sortable (string sort = chronological sort), unambiguous, and parsed natively by every backend. - Millisecond precision minimum. Microsecond/nanosecond where available. Sub-second ordering matters when correlating events inside a single request.
- Stamp at the moment of the event, not at flush or ingestion time. The producer owns the timestamp; the backend’s receive-time is a separate, less trustworthy field.
Rule: One timestamp field, named consistently, in UTC ISO-8601. Backends expect this: Azure Monitor’s
TimeGeneratedand CloudWatch’s event timestamp both parse RFC 3339 directly, and OTel’sTimestampis a UTC instant.
Log levels
Use exactly these six. They map one-to-one onto the OpenTelemetry SeverityNumber ranges and onto RFC 5424 syslog severities, so records are portable across every backend.
| Level | OTel SeverityNumber | Meaning | Use for |
|---|---|---|---|
TRACE | 1 | Finest-grained, per-step flow | Loop iterations, deep protocol detail. Off outside targeted debugging. |
DEBUG | 5 | Developer diagnostics | Variable values, branch decisions. Off in production by default. |
INFO | 9 | Normal, noteworthy events | Service started, request handled, deployment completed. The default production floor. |
WARN | 13 | Recoverable / degraded | Retry succeeded, fell back to secondary, deprecated path hit. Nothing failed yet. |
ERROR | 17 | Operation failed | A request, job, or transaction did not complete. Needs attention; service still up. |
FATAL | 21 | Process is going down | Unrecoverable; the process is about to exit. Pages someone. |
Guidance:
INFOis the production floor. Run production atINFO; raise toDEBUG/TRACEtransiently and surgically when investigating. The minimum level is operator-controlled via environment/config (see the reference implementation), never a code change.WARNis not “minor error”. It means handled and recovered. If someone must act, it isERROR. If nothing is wrong, it isINFO. An always-firingWARNis noise that trains responders to ignore the channel.ERRORmeans an operation failed, not “an exception was caught”. A caught-and-recovered exception isWARNorINFO. ReserveERRORfor work that did not complete.- Semantic aliases collapse to a canonical level. A friendly outcome level such as
SUCCESSis emitted atINFOseverity (SeverityNumber9) - it is presentation sugar, not a new severity. The reference implementation does precisely this. - One level scale, everywhere. Do not invent
NOTICE,VERBOSE,CRITICAL,SEVEREper service. If you need syslog’sNOTICE/CRITICAL, map them onto this scale at the boundary.
Correlation: tying logs to traces
A log line is far more valuable when it can be joined to the distributed trace it occurred within. The OpenTelemetry context propagation model makes this automatic if you cooperate:
- When a span is active, read the current trace context and stamp
trace_idandspan_idonto every record. Backends (Azure Monitor, CloudWatch, Grafana/Tempo, Jaeger) use these to render “show me the logs for this trace” with no extra work. - Propagate the W3C
traceparentheader across service boundaries so thetrace_idis stable end-to-end. - Where logs and traces share a backend (Application Insights, an OTel-native stack), correlation is a built-in pivot; where they do not,
trace_idis still the join key in your query language.
Rule: If your runtime has an ambient span (most do, via OTel SDKs or .NET
Activity.Current), capturetrace_id/span_idin the logging function once, centrally. Do not ask every call site to pass them.
What never goes in a log
Logs fan out to aggregators, dashboards, cold storage, and third-party SaaS, and are retained for months. Treat every record as potentially world-readable and permanent.
- Never log secrets - passwords, tokens, API keys, connection strings, private keys,
SecureString/credential plaintext. Mask at the call site (****, last-4, or a hash). The log backend is not a vault. - Never log unmasked PII / sensitive data - full card numbers, government IDs, health data, full auth headers. Redact, tokenise, or hash to satisfy GDPR/PCI-DSS/HIPAA. Prefer logging a stable opaque user id over an email address.
- Never build JSON by string concatenation. Always serialise via a real JSON encoder so values are escaped - hand-built JSON breaks on a quote or newline in user input and is a log-injection vector.
- Cap field sizes. Truncate large payloads (request bodies, blobs). Unbounded fields blow up ingestion cost and can DoS the pipeline.
Rule: Redaction is the producer’s job, applied before serialisation. Do not rely on the backend to scrub - by the time data reaches it, it has already crossed the wire and may be cached or indexed.
Exporting: getting logs to a backend
Because the record shape is backend-neutral, the application’s job ends at “emit NDJSON to stdout.” A collector owns transport, batching, retries, and backend-specific formatting. This keeps the app free of vendor SDKs in the log path and lets you change or fan out destinations centrally.
OpenTelemetry (the neutral target)
Model logs on the OTel log data model and they ship anywhere. Two common patterns:
- Collector tails stdout (recommended for most services). The application writes NDJSON; the OpenTelemetry Collector ’s
filelogreceiver tails it, ajsonparser lifts your fields into the record, and an operator mapslevel→SeverityNumber. From there an OTLP (or vendor) exporter forwards to any backend. The app needs no OTel SDK for logs.
# otel-collector - tail NDJSON, map our fields onto the OTel log model
receivers:
filelog:
include: [ /var/log/app/*.jsonl ]
operators:
- type: json_parser
timestamp: { parse_from: attributes.timestamp, layout_type: gotime, layout: "2006-01-02T15:04:05.999Z07:00" }
severity: { parse_from: attributes.level } # TRACE/DEBUG/INFO/WARN/ERROR/FATAL → SeverityNumber
exporters:
otlp:
endpoint: collector.observability:4317
service:
pipelines:
logs: { receivers: [filelog], exporters: [otlp] }- SDK emits OTLP directly. A service already instrumented with an OTel SDK can use the logs SDK/bridge to export
LogRecords over OTLP, populatingTimestamp,SeverityNumber/SeverityText,Body,Attributes,Resource, andTraceId/SpanIdfrom the same fields above. Prefer this only where you already run the SDK for traces; otherwise the Collector-tails-stdout pattern is simpler and decoupled.
Rule: Keep the application backend-agnostic. Emit the standard record; let the Collector translate. Do not scatter
Azure.Monitor/CloudWatchSDK calls through business code.
Azure Monitor / Application Insights
- Custom structured logs → Logs Ingestion API. Ship records to a custom table (
*_CL) in a Log Analytics workspace via a Data Collection Endpoint (DCE) and Data Collection Rule (DCR). This is the modern, supported route (it supersedes the deprecated HTTP Data Collector API). The destination table requires aTimeGeneratedcolumn - map yourtimestampto it. Authenticate with managed identity / workload identity holding Monitoring Metrics Publisher on the DCR; never a workspace shared key. - Correlated traces + logs → Application Insights. App Insights does not accept raw OTLP on a public endpoint; use the Azure Monitor OpenTelemetry distro/exporter (or the Collector’s
azuremonitorexporter) sotrace_id/span_idlight up the end-to-end transaction view. Configure viaAPPLICATIONINSIGHTS_CONNECTION_STRINGfrom config; never paste an instrumentation key into source. - Because your records already carry ISO-8601 UTC timestamps, canonical levels, and
trace_id/span_id, the mapping into KQL (TimeGenerated,SeverityLevel,OperationId) is mechanical.
AWS CloudWatch
- CloudWatch Logs auto-discovers JSON fields. Emit the same NDJSON; CloudWatch parses each line and exposes your fields to Logs Insights queries (
fields @timestamp, level, message | filter level = "ERROR"). No format change. - Collector route: the AWS Distro for OpenTelemetry (ADOT) Collector receives your logs/traces and exports to CloudWatch Logs and X-Ray, keeping the application vendor-neutral - the same NDJSON that feeds Azure feeds AWS.
- For metrics derived from logs, the CloudWatch Embedded Metric Format (EMF) is a JSON superset; the base record stays standard and EMF metadata is added at the edge when needed.
Rule: The application emits one record shape. Azure, AWS, and OTel are destinations selected by the collector/exporter, not formats the application knows about. This is what makes the standard portable.
Reference implementations
Every implementation below emits the same canonical NDJSON record - identical field names, the same level vocabulary, UTC ISO-8601 timestamps, and trace correlation. What differs is idiomatic plumbing, not the contract: a log line from the Go service and a log line from the Bash script land in the backend with the same shape and answer the same query. Each example uses the de-facto structured logger for its ecosystem rather than a bespoke one - the discipline is in configuring the standard logger correctly, not writing your own.
| Language | Recommended logger | Output stream |
|---|---|---|
| PowerShell | LibreDevOpsHelpers Write-LdoLog (or PSFramework ) | information / error streams |
| Bash | jq-based helper (below) | stderr |
| Python | structlog or stdlib logging + JSON formatter | stderr |
| C# / .NET | Microsoft.Extensions.Logging + OpenTelemetry (or Serilog CLEF) | OTLP / stdout |
| Go | stdlib log/slog JSON handler | stderr |
| TypeScript / Node | pino | stdout |
stdout or stderr? Long-running services write their event stream to stdout (twelve-factor); CLIs and tools whose stdout carries real output write logs to stderr so diagnostics never collide with program output. Container runtimes and orchestrators capture both streams identically, so the backend sees no difference - the choice is purely about not corrupting a tool’s data path. The examples reflect each tool’s typical role.
PowerShell - Write-LdoLog (canonical worked example)
The LibreDevOpsHelpers module is a real-world implementation of this standard. Its Write-LdoLog is the single logging entry point for every function in the module. It is worth reading as a concrete, production application of every rule above - in a language (PowerShell) that has no built-in structured logger.
How it satisfies the standard:
| Standard | How Write-LdoLog implements it |
|---|---|
| Structured JSON, one object per line | Default format is compact JSON via ConvertTo-Json -Compress - one NDJSON object per call. |
| Machines first, humans opt-in | Json is the default; Text (coloured) and JsonIndented are explicit opt-ins for local debugging only. |
| UTC ISO-8601 timestamps | (Get-Date).ToUniversalTime().ToString('o') - round-trip o format is RFC 3339 UTC. |
| Fixed level vocabulary | DEBUG, INFO, SUCCESS, WARN, ERROR, validated by [ValidateSet]. |
| Semantic alias collapses to canonical | SUCCESS is gated at the same threshold as INFO and routed to the information stream - presentation sugar over INFO severity. |
| Configure at the edge | Level and format seed from LDO_LOG_LEVEL / LDO_LOG_FORMAT env vars, with Set-LdoLogLevel / Set-LdoLogFormat overrides - no code change to retune. |
| Logging never corrupts output | Each level routes to a non-success stream (Write-Debug/Write-Information/Write-Warning/Write-Error); nothing touches stdout/stream 1. |
| Logging never throws | ERROR is emitted with -ErrorAction Continue (explicitly non-terminating) even under $ErrorActionPreference = 'Stop'; the caller decides whether to throw. |
| Extensible attributes | -Data merges arbitrary structured properties (correlation ids, resource names) into the JSON record. |
Set-StrictMode -Version Latest
# Module-scoped minimum level. Messages below this are suppressed. DEBUG also
# respects $DebugPreference, so it stays hidden unless the caller opts in.
$script:LdoLogLevels = @{ DEBUG = 0; INFO = 1; SUCCESS = 1; WARN = 2; ERROR = 3 }
# Minimum level and output format. Both can be seeded from the environment so that
# operators can control logging in CI/CD without touching code, and both fall back to
# sensible defaults (show everything; structured JSON) when unset or invalid.
$script:LdoMinLogLevel = if ($env:LDO_LOG_LEVEL -and $script:LdoLogLevels.ContainsKey($env:LDO_LOG_LEVEL.ToUpperInvariant())) {
$env:LDO_LOG_LEVEL.ToUpperInvariant()
}
else {
'DEBUG'
}
$script:LdoLogFormat = switch -Regex ($env:LDO_LOG_FORMAT) {
'^(?i)jsonindented$' { 'JsonIndented'; break }
'^(?i)text$' { 'Text'; break }
default { 'Json' } # covers 'json', unset, and any unrecognised value
}
function Write-LdoLog {
<#
.SYNOPSIS
Writes a levelled, timestamped log message to the correct PowerShell stream.
.DESCRIPTION
The single logging entry point for all LibreDevOpsHelpers modules. By default
each message is rendered as one compact JSON object (newline-delimited JSON)
carrying a UTC ISO-8601 timestamp, level, invocation and message, plus any extra
properties supplied via -Data. Each level is routed to a stream that never
touches the success (output) pipeline, so the function is safe to call from
inside other functions without corrupting their return values.
#>
[CmdletBinding()]
[OutputType([void])]
param(
[Parameter(Mandatory)][ValidateSet('DEBUG', 'INFO', 'SUCCESS', 'WARN', 'ERROR')]
[string]$Level,
[Parameter(Mandatory)][AllowEmptyString()]
[string]$Message,
[string]$InvocationName,
[hashtable]$Data,
[ValidateSet('Json', 'JsonIndented', 'Text')]
[string]$Format
)
# Default the invocation name to the immediate caller so records are attributable.
if (-not $InvocationName) {
$caller = (Get-PSCallStack)[1]
$InvocationName = if ($caller -and $caller.Command) { $caller.Command } else { '<script>' }
}
# Honour the module-scoped minimum level - drop anything below the threshold.
if ($script:LdoLogLevels[$Level] -lt $script:LdoLogLevels[$script:LdoMinLogLevel]) {
return
}
if (-not $Format) { $Format = $script:LdoLogFormat }
$now = Get-Date
if ($Format -eq 'Text') {
$timestamp = $now.ToString('yyyy-MM-dd HH:mm:ss')
$line = '{0} [{1}] [{2}] {3}' -f $timestamp, $Level, $InvocationName, $Message
}
else {
# ISO-8601 in UTC ("o" round-trip format) so downstream log systems can parse
# an unambiguous, timezone-correct timestamp.
$record = [ordered]@{
timestamp = $now.ToUniversalTime().ToString('o')
level = $Level
invocation = $InvocationName
message = $Message
}
if ($Data) {
foreach ($key in $Data.Keys) { $record[[string]$key] = $Data[$key] }
}
# Compact (one object per line) is the default for log ingestion. JsonIndented is an
# opt-in for local debugging and is deliberately not newline-delimited.
if ($Format -eq 'JsonIndented') {
$line = $record | ConvertTo-Json -Depth 10
}
else {
$line = $record | ConvertTo-Json -Depth 10 -Compress
}
}
# Route every level to a stream that never touches stdout (stream 1), so logging
# cannot corrupt the return value of a function that called us.
switch ($Level) {
'DEBUG' { Write-Debug $line }
'INFO' { Write-LdoInfoLine -Line $line -Level $Level -Format $Format -Color Cyan }
'SUCCESS' { Write-LdoInfoLine -Line $line -Level $Level -Format $Format -Color Green }
'WARN' { Write-Warning $line }
# Explicitly non-terminating: logging an error must never throw on its own,
# even when the caller has $ErrorActionPreference = 'Stop'. The caller decides.
'ERROR' { Write-Error $line -ErrorAction Continue }
}
}Usage - the same call works in interactive (Text) and CI (Json) contexts; only LDO_LOG_FORMAT changes:
Write-LdoLog -Level INFO -Message 'Starting deployment'
Write-LdoLog -Level INFO -Message 'Created resource group' -Data @{ resourceGroup = 'rg-prod'; correlationId = $cid }
Write-LdoLog -Level ERROR -Message "Failed: $($_.Exception.Message)" -Data @{ exitCode = $LASTEXITCODE }A Json-mode record from the second call is ready for any backend:
{"timestamp":"YYYY-MM-DDTHH:mm:ss.sssZ","level":"INFO","invocation":"New-LdoResourceGroup","message":"Created resource group","resourceGroup":"rg-prod","correlationId":"e2c4..."}Bridging to the canonical schema: the module keeps field names lean (
invocationrather thanservice.name) for ergonomics. When shipping to a shared backend, addservice.name/service.version/trace_ideither via-Dataat the entry point or with a Collectortransformoperator that renamesinvocationand injects resource attributes - the application stays simple and the standard schema is satisfied at the edge.
Bash
Bash has no JSON type, so the one rule that matters is: never hand-build JSON - delegate encoding to jq so every value is escaped correctly. The helper seeds its level and service identity from the environment, drops anything below the threshold, and folds trailing key value pairs into the record. It writes to stderr because a shell script’s stdout usually carries real output.
#!/usr/bin/env bash
set -euo pipefail
# Canonical level -> OTel SeverityNumber. Used for filtering and the severity_number field.
declare -A _LDO_LEVELS=( [TRACE]=1 [DEBUG]=5 [INFO]=9 [WARN]=13 [ERROR]=17 [FATAL]=21 )
# Operator-controlled, seeded from the environment - no code change to retune.
LDO_LOG_LEVEL="${LDO_LOG_LEVEL:-INFO}"
LDO_SERVICE_NAME="${LDO_SERVICE_NAME:-$(basename "${0}")}"
# Portable UTC ISO-8601 timestamp. %3N (millisecond precision) is GNU-only; BSD/macOS
# date leaves a literal "N" instead of expanding it. Prefer gdate, then detect whether
# the system date expanded %3N, and degrade to second precision when it did not.
_ldo_iso8601_utc_now() {
if command -v gdate >/dev/null 2>&1; then
gdate -u +%Y-%m-%dT%H:%M:%S.%3NZ
return
fi
local ts; ts=$(date -u +%Y-%m-%dT%H:%M:%S.%3NZ)
case $ts in
*N*) date -u +%Y-%m-%dT%H:%M:%SZ ;; # BSD/macOS - no millisecond support
*) printf '%s\n' "$ts" ;; # GNU - %3N expanded to digits
esac
}
# log LEVEL MESSAGE [key value]... -> one compact JSON object on stderr.
log() {
local level="${1:?level required}"; local message="${2:?message required}"; shift 2
# Drop anything below the configured threshold.
(( ${_LDO_LEVELS[$level]:-9} < ${_LDO_LEVELS[$LDO_LOG_LEVEL]:-9} )) && return 0
# jq encodes every value safely - never hand-build JSON in shell.
jq -cn \
--arg ts "$(_ldo_iso8601_utc_now)" \
--arg lvl "$level" \
--arg sev "${_LDO_LEVELS[$level]}" \
--arg msg "$message" \
--arg svc "$LDO_SERVICE_NAME" \
--args '
{
timestamp: $ts,
level: $lvl,
severity_number: ($sev | tonumber),
message: $msg,
"service.name": $svc
}
+ reduce range(0; ($ARGS.positional | length); 2) as $i
({}; .[$ARGS.positional[$i]] = $ARGS.positional[$i + 1])
' "$@" >&2
}
# Usage - trailing pairs become structured fields.
log INFO "Starting deployment"
log INFO "Created resource group" resourceGroup rg-prod correlationId "${CID:-}"
log ERROR "terraform apply failed" exitCode "$?"{"timestamp":"YYYY-MM-DDTHH:mm:ss.sssZ","level":"INFO","severity_number":9,"message":"Created resource group","service.name":"deploy.sh","resourceGroup":"rg-prod","correlationId":"e2c4..."}Python
Use structlog in new services; where you must stay on the standard library, attach a JSON formatter to the root logger. Configure it once at the entry point - library modules just call logging.getLogger(__name__) and inherit it. The formatter stamps UTC ISO-8601, maps stdlib level names onto the canonical vocabulary (WARNING->WARN, CRITICAL->FATAL), and pulls trace_id/span_id from the active OpenTelemetry span automatically.
import datetime as dt
import json
import logging
import sys
from opentelemetry import trace
_SEVERITY = {"DEBUG": 5, "INFO": 9, "WARNING": 13, "ERROR": 17, "CRITICAL": 21}
_CANONICAL = {"WARNING": "WARN", "CRITICAL": "FATAL"} # stdlib name -> canonical
class JsonFormatter(logging.Formatter):
"""Render each LogRecord as one canonical NDJSON object."""
def __init__(self, service: str, version: str = "") -> None:
super().__init__()
self._base = {"service.name": service}
if version:
self._base["service.version"] = version
def format(self, record: logging.LogRecord) -> str:
ts = dt.datetime.fromtimestamp(record.created, dt.timezone.utc)
out = {
"timestamp": ts.isoformat(timespec="milliseconds").replace("+00:00", "Z"),
"level": _CANONICAL.get(record.levelname, record.levelname),
"severity_number": _SEVERITY.get(record.levelname, 9),
"message": record.getMessage(),
**self._base,
}
span = trace.get_current_span().get_span_context()
if span.is_valid: # SpanContext.is_valid is a @property - no parentheses
out["trace_id"] = format(span.trace_id, "032x")
out["span_id"] = format(span.span_id, "016x")
out.update(getattr(record, "attributes", {})) # structured fields via extra=
if record.exc_info:
out["error.type"] = record.exc_info[0].__name__
out["error.stack"] = self.formatException(record.exc_info)
return json.dumps(out, default=str)
def configure_logging(service: str, level: str = "INFO") -> None:
"""Call once, at the process entry point - never inside library code."""
handler = logging.StreamHandler(sys.stderr)
handler.setFormatter(JsonFormatter(service))
root = logging.getLogger()
root.handlers[:] = [handler]
root.setLevel(level)
# Usage
configure_logging("payments-api", level="INFO")
log = logging.getLogger(__name__)
log.info("Created resource group", extra={"attributes": {"resourceGroup": "rg-prod"}})
log.error("apply failed", exc_info=True, extra={"attributes": {"exitCode": 1}})C# / .NET
The most correct .NET route is Microsoft.Extensions.Logging with the OpenTelemetry logger provider: it maps every ILogger record onto the OTel log data model (Timestamp, SeverityNumber/SeverityText, Body, Attributes, TraceId/SpanId) and exports over OTLP - no hand-rolled JSON, and trace correlation is automatic from Activity.Current. Message-template placeholders become attributes; they are never string-interpolated into the message.
using Microsoft.Extensions.Logging;
using OpenTelemetry.Logs;
using OpenTelemetry.Resources;
// Configure once at startup. TraceId/SpanId attach automatically from Activity.Current.
using var loggerFactory = LoggerFactory.Create(builder =>
{
builder.AddOpenTelemetry(o =>
{
o.SetResourceBuilder(ResourceBuilder.CreateDefault()
.AddService(serviceName: "payments-api", serviceVersion: "2.1.0"));
o.IncludeScopes = true;
o.AddOtlpExporter(); // -> OTel Collector (4317) -> any backend
});
builder.SetMinimumLevel(LogLevel.Information); // operator-controlled floor
});
ILogger log = loggerFactory.CreateLogger("Deploy");
// Structured: {resourceGroup} becomes a field, not interpolated text.
log.LogInformation("Created resource group {resourceGroup}", "rg-prod");
try { /* ... */ }
catch (Exception ex) { log.LogError(ex, "apply failed with exit code {exitCode}", 1); }When you need JSON on the console instead of OTLP - e.g. a container whose stdout a Collector tails - add Serilog with
Serilog.Formatting.Compact.CompactJsonFormatter(CLEF) or the built-inAddJsonConsole(), and rename their native fields (@t/@l/@morTimestamp/LogLevel) to the canonical schema with a Collectortransform- the same edge-bridging pattern as PowerShell above.
Go
The standard library’s log/slog (Go 1.21+) is a structured, levelled JSON logger out of the box - no third-party dependency. A ReplaceAttr hook renames slog’s default keys (time/level/msg) to the canonical names and forces UTC ISO-8601; slog.LevelWarn already serialises as WARN. Attach the trace context per-request with With.
package main
import (
"context"
"log/slog"
"os"
"go.opentelemetry.io/otel/trace"
)
// newLogger configures a JSON slog logger once, at startup.
func newLogger(service, version string, level slog.Level) *slog.Logger {
h := slog.NewJSONHandler(os.Stderr, &slog.HandlerOptions{
Level: level, // operator-controlled floor
ReplaceAttr: func(_ []string, a slog.Attr) slog.Attr {
switch a.Key {
case slog.TimeKey:
a.Key = "timestamp"
// Z07:00 is the RFC3339 layout token: it emits "Z" for a zero (UTC)
// offset and "±hh:mm" otherwise. .UTC() forces the former.
a.Value = slog.StringValue(a.Value.Time().UTC().Format("2006-01-02T15:04:05.000Z07:00"))
case slog.LevelKey:
a.Key = "level" // INFO/WARN/ERROR already match the canonical vocabulary
case slog.MessageKey:
a.Key = "message"
}
return a
},
})
return slog.New(h).With(
slog.String("service.name", service),
slog.String("service.version", version),
)
}
// withTrace stamps the active trace context onto every record from the returned logger.
func withTrace(ctx context.Context, l *slog.Logger) *slog.Logger {
sc := trace.SpanContextFromContext(ctx)
if !sc.IsValid() {
return l
}
return l.With(
slog.String("trace_id", sc.TraceID().String()),
slog.String("span_id", sc.SpanID().String()),
)
}
func main() {
log := newLogger("payments-api", "2.1.0", slog.LevelInfo)
// Structured: key/value attributes, never interpolated into the message.
log.Info("Created resource group", slog.String("resourceGroup", "rg-prod"))
log.Error("apply failed", slog.Int("exitCode", 1))
}TypeScript / Node.js
pino is the de-facto high-performance JSON logger for Node. It emits NDJSON to stdout (the service case), and its formatters, timestamp, mixin, and redact options shape records to the canonical schema, inject the active trace context into every line, and strip secrets before serialisation.
import { context, trace } from "@opentelemetry/api";
import pino from "pino";
// Configure once at startup. pino's level numbers (10..60) map linearly to OTel
// SeverityNumber via 0.4 * n - 3 (info 30 -> 9, error 50 -> 17, fatal 60 -> 21).
export const log = pino({
level: process.env.LDO_LOG_LEVEL ?? "info",
base: { "service.name": "payments-api", "service.version": "2.1.0" },
messageKey: "message",
timestamp: () => `,"timestamp":"${new Date().toISOString()}"`, // RFC 3339 UTC, ms precision
formatters: {
level: (label, number) => ({
level: label.toUpperCase(), // INFO/WARN/ERROR -> canonical vocabulary
severity_number: Math.round(0.4 * number - 3),
}),
},
// Strip secrets before they are ever written.
redact: { paths: ["req.headers.authorization", "password", "token"], censor: "[REDACTED]" },
mixin() {
const span = trace.getSpan(context.active());
if (!span) return {};
const { traceId, spanId } = span.spanContext();
return { trace_id: traceId, span_id: spanId };
},
});
// Structured: data object first, constant message second.
log.info({ resourceGroup: "rg-prod" }, "Created resource group");
log.error({ exitCode: 1 }, "apply failed");Operational guidance
- Level by environment.
prodatINFO;dev/debugging atDEBUG/TRACE. Toggle via env/config, never a redeploy. - Sampling, not silence. For very high-volume
INFO/DEBUG, sample (e.g. 1-in-N) rather than dropping a level wholesale, so you keep a representative trail. Always keep 100% ofERROR/FATAL. - Cost is a function of volume × retention × cardinality. JSON fields are cheap to query but each indexed field has a cost; do not promote unbounded high-cardinality values (raw user input, GUIDs per request) to indexed columns unless you query them.
- Hot/cold tiering. Keep recent logs in a fast queryable tier (Log Analytics, CloudWatch); archive older logs to cheap storage (Blob/S3) per your retention policy. Audit/compliance logs follow a separate, immutable pipeline.
- Don’t block the request path. Buffer and batch to the backend asynchronously; flush on shutdown. A logging backend outage must degrade to local stdout, never stall or crash the app.
- Clocks must be synced. Correlation across services assumes NTP-disciplined clocks. Skew shows up as impossible event ordering.
Anti-patterns
- 🚨 Unstructured / free-text logs -
"order 4412 failed for user bob@x.com". Unqueryable, un-alertable, and it just leaked PII. Emit fields, not sentences. - 🚨 Logging secrets or unmasked PII - tokens, connection strings, card numbers in a record. The backend is not a vault and retains for months. Redact before serialising.
- 🚨 Hand-built JSON via string concatenation - breaks on a quote/newline in user input and is a log-injection vector. Always use a real JSON encoder.
- 🚨 Logging to stdout’s data channel in a pipeline - diagnostics that corrupt a function’s return value or a piped command’s output. Use a non-success stream (stderr / information).
- 🚨 Logging that can throw - an exception inside the logger takes down the caller. Logging is side-effect-only;
ERRORrecords must not terminate. - ⚠️ Local time / no timezone - unsortable and unjoinable across regions; DST duplicates or drops an hour. UTC ISO-8601 with an explicit offset, always.
- ⚠️ Pretty-printed JSON in production - multi-line records break line-based collectors. Compact NDJSON in prod; indented is local-debug only.
- ⚠️ Inventing per-service level names -
NOTICE,VERBOSE,SEVERE. Use the one canonical six-level scale so alerts and filters are portable. - ⚠️
WARNas “small error” /ERRORas “caught an exception” - trains responders to ignore the channel.WARN= recovered;ERROR= operation failed. - ⚠️ Variable data interpolated into
message- kills grouping and alerting. Constantmessage, variable data in fields. - 🔬 Vendor SDK calls scattered through business code - couples the app to one backend. Emit the neutral record; let a collector translate to Azure/AWS/OTLP.
- 🔬 Configuring level/format inside library functions - makes behaviour untunable in production. Decide once at the process entry point from env/config.
- 🔬 Unbounded fields - logging whole request bodies/blobs balloons ingestion cost and can DoS the pipeline. Truncate and cap.
Compliance checklist
A service meets this standard when:
- Every log record is JSON, one object per line (NDJSON), on stdout.
-
timestampis RFC 3339 / ISO-8601, UTC, ≥ ms precision, stamped at event time. -
levelis one ofTRACE/DEBUG/INFO/WARN/ERROR/FATALand maps to OTelSeverityNumber. -
messageis constant; variable data is in separate fields. -
service.name(and ideallyservice.version,deployment.environment) is present. -
trace_id/span_idare present whenever a trace context is active. - No secrets or unmasked PII appear in any field; redaction happens before serialisation.
- Minimum level and output format are set from environment/config at the entry point.
- The human-readable/coloured format is opt-in and never the production default.
- Logging is side-effect-only: it cannot throw, block the hot path, or corrupt return values.
- Records reach the backend via a collector/exporter; the application carries no vendor logging SDK in business code.
See Also
- OpenTelemetry logs data model - the canonical record shape this standard mirrors
- OpenTelemetry semantic conventions - standardised field names
- OpenTelemetry Collector - the receive/transform/export pipeline
- RFC 5424 - syslog severities - the level/severity lineage
- The Twelve-Factor App - Logs - logs as event streams on stdout
- Azure Monitor Logs Ingestion API - custom structured logs into Log Analytics
- Azure Monitor OpenTelemetry - correlated logs/traces in Application Insights
- AWS Distro for OpenTelemetry - exporting to CloudWatch Logs and X-Ray
- Per-language loggers: structlog (Python) · Serilog / .NET OpenTelemetry (C#) ·
log/slog(Go) · pino (Node) · PSFramework (PowerShell) - LibreDevOpsHelpers on the PowerShell Gallery - the canonical reference implementation
- Language standards & cheatsheets: PowerShell Standards · Python Standards · Bash Standards · CI/CD Standards
- Cheatsheets: Bash · Python · .NET · Go · TypeScript · PowerShell