Cluster multi-process
VextJS has built-in Cluster multi-process management, manages multiple Worker processes through ClusterMaster, makes full use of multi-core CPUs, and supports enterprise-level features such as zero-downtime rolling restart, heartbeat detection, and automatic fault recovery.
Quick Start
Enabled via configuration
Enable via environment variables
There is no need to modify the configuration file, just set VEXT_CLUSTER=1 to enable Cluster mode:
In Cluster mode, the Master will first complete the configuration detection and port pre-check; the patch of bootstrap config provider will be passed to the Worker for reuse in the same startup cycle, preventing the Master and Worker and different Workers from seeing different remote configuration results.
Startup effect
Architecture Overview
- Master process: does not process HTTP requests and is responsible for managing the life cycle of the Worker process
- Worker process: Each Worker runs a complete VextJS application instance and handles HTTP requests independently
- IPC communication: Messages are exchanged between Master and Worker through Node.js’ built-in inter-process communication (IPC)
Configuration options
Configure Cluster related options in config/default.ts:
Worker quantity strategy
CLI commands
VextJS CLI provides complete Cluster management commands:
vext start — start
If cluster.enabled: true or VEXT_CLUSTER=1 is set in the configuration, vext start will automatically start in Cluster mode.
vext stop — stop
vext stop finds the Master process by reading the PID file (default .vext.pid) and sends the SIGTERM signal to trigger graceful shutdown.
Close process:1. Master receives SIGTERM 2. Master sends shutdown instructions to all Workers 3. Each Worker executes the onClose hook (closes the database connection, etc.) 4. Worker stops accepting new requests and waits for existing requests to complete 5. Forced exit after timeout (controlled by shutdown.timeout) 6. After all Workers exit, the Master exits 7. PID files are automatically deleted
vext reload — rolling restart
vext reload executes zero-downtime rolling restart (Rolling Restart):
- Master receives reload signal
- Restart Workers one by one (instead of restarting them all at once)
- After the new Worker is started and ready, close the old Worker
- Process all Workers in sequence
- There is always a Worker serving requests throughout the process
Applicable scenarios:
- After deploying a new version of the code, no downtime is required for the new code to take effect
- Reload after updating configuration
- Hot fix
vext reload requires cluster.reload to be configured (enabled by default). To disable rolling restart, remove the reload configuration item.
vext status — View status
Output example:
Automatic failure recovery
Worker crashes and restarts
When autoRestart: true (default), the Master will automatically restart after the Worker crashes:
Exponential backoff
When crashes occur continuously, the restart delay gradually increases (exponential backoff) to avoid frequent restarts consuming system resources:
Crash loop protection
If the number of restarts reaches maxRestarts (default 5 times) within restartWindow (default 60 seconds), Master will stop restarting and output an alarm:
This prevents buggy code from causing infinite crash-restart loops.
Heartbeat detection
When healthCheck.enabled: true (default), Master sends heartbeat detection to Worker every healthCheck.interval (default 15 seconds). If the Worker does not respond (may be deadlocked or blocked) within healthCheck.timeout (default 30 seconds), the Master will force kill and restart the Worker:
PID file
When started in Cluster mode, the Master process will write to the PID file (default .vext.pid), which is used for vext stop / vext reload / vext status commands to locate the process.
PID files are automatically managed at the following times:
- Create: when Master starts
- Delete: When Master exits normally
- Detection: Detect whether there is a running Cluster at startup
Add .vext.pid to .gitignore to avoid committing to version control.
Cooperation with graceful closing
Graceful shutdown process in Cluster mode:
Timeout control:
- Worker level timeout is controlled by
shutdown.timeout(default 10 seconds) - Worker is forcibly terminated after timeout (
SIGKILL)
Configure according to environment
It is recommended to use vext dev (hot reload mode) instead of Cluster mode for development environment. Cluster is mainly used for multi-core utilization and high availability in production environments.
Inter-process communication
Master and Worker communicate through IPC messages. VextJS defines a standardized messaging protocol:
Worker → Master message
Master → Worker message
Deploying with Docker
Things to note when using Cluster mode in Docker containers:
Dockerfile example
Suggestions
- Worker number: In Docker containers, it is recommended to set
workersaccording to the allocated CPU resources instead of using'auto'('auto'will detect the total number of CPU cores of the host) - PID file: No special configuration is required for the PID file path in the container, just use the default
.vext.pid - Graceful shutdown: Make sure Docker's
stop_grace_periodis greater than VextJS'sshutdown.timeout - Single container, multiple processes: Cluster mode is a reasonable approach to run multiple Workers in a single container, but if you use orchestration tools such as Kubernetes, you can also choose single-process mode + multiple Pod replicas
FAQ
What should we pay attention to when using WebSocket/SSE in Cluster mode?
Long connections (WebSocket, SSE) need to consider sticky session in Cluster mode to ensure that connections from the same client are routed to the same Worker as much as possible. Sticky allocation based on client IP can be enabled via cluster.sticky: "ip"; the default is "none".
What is the appropriate number of Workers?
- CPU intensive: set to the number of CPU cores (
'auto') - I/O intensive: can be set to 1-2 times the number of CPU cores
- Mixed load: Start with the number of CPU cores and adjust based on actual monitoring data
How to monitor the status of each Worker?
Use the vext status command to view the running status, PID, survival time and request count of each Worker. In a production environment, it is recommended to use Prometheus or other monitoring tools to collect more detailed indicators.
How is it different from PM2?
VextJS's built-in Cluster management is deeply integrated with framework features (such as cooperation with onClose hooks, configuration systems, hot reloading), providing a zero-configuration out-of-the-box experience. PM2 is a general-purpose process manager with broader functionality but less integrated with the framework than the built-in solutions. The two can be used together (PM2 manages the Master process), but usually not required.
Next step
- Learn about the detailed explanation of Cluster-related commands in CLI Commands
- View the complete configuration items of Cluster in Configuration
- Learn the relationship between hot reload and Cluster
- Explore Cluster-related testing methods in Testing