How to Set Up ElasticJob with ZooKeeper for Distributed Job Scheduling
Purpose
When I needed distributed job scheduling for my Java application, I chose ElasticJob. But setting it up with ZooKeeper wasn’t straightforward. This post shows the minimal configuration needed to get ElasticJob running with ZooKeeper.
Environment
- Java 17+
- ElasticJob 3.0.5
- ZooKeeper 3.8.x
- Maven
The Problem
I wanted to run scheduled jobs across multiple servers. When one server goes down, another should pick up the work. Simple enough, right?
I started by reading ElasticJob documentation and got confused:
What exactly does ZooKeeper do?Do I really need it?How do I configure it?The documentation explains WHAT ElasticJob does but not clearly HOW to set up the coordination layer.
What ZooKeeper Actually Does
Before diving into code, I needed to understand WHY ZooKeeper is required:
+------------------+ +------------------+| Server A | | Server B || +------------+ | | +------------+ || | ElasticJob | | | | ElasticJob | || +-----+------+ | | +-----+------+ || | | | | |+--------+---------+ +--------+---------+ | | | +------------------+ | | v v +--------------+ | ZooKeeper | <-- Coordination | - Leader election | - Shard assignment | - Failover detection | - Job configuration +--------------+ZooKeeper handles:
- Leader election: Which node triggers the job
- Shard assignment: Which node handles which data partition
- Failover: Detecting node failures and reassigning work
- Configuration storage: Job definitions and runtime state
Step 1: Start ZooKeeper
First, I needed a running ZooKeeper instance. For local development, Docker is the easiest:
docker run --rm -d -p 127.0.0.1:2181:2181 --name elasticjob-zookeeper zookeeperVerify it’s running:
docker ps | grep zookeeperOutput should show the container running.
Step 2: Add ElasticJob Dependency
I added the ElasticJob dependency to my Maven project:
<dependency> <groupId>org.apache.shardingsphere.elasticjob</groupId> <artifactId>elasticjob-bootstrap</artifactId> <version>3.0.5</version></dependency>Step 3: Initialize Registry Center
Now comes the critical part - configuring the ZookeeperRegistryCenter:
import org.apache.shardingsphere.elasticjob.reg.base.CoordinatorRegistryCenter;import org.apache.shardingsphere.elasticjob.reg.zookeeper.ZookeeperConfiguration;import org.apache.shardingsphere.elasticjob.reg.zookeeper.ZookeeperRegistryCenter;
public class RegistryCenterConfig {
public static CoordinatorRegistryCenter createRegistryCenter() { ZookeeperConfiguration config = new ZookeeperConfiguration("localhost:2181", "my-elasticjob-app");
// Optional: Set session timeout (default is 60000ms) config.setSessionTimeoutMilliseconds(60000); // Optional: Set connection timeout (default is 15000ms) config.setConnectionTimeoutMilliseconds(15000);
CoordinatorRegistryCenter registryCenter = new ZookeeperRegistryCenter(config);
registryCenter.init(); // <-- Don't forget this! return registryCenter; }}The key parameters are:
- Server list:
localhost:2181- where ZooKeeper is running - Namespace:
my-elasticjob-app- isolates this app’s data in ZooKeeper
Why Namespace Matters
I made a mistake when I first set this up. I used the same namespace for development and production:
// Dev environmentnew ZookeeperConfiguration("dev-zk:2181", "my-app")
// Production environmentnew ZookeeperConfiguration("prod-zk:2181", "my-app")This seemed fine until I realized both environments were sharing the same ZooKeeper cluster. The jobs from dev were interfering with production!
The fix is simple:
// Dev environmentnew ZookeeperConfiguration("zk-cluster:2181", "my-app-dev")
// Production environmentnew ZookeeperConfiguration("zk-cluster:2181", "my-app-prod")Step 4: Create a Simple Job
Now I need to define an actual job:
import org.apache.shardingsphere.elasticjob.api.ShardingContext;import org.apache.shardingsphere.elasticjob.simple.job.SimpleJob;
public class MySimpleJob implements SimpleJob {
@Override public void execute(ShardingContext context) { // context.getShardingItem() tells you which shard this is // context.getShardingTotalCount() tells you total shards
System.out.println("Job executing on shard: " + context.getShardingItem());
// Your job logic here doWork(); }
private void doWork() { // Business logic }}Step 5: Schedule the Job
Finally, schedule the job with ElasticJob:
import org.apache.shardingsphere.elasticjob.api.JobConfiguration;import org.apache.shardingsphere.elasticjob.lite.api.bootstrap.impl.ScheduleJobBootstrap;
public class JobScheduler {
public static void main(String[] args) { // 1. Create registry center CoordinatorRegistryCenter registryCenter = RegistryCenterConfig.createRegistryCenter();
// 2. Define job configuration JobConfiguration jobConfig = JobConfiguration.newBuilder("MySimpleJob", 3) // 3 shards .cron("0/10 * * * * ?") // Every 10 seconds .shardingItemParameters("0=Beijing,1=Shanghai,2=Guangzhou") .overwrite(true) // Overwrite existing config on each startup .build();
// 3. Schedule the job ScheduleJobBootstrap scheduleJobBootstrap = new ScheduleJobBootstrap(registryCenter, new MySimpleJob(), jobConfig);
scheduleJobBootstrap.schedule();
System.out.println("Job scheduled successfully!"); }}Common Mistakes I Made
Mistake 1: Forgot to Start ZooKeeper
I got this error when ZooKeeper wasn’t running:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLossAlways verify ZooKeeper is running before starting your application:
# Check if ZooKeeper port is listeningnc -zv localhost 2181Mistake 2: Forgot to Call init()
I created the ZookeeperRegistryCenter but forgot to call init():
CoordinatorRegistryCenter registryCenter = new ZookeeperRegistryCenter( new ZookeeperConfiguration("localhost:2181", "my-app") );// registryCenter.init() is missing!The error was cryptic:
IllegalStateException: Registry center has not been initializedMistake 3: Wrong ZooKeeper Address
When deploying to a server, I hardcoded localhost:
new ZookeeperConfiguration("localhost:2181", "my-app")This fails when the app runs on a different server than ZooKeeper. Use environment variables:
String zkServers = System.getenv().getOrDefault("ZOOKEEPER_SERVERS", "localhost:2181");new ZookeeperConfiguration(zkServers, "my-app")How to Verify It’s Working
After starting the application, I check ZooKeeper to verify the job was registered:
# Connect to ZooKeeper CLIdocker exec -it elasticjob-zookeeper zkCli.sh
# List ElasticJob datals /my-elasticjob-appYou should see something like:
[MySimpleJob]The Complete Setup Diagram
Here’s the complete flow:
+-------------------+| Your Application || || +-------------+ || | MySimpleJob | || +------+------+ || | || +------v------+ || | JobBootstrap| || +------+------+ || | |+---------+---------+ | | init() + schedule() |+---------v---------+| ZooKeeper || /my-elasticjob-app/| /MySimpleJob || /sharding || /instances || /servers |+-------------------+Summary
Setting up ElasticJob with ZooKeeper requires:
- Running ZooKeeper: Use Docker for local development
- Registry Center Configuration: Create
ZookeeperRegistryCenterwith server list and namespace - Job Implementation: Implement
SimpleJobinterface - Job Scheduling: Use
ScheduleJobBootstrapto register and schedule
The key insight is that ZooKeeper provides battle-tested distributed coordination primitives that would be extremely complex to implement correctly from scratch.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 ElasticJob Official Documentation
- 👨💻 Apache ZooKeeper Documentation
- 👨💻 ElasticJob GitHub Repository
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments