How to Create and Schedule Jobs in ElasticJob: A Step-by-Step Guide
Purpose
I needed to create a scheduled job that processes data across multiple nodes in a distributed system. After setting up ElasticJob with ZooKeeper, I found myself staring at the documentation wondering: “How do I actually write and schedule my first job?”
This post walks through the three-step pattern to create and schedule jobs in ElasticJob, with a focus on understanding the sharding mechanism that makes distributed execution possible.
The Problem
I had a batch processing requirement: every minute, I needed to pull records from a database and process them. The tricky part? I had three application instances running, and I didn’t want them all processing the same data simultaneously. Manual coordination via database locks seemed fragile.
ElasticJob promised automatic distribution, but I needed to understand how to implement it.
Step 1: Implement SimpleJob Interface
The first step is creating a class that implements SimpleJob. This interface has a single method: execute(ShardingContext context).
public class MyFirstJob implements SimpleJob {
@Override public void execute(ShardingContext context) { // This is where your job logic goes System.out.println("Job executed at: " + new Date()); }}At this point, I wondered: what is ShardingContext? This is the key to understanding ElasticJob’s distribution model.
Understanding ShardingContext
When ElasticJob runs, it can split your job into multiple “shards” that run in parallel across different nodes. The ShardingContext tells each instance which piece of work it’s responsible for.
+------------------+ +------------------+ +------------------+| Node A | | Node B | | Node C || Shard 0 | | Shard 1 | | Shard 2 || Process 1-100 | | Process 101-200| | Process 201-300|+------------------+ +------------------+ +------------------+The ShardingContext provides:
int shardId = context.getShardingItem(); // 0, 1, 2, etc.int totalCount = context.getShardingTotalCount(); // Total shards (e.g., 3)String jobParam = context.getJobParameter(); // Global parameterString shardParam = context.getShardingParameter(); // Per-shard parameterStep 2: Build JobConfiguration
Now I needed to configure the job with its name, shard count, and schedule.
JobConfiguration jobConfig = JobConfiguration.newBuilder("MyFirstJob", 3) .cron("0 * * * * ?") // Run every minute .jobParameter("globalValue") // Parameter passed to all shards .shardingItemParameters("0=shardA,1=shardB,2=shardC") // Per-shard params .build();Let me break down each configuration option:
| Method | Purpose | Example |
|---|---|---|
cron() | Schedule using cron expression | "0 * * * * ?" = every minute |
jobParameter() | Global parameter for all shards | "myConfig" |
shardingItemParameters() | Different parameter per shard | "0=db1,1=db2,2=db3" |
overwrite() | Update job config if exists | true or false |
failover() | Enable failover to other nodes | true or false |
misfire() | Handle missed executions | true or false |
Cron Expressions Quick Reference
ElasticJob uses standard cron expressions with 6 fields (seconds, minutes, hours, day of month, month, day of week):
┌───────────── second (0-59)│ ┌───────────── minute (0-59)│ │ ┌───────────── hour (0-23)│ │ │ ┌───────────── day of month (1-31)│ │ │ │ ┌───────────── month (1-12)│ │ │ │ │ ┌───────────── day of week (0-6, 0=Sunday)│ │ │ │ │ │* * * * * *Common patterns I use:
0 * * * * ? Every minute0 0 * * * ? Every hour0 0 0 * * ? Every day at midnight0 30 10 * * ? Every day at 10:30 AM0 0 12 ? * WED Every Wednesday at noonStep 3: Schedule with ScheduleJobBootstrap
The final step connects everything together: the job implementation, the configuration, and the ZooKeeper registry center.
// Assuming you have a CoordinatorRegistryCenter already set upCoordinatorRegistryCenter registryCenter = createRegistryCenter();
// Create and schedule the jobnew ScheduleJobBootstrap(registryCenter, new MyFirstJob(), jobConfig) .schedule();The ScheduleJobBootstrap handles registration with ZooKeeper and starts the scheduler.
A Complete Working Example
Here’s a complete example that processes different database partitions across shards:
public class DatabasePartitionJob implements SimpleJob {
private final PartitionProcessor processor;
public DatabasePartitionJob(PartitionProcessor processor) { this.processor = processor; }
@Override public void execute(ShardingContext context) { int shardId = context.getShardingItem(); String partitionName = context.getShardingParameter();
System.out.printf("Node %s processing partition %s (shard %d)%n", InetAddress.getLocalHost().getHostName(), partitionName, shardId );
// Process only records belonging to this partition processor.processPartition(partitionName); }}Configuration and scheduling:
public class Application {
public static void main(String[] args) { // Step 1: Set up ZooKeeper registry center CoordinatorRegistryCenter registryCenter = new ZookeeperRegistryCenter( new ZookeeperConfiguration("localhost:2181", "elastic-job-demo") ); registryCenter.init();
// Step 2: Build configuration JobConfiguration config = JobConfiguration.newBuilder("DatabasePartitionJob", 3) .cron("0 */5 * * * ?") // Every 5 minutes .shardingItemParameters("0=partition_a,1=partition_b,2=partition_c") .failover(true) // Enable failover .overwrite(true) // Allow config updates .build();
// Step 3: Schedule the job new ScheduleJobBootstrap( registryCenter, new DatabasePartitionJob(new PartitionProcessor()), config ).schedule();
System.out.println("Job scheduled successfully!"); }}How Sharding Actually Works
I was confused about how ElasticJob decides which node handles which shard. Here’s what happens:
- Registration: When jobs start, each node registers with ZooKeeper
- Leader Election: One node becomes the “leader”
- Shard Assignment: The leader assigns shards to available nodes
- Execution: Each node only executes its assigned shard(s)
Time: T0 (First node starts)┌─────────────────────────────────────┐│ Node A starts ││ Leader election: Node A wins ││ Shard assignment: Node A gets [0,1,2]│└─────────────────────────────────────┘
Time: T1 (Second node starts)┌─────────────────────────────────────┐│ Node B starts ││ Rebalance triggered ││ Shard assignment: ││ Node A gets [0, 1] ││ Node B gets [2] │└─────────────────────────────────────┘
Time: T2 (Third node starts)┌─────────────────────────────────────┐│ Node C starts ││ Rebalance triggered ││ Shard assignment: ││ Node A gets [0] ││ Node B gets [1] ││ Node C gets [2] │└─────────────────────────────────────┘Common Pitfalls I Encountered
Pitfall 1: Same Job Name, Different Configurations
If you run multiple instances with the same job name but different configurations, they will conflict. Always ensure the job name is unique per job type, or use overwrite(true) to update the configuration.
Pitfall 2: Forgetting to Initialize Registry Center
I spent an hour debugging why my job wouldn’t start. The error? I forgot to call registryCenter.init():
// WRONG - registry center not initializedCoordinatorRegistryCenter registryCenter = new ZookeeperRegistryCenter( new ZookeeperConfiguration("localhost:2181", "my-app"));// Missing: registryCenter.init();
// RIGHTCoordinatorRegistryCenter registryCenter = new ZookeeperRegistryCenter( new ZookeeperConfiguration("localhost:2181", "my-app"));registryCenter.init(); // Don't forget this!Pitfall 3: Cron Expression Seconds Field
ElasticJob uses 6-field cron (including seconds), while some tools generate 5-field cron. Always include the seconds field:
5-field cron: * * * * * (minute, hour, dom, month, dow)6-field cron: 0 * * * * ? (second, minute, hour, dom, month, dow)Using JobParameter vs ShardingItemParameters
I initially confused these two. Here’s the distinction:
- jobParameter: Same value passed to ALL shards. Use for configuration that applies globally.
.jobParameter("batchSize=100")// Every shard gets "batchSize=100"- shardingItemParameters: Different value per shard. Use for partitioning data.
.shardingItemParameters("0=region_us,1=region_eu,2=region_asia")// Shard 0 gets "region_us", Shard 1 gets "region_eu", etc.Summary
Creating scheduled jobs in ElasticJob follows a three-step pattern:
- Implement
SimpleJobwith yourexecute()logic - Configure using
JobConfiguration.newBuilder()with job name, shard count, and cron - Schedule via
ScheduleJobBootstrap.schedule()
The key insight is understanding how sharding works: each node handles a subset of work based on its assigned shard ID. The ShardingContext provides all the information needed to process only your assigned data.
The framework handles the complexity of coordination, failover, and load balancing automatically.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments