Outline
1. What is oozie
2. Do you need oozie
3. How to use oozie
4. Use case sharing
What Is Oozie ?
- Originally designed at Yahoo!
- Apache incubator project since 2011
- A web service that launches your jobs based on:
- Time dependency
- Data dependency
- Ability to rerun from last point of failure
- Monitoring
Do You Need Oozie ?
Q1: Having multiple jobs with dependency ?
Q2: Need to run jobs regularly ?
Q3: Need to check data availability ?
Q4: Need monitoring and operational support ?
If any one of your answer is YES,
then you should consider Oozie!
How To Use Oozie
1. Deploy your workflow on HDFS, this includes:
- oozie job definitions (workflow.xml)
- your codes: MR/pig/streaming/java etc.
- libraries (.so & .jar)
2. Submit your job
$ oozie job -run -config job.properties
Workflow ID: 0123-123456-oozie-wrkf-W
3. Check job status
$ oozie job -info 0123-123456-oozie-wrkf-W
$ oozie job -log 0123-123456-oozie-wrkf-W
(submit coordinator using the same way)
Use Case Sharing
- Was using crontab + python scripts
- After porting to oozie:
- Reduce code size (4906 -> 1708 lines)
- More smooth processing (1 week delay -> 3 days)
- More stable
1. What is oozie
2. Do you need oozie
3. How to use oozie
4. Use case sharing
What Is Oozie ?
- Originally designed at Yahoo!
- Apache incubator project since 2011
- A web service that launches your jobs based on:
- Time dependency
- Data dependency
- Ability to rerun from last point of failure
- Monitoring
Do You Need Oozie ?
Q1: Having multiple jobs with dependency ?
Q2: Need to run jobs regularly ?
Q3: Need to check data availability ?
Q4: Need monitoring and operational support ?
If any one of your answer is YES,
then you should consider Oozie!
How To Use Oozie
1. Deploy your workflow on HDFS, this includes:
- oozie job definitions (workflow.xml)
- your codes: MR/pig/streaming/java etc.
- libraries (.so & .jar)
2. Submit your job
$ oozie job -run -config job.properties
Workflow ID: 0123-123456-oozie-wrkf-W
3. Check job status
$ oozie job -info 0123-123456-oozie-wrkf-W
$ oozie job -log 0123-123456-oozie-wrkf-W
(submit coordinator using the same way)
Use Case Sharing
- Was using crontab + python scripts
- After porting to oozie:
- Reduce code size (4906 -> 1708 lines)
- More smooth processing (1 week delay -> 3 days)
- More stable
No comments:
Post a Comment