State Machines
State machines allow you to design and automate business processes and data pipelines by composing workloads (functions, batch-jobs) or other AWS services into workflows. State machines manage failures, retries, parallelization, service integrations, and observability so developers can focus on higher-value business logic.
When to use
Extract, Transform, and Load (ETL) process - state machines ensure that long-running, multiple ETL jobs execute in order and complete successfully, instead of manually orchestrating those jobs or maintaining a separate application.
Orchestrate microservices - use state machines to combine multiple functions into responsive serverless applications and microservices.
Define states
Definition of state machines is written using Amazon states language.
Amazon states language syntax gives enables users to specify any workflow from easy ones to most complex ones.
The following example shows an order-payment flow made up of lambda functions:
resources:checkAndHoldProduct:type: functionproperties:packageConfig:filePath: 'check-and-hold-product.ts'billCustomer:type: functionproperties:packageConfig:filePath: 'bill-customer.ts'shipmentNotification:type: functionproperties:packageConfig:filePath: 'shipment-notification.ts'buyProcessStateMachine:type: 'state-machine'properties:definition:StartAt: 'checkAndHold'States:checkAndHold:Type: TaskResource: $Param('checkAndHoldProduct', 'LambdaFunction::Arn')Next: billbill:Type: TaskResource: $Param('billCustomer', 'LambdaFunction::Arn')Next: notifynotify:Type: TaskResource: $Param('shipmentNotification', 'LambdaFunction::Arn')Next: succeedsucceed:Type: Succeed
Retry example
The following example shows:
- generateReport - batch-job which generates report.
- uploadReport - function that uploads generated report.
- reportStateMachine - state machines that ties above workloads together.
State machine definitions provide great flexibility. In this case,reportStateMachine only retries upload part of our workflow, since regenerating the report (in case of upload failure) would be costly and redundant.
resources:uploadReport:type: functionproperties:packageConfig:filePath: 'upload-report.ts'generateReport:type: 'batch-job'properties:container:imageConfig:filePath: generate-report.tsresources:cpu: 2memory: 7800reportStateMachine:type: 'state-machine'properties:definition:StartAt: 'generate'States:generate:Type: TaskResource: 'arn:aws:states:::batch:submitJob.sync'Parameters:JobDefinition: $Param('generateReport', 'JobDefinition::Arn')JobName: report-jobJobQueue: $Param('SHARED_GLOBAL', 'BatchOnDemandJobQueue::Arn')Next: uploadupload:Type: TaskResource: $Param('uploadReport', 'LambdaFunction::Arn')Next: succeedRetry:- ErrorEquals:- 'State.ALL'IntervalSeconds: 10succeed:Type: Succeed