43 lines
1.6 KiB
Markdown
43 lines
1.6 KiB
Markdown
|
|
ddd
|
||
|
|
This document explains how we tackle the different parts of the Great Expectations Lifecycle.
|
||
|
|
|
||
|
|
## The Lifecycle
|
||
|
|
|
||
|
|
The lifecycle we refer to encompasses all the steps from deciding that you need to recurrently validate data somewhere to decomissioning the validation. It encompasses three main stages:
|
||
|
|
- The setup: deciding what is expected of the data, designing the checkpoint, testing it out and putting it into production.
|
||
|
|
- The operation and iteration: operating your data validation and updating it across time as business and data environments change.
|
||
|
|
- The teardown: decomissioning a data validation that is no longer needed.
|
||
|
|
|
||
|
|
|
||
|
|
## Setup
|
||
|
|
|
||
|
|
Setting up data validation on an ETL or other automated job means creating a new Great Expectations checkpoint. Having datasources and expectations suites configured and ready to use are prerequisites to creating the checkpoint.
|
||
|
|
|
||
|
|
### Creating a Datasource
|
||
|
|
|
||
|
|
|
||
|
|
### Creating an Expectation Suite
|
||
|
|
|
||
|
|
|
||
|
|
## Creating a Checkpoint
|
||
|
|
|
||
|
|
## Operation and Iteration
|
||
|
|
|
||
|
|
### Integrating the checkpoint in a Prefect Flow
|
||
|
|
- Staging and quarantine strategy
|
||
|
|
- Using transactions to rollback
|
||
|
|
- Slack alerts
|
||
|
|
- Checking validation docs after failure
|
||
|
|
|
||
|
|
|
||
|
|
###
|
||
|
|
|
||
|
|
|
||
|
|
|
||
|
|
## Glossary
|
||
|
|
|
||
|
|
| Term | Meaning |
|
||
|
|
| ---------- | ---------------------------------------------------------------------------------------------------- |
|
||
|
|
| Checkpoint | A recipe that ties together datasources, expectation suites and actions to execute after validating. |
|
||
|
|
| | |
|