lolamarket-notes/notes/Great Expectations Lifecycle.md

43 lines
1.6 KiB
Markdown
Raw Permalink Normal View History

2023-11-17 15:13:27 +01:00
ddd
This document explains how we tackle the different parts of the Great Expectations Lifecycle.
## The Lifecycle
The lifecycle we refer to encompasses all the steps from deciding that you need to recurrently validate data somewhere to decomissioning the validation. It encompasses three main stages:
- The setup: deciding what is expected of the data, designing the checkpoint, testing it out and putting it into production.
- The operation and iteration: operating your data validation and updating it across time as business and data environments change.
- The teardown: decomissioning a data validation that is no longer needed.
## Setup
Setting up data validation on an ETL or other automated job means creating a new Great Expectations checkpoint. Having datasources and expectations suites configured and ready to use are prerequisites to creating the checkpoint.
### Creating a Datasource
### Creating an Expectation Suite
## Creating a Checkpoint
## Operation and Iteration
### Integrating the checkpoint in a Prefect Flow
- Staging and quarantine strategy
- Using transactions to rollback
- Slack alerts
- Checking validation docs after failure
###
## Glossary
| Term | Meaning |
| ---------- | ---------------------------------------------------------------------------------------------------- |
| Checkpoint | A recipe that ties together datasources, expectation suites and actions to execute after validating. |
| | |