lolamarket-notes/notes/Great Expectations Lifecycle.md
2023-11-17 15:13:27 +01:00

1.6 KiB

ddd This document explains how we tackle the different parts of the Great Expectations Lifecycle.

The Lifecycle

The lifecycle we refer to encompasses all the steps from deciding that you need to recurrently validate data somewhere to decomissioning the validation. It encompasses three main stages:

  • The setup: deciding what is expected of the data, designing the checkpoint, testing it out and putting it into production.
  • The operation and iteration: operating your data validation and updating it across time as business and data environments change.
  • The teardown: decomissioning a data validation that is no longer needed.

Setup

Setting up data validation on an ETL or other automated job means creating a new Great Expectations checkpoint. Having datasources and expectations suites configured and ready to use are prerequisites to creating the checkpoint.

Creating a Datasource

Creating an Expectation Suite

Creating a Checkpoint

Operation and Iteration

Integrating the checkpoint in a Prefect Flow

  • Staging and quarantine strategy
  • Using transactions to rollback
  • Slack alerts
  • Checking validation docs after failure

Glossary

Term Meaning
Checkpoint A recipe that ties together datasources, expectation suites and actions to execute after validating.