readme
This commit is contained in:
parent
d8dc1c917a
commit
5083cbd35b
1 changed files with 3 additions and 3 deletions
|
|
@ -35,7 +35,7 @@ Syncs are incremental in nature. `anaxi` keeps track of what's the most up to da
|
||||||
|
|
||||||
Different streams are independent from each other. Their runs won't affect them in anyway.
|
Different streams are independent from each other. Their runs won't affect them in anyway.
|
||||||
|
|
||||||
To set up a stream, `anaxi` expects to find a file called `streams.yml` in the path `~/.anaxi/streams.yml`. You can check the example file in this repo named `example-streams.yml` to understand how to build this file. Each entry in the file represents one stream. The `cosmos_database_id` field and `postgres_database` field in each stream entry should be filled in with values that you have informed in the `cosmos-db.yml` and `postgres.yml` files.
|
To set up a stream, `anaxi` expects to find a file called `streams.yml` in the path `~/.anaxi/streams.yml`. You can check the example file in this repo named `example-streams.yml` to understand how to build this file. Each entry in the file represents one stream. The `cosmos_database_id` field and `postgres_database` field in each stream entry should be filled in with values that you have informed in the `cosmos-db.yml` and `postgres.yml` files. The `cutoff_timestamp` field allows you to specify a timestamp (ISO 8601) that should be used as the first data to read data from if no checkpoint is available. You can leave it empty to read all records from the start of the container history.
|
||||||
|
|
||||||
Also, you will need to create a folder named `checkpoints` in the path `~/.anaxi/checkpoints`. The state of the checkpoints for each stream will be kept there in different files.
|
Also, you will need to create a folder named `checkpoints` in the path `~/.anaxi/checkpoints`. The state of the checkpoints for each stream will be kept there in different files.
|
||||||
|
|
||||||
|
|
@ -76,8 +76,8 @@ anaxi sync-stream --stream-id <your-stream-name>
|
||||||
- Create a cron entry with `crontab -e` that runs the script. For example: `0 2 * * * /bin/bash /home/azureuser/run_anaxi.sh` to run syncs every day at 2AM.
|
- Create a cron entry with `crontab -e` that runs the script. For example: `0 2 * * * /bin/bash /home/azureuser/run_anaxi.sh` to run syncs every day at 2AM.
|
||||||
- If you want to run syncs at different frequencies, you can make different copies of `run_anaxi.sh` and schedule them independently.
|
- If you want to run syncs at different frequencies, you can make different copies of `run_anaxi.sh` and schedule them independently.
|
||||||
- Backfilling and first runs
|
- Backfilling and first runs
|
||||||
- When running the first sync for a stream, `anaxi` will by default start reading records since the start of the source Cosmos DB container. In some cases, this is probably what you want. You don't need to take any special action.
|
- When running the first sync for a stream, `anaxi` will by default start reading records since the specified `cutoff_timestamp` date.
|
||||||
- On the other hand, if you want to only sync from a specific point on time, you can achieve this by creating a file in the checkpoints folder. The file should be named `<name-of-your-stream>.yml` and contain a single key named `highest_synced_timestamp`, with the value being the timestamp from which you want the sync to begin at (in UTC!). For, example, if I wanted to start at the time I'm writing this, this would be the content of the file:
|
- If the value is not provided, `anaxi` will read the container's full history from the very beginning.
|
||||||
|
|
||||||
```yml
|
```yml
|
||||||
highest_synced_timestamp: '2024-08-16T9:02:23+00:00'
|
highest_synced_timestamp: '2024-08-16T9:02:23+00:00'
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue