Merged PR 5675: Challenge for DE
This commit is contained in:
parent
16841e5c5d
commit
b7f95fa0be
10 changed files with 4660 additions and 0 deletions
|
|
@ -0,0 +1,44 @@
|
|||
# Truvi Data Engineering Challenge
|
||||
|
||||
Welcome to Truvi's Data Engineering Challenge. This challenge is a small, toy sample of the kind of work we do on a daily basis. Please read through the challenge in detail. Reach out to the recruiting manager if anything is unclear.
|
||||
|
||||
## Context
|
||||
|
||||
Truvi has recently signed a contract with House Group Holiday Rentals (HG), a Property Management Company. HG operates holiday properties for multiple Owner Companies. This means the properties are strictly owned by each the Owner Companies, but HG takes care of running the rental operation for them. As part of their services, they have hired Truvi to protect the bookings that happen in the properties of the Owner Companies. Our specific agreement is that:
|
||||
|
||||
- All bookings that happen in their properties will be protected by Truvi (that is, if the guest does any damage to the property, we will cover for it).
|
||||
- In exchange, Truvi will invoice 10GBP/14USD/12EUR (depending on the country of the Owner Company) per booking on check-out date.
|
||||
- Invoicing will be done in monthly cycles. That means that each Owner Company will get one invoice, each month, with the bookings that checked out within the month.
|
||||
- Besides, there's a minimum fee per Owner Company and month of 100GBP/140USD/120EUR. If there are no bookings, or the fees for the bookings that happen in a month don't add up to those amounts, the minimum fee will still be charged to the Owner Company.
|
||||
|
||||
To implement the invoicing of this deal, we will need to integrate with the systems of HG, fetch their booking data and process it.
|
||||
|
||||
## What we need
|
||||
|
||||
The goal of the challenge is to build a toy architecture that integrates with our customer's system, ingests their data and processes it to deliver a clean summary table for our Data Analysts and finance team colleagues. The final table must display, for each Owner Company and month, what is their revenue, in both their original currency as well as converted into GBP.
|
||||
|
||||
## Challenge and constraints
|
||||
|
||||
We would like you to set up a small system to ingest and process the data described above. The goal is to end up with a running SQL database that holds the table described in the previous section (we will simply refer to it as "the final table").
|
||||
|
||||
You have been delivered a folder named `fake_api`. You can check the `README.md` in that folder to run a toy fake HTTP API that mocks the customer's system. We expect you to use it. You are not expected to modify, iterate or improve this API in any way.
|
||||
|
||||
You should also have received a CSV named `currency_rates.csv` with currency exchange rates.
|
||||
|
||||
Your solution should:
|
||||
|
||||
- Create a SQL database. Feel free to use whatever SQL database you feel comfortable with.
|
||||
- Build some way to ingest the data held in the fake API into the database. Feel free to build it as similar as to what you deliver in a production system (NOTE: we understand this is a hiring test and you have limited time and energy. We DON'T expect a PERFECT, PRODUCTION grade delivery... but the more quality you pack, the better we will appreciate your skills. It's also OK to consciously not make some bits perfect and then proactively discuss in the interview how would you build such parts in a real environment so we can learn about your ideas).
|
||||
- Load the data in `currency_rates.csv` into the database to use it as part of your transformations. You can do this in any way you want, dirty and unsustainable even. We won't judge this bit other than the data being loaded. For the case, we will just assume that the real context would provide you with timely currency rates data.
|
||||
- Within the database, do some transformations to deliver the final table.
|
||||
- Include a way to easily print part of the final table contents.
|
||||
|
||||
Some guidelines:
|
||||
|
||||
- Please, deliver your solution via a Github repo that contains all relevant code, files, docs and other artifacts.
|
||||
- We usually work on Linux systems, so we would appreciate if your solution is runnable on Linux, Mac or WSL.
|
||||
- We would appreciate if your example is clear and has enough documentation so that someone can run it just by reading your submission. You can assume this person knows their way around whatever tooling you're using. You can safely assume that we're happy running this in our laptops, there's no need to bother with any sophisticated infra.
|
||||
- Feel free to tackle the deployment of the SQL database in whatever way you want, but we feel using a Docker container is the simplest approach for our context. Again, you can do it differently, just keep in mind we would like to be able to execute your solution.
|
||||
- Feel free to ignore anything related to authentication and security. Even if we would care about such topics in production, we won't bother for this challenge.
|
||||
- Even though orchestration and monitoring are important topics that we will surely discuss with you in the interview, we're not expecting your solution to address those. It's fine if your solution gets only run once and has little output other than some terminal output.
|
||||
- Feel free to add any additional documentation, explainers, human-readable bits you find relevant. If you need to make assumptions when building your solution, we encourage you to list them, for example.
|
||||
File diff suppressed because it is too large
Load diff
|
|
@ -0,0 +1,111 @@
|
|||
# Fake API
|
||||
|
||||
A tiny script + CSV to fake the API of House Group Holiday Rentals.
|
||||
|
||||
## How to run
|
||||
|
||||
- Create a dedicated `venv` and install the packages listed in `requirements.txt`.
|
||||
- Move your terminal to this directory and run `python3 fake_api.py`.
|
||||
- This will start the API and serve it on port `5000` until you stop it with Ctrl+C.
|
||||
|
||||
## How to request
|
||||
|
||||
The API has a single endpoint `/api/bookings`. This endpoint implements pagination, sorting and some basic filtering. You can find below a few `curl` calls and their responses to get an idea of how the API works. You can start a new terminal after you got the API running and use them to test that the API is running fine.
|
||||
|
||||
- Bookings for a certain check-in date:
|
||||
|
||||
```bash
|
||||
curl "http://localhost:5000/api/bookings?check_in_date=2024-10-01"
|
||||
|
||||
{
|
||||
"page": 1,
|
||||
"per_page": 2,
|
||||
"results": [
|
||||
{
|
||||
"booking_id": "b3c810d6-8ab9-41f5-9810-ed809d1d1c64",
|
||||
"check_in_date": "2024-06-20 22:12:04",
|
||||
"check_out_date": "2024-06-24 22:12:04",
|
||||
"owner_company": "Garcia, Hamilton and Carr",
|
||||
"owner_company_country": "USA"
|
||||
},
|
||||
{
|
||||
"booking_id": "e019d228-3fff-46dd-b54e-a98364a5399a",
|
||||
"check_in_date": "2024-12-02 01:16:23",
|
||||
"check_out_date": "2024-12-05 01:16:23",
|
||||
"owner_company": "Campos PLC",
|
||||
"owner_company_country": "France"
|
||||
}
|
||||
],
|
||||
"total": 1000
|
||||
}
|
||||
```
|
||||
|
||||
- Bookings for a country, with pagination being used:
|
||||
|
||||
```bash
|
||||
curl "http://localhost:5000/api/bookings?owner_company_country=France&page=1&per_page=3"
|
||||
|
||||
{
|
||||
"page": 1,
|
||||
"per_page": 3,
|
||||
"results": [
|
||||
{
|
||||
"booking_id": "e019d228-3fff-46dd-b54e-a98364a5399a",
|
||||
"check_in_date": "2024-12-02 01:16:23",
|
||||
"check_out_date": "2024-12-05 01:16:23",
|
||||
"owner_company": "Campos PLC",
|
||||
"owner_company_country": "France"
|
||||
},
|
||||
{
|
||||
"booking_id": "b0cd14f7-5bdd-4cc1-900c-4f193b26c0ae",
|
||||
"check_in_date": "2024-04-11 03:57:51",
|
||||
"check_out_date": "2024-04-21 03:57:51",
|
||||
"owner_company": "Campos PLC",
|
||||
"owner_company_country": "France"
|
||||
},
|
||||
{
|
||||
"booking_id": "ca678537-d032-4f4e-9135-9fbba287b00d",
|
||||
"check_in_date": "2024-11-08 15:24:11",
|
||||
"check_out_date": "2024-11-11 15:24:11",
|
||||
"owner_company": "Campos PLC",
|
||||
"owner_company_country": "France"
|
||||
}
|
||||
],
|
||||
"total": 97
|
||||
}
|
||||
```
|
||||
|
||||
- Sorted and paginated:
|
||||
|
||||
```bash
|
||||
curl "http://localhost:5000/api/bookings?sort_by=check_out_date&sort_order=desc&page=1&per_page=3"
|
||||
|
||||
{
|
||||
"page": 1,
|
||||
"per_page": 3,
|
||||
"results": [
|
||||
{
|
||||
"booking_id": "007d0909-1dc2-4a0d-bb3f-a925321bd09b",
|
||||
"check_in_date": "2024-12-29 22:39:18",
|
||||
"check_out_date": "2024-12-31 22:39:18",
|
||||
"owner_company": "Campos PLC",
|
||||
"owner_company_country": "France"
|
||||
},
|
||||
{
|
||||
"booking_id": "bffccb7b-f3a8-40bc-9142-4f6964a3e44a",
|
||||
"check_in_date": "2024-12-29 21:11:07",
|
||||
"check_out_date": "2024-12-31 21:11:07",
|
||||
"owner_company": "Faulkner-Howard",
|
||||
"owner_company_country": "UK"
|
||||
},
|
||||
{
|
||||
"booking_id": "f71fefb6-38ef-4b9a-b5a6-08014417b91f",
|
||||
"check_in_date": "2024-12-28 20:12:36",
|
||||
"check_out_date": "2024-12-31 20:12:36",
|
||||
"owner_company": "Jones, Jefferson and Rivera",
|
||||
"owner_company_country": "USA"
|
||||
}
|
||||
],
|
||||
"total": 1000
|
||||
}
|
||||
```
|
||||
|
|
@ -0,0 +1,60 @@
|
|||
from flask import Flask, jsonify, request
|
||||
import csv
|
||||
|
||||
app = Flask(__name__)
|
||||
CSV_FILE = 'fake_bookings.csv'
|
||||
|
||||
ALLOWED_FIELDS = [
|
||||
"booking_id",
|
||||
"check_in_date",
|
||||
"check_out_date",
|
||||
"owner_company",
|
||||
"owner_company_country"
|
||||
]
|
||||
|
||||
def load_data():
|
||||
with open(CSV_FILE, newline='') as csvfile:
|
||||
return list(csv.DictReader(csvfile))
|
||||
|
||||
# Load data once at startup
|
||||
bookings = load_data()
|
||||
|
||||
@app.route('/api/bookings')
|
||||
def get_bookings():
|
||||
filters = request.args
|
||||
filtered = bookings.copy()
|
||||
|
||||
# --- Filtering ---
|
||||
for key, value in filters.items():
|
||||
if key in ALLOWED_FIELDS:
|
||||
filtered = [item for item in filtered if value.lower() in item.get(key, '').lower()]
|
||||
|
||||
# --- Sorting ---
|
||||
sort_by = filters.get("sort_by")
|
||||
sort_order = filters.get("sort_order", "asc").lower()
|
||||
if sort_by in ALLOWED_FIELDS:
|
||||
filtered.sort(
|
||||
key=lambda x: x.get(sort_by, "").lower(),
|
||||
reverse=(sort_order == "desc")
|
||||
)
|
||||
|
||||
# --- Pagination ---
|
||||
try:
|
||||
page = int(filters.get("page", 1))
|
||||
per_page = int(filters.get("per_page", 10))
|
||||
except ValueError:
|
||||
return jsonify({"error": "Invalid pagination values"}), 400
|
||||
|
||||
start = (page - 1) * per_page
|
||||
end = start + per_page
|
||||
paginated = filtered[start:end]
|
||||
|
||||
return jsonify({
|
||||
"total": len(filtered),
|
||||
"page": page,
|
||||
"per_page": per_page,
|
||||
"results": paginated
|
||||
})
|
||||
|
||||
if __name__ == '__main__':
|
||||
app.run(debug=True)
|
||||
File diff suppressed because it is too large
Load diff
|
|
@ -0,0 +1 @@
|
|||
flask
|
||||
Loading…
Add table
Add a link
Reference in a new issue