data-skill-challenges/challenges/data_engineering_challenge/takehome_assessment_guidelines.md
2025-07-11 11:14:31 +00:00

41 lines
2 KiB
Markdown

# Takehome assessment guidelines
Fundamental, basic stuff:
- Did the candidate deliver with a complete, self-contained git repo?
- Is their solution sufficiently documented to understand what is going on?
- Can you run it? Is it a smooth process, or do you have to figure out lots of stuff along the way? Note that some degree of figuring things out and tweaking on our side is expected given that the candidate doesn't know what platform his solution will be running on.
Regarding the business context:
- Did he understand all the requirements properly, or is there some obvious confusion?
- Regarding delivering:
- Did he attribute bookings to the check out date?
- Did he use the right currency for each country?
- Did he implement the minimum fee correctly?
- Did he try to tackle somehow combinations of Owner Company and month that have 0 bookings?
Regarding the usage of git:
- Did he do small, purposeful commits? Or did he blast a few or just one monster commit?
- Note that I wouldn't just discard someone if they make a single monster commit, but if they display step by step, small increments behaviour, that's good.
Regarding the deployment:
- Is it complete and exhaustive? Does he clearly list dependencies?
- Does he provide you with easy tools and utilities to run things?
- Did he include any way to test his deployment?
Regarding the Extract and Load from the fake API to the database:
- Is it clean and readable?
- Did he put any measures in place for logging?
- Does his solution load data in one single batch, or in smaller batches?
- Does he do
Regarding SQL:
- Is it clean and readable?
- Did he bother with stuff like indices, constraints, primary keys, etc?
- Did he do intermediate, modular tables/views to get to the final table? Or did he just roll one huge query?
- Did he use any fancier features from his particular database choice that display mastery of it? For example, using a PG materialized view + a trigger to update the final table any time a new record is ingested from the fake API.