sh-notion/notion_data_team_no_files/20240718-01 - Xe com data not retrieved 5c283e9aa4834323b38af0bff95477a5.md
Pablo Martin a256b48b01 pages
2025-07-11 16:15:17 +02:00

4.9 KiB
Raw Blame History

20240718-01 - Xe.com data not retrieved

Xexe did not retrieve the data from xe.com

Managed by: Uri

Summary

  • Components involved: data-xexe
  • Started at: 2024-07-18 07:00 (local ES time)
  • Detected at: 2024-07-18 08:42
  • Mitigated at: 2024-07-18 16:50

Xe.com subscription has been suspended because lacking of payment from Superhog side. This made the daily execution fail. Once the payment has been done, and after confirmation from xe.com team, the manual execution of the process worked well.

Impact

Currency conversion rates on 17th July have not been retrieved. This means that any reporting containing revenue with currency conversion is not displaying highly accurate figures, but rather, is using the conversions from the previous available day (16th July). This only affects for those reports reading DWH that use backend conversion, Xero reporting is not affected. Specifically:

  • Currency Exchange report
  • Guest Payments report (Business Overview)
  • Main Business KPIs (Business Overview) - only Guest Payments related metrics
  • Check-in Hero Overview
  • Guest Satisfaction (Guest Insights) - not really affected since theres no payment related metric

Impact at the moment is relatively small in the sense that only one day of currency conversion is missing, but failure to fix it soon could increase the impact.

Timeline

Timezone: CEST

Time Event
2024-07-18 07:00:06 Xexe starts to run on version 0.1.0
2024-07-18 07:00:09 Error is raised by processes.py stating that “Didnt find the fields of a good response” while running the healthcheck against xe.com API.
2024-07-18 07:00:13 Xexe attempts to fetch the rates and fails to do so since the response seems empty, returning a python error on KeyError: from
2024-07-18 07:00:13 Alert is sent to #data-alerts channel
2024-07-18 08:42 Alert is spotted by the Data Team
2024-07-18 08:48 After checking the logs, it does not seem straight-forward at first glance. Its clear that we do not have currency conversion data from yesterday, 17th of July 2024
2024-07-18 08:54 A message has been sent to the channel #data to inform that theres an incident ongoing around currency conversion
2024-07-18 09:18 At this stage seems clear that the healthcheck perform vs. xe.com is the main issue. Maybe the API has been temporarily down, for whatever reason. Im not able to see in xe.com if theres an API availability, so Im not able to make sure this is the reason. At this stage, Ill opt for a single re-run and see what happens.
2024-07-18 09:20 A re-run is launched, but fails again. The alert is correctly sent to #data-alerts channel. Same error is displayed.
2024-07-18 09:33 After discussing with Ben R, it seems the problem comes from the billing. A couple of emails have been already shared with Pablo on this subject according to Ben. Ben is going to take a look at it. At this stage, nothing else I (Uri) can do but wait.
2024-07-18 09:56 Gus forwarded me the email loop from Xe.com, indeed its clearly linked to the billing.
2024-07-18 10:30 Ben R confirms that the invoice has been settled now. We try a re-run.
2024-07-18 10:35 Re-run fails with the same error. Maybe the re-activation of our account needs to be done manually from xe.com side
2024-07-18 11:11 A follow up communication to #data channel has been sent with the details on the root cause and more detailed impacts
2024-07-18 11:13 A follow up e-mail is sent by Ben R to the original email loop from xe.com, asking for re-activation now that it has been paid
2024-07-18 16:17 We receive e-mail confirmation from xe.com that the account has been reinstated
2024-07-18 16:43 A new re-run of xexe process is launched, this time finished successfully
2024-07-18 16:46 Re-run of DWH to update all tables and reports
2024-07-18 16:50 A couple of checks are done to ensure data has been updated accordingly. All good, we can consider the incident as mitigated
2024-07-18 16:54 A final communication to #data channel has been sent communicating the mitigation of the incident

Root Cause(s)

There has been a suspension of the service from lack of payment from our side. Email loop shows that there has been communication from Xe.com on this subject on 26th June, a reminder on July 8th and a final communication on 15th July. These emails were sent to tech@guardhog.com and unnoticed by the Data team - at least Uri/Joaquín, the forward of this e-mail to Pablo was unnoticed since Pablo was on holidays.

Resolution and recovery

Billing has been settled on the same day as the incident was raised. Once we got confirmation from xe.com that the account has been reinstated, re-running the daily process manually worked perfectly.

Lessons Learned

To be filled later on

Action Items

To be filled later on