SUMMARY
- Date of incident: February 11th, 2023
- Customer Impact: No recommendations returned, plus a high volume of error on the search service.
Thursday 16th February
Please find the RCA report attached to this article.
The root cause itself is added to the section below.
Monday 13th February
The internal incident report has been delivered, and we are working on an official RCA, which will be added to this article.
Sunday 12th February 2023-02-12 02:25
The Recommendation service was restored at 2023-02-11 22:55. The Attraqt team continued working to resolve the XO Search service as well.
All services were back to normal on Feb 12th at 02:25.
Saturday 11th February 19:59
Attraqt received an internal alert for performance degradation on the Recommendations API and started an investigation to identify the root cause of the issue. Once this was clarified, mitigation actions started, as it was diagnosed that the Search API was also affected.
ROOT CAUSE ANALYSIS REPORT
The root cause for the issue was identified as the result of a communication failure between the XO mirco-services brought on by an SSL Certificates expiration on one of the internal components, which enables these communications. Due to an alerting failure, the certificate was not renewed ahead of it’s expiration.
To correct this, the expired certificate was regenerated, and all the affected services reloaded.
Post-incident, all other SSL Certificates were reviewed to ensure no others were due to expire at this time.
The following improvement actions took place on Attraqt site:
-
SSL Certificates have been regenerated with a ten years validity to avoid the issue from reoccurring.
-
Automatization of the certificate generation is being investigated.
FOR MORE INFORMATION
All currently available information is included in this article. We will continue to provide updates on the issue here as we work to resolve the incident.
If you have logged a ticket with us, we will provide the same information there as soon as possible.
The report of our root cause analysis investigation is usually posted here a few days after the incident has been resolved. If you have additional questions about this incident, please log a ticket with us.
Kommentare
0 Kommentare
Bitte melden Sie sich an, um einen Kommentar zu hinterlassen.