Degraded performance on the BLiP
Incident Report for Take Blip
Postmortem

Hi blipper! On 15/09/2021, Wednesday, we faced an unavailability on Blip that affected the operation of many Smart Contacts.

To be transparent with you, Blip user, we are writing to tell you what happened.

What happened

An automatic unscheduled date change for 2022 has been identified on one our Take Blip server, causing an interruption in the exchange of messages from our smart contacts.

How this issue impact you

Because of this failure, our Blip Portal functionalities faced problems to access and also Blip Desk faced problems to open and distribute tickets.

In the name of Take Blip, I want to say sorry for any problems caused to you, your company and your customers.

What we do to solve it

As soon as the problem was identified at 10:50 am - 3UTC, our team acted quickly to prevent our community from being more affected.‌ We took actions to isolate the divergent server and made interventions in other servers, equalizing the date information and thus, normalizing the operation of our platform around 12:30 pm - 3UTC.

Later, around 2:20 pm - 3UTC, as a repercussion of the scenario reported above, we observed an impact on the tickets’ distribution. This happened because the agents' statuses were still dated 2022. Once again, our team acted immediately by intervening in the services responsible for distributing tickets and correcting agents' status actions with the date 2022, reestablishing the service flow normalizing the operation around 15:20h pm -3UTC.

Where we are now

The Blip Portal and Blip Desk is working again, and our technical team is following up action to map the root cause. In addition, we have internal actions to prevent events like that again as:

  • Investigation to identify how the date was automatically changed on that server;
  • Adjustments in databases that were registered with the year 2022 date information

You can check this history and all other Blip features status on our Status Page.

We also want to say thank you for your patience and remind you that we are always here to help you in any need. Just open a request on our Support or create a new topic on Blip Forum, the exclusive space to the whole users' community.

Sincerely,

Posted Sep 21, 2021 - 14:34 GMT-03:00

Resolved
Status Update:

Fault identified:
Unfortunately, even after the emergency actions that were carried out this last Saturday (11/09/2021), we again recorded impacts on customers as a result of disconnection from our Cache service.

Palliative correction:
Some interventions were carried out in the services, so that the scenario was normalized and we are following the environment.

Start date/time: 11:12 AM
End date/time: 11:40 AM

Actions in progress:
We keep our engineering team in the crisis room already raising new corrective actions.
Posted Sep 13, 2021 - 12:20 GMT-03:00
Identified
We are suffering a degradation in the performance of the BLiP platform, our technical team is already working on the case.
Posted Sep 13, 2021 - 11:22 GMT-03:00
This incident affected: Take Blip Platform (CRM, Core, Analytics, Artificial Intelligence, Portal, Cloud Infrastructure).