Degraded performance on the BLiP

Incident Report for Blip

Postmortem

Hi blipper! On 11/09/2021, Saturday, we faced an unavailability on Blip that affected the operation of many Smart Contacts.

To be transparent with you, Blip user, we are writing to tell you what happened.

What happened

An increased number of database connections in our cache service was identified, which is used to store contact information, causing an interruption in the exchange of messages from our smart contacts.

How this issue impact you

Because of this failure, the Blip CRM, our customer base management functionality, faced problems in executing message exchanges.

In the name of Take Blip, I want to say sorry for any problems caused to you, your company and your customers.

What we do to solve it

As soon as we identified the problem at 11:14 am -3UTC, our team put together and acted quickly to start the treatments. The correction was immediately applied and the service was normalized at 15:16 pm -3UTC.

Where we are now

The Blip CRM is working again, and our technical team is following up with the cloud provider to identify the main cause of the failure.  In addition, we have internal actions to prevent events like that again.

You can check this history and all other Blip features status on our Status Page.

We also want to say thank you for your patience and remind you that we are always here to help you in any need. Just open a request on our Support or create a new topic on Blip Forum, the exclusive space to the whole users' community.

Sincerely,

Posted Sep 21, 2021 - 14:29 GMT-03:00

Resolved

Fault identified:

During monitoring we identified a high volume of connections in our cache service causing failures in our storage service.

Impact:

Slow message exchange in smart contacts

Solution:

It was necessary to carry out interventions in the services responsible for caching messages and after the actions the environment was normalized. An action plan was also created to implement improvements to prevent the scenario from happening again and we continue to monitor the environment.

Start time: 11:14 am
End time: 3:16 pm
Posted Sep 11, 2021 - 16:30 GMT-03:00

Identified

The issue has been identified and a fix is being implemented.
Posted Sep 11, 2021 - 13:42 GMT-03:00

Update

We are continuing to investigate this issue.
Posted Sep 11, 2021 - 11:52 GMT-03:00

Investigating

We are suffering a degradation in the performance of the BLiP platform, our technical team is already working on the case.
Posted Sep 11, 2021 - 11:50 GMT-03:00
This incident affected: Hosting Business (Bot Builder, Bot Router), Blip Platform (CRM, Core, Analytics, Artificial Intelligence, Portal, Cloud Infrastructure), Desk, and Hosting Enterprise (Bot Builder, Bot Router).