Disaster Recovery Plan
Below is an outline of the Disaster Recovery Process followed by Delib. This process is in place 24/7, 365 days of the year.
Here's a summary of the stages, and the target timescales. Each stage is detailed later in this article.
| Stage | When? |
| 1. Detection & definition | When notified by our monitoring systems or a customer |
| 2. On-call team alerted | As soon as possible after detection of critical issue |
| 3. Initial investigation & assessment | Within 2 hours of detection |
| 4. Customer notification | Without undue delay via email, phone and/or in-platform notification |
| 5. Resolution | Depends on complexity of problem. But our target resolution times for either product or infrastructure issues are detailed in our Service Level Agreement |
| 6. Report & document | Within 1 working day of resolution |
| 7. Review & retrospective | Within 3 working days of resolution |
1. Detection & definition
Detection: we'll be made aware of a critical issue by one of the following methods:
- (a) Automated alert from our monitoring systems
- (b) Internal detection (e.g. from investigations arising from third-party security announcements)
- (c) Customer or end-user report
Definition: of a critical issue:
- Has a customer site been unavailable to the general public for more than 10 minutes?
- Is there a reproducible issue which prevents a user from entering or submitting data?
- Is there a reproducible issue which causes unavoidable or unexpected data loss?
- Is there a bug or security vulnerability that constitutes a realistic threat to privacy?
2. On-call team alerted
If a critical error has been picked up by one of our monitoring systems, the team will be alerted. Unavailability lasting ten minutes or longer is automatically reported.
When? As soon as possible after detection of critical issue
3. Initial investigation and assessment
The team will aim to establish the cause of the issue as well as assess the severity and likely duration of the service interruption.
Ideally, this will include:
- Identification of the root cause
- An assessment of the severity and scale of the problem, including which customers are affected
- An estimated time to resolution
When? Within 2 hours of detection
4. Customer notification
Affected customer(s) will be contacted without undue delay to inform them of any critical issue affecting their site.
This communication will most likely be by email, but it depends on the severity of the incident. Any wider-reaching issues may also be posted as an in-product notification to make all end users aware.
5. Resolution
Once the team have assessed the problem, they will report back.
-
If the problem can be easily solved, it will be fixed. The team will document the problem and solution.
or
- If the issue is more complex, a resolution plan is put in place to address the service outage. This may require more team members to be contacted. An interim report, summarising expected cause, and steps to resolution will be produced.
In both cases, any information we have will be communicated to affected customer(s). We will continue to keep all affected customers updated with progress until we reach a resolution to the issue.
When? This depends on the complexity of the problem, but our target resolution times are set out in our Service Level Agreement.
6. Reporting, documentation and tidying up
Once the problem has been resolved, we will provide a written account for affected customers and for Delib's future reference. This will include:
- How the problem was detected
- The scope of the problem and how it may have affected end-user interaction
- The root cause
- Steps to resolution, including any measures put in place to mitigate the risk of repeat occurrence
- Total downtime
- Any service credit or other compensation offered by Delib, should the error have caused us to miss our Service Level Agreement targets
When? All of this should happen within 1 working day of the resolution of the issue.
7. Review and retrospective
Once the error has been resolved, Delib will have a retrospective to identify any long term counter-measures which can be put in place to prevent a recurrence of the issue.
This disaster recovery process is also reviewed to identify any improvements that can be made.
When? Within 3 working days of resolution
Other information
Would we ever take sites offline?
This will be the informed decision of Delib's Managing Director, who will be given a full brief of the situation by the team. We will ask ourselves some specific questions to determine whether this may be necessary:
- If the site stays online, could users submit data that gets lost without them knowing?
- If the site stays online, could any existing data loss or corruption be made worse?
- If the site stays online, is there a possibility of the loss or exposure of any personal information?
- Conversely, if the site is taken offline, could any existing data loss be exacerbated?
This is a last resort for us, and we would never take sites offline unless leaving them online would pose more of a risk to the customer(s) or their respondents.
Reviewed May 2026