Skip to content
Stu LeesMar 3, 2024 3:50:31 PM5 min read

Recover Automation Rules After Latest Cloud Incident

Recover Automation Rules After Latest Cloud Incident

Background

Call it bad luck, or Murphy's Law, but the leap year will not be a great memory for those responsible for managing the the Atlassian Cloud platform.  According to Atlassian's support post (found here), there was a major cloud incident, that at the time of publishing this article, has not been fully resolved.

 

The Incident :  Lost Automation Rules in Configuration

For effected users (and the scope so far seems very wide and far reaching), the simple way to explain the incident is that automation rules inside their Jira sites simply disappeared.  

The image below is from the Atlassian Service Status Page as at February 29, 18:00PST.  Any Jira Admins should watch this page and subscribe for updates to ensure that you are fully aware of resolution progress and plans by Atlassian :  https://jira-service-management.status.atlassian.com/

ServiceStatus

 

The impact of this for customers

Automation Rules allow you to automate actions within your system based on criteria that you set. Automation rules are made up of three parts: triggers that kick off the rule, conditions that refine the rule, and actions that perform tasks in your site.
https://support.atlassian.com/cloud-automation/docs/create-and-edit-jira-automation-rules/

 

As automation rules can be used creatively by Atlassian customers for just about anything, their loss can and will have been having catastrophic impact to customers who are using them for core business processes in their Jira/JSM sites.

Examples of this are;

  • Escalation of tickets between people stopping
  • Tickets moving between projects no longer happening
  • Service desk teams turning up to work to an empty ticket queue
  • End-customers not getting notifications and alerts   
  • Service KPI's around response, comms and resolution times not being met
  • Adherence to ITIL practices not being met 

 

How to Recover from this Incident

As an expert in system and data recovery of Jira Cloud, we have been inundated with requests from a variety of Jira sites to help them make a recovery plan and to get their automations recovered and working again.

On March 29th, we invited The expert crew from The Jira Life team of Rodney Nissen, Alex Ortiz and Robert Wen to jump onto a livestream with us to help the Atlassian Community to recover from this incident.

We put the recording of this below and then decided (if you scroll down) to also start this blog page with a section of actionable steps for Jira Admins to take in order to recover as safely as possible and to minimize further business impact.

 

How to Recover : Livestream with The Jira Life Crew 

 

 

Advice on How to Recover

Disclaimer:  the information in this article is a set of guidelines and ideas and are not based on any inside knowledge of this evolving Atlassian Cloud incident.  With that said, the advice below should be taken in general terms and each administrator should take extra care to follow Atlassian's recommendations, to adhere to careful change management processes and if unsure, to defer to your qualified solution partner experts and/or the Atlassian support team.

 

Talk to Your End Users

This incident has already lasted over one business day so it is clearly significant.  It may be that some sites will be effected for a long time so get in touch with your end users and management and;

  • Inform them of the outage and of the latest updates from Atlassian
  • Ask them specifically for feedback on areas in your operation that are not working.  This may help you identify the impact of the issue that you are not yet aware of.  You may also glean from the impact analysis feedback ways to work around the issue and to get critical business processes moving again.
  • Ask them to inform you if they notice anything changing
  • Give them a communications plan of how often that you'll give status updates so that they don't keep chasing you for news
  • Review your automations dashboard to see what the blast radius is i.e. which rules were used most and what is the impact of these rules. This information will also drive the prioritization on what to restore back first

If you can restore from backup, do it with care

  • Get clear on when the incident started impacting your site - when things are working again, you may have to manually go back and fix data that has 'missed' the workflow.
  • If you have backups of your automations, then make a careful plan on whether or not you decide to restore from them and the scope of your restoration. 
  • Remember, its possible that Atlassian may restore over what you restore so you need to be aware in case that happens

When you restore or rebuild workflows - switch them on one at a time

Each workflow that you restore or rebuild should be turned on individually, and take care to look at dependencies and other factors when you turn a workflow back on.  Test each one carefully before switching on another so that you can limit and minimize the impact of any unexpected behaviour.

Make a backup of what you've got - even if its not working

Even if its not working, it is a good practice to make a backup of both your data and your configuration in Jira Cloud before performing any rectification of automations, batch updates to tickets or anything that you are doing to resolve the situation. 

Emergency repair jobs sometimes don't go to plan and its great to have a definitive restore point to go back to if data is damaged.

Announcement for Effected Jira Admins

We have decided to provide free access to our backup and clone tool for Jira Cloud configuration with an included, extended trial of 60 days for any effected Jira Sites.  This will enable you to safely and securely back up and protect your Jira Automations immediately.
Click on this link to install the app and your trial standard will be automatically extended after installation

Revyz Configuration Manager for Jira Cloud 2024

Install Fully Functional App on the Marketplace

More Updates Coming

We will be updating this page with more resources, advice and instructions aimed at helping you, so bookmark it and come back to it in future.

 

 

More Resources

In a somewhat serendipitous move, Vish Reddy and Stu Lees from Revyz recorded this demo video only two weeks ago.

 

avatar

Stu Lees

Stu is the VP of Marketing and Partnerships at Revyz Inc. With a career spanning over 25 years in IT, Stu runs the global marketing and partnership teams for the Revyz business and is based in Auckland, New Zealand. Stu has worked in a wide variety of arenas that include 15 years in senior leadership roles in enterprise IT as well as running his own technology integration company from 2006-2013. Stu is involved as a leader in the Auckland Atlassian ACE group and speaks frequently in webinars on both cyber security and marketing.
RELATED ARTICLES

Other Jira Backup Resources

If you found this article on Jira Cloud Backups useful, then take a look at our new Resource Center for Jira Cloud

Looking for a Jira Backup Solution?

Review the featues of this award winning Jira Backup solution

Revyz Data Manager for Jira Cloud 2024