Workflow Manager Disaster Recovery – Preparations

Introduction

This is the first “real” posts in the Workflow Manager Disaster Recovery series. In this post I will show you what you need to do to prepare for Disaster Recovery (DR) situations when working with Workflow Manager and Service Bus.

The obvious

Let’s start with the obvious pieces. You should run your Workflow Manager farm on three (3) servers (for more on this discussion see the SPC356 session). Running on three servers are important not just for high-availability it might save you from going into DR mode. DR should be considered as the last resort. You should also consider one or more SQL Server high-availability options, more on this later.

Data tier preparations

WFM/SB default databases The Data tier in the Workflow Manager (WFM) /Service Bus (SB) is the actual SQL Server databases deployed when setting it up. It’s a minimum of six (6) databases – three for WFM and three for SB. These databases should of course be backed up – all DR options for Workflow Manager requires that you have a database backup/replica to restore from. Configure a “good” backup schema with full, differential and transaction log backups according to your SLA. You should also keep these backups in “sync” – that is don’t do SB backups on odd weeks and WFM backups even weeks for instance. They are normally not that large databases and the backup process should be pretty fast.

Secondly you must think of the high-availability (HA) options for these databases. As I said previously, this can save you from going into DR mode. Most SQL HA options can be used with Workflow Manager and Service Bus, for instance:

Mirroring
Always On Availability Groups with sync commit

If you need to shorten your DR time/RTO (Recovery Time Objective) then you should also not only rely on SQL backups but implement some replication technique with SQL to replicate data to a secondary location. The following techniques works great:

Log Shipping
Always On Availability Groups with async commit

Compute tier preparations

Once you are in control of the databases, backups and optionally replicas you could also consider weather you would like to have a cold or warm standby. A warm standby will shorten your RTO compared to a cold standby. A hot stand by Workflow Manager farm is NOT supported (a hot standby is a secondary running instance).

Cold standby – when a disaster occurs you create a brand new Workflow Manager/Service Bus farm using scripts and restore data from backups or replicated data.
Warm standby – a secondary farm configured but all the nodes (WFM and SB) are turned off. Use scripts to resume the nodes. Once the nodes are resumed you need to run the consistency verification cmdlet (Invoke-WFConsistencyVerifier, more on this later in the series).

The Symmetric Key is the key

In order to be successful or succeed at all you must keep track of the Symmetric key of the Service Bus. Without the Symmetric Key you cannot restore the Service Bus and you will loose all of your data. Keep it in a safe place!

You can find the Symmetric Key by running the following PowerShell command:

Get-SBNamespace -Name WorkflowDefaultNamespace

Summary

You’ve just read the first post of this series and you’ve learnt the basics in how to prepare for, and possibly avoid, a Disaster Recovery situation for Workflow Manager and Service Bus. In the following posts we will discuss what do do if and when the disaster happens…