Backup and archive strategies for Microsoft Office 365
Even though backing up data and archiving seem to be similar concepts – i.e. fundamentally, they are both about saving your data – the purpose for each, and how administrators use data in each scenario are different.
In this article, we’ll cover the differences between backups and archives, and then discuss some options for Microsoft 365.
What is a backup?
Techopedia defines a data backup as:
Backup refers to the process of making copies of data or data files to use in the event the original data or data files are lost or destroyed.
Features most valued in backup software are typically the speed of copying and recovery, and data integrity (i.e. there are no errors in the copied data).
What is an archive?
Techopedia defines data archiving as:
Data archiving is the process of retaining data for long term storage. The data might not be in use, however, it can be brought into use and can be stored for future purposes.
Features that are most valued in a data archive solution are searchability and data authenticity (i.e. it is what it claims to be).
Why you need backups and data archiving with Microsoft 365
Most organizations need a backup (and restore) application to protect data in case of ransomware or malware attacks, accidental data deletion due to human error, deletion due to malicious behaviour, offline sync errors, and more.
Organizations that need to comply with regulations or retention rules often need an archive solution to meet requirements for eDiscovery, legal holds in case of litigation, etc. Typically, an archive solution does not restore data in a way that end-users can continue doing their work, so you would still need a backup and restore option.
Also keep in mind that most backups are retained for shorter periods, for instance days or weeks, since new backup images supersede the previous versions.
In summary:
“Remember that archival storage should not be confused with active data and backup operations. The requirements of an archive call for a strategy that enables regulatory compliance, data authenticity, media longevity, quick random access and low TCO [total cost of ownership].”
— Backup vs. archiving: It pays to know the difference | Computerworld
What’s included with Microsoft 365?
For backups:
Microsoft includes basic backup protection with Microsoft 365. Microsoft 365 does a decent job at short-term backups. For example, Microsoft runs an automatic backup of SharePoint and OneDrive every 12 hours and retains documents for 14 days. While this might be ok, it might not be long enough to notice if something is accidently deleted.
The biggest issue with Microsoft 365 backups is complexity. The different apps (e.g. Exchange, SharePoint, OneDrive, Teams) all do backup and restore differently. There isn’t a ‘single admin screen’ that you can go to, in order to restore everything from one file or email in a mailbox to an entire service. Instead, Microsoft 365 includes various features to help with backups and restore.
For example, with deleted documents in SharePoint, you can restore them via Versioning or the Recycle Bin. Entire SharePoint sites can be restored for up to 93 days from the Admin Center. For email, there’s the recovered deleted items function in Outlook on the web, or the Recoverable Items folder for administrators to access.
Note: it’s not possible to restore some things with tools in Microsoft 365. For example, it’s not possible to restore a mailbox after 30 days or an attachment of an email.
The other issue is recovery time. Microsoft 365 is not the fastest with migration or restoration of data in the cloud. This likely isn’t an issue if you’re restoring a few documents, a couple of SharePoint sites or a few emails. However for larger data restores, according to the M365 migration guide, you’ll see that times depend on whether you have lots of large or small files. Depending on how your content is structured you get 250 GB/day to 10TB/day (very optimistic!) in restoration time. For some organizations with 20 or 30 TB of Microsoft 365 data, waiting days or weeks for restoration will be unacceptable. And then there is throttling, which I’m not going to cover in this article.
For archiving:
Microsoft 365 provides retention (i.e. archiving) functions through retention polices and labels. This helps you archive content in case of legal holds or the need to do discovery down-the-road.
Similar to backups, the main issue with Microsoft 365 archiving is complexity. Administrators configure these options for Exchange Online, SharePoint, OneDrive, Teams and Yammer in the Compliance Center, but it’s not quite point and click. Learn more about the steps involved in setting up Microsoft 365 retention from our past blog post.
To do any event-based retention and automate the archiving process, you will need to educate yourself on Power Automate and the Keyword Query Language (KQL).
Preservation Locks are an added layer to stop rogue administrators from changing the retention policy or label.
Once content is retained properly in Microsoft 365, the natural next step is to search the archive. Content search is available for all Microsoft 365 licenses in the Compliance Center for saving and exporting searches:
Content search is helpful if you want to monitor content related to a topic or keyword, or a specific type of content such as an Employee File. It runs in the background and can take a few minutes the first time.
Another common need for archives is to put content on hold or do a discovery (such as a Freedom of Information request).
Additional licenses are needed for Microsoft 365 eDiscovery which lets you place holds on mailboxes, sites or content. There is a basic version, called “Core eDiscovery” and the “Advanced” version for E5 or premium (expensive) licenses. The Advanced version supports more settings, workflow options and useful Optical Character Recognition (OCR) searches. Typically, as your content sets grow with different content types, images, conversation threads, attachments, etc. you will need these advanced search functions for easier discovery across archives.
What about third-party tools?
Often our clients consider third-party tools for backups and archives for a couple of reasons:
Avoid vendor lock-in. One of the reasons some organizations don’t want to rely on Microsoft for Microsoft 365 backups or archives is that all their eggs end up in one basket, so to speak. To diversify data investments, you may want to consider a third-party tool in the very slim chance that Microsoft folds or to avoid vendor lock-in.
Need for features that Microsoft 365 doesn’t support (or are too expensive). There are features such as being able to control backup frequency or the need to support sophisticated eDiscovery or search requirements that drive organizations to look at third-party tools.
Backup tools
The software market for backups and data recovery is large. Gartner lists and reviews dozens of backup and data recovery products, for example. Tools can either be in the cloud or on-premises. In the cloud, you’re typically paying by user or by TB.
Some organizations might find having a third-party tool that has the ‘single admin panel’ for backups and archive useful. IT admins are often stretched across multiple apps and systems, so it’s nice to have one place to go to manage backups in one place.
Keep in mind that third-party tools are all limited by Microsoft 365 APIs. There isn’t a tool that fully restores everything in Microsoft 365 such as all types of Teams chats, Yammer content, Forms data, Planner boards, etc. Microsoft hasn’t built all the APIs needed for that, and the platform is constantly changing.
We don’t have a strong recommendation for backups with third-party tools as the requirements vary for organizations, and the decision will come down to the details of what you need to restore and how fast.
Archive and eDiscovery tools
Similarly, the market for archiving tools is broad.
We recommend Collabspace ARCHIVE or DISCOVERY as a third-party archiving solution for Microsoft 365.
We are a Collabspace partner and have focused on these products because they deliver user-friendly features for compliance. Collabspace targets archivists and IT teams who are need to ensure data is secure and follows regulatory compliance rules.
Collabspace ARCHIVE streams data into an encrypted data lake to store data in WORM-compliant storage. This gives you added protection for content stored outside of the Microsoft 365 ecosystem.
WORM stands for ‘write once, read many’, and ensures data authenticity by saving information in a form that no one can tamper with or accidentally delete. New versions are saved with every change or update to content.
Search is easy to use with the ability to compose queries with multiple filters or metadata. Results are presented in an interface that supports easy sorting, filtering and the option to save queries and search all versions:
With Collabspace DISCOVERY, legal holds and more precision searching is possible. Optical character recognition (OCR) automatically indexes and extracts text content from non-text-searchable documents such as PDFs, Tiffs & images. And if you’re dealing with unstructured data, entity extraction identifies pre-defined labels such as a person, organization, time, event, location, etc.
There are more features in Collabspace ARCHIVE and DISCOVERY than we can cover in this post, so make sure you review the full list of specifications. We see these products as useful complements to Microsoft 365 security and compliance features for regulated organizations, especially when using non-E5 or non-premium Microsoft licenses.
If you’re looking for records management with archives and eDiscovery, check out Collabspace CONTINUUM.
In summary
We hope this helps you understand the difference between having a backup and having an archive. They are similar, but features that administrators require are different.
With so many options for backups and data archiving, it can be difficult to figure out what best fits your organization. Reach out to us and we’d be happy to discuss your specific situation.