profile

Andy Henson

[LFR] Letters from a Roaman - Letter XLIX

Published over 1 year ago • 12 min read

Happy Tuesday friends,

The final part in my data ownership series clocks in at a little over 2,500 words; hopefully, I'll deliver on my promise of shorter emails next time!

Thinking Out Loud

In this series of essays, I aim to explore the true cost of full control of your notes so you can better evaluate what’s most important to you and ensure that you choose the tool best suited for your use-cases and which also matches your stance on control.

As with all my essays, they are my way of learning in public, so if you have thoughts or feedback, I’d love to hear them. Just reply to this email.

If you missed part one, you can read it here, and part two can be read here.

Data Availability

In this final part of my series around data ownership, we’ll explore the facets of data availability and some of the things you should consider to protect your valuable data.

Ask yourself how inconvenient or untenable it really is if you lose complete access to all your data. When the notes only reside on your computing devices, you are fully responsible and must take precautions against data loss. This means you need a plan for when the inevitable happens to recover the information and quickly get back to where you were.

Your notes aren’t helpful if you can’t use them when and where you need to. In today’s world, that means you want them available on various devices. If you aim to take complete control of your information, you have to exert a lot of effort to ensure that the data is securely transported from device to device and kept in sync. For example, it’s no good having different copies of what should be the same note on different devices.

Disaster Recovery and Business Continuity

These are the terms used in the business world to refer to the processes and procedures that should be in place to ensure that corporate knowledge and data are protected. When it comes to your own personal notes, you should take some time to make similar plans to ensure you don’t lose your important electronic information.

While we’re primarily discussing notes here, this applies to your digital files like photos and other documents. First, think through what information you hold electronically and its importance to you. Then think about what could go wrong and what you’re prepared and able to do to minimise the chances of those events, or if they do occur, what actions you’ll take.

There is a final essential element that you must do that even many businesses don’t, and that is that once you have made plans, you must test them to make sure they work. It’s no good waiting for an event to occur, only to find out that the plan doesn’t work, and now you have missing data.

Data syncing and redundancy

Having data in only one location is a risk. However, now most people want to have their notes available on several devices; phones, tablets and desktop computers. If you don’t want to trust and pay a third party to host your data to enable all your devices to sync, then you will have to set up and manage your own infrastructure, and you must take the steps I outline in the following sections. This almost immediately means that unless you are technologically savvy and understand network security, it is unlikely that you can do this and make sure it is actually secure and remains secure. If you’re interested, read this article by Martyna, which demonstrates the steps you’d need to go through to set up your own infrastructure to sync LogSeq graphs.

When I was in my 20s and “knew everything”, I was keen on setting up infrastructure like this. Twenty years later, I know a lot less; however, I have been running a company which specialises in developing and maintaining bespoke applications for businesses. So I know what it takes to securely run and maintain them. It takes work, knowledge and ongoing vigilance to do it well. But, from a personal perspective, I simply don’t have the time or inclination. I would rather pay others who spend their days doing this to do it well–and it’s generally more cost-effective.

So one way or another, for most people, if you want to share notes across devices, you will want to rely on a third party to manage this server infrastructure. In the case of Roam, you almost have no choice in the matter. Local graphs are designed to be on one computing device. Hosted graphs, by definition and design, provide syncing between multiple devices and users to collaborate in real-time. The Roam team handles all the hard lifting to provide the infrastructure. With thousands of paying customers, they are far more incentivised to ensure the platform remains available, secure and performant.

Only the parties with the password can access the data if you set up an encrypted graph.

Exporting data

I have mentioned before that data exports are a critical feature of a tool for thought. It is an instant no for any tool with no export option.

However, exports are not backups and are not for (seamless) data sync. Exports are typically lossy. This means they won’t contain every byte of information necessary to restore the complete state. Exports are therefore best thought of as transformations of data.

In the case of Roam, you can export your data in JSON and markdown formats, each with varying levels of data loss.

For example, a block in Roam has a unique reference ID (the block reference). This information is present in the JSON export format, but it’s not included when a block is transformed and exported in Markdown. This is because Markdown was designed for publishing, so it’s irrelevant.

However, tools such as Obsidian have implemented a form of block referencing into their version of Markdown. Therefore, it is feasible that the Roam team, or an extension author, could develop an Obsidian markdown export function where Roam block references are transformed into a format suitable for Obsidian and included in the output.

Returning to syncing for a moment, here is why syncing and interoperability are hard, even if several tools purport to work together on these plain text files. They each implement things slightly differently, so to be fully interoperable, they would have to be able to continuously transform and export notes. Since they are file-based, you would maintain two repositories of files tailored for each application.

Recoverability (aka backups)

Regular exports of data in common formats like Markdown can provide a level of redundancy in the case of complete loss of access to your tool. For example, as unlikely as it might be, if Roam shut down tomorrow with no notice. This should absolutely form part of your disaster recovery planning.

You will also need a robust backup strategy to handle more day-to-day concerns and likely disaster scenarios to minimise loss and hassle. I’ve talked before in previous letters and my Roam course about backups. Having learned the hard way, the military adage “Two Is One, One Is None”, is something I try to stick to. Admittedly, it adds to the overhead to maintain the availability of your data, but I can tell you it’s worth it to avoid the crushing feeling of losing your precious data.

Backups protect against a few main risks; hardware failure, hardware theft, software error and the most common, user error. Backups can and should occur at several levels; at a general systemwide level and at application-specific levels.

From a business continuity planning perspective, it is essential in organisational settings that your backups are stored offsite in some way. For example, if the building with all the computing equipment and the backups burns down, those backups are pointless.

Transporting those backups should be done securely, too. The data should be encrypted so that should they be lost, they cannot be restored without the key.

And finally, they should be stored in a secure location. For physical media, they ought to be in a safe capable of withstanding fire for several hours to still be usable afterwards.

This level of protection isn’t particularly feasible for most people at a personal level; however, it’s worth doing as much as possible to reduce the risk.

System backups

At a system level, for resilience, your backup should at least be separate from your computer. That means that, at a minimum, the backup must be to an external hard drive, tape drive, or a cloud backup service.

Local media

Using tapes or external hard drives is a simple, easy and relatively inexpensive way of regularly backing up your whole computer. It’s also pretty fast and convenient for restoring information in the event of a computer failure or where you inadvertently delete something you didn’t intend to.

Part of what makes them convenient also makes them easy targets if your property is broken into. If it’s data I’m after, and I had a choice between taking a computer or an external backup drive, I’d go for the drive. Easier to handle, likely multiple copies of data versions and less likely to be encrypted because most users don’t think about these externalities.

I’m not going to go into detail on the types of backup strategies you can run. The documentation of your backup solution should detail the options and pros and cons of each.

The essential thing you must do is ensure you also encrypt your backups and you keep the password keys secure.

Consider scenarios such as fire or other natural disasters. In a corporate setting, it’s common for backups to be physically removed and distanced from the server location and returned in a rotation.

Cloud storage backups

A more convenient and practical option than managing and transporting physical media to remote locations regularly would be to use a cloud storage provider for backups.

When you start using one of these services, it could take several days or more to complete the first backup if you don’t have a high-speed internet connection. Also, be aware that restoring will take a long time if it’s your only backup and you have a system failure. Some providers (like Backblaze can ship you a physical disk with a copy of the backup to load - for a fee which could be faster to get back in business).

As always, you should make sure the backups or the service itself is encrypted. The same rules apply as I discussed in part two. The best solutions are the TNO style, where you hold the key and backups are encrypted using it on your computer before shipping the encrypted data into the cloud.

Backup recommendations

As I stated at the beginning, Two is One, and One is None. So here’s how I apply it on my Mac. I run both local, encrypted backups to external hard drives using the built-in Time Machine feature, and for good measure, I’ve also been using BackBlaze’s unlimited backup space for years.

Plus (as we do for our commercial clients), I also use Tarsnap to store select de-duplicated data in Amazon’s S3 storage service at a low price.

If you want to keep things simple, I also recommend ArqBackup (which is available for both Windows and Mac). It can handle all the options I’ve discussed in a single package.

Taking regular backups of your entire system is essential; the frequency of those backups will be dictated through a combination of storage capacity, pricing and convenience.

Back up your passwords

In this series, we have discussed the importance of encryption and keeping your passwords and encryption keys secure. I recommend using a solid and trusted password manager like 1Password to securely store them all. You then must ensure that you have a backup method for the password manager.

1Password generates a recovery sheet that you can print and store securely. I recommend this paper copy is kept physically separate from your usual locations. Hand it to one or more trusted family members or friends at the very least (because two is one and one is none). I’m fortunate to have a fire-rated data safe at my office, where I store mine away from my house. A trusted family member holds the spare key and a copy of the recovery sheet in their home safe.

Application level backups

Suppose you’re using your tools-for-thought application/s regularly throughout the day. In that case, it is likely that the system-level backups are not always frequent enough to be sure of minimal data loss–you can do a lot of work you don’t want to risk losing in 15-30 minutes!

Of course, there are the basic table-stakes of having a robust undo feature so you can quickly revert accidental deletions or other recent changes.

One other related aspect to this is versioning. Sometimes you may need to keep several versions. The Heath Robinson (or Rube Goldberg) system is to simply export copies of those versions and save them; this is often most people’s “backup” strategy for documents. You can spot those people by the filenames they use My_Report_v2_final_final_a_.docx!!

However, one lesser-known feature in Roam is that you can easily version at the block level using Ctrl-, or right-click the bullet and choose “Add version”. You can create alternative versions of the block and switch between them at will.

For the more file-based PKM apps like Obsidian and LogSeq, aside from relying on the system-level file backups, I understand there are plugins which will make it easier to use services like Github or file storage providers like Google Drive, Dropbox and iCloud to provide more granular versioning of your changes.

Roam itself has an automatic backup feature you can switch on, oddly hidden in the “Export All” menu item. Here you can specify having Roam make a full backup every day or every hour. Plus, you can backup on demand. I recommend setting it to every hour if hard disk space is no issue. I suggest using the manual “backup now” for the cases where you want to get a definitive snapshot before and after vital pieces of work.

Roam’s backups are in the EDN format, which is lossless. They are a valid backup; if you restore them, you will not have lost any fidelity, but they are an all-or-nothing restore. You have to restore an entire graph, so it’s not a quick solution for an inadvertent delete.

Note that these backups or the other Markdown and JSON export formats are not encrypted, even if you’re using an encrypted graph.

Returning to the security considerations discussed in part two, just as with managing the encrypted state of file-based PKMs, the same logic must be applied to the Roam backup files. Roam simply stores them in folders named for the graph with date and time stamped filenames. In my system, these backup files get included in my various encrypted backup sets to Backblaze and Time Machine. Additionally, I use a tool called Hazel to monitor the backup locations. When new files appear, it runs a set of rules to back them up using Tarsnap and periodically cleans up and securely deletes older backups. Otherwise, the folders just continue to grow and grow in size.

Trust but verify

Remember what I said at the outset? One thing that most businesses forget to do is to actually test and verify their disaster recovery plans. Almost every company I’ve consulted with had not been testing their plans (if they even had a plan). They also did not verify that their backups were successful. I learned the hard way early in my IT career that just because you get a successful backup notification, it does not mean the backup was actually successful. The only way you can be sure is to restore the data and verify for yourself that, in broad strokes, it is usable and complete.

This is relatively easy to do with Roam. All you have to do is restore an EDN file. I suggest creating a new graph expressly for testing your backups. Restore the EDN into the graph and visit a few pages containing your most important information. Verify the data is intact, as it’s infeasible to check everything. Rely on spot checks of recent data and confirm there are no egregious mismatches of file size or graph statistics.

Wrapping up

It’s easy to assume that disaster won’t strike you. Still, with some foresight, planning and diligence, you can remain in total control of your data. Let others handle much of the heavy lifting so you can get on with your important work with peace of mind.


Thanks for reading. And don't forget you can give me your feedback by replying to this email. I read and appreciate them all, even if I cannot respond to everything.

Until next time,

Andy

P.S. I enjoy writing these newsletters, but they take a lot of time to curate and write. I don't seek to monetise them, but the software does cost me real money to send them. If you enjoy my work and find value in the ideas I share, please consider contributing to my running costs. I accept donations via Buy Me a Coffee, where you can now also become a member to support me regularly and get a few perks into the bargain.

A huge thanks to inaugural members Pierre and RJ. I really appreciate your generosity 🙏

Alternatively, if you'd like some help or guidance for making the most of Roam in your note-taking practice, I offer a few private 1-1 Roam coaching sessions.

Andy Henson

I write Letters from a Roaman, curating community news and resources primarily around Roam Research, though I also include other information applicable to other tools for thought and the area in general. I also share my thoughts on a wide variety of tools for thought topics.

Read more from Andy Henson

Happy Tuesday friends, If you’ve been around the Twittersphere for the last few weeks, you may well have seen the recent splash that Tana is making, with a number of prominent (and former?) Roamans sharing their excitement as the tool comes out of its “stealth” phase and, with echoes of the early days of Roam when the gates were temporarily closed, the desire of others for the coveted invitations to try it out. By way of PSA, and to hopefully save my email inbox from further deluge, while I...

over 1 year ago • 5 min read

Happy Tuesday, friends, 18 months ago, I posited that the five fundamentals of Roam were: Using the Daily Note Pages Thinking in blocks Indentation Page references and tags Block references Since then, I’ve had what turned out to be a huge article sitting in my drafts which hasn’t yet seen the light of day. Over the next few LFR essay editions, I will discuss these fundamentals and explore them more thoroughly. You’ll have a solid foundation to build on when you have understood and mastered...

over 1 year ago • 5 min read

Happy Tuesday, friends, In honour of this being my fiftieth Letter, I thought I would break with my usual format and share 50 things about Roam Research. It’s a mix of lesser-known and secret features, tips, tricks and simply things that I like, but I’ll start with what I consider the 5 fundamentals of Roam that, once mastered, give you a solid foundation to build the rest upon. 1. The Daily Notes Page (DNP) Let’s start with the fundamentals that make Roam what it is. Used in conjunction with...

over 1 year ago • 9 min read
Share this post