Skip to content

Broken Ghost Install? Post And Database Recovery For Bricked Instances


Published: at 03:42 AM

— By: Daniel Rosehill

Table of contents

Open Table of contents

If You Self-Host Ghost, Watch Out For Backups..

Ghost CMS is a nice piece of software and — for many — it’s the gateway drug between the safe but dull pastures of Wordpress land and the chaotic sprawling mass of Jamstack and Hugo and … my brain is too much of a mush to remember all the variants.

However, on more than one occasion I’ve discovered that it’s a very fickle creature. My track record with backing up and restoring Wordpress installs is pretty much clean but meanderings in the world of Ghost have sunk a few hard-written blogs into digital oblivion.

If you’re relatively new to the travails of managing VPSes (ahem, yes, I’m looking at me) you may also at some point learn one of the most infuriating wise-sayings in tech.

Here’s my version of it:

Unless you test your backups, you may learn the very hard and bitter way that they weren’t worth jack sh*t. Yes, even if you were paying a company money to manage them!

By default, Ghost is a conventional CMS — like Wordpress.

I’ve never gotten around to figuring out what the opposite of “headless CMS” is but let’s just say that it’s a headed CMS. Even though Ghost is wonderful and (like Wordpress) it can be configured in a static or headless configuration, by default, it’s a conventional CMS. The backend and frontend live together on the same server.

One vulnerability inherent in the conventional CMS model is that your production environment may be your only workable copy of your data (in this case, a blog).

It only occurred to me today that this flies in the face of the most elementary practices in backup. The fundamental principle of backup is the 3-2-1 rule (now deprecated in favor of more elaborate approaches).

No responsible Wordpress webmaster would ever intentionally put themselves in this situation. But it’s somewhat concerning that unless you proactively ensure that you have a backup copy (ideally two, ideally one in an offsite location besides production) you’re potentially setting yourself up for data loss.

(Above: one of several tails of woe from the Ghost forums)

Why do I level the charge that Ghost has a backup problem but not Wordpress?

Wordpress is a vastly more popular script (by orders of magnitudes).

There are more Wordress backup plugins than you can shake a stick at and — in my experience at least — most of them do what they promise they will (namely: avert catastrophe).

Ghost CLI is — sorry to be blunt to whoever has been working on this project — pretty useless as a data recovery and export tool. The restorability of Ghost backups offered by a certain cloud provider, I’ve discovered, are not trustworthy.

Everything of course can be scripted. But a word of warning to those adventuring in the world of alternative CMSes: it’s deceptively easy to wind up with a bricked Ghost instance and no way to get your content out.

Unless…


And If The Sh*t Does Hit The Fan, This Might Save Your Blogs…

Here’s the unfortunate situation I found myself in last week after I decided to move my previous tech blog between VPSes (its precious posts now lost to memory!).

And here are the things I did to attempt to fix the situation that didn’t work:

If you fear that your precious blog content is trapped in your useless server forever, there’s a couple of last dash measures before giving up home.

You can recover the content that you created in Ghost if you can put the following two things together:

You can and may wish to get more ‘things’ out of Ghost (like a theme that you modified). But let’s assume that you’ve sworn off the tool and you’ll be happy enough if you can get these things out.

Let’s assume that you used a MySQL database when setting up the CMS (it’s the only database recommended for production installs).


Emergency Ghost CMS Data Recovery

Things you’ll hopefully never have to worry about

Step 1: Rescue The Filesystem

(Actually this is step 2. Step 1 is deep breathing!)

Assuming that you have a viable SSH connection to the server, step 1 is to SSH into the server (sorry for the dump sample IP address!)

A quick poke around the filesystem will reveal where your images were uploaded to.

At the time I’m writing this, the default directory is /content/assets/images (relative to the root path of the Ghost installation on your infrastructure).

Assuming you didn’t change the default configuration your images will be nested under this folder organised automatically into a data-based hierarchy, like so:

If it’s a brand new installation, you can save a bit of folder-making by just grabbing them out of the month’s archive. But assuming it’s not, you’ll want to recurse to /content/assets/images and copy the images off to your local filesystem.

I created a folder called blogputbacktogether as I attempted to recover my blogs (humptydumpty would have worked too):


Step 2: Firewall down, database out: MySQL rescue!

The next step in rescuing your precious content is going to be extracting the database (yes, this is kind of like a really lame digital special ops excursion!).

My Ghost data loss unfortunately turned me off the CMS for good, so as I was planning on destroying the VPS, I threw caution to the wind and pulled down the firewall:

sudo ufw disable

Assuming that you’re on a MySQL database, by default the database probably isn’t accpeting direct external connections on its default port of 3306.

Now we’re going to apply some horrible cybersecurity and allow anyone with the password to access the database directly:

GRANT ALL ON *.* to user@address IDENTIFIED BY 'password'; 
flush privileges;

If you’re planning on keeping the VPS after the data rescue mission, please remember to undo all these steps (ie, disable external access and put the firewall back up). Maintaining this configuration in production would be asking to be hacked.

I tried to recover the .sql file directly by importing it into Dbeaver (this would have been a wonderful timesaver). But the only way I succeeded in accessing the file decrypted was by … provisioning a MySQL server on my local machine.

Fortunately this doesn’t take too long.

If you don’t have a MySQL server on your local machine then you’re going to need to set one up. If you have one on another server, go ahead and create a new database there. You’re going to want to recreate the schema that ghost was working with in its original server (in my case I went with the default option of ‘ghost’).

Next, install MySQL Workbench and connect it to your MySQL server (yes, this methodology totally sucks but … it worked !).

Then click into:

Again, I set up a schema called ‘ghost’ on the ‘rescue’ database to match the one that came off the Ghost server:

Then click on ‘Start Import’ and hope that this works:

If everything goes to plan, you’ll have a recreated copy of the database on the new server:

Go ahead and navigate down to the posts table which is where the content of the content is stored.

If you just want to rescue the actual content of the posts, then you can pull the data from either html or plaintext (or you could create a whole new Ghost install and try to patch it up to the database we pulled out … my powers of perseverance were already waning!)

There is nothing quite as satisfying as assuming that you would never see blog posts you invested hours in again and … lo and behold … the data isn’t lost!

You can actually pull the post content out of the database by copying whichever field takes your fancy (I went for the rendered HTML … this is kind of a weird way of creating a static version of your old site!).

The redeeming quality of my Ghost CMS lockout was that it was a very new install.

I was in the lucky position of being able to try recover the posts individually.

I pulled them out one by one as HTML, copying and pasting the data from the database into a file editor:

We can see image paths and text so … potentially all that we need!

Taking a look at any one of the image paths, we can see that they’ve been rendered in the database like this:

img src="__GHOST_URL__/content/assets/images/2024/05/53c84682-8e30-416c-aaa3-c58fa00455c5.png

Combine Images & Data To Reconstruct Static Post Files

The quick hack?

After you recreate the posts (in HTML) dump the images into the same folder.

Then all we have to do is rewrite all the image paths

To save time, we can import the mixed directory of HTML files and images into VS Code:

You can do a find and replace to remove the non-existent characters from the image paths:

The output isn’t beautiful and your Ghost CMS blog has lost its delightful styling.

But you should be able to open the HTML files in a web browser and be able to retrieve your blog posts:

image-20240515210658645


Not fun but … lesson learned!