Part of the reason why I’ve not achieved as much as I’d hoped this week is that the web server that hosts another project I look after had a hardware failure. It’s shared hosting with Dreamhost and they attempted to recover the site from their backups.
This is really frustrating for a number of reasons:
- Their backups are three days old and I’ve got daily backups so mine are much newer (they tell users not to rely on their backups… so I didn’t).
- Their backups are incomplete… sigh. When setting up their backups someone obviously decided it would be a good idea to exclude any files found in a directory called ‘tmp’. However, some web frameworks, such as CakePHP, store logs and cache files in a ‘tmp’ directory within the app. The result is that when the host recovers the files from their backups the apps won’t run as they are missing the directory structure they expect in ‘tmp’.
- In order to copy the backed up files over to the new server and still serve the sites in the mean time they mount the backup file system across the network. Not only do you get network latency but the restore process is pretty heavy. Since the restore process started the load on the server hasn’t been below 250 and is often up at 500! The result is a site that’s partially broken (see 2) and, when it can serve content, the content is out-of-date and takes 20-30s to be served due to the load.
- When they came to use the backup server they found it to be “degraded” so the recovery process is taking even longer. It’s currently been running for four days!!! (Update, still running 12 days later.)
I understand the host’s reason for what they’re doing (they can’t rely on everyone to take backups and this is the least hassle recovery path) but they haven’t offered those customers that have good backups an opportunity to get their sites up quickly. I raised a ticket asking to be moved to another server and, although they replied to the ticket, they just referred me to the current “update” message which apologised for the load on the server and mentioned the “degraded” backup (which I’d already seen). They even had the cheek to suggest one way out of the current predicament was for me to buy one of their Virtual Private Servers! As if, following four days of downtime and with them unable to give an ETA for the end of the recovery, I’d spend money on a VPS with them?
The whole episode has left me wondering what the point of my backups were when I couldn’t use them to recover my site with Dreahost? So I decided to make use of them, I signed up for some new hosting and moved the site using my backups. This wasn’t a quick or easy process (we’re talking five databases and a mixture of Perl and PHP code and some 5,000+ files) and it took about seven hours work (although this wasn’t continuous) to get things set up and tested. There’s also the time it takes for the DNS to propagate but Dreamhost’s recovery process is so slow that I think the seven hours I spent and the time to propagate (even if it’s 48hrs) have not been lost. There are a couple of bonuses with the new host but the main one is speed, the site zips along faster than it ever has… so, the whole process has been a bit of a pain but not necessarily a bad outcome. The move also prompted me to complete the site documentation I’d been writing so that’s good too.
I still think it’s a massive shame that I’ve had this experience with Dreamhost. They still represent good value for money and offer a lot of flexibility. It’s unfortunate that the flexibility didn’t extend to the recovery process and effectively forced me to move one of my accounts.
UPDATE – After more than two weeks the server was still struggling under high load and the messages in the customer panel were giving no new information. This meant I was right to move the site when I did although I still feel a little sad that the situation led me to have to do that in the first place.