Blog
News, Tech+Biz

Servers down, lessons learned

My websites and email were down from the afternoon of Thurdsay, August 1 to Sunday, August 4. If you emailed any turnwall.com or hifinit.com email addresses, we likely did not receive it. I’m extremely sorry for any inconvenience, but we’re back up and running and you can direct all emails to alex@turnwall.com.

The worst part about this entire fiasco was not my personal sites being down—it was the blackout of clients’, friends’ and students’ sites who happened to be on a host that I recommended to them (on private accounts that I do not manage). Still, I’ve felt personally responsible, because I led people to that service, and for that, I’m truly sorry. I really hope you were on a server that was fixed quickly, and hope it wasn’t too much of an inconvenience to you. Let me know if there’s anything I can do to help.

For the few business clients I’ve recommended to that host—I am incredibly sorry. I know that your businesses depend on your websites and I know that “it’s out of my hands” is not what you wanted to hear last week. If I haven’t spoken with you yet and you want to discuss migrating elsewhere, please email me. Sorry.

The culprit

For years, I’ve been recommending Bluehost as a practical choice for people with a simple website looking for cheap and generally reliable hosting. I liked them because they offered greater flexibility than comparably priced hosts back in the day, and they really did have excellent customer service—you spoke with a knowledgable person in the U.S., usually without being on hold for more than a minute. For individuals and small businesses, they made sense.

Unfortunately over the past year, culminating with their entire network of sites being down for over 12 hours this past weekend, I no longer recommend Bluehost even as a cheap hosting option. This pains me to say, knowing how many people I’ve led to them, but their customer support has gotten worse over the past year, their uptime has decreased, and the worst part is that they make up excuses for their shortcomings.

I’m not saying 96.0% uptime wasn’t acceptable for me (we should all realize that 99.9% uptime guarantees are kind of a fantasy for shared hosting). What I am saying is when I come to you with many months of average 92-98% uptime reports from multiple trusted sources—I want you to be straight with me about why that is instead of telling me the reports are bogus.

A not-so-short rant

I’m going to go on a little, because I actually found these types of posts helpful while looking for a new host.

Short story: I don’t recommend Bluehost anymore. I switched to A Small Orange. While I haven’t been with them long enough to make a judgement,  so far, so good. They seem more like the Bluehost that I signed up with 5 years ago—hopefully they don’t fall in the same direction.

My original issue was totally unrelated to the big Bluehost blackout, which was an unfortunate coincidence that exacerbated the problem, it was related to “upgrading” to a VPS from a shared hosting account. My experience before and after the blackout makes me think that Bluehost is really not good at anything other than shared hosting at the moment—I knew I was taking a chance and I got it wrong. Yes, there was almost “instant” provisioning of my account. Yes, it’s cheaper than many other hosts. Yes, cPanel and settings transferred from my shared account mostly correctly.

(Yes, after all these years, I was still running on shared hosting too. I know, I know. Although we ran many client projects off of more powerful setups, our own websites have been simple and running on WordPress for years. With the help of caching plugins, they’ve been fast enough and economical for a small business.)

But, to make an administrative change on my account, the online instructions said to contact Bluehost to complete the process. Here’s where the trouble starts. After calling the support number and being placed on hold for about 35 minutes (mind you this was a full day before the blackout stuff), the representative informed me I had called the wrong number. Funny thing—the support number I’ve been calling for years is actually NOT the support number you call for VPS. Okay fine. I thought I’d get even faster support since I was an upgraded customer. Cool.

Not so fast. I transfer to the VPS line. Another 30+ minute wait! Finally, I spoke with somebody who informed me that the upgrade process to the VPS account hadn’t quite finished as intended, and that’s why we had to make these changes. He was very helpful and solved my problem in a few minutes. While I had him on the line I decided to change the primary domain on my account, since I had signed up for hosting with a different domain years before I got my current domain and I wanted to let the original expire. I first had to un-assign my addon domain, which I knew would result in a few minutes of down time. I asked him how long it would take for the changes, he said up to 50 minutes, but more likely 10. That was acceptable. We made the changes, I thanked him, and I thought my little bit of frustration about the wait times would subside.

An hour goes by. Then two. Then three. By this time, it’s the end of the business day and my site is down, as well as all of my email on the turnwall.com domain. So I call the VPS number back. Hold… This time I also open a “live chat” window on their website, thinking it may be faster. 20 minuted later, I’m still on hold, but now I’ve got somebody on chat, so I hang up. The person on chat informs me again that I’m a VPS customer, so he is unable to help me via chat, I need to call the number. I said I had been on hold for 20 minutes and asked if he could transfer me somewhere on chat. He said no, but was nice enough to ask if he could help himself, although, as he pointed out, it was against company policy. I’m paraphrasing here (I wish I would have saved the chat):

“Yeah, the VPS guys aren’t really set up yet, I think they’re still working out the kinks. I can’t help you, but let me see what they say.”

Okay, that was nice of him, but the VPS guys are still “working out the kinks”? Huh?

“Oh yeah, they said it looks like the process was never started before. They’re going to start now.”

About 10 seconds go by and…

“Okay, they said you’re all set, but it’ll take up to 50 minutes to go through”

Yes, that’s what the last guy said, I explained. He said it was actually working this time though, I just had to let it go. By this time, it was nearing the end of the work day and I had other business to take care of. I thanked him and headed out.

The next morning, it was still not resolved. This was Friday—the day of the blackout. I attempted to call and chat again that morning, with no response on either for over 40 minutes. Then I realized all of my sites were down, as was the bluehost.com site and all admin panels. Some Googling later, I found out this was a widespread problem and had actually been going on for hours. Their sites and phone were also down—likely why I could not get through.

This went on for hours. After reading a random blog post, I then found this blog, which they set up to tell customers about the situation. Hours after they claimed to have sites back online, mine were not up and I still could not reach any tech support. Late that night, the sites were in and out of operating, but all email was down. It was not until the following morning that I got in touch with support. Again, a long wait. When I finally spoke with somebody, AGAIN they said the process had not started, he’d start it for me, and it would be up to 50 minutes. Well, 5 hours later, FINALLY the change went through and the cPanel was stable enough so that I could go in and make the necessary changes to get my email and sites back online.

By that time, I had already decided to change hosts. I had been thinking about it for about half a year, spurred by multiple support calls for client websites that left a bad taste in my mouth. This whole thing just pushed my over the edge.

Lesson #1: Don’t put all of your eggs in one basket.

My website hosting, domain registration and email were previously all handled by Bluehost. When their servers came crashing down, so did all of my services. Now I’ve offloaded my domains to one provider, VPS hosting to another, and email to another. I don’t expect perfect performance from vendors 100% of the time—that’s not realistic and, well, s*** happens. Now there’s a hedge—if one kicks the bucket, it’s easier to migrate one service at a time.

Lesson #2: You get what you pay for.

Shared hosting is great for lots of people—it’s a low barrier to entry for small businesses and freelancers. (I didn’t want to shell out for better hosting when I was a college student scraping by to start a freelance career.) But, the benefits of upgrading to a VPS can be worth while. You get dedicated resources, a faster site, and (hopefully) professional support to go with it. Looking at it now, I really should have made the switch sooner. Really, what was I thinking? I felt okay with business expenses like payroll (doing it yourself is not worth it), online collaboration tools (Basecamp is pretty great), and note taking apps (I’m kind of an Evernote junkie), but I was skimping on arguably the most important one for, you know, a web designer and developer. Really, what was I thinking?

Lesson #3: If you feel like it’s time to upgrade, don’t wait for the “perfect” fit.

I knew the support was going downhill, and I knew I should get my email moved over to a different service. I started to set up Google Apps over two years ago, and I started trials with other hosts two years ago, but I never made the switch. Something else always seemed more pressing and I didn’t want to commit to something new that wasn’t assuredly better. Boy, that was a mistake. In the end, waiting to make the switch cost me more time.

Hindsight is 20/20.