August 4, 2007

Diagnosing Server Issues, and Convincing Tech Support You’re Right.

August 4, 2007

Some of you may have noticed that dotcult.com, noslang.com and feedbutton.com have been down or acting very slowly for the last few days. I thought it was because of the amount of bandwith feedbutton was using (see post below this one) but it wasn’t. I kept running the numbers in my head and thought “wow, a dedicated server should be able to handle these 3 websites with no problems.” It turns out, I was right.

When my sites weren’t responding I went through the usual checklist. Here’s what I did and my thought process:

  1. Is it my internet connection? A quick visit to some proxy sites show that the site isn’t responding for them either.
  2. Is the server down? ping…pong it’s responding to that.. A quick look at the apache logs confirm 150 current requests being processed… it’s not down
  3. Is there a rogue process? su root; ps x …. Nope, everything looks normal here.
  4. Restart apache anyway. That didn’t seem to work.
  5. Restart the server for the hell of it… that didn’t work either
  6. Must be a network issue. tracrt www.dotcult.com … hop hop hop.. interesting….

After it reaches a GoDaddy IP address (ip.secureserver.net) the final 5 hops take 778ms. This tells me that the problem is somewhere within GoDaddy’s network. It also tells me that there’s nothing I can do to fix this problem. I’m stuck calling GoDaddy.

After about 30 seconds of hold time (impressive!) I’m met by a guy on the other end. Before I can tell him what the issue is, he explains to me that I have an unassisted server plan, and thus he can’t help me. He also told me that he can access my sites fine. I tried to explain to him that by accessing them locally from his network, he won’t experience it… to try a proxy site or external internet connection. He refuses.

30 minutes later, I finally got a manager on the phone. He gave me the same shtick about having to upgrade to an assisted plan. After explaining what I’ve already tried, and assuring him that I know my way around a linux box, I told him it’s got to be a network issue and asked him to walk down to his server team and ask them to debug. I had to threaten to close my account, but he finally obliged.

Only then did he realize that they were having some hardware issues. Apparently the issues were bigger than just my site because he seemed to have an “oh shit” reaction to what he found. He told me that it would take a day or 2 to fix the issue, and thanked me for reporting it.

He was right. The next day the problems were gone and my sites are currently responding faster than ever. (I know I’m probably jinxing myself with this one.)

I’m just shocked that I had to go through all of this. It took him about 5 minutes to find the issue once he went down to his team in the server room, but it should have been found and fixed before I even called.

  1. What a wanker. Seriously, if you want fucking shit customer service you need look no further than webhosts. It seems that because they’re not physically customer facing they feel they can fob you off with any old shit.

    My webhosts (sitehq) have pretty dreadful customer service (well, to be honest, there’s only one guy who answers my tickets and I think it’s more of a personality clash, so I accept part of the responsibility if I’m being fair). The point is, I said I was going to take my business elsewhere and they didn’t care, just closed the ticket without comment – where do you go from there? As a consumer the only leverage we have is our trade, and if the company in question doesn’t care about keeping you – you’re totally screwed.

    Comment by Alexander — August 6, 2007 @ 5:34 am

