Results 1 to 8 of 8
  1. #1
    Moderator
    Join Date
    April 6th, 2006
    Posts
    2,689
    Webmaster tools - googlebot error message
    I've been scratching my head all day on this one...

    I launched a new domain 3 months ago (not an affiliate site). Checking Webmaster tools, it seems that 4 days ago Google stopped crawling the site.. zero time spent, zero pages.. the crawl graph has flatlined.

    There is also an error message that indicates "robots.txt" is unreachable. And when I tried to "Fetch as googlebot", it returns another unreachable error.

    There are only 5 domains in this server, this is the only account with a problem (ie, it's not server-wide).

    I have seen this message before, it usually indicates googlebot is being blocked at the firewall level, so I contacted the host - no Google IP on the black list, but they can't find a record of my "Fetch as googlebot" attempt either.

    On a hunch (not sure why), I setup the "non WWW" version of the domain in webmaster tools... and when I "Fetch as googlebot", it successfully finds robots.txt, and shows a few pages have been crawled.

    So.. "www.newdomain.com" can't be reached.. but "newdomain.com" is fine..?

    And no, there is no rewrite rule in htaccess to address a canonical issue.

    Does any of this make sense.. or could it be a wacky webmaster tool issue..?

  2. #2
    ABW Ambassador 2busy's Avatar
    Join Date
    January 17th, 2005
    Location
    Tropical Mountaintop
    Posts
    5,636
    GWT started some time ago to require you to verify both versions (www and non-www) and then select one or the other. Having a redirect for canonical issues didn't make any difference. Don't look at the non-selected version, some people say to delete the unused version in GWT and I have done that for some but not all domains, have seen no issues after deleting the unused versions. They pay no attention if you select a target location.
    I still go in and read that their crawler can't access 34,000 pages, but any of them that I test has no problem. They show 404 errors for directories I removed a year ago, going through all their noindex and URL removal steps. They are just flakey some days. I go back in a few days and they seem to have settled down. It is hard to take their data seriously, but once in a while they find something that does need attention.

  3. #3
    Moderator
    Join Date
    April 6th, 2006
    Posts
    2,689
    Thanks for the feedback..

    Should I just keep my fingers crossed and run with the "non-www version" that it can actually find..?

    It's just weird that the crawl error indicates "www" is unreachable, while "non-www" is fine. I've already spent a couple of hours on the phone with my host, they checked every firewall possible, and all accounts on the server are under my control.

    I know GWT oddities tend to fix themselves, but a flatline for 3 days is a little unsettling!

  4. #4
    ABW Ambassador 2busy's Avatar
    Join Date
    January 17th, 2005
    Location
    Tropical Mountaintop
    Posts
    5,636
    Did you verify both the www and non www versions already? If you have verified both versions and then selected the non www. version, that's all you can do. If it is a WP site you might want to check that their verification code is still in place (unless you chose to use a page for verification) and that your WP settings for the WP URL match up with what GWT is using. I don't think anyone can give you valid advice regarding why things sometimes seem weird with gugel, I was more commiserating than offering advice there. Sometimes they are strange, but I would be concerned about three days with no crawling too. Have you viewed raw access logs on that site? Are other SEs getting in?

  5. #5
    Moderator
    Join Date
    April 6th, 2006
    Posts
    2,689
    Yes, both versions have now been verified.

    I was hoping someone could shed some light on the reason a bot could access "non-www", but indicate "www" is unreachable... as I'm not familiar with canonical issues. It's a real concern as the "www" version has now lost what little ranking it had earned. This is a content/ecommerce site, not an affiliate site.

    The apache logs show some SE access, which is a good thing. Short of moving the account, I don't know what else to do.

  6. #6
    15 years and counting
    Join Date
    January 18th, 2005
    Posts
    6,121
    Webmaster tool is far from perfect.
    For each of my sites I've selected (a long time ago) in site configuration, settings, the preferred domain: Display URL as... and all was fine.
    Last week I had a bunch of messages with:
    You have more sites.
    You're a verified owner of these sites. (Showing the configuration I had NOT selected) Would you like to add them in your account.

    Over the years I've seen the people at Webmaster Tools doing crazy things. They have a real problem!!! And they can get you in trouble by doing modifications you should not do.
    I've seen them unable to read site maps, robots.txt, and unable to crawl a site...
    First, you think you have a problem. You change your site maps... NO, IT'S THEM!!! Poor programming skills or whatever.
    Give them a few weeks (or days) and it could be back to normal...

    That being said, it's better to check carefully Webmaster tools and read their messages. You can learn something important.

  7. Thanks From:

  8. #7
    ABW Ambassador superCool's Avatar
    Join Date
    April 23rd, 2008
    Location
    Texas
    Posts
    1,268
    have you tried removing / renaming your htaccess file? maybe it has a coding error. also possibly try removing robots and putting in a new generic one to see if it will load

    just grasping at straws.... hope you get it sorted

  9. #8
    Moderator
    Join Date
    April 6th, 2006
    Posts
    2,689
    Just to provide an update - after a few more days of flatlined stats for the "www" site (and seeming success for the "non-www"), I decided to move the site. I have two VPS accounts with the same provider - this site had been installed on my development/backup server, so I simply moved it to the second server.

    Within a few hours of the DNS change, GWT now sees the "www" version, and "Fetch as Googlebot" returns "Success".

    I'm not sure I will ever figure out what happened here - the host got involved well beyond their usual support, and I even had a programmer pal review the apache logs, .htaccess, etc.. no one could find any evidence of a problem.

    It may have been something installed on the development server that google had flagged - not quite a bad neighbourhood (I'm the only resident!), but enough to cause a hiccup. Who knows.. but the move fixed the problem.

    Thanks to everyone for support/suggestions!

  10. Newsletter Signup

+ Reply to Thread

Similar Threads

  1. 404 error in Webmaster Tools
    By silver93350 in forum Programming / Datafeeds / Tools
    Replies: 10
    Last Post: January 10th, 2011, 06:46 PM
  2. Do You Use Google Webmaster Tools
    By Trust in forum Midnight Cafe'
    Replies: 13
    Last Post: August 11th, 2009, 04:33 PM
  3. Webmaster Tools
    By mrfori in forum Midnight Cafe'
    Replies: 0
    Last Post: February 6th, 2005, 11:03 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •