Results 1 to 6 of 6
  1. #1
    Web Ho - Design B!tch ~Michelle's Avatar
    Join Date
    January 18th, 2005
    Location
    Michigan
    Posts
    2,040
    Exclamation Google, Ask Jeeves ignoring robots.txt?
    It seems to me that Google, AskJeeves and others are ignoring my robots.txt file. They used to obey it, but now they are spidering and listing whatever they dang please. I can't figure out why.

    For example, I have the following in my robots.txt file.

    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /ad_images/
    Disallow: /stats/

    User-agent: Mediapartners-Google*
    Disallow: /cgi-bin/
    Disallow: /ad_images/
    Disallow:/stats/

    User-agent: Googlebot-Image
    Disallow: /


    But Google is spidering and listing my /cgi-bin/ like crazy, ask is Ask Jeeves. Google is also listing images from my site within their image search.

    They are also crawling into a file on the server side called /cgi-sys/ and listing everything there. I contacted my host and they keep telling me that is impossible, but it goes there munching away and indexing.

    My bandwidth is getting pounded because of this. This has just happened in the last couple of months. Prior to that all was fine.

    Is anyone else having this problem?
    ~Michelle
    "All I ask is a chance to prove that money can't make me happy."
    "Work to become, not to acquire." -- Confucius

  2. #2
    Not Verif-Lidated infoTim's Avatar
    Join Date
    January 18th, 2005
    Location
    Sunny Florida
    Posts
    1,021
    I would check your server logs to be sure that it's actually pulling your robots.txt file ok and there's not some file permission issue or something that's getting in the way. I've done that before. :-(

    - Tim
    Tim
    consultant by day, affiliate by night

  3. #3
    Crazy like a fox suzigeek's Avatar
    Join Date
    January 18th, 2005
    Posts
    1,096
    I thought I read someone else was having this problem also on this board recently??

    Anyway someone suggested it may not be google at all but someone using googlebot as a user agent. I think you can check the IP to check if it is actually google bot.

    I've never had google index folders I've disallowed.
    Suz~~GearGirl~~

  4. #4
    Newbie
    Join Date
    March 11th, 2005
    Posts
    35
    I've never had google index folders I've disallowed.
    Google has changed and is not playing by the rules we used to know to be true..

  5. #5
    Lite On The Do, Heavy On The Nuts Donuts's Avatar
    Join Date
    January 18th, 2005
    Location
    Winter Park, FL
    Posts
    6,930
    Test your robots.txt file here to make sure it's not your fault somehow...

    http://www.searchengineworld.com/cgi-bin/robotcheck.cgi

  6. #6
    Web Ho - Design B!tch ~Michelle's Avatar
    Join Date
    January 18th, 2005
    Location
    Michigan
    Posts
    2,040
    I tested the robots.txt file and it passed with flying colors and I have looked to see if Google has been grabbing the robots.txt file and it has.

    Another one that is just gobbling up my site is:

    Mozilla/4.0 compatible ZyBorg/1.0 Dead Link Checker (wn.dlc@looksmart.net; http://www.WISEnutbot.com)

    I want to ban it if it is only some Dead Link Checker. Does anyone know if that is all it is?
    ~Michelle
    "All I ask is a chance to prove that money can't make me happy."
    "Work to become, not to acquire." -- Confucius

  7. Newsletter Signup

+ Reply to Thread

Similar Threads

  1. No change to Robots.txt but google says pages blocked
    By JI7009 in forum Search Engine Optimization
    Replies: 7
    Last Post: November 7th, 2009, 11:34 AM
  2. Restricted by robots.txt without robots.txt?
    By mayfly in forum Search Engine Optimization
    Replies: 10
    Last Post: August 26th, 2009, 05:13 PM
  3. Robots.txt
    By Rhia7 in forum Midnight Cafe'
    Replies: 0
    Last Post: April 18th, 2009, 12:34 AM
  4. Google and robots.txt character limit
    By John Powell in forum Search Engine Optimization
    Replies: 6
    Last Post: February 8th, 2006, 01:00 PM
  5. Google wants nothing but robots.txt!
    By login in forum Search Engine Optimization
    Replies: 2
    Last Post: November 19th, 2004, 09:05 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •