Results 1 to 17 of 17
  1. #1
    Moderator leeann's Avatar
    Join Date
    January 18th, 2005
    Posts
    2,955
    Why Are Spiders a Bad Thing?
    I've noticed many people here don't like to be spidered. I guess I don't understand the whole thing behind it. I thought spiders helped you get placement on search engines. But I guess not? or not always? or some? What are bad spiders and how can they hurt you? Is there a way to block out the bad and keep the good?

    Thanks in advance for anything you can offer to help me understand it all.
    leeann


    Shoppers determine what has value and they like coupons. Stop manipulating who set the cookie just because you do not like coupon and promotional sites.

  2. #2
    Member
    Join Date
    January 18th, 2005
    Location
    Hawke's Bay, New Zealand
    Posts
    91
    Google spider = good.

    legitimate spider for new search engine with bad programming that eats up a month's worth of bandwidth every two hours and won't go away = bad

    anonymous-gonna-rip-your-site-and-put-it-up-as-mine spider = bad.

  3. #3
    Moderator MichaelColey's Avatar
    Join Date
    January 18th, 2005
    Location
    Mansfield, TX
    Posts
    16,232
    There are tons of spiders/bots that don't have anything to do with search engines. There are spiders that harvest email addresses. There are spiders that steal content. There are spiders that search for vulnerabilities. I think most of us are happy about the search engine spiders, but the others (at best) are just a waste of bandwidth.
    Michael Coley
    Amazing-Bargains.com
     Affiliate Tips | Merchant Best Practices | Affiliate Friendly? | Couponing | CPA Networks? | ABW Tips | Activating Affiliates
    "Education is the most powerful weapon which you can use to change the world." Nelson Mandela

  4. #4
    More Cheesier Than Ever Cheesehead's Avatar
    Join Date
    January 18th, 2005
    Location
    Land of The NFL Champs!
    Posts
    2,942
    I would imagine they could bog down your site so as to make it unaccessible. Spiders seem to come by mostly in the wee hours of the night during low traffic periods, perhaps for that reason.
    This World is Not My Home
    We're gonna go inside, we're gonna go outside, inside and outside. . . And then we're gonna go go go and we're not gonna stop til we get across that goalline! Quotes from the movie Rudy, 1993

  5. #5
    Member
    Join Date
    January 18th, 2005
    Location
    Hawke's Bay, New Zealand
    Posts
    91
    In terms of keeping them out, you can create a robots.txt file that will tell the polite ones that bother to obey it whether and where they are allowed to spider.

    As far as the impolite ones that don't follow robots, then you have to get cunning with things like .htaccess and spider traps to forcibly exclude them from your site.

  6. #6
    Crazy like a fox suzigeek's Avatar
    Join Date
    January 18th, 2005
    Posts
    1,096
    Hi Leeann,


    Search engine spiders are GREAT!! But there are "rogue" spiders out there that will scrape your site or other nefarious activities which I don't fully understand. So ideally spiders are good except just like everything in this world there are those that spider your site for not good reasons and suck up your bandwidth.

    There are ways to block spiders by using ip addresses or user agents through your htaccess sitewide or for your whole server if you have a dedicated server.

    I think Gordon has a post going somewhere with known bad bots/spider ip addresses on the board he's been updating. Sorry I don't have the link but someone will prolly point it out soon enough...

    HTH!


    added--wow we all posted at the same time!
    Suz~~GearGirl~~

  7. #7
    Moderator leeann's Avatar
    Join Date
    January 18th, 2005
    Posts
    2,955
    Quote Originally Posted by ulteriormotif
    In terms of keeping them out, you can create a robots.txt file that will tell the polite ones that bother to obey it whether and where they are allowed to spider.

    As far as the impolite ones that don't follow robots, then you have to get cunning with things like .htaccess and spider traps to forcibly exclude them from your site.
    I was afraid of that.. so in other words I need to learn something else..lol. This can really get overwhelming! lol Is there a bad list out there somewhere? Not that it will do me much good..but at least I can lose more sleep at night when I see they've hit my site.
    leeann


    Shoppers determine what has value and they like coupons. Stop manipulating who set the cookie just because you do not like coupon and promotional sites.

  8. #8
    Moderator leeann's Avatar
    Join Date
    January 18th, 2005
    Posts
    2,955
    Quote Originally Posted by ulteriormotif
    In terms of keeping them out, you can create a robots.txt file that will tell the polite ones that bother to obey it whether and where they are allowed to spider.

    As far as the impolite ones that don't follow robots, then you have to get cunning with things like .htaccess and spider traps to forcibly exclude them from your site.
    .htaccess reminds me of my xp registry - the twilight zone I never wander into. In fact - I don't even know if I have one. But if I do I'm afraid of doing major damage "in there" ...where ever it is....
    leeann


    Shoppers determine what has value and they like coupons. Stop manipulating who set the cookie just because you do not like coupon and promotional sites.

  9. #9
    ABW Ambassador Andy's Avatar
    Join Date
    January 18th, 2005
    Posts
    4,178
    Hi LeeAnn,

    Just do a search on your favorite SE for "ban bad bots" or something to that effect. You'll see quite a few links to read. There are lists of User Agents to deny that you place in your .htaccess file. This prevents the bad bots from accessing your site, based on their User Agent information. It's not foolproof, as this info can be manipulated to show a different agent.

    You need to be careful, and make sure you understand what you're doing, because you can do more harm than good if you end up banning a good bot. Read up on it a little, you'll understand how it works.

  10. #10
    Moderator leeann's Avatar
    Join Date
    January 18th, 2005
    Posts
    2,955
    Thanks for the tips. Too bad the hosting companies don't offer this as a service.
    leeann


    Shoppers determine what has value and they like coupons. Stop manipulating who set the cookie just because you do not like coupon and promotional sites.

  11. #11
    MasterMike HardwareGeek's Avatar
    Join Date
    January 18th, 2005
    Posts
    3,810
    I like Google, yahoo, Jeeves, MSN, and other SE Spiders they are welcome on my site 24/7 and they tend to be atleast 10 google spiders on my site at all times and another 1 to 3 for the rest.

    I lub all kinds of spiders except the ones that carry diseases such as . Steal-a-content-itis and hacker-assister-itis

  12. #12
    Full Member
    Join Date
    February 14th, 2005
    Location
    Texas
    Posts
    220
    I like to check my log files at least once per week. In the last two months, I'm finding more and more spiders for upcoming search engines. When they are really new it's hard to know what their true intentions are, but with the ton of money that Google owners made when it went public, there are going to be many more investors that will want to try and take a piece of their big pie.

  13. #13
    Full Member Travelin Man's Avatar
    Join Date
    January 18th, 2005
    Posts
    409
    Quote Originally Posted by mhutch
    In the last two months, I'm finding more and more spiders for upcoming search engines.
    Me too, although I'm blocking those want-a-be SEs until they actually become a search engine. Some of those outfits have been collecting data for years. Stop guzzling my bandwidth and launch the SE already.
    Travelin' Man

    "If you don't know where you are going, any road will lead you there." -- unknown

  14. #14
    Newbie Knothead's Avatar
    Join Date
    July 29th, 2005
    Location
    South Carolina
    Posts
    43
    What about reciprocal link checking spiders? How can you spot them and can they cause any problems?

  15. #15
    Full Member Tech Evangelist's Avatar
    Join Date
    March 16th, 2005
    Location
    Mesa, AZ
    Posts
    374
    My sites have been getting battered by the new BecomeBot spider. It ate up 5 gig of bandwidth on one site this month. I have not banned it yet because it looks like a legitimate shopping spider, but I did implement the advice found on the Become.com site to slow it down to one request every 30 seconds. How are you guys treating this spider?
    There's good, fast and cheap. Pick any two.
    [url=http://www.topranksolutions.com]Phoenix SEO[/url] :: [url=http://www.tech-evangelist.com/category/affiliate-marketing/]Affiliate Marketing Tutorials[/url]

  16. #16
    Plazan Merchant Neil's Avatar
    Join Date
    February 25th, 2005
    Location
    cyprus
    Posts
    1,764
    i get around 30 different bots a month,
    which all visit a number of times.
    its the ones that say , Unknown robot (identified by 'spider')
    that worry me ??? what are they.
    Find us at shareasale.com 12% commission
    Shareasale Merchant 7191
    PLAZAN SKIN CARE As seen on TV . Used by Jennifer Lopez

  17. #17
    MasterMike HardwareGeek's Avatar
    Join Date
    January 18th, 2005
    Posts
    3,810
    I'm glad I got unlimited bandwidth. Don't mind spiders sucking it up. Atleast 300 GB a month goes on spiders. and half of it googles

  18. Newsletter Signup

+ Reply to Thread

Similar Threads

  1. New extensions - A good or bad thing?
    By Parexal in forum Domains & Hosting
    Replies: 9
    Last Post: September 20th, 2014, 01:57 PM
  2. Content match - A bad thing?
    By IronChef253 in forum Search Engine Optimization
    Replies: 21
    Last Post: September 13th, 2006, 08:50 PM
  3. Is 4th Click a Bad Thing? Neutral?
    By johnnyWebAffiliate in forum Midnight Cafe'
    Replies: 18
    Last Post: August 28th, 2006, 09:56 AM
  4. Greetings! - Did I do a really bad thing?
    By Yogi in forum Introduce Yourself
    Replies: 19
    Last Post: February 15th, 2005, 09:31 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •