Results 1 to 9 of 9
  1. #1
    pph Expert! Gordon's Avatar
    Join Date
    January 18th, 2005
    Location
    Edmonton Canada
    Posts
    5,781
    Can this be done?
    I am fed up of the bots and spiders inflating my click throughs I am using a jump script (the one Chet posted) and I am wondering if there is any way I can prevent them from following the links thus adding clicks to my stats that are not real clicks.

    I am wondering if there is any bit of script I can add to my .htaccess (either the one in root or if need be I could put one in the directory in question) that will only allow people access who have clicked a link on my site?

    thanks in advance for any help.
    One day parasites and their ilk will be made illegal, I bet a few Lawyers will be pissed off when the day comes.
    Mr. Spitzer is fetching it nearer

    YouTrek

  2. #2
    Moderator MichaelColey's Avatar
    Join Date
    January 18th, 2005
    Location
    Mansfield, TX
    Posts
    16,232
    I'm doing something very similar to that. Basically, I have a list of referer match strings and a handful of specific IP's that I block. As I see others inflating my clicks, I add them to the list. Here's my current list:

    Xenu Link Sleuth
    NutchCVS
    MSIECrawler
    Wget
    PHP
    ZyBorg
    WinBatch
    Teleport
    Googlebot
    Scooter
    ScoutAbout
    Ask Jeeves
    Microsoft URL Control
    Borg
    Slurp
    Lachesis
    Library
    WebCopier
    ia_archiver
    lwp-request
    Crawl
    libwww
    robot
    lwp-
    Downloader
    Fetch
    grub
    CCK
    gatewaynet
    dev-soft
    HTTPClient
    Fluffy the spider
    AvantGo
    Dumbot
    Michael Coley
    Amazing-Bargains.com
     Affiliate Tips | Merchant Best Practices | Affiliate Friendly? | Couponing | CPA Networks? | ABW Tips | Activating Affiliates
    "Education is the most powerful weapon which you can use to change the world." Nelson Mandela

  3. #3
    Not Verif-Lidated infoTim's Avatar
    Join Date
    January 18th, 2005
    Location
    Sunny Florida
    Posts
    1,021
    Good list! :-) I'll have to try that.

    I have my exit script listed as "deny" in robots.txt, but who knows if they all respect that. It did cut down on the goog and the yahoo crawling thru my affiliate links. I wonder if that has any positive SEO benefit, because they don't see you outlinking from there? As far as the bot is concerned, you're linking to a private internal page.
    Tim
    consultant by day, affiliate by night

  4. #4
    Moderator MichaelColey's Avatar
    Join Date
    January 18th, 2005
    Location
    Mansfield, TX
    Posts
    16,232
    Most of the ones I listed don't obey robots.txt. I use that, too, and it does stop the well-behaved ones like Googlebot.
    Michael Coley
    Amazing-Bargains.com
     Affiliate Tips | Merchant Best Practices | Affiliate Friendly? | Couponing | CPA Networks? | ABW Tips | Activating Affiliates
    "Education is the most powerful weapon which you can use to change the world." Nelson Mandela

  5. #5
    Full Member
    Join Date
    January 18th, 2005
    Posts
    331
    Quote Originally Posted by MichaelColey
    Most of the ones I listed don't obey robots.txt. I use that, too, and it does stop the well-behaved ones like Googlebot.
    So, trying not to sound naive, but knowing I will , are you saying that some of the inflated stats can be avoided by using the robots.txt?? I've used it for directories and other parts of the site but never thought of using it for that..

    I thought that just by using something like " Disallow: /cgi-bin/ " would do the trick. Would I be better off using " Disallow: /cgi-bin/jumpscript.cgi " or is that being doubly redundant

  6. #6
    ABW Founder Haiko de Poel, Jr.'s Avatar
    Join Date
    January 18th, 2005
    Location
    New York
    Posts
    21,609
    Here's what I set up:

    # robots.txt
    # go away
    User-agent: Alexibot
    User-agent: Aqua_Products
    User-agent: BackDoorBot
    User-agent: BackDoorBot/1.0
    User-agent: Black.Hole
    User-agent: BlackWidow
    User-agent: BlowFish
    User-agent: BlowFish/1.0
    User-agent: Bookmark search tool
    User-agent: Bot mailto:craftbot@yahoo.com
    User-agent: BotALot
    User-agent: BotRightHere
    User-agent: BuiltBotTough
    User-agent: Bullseye
    User-agent: Bullseye/1.0
    User-agent: BunnySlippers
    User-agent: Cegbfeieh
    User-agent: CheeseBot
    User-agent: CherryPicker
    User-agent: CherryPickerElite/1.0
    User-agent: CherryPickerSE/1.0
    User-agent: ChinaClaw
    User-agent: Copernic
    User-agent: CopyRightCheck
    User-agent: Crescent
    User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0
    User-agent: Custo
    User-agent: DISCo
    User-agent: DISCo Pump 3.0
    User-agent: DISCo Pump 3.2
    User-agent: DISCoFinder
    User-agent: DittoSpyder
    User-agent: Download Demon
    User-agent: Download Demon/3.2.0.8
    User-agent: Download Demon/3.5.0.11
    User-agent: EirGrabber
    User-agent: EmailCollector
    User-agent: EmailSiphon
    User-agent: EmailWolf
    User-agent: EroCrawler
    User-agent: Express WebPictures
    User-agent: Express WebPictures (www.express-soft.com)
    User-agent: ExtractorPro
    User-agent: EyeNetIE
    User-agent: FairAd Client
    User-agent: Flaming AttackBot
    User-agent: FlashGet
    User-agent: FlashGet WebWasher 3.2
    User-agent: Foobot
    User-agent: FrontPage
    User-agent: FrontPage [NC,OR]
    User-agent: Gaisbot
    User-agent: GetRight
    User-agent: GetRight/2.11
    User-agent: GetRight/3.1
    User-agent: GetRight/3.2
    User-agent: GetRight/3.3
    User-agent: GetRight/3.3.3
    User-agent: GetRight/3.3.4
    User-agent: GetRight/4.0.0
    User-agent: GetRight/4.1.0
    User-agent: GetRight/4.1.1
    User-agent: GetRight/4.1.2
    User-agent: GetRight/4.2
    User-agent: GetRight/4.2b (Portuguxeas)
    User-agent: GetRight/4.2c
    User-agent: GetRight/4.3
    User-agent: GetRight/4.5
    User-agent: GetRight/4.5a
    User-agent: GetRight/4.5b
    User-agent: GetRight/4.5b1
    User-agent: GetRight/4.5b2
    User-agent: GetRight/4.5b3
    User-agent: GetRight/4.5b6
    User-agent: GetRight/4.5b7
    User-agent: GetRight/4.5c
    User-agent: GetRight/4.5d
    User-agent: GetRight/4.5e
    User-agent: GetRight/5.0beta1
    User-agent: GetRight/5.0beta2
    User-agent: GetWeb!
    User-agent: Go!Zilla
    User-agent: Go!Zilla (www.gozilla.com)
    User-agent: Go!Zilla 3.3 (www.gozilla.com)
    User-agent: Go!Zilla 3.5 (www.gozilla.com)
    User-agent: Go-Ahead-Got-It
    User-agent: Googlebot-Image
    User-agent: GrabNet
    User-agent: Grafula
    User-agent: HMView
    User-agent: HTTrack
    User-agent: HTTrack 3.0
    User-agent: HTTrack [NC,OR]
    User-agent: Harvest
    User-agent: Harvest/1.5
    User-agent: Image Stripper
    User-agent: Image Sucker
    User-agent: Indy Library
    User-agent: Indy Library [NC,OR]
    User-agent: InfoNaviRobot
    User-agent: InterGET
    User-agent: Internet Ninja
    User-agent: Internet Ninja 4.0
    User-agent: Internet Ninja 5.0
    User-agent: Internet Ninja 6.0
    User-agent: Iron33/1.0.2
    User-agent: JOC Web Spider
    User-agent: JennyBot
    User-agent: JetCar
    User-agent: Kenjin Spider
    User-agent: Kenjin.Spider
    User-agent: Keyword Density/0.9
    User-agent: Keyword.Density
    User-agent: LNSpiderguy
    User-agent: LeechFTP
    User-agent: LexiBot
    User-agent: LinkScan/8.1a Unix
    User-agent: LinkScan/8.1a.Unix
    User-agent: LinkWalker
    User-agent: LinkextractorPro
    User-agent: MIDown tool
    User-agent: MIIxpc
    User-agent: MIIxpc/4.2
    User-agent: MSIECrawler
    User-agent: Mass Downloader
    User-agent: Mass Downloader/2.2
    User-agent: Mata Hari
    User-agent: Mata.Hari
    User-agent: Microsoft URL Control
    User-agent: Microsoft URL Control - 5.01.4511
    User-agent: Microsoft URL Control - 6.00.8169
    User-agent: Microsoft.URL
    User-agent: Mister PiX
    User-agent: Mister PiX version.dll
    User-agent: Mister Pix II 2.01
    User-agent: Mister Pix II 2.02a
    User-agent: Mister.PiX
    User-agent: NICErsPRO
    User-agent: NPBot
    User-agent: NPbot
    User-agent: Navroad
    User-agent: NearSite
    User-agent: Net Vampire
    User-agent: Net Vampire/3.0
    User-agent: NetAnts
    User-agent: NetAnts/1.10
    User-agent: NetAnts/1.23
    User-agent: NetAnts/1.24
    User-agent: NetAnts/1.25
    User-agent: NetMechanic
    User-agent: NetSpider
    User-agent: NetZIP
    User-agent: NetZip Downloader 1.0 Win32(Nov 12 1998)
    User-agent: NetZip-Downloader/1.0.62 (Win32; Dec 7 1998)
    User-agent: NetZippy+(http://www.innerprise.net/usp-spider.asp)
    User-agent: Octopus
    User-agent: Offline Explorer
    User-agent: Offline Explorer/1.2
    User-agent: Offline Explorer/1.4
    User-agent: Offline Explorer/1.6
    User-agent: Offline Explorer/1.7
    User-agent: Offline Explorer/1.9
    User-agent: Offline Explorer/2.0
    User-agent: Offline Explorer/2.1
    User-agent: Offline Explorer/2.3
    User-agent: Offline Explorer/2.4
    User-agent: Offline Explorer/2.5
    User-agent: Offline Navigator
    User-agent: Offline.Explorer
    User-agent: Openbot
    User-agent: Openfind
    User-agent: Openfind data gatherer
    User-agent: Oracle Ultra Search
    User-agent: PageGrabber
    User-agent: Papa Foto
    User-agent: PerMan
    User-agent: ProPowerBot/2.14
    User-agent: ProWebWalker
    User-agent: Python-urllib
    User-agent: QueryN Metasearch
    User-agent: QueryN.Metasearch
    User-agent: RMA
    User-agent: Radiation Retriever 1.1
    User-agent: ReGet
    User-agent: RealDownload
    User-agent: RealDownload/4.0.0.40
    User-agent: RealDownload/4.0.0.41
    User-agent: RealDownload/4.0.0.42
    User-agent: RepoMonkey
    User-agent: RepoMonkey Bait & Tackle/v1.01
    User-agent: SiteSnagger
    User-agent: SlySearch
    User-agent: SmartDownload
    User-agent: SmartDownload/1.2.76 (Win32; Apr 1 1999)
    User-agent: SmartDownload/1.2.77 (Win32; Aug 17 1999)
    User-agent: SmartDownload/1.2.77 (Win32; Feb 1 2000)
    User-agent: SmartDownload/1.2.77 (Win32; Jun 19 2001)
    User-agent: SpankBot
    User-agent: Sqworm/2.9.85-BETA (beta_release; 20011115-775; i686-pc-linux
    User-agent: SuperBot
    User-agent: SuperBot/3.0 (Win32)
    User-agent: SuperBot/3.1 (Win32)
    User-agent: SuperHTTP
    User-agent: SuperHTTP/1.0
    User-agent: Surfbot
    User-agent: Szukacz/1.4
    User-agent: Teleport
    User-agent: Teleport Pro
    User-agent: Teleport Pro/1.29
    User-agent: Teleport Pro/1.29.1590
    User-agent: Teleport Pro/1.29.1634
    User-agent: Teleport Pro/1.29.1718
    User-agent: Teleport Pro/1.29.1820
    User-agent: Teleport Pro/1.29.1847
    User-agent: TeleportPro
    User-agent: Telesoft
    User-agent: The Intraformant
    User-agent: The.Intraformant
    User-agent: TheNomad
    User-agent: TightTwatBot
    User-agent: Titan
    User-agent: True_Robot
    User-agent: True_Robot/1.0
    User-agent: TurnitinBot
    User-agent: TurnitinBot/1.5
    User-agent: URL Control
    User-agent: URL_Spider_Pro
    User-agent: URLy Warning
    User-agent: URLy.Warning
    User-agent: VCI
    User-agent: VCI WebViewer VCI WebViewer Win32
    User-agent: VoidEYE
    User-agent: WWW-Collector-E
    User-agent: WWWOFFLE
    User-agent: Web Image Collector
    User-agent: Web Sucker
    User-agent: Web.Image.Collector
    User-agent: WebAuto
    User-agent: WebAuto/3.40 (Win98; I)
    User-agent: WebBandit
    User-agent: WebBandit/3.50
    User-agent: WebCapture 2.0
    User-agent: WebCopier
    User-agent: WebCopier v.2.2
    User-agent: WebCopier v2.5
    User-agent: WebCopier v2.6
    User-agent: WebCopier v2.7a
    User-agent: WebCopier v2.8
    User-agent: WebCopier v3.0
    User-agent: WebCopier v3.0.1
    User-agent: WebCopier v3.2
    User-agent: WebCopier v3.2a
    User-agent: WebEMailExtrac.*
    User-agent: WebEnhancer
    User-agent: WebFetch
    User-agent: WebGo IS
    User-agent: WebLeacher
    User-agent: WebReaper
    User-agent: WebReaper [info@webreaper.net]
    User-agent: WebReaper [webreaper@otway.com]
    User-agent: WebReaper v9.1 - www.otway.com/webreaper
    User-agent: WebReaper v9.7 - www.webreaper.net
    User-agent: WebReaper v9.8 - www.webreaper.net
    User-agent: WebReaper vWebReaper v7.3 - www,otway.com/webreaper
    User-agent: WebSauger
    User-agent: WebSauger 1.20b
    User-agent: WebSauger 1.20j
    User-agent: WebSauger 1.20k
    User-agent: WebStripper
    User-agent: WebStripper/2.03
    User-agent: WebStripper/2.10
    User-agent: WebStripper/2.12
    User-agent: WebStripper/2.13
    User-agent: WebStripper/2.15
    User-agent: WebStripper/2.16
    User-agent: WebStripper/2.19
    User-agent: WebWhacker
    User-agent: WebZIP
    User-agent: WebZIP/2.75 (http://www.spidersoft.com)
    User-agent: WebZIP/3.65 (http://www.spidersoft.com)
    User-agent: WebZIP/3.80 (http://www.spidersoft.com)
    User-agent: WebZIP/4.0 (http://www.spidersoft.com)
    User-agent: WebZIP/4.1 (http://www.spidersoft.com)
    User-agent: WebZIP/4.21
    User-agent: WebZIP/4.21 (http://www.spidersoft.com)
    User-agent: WebZIP/5.0
    User-agent: WebZIP/5.0 (http://www.spidersoft.com)
    User-agent: WebZIP/5.0 PR1 (http://www.spidersoft.com)
    User-agent: WebZip
    User-agent: WebZip/4.0
    User-agent: WebmasterWorldForumBot
    User-agent: Website Quester
    User-agent: Website Quester - www.asona.org
    User-agent: Website Quester - www.esalesbiz.com/extra/
    User-agent: Website eXtractor
    User-agent: Website eXtractor (http://www.asona.org)
    User-agent: Website.Quester
    User-agent: Webster Pro
    User-agent: Webster.Pro
    User-agent: Wget
    User-agent: Wget/1.5.2
    User-agent: Wget/1.5.3
    User-agent: Wget/1.6
    User-agent: Wget/1.7
    User-agent: Wget/1.8
    User-agent: Wget/1.8.1
    User-agent: Wget/1.8.1+cvs
    User-agent: Wget/1.8.2
    User-agent: Wget/1.9-beta
    User-agent: Widow
    User-agent: Xaldon WebSpider
    User-agent: Xaldon WebSpider 2.5.b3
    User-agent: Xenu's
    User-agent: Xenu's Link Sleuth 1.1c
    User-agent: Zeus
    User-agent: Zeus 11389 Webster Pro V2.9 Win32
    User-agent: Zeus 11652 Webster Pro V2.9 Win32
    User-agent: Zeus 18018 Webster Pro V2.9 Win32
    User-agent: Zeus 26378 Webster Pro V2.9 Win32
    User-agent: Zeus 30747 Webster Pro V2.9 Win32
    User-agent: Zeus 32297 Webster Pro V2.9 Win32
    User-agent: Zeus 39206 Webster Pro V2.9 Win32
    User-agent: Zeus 41641 Webster Pro V2.9 Win32
    User-agent: Zeus 44238 Webster Pro V2.9 Win32
    User-agent: Zeus 51070 Webster Pro V2.9 Win32
    User-agent: Zeus 51674 Webster Pro V2.9 Win32
    User-agent: Zeus 51837 Webster Pro V2.9 Win32
    User-agent: Zeus 63567 Webster Pro V2.9 Win32
    User-agent: Zeus 6694 Webster Pro V2.9 Win32
    User-agent: Zeus 71129 Webster Pro V2.9 Win32
    User-agent: Zeus 82016 Webster Pro V2.9 Win32
    User-agent: Zeus 82900 Webster Pro V2.9 Win32
    User-agent: Zeus 84842 Webster Pro V2.9 Win32
    User-agent: Zeus 90872 Webster Pro V2.9 Win32
    User-agent: Zeus 94934 Webster Pro V2.9 Win32
    User-agent: Zeus 95245 Webster Pro V2.9 Win32
    User-agent: Zeus 95351 Webster Pro V2.9 Win32
    User-agent: Zeus 97371 Webster Pro V2.9 Win32
    User-agent: Zeus Link Scout
    User-agent: asterias
    User-agent: b2w/0.1
    User-agent: cosmos
    User-agent: eCatch
    User-agent: eCatch/3.0
    User-agent: hloader
    User-agent: httplib
    User-agent: humanlinks
    User-agent: ia_archiver
    User-agent: larbin
    User-agent: larbin (samualt9@bigfoot.com)
    User-agent: larbin samualt9@bigfoot.com
    User-agent: larbin_2.6.2 (kabura@sushi.com)
    User-agent: larbin_2.6.2 (larbin2.6.2@unspecified.mail)
    User-agent: larbin_2.6.2 (listonATccDOTgatechDOTedu)
    User-agent: larbin_2.6.2 (vitalbox1@hotmail.com)
    User-agent: larbin_2.6.2 kabura@sushi.com
    User-agent: larbin_2.6.2 larbin2.6.2@unspecified.mail
    User-agent: larbin_2.6.2 larbin@correa.org
    User-agent: larbin_2.6.2 listonATccDOTgatechDOTedu
    User-agent: larbin_2.6.2 vitalbox1@hotmail.com
    User-agent: libWeb/clsHTTP
    User-agent: lwp-trivial
    User-agent: lwp-trivial/1.34
    User-agent: moget
    User-agent: moget/2.1
    User-agent: pavuk
    User-agent: pcBrowser
    User-agent: psbot
    User-agent: searchpreview
    User-agent: spanner
    User-agent: suzuran
    User-agent: tAkeOut
    User-agent: toCrawl/UrlDispatcher
    User-agent: turingos
    User-agent: webfetch/2.1.0
    User-agent: wget
    Disallow: /

    # good ones can stay
    User-agent: *
    Continued Success,

    Haiko
    The secret of success is constancy of purpose ~ Disraeli

  7. #7
    ABW Founder Haiko de Poel, Jr.'s Avatar
    Join Date
    January 18th, 2005
    Location
    New York
    Posts
    21,609
    There are obviously many disallows ...

    Disallow: /folder name here


    ... after the good ones can stay but that would change for all sites so I didn't post that.
    Continued Success,

    Haiko
    The secret of success is constancy of purpose ~ Disraeli

  8. #8
    ABW Founder Haiko de Poel, Jr.'s Avatar
    Join Date
    January 18th, 2005
    Location
    New York
    Posts
    21,609
    Quote Originally Posted by hotspice
    or is that being doubly redundant
    Doubly redundant.
    Continued Success,

    Haiko
    The secret of success is constancy of purpose ~ Disraeli

  9. #9
    Newbie
    Join Date
    June 14th, 2006
    Posts
    1
    Demo Automation studio 5.0
    Can you help me Automationstudio5.0. I need this program. And I cant find getright 4.1.1 registration code.

    Thanks for your help on this time.

  10. Newsletter Signup

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •