Results 1 to 12 of 12
  1. #1
    ABW Ambassador
    Join Date
    January 18th, 2005
    Location
    Nunya, Business
    Posts
    23,684
    Anybody Know How I Can Block These Douchebags
    Some new pages (posts) I'm putting up are ranking on page 1 but thru these guys - headlinenewsonline.com

    My title tag is showing up first, followed by theirs.

    And on that site, my AVG pops up because of some social exploit or something.

    As an example if you search on - Exclusive Interview: Sen. Cornyn Says Obama Has ‘Given Up on Governing’

    Exclusive Interview: Sen. Cornyn Says Obama Has

    they're right under the original author/site.

    I'm using Wordpress. So don't know what to put where.

    Thanks for any help.

    Also, don't know what forum this is supposed to go in. So somebody can move it if there is a better one.

    And I went ahead and reported them to Google for malware.
    Last edited by Trust; November 16th, 2011 at 11:57 PM.

  2. #2
    Affiliate Manager dculpepper's Avatar
    Join Date
    May 23rd, 2005
    Location
    Collins, MS
    Posts
    108
    I didn't open their site to see what they are doing but if you think they are scraping your feed you can get their IP address and block them from accessing your feed using your .htaccess file or through cpanel. You can even get creative with .htaccess and serve the scraper some "alternate content".

    I think there is also a Wordpress plugin that serves the the scraper some alternate content. I believe it is called AntiLeech.
    [SIZE="2"][LEFT][B]David Culpepper, SubscriptionAgent.com[/B]
    [URL=http://www.subscriptionagent.com/aff/index.html]Affiliate Program Info[/URL] | E: [email=david@subscriptionagent.com]david@subscriptionagent.com[/email] | [URL=http://twitter.com/dculpepper]Twitter[/URL] | [URL=http://www.facebook.com/DavidCulpepper]Facebook[/URL]
    30% Commission & LIFETIME Cookie![/LEFT][/SIZE]

  3. #3
    ABW Ambassador
    Join Date
    January 18th, 2005
    Location
    Nunya, Business
    Posts
    23,684
    I wouldn't even know how to find out which one is him, ip address. I know from checking the whois, there is only one of him in the U.S. since he has a unique name and is from Lynden, Washington 98264.

    "but if you think they are scraping your feed"

    I guess that's what they're doing. First with the adware and then with that. I really want to get this guy.

    Maybe I'll just kill my feeds.

  4. #4
    Newbie All Blacks Fan's Avatar
    Join Date
    November 15th, 2011
    Posts
    9
    Quote Originally Posted by dculpepper View Post
    I didn't open their site to see what they are doing but if you think they are scraping your feed you can get their IP address and block them from accessing your feed using your .htaccess file or through cpanel. You can even get creative with .htaccess and serve the scraper some "alternate content".

    I think there is also a Wordpress plugin that serves the the scraper some alternate content. I believe it is called AntiLeech.

    dculpepper, firstly hello! I just want to ask you...is there a way to prevent whats happening with Trust at the moment? cos I'm thinking stuff that!!

    I've never heard of that before "scraping your feed" and am wondering if it happens often??

    @Trust - I hope you get em too!!!


    Leina

  5. #5
    Affiliate Manager dculpepper's Avatar
    Join Date
    May 23rd, 2005
    Location
    Collins, MS
    Posts
    108
    Looks like the ip address of their server is 174.123.72.98.

    If you have access to the hosting control panel for your website you can block their ip address very easily. The various control panels should have a utility that will help you do that... in cpanel it's called IP Deny Manager... I would guess that plesk, webmin, and others would provide something similar.

    If you have access to your .htaccess file you can do the same thing that IP Deny Manager does. It also gives you the opportunity to get creative and send them any content you would like them to have. Send them content that links back to your website, or just a bunch of Lorem Ipsum text, or some interesting images, etc.

    Killing your feeds is also an option if your site visitors won't be following your posts via the feed.

    If you need any help let me know.
    [SIZE="2"][LEFT][B]David Culpepper, SubscriptionAgent.com[/B]
    [URL=http://www.subscriptionagent.com/aff/index.html]Affiliate Program Info[/URL] | E: [email=david@subscriptionagent.com]david@subscriptionagent.com[/email] | [URL=http://twitter.com/dculpepper]Twitter[/URL] | [URL=http://www.facebook.com/DavidCulpepper]Facebook[/URL]
    30% Commission & LIFETIME Cookie![/LEFT][/SIZE]


  6. #6
    Affiliate Manager dculpepper's Avatar
    Join Date
    May 23rd, 2005
    Location
    Collins, MS
    Posts
    108
    Hi Leina,

    Unfortunately content scraping happens a lot. But once you block their ip address they will no longer be able grab your content unless they are using multiple ip addresses. Then it can take some effort to track them all down.

    You can also report them to Google and to their web host. Sometimes the web host will actually care and shut them down.
    [SIZE="2"][LEFT][B]David Culpepper, SubscriptionAgent.com[/B]
    [URL=http://www.subscriptionagent.com/aff/index.html]Affiliate Program Info[/URL] | E: [email=david@subscriptionagent.com]david@subscriptionagent.com[/email] | [URL=http://twitter.com/dculpepper]Twitter[/URL] | [URL=http://www.facebook.com/DavidCulpepper]Facebook[/URL]
    30% Commission & LIFETIME Cookie![/LEFT][/SIZE]

  7. Thanks From:

  8. #7
    ABW Ambassador
    Join Date
    January 18th, 2005
    Location
    Nunya, Business
    Posts
    23,684
    Quote Originally Posted by dculpepper View Post
    Looks like the ip address of their server is 174.123.72.98.

    If you have access to the hosting control panel for your website you can block their ip address very easily. The various control panels should have a utility that will help you do that... in cpanel it's called IP Deny Manager... I would guess that plesk, webmin, and others would provide something similar.

    If you have access to your .htaccess file you can do the same thing that IP Deny Manager does. It also gives you the opportunity to get creative and send them any content you would like them to have. Send them content that links back to your website, or just a bunch of Lorem Ipsum text, or some interesting images, etc.

    Killing your feeds is also an option if your site visitors won't be following your posts via the feed.

    If you need any help let me know.
    Thanks for your help. I also filled out another report with Google. See if they do anything.

    I just went into IP Deny Manager and entered the ip address. I see where they have it so you can enter a domain as well. Hopefully this does the trick. Or I'll friend him on Facebook and take some other action.
    Last edited by Trust; November 17th, 2011 at 03:11 AM.

  9. #8
    Moderator
    Join Date
    April 6th, 2006
    Posts
    2,689
    On the subject of feeds, I killed all of mine.. it was too painful to see my hard work filling the pages of other sites..

    I know this isn't the most popular sentiment - everything you read talks about the importance of syndication - but there is less chance of being hit with a duplicate content penalty, or have a scraper site rank higher.

  10. #9
    Full Member
    Join Date
    November 21st, 2010
    Posts
    230
    I have this a lot. Here's the rundown.

    First, if you're using Feedburner, blocking their IP on your server isn't going to work since they'll go to Google and Google goes to you. But blocking their IP (and it is likely to be the (a) server since it's probably automated) will help if they're trying to go to you directly. Also if you use something like CloudFlare (which I do now) you'll have to block the IP through them, not on your server.

    However, there are alternatives. If in fact they are using your RSS, there are ways to add a footer to your RSS articles, so it can link back to you. Also it's often suggested to use as many internal links as possible in your articles on your own site for just this occasion. Unless you're #2, might as well try to leverage their success.

    And if you want to get fancy... I use WordPress and added some code in the functions.php to manipulate my RSS feeds based on who was requesting it. For example, since Feedburner (and Blogspot for that matter) choke if the feed is over 50K, if Google is requesting the feed, I chop down lengthy articles and link back to original to keep size under 50K. For sites that i don't like using my feed, I've coded a 'This is unauthorized, please visit original article here: <link>' message for the article instead of the real thing.

  11. #10
    ABW Ambassador I.M.O.G.'s Avatar
    Join Date
    February 19th, 2011
    Location
    Rootstown, OH
    Posts
    1,096
    Quote Originally Posted by Trust View Post
    Thanks for your help. I also filled out another report with Google. See if they do anything.

    I just went into IP Deny Manager and entered the ip address. I see where they have it so you can enter a domain as well. Hopefully this does the trick. Or I'll friend him on Facebook and take some other action.
    Did you file a report with google because of the search placement, or because they host or serve his page in some way? In my experience, its most effective to send DMCA to the hosting provider - they contact the person, and typically the content disappears before long. Google has been great about removing offending comment from Blogger, but I have never seen them take action on complaints with search results unless they have valid DMCA grounds. Most hosting providers seem to take action quickly on any DMCA though in my experience (resolution in less than a month).

    Our sites get scraped pretty commonly, and we routinely are playing whackamole... we don't bother with IP blocking, but it probably isn't a waste of time.

    We don't worry too much though either, because google has algorithms to recognize original content - they know we are the original source because our item was indexed immediately, and our pagerank is high enough to be placed higher than any scrapers. Mainly, its just a concern so that our community members aren't having their work stolen or passed off for someone elses, and that is why we follow it up - we aren't losing any traffic to scrapers, so on a technical level it isn't worth our time.

    EDIT: FWIW, 9 out of 10 times the scrapers are grabbing the feed content. Publishing only partial content on the feed could help in your case also possibly, rather than killing it altogether.
    Last edited by I.M.O.G.; November 17th, 2011 at 11:16 AM.
    Matt Bidinger
    Online Community Engagement

  12. #11
    Full Member
    Join Date
    November 21st, 2010
    Posts
    230
    Quote Originally Posted by I.M.O.G. View Post
    EDIT: FWIW, 9 out of 10 times the scrapers are grabbing the feed content. Publishing only partial content on the feed could help in your case also possibly, rather than killing it altogether.
    unfortunately that also alienates a good percentage of legitimate subscribers

  13. #12
    Newbie
    Join Date
    November 17th, 2011
    Posts
    1
    Solution: stopping site scraping and content scraping
    I work for Distil, Inc., we have an Adaptive Mitigation Platform for blocking site scraping, data theft, and malicious bot attacks.

    Depending on the volume of traffic you have coming to your site, you might be able to just use the free service.

    Hopefully this will help someone here. Send a message if you have any questions.

    Sean

  14. Newsletter Signup

+ Reply to Thread

Similar Threads

  1. New Merchant on the Block
    By KellycoChick in forum Introduce Yourself
    Replies: 1
    Last Post: July 9th, 2008, 11:55 AM
  2. New on the block
    By vivigirl in forum Introduce Yourself
    Replies: 7
    Last Post: October 16th, 2007, 08:27 PM
  3. How To Block Robots...?? Help..
    By AddHandler in forum Programming / Datafeeds / Tools
    Replies: 11
    Last Post: August 12th, 2007, 05:51 PM
  4. hello from new guy on the block
    By dzaifman in forum Introduce Yourself
    Replies: 8
    Last Post: February 10th, 2007, 01:12 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •