Results 1 to 5 of 5
  1. #1
    Full Member silver93350's Avatar
    Join Date
    March 8th, 2008
    Location
    Kansas City
    Posts
    323
    Site Scrapers (Duplicate Conetent)
    I have a massive problem with site scrapers cutting and pasting my content on my hosting review site. Since the 24th I have lots a ton of traffic. I went from 700 hits a day down to 70 now. I have checked them through Copyscape and was shocked that about 20+ reviews had thrown up red flags. As of late I have just started redoing all the reviews in question (along with adding much more content). It would take far to long sending DMCA notices like crazy.


    I am full time affiliate marketer and my bills are paid from sites I run that generate me money. Getting a little frustrated and was wondering if you guys/girls think it could be possibly that, that is the culprit of the loss of Google organic traffic to my website?

    Feedback would really really be appreciated. I have been messing with this site for a month now and not sure what to do.
    [URL=http://www.the-best-web-hosting-service.com/]Best Web Hosting Service[/URL] [COLOR=Blue]- Quality Reviews![/COLOR]

  2. #2
    Moderator
    Join Date
    April 6th, 2006
    Posts
    2,689
    First of all, you need to figure out how they are scraping your site, and close up any leaks. I doubt they are using copy/paste, it's probably something automated.

    Do you have an RSS feed... sitemap xml..? CMS like Wordpress or Joomla..?

    Let's start there!

  3. #3
    Full Member silver93350's Avatar
    Join Date
    March 8th, 2008
    Location
    Kansas City
    Posts
    323
    yeah I was thinking they are using some automated tools or something. Possibly a way to block those programs or something with a script. Root site is done without a CMS actually and the blog does use Wordpress. I have sitemaps for root and blog, and I ran my sitemap through Copyscape and that's how I found my content everywhere.
    [URL=http://www.the-best-web-hosting-service.com/]Best Web Hosting Service[/URL] [COLOR=Blue]- Quality Reviews![/COLOR]

  4. #4
    Moderator
    Join Date
    April 6th, 2006
    Posts
    2,689
    As a temporary measure, you should rename your sitemap to something obscure.. doesn't mean they can't find it another name, but you shouldn't use something as generic as "sitemap.xml".

    You may have already searched G, but I found this terrific article, specifically the rel="author" tag to indicate ownership:

    A Definitive Guide to Blog Content Scraping & How to STOP It! | HyperArts

    There's no way of knowing if the scraping has cost you in natural rankings, but it has to be addressed anyway - do you have an account in Webmaster tools to check for warnings, and crawl stats?

  5. #5
    ...and a Pirate's heart. Convergence's Avatar
    Join Date
    June 24th, 2005
    Posts
    6,918
    If you have a dedicated server start controlling who accesses it. It's easier on a dedicated but I'm sure there are ways to do it on shared hosting as well.

    We block ALL countries we are not targeting on a site by site level using geoIP dat files. No more adding thousands of IPs/CIDRs to your .htaccess. Just put in what countries you ALLOW. If there is an OPM/AM in another country we get their IPs and allow them access to the site we want to promote. Redirecting to specific URLs like the FBI's CyberCrime page is fun as well. MaxMind has some free geoIP dat files. We use another service of theirs for our ad server and other geo targeting. Highly recommended.
    MaxMind - GeoIP Apache API

    Next we block nearly ALL of Amazon's cloud hosting. If a merchant uses them we allow those IPs. The scum of the internet are using Amazon's cloud hosting to steal your content, bandwidth, and business. Amazon had been posting on their website when they released new public IP ranges but have not seen one for a while.
    https://forums.aws.amazon.com/index.jspa
    How to Block the Amazon AWS EC2 - Forum Posters Union

    Also another group of bandits out there are getting leased IPs from a Peak/web/hosting/com - IP info returned says "United States Northfield We License Ips". They are as bad as the Amazon IP users.

    Since we implemented this on our servers in February and March we have seen no ill effect on traffic. Spam from contact forms no longer exists and we haven't had a single mysql injection attempt. In fact we believe it has helped our site speeds and increased our traffic. No more 'bounces' from countries where we don't market.

    There are those that will say "but you don't know if someone from lower B.F.E. is buying something for their cousin in Denver" or "What if someone is on vacation out of the country and wants to order something?" So what. Not worth the hassle or the costs associated with fighting the scum.

    But hey, could just be me...
    Salty kisses, Sandy toes, and a Pirate's heart...


  6. Newsletter Signup

+ Reply to Thread

Similar Threads

  1. Old Site, switching to Wordpress, duplicate content?
    By Hardaka in forum Newbie Affiliate FAQs & Helpful Articles
    Replies: 10
    Last Post: June 24th, 2008, 12:35 PM
  2. Are all screen-scrapers the same?
    By micheck in forum Midnight Cafe'
    Replies: 2
    Last Post: March 30th, 2008, 06:26 PM
  3. Duplicate Order <-> Duplicate Order TFAW again.
    By AffJus in forum Commission Junction - CJ
    Replies: 9
    Last Post: December 13th, 2002, 04:50 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •