Page 1 of 2 12 LastLast
Results 1 to 25 of 41
  1. #1
    Newbie DataFeedFile.com's Avatar
    Join Date
    May 26th, 2006
    Posts
    329
    Lightbulb What Google consider as duplicate content?
    I created this posting in reply to several questions from another forum member, but that original post was deleted and I have spent 25 minutes creating this posting. I thought if Haiko allows it I will post it here at Google subforum where it is appropriate and may help someone...

    I would like to state that these are my opinion and my believes alone, if you read this thread and agrees with my opinion, I am happy. But if you do not agree, you are free to offer your own opinion here also. Sorry, I can not be responsible for you implementing my opinion and any damage it may cause from it. (standard disclaimer - in case I get attacked )

    I believe what Google mean by duplicate contents are web sites which has same content but uses many domain names to host these same contents. Or another example is a webmaster who duplicate contents and use different URL to publish those contents changing only the URL and Page titles.

    Of course no one knows for sure if that is true, unless a Google editor cares to read and comment on this posting. Reading Google's site submission guideline alone does not explain enough in detail of what Duplicate Contents exactly means.

    I will list a few examples to proof that duplicate contents are being indexed by Google -but- not getting penalized, because they are not considered duplicates once they are used WITHIN the site(s) other contents. I believe as long as the duplicate content exists only within another page and only covers a maximum of about 60 - 70% of the page content, Goggle robots will not be able to distinguish them being duplicates.

    Scenario #1 : Wikipedia encyclopedia content
    Not many webmaster realize and I hate to bring this up afraid this may be abused... You can download the content of Wikipedia database and create a Wikipedia within your site. (WOW) Talking about duplicate content? But yet if you search for keyword "wikipedia" and go to the 10+ pages and see the pages being indexed you will notice a lot of them are the contents same as you would find them from Wikipedia.org.

    Scenario # 2 : Open Source Software Documentations
    Since I have read and followed many Open Source docs, I know they are GPL licensed and many sites have actualy downloaded the docs and republished them. Again... many times I have searched for something I needed and found results from manuals/documentation from another site rather than the actual Official Site.

    Scenario # 3 : RSS Feeds
    Since RSS Feeds was introduced perhaps 5 years ago and got popular within the last 2 years, millions of web sites has used or republished RSS contents over and over again. I mean the whole idea of RSS Feed is the syndication concept itself. Again I believe it is about how much data you duplicate.

    These are just some of the instances I have personaly experienced about Google's treatment of duplicate content. Many of us can argue on and on about this topic... Unfortunately I do not have the time. I have DataFeedFile.com to run and support. I hope this posting can help someone if it makes sense.

  2. #2
    ABW Ambassador
    Join Date
    January 18th, 2005
    Posts
    4,423
    I would say you show a pretty slanted view of duplicate content and one that is not generally accepted. Your examples are broken, don't search for wikipedia, search for the content contained. An example "Theodore Kaczynski". Wikipedia is #1, there are 3 other versions in the top 5 pages using some version of wiki's content, only #1 being duplicate content, but the source reference.com is in itself an authority. If you choose 50 normal affiliate sites and check, they will be duplicate content and get penalized. Exceptions are not the rules... they are the exception.

    While you claim you do not have time to discuss it, i guess you do have time to plant this misinformation to help your business. That is pretty scummy.

    Chet

  3. #3
    Newbie DataFeedFile.com's Avatar
    Join Date
    May 26th, 2006
    Posts
    329
    Okay... I never accused anyone of "scummy"... I am not sure why ABW members are so quick at accusations... as I said this is not useful and it is wasting my time...

    I will use your example to proof many instances of the same "wikipedia" content is being index in Google.

    instead of searching for "Theodore Kacyznski" which will not reveal most duplicate content occurences using the same wikipedia content. Try this search instead:

    Search for "Theodore Kaczynski wikipedia" on Google, here is the link:

    http://www.google.com/search?hl=en&l...nski+wikipedia

    We will only focus on the English language versions...

    At the Google search result - 1st Page 3rd position:

    At the Google search result - 1st Page 6th position:

    At the Google search result - 2nd page 1st position:

    At the Google search result - 2nd Page 6th position:

    At the Google search result - 2nd page 5th position:

    At the Google search result - 5th page 5th position:

    At the Google search result - 6th page 6th position:

    By the way, Wikipedia is the source of this article not the other way around. This quote below obtained from subtitle at Reference.com:

    Wikipedia, the free encyclopedia - Cite This Source
    Checking on Reference.com reveals most of their thousands of their content pages have exact same content *from* Wikipedia which I am sure lots of these pages are being indexed by Google.

    While researching this I also accidentaly find Answer.com has practicaly the same content about this Unabomber on two separate links, both indexed by Google on the 1st and 2nd page (not bad ranking).

    http://www.answers.com/topic/unabomber
    http://www.answers.com/theodore%20kaczynski

    I will try to leave again by saying that this is not a good use of my time. The first posting in this thread is my opinion, you are welcome to comment, but PLEASE do not accuse me for being SCUMMY!
    Last edited by DataFeedFile.com; September 19th, 2006 at 07:49 PM.

  4. #4
    Newbie DataFeedFile.com's Avatar
    Join Date
    May 26th, 2006
    Posts
    329
    Quote Originally Posted by chetf
    While you claim you do not have time to discuss it, i guess you do have time to plant this misinformation to help your business. That is pretty scummy.
    Chet
    I do not need to spend 25 minutes to create the original post let alone spend another 20 minutes to reply to your accusation. We are a sponsor of ABW. We have our own sub-forum here at ABW, I post daily and often. It is faster and easier for me to post something relevant to about our business rather than going round and round with this issue.

  5. #5
    Full Member
    Join Date
    January 16th, 2006
    Posts
    447
    Would much of this be a moot point if someone used the javascript one liner to create the store?

  6. #6
    Newbie DataFeedFile.com's Avatar
    Join Date
    May 26th, 2006
    Posts
    329
    No javascript is completely ignored by robots.

    Because javascript must be executed on the client side by browser like IE, Mozzila, etc...

  7. #7
    Full Member
    Join Date
    January 16th, 2006
    Posts
    447
    Right, so the whole duplicate content point means nothing in this context. There will never be content to be indexed because it's all in a javascript link.

  8. #8
    Newbie DataFeedFile.com's Avatar
    Join Date
    May 26th, 2006
    Posts
    329
    Right, duplicate content does not apply for content delivered from remote server using javascripts.

    For example Google Ads and Yahoo Overture ads are both javascripts and will not show on any cached or indexed pages.

    A good way to find out whether certain page's content is javascript or not is by using "View Source" (seeing the source of the page in HTML format). If you do not see the content you are looking for in the HTML source code, but instead you see <script type="text/javascript">.... blah blah .... </script> then you know it is javascript.

  9. #9
    ABW Ambassador simcat's Avatar
    Join Date
    January 18th, 2005
    Location
    Denver
    Posts
    1,786
    Duplicate content is all about different shades of gray IMO.

    Think about all the major news sites that republish AP stores, etc.

    Most sites have a basic navigation template thru-out the site. (can google always separate that from the other content?)

    A lot of common blog software and CMS's generate the same content using different url's within the site.

    I think that you can bypass a lot of the duplicate content issues if you have-
    a lot of incoming links
    a high 'trust' ranking (whatever that is) in the eyes of the SE's
    a lot of original content to go along with the non-original

  10. #10
    Newbie DataFeedFile.com's Avatar
    Join Date
    May 26th, 2006
    Posts
    329
    Simcat,

    Thanks for re-iterating the fact that this topic about Google Duplicate Content is somewhat of a gray area of internet marketing.

    For anyone reading this thread. I have put warnings in the first post itself, but I would like to be absolutely clear.

    Please note! The information about Google Duplicate Content is strictly my own opinion. Read it and use it if you wish *at your own risk*.

  11. #11
    ABW Ambassador
    Join Date
    January 18th, 2005
    Posts
    4,423
    I said my piece, and stand by it - but just to clarify one point.

    I said the act was scummy, not the person. I have no idea of their dealings or workings, I don't use them and don't really read up on them etc. So don't take that as an indicator of their worth as a company etc. I am sure there is plenty of information out there that would be a better indicator one way or another on their company.

    Chet

  12. #12
    ABW Ambassador
    Join Date
    January 18th, 2005
    Location
    Los Angeles
    Posts
    4,053
    Exceptions are not the rules... they are the exception.
    Very good point, and it also applies to a lot of posts out there about SEO and subdomains. IMHO it's very misleading when people give sites such as Search Engine Watch and Yahoo! and About.com as examples of subdomains being just fine. Same with using notable sites like that with cross-linking issues.

    An important thing to note about the matter is that the small guy, mom 'n pop site does not have the authority and trust status that the big guys have, and are highly unlikely to have the huge volume of original content and independent inbound links. The same principle can apply with duplicate content. You and I are *not* going to get away with what the New York Times will if news stories are run.

    It really isn't a fair example to use sites with hundreds of thousands (possibly millions) of quality inbound links, and a PR8 or so to give as examples to the average webmaster.

    I believe what Google mean by duplicate contents are web sites which has same content but uses many domain names to host these same contents. Or another example is a webmaster who duplicate contents and use different URL to publish those contents changing only the URL and Page titles.
    That simply isn't so. Those are mirror sites, but duplicate content detection and filtering go much further than that.

    Of course no one knows for sure if that is true, unless a Google editor cares to read and comment on this posting. Reading Google's site submission guideline alone does not explain enough in detail of what Duplicate Contents exactly means.
    Also not so, not 100%; we can't know exactly what IS being done (in general, but we can identify filtering in individual cases) but we can definitely know what CAN be done and looked for, and what IS considered duplication of content, without search engineers telling specifics, which they'll never do.

    Added: Just my opinions, FWIW.
    Last edited by webworker; September 20th, 2006 at 04:41 AM.

  13. #13
    Action Jackson - King of the World
    Join Date
    January 18th, 2005
    Posts
    2,201
    I for one feel that dup content is overly exaggerated. Does it enter into the algorithm? Of course it does. However, I think webmasters are way too quick to jump to the conclusion that it's a dup content penalty when often times it just may be a lack of incoming links or another associated website problem like redirects etc.

  14. #14
    Newbie DataFeedFile.com's Avatar
    Join Date
    May 26th, 2006
    Posts
    329
    webworker has some good comments...

    Another good question is along this subject:

    Why Google even allow / index Mirrored Sites?

  15. #15
    ABW Ambassador Akiva's Avatar
    Join Date
    January 18th, 2005
    Location
    New Jersey
    Posts
    3,266
    The dup content issue is complex. Even if you try to figure out what Google is doing, Google engineers are so far ahead that whatever you do, if you don't provide useful content (and by that I mean what Google deems as useful), you will be caught in the filter and sent to page oblivion. Using javascript is a sure-fire way of getting google to look at your ACTUAL content rather than the merchant's product info (which is what it is at the end of the day). That's not to say that using spiderable product info will penalize you but you are taking a risk of google labeling you as a "thin affiliate site" if you are publishing duplicate data.

    The bottom line is that what worked yesterday may not work today in terms of Google ranking you well. Javascript product info is the only way to make sure that Google will look at your content. The days of ranking well do to product info are long gone. If you don't have a "Buy" or "Add to Cart" button on your site, chances are you will not be anywhere near the top 10 results. BigDaddy took care of that.

    So the bottom bottom line is that what other people say on forums means s*it as Google is way ahead of the curve. Google hates "thin affiliates" - duplicate content is one of many signs of thin affiliates.
    Akiva Bergstrom | akiva@affsolutions.com | 718-871-8286

    Affiliate Marketing Solutions by affSolutions - Creator of the Product Showcase Creator™

    Managed Programs: EssentialApparel.com (Join) | SportsFanfare.com (Join)


    Affiliates: Product Showcase Creator Directory | Merchants: License the Product Showcase Creator™!

  16. #16
    15 years and counting
    Join Date
    January 18th, 2005
    Posts
    6,121
    That's my thinking, Akiva.

  17. #17
    ABW Ambassador netnow22's Avatar
    Join Date
    January 18th, 2005
    Location
    Columbia, SC
    Posts
    748
    One thing you should note:

    Just beacuse you see duplicate content in the se's, doesn't mean the search engine algorythms are giving credit to any content or back links. Also you dont get penalized for having duplicate content, the searchs engines just ignore it.( and the links associated it with it)

  18. #18
    15 years and counting
    Join Date
    January 18th, 2005
    Posts
    6,121
    Also you dont get penalized for having duplicate content, the searchs engines just ignore it.( and the links associated it with it)
    Wrong

  19. #19
    Full Member
    Join Date
    September 10th, 2005
    Posts
    369
    If it's not printed in black and white in detail by the SE's it's open to interpretation, opinions and theories.

  20. #20
    15 years and counting
    Join Date
    January 18th, 2005
    Posts
    6,121
    It's not printed in black but you can experiment.
    Take one of your sites with established traffic, I mean not one already banned, with enough keywords of one or two words in the top 10 positions and add more than 20% of duplicate content. Add just one datafeed with the merchant description, if you have 1000 pages already in Google, it's only 200 products. And watch the result. Google is faster and faster to catch that. And don't cry here, CJ is not tracking.
    If, like many people here, you don't have traffic, and you add 1K or 100K products, you're not going to see the same results. Yes, you can pull a few sales from time to time but you will NEVER rank well enough to make money. These people are happy like hell to make a sale and what, does it mean Google don't penalize duplicate content. No, they just don't know it.
    I'm not doctus cum libro. I did my homework and my experiments, months, years ago and still check the results.

    By the way, I had a quick check at many major affiliates, most of them lost around half of their traffic from last year to now. You can look also at CJ and LinkShare. They have more affiliates, more merchants. Their traffic should expand, no. It's half too. Also, check at ABW the number of active affiliates. Why are they gone? Yep, it's related to Google duplicate content penalty.

  21. #21
    The "other" left wing davidh's Avatar
    Join Date
    January 18th, 2005
    Location
    Boston
    Posts
    3,492
    The example you show "theodore kaczynski wikipedia" would presumably return results that contain the terms "theodore kaczynski" and "wikipedia". These pages may very well get indexed, but they may not get ranked well though, which is why a search for "theodore kaczynski"....

    ....will not reveal most duplicate content occurences using the same wikipedia content
    CUSTOM BANNERS by GRAPHICS CANDY ~ Banner Sets and Website Graphics ~ Professional design, reasonable rates
    DESIGNER DOG CHECKS ~ We double-dog dare ya to write one!

  22. #22
    ABW Ambassador netnow22's Avatar
    Join Date
    January 18th, 2005
    Location
    Columbia, SC
    Posts
    748
    Most commonly, if two or more pages contain the same content, rather than ranking them both for the same query (and cluttering their search results with identical content), search engines choose one of the pages to show, and push the other pages down in their search results (sometimes placing them in their supplemental index of "backup" pages to show if they can't find any relevant results in their main index).

    For example, this happens occasionally when the same article is syndicated to many sites. Google will just choose one version of the article to display, and avoid showing the other versions. The version Google selects is typically chosen based on a variety of factors, including which version they indexed first, the trust they place on the domain on which the article is hosted, and the PageRank of the page the article is on.

    Basically, all this is just a complicated way of saying that Google won't "penalize" a page for duplicate content, it just won't show it if there's already another page with the same content in their index.

  23. #23
    Action Jackson - King of the World
    Join Date
    January 18th, 2005
    Posts
    2,201
    Take coupon sites, MFA sites, Directory sites. None of these have original content yet they rank near the top. This to me dispels the dup content myth.

  24. #24
    ABW Ambassador simcat's Avatar
    Join Date
    January 18th, 2005
    Location
    Denver
    Posts
    1,786
    Quote Originally Posted by jackson992
    Take coupon sites, MFA sites, Directory sites. None of these have original content yet they rank near the top.
    Some do, but they often combine the content on the page in ways to make it not truly duplicate.

    I think that having some dup content is almost inevitable on a big site, but if people think they can just stick up a site of only 1000's of pages of unchanged datafeed and get good rankings, they will usually be dissapointed.

  25. #25
    ABW Ambassador
    Join Date
    January 18th, 2005
    Posts
    2,420
    >>>1000's of pages of unchanged datafeed and get good rankings, they will usually be dissapointed.

    Anyone differ?

+ Reply to Thread
Page 1 of 2 12 LastLast

Similar Threads

  1. Duplicate Content
    By remysays in forum Blogging, Mobile and Social Media
    Replies: 7
    Last Post: December 17th, 2012, 03:44 AM
  2. Affiliate Links and Duplicate Content according to Google's Matt Cutts
    By Chuck Hamrick in forum Building Traffic, Newsletters & Advertising
    Replies: 0
    Last Post: March 25th, 2010, 04:27 PM
  3. Duplicate Content
    By Kip in forum Search Engine Optimization
    Replies: 5
    Last Post: September 13th, 2003, 08:09 PM
  4. What does Google consider duplicate content?
    By Andy in forum Search Engine Optimization
    Replies: 4
    Last Post: March 1st, 2003, 06:22 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •