Results 1 to 12 of 12
  1. #1
    Outsourced Program Manager John Jupp's Avatar
    Join Date
    January 23rd, 2005
    Location
    England
    Posts
    1,502
    The Ethics Vs Site Scraping
    Many suppliers provide poor product details to retailers and generally advise of price/product changes via email. Many suppliers don't supply xml or csv material giving name of product, product number, price and specifications or if they do it's supplied in pdf or some other format which isn't instantly usable by a retailer. Now multiply that by a number of suppliers to a retail website with a few thousand products and this can be very time consuming updating things to keep it all up to date.

    The same could conversely be said about retailers with affiliate schemes who provide poor datafeeds but I'll put that as a separate question in a moment.

    One way for a retailer to get up to date information is for them to scrape the wholesaler/supplier website and extract into a csv information such as category, product name, product number etc etc, which can be incorporated into a database which then can be integrated into their ecommerce website.

    The beauty with this approach is that a weekly scrape of the relevant categories ensures that the information on current products is always up to date. It does not use horrendous amounts of bandwidth. It's also quite quick. It isn't "copying" a website, it's getting details on the products that a wholesaler/supplier provides a retailer for that retailer to have an up to date inventory which they then manually price adjust.

    Question: What are the ethics and how do you view this as an activity to maintain an ecommerce website?

    From the affiliate perspective now. You subscribe to a program and the datafeed is rubbish. What ethics are involved here by doing exactly the same thing with a retailers website to get a csv file with Product name, product number, specification, price and image url? Is it also an ethical conflict in obtaining from the merchant who is too lousy at supplying decent feeds or who can't be bothered or who lacks the skill?

    Discuss.
    Flambi Media Limited - USA/UK/EU Affiliate Management Expertise

  2. #2
    notary sojac Herb ԿԬ's Avatar
    Join Date
    January 18th, 2005
    Location
    Central/Western NY State
    Posts
    7,741
    if the retailer is an authorized agent of the supplier I see no ethical problem for the retailer, but you mentioned an affiliate . . . should have some kind of understanding or permission to hit the originator's site. After all, that is "two steps back," and going around the merchant or network.

  3. #3
    Beachy Bill's Avatar
    Join Date
    November 20th, 2005
    Posts
    8,266
    There should be no ethical problem for the affiliate if the scraping is done with the permission of the merchant.

    Bill / Marketing Blog @ 12PM - Current project: Resurrecting my "baby" at South Baltimore..
    Cute Personal Checks and Business Checks
    If you are too busy to laugh you are too busy.

  4. #4
    ABW Founder Haiko de Poel, Jr.'s Avatar
    Join Date
    January 18th, 2005
    Location
    New York
    Posts
    21,609
    Agreed. Permission with full disclosure, you could in essance be DDoSing the merchant if their hosting isn't up to par or blow out their bandwidth - you'd be amazed at the borderline / poorly allocated hosting plans that some merchants have.
    Continued Success,

    Haiko
    The secret of success is constancy of purpose ~ Disraeli

  5. #5
    Visual Artist & ABW Ambassador lostdeviant's Avatar
    Join Date
    September 7th, 2007
    Location
    Cuautitln, Edo. de Mxico
    Posts
    1,725
    I agree. If you have permission you can pass go and collect $200...if you don't you can go to Jail :-)

    *Monopoly Speak*

    Any Public Domain or Collective commons site with copy permission can also be used for source material, but you need to give credit according to the CC license.

  6. #6
    Outsourced Program Manager John Jupp's Avatar
    Join Date
    January 23rd, 2005
    Location
    England
    Posts
    1,502
    Many thanks for the informative responses so far.

    It is usually within the terms and conditions contained within a retailers contract with a supplier that they may use material from the suppliers website. It is merely the manner in which it is obtained that I wished to seek clarification. Many thanks for that.

    From an affiliates perspective I agree, an affiliate should obtain permission first. I have own designed software to do that very task. I always seek consent from the merchant if producing an affiliate datafeed and conversely as a retailer, it is always important to check the t's & c's.

    I agree too that it is surprising just how many merchants do not invest in bandwidth and more importantly, server resources. Shared hosting can be a problem for affiliates and merchants alike when use of system resources is high and can result in being politely asked to relocate or upgrade to dedicated use.
    Flambi Media Limited - USA/UK/EU Affiliate Management Expertise

  7. #7
    Member
    Join Date
    February 21st, 2007
    Location
    Seattle, WA
    Posts
    75
    It is always best to inform the merchant of your intentions before you undertake the project and to seek permission from them.

    However, there are certain circumstances where you are on solid legal/ethical footing even if you don't explicitly ask for permission. This depends on how you use the information that you scrape.

    If you're complying with their robots.txt, their copyrights and their Affiliate terms I think your activities are legit.

    But, it is always easy to shoot an email off to the merchant and I'd encourage you to do that first.

  8. #8
    Classic Rocker Mack's Avatar
    Join Date
    January 27th, 2007
    Location
    Lower Left Coast
    Posts
    1,167
    Contact the merchant, you'd be surprised how cooperative the good ones are. I asked one if i could grab all the images in a feed and serve them up myself. She said no problem, she'd make it easier. She had a tech overnight me a DVD with all the images. In it was a note from the tech saying that if I gave him ftp write to a directory, he'd automatically update them for me.

    If you ask, they may even find a better way to help you.

  9. #9
    Outsourced Program Manager John Jupp's Avatar
    Join Date
    January 23rd, 2005
    Location
    England
    Posts
    1,502
    Quote Originally Posted by Mack
    Contact the merchant, you'd be surprised how cooperative the good ones are. I asked one if i could grab all the images in a feed and serve them up myself. She said no problem, she'd make it easier. She had a tech overnight me a DVD with all the images. In it was a note from the tech saying that if I gave him ftp write to a directory, he'd automatically update them for me.

    If you ask, they may even find a better way to help you.
    I have a similar arrangement with one merchant from the affiliate side. I requested a weekly update for a particular category and they ftp it weekly.

    I've just completed an extraction of product data from a very difficult site. It'll be better if that merchant can supply a regular update correctly formatted as they've gone the duplicate categorisation route with a javascript rerouter that would normally nullify a site scraper. I say normally except for that we found a way round it in about two minutes. It just means that we'd have to take page 1 of each sub category as a block before going on to page 2. Not a problem.

    Still I'll have to talk to that merchant direct even though they allow content material to be used as they've obviously taken these steps to reduce the amount of server load from bots and their website is 40Gb.
    Flambi Media Limited - USA/UK/EU Affiliate Management Expertise

  10. #10
    OPM and Moderator Chuck Hamrick's Avatar
    Join Date
    April 5th, 2005
    Location
    Park City Utah
    Posts
    16,646
    [/QUOTE]There should be no ethical problem for the affiliate if the scraping is done with the permission of the merchant.[QUOTE]

    First, the affiliate managers i.e. merchants who have a lousy datafeed will do one of two things. Either they will ignore your request just as they have been doing or they will immediately respond that you do NOT have permission to crawl their site.

    Affiliates have the same responsibility that affiliate managers have to optimize the relationship. If your first several attempts to get the merchant to optimize the datafeed failed then put it on your follow-up list and keep trying if its to your advantage.

    If you can get them to agree and show an increase in sales due to the optimized datafeed then you just provided the case study they need to get the resources. There was a lengthy discussion here several years ago that resulted in a datafeed centric network being built. That was due to several founders not getting the resources to build out the datafeed they needed for the program. I don't have the exact thread (please post so I can bookmark) but here are some other threads:
    http://forum.abestweb.com/showthread.php?t=5493
    http://forum.abestweb.com/showthread.php?t=5364

  11. #11
    Resident Genius and Staunch Capitalist Leader's Avatar
    Join Date
    January 18th, 2005
    Location
    Florida
    Posts
    12,817
    I'd say to contact the merchant first, too. If I saw an unidentified scraper coming through GoodBulbs, I'd probably assume it was someone up to no good and ban it. So just to avoid getting your merchants antsy, you should let them know it's you and that you're a real affiliate who just wants to sell their stuff.
    There is no knowledge that is not power. ~Hemingway

  12. #12
    ABW Ambassador MoneyBusiness's Avatar
    Join Date
    March 14th, 2006
    Posts
    2,051
    Quote Originally Posted by John Jupp
    Many suppliers provide poor product details to retailers and generally advise of price/product changes via email. Many suppliers don't supply xml or csv material giving name of product, product number, price and specifications or if they do it's supplied in pdf or some other format which isn't instantly usable by a retailer. Now multiply that by a number of suppliers to a retail website with a few thousand products and this can be very time consuming updating things to keep it all up to date.

    The same could conversely be said about retailers with affiliate schemes who provide poor datafeeds but I'll put that as a separate question in a moment.

    One way for a retailer to get up to date information is for them to scrape the wholesaler/supplier website and extract into a csv information such as category, product name, product number etc etc, which can be incorporated into a database which then can be integrated into their ecommerce website.

    The beauty with this approach is that a weekly scrape of the relevant categories ensures that the information on current products is always up to date. It does not use horrendous amounts of bandwidth. It's also quite quick. It isn't "copying" a website, it's getting details on the products that a wholesaler/supplier provides a retailer for that retailer to have an up to date inventory which they then manually price adjust.

    Question: What are the ethics and how do you view this as an activity to maintain an ecommerce website?

    From the affiliate perspective now. You subscribe to a program and the datafeed is rubbish. What ethics are involved here by doing exactly the same thing with a retailers website to get a csv file with Product name, product number, specification, price and image url? Is it also an ethical conflict in obtaining from the merchant who is too lousy at supplying decent feeds or who can't be bothered or who lacks the skill?

    Discuss.

    I had to come up with a couple of scripts to do just that. However, in an effort to not blow up their servers, I only grab data in small spurts over a long period of time. This at least gets me the initial load of data that I need to incorporate into my site.

    Also did something similar in order to actually download images and save to my server, allowing it to be loaded quicker and more reliably.

  13. Newsletter Signup

+ Reply to Thread

Similar Threads

  1. CouponOver.com = Another site scraping coupon theif
    By shimmy in forum Unethical Affiliates
    Replies: 9
    Last Post: February 4th, 2010, 07:24 AM
  2. Network response to scraping - A+
    By teezone in forum Midnight Cafe'
    Replies: 4
    Last Post: March 23rd, 2008, 07:27 PM
  3. Scraping a site for making my own data feed
    By SSanf in forum Programming / Datafeeds / Tools
    Replies: 26
    Last Post: May 8th, 2006, 10:13 AM
  4. Scraping a site?
    By ks11 in forum WebMerge (Fourthworld.com)
    Replies: 24
    Last Post: December 6th, 2003, 07:36 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •