Results 1 to 12 of 12
  1. #1
    Newbie
    Join Date
    September 30th, 2006
    Posts
    12
    Help with 8 gig XML Datafeed
    Good evening,

    I'm in need of assistance with modifying an 8 gig XML datafeed. The feed contains all of the products from the automotive section of a retailer. I am only in need of the motorcycle products that are a subcategory within this section. I'm new to XML as I've primarily dealt with csv feeds, and am need of help with extracting these particular products now and each time I download this file to update my site. Any suggestions / ideas? I read other posts and some people suggested previously that a 200 meg feed likely contained duplicate products due to that size. I can tell you with 100% certainty that with this retailer, I know that not to be the case. Thank you in advance for any direction.

    Regards,
    Jeremy

  2. #2
    ABW Ambassador PatrickAllmond's Avatar
    Join Date
    September 20th, 2005
    Location
    OKC
    Posts
    1,219
    I'd suggest doing the obvious: Find a way to extract the motorcycle feeds and put them into a format that is usable. If you are used to working with CSV then I'd turn them into that providing that there are no commas in the XML file. However if you feel like a good challenge I'd just stick with the XML. Once you get used to working with it it is the cleanest of the formats out there.

    1. Which technology are you familir with Java, PHP or ASP ?
    2. What do you want to do with the data once you get it extracted ?

    Most technologies nowadays have built in functions for dealing with XML.

    Here is a quickie I was able to google from PHP and XML:
    http://www.sitepoint.com/article/php...sing-rss-1-0/2
    ---
    This response was masterly crafted via the fingers of Patrick Allmond who believe you should StopDoingNothing starting today.
    ---
    Focus Consulting is where I roll | Follow @patrickallmond on Twitter
    Search Engine Marketing | Search Engine Optimization | Social Media | Online Video

  3. #3
    Merchant & ABW Ambassador
    Join Date
    May 31st, 2006
    Location
    Houston TX
    Posts
    4,731
    lowparts, I have seen Heidi from Moto Sports that came on ABW and posted something about datafeed via merchandizer. As far as i know, they are new.. check out announcement or exclusive commission section.

    As for your duplicate content issue, just to verify it yourself, why not sort it via a unique field such as part #. This way, you will be able to in verify the duplicate content issue

  4. #4
    Newbie
    Join Date
    September 30th, 2006
    Posts
    12
    Thank you to everyone who has replied. Let me further explain what I'm trying to accomplish.

    I have the datafeed which is in XML. The contents of it are all products from from a retail partner, but approximately 70% of the products are ones I need to somehow parse out as they are not applicable to my site. Aside from me needing to split it up so not to have to upload that large of a file, I'm also looking for someone who may have an idea on how to parse out the products that I do not need. Whether its software, a method or a service...I'm all ears.

    Thanks again everyone!

  5. #5
    Newbie
    Join Date
    September 30th, 2006
    Posts
    12
    Quote Originally Posted by FairFieldGetaway-EricEwe
    lowparts, I have seen Heidi from Moto Sports that came on ABW and posted something about datafeed via merchandizer. As far as i know, they are new.. check out announcement or exclusive commission section.

    As for your duplicate content issue, just to verify it yourself, why not sort it via a unique field such as part #. This way, you will be able to in verify the duplicate content issue

    Eric...thank you for the reply. I've actually been working with Heidi and she has been fantastic. Thanks for your feedback!

  6. #6
    ABW Ambassador PatrickAllmond's Avatar
    Join Date
    September 20th, 2005
    Location
    OKC
    Posts
    1,219
    Sure.

    WIth your own eyes if you were to go through it how would you know what to keep and what not to keep?

    Let's figure that out and then we can figure out how to code it. I am sure there is a way that you know a record is one you want to keep. However for the XML it may not be until the 2nd or 3rd element/tag in before you know that information. So I am pretty sure what we need to do is:

    Open your input XML
    Open your empty output destination file
    Write out the XML header and first XML tag. You can have only one parent tag
    Start a loop reading the XML tag by tag
    As we hit a new product keep it in memory
    Someplace in the loop set a flag indicating this record is a keeper
    At the end of the loop if this is a keeper write it out to the output file
    End your loop - got back to the top
    Write your final XML tag that matched your parent tag

    If you write this correctly you could generate your output in XML or CSV or whatever you wanted.

    Keep asking questions. I eat this stuff up.
    ---
    This response was masterly crafted via the fingers of Patrick Allmond who believe you should StopDoingNothing starting today.
    ---
    Focus Consulting is where I roll | Follow @patrickallmond on Twitter
    Search Engine Marketing | Search Engine Optimization | Social Media | Online Video

  7. #7
    Newbie
    Join Date
    January 4th, 2006
    Location
    Berlin
    Posts
    41
    I think whoever suggested using php to process an 8gb xml file needs their head screwing on in the other direction

    I would guess if you did some super super super programming you might just be able to get php to do nothing on it.

    You need to ask them what identifier is in the feed to identify the records you want and make some kind of perl parser. Or alternatively you can pay me a monthly fee and i will deliver just the one category you need every day/week or whenever.

    That is unless you have already sorted the problem by now.

  8. #8
    Newbie
    Join Date
    September 30th, 2006
    Posts
    12
    Quote Originally Posted by pricethat
    I think whoever suggested using php to process an 8gb xml file needs their head screwing on in the other direction

    I would guess if you did some super super super programming you might just be able to get php to do nothing on it.

    You need to ask them what identifier is in the feed to identify the records you want and make some kind of perl parser. Or alternatively you can pay me a monthly fee and i will deliver just the one category you need every day/week or whenever.

    That is unless you have already sorted the problem by now.
    Thanks for the information. I would be interested in learning about what you charge for your services. Could you email me at webmaster@lowparts.com? Thanks!

  9. #9
    Newbie
    Join Date
    June 28th, 2006
    Location
    Berlin
    Posts
    5
    Well first i have to check that my software can even parse the data but i am sure it wont have too much problems, as for cost, the same cost as a hosting account with 8gb data transfer per month i guess, or however many times you want it processed per month.

  10. #10
    Affiliate Manager MotoMerchant's Avatar
    Join Date
    August 9th, 2006
    Location
    Oregon
    Posts
    45
    Thanks Eric for the plug & Jeremy for the kind words!

  11. #11
    Newbie
    Join Date
    January 4th, 2006
    Location
    Berlin
    Posts
    41
    Sent you an email ;-) You might want to take your email address of this public post before every bot add you to their spam lists though, they do frequent here quite often, soon they will be wanting u to sign up to every penis extending affiliate program in the world....

  12. #12
    Member
    Join Date
    September 5th, 2005
    Location
    Mansfield, TX
    Posts
    161
    Sounds like you got your help but if not I would suggest that you use a SAX parser and not build a DOM object in memory (you'll run out of it ). My language of choice would be Java. Apache's Xerces parser is very fast.

    I've done files that large in my 9-5 job. Let me know if I can be of assistance.

  13. Newsletter Signup

+ Reply to Thread

Similar Threads

  1. XML Datafeed Questions
    By reaper in forum Programming / Datafeeds / Tools
    Replies: 4
    Last Post: August 4th, 2006, 09:42 PM
  2. Help with 4 Gig or 20 Gig Ipod
    By Affiliate Ian in forum Midnight Cafe'
    Replies: 6
    Last Post: September 16th, 2004, 04:43 AM
  3. Error in Reading XML Datafeed for ABirdsWorld
    By Asif in forum Commission Junction - CJ
    Replies: 1
    Last Post: November 22nd, 2003, 01:46 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •