Results 1 to 25 of 25
  1. #1
    ABW Ambassador Paul_Ward's Avatar
    Join Date
    January 18th, 2005
    Location
    Cambridgeshire, England
    Posts
    1,573
    Handling a huge feed
    I've recently acquired WebMerge and am just getting to grips with it.

    There's one datafeed that I have from a merchant that's about 73Mb when unzipped. So far I've prepared my files in Excel before WebMerge builds the pages, but this file is so big that it hits Excel's limit on the number of lines and it opens up the first about 65K lines only. I don't want to use the whole of the feed, just certain sections, some of which come below the line limit that Excel imposes.

    Does anyone have any tips for how wrestle this monster?

  2. #2
    ABW Ambassador AddHandler's Avatar
    Join Date
    January 19th, 2005
    Posts
    1,270
    FoxPro and about six months to a year of learning how to use it...!

    Merchants who do not know what they are doing when it comes to their own datafeed... really SUCK..

  3. #3
    ABW Ambassador buy_online's Avatar
    Join Date
    January 18th, 2005
    Location
    Richmond, VA
    Posts
    3,234
    Access 2000 or above, if you're used to using MS apps.

    Fred

  4. #4
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    Is that feed file available somewhere. I'd like to use it for testing a table viewer I'm building for v3.0...
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  5. #5
    Member infoscott's Avatar
    Join Date
    March 17th, 2005
    Posts
    126
    If you could send me the first 10 or so lines of it, I could program Excel to span it across multiple worksheets.
    [LEFT]Scott :tartanber [URL=http://www.scotthamilton.net]My Vanity Page[/URL][/LEFT]

  6. #6
    Internet Cowboy
    Join Date
    January 18th, 2005
    Posts
    4,662
    I use Access for these as well and it handles them very well. For a simple search and replace, I often use WordPad.


  7. #7
    Affiliate Marketer Rogi's Avatar
    Join Date
    January 18th, 2005
    Location
    Melbourne
    Posts
    415
    73Mb is just too big. Opening something like that up will take a huge chunk out of your processing and ram.

    Is it really that hard for merchants to divide their feeds into the same categories/departments they divide their site into?
    Overstock does this well with their multiple feeds.

  8. #8
    ABW Ambassador Paul_Ward's Avatar
    Join Date
    January 18th, 2005
    Location
    Cambridgeshire, England
    Posts
    1,573
    Thanks for the replies guys. I hadn't thought about using Access - duh! Another thing to learn.

    FourthWorld and infoscott, will pm you to see if you can help.

  9. #9
    ABW Ambassador buy_online's Avatar
    Join Date
    January 18th, 2005
    Location
    Richmond, VA
    Posts
    3,234
    Access is very close to excel, but doing search and replace in Access is a little tougher to learn.

    Fred

  10. #10
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    All the more reason for me to hurry with v3.0 to put that table viewer win.

    Fred, what's the largest data feed you use, in terms of number of records?
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  11. #11
    Full Member heisje's Avatar
    Join Date
    January 18th, 2005
    Posts
    314
    some datafeeds may exceed 120,000 records

    heisje

  12. #12
    Action Jackson - King of the World
    Join Date
    January 18th, 2005
    Posts
    2,201
    I use a file splitter for those big feeds.

    Walmart, target and J & R Computers spring to mind offhand

  13. #13
    Member ripe's Avatar
    Join Date
    January 18th, 2005
    Posts
    141
    Or use cygwin is a Linux-like environment for Windows. (cygwin.com)
    and standard unix command 'split' for splitting file.

  14. #14
    ABW Ambassador Paul_Ward's Avatar
    Join Date
    January 18th, 2005
    Location
    Cambridgeshire, England
    Posts
    1,573
    I tried Access 2000, it wouldn't open any of it as it was too big.

    I've downloaded hjsplit - filesplitter and that's chopped it into more manageable chunks that I can at least work on, it's far from ideal, but will have to do for now.

  15. #15
    ABW Ambassador buy_online's Avatar
    Join Date
    January 18th, 2005
    Location
    Richmond, VA
    Posts
    3,234
    Quote Originally Posted by FourthWorld
    All the more reason for me to hurry with v3.0 to put that table viewer win.

    Fred, what's the largest data feed you use, in terms of number of records?
    I think the largest I've ever used was 135,000 records. Which, by the way I opened and "fixed up" in Access 2000. It took a while to load, and it wanted to create it's own file that was five times larger than the original data feed file, but it did open.

    One has to remember, many of us like to see the information formatted on our desktop just like it is in excel (columns and rows), so that we can do search and replaces only on certain field names. For example, I might want to remove or add spaces, commas or something else to the records in the "keywords" field name only. That's pretty tough to do in a text editor. Simply put, many of us have secrets about how we "massage" a data feed file, and it's done by working on certain records only - that's why a GUI interface like that is essential.

    Having said all that, if I know what the data looks like, and I can do a search and replace on the entire file, I'll use TextPad (there are many others) for that, it can open very large files easily. I've even been known to take an entire field (column) out of excel, use the very strong filter capability of textpad, and put it back into excel, save it as tab delimited - and run it.

    A lot of people will say that's ridiculous, but it results in very unique content for my sites (based on the data feed file) - content that someone with a php script could never achieve. It just depends on how much trouble you want to take. And no, that kind of feed doesn't get updated very often

    How's that for more information than anyone really needed?

    Fred

  16. #16
    ABW Ambassador Paul_Ward's Avatar
    Join Date
    January 18th, 2005
    Location
    Cambridgeshire, England
    Posts
    1,573
    Quote Originally Posted by buy_online
    How's that for more information than anyone really needed?

    Fred
    It was very useful to me! as I've started doing similar things (I think?) to customize my generated pages in as unique a way as I can, but still make them easy to generate in quantity so making it worthwhile using WebMerge in the first place. Seems like I'm on the right track if veterans at this are doing similar things.

    I'd much prefer to use Excel or similar format to tinker with the feed as I've discovered alot of the work involves looking for similar but different text or punctuation in many places at once.

  17. #17
    Member
    Join Date
    January 18th, 2005
    Location
    Australia
    Posts
    118
    Fred, how much RAM does that PC have? I am about to get a new PC but the old one with 256 MB and Access 2003 won't import any XLS file more than a few hundred KB.

  18. #18
    ABW Ambassador buy_online's Avatar
    Join Date
    January 18th, 2005
    Location
    Richmond, VA
    Posts
    3,234
    crm911, sorry for the delay - I missed your post.

    I had 512MB which worked just fair. I now have 1GB, and it works much better (but still time consuming). I am also using Access 2k, Win2k, and a 2k Intel Processor.

    Hope this helps.

    Fred

  19. #19
    Newbie
    Join Date
    August 2nd, 2006
    Posts
    6
    For splitting large files for Excel, we use TextPipe Standard from datamystic.com. It handles unlimited file sizes, you just tell it how many lines to split at. For Excel, this is 65535.

  20. #20
    Member tsmgroup2's Avatar
    Join Date
    January 18th, 2005
    Location
    New Providence, PA USA
    Posts
    155
    Talking here's another biggie
    Remember Tower Records???
    It's another Big one! Yikes.
    Haven't had time to use the file splitter yet, but I am sure it will work fine.

    Still having some comps with my current pc. Need to re-equip it soon.
    Thank God for Dual Cores and Quad Core units are coming out next year Gang!
    Look for em'.
    There hot, large, and very sweet in speed!

    bye!
    Mark (Satchel)
    Webmaster / Sales Manager
    [url]www.tsmgroup2.biz[/url]

  21. #21
    SEO: A Specialty - Web Design: Slow or outsourced andbeyond's Avatar
    Join Date
    June 18th, 2006
    Location
    The Call is coming from Inside the House!
    Posts
    1,332
    Dual Core CPUs just dropped a bunch in price in the last week or two. Not sure if they are available in a store near you...

    Price War between Intel and AMD. Also look at the Pentium 805 that overclocks like mad:

    http://www.tomshardware.com/2006/06/...720/index.html

    And it cost less than $200 for the chip...

    Good times if you must renew your computer. Personally I can wait.

  22. #22
    Member tsmgroup2's Avatar
    Join Date
    January 18th, 2005
    Location
    New Providence, PA USA
    Posts
    155
    Hey Richard.
    Do you know if either of the revisions of WM coming out will be able to handle the larger feeds better? Process them? What about processing the other formats more automatically? Can it programmed to autodetect the format? Could it have a self-cleaning feature built into it that would do something similar like the feed cleaner already available to us or something similar to MS excel?
    Just a thought...
    Oh, by the way, I still love WM... even the way it is... Just wish WM had some marketing techniques, engine submission, or dynamic page creation built into it to aid in keeping the larger sites up to date easier....
    Too many thoughts, eh? sorry... just trying to look to the future...
    Thank you anyway, Rock on! Rich.
    Mark (Satchel)
    Webmaster / Sales Manager
    [url]www.tsmgroup2.biz[/url]

  23. #23
    ABW Ambassador best123's Avatar
    Join Date
    July 5th, 2006
    Posts
    571
    Quote Originally Posted by Paul Ward
    I've recently acquired WebMerge and am just getting to grips with it.

    There's one datafeed that I have from a merchant that's about 73Mb when unzipped. So far I've prepared my files in Excel before WebMerge builds the pages, but this file is so big that it hits Excel's limit on the number of lines and it opens up the first about 65K lines only. I don't want to use the whole of the feed, just certain sections, some of which come below the line limit that Excel imposes.

    Does anyone have any tips for how wrestle this monster?
    I didn't read other peoples post here, but I think the best solution is to ask the feed provider to give you the option to only get chunks of the feed
    usually by sub-categories.

    If the feed provider can do this for all their feeds it will make our lives easier.

  24. #24
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    Best123 has a good suggestion, esp. given that Excel is very popular for pre-processing feeds and has much more severe limits than WebMerge.

    WebMerge's internal logic can handle feeds up to 4GB -- most folks will run out of memory long before WebMerge runs out of its ability to address it.

    The only other limit in WebMerge which can affect common usage is the built-in Sort feature in the Source tab. If you use that feature you'll want to make sure no record exceeds 64k in length. If any of your records do, you'll want to run your sort in MS Acces or other database first.

    But beyond the Sort feature, WebMerge seems to hold up well with really large feeds -- provided of course you have the RAM to deal with it (assume the size of the feed X 2 plus a little overhead for the WebMerge engine).
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  25. #25
    Member tsmgroup2's Avatar
    Join Date
    January 18th, 2005
    Location
    New Providence, PA USA
    Posts
    155
    Lightbulb additional ways to wrestle w/ large feeds
    Quote Originally Posted by Paul Ward
    I've recently acquired WebMerge and am just getting to grips with it.

    There's one datafeed that I have from a merchant that's about 73Mb when unzipped. So far I've prepared my files in Excel before WebMerge builds the pages, but this file is so big that it hits Excel's limit on the number of lines and it opens up the first about 65K lines only. I don't want to use the whole of the feed, just certain sections, some of which come below the line limit that Excel imposes.

    Does anyone have any tips for how wrestle this monster?
    Hello, everyone.

    Why, I found something unique that I think you will all agree works very well.
    Using Winzip or one of three other zipping programs allows you to bypass the usage of Excel altogether unless you find you really need feed cleaning of some kind, there is one on this forum which I am sure someone will be able to point you to on here.

    I personally tried and had complete success using Winzip to unzip Tower Records into .txt format and WM had no problem working with this feed at all. Richard is correct on one matter of memory though, invest a small fortune in getting your systems up to 2gb of onboard ram as quickly as you can, it's worth the investment of keeping your system running smoothly and capable of managing large feeds like this with ease.

    The newer dual and quad core motherboards handling even larger amounts of onboard ram will make it even easier. I wonder if someone manufactures a 8-core or 16-core motherboard yet???

    Hope this helps.

    Say, on another note, during page creation or product output, does anyone know what script to use or how to layout webmerge's php to get side by side product layout on a page? For example: two or more columns on the same page? Thank you.
    Mark (Satchel)
    Webmaster / Sales Manager
    [url]www.tsmgroup2.biz[/url]

  26. Newsletter Signup

+ Reply to Thread

Similar Threads

  1. breaking up a huge feed
    By QponCentral in forum Programming / Datafeeds / Tools
    Replies: 4
    Last Post: April 16th, 2004, 07:59 PM
  2. HUGE HUGE SALE - 9 websites & 16 domains
    By fonzerelli_79 in forum Midnight Cafe'
    Replies: 1
    Last Post: March 18th, 2003, 03:39 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •