Results 1 to 22 of 22
  1. #1
    Lone Ranger muddyboots's Avatar
    Join Date
    March 11th, 2005
    Location
    Asheville, NC
    Posts
    219
    Any known file size limitations?
    I've been using Webmerge for a year. I've created millions of pages. When I've tried larger source files (over 100,000 records -- but I don't know where the breaking point is between working and not working) I get ...

    MetaCard engine for Win32 has encountered a problem and needs to close. We are sorry for the inconvenience.

    It's happened on two different computers, bith running Winsows XP Professional. Any known issues, problems, solutions?
    Dennis Duffy
    Slavin' over a hot keyboard for nickels & dimes ... and nobody understands what I do.

  2. #2
    Resident Genius and Staunch Capitalist Leader's Avatar
    Join Date
    January 18th, 2005
    Location
    Florida
    Posts
    12,817
    Hopefully this will help while you're waiting for a response from Richard.

    The only time I had anything like that, it was something wrong with one of the records in my consolidated feed.

    I never did find out what exactly was wrong with the feed, but by watching WM's progress closely (that little box where it says the number of records processed), I was able to get the approximate record number that it was crashing on.

    Then I just used a huge-file editor to delete the suspect group of records from the feed. After a couple of "surgeries" it went through.

    Richard might be able to pinpoint the exact issue if he saw the feed, but understandably, he doesn't want his email blown up with a mega feed file (I think he asks for attachments to be less than 2 MB...not enough for a 100,000+ product feed!). Best to get in touch with him as to how to get it to him.
    There is no knowledge that is not power. ~Hemingway

  3. #3
    Lone Ranger muddyboots's Avatar
    Join Date
    March 11th, 2005
    Location
    Asheville, NC
    Posts
    219
    Leader ...

    Thanks. It's not a shop stopper. I've broken my file into smaller pieces and run each separately. That's the same thing I did when I first encountered the problem about 6 weeks ago. The first time it happened to me I just worked around it and figured, "well let's see if that happens again."

    I have had another similar situation to what you describe. WM gets to a certain record and believes that it's the end of file when it's not. I've edited and removed that record number, the one before and the one after. So I'm familiar with the situation you've described. This one's a bit different.

    But I'll follow up with Richard and send the settings file, the template and a sample of the data file, along with the dimensions of the file that failed.

    Thanks again.
    Dennis Duffy
    Slavin' over a hot keyboard for nickels & dimes ... and nobody understands what I do.

  4. #4
    ABW Veteran Mr. Sal's Avatar
    Join Date
    January 18th, 2005
    Posts
    6,795
    I've been using Webmerge for a year. I've created millions of pages. When I've tried larger source files (over 100,000 records -- but I don't know where the breaking point is between working and not working) I get ...
    over 100,000 records?

    I'm sorry but, I can't even imagine myself, ever doing a 10K pages for any merchant, so when I saw that you had created millions of pages and that you have tried larger source files like, (over 100,000 records) I am no longer feeling that good about the datafeeds.

    Every time that I upload a new feed of less than 3k pages and every time that I re-index a site with less than 10k pages, I waste a lot of time, so I can't even think of a 10k datafeed for one merchant, much less updating that feed on a weekly schedule.

    You have me worried about that # (over 100,000 records)

    Now, I don't know if I should just , or simply myself.


  5. #5
    Super Sh!t Stirrer SSanf's Avatar
    Join Date
    January 18th, 2005
    Posts
    9,944
    Uh....
    Ran into this problem lately. What do you do when you come to the end of the Excel spread sheet? I have figured that I will have to do part 2, part 3 and so forth. But, that creates some problems with sort and so forth.
    Comments are opinion unless otherwise noted. Remember, pillage first. Then burn. Half of all people in the world have IQs under 100. You best learn to trust ol' SSanf!

  6. #6
    Full Member heisje's Avatar
    Join Date
    January 18th, 2005
    Posts
    314
    .

    by the way, have not seen richard for a while here - is he ok?


    heisje

    .

  7. #7
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    Muddyboots -- how much RAM is installed on that machine? It may be just a memory error.

    Heisje -- I'm fine. Just got back from a programming conference in Monterey where I gave two talks and participated in two panels. It was a very busy few days, I had a good time, enjoyed the train ride back, and am happy to be home.
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  8. #8
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    PS: to answer the question which is the title of this thread, the only known limitations for file sizes are:

    - A file must be smaller than 4GB
    - If using the Sort option no line in the file can exceed 65,535 characters.

    I'm not aware of any other limitations.
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  9. #9
    Lone Ranger muddyboots's Avatar
    Join Date
    March 11th, 2005
    Location
    Asheville, NC
    Posts
    219
    Quote Originally Posted by FourthWorld
    Muddyboots -- how much RAM is installed on that machine? It may be just a memory error.
    It had 1GB of memory.

    I'll send machine specs, the template, the settings file and a small sample of the data file. It's a relatively small record length too.

    Thanks.
    Dennis Duffy
    Slavin' over a hot keyboard for nickels & dimes ... and nobody understands what I do.

  10. #10
    Resident Genius and Staunch Capitalist Leader's Avatar
    Join Date
    January 18th, 2005
    Location
    Florida
    Posts
    12,817
    What do you do when you come to the end of the Excel spread sheet?
    All file lengths bow to MySQL.

    "Just" change what you want to w/MySQL, save the data back to a regular file, and run that file through WM. ("Just" is in quotes because that does require roting in enough MySQL commands to make it do what you want. How easy that is, is in the eye of the beholder...)

    Once WM is w*rking on its part, make some cookies or watch a couple of TV shows, then it should be ready for "zippin' and shippin'" to your site!
    There is no knowledge that is not power. ~Hemingway

  11. #11
    Full Member heisje's Avatar
    Join Date
    January 18th, 2005
    Posts
    314
    .

    watch a couple of TV shows
    more than a couple . . .


    heisje


    .

  12. #12
    Full Member heisje's Avatar
    Join Date
    January 18th, 2005
    Posts
    314
    .


    It was a very busy few days, I had a good time,

    what are you hiding from us, richard gaskin???


    heisje


    .

  13. #13
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    No, not hiding at all. As I mentioned I was travelling, first to a conference then for a quick retreat camping in the desert. Refreshed now, and back in action.

    As for the processing time, I'd be interested in learning if it took more than one sitcom's length of time to process those, given that most pages are processed in less than a millisecond or two each.

    And for the premature "end of file", I would double-check the raw data in Word or other tool that will open large files. I suspect there may be a bad character in the feed, or the delimiter settings in the Sources tab aren't set appropriately for the data's format.
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  14. #14
    Lone Ranger muddyboots's Avatar
    Join Date
    March 11th, 2005
    Location
    Asheville, NC
    Posts
    219
    I had originally posted this problem ...

    Quote Originally Posted by muddyboots

    MetaCard engine for Win32 has encountered a problem and needs to close. We are sorry for the inconvenience.
    And then in a subsequent exchange in this thread I referred to this one ...

    Quote Originally Posted by muddyboots
    I have had another similar situation to what you describe. WM gets to a certain record and believes that it's the end of file when it's not. I've edited and removed that record number, the one before and the one after.
    Thanks again.
    And earlier this morning I figured out the problem (and it's my mistake). I had set "Text values are in quotes" when they were not. I have done this by habit because sometimes I am processing data with text values in quotes. I ran the process without the text values in quotes box checked and everything worked fine. This appears to be the source of both of these problems for me.

    Quote Originally Posted by FourthWorld

    And for the premature "end of file", I would double-check the raw data in Word or other tool that will open large files. I suspect there may be a bad character in the feed, or the delimiter settings in the Sources tab aren't set appropriately for the data's format.
    You're right -- I just coincidentally found the solution about an hour before your post. Thanks for the help.

    Quote Originally Posted by FourthWorld
    As for the processing time, I'd be interested in learning if it took more than one sitcom's length of time to process those, given that most pages are processed in less than a millisecond or two each..
    I processed about 350,000 records in one run and it took right around 30 minutes. I'm running a Pentium 4 3 GHz processor with 2 GB of ram and about 175 GB of free disk space.

    Cheers. Glad to have figured that out (user error).
    Dennis Duffy
    Slavin' over a hot keyboard for nickels & dimes ... and nobody understands what I do.

  15. #15
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    Quote Originally Posted by muddyboots
    I figured out the problem (and it's my mistake). I had set "Text values are in quotes" when they were not. I have done this by habit because sometimes I am processing data with text values in quotes. I ran the process without the text values in quotes box checked and everything worked fine. This appears to be the source of both of these problems for me.
    Yes, earlier versions of WebMerge were somewhat imprecise in applying the quote-parsing option; v2.4 requires you to be more specific but does a much more reliable job with complex feeds and is also about 50% faster in that parsing.

    Quote Originally Posted by muddyboots
    I processed about 350,000 records in one run and it took right around 30 minutes.
    So cool -- thanks for posting that. I love a good performance story -- any chance you'd be interested in posting that with a link to your site on our Gallery page?:
    http://www.fourthworld.com/products/...e/gallery.html
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  16. #16
    Lone Ranger muddyboots's Avatar
    Join Date
    March 11th, 2005
    Location
    Asheville, NC
    Posts
    219
    Richard ...

    Sure. I'd be delighted to. Do you want to send me a PM with your e-mail address and I'll write a blurb using your existing gallery as a guideline.
    Dennis Duffy
    Slavin' over a hot keyboard for nickels & dimes ... and nobody understands what I do.

  17. #17
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    No need -- we have a handy form for Gallery submissions:
    http://www.fourthworld.com/products/...lery_form.html

    Thanks!
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  18. #18
    Full Member heisje's Avatar
    Join Date
    January 18th, 2005
    Posts
    314
    .


    I processed about 350,000 records in one run and it took right around 30 minutes.

    apparently not the "average user experience".
    and yes, possible: if your template workload is featherlight . . .

    heisje

    .

  19. #19
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    Quote Originally Posted by heisje
    .yes, possible: if your template workload is featherlight . . .
    Even complex templates may not take significantly longer. In general WebMerge scales well, which is to say that the processing time for smaller data sets can often be used to extrapolate performance about larger ones (as long as there's sufficient physical memory available, of course).

    For example, on my modest computer it takes WM 380 milliseconds to process the 30 records in the Adam's Diary example, and 6.2 seconds to process the 530 records in the Congress Contact example. That gives us a size ratio of 0.056 and a performance ratio of 0.06 -- pretty close match, even though the Congress Contact templates contain many more tags.

    As long as there's at least twice as much physical RAM available as the file size you should see a similarly linear scaling of performance; it may be slightly slower with really large files, but not radically so.

    The key to doing performance benchmarking is to make sure you're comparing files of the same format, since CSV files will take longer in the "preprocessing" phase than tab-delimited (thanks to the notorious inefficiencies of the CSV format).

    And of course testing with the same templates will affect outcomes, though I've yet to see a template yet complex enough to require more than a second to process, often a very small fraction of that.

    If you run across any templates that take longer than half a second please pass 'em along and I'll see what I can do to further optimize things.
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  20. #20
    Lone Ranger muddyboots's Avatar
    Join Date
    March 11th, 2005
    Location
    Asheville, NC
    Posts
    219
    If one is using SSI the template size can be managed down, thus streamlining the processing.
    Dennis Duffy
    Slavin' over a hot keyboard for nickels & dimes ... and nobody understands what I do.

  21. #21
    Full Member heisje's Avatar
    Join Date
    January 18th, 2005
    Posts
    314
    .


    if a demanding template instructs 30 processes, for example, as compared to 5 for a less demanding one, how can the time requirement be similar?

    my own experience indicates that the time requirement is not similar or even linear, it seems to me rather exponential.

    but then, it can only be me . . .

    heisje

    .

  22. #22
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    No, there's definitely an increase with template complexity, and some tags (like multi-value IFs) take more time than others.

    The lineaer increase I was referring to was for the number of records in the feed.
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  23. Newsletter Signup

+ Reply to Thread

Similar Threads

  1. Single Hour Log File size...
    By weisinator in forum Midnight Cafe'
    Replies: 0
    Last Post: January 24th, 2004, 07:55 AM
  2. Is there a way to know the page size (file size)
    By netsu in forum Midnight Cafe'
    Replies: 3
    Last Post: August 21st, 2003, 08:12 PM
  3. Replies: 5
    Last Post: March 7th, 2003, 11:43 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •