Results 1 to 5 of 5
  1. #1
    Member
    Join Date
    November 3rd, 2009
    Posts
    50
    Dealing with large datafeed files in PHP
    I have used the file function to load the content of a file into an array a lot of times, but always with rather small files

    now I'm building a new datafeed system and large datafeeds ( doing tests with a 120 MB datafeed file ) when opened with such function are causing errors (even with PHP memory limit set to 256 MB). On the PHP docs page they say that the file function shouldn't be used with files bigger than 10 MB.

    I have tried fgets() to read one line per time but it's obviously extremely slow (moreover you can't start from a specific line so you have to loop through the entire file every time - supposing you're not importing the feed all at once ).

    Do you know any fast alternative way?

    Another solution would be to split the big file into smaller ones reading it one time with fgets; later I could process every smaller file with file('filename.txt'), this would be significantly faster. Just wondering if I could do something smarter/more efficient that I am not aware of

  2. #2
    Full Member iolaire's Avatar
    Join Date
    October 3rd, 2006
    Location
    Arlington, VA
    Posts
    229
    What type of file? If it is non XML are you using some sort of CSV parser?

    If you use a parse library that should handle the sequential read automatically. Take a look at the sample code with this random parse library that may be built in the PHP install: http://php.net/manual/en/function.fgetcsv.php, example #1 shows it looping through a column separated file.

    Another option is to skip dealing with the file in PHP and just call mysql via php to do a LOAD DATA INFILE command in SQL, i.e.
    Code:
    "LOAD DATA INFILE '\volume\available\to\mysql\php\user\on\server\filename' INTO TABLE `Data_Temp` FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\n' IGNORE 1 LINES ;"

  3. #3
    Member
    Join Date
    November 3rd, 2009
    Posts
    50
    I was wrong, fgets is fast enough and handles with no problems 3GB datafeeds.
    the file is a csv, I'm using fgets + explode, I didn't know about the fgetcsv function

    the slowness was caused by something else but unfortunately I don't remember what ( and I forgot to update this discussion until today )

  4. #4
    Affiliate Manager ParadigmWilliam's Avatar
    Join Date
    September 23rd, 2007
    Posts
    364
    Any chance of breaking the file up? I had this problem, I just ended up breaking the file up in sections and processing it one chunk at a time.
    [URL="http://www.manageaffiliatelinks.com/"][COLOR="Red"][B]Manage Affiliate Links[/B][/COLOR][/URL] - Redirect Dead, Expired, or Broken Links

    [URL="http://www.wpcoupon.com/"][COLOR="Blue"][B]WP Coupon[/B][/COLOR][/URL] - Turn Wordpress into a Coupon Site!

  5. #5
    Member
    Join Date
    November 3rd, 2009
    Posts
    50
    as I wrote fgets handles large files with no problems ( I was wrong in the first post ) so there seems to be no need to split the file in smaller ones.

    Moreover I coded a part of a script which needs each datafeed to be in a single file, I hope I won't find datafeeds so large that need to be split....

  6. Newsletter Signup

+ Reply to Thread

Similar Threads

  1. Featured: TIP: Viewing large datafeed files
    By isellstuff in forum Programming / Datafeeds / Tools
    Replies: 11
    Last Post: July 6th, 2011, 09:52 AM
  2. Dealing with a datafeed and using WebMerge versus Php and Mysql - Part 1
    By Mr. Sal in forum Programming / Datafeeds / Tools
    Replies: 28
    Last Post: October 2nd, 2006, 11:37 PM
  3. Removing HTML from Large Database Files
    By jdsguam in forum Programming / Datafeeds / Tools
    Replies: 10
    Last Post: August 10th, 2006, 09:29 AM
  4. Image files and PHP
    By dak142 in forum Midnight Cafe'
    Replies: 0
    Last Post: October 8th, 2004, 02:40 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •