You are here:  » Splitting a parsing job


Splitting a parsing job

Submitted by formmailer on Tue, 2010-08-31 21:21 in

Hi David,

I need to import a huge feed (128691 records) into my MySQL database. This operation takes quite some time and I made sure that PHP isn't timing out, but I would prefer multiple runs that contains let's say 2000 records at a time.
The feed is in CSV format.

Would it be possible for magicparser to start split the job in multiple runs?

Or would it be better to download the feed to my server and split the file before running Magicparser. If so, do you have any suggestions how to accomplish this?

Your help is much appreciated!

Thanks in advance,
Jasper

Submitted by support on Wed, 2010-09-01 07:46

Hello Jasper,

No problem - there's an optional 4th parameter in the CSV Format String that you can use to skip a number of lines before beginning parsing, for example:

csv|44|1|0|2000

...would begin parsing at the 2001st line of the file (the parameter is the number of lines to skip).

Of course there's various ways that you could use this to split the parsing job, the simplest I think being an input parameter to your script in which you pass the skip value, with your script coded to abort after 2000 records, so you would call it using, for example

/yourscript.php?start=0
/yourscript.php?start=2000
/yourscript.php?start=4000
etc.

Then in your myRecordHandler() function, return true after processing the number of records you wish to process in each batch, which can be done using a counter variable - for example;

  $count = 0;
  function myRecordHandler($record)
  {
    // process record as normal here
    // ..
    // abort if $count reaches limit
    global $count;
    $count++;
    if ($count == 2000) return TRUE;
  }

Hope this helps!
Cheers,
David.

Submitted by formmailer on Wed, 2010-09-01 08:54

Once again, a big thank you for your excellent support!
I think the above will solve my problem/concerns...

//Jasper