Hello,
I'm trying to parse xml files that are around 200mb, for each xml $record I want to run an SQL query.
At the moment I'm getting a maximum time execution error (30 seconds), for the moment I can't move to dedicated hosting to be able to change php ini to a longer time.
I thought about splitting the xml file into smaller chunks, but my scripting seems to only be good enough to split a small file into smaller files, but I can't seem to make a large file into smaller ones (run of time)
I assume this must be fairly common? Any thoughts?
Could magicParser work on a per $record basis.
I sort of used this on a earlier xml parse - I don't want to keep using this, but this worked for me on the large files.
<?php
$file = 'house.xml';
$reader = new XMLReader();
$reader->open($file);
while ($reader->read())
{
// are we in a house?
if ($reader->nodeType == XMLReader::ELEMENT &&
strtolower($reader->localName) == 'house')
{
$node = $reader->expand(); // expand the node into a DOMNode
// Convert to SimpleXML via DOM, messy but SimpleXML is soo much nicer.
$dom = new DomDocument();
$dom->appendChild( $dom->importNode($node, TRUE) );
$sxl = simplexml_import_dom($dom);
// then do what we want to do.
processProduct($sxl);
unset($node, $dom, $sxl);
}
}
$reader->close();
unset($reader, $file);
?>
Hi,
If the above code can complete within the 30 seconds, Magic Parser should also, provided that you are giving a Format String value in the 3rd (optional) parameter to MagicParser_parse(), otherwise Magic Parser will be having to read the entire file twice - once to work out the format, and the second time to actually parse the records and hand each one to your myRecordHandler() function.
Are you currently using a Format String?
This is the string that describes which level of the XML you are interested in, for example:
xml|HOUSES/HOUSE/
...if your XML looked something like:
<houses>
<house>
... house info ...
</house>
<house>
... house info ...
</house>
<house>
... house info ...
</house>
</houses>
If you're not sure, would it be possible for you to email me a link to your XML and I'll work out the Format String for you...
Cheers,
David.