Best way to import 2 or more XML feeds into a MySQL database?
Hi there
Does anyone know how best to go about importing two or more XML feeds into the same MySQL database? I'll be needing to be able to do this every day via cron.
The challenges I see are as follows:
- Do I need to trash the database before every import or does Magic Parser recognise changes to existing items in the database and update only those (probably not I suspect as it only parses the feeds, nothing else)?
- How do I allocate the same primary index ID to each item so that even if I've trashed the database, each item keeps the same ID?
Is this something that anyone out there has done before? If so, is there any chance you'd care to share your code?
Kind regards
Jim
Hi David
Thanks for the speedy reply!
Okay - so I don't need to trash existing entries - I can update them. Nice. I guess I'll need to identify a field in the feed that is unique for every item and then check that against the items in my database (you used $record["NAME"] in your example).
I'll have a go and get back to you if I have any problems!
Cheers
Jim
Hi again David
I've got it working really well for my first feed and am very pleased.
Working on the second feed now which doesn't have a unique ID in it as an item, but there is one contained in the url of an image in each record.
See below for an example:
http://my.fictional-feed.com/PictureResizer.ASP?PropertyID=1639586&PhotoID=1&AgentID=782&BranchID=1318&width=200&rotation=0Although there are around 5 photos for each record in the feed, I thought of using the following to pull the PropertyID out of the urls:
print "PropertyID: ".substr($record["PHOTOS/PHOTO/URL"],58,7)."<br />";(this is just for displaying currently rather than inserting into mysql)
This gives me "1639586" for the first record, but subsequent records give me "42&Phot" and similar (always the same position). Any idea how I'd correct this?
Cheers for now
Jim
Hi Jim,
It's probably because the URL length and also length of the ID will be changing. Have a go at basing on the position of the surrounding strings, something like this:
$posA = strpos($record["PHOTOS/PHOTO/URL"],"PropertyID=")+11;
$posB = strpos($record["PHOTOS/PHOTO/URL"],"&",$posA);
$PropertyID = substr($record["PHOTOS/PHOTO/URL"],$posA,($posB-$posA));
print "PropertyID: ".$PropertyID."<br />";\Hope this helps!
Cheers,
David.
David - you're a LEGEND! It works a treat.
Many thanks
Jim :-)
Hello Jim,
As there's no link between Magic Parser and the database it's all down to your application as to how this is handled.
In order to maintain an ID to item association, your application will need to check to see if the item is already in the database and then UPDATE if so, otherwise INSERT. An example of how this might look in your record handler function would be as follows:
<?phpfunction myRecordHandler($record)
{
$sql = "SELECT id FROM items WHERE name='".mysql_escape_string($record["NAME"])."'";
if (mysql_num_rows(mysql_query($sql))
{
$sql = "UPDATE items SET
foo='".mysql_escape_string($record["FOO"])."',
bar='".mysql_escape_string($record["BAR"])."'
WHERE
name='".mysql_escape_string($record["NAME"])."'
";
}
else
{
$sql = "INSERT INTO items SET
name='".mysql_escape_string($record["NAME"])."',
foo='".mysql_escape_string($record["FOO"])."',
bar='".mysql_escape_string($record["BAR"])."'
";
}
mysql_query($sql);
}
?>
Hope this helps!
Cheers,
David.