You are here:  » Best way to import 2 or more XML feeds into a MySQL database?


Best way to import 2 or more XML feeds into a MySQL database?

Submitted by jimpannell on Wed, 2009-11-18 08:42 in

Hi there

Does anyone know how best to go about importing two or more XML feeds into the same MySQL database? I'll be needing to be able to do this every day via cron.

The challenges I see are as follows:

  • Do I need to trash the database before every import or does Magic Parser recognise changes to existing items in the database and update only those (probably not I suspect as it only parses the feeds, nothing else)?
  • How do I allocate the same primary index ID to each item so that even if I've trashed the database, each item keeps the same ID?

Is this something that anyone out there has done before? If so, is there any chance you'd care to share your code?

Kind regards

Jim

Submitted by support on Wed, 2009-11-18 10:26

Hello Jim,

As there's no link between Magic Parser and the database it's all down to your application as to how this is handled.

In order to maintain an ID to item association, your application will need to check to see if the item is already in the database and then UPDATE if so, otherwise INSERT. An example of how this might look in your record handler function would be as follows:

<?php
  
function myRecordHandler($record)
  {
    
$sql "SELECT id FROM items WHERE name='".mysql_escape_string($record["NAME"])."'";
    if (
mysql_num_rows(mysql_query($sql))
    {
      
$sql "UPDATE items SET
                foo='"
.mysql_escape_string($record["FOO"])."',
                bar='"
.mysql_escape_string($record["BAR"])."'
                WHERE
                name='"
.mysql_escape_string($record["NAME"])."'
                "
;
    }
    else
    {
      
$sql "INSERT INTO items SET
                name='"
.mysql_escape_string($record["NAME"])."',
                foo='"
.mysql_escape_string($record["FOO"])."',
                bar='"
.mysql_escape_string($record["BAR"])."'
                "
;
    }
    
mysql_query($sql);
  }
?>

Hope this helps!
Cheers,
David.

Submitted by jimpannell on Wed, 2009-11-18 11:09

Hi David

Thanks for the speedy reply!

Okay - so I don't need to trash existing entries - I can update them. Nice. I guess I'll need to identify a field in the feed that is unique for every item and then check that against the items in my database (you used $record["NAME"] in your example).

I'll have a go and get back to you if I have any problems!

Cheers

Jim

Submitted by jimpannell on Wed, 2009-11-18 20:20

Hi again David

I've got it working really well for my first feed and am very pleased.

Working on the second feed now which doesn't have a unique ID in it as an item, but there is one contained in the url of an image in each record.

See below for an example:

http://my.fictional-feed.com/PictureResizer.ASP?PropertyID=1639586&amp;PhotoID=1&amp;AgentID=782&amp;BranchID=1318&amp;width=200&amp;rotation=0

Although there are around 5 photos for each record in the feed, I thought of using the following to pull the PropertyID out of the urls:

print "PropertyID: ".substr($record["PHOTOS/PHOTO/URL"],58,7)."<br />";

(this is just for displaying currently rather than inserting into mysql)

This gives me "1639586" for the first record, but subsequent records give me "42&Phot" and similar (always the same position). Any idea how I'd correct this?

Cheers for now

Jim

Submitted by support on Wed, 2009-11-18 20:30

Hi Jim,

It's probably because the URL length and also length of the ID will be changing. Have a go at basing on the position of the surrounding strings, something like this:

$posA = strpos($record["PHOTOS/PHOTO/URL"],"PropertyID=")+11;
$posB = strpos($record["PHOTOS/PHOTO/URL"],"&amp;",$posA);
$PropertyID = substr($record["PHOTOS/PHOTO/URL"],$posA,($posB-$posA));
print "PropertyID: ".$PropertyID."<br />";\

Hope this helps!
Cheers,
David.

Submitted by jimpannell on Wed, 2009-11-18 21:00

David - you're a LEGEND! It works a treat.

Many thanks

Jim :-)