I have searched as best that I could for an answer on these forums and could not find this question answered.
I am trying to parse multiple rss feeds and insert them into a MySQL database, I can do this with all of the feeds individually, but i would like to create a page that will insert everything at once instead of multiple pages for the different feeds. You mention there is a way to do this but can you give an example of how?
<?php
require("MagicParser.php");
function myRecordHandler($record)
{
$query = "INSERT INTO deals_raw set
deal_title = '".$record["TITLE"]."',
deal_desc = '".$record["DESCRIPTION"]."',
deal_link = '".$record["LINK"]."'
";
$result = mysql_query($query) or die('This deal did not post... ');
}
MagicParser_parse("<<my link>>","myRecordHandler","xml|RSS/CHANNEL/ITEM/");
?>
One other unrelated question:
Is there a way to take the following code and split the img from the text?
<description>
<![CDATA[
<img src="http://images.tigerdirect.com/SKUimages/small/S222-5518-main02-am.jpg"/><br />
]]>
SAVE $1100 - FREE Shipping - Samsung 55" LED HDTV UN55B6000 w/ Samsung BD-P1600 Blu-Ray Player Bundle $1899.99 - after COUPON CODE: VCN853
</description>
Hi Mokono,
The basic approach would simply be to have one record handler for each different type of feed, and then multiple calls to MagicParser_parse() for each one (feeds with the same format can share the same record handler function). For example:
<?php
require("MagicParser.php");
function myRSSRecordHandler($record)
{
$query = "INSERT INTO deals_raw set
deal_title = '".$record["TITLE"]."',
deal_desc = '".$record["DESCRIPTION"]."',
deal_link = '".$record["LINK"]."'
";
$result = mysql_query($query) or die('This deal did not post... ');
}
function myAtomRecordHandler($record)
{
$query = "INSERT INTO deals_raw set
deal_title = '".$record["TITLE"]."',
deal_desc = '".$record["CONTENT"]."',
deal_link = '".$record["LINK-HREF"]."'
";
$result = mysql_query($query) or die('This deal did not post... ');
}
MagicParser_parse("<<my link1>>","myRSSRecordHandler","xml|RSS/CHANNEL/ITEM/");
MagicParser_parse("<<my link2>>","myRSSRecordHandler","xml|RSS/CHANNEL/ITEM/");
MagicParser_parse("<<my link3>>","myAtomRecordHandler","xml|FEED/ENTRY/");
?>
However, it might be better in practice to only have one section of code where you actually perform the database work, limiting the amount of repeated code. This way; you would load records from the different formats into your own global array with a fixed format (representing your database fields perhaps) and then load the database from the global array after parsing all feeds. For example:
<?php
require("MagicParser.php");
$allRecords = array();
function myRSSRecordHandler($record)
{
global $allRecords;
$temp = array();
$temp["deal_title"] = $record["TITLE"];
$temp["deal_desc"] = $record["DESCRIPTION"];
$temp["deal_link"] = $record["LINK"];
$allRecords[] = $temp;
}
function myAtomRecordHandler($record)
{
global $allRecords;
$temp = array();
$temp["deal_title"] = $record["TITLE"];
$temp["deal_desc"] = $record["CONTENT"];
$temp["deal_link"] = $record["LINK-HREF"];
$allRecords[] = $temp;
}
MagicParser_parse("<<my link1>>","myRSSRecordHandler","xml|RSS/CHANNEL/ITEM/");
MagicParser_parse("<<my link2>>","myRSSRecordHandler","xml|RSS/CHANNEL/ITEM/");
MagicParser_parse("<<my link3>>","myAtomRecordHandler","xml|FEED/ENTRY/");
foreach($allRecords as $record)
{
$query = "INSERT INTO deals_raw set
deal_title = '".mysql_escape_string($record["deal_title"])."',
deal_desc = '".mysql_escape_string($record["deal_desc"])."',
deal_link = '".mysql_escape_string($record["deal_link"])."'
";
$result = mysql_query($query) or die('This deal did not post... ');
}
?>
In the second example, I have added mysql_escape_string to the construction of your $query which is important otherwise content in the feeds could break the structure of the SQL.
Regarding extracting the image src URL from your example data; which when using Magic Parser would be within the $record["DESCRIPTION"] field; the best approach is to use regular expressions. Regular expressions themselves are a huge subject; but for example, to extract the URL between quotes from $record["DESCRIPTION"] you could use this;
preg_match('/(?<=")(.*?)(?=")/',$record["DESCRIPTION"],$matches);
$src = $matches[0];
The image URL will be in $src after the above code.
Hope this helps!
Cheers,
David.