Hi,
When I am trying to parse a feed by using MagicParser, I am getting to the point that part of it getting lost.
This is the feed that I am trying to parse:
<QueryResponse xmlns="">
<categoryResponse matched="2" included="1">
<category id="499" parent_id="4">
<name>MP3 & Media Players</name>
<relevance>0.999999463558197</relevance>
<URL>http://something.com</URL>
</category>
<category id="46015" parent_id="4">
<name>MP3 Player Accessories</name>
<relevance>5.40855637609639e-07</relevance>
<URL>http://something.com</URL>
</category>
</categoryResponse>
<productResponse requested="20" matched="2000" included="20" start="1">
<product category_id="499" id="626957925">
<name>Apple iPod Classic 80 GB - Black</name>
<relevance>5289607168000</relevance>
<URL>http://something.com/ipod.html</URL>
<imageURL_small>http://image.something.com/resize?sq=60&uid=626957925</imageURL_small>
<imageURL_med>http://image.something.com/resize?sq=100&uid=626957925</imageURL_med>
<imageURL_medlarge>http://image.something.com/resize?sq=160&uid=626957925</imageURL_medlarge>
<imageURL_large>http://image.something.com/resize?sq=400&uid=626957925</imageURL_large>
<maxRawImageSize/>
<desc_short>Holds Up to 20,000 Songs, 25,000 Photos, or 100 hrs of Video - 2.5 in Display - Battery Life: Up to 30 hrs of Audio/7 hrs of Video</desc_short>
<desc_long>Holds Up to 20,000 Songs, 25,000 Photos, or 100 hrs of Video - 2.5 in Display - Battery Life: Up to 30 hrs of Audio/7 hrs of Video</desc_long>
<prodRating URLref="prodRate-4">4</prodRating>
<prodScore>4.00</prodScore>
<numReviews>18</numReviews>
<reviewsURL>http://www.something.com/mp3_mediaplayers/apple-ipod-classic-80-gb-black--pid626957925/reviews__af_assettype_id--10__af_creative_id--6__af_id--1001__af_placement_id--1__keyword--ipods__rf--af1.html</reviewsURL>
<minPrice>175.99</minPrice>
<maxPrice>249.99</maxPrice>
<numMerchants>6</numMerchants>
</product>
<product category_id="499" id="626986930">
<name>Apple iPod Nano 4 GB - Silver</name>
<relevance>4019571654656</relevance>
<URL>http://www.something.com/mp3_mediaplayers/apple-ipod-nano-4-gb-silver--pid626986930/compareprices__af_assettype_id--10__af_creative_id--6__af_id--1001__af_placement_id--1__keyword--ipods__rf--af1.html</URL>
<imageURL_small>http://image.something.com/resize?sq=60&uid=626986930</imageURL_small>
<imageURL_med>http://image.something.com/resize?sq=100&uid=626986930</imageURL_med>
<imageURL_medlarge>http://image.something.com/resize?sq=160&uid=626986930</imageURL_medlarge>
<imageURL_large>http://image.something.com/resize?sq=400&uid=626986930</imageURL_large>
<maxRawImageSize/>
<desc_short>Holds Up to 1,000 Songs, 3,500 Photos, or 4 hrs of Video - 2 in. Display - Battery Life: Up to 24 hrs of Audio/5 hrs of Video</desc_short>
<desc_long>Holds Up to 1,000 Songs, 3,500 Photos, or 4 hrs of Video - 2 in. Display - Battery Life: Up to 24 hrs of Audio/5 hrs of Video</desc_long>
<prodRating URLref="prodRate-4.5">4.5</prodRating>
<prodScore>0.00</prodScore>
<numReviews>12</numReviews>
<reviewsURL>http://www.something.com/mp3_mediaplayers/apple-ipod-nano-4-gb-silver--pid626986930/reviews__af_assettype_id--10__af_creative_id--6__af_id--1001__af_placement_id--1__keyword--ipods__rf--af1.html</reviewsURL>
<minPrice>104.99</minPrice>
<maxPrice>149.99</maxPrice>
<numMerchants>4</numMerchants>
</product>
</productResponse>
<otherResponse>
<totalProducts>22899973</totalProducts>
<totalStores>109024</totalStores>
<imageURL>
<URL id="prodRate-3.5">http://image.something.com/site/rating_3_and_half_star_80x13.gif</URL>
<URL id="prodRate-4">http://image.something.com/site/rating_4_star_80x13.gif</URL>
<URL id="prodRate-4.5">http://image.something.com/site/rating_4_and_half_star_80x13.gif</URL>
<URL id="prodRate-5">http://image.something.com/site/rating_5_star_80x13.gif</URL>
</imageURL>
<otherURL>
<URL id="search">http://www.something.com/search__keyword--ipods__rf--af1.html</URL>
</otherURL>
<trackingPixel>http://adserve.something.com/img/publisherID-1001/assetID-6/placementID-1/</trackingPixel>
</otherResponse>
</QueryResponse>
This is the code which is generated by http://www.magicparser.com/demo for php:
<?php
require("MagicParser.php");
function myRecordHandler($record)
{
print $record["PRODUCT"];
print $record["PRODUCT-CATEGORY_ID"];
print $record["PRODUCT-ID"];
print $record["NAME"];
print $record["RELEVANCE"];
print $record["URL"];
print $record["IMAGEURL_SMALL"];
print $record["IMAGEURL_MED"];
print $record["IMAGEURL_MEDLARGE"];
print $record["IMAGEURL_LARGE"];
print $record["MAXRAWIMAGESIZE"];
print $record["DESC_SHORT"];
print $record["DESC_LONG"];
print $record["PRODRATING"];
print $record["PRODRATING-URLREF"];
print $record["PRODSCORE"];
print $record["NUMREVIEWS"];
print $record["REVIEWSURL"];
print $record["MINPRICE"];
print $record["MAXPRICE"];
print $record["NUMMERCHANTS"];
}
MagicParser_parse("all good here.");
?>
As you can see only productResponse has been parsed.
categoryResponse and otherResponse is just ignored.
Could you help me to get a solution for this issue, please?
Thank you!
Hi David,
Thank you very much for quick and useful response. It worked perfectly!
Hi David,
I have another slight problem.
I need to get 2 parameters from product response. I need to get number of products that "matched" and number that "included".
<productResponse requested="20" matched="2000" included="20" start="1">
Thanks in advance :)
Hi,
Whilst you can access this data using Magic Parser, it is not ideal because the script is
designed for accessing repeating records, not specific elements or attributes of a large
XML document.
What you need to do is parse the document at the top level element, using (in this case)
the format string:
xml|QUERYRESPONSE/
As the entire document will then be passed to your record handler function, you can access
the number of products matched and included through the following variables:
$record["PRODUCTRESPONSE-MATCHED"]
$record["PRODUCTRESPONSE-INCLUDED"]
For example (based on the example above)
<?php
function myTopLevelRecordHandler($record)
{
// you probably just want to copy these into global variables
print $record["PRODUCTRESPONSE-MATCHED"];
print $record["PRODUCTRESPONSE-INCLUDED"];
}
$xml = "_YOUR_XML_HERE_"; // string variable contianing the data to parse
MagicParser_parse("string://".$xml,"myTopLevelRecordHandler","xml|QUERYRESPONSE/");
?>
Hope this helps!
Cheers,
David.
Hi David,
Thanks for reply, it did work, but it doesn't work consistently. :(
May be I did something wrong.
function top($record)
{
echo $record["PRODUCTRESPONSE-INCLUDED"];
}
INCLUDED - is total number of products.
One time for example it will display 139 products after refresh it is displaying total of 217 products that were included. Same thing happens when I am going to another page. It happens about 50% of the times.
Please help.
Hi,
Could you perhaps look at the XML a few times to see if it is actually the value in the feed that is different? As there is a number being displayed, it is almost certain that it is exactly the value contained in the XML (otherwise it would be empty), so we really need to eliminate the XML source first...
Cheers,
David.
David,
another small issue:
This is part of xml file
<attrResponse requested="0" included="1">
<attr id="259818">
<name>Price Range</name>
<URL>
</URL>
<values requested="" included="7">
<attrValue id="030630507324">
<name>< 450</name>
<numProductsMatched>791</numProductsMatched>
<URL></URL>
</attrValue>
<attrValue id="030594261506502705">
<name>450 - 620</name>
<numProductsMatched>785</numProductsMatched>
<URL></URL>
</attrValue>
......
This is PHP code that I use to pull the data:
$fname = array();
function filter_name($record)
{
global $fname;
$fname[] = $record;
}
$brandID = array();
function filterID($record)
{
global $brandID;
$brandID[] = $record;
}
$brand_filter = array();
function filter($record)
{
global $brand_filter;
$brand_filter[] = $record;
}
........
........
........
MagicParser_parse("string://".$xml,"filter_name","xml|QUERYRESPONSE/ATTRRESPONSE/ATTR/");
MagicParser_parse("string://".$xml,"filterID","xml|QUERYRESPONSE/ATTRRESPONSE/ATTR/VALUES/");
MagicParser_parse("string://".$xml,"filter","xml|QUERYRESPONSE/ATTRRESPONSE/ATTR/VALUES/ATTRVALUE/");
Here is few issues that I have.
1) For some reason filterID array does not fully displayed. Only 1st record is shown. When I am trying to pull other parameters it showing that no other data is in the array.
2) Is there another way to write a code, it gets kind of messy when I need to create array for each parameter?
Hi,
For the format string you are using for the filterID record handler...
xml|QUERYRESPONSE/ATTRRESPONSE/ATTR/VALUES/
...there will only be one record, and it will contain all the ATTRVALUE records
using the @1,@2... notation for differentiating duplicate fields.
What values are you trying to extract into the filterID array (in other words
how do you want to use it later on in the code, that will help me see what code
you need to use to populate it)
With regards to the coding style, as this is not record based XML this is really
the only suitable way to handle it using Magic Parser; as ordinarily you would use
a DOM parse with XML of this style (which is much more complicated!)....
Cheers,
David.
Hi,
This will have happened because Magic Parser has auto-detected the product records, and is returning them
to your myRecordHandler function.
In order to access the category records, you would need to parse using a different format string, and its
own record handler function. Here's what you need to do:
<?php
header("Content-Type: text/plain");
require("MagicParser.php");
function myCategoryRecordHandler($record)
{
print_r($record);
}
function myProductRecordHandler($record)
{
print_r($record);
}
function myOtherRecordHandler($record)
{
print_r($record);
}
$xml = "_YOUR_XML_HERE_"; // string variable contianing the data to parse
MagicParser_parse("string://".$xml,"myProductRecordHandler","xml|QUERYRESPONSE/PRODUCTRESPONSE/PRODUCT/");
MagicParser_parse("string://".$xml,"myCategoryRecordHandler","xml|QUERYRESPONSE/CATEGORYRESPONSE/CATEGORY/");
MagicParser_parse("string://".$xml,"myOtherRecordHandler","xml|QUERYRESPONSE/OTHERRESPONSE/");
?>
Note that I have added header("Content-Type: text/plain"); at the top of this demo script - that is so that
you can easily see the values in each of the $record arrays. Simply remove this once you start to make the
script generate the HTML output that you require (if that is what you are doing).
If you are parsing a URL; it's best not to keep using a URL in the call to MagicParser_parse as each one will
result in a request being made to the remote server. To do this, instead of:
$xml = "_YOUR_XML_HERE_"; // string variable contianing the data to parse
Use something like:
<?php
$xml = "";
$url = "http://www.example.com/path/to/xml";
if ($fp = fopen($url,"r"))
{
while(!feof($fp)) $xml .= fread($fp,1024);
fclose($fp);
}
?>
Hope this helps!
Cheers,
David.