You are here:  » limit number of items to parse flag?


limit number of items to parse flag?

Submitted by gateway on Thu, 2007-06-07 06:40 in

Is their a way to pass a variable to MagicParser_parse($url,"myRecordHandler",$format); to grab like the latest X items in the xml file?

maybe something like this

MagicParser_parse($url,"myRecordHandler",$format,3);

with 3 being the number of items?

Steve

Submitted by support on Thu, 2007-06-07 07:35

Hello Steve,

Because XML is a serial format there is no easy way (at the parser level) to obtain just the first or last X records. It is, however, reasonably straight forward to do this at the application level.

Firstly, limiting the number of items processed to the first X is the more common request, and you can see how to that (by virtue of the fact the if you return TRUE from your record handler function, Magic Parser will stop) in the following thread:

http://www.magicparser.com/node/202

Obtaining the LAST X items is not so straight forward, because, as mentioned above, XML is a serial format and when parsing you have no idea where you are in the feed. Therefore, what you need to do is first read all the items into an array, and then obtain the last X items from that array.

If you like the way Magic Parser works (by calling a function with each record as an array), you can simulate this easily, taking advantage of PHP's array_pop() function. Here's an example:

<?php
  
require("MagicParser.php");
  
// global array to hold items
  
$items = array();
  
// parse the file and read items into global $items array
  
function myRecordHandler($record)
  {
    global 
$items;
    
// add item to the end of the array
    
$items[] = $record;
  }
  
MagicParser_parse("file.xml","myRecordHandler","xml|FORMAT/STRING/");
  
// now simulate the above using myItemHandler with the last 3 items
  
function myItemHandler($item)
  {
    
print_r($item);
  }
  
// extract the last 3 items and call myItemHandler
  
for($i=1;$i<=3;$i++)
  {
    
$item array_pop($items);
    
myItemHandler($item);
    
// alternatively, you could just process $item here, of course.
  
}
?>

Hope this helps!
Cheers,
David.

Submitted by gateway on Tue, 2007-06-12 00:31

thanks ill give it a shot, i was more along the lines of # of items to grab from the top of the list, which I found in that post you linked to.

thanks

Submitted by gateway on Tue, 2007-06-12 05:47

hmm, Im running into a bit of an issue on what im trying to do.

I pull rss urls, count from a database then run each one though the record handler untill count = number of items to get int he db for that rss feed.

so currently i run a sql query against anything thats marked as active in my db, return the results with $name, $url, $count then I need to though those though the handler and parse those results and put them into another db field (for the data im getting) ..

any thoughts on this way.. maybe im missing something simple, its late and been coding all night.

Submitted by support on Tue, 2007-06-12 07:59

Hi,

One thing to check is that inside your record handler function you have declared as global any variables that you have set outside of the record handler function - in particular the $count variable. The counter also has to be global so that you are actually incrementing the counter for each record - otherwise it might be starting from zero each time...

Cheers,
David.

Submitted by gateway on Tue, 2007-06-12 18:01

David, I apperciate your replies, whats hanging me up is the ability to not pass additional variables that change though our the loop to:

function myRecordHandler($item)
{

ie like function myRecordHandler($item,$numitems,$a,$b)
{

etc..

maybe im going about this the wrong way..

Submitted by support on Tue, 2007-06-12 18:07

Hi,

You won't be able to pass additional parameters to myRecordHandler - how it gets called is quite specific to the Magic Parser library. However, you can simulate this completely using global variables. For example, consider the following code:

<?php
  
require("MagicParser.php");
  
// get max items from database, here we just set $count
  
$maxitems 3;
  
// set $counter to keep track of how many items have been parsed so far
  
$counter 0;
  function 
myRecordHandler($item)
  {
    global 
$maxitems;
    global 
$counter;
    
// process $item as required
    // increment record counter
    
$counter++;
    return (
$counter == $maxitems);
  }
  
MagicParser_parse("file.xml","myRecordHandler");
?>

This is very similar to the code in the example from the other thread, but rather than hard coding the script to extract 3 items, you have have obtained that "3" from somewhere else - in your case the number of items to read in the RSS feed - the URL of which you have got from your database.

Then, in myRecordHandler, you compare the global $counter with your global $maxitems, and when they match the test returns TRUE which tells Magic Parser to stop...

Hope this helps - if you're still not sure how to go about feel free to email me your script so far and i'll take a quick look over it for you. Reply to your registration code or forum registration email is the easiest way to reach me...

Cheers,
David.