Question

Hello Chris,

Thank you for your interest in Magic Parser.

I think it will do what you want - with a little bit of the appropriate PHP to create files and your story index etc. (which isn't parsing related functionality) - but it's quite straight forward to do so i've written some examples to help get you started. Note of course that there are plenty of other ways to achieve this (as i'm sure you'd appreciate being a Unix programmer!) such as loading the stories into a database, however here's a basic method using PHP's file handling functions to do what (I think!) you want to do...

Firstly, regarding the basic parsing of your XML file - yes, Magic Parser will work fine with this. The key thing you need to know is that the format string for use with your xml is as follows:

xml|BULLETIN/STORY/

To use this format string, together with the "looping through child objects" technique that you have already discovered, here is a very basic script to parse your XML and print out each story:

Output:
http://www.magicparser.com/examples/storyboard/storyboard.php

storyboard.php

<?php
  require("MagicParser.php");
  function myRecordHandler($record)
  {
    print "<h1>".$record["HEADLINE"]."</h1>";
    $i = 0;
    while(1) {
      if ($i) $postfix = "@".$i;
      if (!isset($record["BODYTEXT/P".$postfix])) break;
      print "<p>".$record["BODYTEXT/P".$postfix]."</p>";
      $i++;
    }
  }
  MagicParser_parse("race_20070705_161301.xml","myRecordHandler","xml|BULLETIN/STORY/");
?>

Now, the reason why I have given this example its own sub-directory is because within that directory I have created a sub-directory called "stories", and given the web server process write access to that directory. This is so that we can extend the above script to extract each story and write it into an HTML file in the stories directory. In this example, I've chosen storyname as the base filename, as this seems to be some kind of ID field. The script first checks to see if the file exists, and creates the new story if not.

Finally, it appends a link to the story to a index file called "storyindex.php". This is so that you can include the story index within the index page for the directory, which I'll come onto in a moment...

makefiles.php

<?php
  require("MagicParser.php");
  function myRecordHandler($record)
  {
    $filename = "stories/".$record["STORYNAME"].".html";
    // see if we have already extracted this story, abandon if so
    if (file_exists($filename)) return;
    // create and open the file for write access
    $fp = fopen($filename,"w");
    // write the story file - here you could print additional header HTML
    fwrite($fp,"<h1>".$record["HEADLINE"]."</h1>");
    // loop through the paragraphs and write each to the file
    $i = 0;
    while(1) {
      if ($i) $postfix = "@".$i;
      if (!isset($record["BODYTEXT/P".$postfix])) break;
      fwrite($fp,"<p>".$record["BODYTEXT/P".$postfix]."</p>");
      $i++;
    }
    // close the file, but before this you could print additional footer HTML
    fclose($fp);
    // open storyindex.php for append/write access
    $fp = fopen("storyindex.php","a");
    // write the link to this file, using the headline as the anchor text
    $link = "<p><a href='".$filename."'>".$record["HEADLINE"]."</a></p>";
    fwrite($fp,$link);
    // close the index file
    fclose($fp);
  }
  MagicParser_parse("race_20070705_161301.xml","myRecordHandler","xml|BULLETIN/STORY/");
?>

Finally, a master index (what better file to use that index.php, the default index for a directory), is required to include in your storyindex.php file to display links to the extracted stories.

index.php

<?php
  print "<html>";
  print "<body>";
  print "<h1>Choose a story...</h1>";
  require("storyindex.php");
  print "</body>";
  print "</html>";
?>

Here's the link to the example directory to see the results in action:
http://www.magicparser.com/examples/storyboard/

Notice that this example doesn't use header or footer files, but check the comments in the makefiles.php for where you could bring in a standard header and footer for each story, but as before there are plenty of other ways to do this.

With regards to scheduling your script to run, you would need to look at using something like CRON to periodically run your makefiles script to look for a new XML file that has been pushed to you.

Support Forum

Active Forum Topics