You are here:  » parse XML file which has a lot sub-nodes

Support Forum



parse XML file which has a lot sub-nodes

Submitted by lang2000 on Tue, 2008-04-15 11:29 in

Hi:

I have tried to figure this out for a while, but no luck, the XML documents is:

http://www.jowjow.co.uk/ART_EVENT.xml

I can list the data in this XML file as following:

TITLE ID: 3619611902
TITLE NAME: DAGE Summer Party
PERFORMACE ID: 3619612202
PERFORMANCE: 29 Jun 2008 3:00 PM
START DATE: 29/06/2008
END DATE: 29/06/2008
FULL PRICE: 0
EVENT ID: 3619778802
EVENT START DATE:29/06/2008
EVENT END DATE: 29/06/2008
EVENT START TIME:3:00 PM

TITLE ID: 3619612302
TITLE NAME: Nodding at the Back, Launch Event
PERFORMACE ID: 3619612402
PERFORMANCE: 15 May 2008 6:30 PM
START DATE: 15/05/2008
END DATE: 15/05/2008
FULL PRICE: 0
EVENT ID: 3619778902
EVENT START DATE:15/05/2008
EVENT END DATE: 15/05/2008
EVENT START TIME:6:30 PM

TITLE ID: 3507946302
TITLE NAME: Blues With Gumbo and Friends
PERFORMACE ID: 3507946902
PERFORMANCE: Every Ist Tue Of the Month 8:00 PM
START DATE: 01/01/2008
END DATE: 31/12/2008
FULL PRICE: 0
EVENT ID: 3508478802
EVENT START DATE:06/05/2008
EVENT END DATE: 06/05/2008
EVENT START TIME:8:00 PM

TITLE ID: 3507947402
TITLE NAME: Crazy Cats 'N Dogs Club
PERFORMACE ID: 3507947602
PERFORMANCE: Every 2nd Tue Of the Month, 4th Tue Of the Month 8:00 PM
START DATE: 01/01/2008
END DATE: 31/12/2008
FULL PRICE: 0
EVENT ID: 3508479902
EVENT START DATE:08/04/2008
EVENT END DATE: 08/04/2008
EVENT START TIME:8:00 PM

TITLE ID: 3507948202
TITLE NAME: Out of the Blue - Calypso and Reggae Band
PERFORMACE ID: 3507948402
PERFORMANCE: Every 3rd Tue Of the Month 8:00 PM
START DATE: 01/01/2008
END DATE: 31/12/2008
FULL PRICE: 0
EVENT ID: 3508481902
EVENT START DATE:15/04/2008
EVENT END DATE: 15/04/2008
EVENT START TIME:8:00 PM

there are five events in this list, however, there is another node called "VENUE" in the XML, how do I assign the venue id to each event? The first two events share the same venue, the last three events share another venue. i.e. for the first event I would like to display:

VENUE ID: 4817
TITLE ID: 3619611902
TITLE NAME: DAGE Summer Party
PERFORMACE ID: 3619612202
PERFORMANCE: 29 Jun 2008 3:00 PM
START DATE: 29/06/2008
END DATE: 29/06/2008
FULL PRICE: 0
EVENT ID: 3619778802
EVENT START DATE:29/06/2008
EVENT END DATE: 29/06/2008
EVENT START TIME:3:00 PM

Another questions is how do I display all the "event" under each "title" node? i.e. listing all the events and be able to know which title each event belongs to? at moment it only displays the first event under each "title" node.

Thanks a lot, and hope to hear from you.

Submitted by support on Tue, 2008-04-15 12:00

Hi there,

As the XML contains 2 levels of nesting - multiple venues, and multiple events per title, i'm afraid that parsing this particular style of XML is not best suited to Magic Parser, however it is possible using various techniques.

The basic theory is to parse your feed twice - once to build up an array of venues; and secondly to parse the titles, referring to your venue array in order to map a title to a venue. A second technique is then required to extract the multiple events per title. Magic Parser will have differentiated these using @1, @2.. on the end of the array element name; so this can be used to test whether the values exist in the array, and then use them if so.

As always, it's probably a lot easier to see it working than to try and explain it; so i've written a script to extract your XML into a usable format which should help you get started. Here is the output running on this server:

http://www.magicparser.com/examples/events.php

I've only chosen to display certain items of the records; but you can of course display or insert into a database any of the fields in the various arrays - use print_r() to see what field names are available at any particular point in the code.

Here's the script:

<?php
  
require("MagicParser.php");
  
// global array to hold the titles
  
$titles = array();
  
// record handler to build the above array from the XML
  
function myTitleRecordHandler($title)
  {
    global 
$titles;
    
$titles[] = $title;
  }
  
// globl array to hold venues and a mapping array to associated
  // titles with venues
  
$venues = array();
  
$title2venue = array();
  
// record handler to build the above arrays from the XML
  
function myVenueRecordHandler($venue)
  {
    
// create array of venues
    
global $venues;
    
$venues[$venue["VENUE-VENUE_ID"]] = $venue["VENUE-VENUE_NAME"];
    
// now create a mapping array to match venues with titles
    // to do this, we go through the entire record looking for
    // TITLE_ID fields (they will be differentiated with @1, @2..
    // but this can be ignored for now
    
global $title2venue;
    foreach(
$venue as $k => $v)
    {
      if (
strpos($k,"TITLE_ID"))
      {
        
$title2venue[$venue[$k]] = $venue["VENUE-VENUE_ID"];
      }
    }
  }
  
// load the XML into a variable so that we don't hit the remote server twice!
  
$xml "";
  
$url "http://www.jowjow.co.uk/ART_EVENT.xml";
  
$fp fopen($url,$r);
  while(!
feof($fp)) $xml .= fread($fp,1024);
  
fclose($fp);
  
// first parse to load all titles into the global array $titles
  
MagicParser_parse("string://".$xml,"myTitleRecordHandler","xml|LISTINGS/POI/VENUE/TITLES/TITLE/");
  
// second parse to generate title > venue mapping array
  
MagicParser_parse("string://".$xml,"myVenueRecordHandler","xml|LISTINGS/POI/VENUE/");
  
// finally we can handle the $titles array using foreach() exactly as the array would have been
  // handled within myRecordHandler, using $title to access the XML elements.
  // the code below shows how to extract the multiple event by using a counter
  // and looking for the way Magic Parser has resolved the duplicate names using @1, @2, etc..
  
foreach($titles as $title)
  {
    print 
"<h2>".$title["TITLE-TITLE_NAME"]."</h2>";
    print 
"<h3>Venue:".$venues[$title2venue[$title["TITLE-TITLE_ID"]]]."</h3>";
    print 
"<blockquote>";
    print 
"<h4>Performances</h4>";
    print 
"<ul>";
    
$postfix "";
    
$i 0;
    while(
1) {
      if (
$i$postfix "@".$i;
      if (!
$title["EVENTS/EVENT".$postfix."-EVENT_ID"]) break;
      
$event_id $title["EVENTS/EVENT".$postfix."-EVENT_ID"];
      
$event_start_date $title["EVENTS/EVENT".$postfix."-EVENT_START_DATE"];
      
$event_end_date $title["EVENTS/EVENT".$postfix."-EVENT_END_DATE"];
      
$event_start_time $title["EVENTS/EVENT".$postfix."-EVENT_START_TIME"];
      print 
"<li>".$event_start_date." at ".$event_start_time."</li>";
      
$i++;
    }
    print 
"</ul>";
    print 
"</blockquote>";
  }
?>

Hope this helps!
Cheers,
David.

Submitted by lang2000 on Tue, 2008-04-15 13:31

Thanks a lot David, The code you wrote really helped, however, i ran into another problem after I adapted your code for a larger XML file, it has the same structure as the one I showed you (http://www.jowjow.co.uk/ART_EVENT.xml), but this time it has a lot more data, it is located at :

http://www.jowjow.co.uk/ART_EVENT_20080402.xml

when I ran the code, it just kept loading the page without showing anything. hope you can help, Thanks.

Submitted by support on Tue, 2008-04-15 13:41

Hi,

I just put the new URL into the script on my server and it appears to have worked correctly...

http://www.magicparser.com/examples/events.php

The first thing I would do is to add some debug code to print the size of the XML file received just to confirm that the fopen() is working for the new, larger URL.

Where you currently have:

fclose($fp);

...add the following code on the next line:

  print "Bytes Received: ".strlen($xml);exit();

This should output:

Bytes Received: 163236

If that looks OK, the next debug step would be to see if the venues are being parsed correctly. Where you currently have:

MagicParser_parse("string://".$xml,"myTitleRecordHandler","xml|LISTINGS/POI/VENUE/TITLES/TITLE/");

add on the next line:

print_r($venues);exit();

If the code never gets here, this will tell us that it is a problem parsing the venues, so we can investigate that further. If this does work OK (you will see a dump of the $venues array; then it will be a problem parsing the titles so we can then look at that...

Cheers,
David.

Submitted by lang2000 on Sun, 2008-04-27 17:24

Hi David:

Thanks for the script, I have tested the scripts on the server: jowjow.co.uk, and it worked:

TEST 1

SERVER: jowjow.co.uk

FILE A: php file is http://www.jowjow.co.uk/feed/events_jowjow.php

FILE B: xml file is http://www.jowjow.co.uk/feed/ART_EVENT_20080402.xml

And the same test has also been tested on another server, it also worked.

However, I ran into another problem, and couldn't figure out what caused it.

TEST 2

I put the scripts on the server that it is supposed to run from eventually : www.linj1601.co.uk

SERVER: linj1601.co.uk

FILE C: php file is http://www.linj1601.co.uk/feed/events_ldf.php (after clicking, the page will just keep loading without showing any result)

FILE D: xml file is http://www.linj1601.co.uk/feed/ART_EVENT_20080402.xml

when running the php FILE C, it doesn't parse the xml file at all, and the webpage just keep loading itself, but no result,

However, if i change the php code in FILE C to parse FILE B which is on server jowjow.co.uk, it then worked as shown below:

FILE E: php file to parse FILE B http://www.linj1601.co.uk/feed/events_ldf_jowjow.php

So the problem is: on linj1601.co.uk server, it seems that it doesn't parse the XML located on the same server (i have tried to place the xml in the different folders on the server, it also didn't work), but no problem parsing the files at the other locations, i.e. other servers.

I have tried a lot to get this work, but I couldn't figure out why, can you help? What I want is to make the php script parse the xml files located on the server: linj1601.co.uk .

I provided the php information on each server, does the linj1601.co.uk server miss any package in order to make the script work?

PHP INFORMATION:

jowjow.co.uk: http://www.jowjow.co.uk/info.php

linj1601.co.uk http://www.linj1601.co.uk/info.php

Thanks, David.

Reply

Forward

Submitted by support on Mon, 2008-04-28 08:38

Hi,

As the script on linj1601.co.uk works fine parsing the code remotely, the problem is likely down to this section of code:

  $xml = "";
  $url = "http://www.jowjow.co.uk/ART_EVENT.xml";
  $fp = fopen($url,$r);
  while(!feof($fp)) $xml .= fread($fp,1024);
  fclose($fp);

I'm assuming that your FILE C has been modified simply to:

  $xml = "";
  $url = "ART_EVENT_20080402.xml";
  $fp = fopen($url,$r);
  while(!feof($fp)) $xml .= fread($fp,1024);
  fclose($fp);

In other words, as you are on the same server, you don't need to open the file as a URL - just the local filename will do. If however, you had used the actual URL to the file on the new server that may be the cause of the problem if there was a firewall issue effectively preventing the server from connecting to itself.

However, to avoid the continuous loading; I would modify the above as follows, changing $url to $filename to avoid using confusing variable names:

  $xml = "";
  $filename = "ART_EVENT_20080402.xml";
  $fp = fopen($filename,$r);
  if ($fp)
  {
    while(!feof($fp)) $xml .= fread($fp,1024);
    fclose($fp);
  }
  else
  {
    print "Error opening ".$filename;
    exit();
  }

Cheers,
David.

Submitted by lang2000 on Mon, 2008-04-28 09:41

Hi David:

Now I have changed the code to the following as you suggested:

<?php
 $xml 
"";
  
$filename "ART_EVENT_20080402.xml";
  
$fp fopen($filename,$r);
  if (
$fp)
  {
    while(!
feof($fp)) $xml .= fread($fp,1024);
    
fclose($fp);
    print 
"Bytes Received: ".strlen($xml);
  }
  else
  {
    print 
"Error opening ".$filename;
    exit();
  }
?>

when i run www.linj1601.co.uk/feed/events_ldf.php

it says Error opening ART_EVENT_20080402.xml

it still having problem reading the xml file.

Any suggestions?

Cheers
Lin

Submitted by support on Mon, 2008-04-28 10:08

Hello Lin,

Try this quickly....

<?php
  $filename 
"ART_EVENT_20080402.xml";
  if (
file_exists($filename))
  {
    print 
"OK";
  }
  else
  {
    print 
"File not found!";
  }
?>

This will confirm that the file is visible to PHP in the current directory...

Cheers,
David.

Submitted by lang2000 on Mon, 2008-04-28 10:34

Hi David:

Code changed, and it prints out OK.... and "error opening......" as shown below,

http://www.linj1601.co.uk/feed/events_ldf.php

Thanks

Submitted by support on Mon, 2008-04-28 10:36

Hello Lin,

My apologies - i've just spotted the problem....

Instead of:

  $fp = fopen($url,$r);

...it should be:

  $fp = fopen($url,"r");

$r would have been empty; which is obviously a satisfactory open mode for a URL, but not for a local file. This should fix it....!

All the best,
David.

Submitted by lang2000 on Mon, 2008-04-28 10:49

Yes, David, it finally worked, thanks very much for your help, appreciate your patience.

Regards
Lin