You are here:  » Help with extracting records from an XML file


Help with extracting records from an XML file

Submitted by baggagepin on Wed, 2011-02-09 16:16 in

Hi David,

I am running an XML file with the MagicParser and the output is working fine.

The problem I am having is, I have a table that contains flight information. Each record in the XML file has the flight inofrmation plus additional information about what are called code shared flights.

My required output in the table should be:-

rec time Ref History ID
1 20:45 JJ8085 216705470 .........this is derived from the header section of the file

2 20:45 BD5805 216705470 ........ this is from the first sub section Called CODESHARE

3 20:45 NH5921 216705470 ......... This is from the second sub section called CODESHARE@1

Each of the above need to be individual records that get written to the table but because there is only one field to contain the "time", "Ref" and "HistroryID" I am having an issue.

It is unknown beforehand how many sub sections there are to any header, but realistically could be none up to twenty.

I have attached a sample of the file that contains the info as shown above... and I'm Stuck!!! Could you please help.

<?xml version="1.0" encoding="UTF-8"?>
<FlightHistoryGetRecordsResponse xmlns="http://pathfinder-xml/FlightHistoryService.xsd">
<FlightHistory DepartureAirportTimeZoneOffset="0" ArrivalAirportTimeZoneOffset="-2" ArrivalDate="2011-02-10T06:40:00.000" ArrivalTerminal="1" CreatorCode="O" DepartureDate="2011-02-09T20:45:00.000" DepartureTerminal="1" FlightHistoryId="216705470" FlightNumber="8085" PublishedArrivalDate="2011-02-10T06:40:00.000" PublishedDepartureDate="2011-02-09T20:45:00.000" ScheduledAircraftType="773" ScheduledBlockTime="715" ScheduledGateArrivalDate="2011-02-10T06:40:00.000" ScheduledGateDepartureDate="2011-02-09T20:45:00.000" Status="Scheduled" StatusCode="S"><Airline AirlineCode="JJ" IATACode="JJ" ICAOCode="TAM" Name="TAM Linhas Aereas"/><Origin AirportCode="LHR" IATACode="LHR" ICAOCode="EGLL" Name="Heathrow Airport"/><Destination AirportCode="GRU" IATACode="GRU" ICAOCode="SBGR" Name="Guarulhos International Airport"/>
<FlightHistoryCodeshare Designator="L" FlightHistoryCodeshareId="152940584" FlightHistoryId="216705470" FlightNumber="5805" PublishedArrivalDate="2011-02-10T06:40:00.000" PublishedDepartureDate="2011-02-09T20:45:00.000"><Airline AirlineCode="BD" IATACode="BD" ICAOCode="BMA" Name="bmi"/></FlightHistoryCodeshare>
<FlightHistoryCodeshare Designator="L" FlightHistoryCodeshareId="152964012" FlightHistoryId="216705470" FlightNumber="5921" PublishedArrivalDate="2011-02-10T06:40:00.000" PublishedDepartureDate="2011-02-09T20:45:00.000"><Airline AirlineCode="NH" IATACode="NH" ICAOCode="ANA" Name="ANA - All Nippon Airways"/></FlightHistoryCodeshare></FlightHistory>
</FlightHistoryGetRecordsResponse>

Submitted by support on Thu, 2011-02-10 10:08

Hi,

In case there are likely to be more than one response in the same file, the best approach is to use a loop, appending the @1, @2 etc. qualifiers that Magic Parser appends to the field names to resolve what would otherwise be identical field name within the $record. Consider the following example:

<?php
  
function myRecordHandler($record)
  {
    
$HistoryID $record["FLIGHTHISTORY-FLIGHTHISTORYID"];
    
$AirlineCode $record["AIRLINE-AIRLINECODE"];
    
$FlightNumber $record["FLIGHTHISTORY-FLIGHTNUMBER"];
    
$Ref $AirlineCode.$FlightNumber;
    
// create first record here with Ref and HistroyID
    // ...
    // now loop through each FLIGHTHISTORYCODESHARE
    
$i 0;
    
$p "";
    do {
      if (
$i$p "@".$i;
      if (!
$record["FLIGHTHISTORYCODESHARE".$p."-DESIGNATOR"]) break;
      
$AirlineCode $record["FLIGHTHISTORYCODESHARE/AIRLINE".$p."-AIRLINECODE"];
      
$FlightNumber $record["FLIGHTHISTORYCODESHARE".$p."-FLIGHTNUMBER"];
      
$Ref $AirlineCode.$FlightNumber;
      
// create codeshare record here with Ref from here and same HistoryID from before
      
$i++;
    }
 }
  
MagicParser_parse("flighthistory.xml","myRecordHandler",
    
"xml|FLIGHTHISTORYGETRECORDSRESPONSE/FLIGHTHISTORY/");
?>

Notice how the $p variable contains nothing in the first iteration, and then @1, @2 etc. and is used to construct the key values for indexing $record for that iteration.

Hope this helps!
Cheers,
David.

Submitted by baggagepin on Thu, 2011-02-10 11:15

Hi David.

Many thanks for the reply and your support.

Just to get it clear in my head, do you mean the following:

<?php
  function myRecordHandler($record)
  {
    $HistoryID = $record["FLIGHTHISTORY-FLIGHTHISTORYID"];
    $AirlineCode = $record["AIRLINE-AIRLINECODE"];
    $FlightNumber = $record["FLIGHTHISTORY-FLIGHTNUMBER"];
    $Ref = $AirlineCode.$FlightNumber;
    // create first record here with Ref and HistroyID
    // ...

run the script to here and then insert the first record into the tabel, and then:
// now loop through each FLIGHTHISTORYCODESHARE
    $i = 0;
    $p = "";
    do {
      if ($i) $p = "@".$i;
      if (!$record["FLIGHTHISTORYCODESHARE".$p."-DESIGNATOR"]) break;
      $AirlineCode = $record["FLIGHTHISTORYCODESHARE/AIRLINE".$p."-AIRLINECODE"];
      $FlightNumber = $record["FLIGHTHISTORYCODESHARE".$p."-FLIGHTNUMBER"];
      $Ref = $AirlineCode.$FlightNumber;

...and insert each codeshare record at this point...
      $i++;
    }
 }
  MagicParser_parse("flighthistory.xml","myRecordHandler",
    "xml|FLIGHTHISTORYGETRECORDSRESPONSE/FLIGHTHISTORY/");
?>

after the first record has been inserted, run the last part of the script which includes the loop to insert the remaining codeshare records.

Sorry for being thick.

Submitted by support on Thu, 2011-02-10 11:19

Hi,

You're spot on - I also edited your post to show exactly where to insert each codeshare record...

Cheers,
David.

Submitted by baggagepin on Thu, 2011-02-10 12:38

Hi David,

Sorry to be a complete pain. I think I am nerly there but I am getting an error and for the life of me I can't see where the error is coming from. The error I have is:

Parse error: syntax error, unexpected T_IF, expecting T_WHILE in /home/vhosts/mydomain.com/httpdocs/lhr/lhr_24_depart_dev3.php on line 253

Below is the complete script, can you see what I have done wrong.
{code saved}

Submitted by support on Thu, 2011-02-10 13:29

Hi,

My mistake - instead of:

do {

...that should be:

while(1) {

(line 194 in the code you posted...)

Cheers,
David.

Submitted by baggagepin on Thu, 2011-02-10 14:13

Hi David,

Many thanks for all your help.

The script works, I think. The issue I have now is that I get thefollowing error:

Fatal error: Maximum execution time of 60 seconds exceeded in /home/vhosts/mydomain.com/httpdocs/lhr/lhr_24_depart_dev3.php on line 255

Which is something I will need to look at.

Again, many thanks,
Dereck

Submitted by support on Thu, 2011-02-10 14:15

Hi Dereck,

Try adding

  set_time_limit(0);

...at the top of your script to fix that if allowed by your host...

All the best,
David.

Submitted by baggagepin on Thu, 2011-02-10 16:06

Hi David,

Just one last question before I jump out the window.

I inserted the "set_time_limit(0);" and ran the script after about 5 mins the script was still running but looking at the error log I see:

PHP Warning: set_time_limit() [function.set-time-limit]: Cannot set time limit in safe mode in /home/vhosts/mydomain.com/httpdocs/lhr/lhr_24_depart_dev3.php on line.

At the same time I was monitoring the server using "top" and the CPU was running between 99.9% and "100.7%", not to sure how I see "100.7%".

The XML flie contains around 2500 records if I include the codeshares to the primary record.

My question I think is; should the script run faster than it is?

To get around the issue at the moment I have written 18 scripts, each one dealing with a level of the codeshare flights @1, @2, @3 and so on. I have each of the scripts run as a cron job between 00:15 and 00:30 in sets of 5. I know that your solution is far better than the way I am currently doing it. Not to sure which way to turn.

Any ideas.

Best regards,

Dereck

Background;

Dedicated server
CPU GenuineIntel, Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz
Total Mem = 2Gb
Mem in use 1.0 GB free of 1.9 GB

Submitted by support on Thu, 2011-02-10 16:13

Hi Dereck,

As you have a dedicated server, I would turn safe_mode off (it has been depreciated as of the latest PHP version anyway!).

To do this, edit your php.ini and change

safe_mode = On

to:

safe_mode = Off

(if you're not sure where php.ini resides, use "locate php.ini" at your command prompt)

Don't forget to restart your web server after making the changes if browsing to the script via HTTP.

Better still of course, if you are not doing so already would be to run your script from the command line, e.g.

$cd /home/vhosts/mydomain.com/httpdocs/lhr/
$php lhr_24_depart_dev3.php

(where $ is your command prompt)

Hope this helps!
Cheers,
David.