You are here:  » Default format string doesn't dig deep enough


Default format string doesn't dig deep enough

Submitted by qbuster on Wed, 2009-03-18 14:41 in

David

I have a large xml from which the default format string misses data from which I would like to extract bits of data.

I have included a section of the file below. The auto-produced format string is xml|CANAL/PART/PLACE/ and that gives me access to most parts I am interested in but not others.

So I can get access to name, x, y, id and even postcode but I also need

1. 'canal' name and id
2. 'waterway-info' gauge
3. 'part' name and id

How do I get whold of those items?

Thanks

Will Chapman

<?xml version="1.0" encoding="utf-8"?>
<canal name="Driffield Navigation" id="c_m1o4">
    <waterway-info type="whole">
        <gauge>broad</gauge>
        <towpath>none</towpath>
        <region>North East</region>
        <length>61</length>
        <width>14.5</width>
    </waterway-info>
    <part name="Main waterway" id="c_2mmk">
        <place name="Driffield Wharfs" x="502700" y="457400" id="g7n0">
                <type>town</type>
                <postcode>YO25 6PR</postcode>
        </place>
        <place name="Driffield Lock No 1" x="503100" y="456900" id="imgm">
            <type>dot</type>
            <lock dir="down" height="unknown"/>
            <postcode>YO25 6NU</postcode>
            <dist unit="flg">3</dist>
        </place>
        <place name="Whin Hill Lock No 2" x="505100" y="456800" id="cmfj">
            <type>dot</type>
            <lock dir="down" height="unknown" />
            <postcode>YO25 8JJ</postcode>
            <dist unit="flg">9</dist>
        </place>
        <place name="Wansford Lock No 3" x="506200" y="456200" id="fchj">
            <extra>Wansford Village</extra>
            <type>village</type>
            <lock dir="down" height="unknown" />
            <postcode>YO25 8NU</postcode>
            <dist unit="flg">8</dist>
        </place>
            <place name="Snakeholme Locks Nos 4 and 5" x="506700" y="455500" id="c32n">
                <type>dot</type>
                <lock dir="down" height="unknown" mult="2" />
                <postcode>YO25 8JN</postcode>
                <dist unit="flg">4</dist>
            </place>
        </part>
        <part name="West Beck" id="c_911h">
            <place name="Junction with Corps Landing Branch" id="md4g">
                <type>reference</type>
            </place>
            <place name="Corps Landing" x="506268" y="452971" id="lim0" source="gmap">
                <type>feature</type>
                <postcode>YO25 8JW</postcode>
                <dist unit="flg">14.4</dist>
            </place>
        </part>
        <part name="Frodingham Beck" id="c_e9id">
            <place name="Junction with Frodingham Beck" x="508140" y="452743" id="480b" source="gmap">
                <extra>Also known as Fisholme</extra>
                <alias>Fisholme</alias>
                <type>reference</type>
            </place>
            <place name="North Frodingham Wharf" x="508789" y="453717" id="3qti" source="gmap">
                <type>spot</type>
                <dist unit="metres">140</dist>
            </place>
            <place name="North Frodingham Road Bridge" x="508909" y="453787" id="qp7m" source="gmap">
                <type>spot</type>
                <postcode>YO25 8BP</postcode>
                <dist unit="metres">1280</dist>
            </place>
        </part>
</canal>

Submitted by support on Wed, 2009-03-18 15:40

Hello Will,

The issue here is that the content you also require is actually at a higher level within the XML rather than a lower level. The auto-detected format string has correctly picked out the lowest level most frequently repeating element (PLACE) which is indeed what you are interested in, but therefore to access higher level information in the way Magic Parser works, it would be necessary to parse the file twice - first with the higher level document element (XML|CANAL/) and then secondly (with a different record handler function) to read each of the PLACE elements.

Based on your XML, here's an example of the above process, reading from filename.xml, which using the below example you can replace with a URL if that is how you are obtaining your data.

<?php
  
require("MagicParser.php");
  function 
myCanalRecordHandler($record)
  {
    global 
$canalName;
    global 
$canalId;
    global 
$canalGuage;
    global 
$canalPartName;
    global 
$canalPartId;
    
$canalName $record["CANAL-NAME"];
    
$canalId $record["CANAL-ID"];
    
$canalGuage $record["WATERWAY-INFO/GAUGE"];
    
$canalPartName $record["PART-NAME"];
    
$canalPartId $record["PART-ID"];
  }
  function 
myPlaceRecordHandler($record)
  {
    
// process $place records as before, and if you need
    // any of the global variables set from the first parse
    // just global them in within this function
  
}
  
$xml file_get_contents("filename.xml");
  
MagicParser_parse("string://".$xml,"myCanalRecordHandler","xml|CANAL/");
  
MagicParser_parse("string://".$xml,"myCanalPlaceHandler","xml|CANAL/PART/PLACE/");
?>

I am out of the office for a couple of days so apologies if I am not able to reply as quickly as normal if you have any further queries, but I will be back online as normal from Friday.

Hope this helps!
Cheers,
David.

Submitted by qbuster on Wed, 2009-03-18 18:39

Perfect - just what I was trying to figure out...

Thanks

Will

PS I see we both live in Staffs - I'm in Alrewas