You are here:  » Any way to specify an XML Format String to have parsing stop a given level and/or property name only?


Any way to specify an XML Format String to have parsing stop a given level and/or property name only?

Submitted by cogden on Wed, 2007-01-17 01:05 in

Is there any way to specify a format string that will STOP at a given level and not recurse down through all the children? I have a complex file that max's the server and takes an hour plus to run because it tries to process all children, when what I really want is just the high-level element information.

For example, given...

<?xml version="1.0" encoding="UTF-8"?>
<f:full xmlns:f="http://schema.afasfasfasf/Full" site-code="ampcme" created-date="2007-01-14T22:43:11.819-08:00">
<f:date-range start="2000-01-01T00:00:00.000-08:00" end="2007-02-01T23:59:59.999-08:00"/>
<crdcat:credit-category xmlns:crdcat="http://schema.afasfasfasf/CreditCategory" id="3" agency="XXXXXXX Category 1" category-name="SDFDSFDFD Category 1" description="blahblahbah" credit-type="CME" credit-unit="Credit" accrediting-body="ddddd"/>
<crdcat:credit-category xmlns:crdcat="http://schema.afasfasfasf/CreditCategory" id="99" agency="Not Currently Accredited" category-name="N/A"/>
<crs:course xmlns:crs="http://schema.afasfasfas/Course" id="abc_course;10046" title="ddddd" expiration-date="9999-12-31T00:00:00.000-08:00" publication-date="2000-04-01T00:00:00.000-08:00" in-production="true" is-linkable="true" is-expired="false" etc., etc.
lots and lots and lots of children, subchildren, etc. (eg, the first record had 4,453 fields, the second 11068, and the third 13139)

Setting the $format_string = "xml|f:full/crs:course/"; tries to process all the ugly details.

Ideally, there'd be a way to stop processing at a given level.

However, if that's not possible, if I could specify a property name, I could even do multiple passes and put them together:
$format_string = "xml|f:full/crs:course/expiration-date"; (no trailing slash, or something)

Any ideas would be muchly appreciated!

Submitted by support on Wed, 2007-01-17 08:20

Hi,

I understand the issue here, but unfortunately there's no way to limit recursion through the format string. Bear with me and i'll have a look at the code to see if this would be straight forward to support, for example with an extra parameter on the format string...

Cheers,
David.

Submitted by cogden on Tue, 2007-01-23 15:27

That would be INCREDIBLY useful - thanks!

Submitted by fraccozzo on Sat, 2007-01-27 14:57

Hi, I have a similar problem. What I would like to do is organize nodes into different tables for example. If I use the $record syntax, I am not being able to divide upper nodes that contains children inside.

for example if i have

booktitle
bookisdn

and then I have

cdtitle
cdprice

Using the record syntax i will get a table where
i receive:
1) booktitle
2) bookisdn
3) cdtitle
4) cd price

My question is, is there a way to make two queries to select the two or one query but separate the upper nodes? The problem is also that each superior node may contain different children.

Many thanks in advance for the help

Submitted by support on Sat, 2007-01-27 15:10

Hi,

One trick you can use that may help in this situation is to look at whether you have, for example, a BOOKTITLE field or a CDFIELD in each record; and then handle them as required; for example:

<?php
  
function myRecordHandler($record)
  {
    if (
$record["BOOKTITLE"])
    {
      
// handle book record
    
}
    if (
$record["CDTITLE"])
    {
      
// handle CD record
    
}
  }
?>

Apologies if I have misunderstood your question. It might help if you are able to post an example of your XML, and what data you are trying to access and i'll see what could be done...

Hope this helps!
Cheers,
David.

Submitted by fraccozzo on Mon, 2007-01-29 20:25

Hi, I just realized that i forgot to include the code submitted inside the code tags.

Here is a more complicated example

[xml saved]

What i need to do is to organize my result into release-list and relation-list. If I set the format to be at the highest node, how do I select children nodes and single values inside children nodes?

Please note that it should be a loop, as I need to query for all those records inside a higher node.

Many thanks for your time.

Submitted by support on Mon, 2007-01-29 20:32

Hi,

I understand what you're trying to do now. Accessing muliple child nodes of the same name is possible because Magic Parser automatically distinguishes duplicate field names with @1..2..n, so you can loop through these testing for values.

The easiest way is for me to write an example script for you. Could you please send me a copy of the raw XML file that you are working with (the data that you posted contains Internet Explorer markup so is not valid XML). If you reply to your registration code or forum reg email you will get me. I'll then be able to put together an example for you...

Cheers,
David.

Submitted by cogden on Sat, 2007-02-24 00:57

David, I'm heading back from Antarctica and wanted to check in to see if you've had any luck adding a parameter?

As they say in Argentina, much gusto!
-c

Submitted by support on Sat, 2007-02-24 12:00

Hi,

I've not yet updated Magic Parser to support this but as soon as I have updated the code I will let you know (assuming that it is possible to implement)

Apologies for the inconvenience;
Cheers,
David,