You are here:  » too many records

Support Forum



too many records

Submitted by nans on Sun, 2008-02-17 12:42 in

I want to import following data into my MySql database:
http://www.100ideeen.be/EzWeb/MSN/NewsML.asp

Afterwards I want to use the data to create html pages.

The parser however, generates tons of records for each html tag or line..
How can I avoid that?

Submitted by support on Mon, 2008-02-18 09:15

Hello nans,

Because this particular XML has HTML content that is not escaped within CDATA tags, it is not very easy to parse in its current form.

Assuming that contacting the publisher to ask them to correct this is not an option; the only way I can think of to handle it is to load the feed into a string, and then add the CDATA tags manually around the BODY.CONTENT tags using str_replace.

This will at least mean that you can extract the body content from each record using:

$record["NEWSCOMPONENT/NEWSCOMPONENT/NEWSCOMPONENT/CONTENTITEM/DATACONTENT/NITF/BODY/BODY.CONTENT/"];

Here is an example; using print_r() to dump the first record, and content-type: text/plain; for clarity...

View Output

<?php
  header
("Content-Type: text/plain");
  require(
"MagicParser.php");
  function 
myRecordHandler($record)
  {
    
print_r($record);
    exit();
  }
  
$url "http://www.100ideeen.be/EzWeb/MSN/NewsML.asp";
  
$xml "";
  
$fp fopen($url,"r");
  while(!
feof($fp)) $xml .= fread($fp,1024);
  
fclose($fp);  
  
$xml str_replace("<body.content>","<body.content><![CDATA[",$xml);
  
$xml str_replace("</body.content>","]]></body.content>",$xml);
  
MagicParser_parse("string://".$xml,"myRecordHandler","xml|NEWSML/NEWSITEM/");
?>

Hope this helsp!
Cheers,
David.

Submitted by nans on Thu, 2008-02-21 21:16

Thanks, I will experiment with this info.