You are here:  » XML to CSV


XML to CSV

Submitted by support on Mon, 2007-08-27 15:07 in

Hi everyone,

Here is a completely generic XML to CSV conversion script, which will work with any XML source:

xml2csv.php

<?php
  
require("MagicParser.php");
  
$xml "file.xml"// local filename or URL of your XML source
  
$csv "file.csv";
  function 
myRecordHandler($record)
  {
    global 
$csv;
    
// strip commas, new-line and carriage return characters from all fields
    
foreach($record as $key => $value)
    {
      
$record[$key] = str_replace(","," ",$value);
      
$record[$key] = str_replace("\n"," ",$value);
      
$record[$key] = str_replace("\r"," ",$value);
    }
    
fwrite($csv,implode(",",$record)."\n");
  }
  
$csv fopen($csv,"w");
  if (!
$csv) { print "Could not create output file - check permissions!";exit(); }
  
MagicParser_parse($xml,"myRecordHandler");
  
fclose($csv);
?>

If you need specify a Magic Parser format string for your XML source, simply change:

  MagicParser_parse($xml,"myRecordHandler");

to:

  MagicParser_parse($xml,"myRecordHandler","xml|FORMAT/STRING/");

In order to create file.csv the folder in which the script runs must be writable by PHP. The easiest way to do this is normally with your FTP program. In the remote window, right-click on the folder containing your xml2csv.php script and look for Permissions... or maybe Properties... and then Permissions. Then give WRITE access to all users (Owner / Group / World).

Alternatively, you can simply output the CSV data directly to the browser, as follows:

xml2csv.php

<?php
  
require("MagicParser.php");
  
header("Content-Type: text/plain");
  
$xml "file.xml"// local filename or URL of your XML source
  
function myRecordHandler($record)
  {
    
// strip commas, new-line and carriage return characters from all fields
    
foreach($record as $key => $value)
    {
      
$record[$key] = str_replace(","," ",$value);
      
$record[$key] = str_replace("\n"," ",$value);
      
$record[$key] = str_replace("\r"," ",$value);
    }
    print 
implode(",",$record)."\n");
  }
  
MagicParser_parse($xml,"myRecordHandler");
?>

Cheers,
David.

Submitted by indicator on Tue, 2007-11-13 09:01

Do you have a curl example?

thanks

Submitted by support on Tue, 2007-11-13 09:05

Hi,

I don't have a specific example, but people do use CURL with Magic Parser. There is some code in the following thread which shows how to use CURL to grab a remote feed:

http://www.magicparser.com/node/61

Cheers,
David.

Submitted by toulouse on Sat, 2009-05-09 17:24

Hello,

I seek to use your function with a dynamic file xml (rss.xml.php).
But, impossible to obtain a result.

  header('Content-Type: application/rss+xml; charset=utf-8');
  $xml = file_get_contents("http://www.(...).com/rss.xml.php");
  //$xml = str_replace("iso-8859-1","utf-8",$xml);
  //$xml = utf8_encode($xml);
  //$xml = "rss.xml.php";
  $csv = "rss.csv";
  require("MagicParser.php");
  function myRecordHandler($record)
  { ...

That functions perfectly with a file static (rss.xml).

You think that it is my server which poses a problem ?

Thank you for your assistance.

Mick

Submitted by support on Sun, 2009-05-10 07:43

Hello Mick,

That would indicate that your server is not permitted to fopen() a file (which is what file_get_contents does) by URL. The following page has more information on this...

http://php.net/manual/en/filesystem.configuration.php#ini.allow-url-fopen

It is often worth asking your host if it is possible to have URL wrappers enabled on your account so that you can do this.

Alternatively, your server may have CURL installed, in which case you could use that as an alternative to file_get_contents. Here's the equivalent code:

$ch = curl_init("http://www.example.com/");
curl_setopt($ch, CURLOPT_HEADER, 0 );
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$xml = curl_exec ( $ch );

Hope this helps!
Cheers,
David.

Submitted by toulouse on Sun, 2009-05-10 15:01

Thanks, in fact there was a restriction in my .htaccess file.

But, my file CSV adds an additional column.

Here my file xml :

<?xml version="1.0" encoding="UTF-8" ?>
- <database>
- <row>
  <name>...</name>
  <description>...</description>
  <url>...</url>
  </row>
- <row>
  <name>...</name>
  <description>...</description>
  <url>...</url>
  </row>
  </database>

My file xml2csv.php :

  function myRecordHandler($record)
  {
    global $csv;
    // strip commas, new-line and carriage return characters from all fields
    foreach($record as $key => $value)
    {
      $record[$key] = str_replace(","," ", $value);
      $record[$key] = str_replace("\n"," ", $value);
      $record[$key] = str_replace("\r"," ", $value);
    }
    fwrite($csv,implode("|",$record)."\n");
  }
  $csv = fopen($csv,"a");
  if (!$csv) { print "Could not create output file - check permissions!";exit(); }
  MagicParser_parse($xml,"myRecordHandler");
  fclose($csv);

And my file csv :

      [MagicParser_xml_current_record] => Array
        (
            [ROW] =>
            [NAME] => ...
            [DESCRIPTION] => ...
            [URL] => ...
        )

1 |...|...|...
2
3 |...|...|...
4
5 |...|...|...
6

Column ROW is added and a blank line separates the recordings.

It is possible to remove ROW and the blank line ?

Like your example :
http://www.magicparser.com/examples/mlexport.php

Thank for your assistance.

Mick

Submitted by support on Mon, 2009-05-11 05:23

Hello Mick,

The ROW column is being included because Magic Parser always returns the enclosing element of each record as it may contain data in some circumstances. It's easy to remove with unset() - for example:

  function myRecordHandler($record)
  {
    global $csv;
    unset($record["ROW"]);
    // strip commas, new-line and carriage return characters from all fields
    foreach($record as $key => $value)
    {
      $record[$key] = str_replace(","," ", $value);
      $record[$key] = str_replace("\n"," ", $value);
      $record[$key] = str_replace("\r"," ", $value);
    }
    fwrite($csv,implode("|",$record)."\n");
  }

As there's only one \n being output, I'm not sure why a blank line is being inserted between each row. If that is still occurring, could you perhaps email me your xml to csv script, and sample of the CSV being generated and I'll take a look for you...

Cheers,
David.

Submitted by toulouse on Mon, 2009-05-11 06:48

Hello David,

Thank you for your answer.
Your solution corrected all the problems (1 and 2).

I added some lines of code (unlink) to avoid adding the same recordings in the file csv.

header("Content-Type: text/html;charset=utf-8");
$xml = "{link saved}";
$csv = "data.csv";
if ( is_file ($csv) ) {
unlink ( $csv) ;
    echo "The file was deleted successfully.";
} else { echo "There was an error trying to delete the file.";
}
require ...

You think that it is a good method ?

Thank you for your patience !

Mick

Submitted by support on Mon, 2009-05-11 06:54

Hi Mick,

That looks fine!

Cheers,
David.

Submitted by toulouse on Tue, 2009-07-14 19:50

Hello David,

It is possible to apply a filter before obtaining file CSV ?

I tested :

  function myRecordHandler($record)
  {
if ($MagicParser_xml_current_record["CATEGORY"]=="50000") return;
    global $csv;

...but that does not function.

There is a possible solution ?

Thank you for your assistance.

Mick

Submitted by support on Thu, 2009-07-16 18:48

Hello Mick,

You should just be able to reference the local $record variable rather than relying on Magic Parser global variables; for example:

  function myRecordHandler($record)
  {
    if ($record["CATEGORY"]=="50000") return;
    // rest of record handler function

Cheers,
David.