You are here:  » Parse feed to xml file


Parse feed to xml file

Submitted by Ian37 on Mon, 2010-07-19 13:24 in

Hi, this is my feed: {link saved} which works O.K.
its about 25 MB. I need to output the feed to a xml file.

Any ideas?

Thanks

Submitted by support on Mon, 2010-07-19 13:34

Hi Ian,

I check the URL in your post, and it is already an XML format feed. Can you provide more details about what you need to do?

The feed will parse correctly with Magic Parser using the Format String xml|MERCHANT/PRODUCT/, for example:

MagicParser_parse("{link saved}","myRecordHandler","xml|MERCHANT/PRODUCT/");

(you will see {link saved} where the URL of the feed should go)

If you can describe more about what you are trying to do with the feed I will point you in the right direction..

Cheers,
David.

Submitted by Ian37 on Mon, 2010-07-19 13:45

David, The feed goes into a job board but it wont work. The board can also work from an xml file.
If I look at the feed in a browser then cut and paste and make an xml document it works.
If I save the web page as an xml doc even with the correct encoding I get parse errors from the job board.

Ian

Submitted by support on Mon, 2010-07-19 13:52

Hi Ian,

It sounds like the cut paste procedure that you are using is cleaning the data; but when saving the page as an .xml file any encoding errors are remaining in place.

Try accessing the feed using the following script (you'll need to run this on a web server that you have access to with PHP)

clean.php

<?php
  header
("Content-Type: text/xml; charset=utf-8");
  
$xml file_get_contents("{link saved}");
  
$xml utf8_encode($xml);
  print 
$xml;
  exit();
?>

Hope this helps!
Cheers,
David.

Submitted by Ian37 on Mon, 2010-07-19 16:18

David, thanks. When I run clean.php I get this.

XML Parsing Error: junk after document element
Location: http://www.jobs2web.net/clean.php
Line Number 2, Column 1:Warning: file_get_contents({link saved) [function.file-get-contents]: failed to open stream: No such file or directory in /home/job2web/SGOEN4TT/htdocs/clean.php on line 3

Submitted by support on Mon, 2010-07-19 16:21

Hi Ian,

I still see the { } brackets in the generated error message - make sure that your URL is the only thing "between the quotes", for example:

<?php
  header
("Content-Type: text/xml; charset=utf-8");
  
$xml file_get_contents("http://www.example.com/feed.xml");
  
$xml utf8_encode($xml);
  print 
$xml;
  exit();
?>

Cheers,
David.

Submitted by Ian37 on Mon, 2010-07-19 16:30

David, thanks. This works but gives this error.
Fatal error: Allowed memory size of 68157440 bytes exhausted (tried to allocate 106596221 bytes) in /home/job2web/SGOEN4TT/htdocs/clean.php on line 4

Submitted by Ian37 on Mon, 2010-07-19 16:36

David, It runs but now I get this
Fatal error: Allowed memory size of 68157440 bytes exhausted (tried to allocate 106596221 bytes) in /home/job2web/SGOEN4TT/htdocs/clean.php on line 4

Submitted by support on Mon, 2010-07-19 16:39

Hi Ian,

The first thing to try is to see if you are able to override the memory limit - otherwise a different approach would be required... have a go with:

<?php
  ini_set
("memory_limit","0");
  
header("Content-Type: text/xml; charset=utf-8");
  
$xml file_get_contents("http://www.example.com/feed.xml");
  
$xml utf8_encode($xml);
  print 
$xml;
  exit();
?>

Cheers,
David.

Submitted by Ian37 on Mon, 2010-07-19 16:44

David, it appears not
Fatal error: Allowed memory size of 99614720 bytes exhausted (tried to allocate 106596221 bytes) in /home/job2web/SGOEN4TT/htdocs/clean.php on line 5
^

Submitted by support on Mon, 2010-07-19 16:54

Hi Ian,

It is obviously a very large file, and if it contains encoding errors, if you are not able to fully load it into memory in order to cleanse the file; it will have to be corrected at source.

Please note that this is really outside of any Magic Parser specific support, but I'm always willing to help customers with data retrieval and cleansing problems.

Do you have a relationship with the provider of the fed such that you can contact them and ask if the encoding validity of the file can be checked?

Cheers,
David.

Submitted by Ian37 on Mon, 2010-07-19 17:16

Thanks David. I will do.
If I run the url through the demo on your site it works fine.
MP is a great bit of software!

Submitted by support on Mon, 2010-07-19 17:35

Hello Ian,

Intolerance of encoding errors can be specific to a particular combination of PHP version and XML library installed on the server, but I will send you the cleansing versions of Magic Parser to try. Check your email shortly...

Cheers,
David.