You are here:  » Affiliate Window zip datafeed


Affiliate Window zip datafeed

Submitted by vestas on Thu, 2008-09-04 16:08 in

Hello,

I have written a php script which uses magicparser to parse awin datafeeds in xml format.

This works fine for uncompressed feeds, but awin have now made changes which only allows you to get the larger datafeeds (more than about 40,000 products) as a compressed file, so my script does not work for these zip files.

My question, is it possible to parse these zipped datafeeds (specifically Play.com) using magicparser?

The datafeed url I need to parse is in this format

http://datafeeds.productserve.com/datafeed_products.php?user=xxxxx&password=xxxxxxxxxxx&mid=1418&format=XML&dtd=1.2&compression=zip

Submitted by support on Thu, 2008-09-04 16:22

Hi,

What you will need to do is retrieve the remote document to a local file, unzip, and then parse the local file.

There are various ways to do this; as well as dependencies on what programs you have available to you in order to perform the "unzipping" from within PHP.

As a starter for 10, firstly create a working directory in the same folder as your script, for example "files", and make sure that it is writable by all users (for the time being). The easiest way to do this is normally through your FTP program - if you create the folder in the remote window, and then right-click, look for a "Permissions..." or "Properties..." option that leads you to the permissions settings, and check everything; allowing write access to owner / group / world.

With a working directory in place, the first thing to try would be to use fopen() to retrieve the remote document from Affiliate Window, and then shell out to the "unzip" command to decompress; and then finally parse the decompressed file. Here's the basic idea:

<?php
  
require("MagicParser.php");
  
$workingDir "files/"// must be writable by PHP
  
$url "http://datafeeds.productserve.com/datafeed_products.php...."// compressed document to fetch
  // fetch document to working directory
  
$filename $workingDir."temp.xml";
  if (!
copy($url,$filename))
  {
    print 
"File download failed - check permissions!";
  }
  
// shell out to unzip command
  
$cmd "/usr/bin/unzip -p ".$filename." > ".$filename.".unzipped";
  
exec($cmd);
  
unlink($filename);
  
rename($filename.".unzipped",$filename);
  
// temp.xml should now be unzipped, and can be parsed as normal, so take
  // replace the remainder with your existing code - feed should be in $filename
  
function myRecordHandler($record)
  {
    
// process $record
  
}
  
MagicParser_parse($filename,"myRecordHandler","xml|PRODUCTS/PRODUCT/");
?>

Hope this helps as a starter for 10; if no joy we'll add some debug code to find out how far it gets and look at other options...

Cheers,
David.

Submitted by vestas on Thu, 2008-09-04 17:05

Many thanks David, it works perfecty with smaller files, but with the play.com feed it unzips 137708k worth of data and then I get this error:
"Fatal error: Allowed memory size of 8388608 bytes exhausted (tried to allocate 47 bytes) in /mypathto/MagicParser.php on line 2"

Also isn't is a bit dangerous having the 'files/' directory permissions set to 777?

Submitted by support on Thu, 2008-09-04 17:12

Hi,

In the larger file case, that sounds like the unzip didn't actually work, and therefore MagicParser has been trying to read the entire file into memory looking for the XML header - which it won't find as it's looking at zipped data.

Can you exit() the script after the unzip and study the unzipped file and check the contents?

Regarding having a directory set to 777; this only applies to the local machine; not users connected via the web server - so unless you are running something like WebDAV (a Microsoft protocol, but you can get Apache support for it) there is no way for anyone externally to access that directory.

Having said that; it's good practice to tighten permissions to the maximum possible; so once this is working correctly you could establish what user PHP is running as (often something like "apache" or "deamon" depending on how your server is configured) and add that user to a group that has write access to the working directory...

Cheers,
David.

Submitted by vestas on Thu, 2008-09-04 17:21

I've checked the contents of the unzipped file using cpanel's file manager and it has unzipped the complete file OK (134 MB worth of data), but I don't understand why I get the error

Submitted by vestas on Thu, 2008-09-04 17:47

sorry please ignore the last post where I said that the complete file had been unzipped.

It was almost unzipped completely.
It appears that 95% of the compressed file from play.com was unzipped and then I got the error before it was completed.

Submitted by support on Thu, 2008-09-04 18:10

Hi,

Firstly, in your call to MagicParser_parse() - are you using a format string?

For Affiliate Window feeds, it is normally "xml|PRODUCTS/PRODUCT/"

Cheers,
David.

Submitted by vestas on Thu, 2008-09-04 18:50

Yes, I am using "xml|PRODUCTS/PRODUCT/"

Submitted by support on Thu, 2008-09-04 18:53

Hi,

Could you perhaps add an exit(); statement after the following line:

rename($filename.".unzipped",$filename);

This will stop the script before attempting to parse the file, so we can make sure that it is working to this point. After doing this, can you check the file and confirm whether or not it has unzipped successfully...

Cheers,
David.

Submitted by vestas on Thu, 2008-09-04 19:27

Hi David,

I've done as you suggested and this time the file has unzipped successfully.

Submitted by support on Thu, 2008-09-04 19:37

Hi,

That's interesting, in which case it should go on to parse correctly.

Could you perhaps email me the script you are using and i'll take a look; and if possible - a link to the unzipped file if it is reachable via a public URL... for example:

http://www.yoursite.com/files/temp.xml

Cheers,
David.

Submitted by vestas on Thu, 2008-09-04 20:04

email sent

Submitted by vestas on Fri, 2008-09-05 11:33

I have managed to get it working by running two seperate scripts;
the first gets the zipped file and unzips it, then the second script parses the unzipped file created by the first script

i just have to make sure I leave enough time between scripts to make sure the first one has finished. I am still puzzled as to why it won't work all in one script though.

I would like to add that if anybody is not sure whether to buy Magic Parser or Price Tapestry it is worth the price just for the support alone