You are here:  » SkyScanner feed


SkyScanner feed

Submitted by travelfrog on Sat, 2006-08-05 14:12 in

Help please?
I have been using Magic Parser for some time and have managed to display all the feeds that I wanted to. However, I have come across a feed from SkyScanner that I cannot get to display and I cannot see where the problem lies. When I have tried other online feed readers, the feed is read ok. Is it something simple that I am not seeing?

Here is the basic MagicParser script that I am using

<?php
  
require $_SERVER['DOCUMENT_ROOT'].'/magic-parser/MagicParser.php';
  function 
myRecordHandler($item)
  {
    print 
"<h2><a href='".$item["LINK"]."'>".$item["TITLE"]."</a></h2>";
    print 
"<p>".$item["DESCRIPTION"]."</p>";
  }
  print 
"<h1>BBC News Headlines</h1>";
  
$url "http://www.skyscanner.net/gbp/flights-from/cvt/w4/weekend-breaks-from-coventry.rss";
  
//THE FEEDS BELOW DISPLAY OK, BUT THE SKYSCANNER FEED ABOVE WILL NOT DISPLAY
  //http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml
  //http://www.prweb.com/rss2/daily.xml
  
MagicParser_parse($url,"myRecordHandler","xml|RSS/CHANNEL/ITEM/");
?>

Submitted by support on Sat, 2006-08-05 14:29

Hi,

The feed seems to read OK (and your code looks fine), however note that it is ISO-8859-1 encoded so you need to set the appropriate HTTP header so that the pound sign in the feed is displayed correctly. Here's my test script:

SkyScanner Offers

Source:

<?php
  header
("Content-Type: text/html; charset=iso-8859-1");
  require(
"MagicParser.php");
  function 
myRecordHandler($item)
  {
    print 
"<h2><a href='".$item["LINK"]."'>".$item["TITLE"]."</a></h2>";
    print 
"<p>".$item["DESCRIPTION"]."</p>";
  }
  print 
"<h1>SkyScanner Offers</h1>";
  
$url "http://www.skyscanner.net/gbp/flights-from/cvt/w4/weekend-breaks-from-coventry.rss";
  
MagicParser_parse($url,"myRecordHandler","xml|RSS/CHANNEL/ITEM/");
?>

If the above code doesn't work on your server, add the following code at the end to see if an error message has been generated by Magic Parser:

print MagicParser_getErrorMessage();

Cheers,
David.

Submitted by travelfrog on Sat, 2006-08-05 15:04

Only the header SkyScanner Offers displays without the feed content. I get the following error message.

could not open http://www.skyscanner.net/gbp/flights-from/cvt/w4/weekend-breaks-from-coventry.rss

Submitted by support on Sat, 2006-08-05 16:01

Hi,

I take it the exact same code works with any other URL; so there are no problems fopen()'ing a URL on your PHP installation?

Cheers,
David.

Submitted by travelfrog on Sat, 2006-08-05 23:23

Ok, these are the feeds that I have tried and their parse result:

<?php
  header
("Content-Type: text/html; charset=iso-8859-1");
  require 
$_SERVER['DOCUMENT_ROOT'].'/magic-parser/MagicParser.php';
  function 
myRecordHandler($item)
  {
    print 
"<h2><a href='".$item["LINK"]."'>".$item["TITLE"]."</a></h2>";
    print 
"<p>".$item["DESCRIPTION"]."</p>";
  }
  print 
"<h1>SkyScanner Offers</h1>";
  
$url "http://www.real-estate-supply.com/feed.xml";
  
//http://www.real-estate-supply.com/feed.xml -- parsed ok
  //http://www.bestflights.com.au/rss/bfnews.rss -- parsed ok
  //http://www.travelblog.org/rss/travelblog.rss -- parsed ok
  //http://www.bestflights.com.au/rss/bfnews.xml -- parsed ok
  //http://www.skyscanner.com/rss.asp?city=NEWY&ccy=USD -- not parsed
  //http://xml.newsisfree.com/feeds/62/162.xml  -- parsed ok
  //http://www.scripting.com/rss.xml --  parsed ok
  //http://abcinformatic.com/travel-deal/atom.xml -- not parsed
  //http://www.flightstats.com/go/rss/airportdelays.xml -- parsed ok
  //http://p.moreover.com/cgi-local/page?c=Greece%20news&o=rss002 -- not parsed
  //http://www.carhire3000.com/affiliates/tfrog/tfrog.xml  -- not parsed
  //http://www.skyscanner.net/gbp/flights-from/cvt/w4/weekend-breaks-from-coventry.rss -- not parsed
  
MagicParser_parse($url,"myRecordHandler","xml|RSS/CHANNEL/ITEM/");
  print 
MagicParser_getErrorMessage();
?>

Submitted by support on Sun, 2006-08-06 01:30

Hi,

Of the feeds above; it would seem that with the exception of the SkyScanner feeds, the ones that you are not able to parse with the code as it stands is because they are not in RSS format; and so the format string that you are using (xml|RSS/CHANNEL/ITEM/) is not valid.

For example, your feed http://www.carhire3000.com/affiliates/tfrog/tfrog.xml requires the format string "xml|PRODUCTS/PRODUCT/".

I would recommend using the demo tool with any new feed to discover what format string to use, using the above feed as an example:

http://www.magicparser.com/demo?fileID=44D5465087095&record=1
(demo will be deleted shortly)

If you click the Generate PHP Source Code on that page you will get a customised demo script just for that feed.

Of the other non-SkyScanner feeds that did not work:
http://abcinformatic.com/travel-deal/atom.xml
..is Atom format, and needs the format string "xml|FEED/ENTRY/".

http://p.moreover.com/cgi-local/page?c=Greece%20news&o=rss002
...works with the demo tool and is RSS format so may be related to the SkyScanner problem that you are having.

Regarding SkyScanner, is it possible that they are not serving their feeds to the IP address of your server because of being requested too frequently? That would be one thing to check.

Hope this helps,
Cheers,
David.

Submitted by travelfrog on Sun, 2006-08-06 15:32

David,
My apologies, I should have checked for the correct format string. I just picked the feeds at random to test the script with different feeds to see if I could repeat the error.

Do you think that the problem could be due to my server settings causing this problem?

Submitted by support on Sun, 2006-08-06 20:01

Hi,

It seems that you are able to open some URLs successfully, but not others. I would try this test script to simulate using fopen() to open URLs just as Magic Parser does; and see what happens on your server. It works fine running on this server:

fopen() URL Test

Here's the source:

<?php
  
function test($url)
  {
    
$file fopen($url,"r");
    if (!
$file)
    {
      print 
"<p>Failed to open ".$url."</p>";
      return;
    }
    
$data "";
    while(!
feof($file)) $data .= fread($file,1024);
    
fclose($file);
    print 
"<p>Successfully opened ".$url.", read ".strlen($data)." bytes.</p>";
  }
  
test("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml");
  
test("http://www.skyscanner.net/gbp/flights-from/cvt/w4/weekend-breaks-from-coventry.rss");
?>

What output do you get when you try to run that on your server?

Cheers,
David.

Submitted by travelfrog on Mon, 2006-08-07 10:03

Hi David,

This is the result that I get from the fopen() test

Successfully opened http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml, read 17972 bytes.

Failed to open http://www.skyscanner.net/gbp/flights-from/cvt/w4/weekend-breaks-from-coventry.rss

Submitted by travelfrog on Mon, 2006-08-07 10:09

I have run php info and allow_url_fopen is set to on.

I am running php version 4.4.2

Submitted by support on Mon, 2006-08-07 10:43

Are you able to login to your server using Telnet or SSH?

The next thing I would do is try to eliminate DNS as a possible cause of the problem. To do this, login into your server and try to PING the two host names involved in this test by typing the commands as follows:

ping newsrss.bbc.co.uk
ping www.skyscanner.net

If you get a response from newsrss.bbc.co.uk but not from www.skyscanner.net then this indicates that you have DNS issue on your server or the network to which it is connected. Your hosting company would then need to look into resolving it.

Next, I would try to use WGET to retrieve the files instead of PHP to see what happens then. To do this, use the following commands (you must be in a directory in which your user has write access):

wget http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml
wget http://www.skyscanner.net/gbp/flights-from/cvt/w4/weekend-breaks-from-coventry.rss

If wget does not return an error message, look at the files using the "more" command to see what has been returned:

more rss.xml
more weekend-breaks-from-coventry.rss

(keep pressing SPACE or press CTRL+C to exit from more)

Cheers,
David.

Submitted by travelfrog on Mon, 2006-08-07 10:49

I don't know how to use Telnet or SSH.

Submitted by support on Mon, 2006-08-07 10:59

If you have a username and password that was provided with your hosting account you may be able to login using Telnet, although it is unlikely. From a Windows PC you can try clicking Start > Run then typing the command as follows:

telnet www.example.com

(where www.example.com is your website). You may then receive a login prompt in which case you can enter your username etc.; however more likely it will say connection refused as Telenet is not a secure protocol.

It is more likely that you will have SSH access, and to login using SSH you need an SSH client program. Putty is a popular SSH client for Windows which you can get from here:

http://www.chiark.greenend.org.uk/~sgtatham/putty/

When running Putty, be sure to check SSH as the connection method (I think it defaults to Telnet); and enter your website address e.g. www.example.com in the host name box.

Hope this helps!
David.

Submitted by travelfrog on Mon, 2006-08-07 11:41

Ok, I checked with my host and downloaded Putty for SSH access.
I ran the tests, but ping did not work for either.
I ran the wget tests and bbcnews was returned ok.

Skyscanner did not return a result, the output is below.

Resolving www.skyscanner.net...done
connecting to www.skyscanner.net[217.77.180.210]:80... done
HTTP request sent, awaiting response... 302 object moved
Location: http://www.skyscanner.net/de/gbp/flights-from/cvt/wk4/weekend-breaks-from-coventry.rss?redirecturl=28Url [following]
http://www.skyscanner.net/de/gbp/flights-from/cvt/wk4/weekend-breaks-from-coventry.rss?redirecturl=28Url
=> 'weekend-breaks-from-coventry.rss?redirecturl=28Url
connecting to www.skyscanner.net[217.77.180.210]:80... connected
HTTP request sent, awaiting response...404 Not Found
ERROR 404: Not Found

Submitted by support on Mon, 2006-08-07 12:06

Here's the output from my server:

$ wget http://www.skyscanner.net/gbp/flights-from/cvt/w4/weekend-breaks-from-coventry.rss
--12:03:21-- http://www.skyscanner.net/gbp/flights-from/cvt/w4/weekend-breaks-from-coventry.rss
           => `weekend-breaks-from-coventry.rss'
Resolving www.skyscanner.net... 217.77.180.210
Connecting to www.skyscanner.net[217.77.180.210]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/xml]
    [ <=> ] 8,892 6.95K/s
12:03:23 (6.95 KB/s) - `weekend-breaks-from-coventry.rss' saved [8,892]

That's what you should see - the document is downloaded, however on your server you are getting a HTTP 404 error (file not found). The same thing is obviously happening (on your server) when PHP tries to open the file - it is getting the 404 - not found. Unfortunately, I don't really know what more to suggest; it is nothing wrong with anything on your server.

It may be worth contact SkyScanner (showing them the output from wget) and asking for their assistance - and also let them know the IP address of your server incase they are restricted access to their feeds somehow.

Cheers,
David.

Submitted by travelfrog on Mon, 2006-08-07 13:18

I have contacted SkyScanner.net and also my host 1and1.co.uk to see what they can suggest. I will let you know what responce I get.

Submitted by travelfrog on Thu, 2006-08-10 14:01

Hi David,

Latest update since I contacted SkyScanner and 1and1 two days ago.

SkyScanner.net have not bothered to reply to my email yet. (Maybe I should forget about SkyScanner and try to find another company that can provide rss feeds for flights.)

My hosts 1and1.co.uk are looking into it but no suggestions or resolutions yet.

Submitted by travelfrog on Thu, 2006-08-10 23:29

Hi David,

No solution from my hosts 1and1.co.uk. This is there reply:

"There shouldn't be any issues regarding the use of pulling up rss feeds
from other site regarding our servers. I would await an outcome from
the provider of that script."

Do you have any other ideas why I cannot access the SkyScanner feed?

Submitted by travelfrog on Mon, 2006-08-21 10:56

Hi David,

Is there anything that you can think of as to why the Magic Parser script will not parse the SkyScanner feed on my server?

1and1.co.uk have not been able to help.

SkyScanner finally got in touch, and tried to help, but were not familiar with PHP and were unable to solve the problem.

As the author of the Magic Parser script. Can you think of any obvious reason why your script will not parse a SkyScanner rss feed on my server, but will parse the feed on your server? It did appear when we spoke last, that on my server, the SkyScanner feed was being re-directed somehow.

Submitted by support on Mon, 2006-08-21 11:35

Hi,

I've just got access to a 1&1 box so I logged in to try it and it seems like SkyScanner are for some reason blocking access to their feeds from 1&1's IP address space:

$ wget http://www.skyscanner.net/gbp/flights-from/cvt/w4/weekend-breaks-from-coventry.rss
--12:33:35-- http://www.skyscanner.net/gbp/flights-from/cvt/w4/weekend-breaks-from-coventry.rss
           => `weekend-breaks-from-coventry.rss'
Resolving www.skyscanner.net... 217.77.180.210
Connecting to www.skyscanner.net[217.77.180.210]:80... connected.
HTTP request sent, awaiting response... 302 Object moved
Location: http://www.skyscanner.net/de/gbp/flights-from/cvt/w4/weekend-breaks-from-coventry.rss?redirecturl=28Url [following]--12:33:35-- http://www.skyscanner.net/de/gbp/flights-from/cvt/w4/weekend-breaks-from-coventry.rss?redirecturl=28Url
           => `weekend-breaks-from-coventry.rss?redirecturl=28Url'
Connecting to www.skyscanner.net[217.77.180.210]:80... connected.
HTTP request sent, awaiting response... 404 Not Found
12:33:35 ERROR 404: Not Found.

It's definitely not a 1&1 or PHP problem - it's SkyScanner that are not returning the feed when requested from 1&1's IP address space...

I'll see if I can contact anyone at SkyScanner to explain the situation to them - they are probably not aware that it is happening...

Cheers,
David.

Submitted by travelfrog on Mon, 2006-08-21 11:48

David,

Thank you for your prompt reply. I have now sent an email to SkyScanner and I will let you know what responce I get.