I want to parse a Wikipedia page. You can see this on the MagicParser demo pages:
Try inputing this URL:
http://en.wikipedia.org/wiki/Special:Export/Jennifer_Black
and using the Demo.
Works fine.
However, copy the PHP over and you get this message:
could not open "http://en.wikipedia.org/wiki/Special:Export/Jennifer_Black"
Has anybody got an idea what is going wrong?
Thanks David,
That fopen page might as well be in martian! I have no idea where to start with that. I do have access to my php.ini file if that is useful but not what to put into it.
I would only use the wikipedia Special:Pages export function here. Maybe it would help if I knew what
http://en.wikipedia.org/wiki/Special:Export/Jennifer_Black
actually resolves to as a filename.
http://en.wikipedia.org/wiki/Special:Export/Jennifer_Black.xml
Or something. Perhaps that would help, getting round the wrapper problem?
all the best
Scott
Hi Scott,
Since you have access to php.ini, you should have permission to set an PHP configuration directive within your script itself. So as a first experiment, try adding the following line right at the top of your script (after the opening PHP tag):
ini_set("allow_url_fopen","1");
If that still doesn't work, then try the same setting by editing your php.ini. First search for "allow_url_fopen" (without the quotes) in the file, and if you find it, see what it is currently set to. If it is 0 or "FALSE", change the setting to "1". Otherwise, add the following line at the end:
allow_url_fopen = 1
Don't forget that you will need to restart PHP (or Apache if PHP is running as a module) before changes to php.ini take effect...
Cheers,
David.
allow_url_fopen = On
allow_url_include = On
session.use_only_cookies = 1
session.use_trans_sid = 0
Dear David,
This above is what PHP.INI had on this subject. I buy space on a remote server and they suggest changing PHP.INI to make this or that happen (with no need for rebooting etc.). Unfortunately they do not offer support on "programming issues".
So I added:
allow_url_include = 1
allow_url_fopen = 1
But that didn't change anything.
all the best
Scott
Hi Scott,
Could you try this test script to confirm for sure that it is a URL wrappers problem. This simulates exactly what MagicParser.php does when trying to open a URL...
<?php
$url = "http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml";
$fp = fopen($url,"r");
if ($fp)
{
print "Success";
}
else
{
print "fopen() Failed";
}
?>
Cheers,
David.
Hi Scott,
That's interesting, as it implies that URL wrappers are working fine, so that's not the problem. OK, next test is to try exactly the same code, but with the wikipedia URL....
<?php
$url = "http://en.wikipedia.org/wiki/Special:Export/Jennifer_Black";
$fp = fopen($url,"r");
if ($fp)
{
print "Success";
}
else
{
print "fopen() Failed";
}
?>
Cheers,
David.
Warning: fopen(http://en.wikipedia.org/wiki/Special:Export/Jennifer_Black) [function.fopen]: failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden in /home/globalgu/public_html/temp2.html on line 5
Hi Scott,
As you can see from that error message, the web server refused to answer the request coming from your PHP script.
Now, this could be because of the user-agent if the remote server is trying to prevent fetching via PHP scripts. Try the following test script (adding a line to set the user agent at the top), which will indicate who you are which is considered polite when making automated requests such as this:
<?php
ini_set("user_agent", "GlobalGuide/1.0");
$url = "http://en.wikipedia.org/wiki/Special:Export/Jennifer_Black";
$fp = fopen($url,"r");
if ($fp)
{
print "Success";
}
else
{
print "fopen() Failed";
}
?>
Hi David,
That worked like a dream - that was all it was!!!!!
Thanks very much
Scott
Hi,
This sounds like your server is not able to fopen() files by URL... There's some info and how to enable this (or what to ask your host) in this thread...
http://www.magicparser.com/node/189
That should be all it is - let me know if you need any more help or need to look at other ways of retrieving the remote document...
Cheers,
David.