Support Forum

Request new password

Active Forum Topics

Creating subpages for each xml item

Submitted by function on Fri, 2008-01-11 11:14 in Magic Parser

Hello.

Is it possible to create subpages of each item in a xml-feed, without using mysql?

For example storing each item in mydomain.com/details/item1.php ../details/item2.php etc etc where itemX is a increasing number, or based on for example item-title.
And then having links on the main php page to each of the generated detail pages.
Both the detail pages and main page would be able to use xml elements from the specific item.

Or maybe there's another script that could do this in cooperation with magicparser?

What I'm basically trying to do is to create an archive of each item, OR archive the main php when it's reached X entries.

The xml feeds I'm using create so many entries that I'd like to save somehow, without having to have a neverending list of entries.

I really hope you understand what I'm trying to explain - it's not easy I admit :)

Hi, If you don't want to use

Submitted by support on Fri, 2008-01-11 12:28

Hi,

If you don't want to use a database, then your only other option is to write files directly to disk, using an appropriate file-naming convention. To do this requires that you are able to create the directory "/details/" with WRITE access to PHP, so that it can create the files.

Do you know how / if you are able to do this? It might be possible via your FTP programme. If you create a sub-directory called "details", and then right-click and look for "Permissions.." or "Properties..." and then "Permissions..." - you should be able to enable world write access to the folder. This should enable a PHP script to create a file in that directory, which you can test like this:

<?php
  $dir = "details/";
  $filename = $dir."test.php";
  $fp = fopen($filename,"w");
  if ($fp)
  {
    fwrite($fp,"<html><body>Test</body></html>");
    fclose($fp);
  }
  else
  {
    print "Could not open ".$filename." for WRITE access!";
  }
?>

Cheers,
David.

Thanks alot David. Could you

Submitted by function on Fri, 2008-01-11 13:10

Thanks alot David.

Could you help me along with a number creation of the filename.

So $filename would be a +1 increasing number for every new item.
Instead of test.php I would get 1.php for first item 2.php for second item etc.

Much appreciated,
Tro

Hi Tro, Have you thought

Submitted by support on Fri, 2008-01-11 13:20

Hi Tro,

Have you thought about how to handle duplicates? You mention that you want to parse items from a feed and write those items to x.php...

If these are RSS feeds that you are reading, are they likely to contain some items that you have already archived, and you only want to write to a file / archive the new items?

If this is the case, it might be better to create a filename that is a function of each item rather than an incremental number; and this also means that you don't have to try and work out which is the next number to use (although there are various ways to do this)...

Does that make sense?

Cheers,
David.

Hi David. No, I don't use

Submitted by function on Fri, 2008-01-11 13:27

Hi David.

No, I don't use feeds which have duplicates or re-use other sites feeds.
All items should be unique.

I thought about using a MD5 hash of the title like you show in your xml cache post.
But that will run into a duplicate filename problem, as some items are bound to have similar titles.

I haven't set my mind to anything specific. I just thought the increasing number solution would be the easiest.

Tro

Also, I actually already

Submitted by function on Fri, 2008-01-11 13:32

Also, I actually already have a script that does what I mention. But the parser is very limited, and I can only get title, description, url and date out of the feeds (which is why I bought magicparser).

The code for writing is here:

<?

set_time_limit(86400);

error_reporting(E_ERROR);

ignore_user_abort(true);

include("parser.php");

include("config.inc");

global $titles;

$titles=explode("\n",implode("",file("history.txt")));

if (count($feeds) == 0) {echo "Error: Feed addresses not found";exit;}

flush();

while ($f=array_shift($feeds))

{

$parser = new myParser($f);

$out=$parser->getRawOutput();

echo "<b>Feed</b>: ".$f.", <b>Structure</b>: $structure<br><br><br>New titles:<br>";

flush();

ob_flush();

if ($structure=="atom") {

$entries=$out["FEED"]["ENTRY"];

}

if ($structure=="rss") {

$tentries=$out["RSS"]["CHANNEL"];

$source["link"]=$tentries[0]["LINK"];

$source["title"]=$tentries[0]["TITLE"];

$entries=$tentries[0]["ITEM"];

}

$linkstr=" ";

array_walk ($entries,'process_headlines');

} 

function process_headlines ($entry) {

global $structure,$titles,$pagefilenamemask,$pagesfolder,$indexfilename,$source;

$title=trim($entry['TITLE']);

$title=ucfirst($title);

//$title=strip_tags($title);

$title=preg_replace("/\W+/"," ",$title);

$title=preg_replace("/\W$/","",$title);

$avai=available($title);

if ( ($avai != "y") and ($title)) {

$titles[]=$title;

$content = implode("\n",$titles);

if (is_writable("history.txt")) {

   if (!$handle = fopen("history.txt", 'w')) {

         echo "Cannot open file for titles";

         exit;

   }

if (fwrite($handle, $content) === FALSE) {

 echo "Cannot write to file for titles";

exit;

}

   fclose($handle);

} else {

   echo "The titles file is not writable";

}

echo $title."<br>";

if ($structure=="rss") {

$desc=$entry['DESCRIPTION'];

$rss_link=$entry['LINK'];

$rss_date=$entry['PUBDATE'];

} else {

$desc=$entry['CONTENT'];

}

$desc=stripslashes($desc);

$linkp=preg_replace("/\W+/","-",strtolower($title));

$link=preg_replace("/{title}/",$linkp,$pagefilenamemask);

$filename = $pagesfolder."/".$link;

$somecontent = parse($title,$desc,$rss_link,$rss_date);

   if (!$handle = fopen($filename, 'w')) {

         echo "Cannot open file ($filename)";

         exit;

   }

   if (fwrite($handle, $somecontent) === FALSE) {

       echo "Cannot write to file ($filename)";

       exit;

   }

   fclose($handle);

}

}

$titles=array_reverse($titles);

$filename = $indexfilename;

$somecontent = parse_index($titles);;

   if (!$handle = fopen($filename, 'w')) {

         echo "Cannot open file ($filename)";

         exit;

   }

   if (fwrite($handle, $somecontent) === FALSE) {

       echo "Cannot write to file ($filename)";

       exit;

   }

   fclose($handle);

function available ($title) {

global $titles;

if (array_search($title,$titles)) return "y";

/*

while ($lt=array_shift($t_t)) {

if (trim($lt) == $title) {

$yeah=1;

}

}

echo count($t_t);

if ($yeah) return "y";

*/

}

function parse ($title,$desc,$rss_link,$rss_date) {

global $template,$source,$linkstr;

$html=implode("",file($template));

$html=preg_replace("/{title}/", $title, $html);

$html=preg_replace("/{content}/", $desc, $html);

$html=preg_replace("/{source-link}/", $source["link"], $html);

$html=preg_replace("/{source-title}/", $source["title"], $html);

$html=preg_replace("/{item-link}/", $rss_link, $html);

$html=preg_replace("/{item-date}/", $rss_date, $html);

return $html;

}

function parse_index ($titles) {

global $index_template,$lines,$pagefilenamemask,$pagesfolder,$linkstr;

$titles=array_splice($titles,0,$lines);

while ($line=array_shift($titles)) {

$linkp=preg_replace("/\W+/","-",strtolower($line));

$link=preg_replace("/{title}/",$linkp,$pagefilenamemask);

$filename = $pagesfolder."/".$link;

$cntnt.="<a href='$filename'>$line</a><br>";

}

$html=implode("",file($index_template));

$html=preg_replace("/{content}/", "{content}<br><br>".$linkstr, $html);

$html=preg_replace("/{content}/", $cntnt, $html);

return $html;

}

?>

Ok, no probs.... The safest

Submitted by support on Fri, 2008-01-11 13:38

Ok, no probs....

The safest way to do this without having to worry about integrity (overwriting an existing file) is to scan the directory and work out the highest number that way. Keep an eye on performance however if the directory is to grow to several hundred files. Here's a way to do that:

<?php
  $dir = "details/";
  $max = 0;
  $dh = opendir($dir);
  while (($file = readdir($dh)) !== false)
  {
    $n = intval(substr($file,strpos($file,".")));
    if ($n > $max) $max = $n;
  }
  closedir($dh);
  // now we have worked out max, start with the next number
  $next = $max + 1;
  $filename = $dir.$next.".php";
  $fp = fopen($filename,"w");
  if ($fp)
  {
    fwrite($fp,"<html><body>Test</body></html>");
    fclose($fp);
  }
  else
  {
    print "Could not open ".$filename." for WRITE access!";
  }
?>

Assuming that works correctly for a few test runs (you should get 1.php, 2.php etc.), then all you need to do in your script is increment $next for each item and construct $filename as above...

Hope this helps!
Cheers,
David.

Edit: I'd replied before seeing your code above - let me know if you still need help incorporating the above into your myRecordHandler() function when using Magic Parser. Don't forget to global in variables like $next...

Hi David. Again, thanks for

Submitted by function on Fri, 2008-01-11 13:50

Hi David.

Again, thanks for helping me out.
The above code just creates 1.php and nothing more after that (after refreshes ofcourse).

IF you can implement magicparser somehow into the above code that would really be helpful.
I only included the update part of the script - I didn't think you needed the parser, as that would be magicparser's job I guess.
Let me know if you'd like that code though.

Thanks,
Tro

Sorry - there was a typo (0

Submitted by support on Fri, 2008-01-11 13:59

Sorry - there was a typo (0 needed in the call to substr) - here's the correct version:

<?php
  $dir = "details/";
  $max = 0;
  $dh = opendir($dir);
  while (($file = readdir($dh)) !== false)
  {
    $n = intval(substr($file,0,strpos($file,".")));
    if ($n > $max) $max = $n;
  }
  closedir($dh);
  // now we have worked out max, start with the next number
  $next = $max + 1;
  $filename = $dir.$next.".php";
  $fp = fopen($filename,"w");
  if ($fp)
  {
    fwrite($fp,"<html><body>Test</body></html>");
    fclose($fp);
  }
  else
  {
    print "Could not open ".$filename." for WRITE access!";
  }
?>

Which part do you still need

Submitted by support on Fri, 2008-01-11 14:02

Which part do you still need help with?

Do you have Magic Parser reading your files correctly based on the BBC News example?

Presumably you need to auto-detect RSS or Atom format, then dump the item contents to the file 1.php, 2.php etc...

Could you also post the format that you want to dump the items into (an example in HTML) - that would help create the PHP required to generate that markup...

Cheers,
David.

I see what you mean now

Submitted by function on Fri, 2008-01-11 20:43

I see what you mean now David.

I implemented the last script, but it just creates a new X.php in details/ for every time I refresh the main php file.

What I wanted it to do was to pass variables from each new item, into it's own detail page.

Kind of like I have on http://www.admproject.com/
It reads the RSS feed, pushes the variables I want into a detail page (from a template file), and creates a link to the detail page from the main page.

But that script can only parse the normal main variables such as title, description, url and date.
I want to be able to grab more variables such as image and other special variables embedded in most feeds.

That's why I somehow wanted to merge the script seen on http://www.admproject.com/ (and listed above) with magicparser so that I could play with more variables.

Thanks,
Tro

Hello Tro, I think I

Submitted by support on Sun, 2008-01-13 12:40

Hello Tro,

I think I understand what you want to do, but it's very hard just take the code you posted earlier and try to make it do what you want. Can you look at the 2 questions I posted above which will help me to understand your requirements so I can try and help you better...

Cheers,
David.