You are here:  » Repeating unknown times


Repeating unknown times

Submitted by olov on Thu, 2006-04-06 08:02 in

Hello

This is my xml (thousands of these in the real xml file:

<list date="2006-04-05 05:19:20" language="en" num_hotels="11873">
<p id="2"> <name>The Westin Palace</name> <address>Piazza della Repubblica 20</address> <zip>20124</zip> <type>Hotel</type> <rating>5L</rating> <rooms>228</rooms> <availPolicy>Both</availPolicy> <activationDate>1997/06/18</activationDate> <usersRating>0</usersRating> <mapURL><![CDATA[http://www.venere.com/img/mappe/it/milano06/palestro.gif]]></mapURL> <locationURL><![CDATA[http://www.venere.com/maps/show_position.php?lg=en&geoid=303&hotel_id=2&map_id=8355&view=map&ref=0]]></locationURL> <venereRanking>448245</venereRanking> <templateType>New</templateType> <translations>
<language lg="en"/>
<language lg="it"/>
<language lg="de"/>
<language lg="fr"/>
<language lg="es"/>
</translations> <doublePriceMin>236</doublePriceMin> <doublePriceMax>522</doublePriceMax> <currency>EUR</currency> <geoID>303</geoID> <lat>45.478985343259200</lat> <lon>9.199338467600990</lon> <macroregion>Surroundings of Milan</macroregion> <country>Italy</country> <state/> <region>Lumbardy</region> <province>Milan</province> <city>Milan</city> <cityZone>Palestro</cityZone> <propertyURL>http://en.venere.com/hotels_milan/palestro/hotel_the_westin_palace.html?ref=0</propertyURL> <amenities>
<amenity>Entire property is air conditioned</amenity>
<amenity>Lift/elevator</amenity>
<amenity>Access for disabled</amenity>
<amenity>Pets accepted</amenity>
<amenity>Garage</amenity>
<amenity>Shuttle service from and/or to the airport</amenity>
<amenity>Baby sitter</amenity>
<amenity>Internet/Email services</amenity>
<amenity>Lounge bar</amenity>
<amenity>Restaurant</amenity>
<amenity>Laundry service</amenity>
<amenity>Room service - 24 hour</amenity>
<amenity>Front desk - 24 hour</amenity>
<amenity>Front desk - fax service</amenity>
<amenity>Meeting room</amenity>
<amenity>Business center</amenity>
<amenity>Banqueting service</amenity>
<amenity>Fitness center</amenity>
<amenity>Turkish bath</amenity>
<amenity>Massages</amenity>
<amenity>Wellness centre</amenity>
<amenity>Health center</amenity>
<amenity>Air conditioning</amenity>
<amenity>Heating</amenity>
<amenity>Hairdryer in room</amenity>
<amenity>Safe box</amenity>
<amenity>Mini bar</amenity>
<amenity>Direct dial phone</amenity>
<amenity>Internet plug</amenity>
<amenity>Satellite TV</amenity>
<amenity>Pay TV</amenity>
<amenity>Facilities for disabled people</amenity>
<amenity>Congress facilities</amenity>
<amenity>Wi-Fi Internet connection</amenity>
</amenities> <description lg="en"><![CDATA[The Westin Palace hotel is a luxury hotel located in the vibrant heart of Milan's fashion and finance district, offering international and elegant ambience and impeccable service. The property features a lounge bar, a restaurant, an Internet point, a spacious fitness centre provided with ultra-modern equipment, a comfortable relax area, 2 turkish baths and 3 exclusive treatment rooms. Furthermore, The Westin Palace offers 12 modular meeting rooms, hosting up to 400 people and representing the perfect solution for any type of event, from the most reserved meetings to the most prestigious gala dinners.]]></description> <photoURL><![CDATA[http://www.venere.com/img/hotel/2/0/0/0/2/2.jpg]]></photoURL> <images>
<image num="1">
<imageURL>http://www.venere.com/img/hotel/2/0/0/0/2/image_hotel_exterior_entrance_1.jpg</imageURL>
<thumbURL>http://www.venere.com/img/hotel/2/0/0/0/2/image_hotel_exterior_entrance_small_1.jpg</thumbURL>
<title>Exterior - entrance</title>
</image>
<image num="2">
<imageURL>http://www.venere.com/img/hotel/2/0/0/0/2/image_hotel_exterior_night_1.jpg</imageURL>
<thumbURL>http://www.venere.com/img/hotel/2/0/0/0/2/image_hotel_exterior_night_small_1.jpg</thumbURL>
<title>Exterior - night</title>
</image>
<image num="3">
<imageURL>http://www.venere.com/img/hotel/2/0/0/0/2/image_room_double_classic_1.jpg</imageURL>
<thumbURL>http://www.venere.com/img/hotel/2/0/0/0/2/image_room_double_classic_small_1.jpg</thumbURL>
<title>Double room classic</title>
</image>
<image num="4">
<imageURL>http://www.venere.com/img/hotel/2/0/0/0/2/image_room_double_deluxe_2.jpg</imageURL>
<thumbURL>http://www.venere.com/img/hotel/2/0/0/0/2/image_room_double_deluxe_small_2.jpg</thumbURL>
<title>Double room de luxe</title>
</image>
<image num="5">
<imageURL>http://www.venere.com/img/hotel/2/0/0/0/2/image_room_double_deluxe_1.jpg</imageURL>
<thumbURL>http://www.venere.com/img/hotel/2/0/0/0/2/image_room_double_deluxe_small_1.jpg</thumbURL>
<title>Double room de luxe</title>
</image>
</images> </p>
</list>

I am creating a php script that will parse and insert the xml to a database..

This is my code so far:

  function myRecordHandler($item)
  {
    echo "Namn:" .$item["NAME"]."<br>";
echo "Adress:" . $item["ADDRESS"]."<br>";
echo "Postnummer:" . $item["ZIP"]."<br>";
echo "Område:" . $item["CITYZONE"]."<br>";
echo "Stad:" . $item["CITY"]."<br>";
echo "Region:" . $item["REGION"]."<br>";
echo "Provins:" . $item["PROVINCE"]."<br>";
echo "Land:" . $item["COUNTRY"]."<br>";
echo "Karta:" . $item["MAPURL"]."<br>";
echo "locationURL:" . $item["LOCATIONURL"]."<br>";
echo "propertyURL:" . $item["PROPERTYURL"]."<br>";
echo "Bild:" . $item["PHOTOURL"]."<br>";
echo "Longitud:" . $item["LON"]."<br>";
echo "Latitud:" . $item["LAT"]."<br>";
echo "Amenity:" . $item["AMENITIES/AMENITY"]."<br>";
echo "Beskrivning" . $item["DESCRIPTION"]."<br><br>";
  }
  print "<h1>Hotell</h1>";
  $url = "catalog_tst.xml";
  MagicParser_parse($url,"myRecordHandler","xml|LIST/P/");

How do I access the unknown number of Amenities?
Is it possible to loop thru them withing myRecordHandler?

Regards,
Olov

Submitted by support on Thu, 2006-04-06 08:15

Hello Olov,

Using the latest version of Magic Parser, duplicate key names are resolved by appending the "@" character, and a number, so you can access them all within your record handler function.

Firstly, here is your sample uploaded to the demo script:

Hotel XML Demo

You will notice the list of amenities with key names like "AMENITIES/AMENITY@7". The easiest way to access this list within your record handler function is to read every key in the $record using foreach(); and check for the key name beginning with "AMENITIES/AMENITY". Here's an example that should work with your source file:

<?php
  
function myRecordHandler($record)
  {
    echo 
"Namn:" .$item["NAME"]."<br>";
    echo 
"List Of Amenities:<br>";
    echo 
"<ul>";
    foreach(
$record as $key => $value)
    {
      if (
substr($key,0,17)=="AMENITIES/AMENITY")
      {
        echo(
"<li>".$value."</li>");
      }
    }
    echo 
"</ul>";
  }
  
$url "catalog_tst.xml";
  
MagicParser_parse($url,"myRecordHandler","xml|LIST/P/");
?>

Hope this helps!

Submitted by olov on Thu, 2006-04-20 11:50

Perfect, thank you.
Sorry for the late reply.

This solved my first problem, but now I have a second problem.
The xml file I am processing is huge, about 40 MB.

When I parse the xml with my script it just stops after a couple of thousand hotels without error messages.
I have checked memory limits and timeouts, but cant find any problems there.

This is my current script: (a bit stripped down at the moment to help me find the error.

<?php
set_time_limit(0);
include ('/var/www/virtual/www.XXXXX/dbklass.php');
require("MagicParser.php");
  function myRecordHandler($item)
  {
    $Hotellid = mysql_escape($item["P-ID"]);
    /*$Namn = mysql_escape($item["NAME"]);
$Adress = mysql_escape($item["ADDRESS"]);
$Postnummer = mysql_escape($item["ZIP"]);
$Omrade = mysql_escape($item["CITYZONE"]);
$Stad = mysql_escape($item["CITY"]);
$Region = mysql_escape($item["REGION"]);
$Provins = mysql_escape($item["PROVINCE"]);
$Land = mysql_escape($item["COUNTRY"]);
$Karta = mysql_escape($item["MAPURL"]);
$locationURL = mysql_escape($item["LOCATIONURL"]);
$propertyURL = mysql_escape($item["PROPERTYURL"]);
$Bild = mysql_escape($item["PHOTOURL"]);
$Longitud = mysql_escape($item["LON"]);
$Latitud = mysql_escape($item["LAT"]);
$Prisfran = mysql_escape($item["DOUBLEPRICEMIN"]);
$Pristill = mysql_escape($item["DOUBLEPRICEMAX"]);
$Beskrivning = mysql_escape($item["DESCRIPTION"]);
*/
$main_sql = "insert into hotell (Hotellid, namn, adress, postnummer, omrade, stad, region, provins, land, karta, locationURL, propertyURL, Bild, Longitud, Latitud, Prisfran, Pristill, Beskrivning ) VALUES ('$Hotellid','$Namn','$Adress','$Postnummer','$Omrade','$Stad','$Region','$Provins','$Land','$Karta','$locationURL','$propertyURL' ,'$Bild','$Longitud','$Latitud','$Prisfran','$Pristill','$Beskrivning')";
//print $Hotellid."<br>";
print MagicParser_getErrorMessage();
//execute the sql
$objDb = new DB;
$objDb->execSql($main_sql);
   /* foreach($item as $key => $value)
    {
      if (substr($key,0,17)=="AMENITIES/AMENITY")
      {
        $Amenity = mysql_escape($value);
$am_sql = "insert into amenity (Hotellid, Amenity) VALUES ('$Hotellid', '$Amenity')";
//$objDb->execSql($am_sql);
      }
    }
    foreach($item as $key => $value)
    {
      if (substr($key,0,21)=="IMAGES/IMAGE/IMAGEURL")
      {
        $Bilder = mysql_escape($value);
$bild_sql = "insert into bilder (Hotellid, Bild) VALUES ('$Hotellid', '$Bilder')";
//$objDb->execSql($bild_sql);
      }
    }
*/
  }
  $url = "catalog_en.xml";
  MagicParser_parse($url,"myRecordHandler","xml|LIST/P/");
  print "<h1>Done!</h1>";
function mysql_escape($string)
{
$string = str_replace ("\0",'',$string);
$string = str_replace ("\n",'',$string);
$string = str_replace ("'",'',$string);
$string = stripslashes($string);
return mysql_escape_string($string);
}
?>

its possibly an error in the XML, but I cant find it.
Any Ideas?

Thanks
Olov