You are here:  » Duplicate Records


Duplicate Records

Submitted by Bangkok Bob on Wed, 2006-01-04 04:03 in

Hi,

I have used the MagicParser with MySQL to get data from a XML feed and in another case to extract data from a CSV file but each time I get duplicate records in the database and have to run a DISTINCT query to clean out the dupes.

Please could you give me a clue what I am doing wrong?

I can post the code if needed but I have used the affiliate example with INSERT INTO mytable SET and DELETE FROM mytable.

By the way, I added the following to show any basic errors while parsing:

if (@mysql_query($sql)) {
 echo '<p>Your data has been added.</p>';
 } else {
 echo '<p>Error adding submitted data: ' . mysql_error() . '</p>';
   }

Later,

Bangkok Bob

Submitted by support on Wed, 2006-01-04 09:48

Hi,

If your feed contains duplicate records you will have to make a SELECT query to the database before each insert to make sure that you do not already have that record.

Alternatively; if you know there are going to be dupes you could just run the DISTINCT query at the end just as you are doing now.

If you think it is the code that is creating the duplicates then yes, please post your code (remove any usernames / passwords of course) as this will help see what's going on....

Submitted by Bangkok Bob on Wed, 2006-01-04 11:38

Hi David,

Thanks for your prompt response.

I think it is the code creating the duplicates as it has happened with different data sources. Here are the codes:

<?php
  set_time_limit(0);
  require("MagicParser.php");
  mysql_connect("localhost","username","password");
  mysql_select_db("database");
  function myRecordHandler($flexrez)
  {
    $sql = "
      INSERT INTO flexrez SET
        f_hotel_id = '".mysql_real_escape_string( $flexrez["PCLNHOTELID"] )."',
        f_odd_chain = '".mysql_real_escape_string( $flexrez["ODDCHAINCODE"] )."',
        f_odd_property = '".mysql_real_escape_string( $flexrez["ODDPROPERTYCODE"] )."',
        f_usw_chain = '".mysql_real_escape_string( $flexrez["USW_CHAIN_CODE"] )."',
        f_name = '".mysql_real_escape_string( $flexrez["NAME"] )."',
        f_address_1 = '".mysql_real_escape_string( $flexrez["ADDRESSLINE1"] )."',
        f_address_2 = '".mysql_real_escape_string( $flexrez["ADDRESSLINE2"] )."',
        f_address_3 = '".mysql_real_escape_string( $flexrez["ADDRESSLINE3"] )."',
        f_city = '".mysql_real_escape_string( $flexrez["CITY"] )."',
        f_state = '".mysql_real_escape_string( $flexrez["STATEPROVINCECODE"] )."',
        f_country = '".mysql_real_escape_string( $flexrez["COUNTRYCODE"] )."',
        f_postal = '".mysql_real_escape_string( $flexrez["POSTALCODE"] )."',
        f_phone = '".mysql_real_escape_string( $flexrez["PHONENUMBER"] )."',
        f_star = '".mysql_real_escape_string( $flexrez["STARRATING"] )."'
      ";
    mysql_query($sql);
if (@mysql_query($sql)) {
     echo '<p>Your data has been added.</p>';
   } else {
     echo '<p>Error adding submitted data: ' . mysql_error() . '</p>';
   }
  }
  mysql_query("DELETE FROM flexrez");
  MagicParser_parse("AllHotels.txt","myRecordHandler","csv|124|1|0");
?>

<?php
  require("MagicParser.php");
  mysql_connect("localhost","username","password");
  mysql_select_db("database");
  function myRecordHandler($hotel)
  {
    $sql = "
      INSERT INTO hotels SET
        h_name = '".mysql_real_escape_string( $hotel["HNAME"] )."',
        h_url = '".mysql_real_escape_string( $hotel["HLINK"] )."',
        h_location = '".mysql_real_escape_string( $hotel["HLOCATION"] )."',
        h_rate = '".mysql_real_escape_string( $hotel["HRATE"] )."',
        h_inclabf = '".mysql_real_escape_string( $hotel["HINCLABF"] )."',
        h_inclsvc = '".mysql_real_escape_string( $hotel["HINCLSVC"] )."',
        h_inclvat = '".mysql_real_escape_string( $hotel["HINCLVAT"] )."',
        h_promo = '".mysql_real_escape_string( $hotel["HPROMO"] )."',
        h_thumb = '".mysql_real_escape_string( $hotel["HTHUMB"] )."'
      ";
    mysql_query($sql);
if (@mysql_query($sql)) {
     echo '<p>Your data has been added.</p>';
   } else {
     echo '<p>Error adding submitted data: ' . mysql_error() . '</p>';
   }
  }
  mysql_query("DELETE FROM hotels");
  $url = "http://R24.org/R24xmlhotel.php?p=chiangmai-post.com&r=chiangmai&cur=USD&sort=L&promo=0";
  MagicParser_parse($url,"myRecordHandler","xml|CHIANGMAI/HNO/");
?>

<?php
  require("MagicParser.php");
  mysql_connect("localhost","username","password");
  mysql_select_db("database");
  function myRecordHandler($tour)
  {
    $sql = "
      INSERT INTO tours SET
        t_name = '".mysql_real_escape_string( $tour["PRODUCTITEMDETAIL/ITEMTITLE"] )."',
        t_url = '".mysql_real_escape_string( $tour["PRODUCTPAGEURL"] )."',
        t_description = '".mysql_real_escape_string( $tour["PRODUCTITEMDETAIL/PRODUCTTEXT"] )."',
        t_rate = '".mysql_real_escape_string( $tour["PRODUCTITEMDETAIL/PRICEDESCIPTION"] )."',
        t_duration = '".mysql_real_escape_string( $tour["PRODUCTITEMDETAIL/DURATION"] )."',
        t_thumb = '".mysql_real_escape_string( $tour["PRODUCTITEMDETAIL/URLLIST/URL/URLLINK"] )."'
      ";
    mysql_query($sql);
if (@mysql_query($sql)) {
     echo '<p>Your data has been added.</p>';
   } else {
     echo '<p>Error adding submitted data: ' . mysql_error() . '</p>';
   }
  }
  mysql_query("DELETE FROM tours");
  $url = "http://www.affiliate.viator.com/xml/affiliateXMLData.jsp?IATA=CNX&Currency=USD";
  MagicParser_parse($url,"myRecordHandler","xml|VIATORAFFILIATEXMLDATA/IATA/PRODUCTITEM/");
?>

Later,

Bangkok Bob