You are here:  » Special character encoding - XML mysql insert


Special character encoding - XML mysql insert

Submitted by keyesque2 on Sat, 2009-05-16 23:10 in

Data does insert but special characters:

hyphens, insert to table as: â€
bullets, insert to table as: •

Have tested with both:

require("xml_parser.php");
$filename = "http://testsite.com/a.xml";
  function myRecordHandler($record)
  {
  $sql = "
      INSERT INTO Books SET
Title = '".mysql_real_escape_string( $record["TITLE"] )."',
Description = '".mysql_real_escape_string( $record["DESCRIPTION"] )."',
loadDate = '".date('Y-m-d')."'
      ";
  mysql_query($sql) or die(mysql_error());
  echo "Data Inserted";
  }
  MagicParser_parse($filename,"myRecordHandler","xml|source/genre/");

------- FOLLOWED BY SAME EXCEPT; setting encoding on variable before insert -----

require("xml_parser.php");
$filename = "http://testsite.com/a.xml";
  function myRecordHandler($record)
  {
  $DescriptVar = utf8_encode($record["DESCRIPTION"]);
  $sql = "
      INSERT INTO Books SET
Title = '".mysql_real_escape_string( $record["TITLE"] )."',
Description = '".$DescriptVar."',
loadDate = '".date('Y-m-d')."'
      ";
  mysql_query($sql) or die(mysql_error());
  echo "Data Inserted";
  }
  MagicParser_parse($filename,"myRecordHandler","xml|source/genre/");

Neither of the above seems to have any effect.
Due to the nature of the feeds I'll be getting there may be other problem characters, while too getting html formatting, etc. So far problem characters have been experienced in Description field but would like to format other fields in a fell swoop if possible.

What other way may work for a function or otherwise here?

Submitted by support on Sun, 2009-05-17 07:45

Hi there,

Have you made sure that the output of your script (from which you are viewing the characters that do not appear correctly) is in the same character set as the data?

This is normally done with the header() function to set the content type of the document; for example (this must come before any output is generated)

  header("Content-Type: text/html; charset=utf-8");

Alternatively, the second most likely character set is iso-8859-1, e.g.:

  header("Content-Type: text/html; charset=iso-8859-1");

Hope this helps!
Cheers,
David.

Submitted by keyesque2 on Sun, 2009-05-17 21:34

My post above contained this (not in tagged code section) but which did not show live on this forum:

Have tested with both:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

I'll check with the party handling the output but which is better for special characters so I can tell them?

Submitted by support on Mon, 2009-05-18 07:14

It may be worth trying it without the utf8_encode() call, and instead simply ensure continuity of the character set between the XML source, your database collation (the character set of the database tables) and the output being generated...

Cheers,
David.

Submitted by keyesque2 on Fri, 2009-05-29 13:50

Okay, have tried both with and without the utf8_encode() call.

Have ensured continuity of the character set between the XML source, the database collation (the character set of the database tables) and the output being generated which are all set at utf8

Now the bullets and single and double hyphens are being interpreted as '?' question marks on insert.

Submitted by support on Fri, 2009-05-29 14:08

Hi,

Would it be possible for you to email me a link to the XML source so that I can download the file and take a look for you...

Also any information about where / which fields in the XML contain the problem characters that would help...

Thanks!
David.

Submitted by keyesque2 on Tue, 2009-06-30 13:31

David,

Many thanks for the assist in making this insert go as needed. I know what we encountered was somewhat out of the ordinary.

Admittedly, my host could have given a little better support and so I apologize on their behalf for what delay this has caused.

Given the level of service you have provided, I am much satisfied with our choice to go with MagicParser.