Data does insert but special characters:
hyphens, insert to table as: â€
bullets, insert to table as: •
Have tested with both:
require("xml_parser.php");
$filename = "http://testsite.com/a.xml";
function myRecordHandler($record)
{
$sql = "
INSERT INTO Books SET
Title = '".mysql_real_escape_string( $record["TITLE"] )."',
Description = '".mysql_real_escape_string( $record["DESCRIPTION"] )."',
loadDate = '".date('Y-m-d')."'
";
mysql_query($sql) or die(mysql_error());
echo "Data Inserted";
}
MagicParser_parse($filename,"myRecordHandler","xml|source/genre/");
require("xml_parser.php");
$filename = "http://testsite.com/a.xml";
function myRecordHandler($record)
{
$DescriptVar = utf8_encode($record["DESCRIPTION"]);
$sql = "
INSERT INTO Books SET
Title = '".mysql_real_escape_string( $record["TITLE"] )."',
Description = '".$DescriptVar."',
loadDate = '".date('Y-m-d')."'
";
mysql_query($sql) or die(mysql_error());
echo "Data Inserted";
}
MagicParser_parse($filename,"myRecordHandler","xml|source/genre/");
Neither of the above seems to have any effect.
Due to the nature of the feeds I'll be getting there may be other problem characters, while too getting html formatting, etc. So far problem characters have been experienced in Description field but would like to format other fields in a fell swoop if possible.
What other way may work for a function or otherwise here?
My post above contained this (not in tagged code section) but which did not show live on this forum:
Have tested with both:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
I'll check with the party handling the output but which is better for special characters so I can tell them?
It may be worth trying it without the utf8_encode() call, and instead simply ensure continuity of the character set between the XML source, your database collation (the character set of the database tables) and the output being generated...
Cheers,
David.
Okay, have tried both with and without the utf8_encode() call.
Have ensured continuity of the character set between the XML source, the database collation (the character set of the database tables) and the output being generated which are all set at utf8
Now the bullets and single and double hyphens are being interpreted as '?' question marks on insert.
Hi,
Would it be possible for you to email me a link to the XML source so that I can download the file and take a look for you...
Also any information about where / which fields in the XML contain the problem characters that would help...
Thanks!
David.
David,
Many thanks for the assist in making this insert go as needed. I know what we encountered was somewhat out of the ordinary.
Admittedly, my host could have given a little better support and so I apologize on their behalf for what delay this has caused.
Given the level of service you have provided, I am much satisfied with our choice to go with MagicParser.
Hi there,
Have you made sure that the output of your script (from which you are viewing the characters that do not appear correctly) is in the same character set as the data?
This is normally done with the header() function to set the content type of the document; for example (this must come before any output is generated)
header("Content-Type: text/html; charset=utf-8");
Alternatively, the second most likely character set is iso-8859-1, e.g.:
header("Content-Type: text/html; charset=iso-8859-1");
Hope this helps!
Cheers,
David.