I too seem to be having an encoding problem I can't get around. Maybe I'm overlooking something obvious but can't seem to find the answer.
1. My product feed is encoded in ISO-8859-1
2. Magic parser seems to be parsing the file successfully. A simple print_r gives the following:
Array ( [PRODUCT] => [PRODUCT_CODE] => 04671558 [PRODUCT_NAME] => Florentyna Shower Crème [BRAND_NAME] => Other [LEVEL1] => Beauty [LEVEL2] => [LEVEL3] => [LEVEL4] => [LEVEL5] => [MAPPED_CAT_LEVEL1] => Health & Beauty [MAPPED_CAT_LEVEL2] => Miscellaneous [MAPPED_CAT_ID] => 205 [DESCRIPTION] => Florentyna shower crme with moisturisers For use in bath or shower .... etc
3. My encoding is set to utf-8 (via php header and also meta)
So I think that basically magic parser is handling the encoding correctly (parsing iso-8859-1 and spitting out utf-8)
4. The problem comes when I try to insert into my sql .... Florentyna Shower Crème becomes Florentyna Shower Crème and try as I might I can't fix it. i've tried utf8-encode() around the data to be inserted, checked that my collation in sql is correct (it's set to utf8 general) all to no avail.
If anyone has any bright ideas I'd appreciate it!
Thanks,
Karl
Hi Karl,
In general, the collation defined on your database is only relevant to search queries as it defines how two strings will "match". Regardless of how it is set, what you put in is what you will get out. Therefore, this indicates that the part of your application that is reading from MySQL and then displaying the data is not sending the correct Content-Type header.
I'm slightly confused by the fact that you said that your data is iso-8859-1 encoded yet your script is sending the utf-8 header. I would have thought that would cause the opposite problem, in that your print_r() output would not appear correctly.
The basic steps you should always take in this situation is;
i) When viewing the incorrect output, use the "View > Character Encoding" encoding of your web browser to manually select what you think is the correct encoding; and the confirm that the characters appear correctly when this is selected.
ii) Taking that character set; then make sure that you are sending the correct Content-Type header, for example:
<?php
header("Content-Type: text/html;charset=iso-8859-1");
?>
It is essential that this code comes BEFORE any output is generated, otherwise it does not have any effect, and depending upon your PHP warning / error settings you may not see a message to that effect...! Therefore...
iii) Double-check that the correct header is being sent by looking at the headers using a browser plugin such as the "Webmaster" plugin for Firefox. With this, you can right-click on a page, and then Information > View Response Headers.
Hope this helps!
Cheers,
David.