You are here:  » Problem displaying Japanese text


Problem displaying Japanese text

Submitted by ptom98 on Thu, 2007-07-26 09:17 in

I have just purchased Magic Parser and find it an extremly useful program. I am currently having one problem, and that is the results that are displayed are all appearing as either nonsense text or as question marks.

I am accessing Amazon.co.jp and retrieving the bestselling 'Photobooks' the page uses utf-8 as the charset, I have tried using utf8_decode and utf8_encode on the returned text, the display portion of the code can be seen below;

function myRecordHandler($item)
  {
    print "<h3><a href='".$item["DETAILPAGEURL"]."'>".utf8_decode($item["ITEMATTRIBUTES/TITLE"])."</a></h3>";
    print "<img src='".$item["MEDIUMIMAGE/URL"]."' />";
    print "<br /><br />";
  }

a link to the current test page is http://www.helloblog.co.uk/test.php

I would appreciate any help you can offer,
Thanks.

Submitted by support on Thu, 2007-07-26 09:20

Hi,

This is almost certainly down to the character encoding of the output being generated by your script. Try the following code on the first line of your script:

  header("Content-Type: text/html;charset=utf-8");

That will inform the browser to use utf-8 to display the page.

Hope this helps!
Cheers,
David.

Submitted by ptom98 on Thu, 2007-07-26 11:14

It doesn't seem to have any affect, I've included the test.php file in it's entirity below, are then any glaringly obvious mistakes that I've overlooked? I've also put in the 'header("Content-Type: text/html;charset=utf-8");' and I've put the some Japanese text into the page itself to make it sure it can display them, which the pages displays no problems...but still not the titles of the books.

<?php header("Content-Type: text/html;charset=utf-8"); ?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
オリコンによる音楽
<?php
  require("MagicParser.php");
  function myRecordHandler($item)
  {
    print "<h3><a href='".$item["DETAILPAGEURL"]."'>".utf8_decode($item["ITEMATTRIBUTES/TITLE"])."</a></h3>";
    print "<img src='".$item["MEDIUMIMAGE/URL"]."' />";
    print "<br /><br />";
  }
    // construct Amazon Web Services REST Query URL
$url = "http://xml.amazon.co.jp/onca/xml?Service=AWSECommerceService";
$url .= "&Operation=ItemSearch";
$url .= "&AWSAccessKeyId=[key]";
$url .= "&SearchIndex=Books";
$url .= "&BrowseNode=500592";
$url .= "&Condition=All";
$url .= "&ResponseGroup=Images,Medium";
$url .= "&ItemPage=1";
    // fetch the response and parse the results
//echo file_get_contents($url);
    MagicParser_parse($url,"myRecordHandler","xml|ITEMSEARCHRESPONSE/ITEMS/ITEM/");
?>
</body>
</html>

Submitted by support on Thu, 2007-07-26 11:22

Hi,

You will need to remove the calls to utf8_decode() as this will interfere with the fact that you are now correctly outputting the page in UTF8.

That should do the trick!

Cheers,
David.