You are here:  » Problem with yahoo parsing

Support Forum



Problem with yahoo parsing

Submitted by mcncg2m on Tue, 2007-12-11 08:05 in

I'm sorry my english :)

I'm parsing this url:

http://xml.es.overture.com/d/search/p/standard/eu/xml/rlb/?mkt=es&maxCount=10&Partner=kepoo_xml_es_searchbox_informatica&Keywords=Creditos&affilData=ip%3D213.96.116.76%26ua%3DMozilla%252F4.0%2B%2528compatible%253B%2BMSIE%2B7.0%253B%2BWindows%2BNT%2B5.1%253B%2B.NET%2BCLR%2B1.1.4322%253B%2B.NET%2BCLR%2B2.0.50727%253B%2B.NET%2BCLR%2B3.0.04506.30%2529&serveUrl=www.hispaoferta.com%2Fprueba_hispaoferta%2Fcredito.php&type=

XML results:

  <?xml version="1.0" encoding="iso-8859-1" ?>
- <Results xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://dtd.overture.com/schema/affiliate/2.8.3/OvertureSearchResults.xsd">
- <ResultSet id="searchResults" numResults="10" adultRating="G">
- <Listing rank="1" title="Encuentra Tu Crédito de hasta 6.000EUR con Quierocredito.com" description="Hasta 6.000EUR para lo que necesites y sin tener que cambiar de banco." siteHost="www.quierocredito.com" biddedListing="true" adultRating="G">
  <ClickUrl type="body">http://rc23.overture.com/d/sr/?xargs=15KPjg14tStZamwravcLjMSuGCxl4axca588lpCJlwH9Zf5iMxXOF%2DZ6LInc0YfdxY%2DF%2DNwPqc9KUeKvD4n%5F%2DDEQyPQVKPGuXpjdLJzII5Pq2nwedw2L4twe7qmoNOO8s%2DUnLxdNO8nu7acZf0Pn9O%5FplwlV6U%5FPRqxsq3xeQWEayJiV8qpAidN8pW4NQt9c3HWO5%5FTLRRd9CclVKAIc9Ny4xFoMelPyZhICin4m0FqEzfKTh4v6zJdJEIuLzvm4uQf7KpkYocPhDKsKt1uUubuuas%2DMl1Zz%2Da4YNphAgJTL2tNzeG3Utw98GX0a60fJ8M%5FWX%2DC5%2DBRS6%5FDhslKhozD%5FKtZQaPMzB47xaRIs1LcN%5F%5FoGOIOvKLCyQxMxEM78XlNccn0z0%2E&yargs=www.quierocredito.com</ClickUrl>
  </Listing>

Encoding is correct "iso-8859-1"

Why is different results that I can see..

Encuentra Tu Crédito de hasta 6.000EUR con Quierocredito.com
Hasta 6.000EUR para lo que necesites y sin tener que cambiar de banco.
www.quierocredito.com

If XML result is correct.

Appears
Crédito instead of Crédito

Problem with spanish words.

Thanks.

Submitted by support on Tue, 2007-12-11 09:30

Hello,

This is probably because the page you are creating is not indicating that it is iso-8859-1 to the web browser.

To control this, at the top of your script add the code:

<?php
  header
("Content-Type: text/html; charset=iso-8859-1");
?>

This must come right at the top, before any output is generated by your script...

Hope this helps,
Cheers,
David.

Submitted by mcncg2m on Tue, 2007-12-11 11:45

Hi again,

It's no solution because in html body appears it.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<link rel="stylesheet" href="estilo/hispaofertav2.css" type="text/css" />
<title>Bancos</title>
<meta content="no-cache" http-equiv="pragma"/>

Submitted by support on Tue, 2007-12-11 13:23

Hi,

Could you email me a link to the page showing the incorrect characters and I will take a look for you.... you can reply to your reg code or forum registration email to get me....

Cheers,
David.

Submitted by mcncg2m on Tue, 2007-12-11 16:38

HI!!!! I solve the problem!!!!

I've need to change one function in magicparser.php...

I've created a new magicparser_iso_8859-1.php

I replace @xml_parser_create(); by @xml_parser_create("ISO-8859-1"); then the results are ok.

Thank a lot!!!

Submitted by support on Tue, 2007-12-11 16:45

Hi,

Well done! I would have worked towards that fix in the end, but it is a rare one and because of one particular version of PHP (in conjunction with one particular library version) that causes PHP's build in XML parser to incorrectly identify the character encoding of the file!

Cheers,
David.