You are here:  »  Invalid byte 1 of 1-byte UTF-8-sequence


Invalid byte 1 of 1-byte UTF-8-sequence

Submitted by buyowner on Thu, 2008-11-13 00:09 in

The XML file I am pulling is not throwing a MagicParser error, just gives me nothing.

I ran it through an XML validator and I am getting this error.

Invalid byte 1 of 1-byte UTF-8-sequence

The error area is : ... family room areas are huge. There?s also a nice patio area very nicely d

The ? is a variation of the ' character.

How can I detect, or replace this with the correct character without opening, editing, and saving the file prior to reading it into the parser.

Dan

Submitted by support on Thu, 2008-11-13 09:02

Hello Dan,

Without taking some kind of action it is impossible for a parser to proceed with a file once the character encoding has gone out of sync, as you cannot be sure where the next markup-significant character lies.

One solution that has worked before is to use a modified version of MagicParser that cleanses the data into valid utf-8 before passing it to the lower level parser. I will email you this version of MagicParser which should do the trick!

Cheers,
David.

Submitted by buyowner on Thu, 2008-11-13 16:05

Thanks That worked.