You are here:  » Replacing Special Characters in Magic Parser Content


Replacing Special Characters in Magic Parser Content

Submitted by marcysutton on Thu, 2009-09-17 22:56 in

Hi There!

I am having trouble replacing smart quotes in Magic Parser content. The reason for doing it is that I want to allow the person updating an XML file to include special characters without having to know the html entities. I am putting site-wide copy into a single XML document and pulling that content into the separate pages using Magic Parser (so I need the HTML tags).

My function works fine when I use it on content contained in a basic string. And it works fine on an ampersand in the Magic Parser content, just the ASCII character values don't work. I hand-picked the $search values from the ASCII table. Am I doing something wrong? Do I have to process the content outside of the Magic Parser function to control these values?

Both my XML document and PHP page are in UTF-8.

Here is my PHP function that replaces special characters:

<?php
function convert_smart_quotes($string) {
    $search = array(
                    '&', // works fine
                    '"',
                    chr(133), // this and everything below will not convert
                    chr(145),
                    chr(146),
                    chr(147),
                    chr(148),
                    chr(149),
                    chr(150),
                    chr(151),
                    chr(153),
                    chr(169),
                    chr(171),
                    chr(174),
                    chr(176),
                    chr(187),
                    chr(232),
                    chr(233)
                    );
    $replace = array(
                    '&amp;',
                    '&quot;',
                    '&hellip;',
                    '&lsquo;',
                    '&rsquo;',
                    '&ldquo;',
                    '&rdquo;',
                    '&bull;',
                    '&ndash;',
                    '&mdash;',
                    '&trade;',
                    '&copy;',
                    '&laquo;',
                    '&reg;',
                    '&deg;',
                    '&raquo;',
                    '&egrave;',
                    '&eacute;'
                    );
    return str_replace($search, $replace, $string);
} ?>

Here is my Magic Parser function that pulls in content:

<?php
function copyHandler($record) {
            $copy = $record["PAGE"];
            echo convert_smart_quotes($copy);
}
MagicParser_parse("copy.xml","copyHandler","xml|THESTATION/HOME/PAGE/"); ?>

Here is the XML I am parsing:

<thestation>
    <home>
        <page name="homepage">
            <![CDATA[<h1>Connect to everywhere, <br />every seven minutes.</h1>
            <p>The Station at Othello Park offers sustainable, contemporary Northwest apartments with instant light-rail links to wherever you want to go – from downtown restaurants to Columbia City cafes. From Pioneer Square shops to SoDo stadiums, & more. Live in Seattle’s most authentic, global neighborhood that’s as vibrant and artistic as the people who call it home. The Station at Othello Park. Now arriving.</p>]]>
        </page>
    </home>
</thestation>

Thanks!

Submitted by support on Fri, 2009-09-18 09:18

Hello Marcy,

Your translations would need to be done at the point at which your XML is generated rather than when it is being parsed. The process is simply

i) Your content editors submit text

ii) Process their input with your convert_smart_quotes() fucntion

iii) Generate XML as normal.

Then when it comes to parsing the XML there shouldn't be any further processing required...

Hope this helps!
Cheers,
David.

Submitted by marcysutton on Fri, 2009-09-18 16:36

David,

Thanks so much for getting back to me! That's pretty much what I thought, but unfortunately my content editor will be making changes to the XML file directly. To process the content first would require adding an interface for them to paste into -- and I'm not sure how that would work with my detailed XML structure (the entire website's copy & image filenames will be in a single document) and no login/cms system. But if you have any suggestions on how to process the content before inserting it into XML, I am all ears.

My current plan of attack is to just instruct the person doing the editing not to include any special characters, and/or provide her with a list of html entities to use.

Thanks again for a great product I use a LOT!

Marcy