You are here:  » CSV Field with a Comma and Double Quote


CSV Field with a Comma and Double Quote

Submitted by jwp on Thu, 2010-03-11 05:38 in

Hi David,

I have a CSV file with a field that includes both commas and double quotes. Magic Parser doesn't seem to be parsing the file correctly.

In my CSV file the fields that include commas are surrounded by double quotes. In a field that includes double quotes, the quotes have an extra set of double quotes around them.

The problem seems to occur when I have a comma and a double quote in the same field - e.g.

Favorite Movies: "Crash", "Gladiator", and "Titanic".

The CSV format puts double quotes in front of fields with commas, and an extra set of double quotes before and after quotes to differentiate them from end of text field delimiter. Here is what the CSV data looks like for this field:

"Favorite Movies: ""Crash"", ""Gladiator"", and ""Titanic""."

Let me know if that makes sense.

Best regards,
Jeremy

Submitted by support on Thu, 2010-03-11 10:02

Hi Jeremy,

The double-quoted format is actually very difficult to process serially, so what I normally suggest is reading your CSV into a string (assuming that memory permits), and then pre-processing to replace double-double quotes with an alternative character - single quote for example; and then parsing as normal. For example:

<?php
  
require("MagicParser.php");
  function 
myRecordHandler($record)
  {
    
print_r($record);
  }
  
$csv file_get_contents("filename.csv"); // could be URL of course
  
$csv str_replace("\"\"","'",$csv);
  
MagicParser_parse("string://".$csv,"myRecordHandler","csv|44|0|34");
?>

Hope this helps,
Cheers,
David.

Submitted by jwp on Fri, 2010-03-12 01:37

Great advice David! Thanks I'll give that a try.

Submitted by makenoiz on Wed, 2013-08-21 08:45

Hi David, I have the same problem.

My xls text is
54" HIGH 3 BOW UPS-ABLE Widget

It Exports to csv as
"54"" HIGH 3 BOW UPS-ABLE Widget"

and nothing I do in Magic parser or PHP will allow this to enter the DB as 54" HIGH 3 BOW UPS-ABLE Widget

I tried MagicParser_parse($filename,"myRecordHandler","csv|44|1|34"); I tried htmlspecialchars, mysql_real_escape_string etc...etc... nothing.

And I tried to implement your solution to david as well.

anyother advice

Submitted by support on Wed, 2013-08-21 09:07

Hi makenoiz,

As above i'm afraid there is no standard or programatically easy way to handle quotes in a quoted serial string. The ideal of course is, if it is an option, to use a separator that never occurs in the stream, e.g. pipe "|" or TAB.

Did you try the read into a string and pre-process / post-process as described above? One option would be to convert the double quote into the HTML entity " for example, assuming

<?php
  
function myRecordHandler($record)
  {
    foreach(
$record as $k => $v)
    {
      
$record[$k] = str_replace("&quot;","\"",$v);
    }
    
// rest of record processing code
  
}
  
$csv file_get_contents($filename);
  
$csv str_replace("\"\"","&quot;",$csv);
  
MagicParser_parse("string://".$csv,"myRecordHandler","csv|44|1|34");
?>

Hope this helps!
Cheers,
David
--
MagicParser.com