You are here:  » Reading a CSV that has multiple characters as a delimiter


Reading a CSV that has multiple characters as a delimiter

Submitted by orbitals on Thu, 2018-04-19 11:21 in

I'm trying to read in a file that uses multiple characters to delimit the data.

The example is rather than a conventional comma (,) it uses a pipe and a plus symbol then another pipe (|+|).

Can magicparser be adapted to cater to this? Reading the manual I understand that the ASCII code needs to be used, but can it accept more than one ASCII code in and how should it be formatted?

Many thanks in advance.

Submitted by support on Wed, 2018-05-02 08:32

Hi,

The parser only supports single character separation but what I would suggest, is if the file is small enough to be read into memory the delimiter can be changed using str_replace() and then parsed using the string:// operator, for example, to change to NULL separation (ASCII 0x00)

  $data = file_get_contents("filename.dat");
  $data = str_replace("|+|",chr(0),$data);
  MagicParser_parse("string://".$data,"myRecordHandler","csv|0|1|0");

Note the format string in the 3rd parameter to MagicParser_parse() which in this case is 0 for the field separator, 1 (indicating header row) and 0 (indicating no text delimiter - more info)

Hope this helps!
Cheers,
David
--
MagicParser.com