csv: League\Csv\Writer - Invalid CSV

Issue summary

A string with a doubled backslash at the end, leads to an invalid CSV file. It looks like that in the CSV File:

"a text string \\"", "..."

System informations

Information Description
League\Csv version 9.1.4
PHP/HHVM version PHP 7.2
OS Platform Debian GNU/Linux 9

Standalone code, or other way to reproduce the problem

Code not tested, just written to make it more easy to understand the problem:

$csvWriter = Writer::createFromPath('/tmp/file.csv', 'a+');
$csvWriter->setNewline("\r\n"); //RFC4180 Line feed
RFC4180Field::addTo($csvWriter);
$line=['a text string \\','...'];
$csvWriter->insertOne($line);

Expected result

"a text string \\", "..."

Actual result

"a text string \\"", "..."

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 1
  • Comments: 20 (12 by maintainers)

Commits related to this issue

Most upvoted comments

There is now a pull request to fix this in PHP: https://github.com/php/php-src/pull/3515.

Are there any other important differences in the RFC4180 writing mode other than ignoring the escape character?

yes there’s one fputcsv adds enclosures in presence of the white spaces in a record field, RFC4180 does not.

But to me that’s a minor issue since having more enclosed field gives less error prone conversion than the other way around (not having enclosure at all anytime around record fields).

@theodorejb the following PR #309 contains the fix TL;DR it falls back to using fwrite instead of fputcsv.

It’s still possible that the issue could be fixed in PHP by allowing a blank string to be set as the escape character. In the meantime before PHP allows this, perhaps this library could special-case calling setEscape(‘’), which would switch to a writing mode that doesn’t use fputcsv.

Indeed if the upstream code in fputcsv was patch I would no longer need the optional parameter. But since I can not 100% guarantee that this is how the patch will land (using an empty string) I’d rather not rely on a fix which has not yet land, what if the patch uses null or false instead of the empty string ?

I’m not entirely convinced that adding a new writing mode parameter is the best solution, however.

The issue is not with how the patch works but how it is triggered. Should it be triggered by a special input for the escape character OR should it be triggered by a optional argument on the insertXXX methods.

IMO using an optional argument is the best approach for the following reason If PHP was to fix the escape parameter bug

  1. v9.2.0 introduced the Writer::MODE_RFC4180 relying on fwrite (see #309);
  2. PHP patchs fputcsv;
  3. a patch release for League\Csv is released which switch internally to fputcsv on all modes but still uses fwrite for legacy PHP version;
  4. the next major release drops the Writer::MODE_RFC4180 constant and just relies on fputcsv behavior;

This is clean simple and reliable. On the other hand if we use the “escape” parameter switch and the PHP patch is done differently (let’s hope not 😄 ) then we will have to directly release a v10 because of potential bugs. Not to mention that nothing is said for fgetcsv will it be patch too to ignore the escape parameter using the same empty string 🤔

I’ve seen your post on the PHP internal mailing list 👍 . Let’s hope this boost the release of a patch quickly!!

Also, how does the new writing mode perform compared to the native PHP function?

I suppose that benchmark using fwrite decrease insertion speed, but it should depend on the record content length 🤔 You can benchmark the PR to see the difference.

@nyamsprod It was just a additional thought from me - i think your solution is valid in terms of a coding view.