kakoune: Pipe handles newlines improperly

Steps

Write some string into the buffer. Select the entire string, but do not include the newline character at the end of it. Use | in normal mode to pipe the selection into an external program; base64 works well here.

Issue 1: Copy that entire string into some separate base64 decoder like CyberChef and decode it.

Issue 2: Back in Kakoune, select your string again, (make sure to not include a new line). Pipe your selection through base64 --decode.

Outcome

With Issue 1, it is clear to see that there is an extra new line included. This is not expected.

With Issue 2, while it is clear that the encoded string contains a newline (as per Issue 1), no new line is included in the decoded string.

Expected

With Issue 1, I would expect that no newline character would be appended to my selection string, and it should string should be exactly passed to the shell command’s input stream.

With Issue 2, I would expect that any newline characters included at the end of the output stream to be accounted for and recorded in the buffer.


To add some color to this, I was using Kakoune to edit some k8s resources, some of which were secrets. In k8s, the string data of a secret is encoded with base64. For example:

apiVersion: v1
kind: Secret
metadata:
  name: MySecret
data:
  THE_SECRET: bXkgc25lYWt5IHNlY3JldA==

Withing Kakoune, I decoded a string to edit it, and then reencoded it; this resulted in a sneaky and unexpected newline character being included with my secret’s value. I spent a good deal of time running around trying to figure out why the secret was not what it was showing up to be in my editor, and eventually found this issue.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 1
  • Comments: 15 (10 by maintainers)

Commits related to this issue

Most upvoted comments

It occurs to me that if Kakoune always piped data exactly as-is, it would be easy to write wrapper scripts to automatically add/strip a trailing newline where needed. However, since Kakoune does the fix-ups itself, it’s not possible for wrapper scripts to restore the original content.

The issue is that we already have <a-|> for pipe-to (pipe without replacing buffer text with the output), so we would end-up with combinatorial explosion if we were to add alternate behaviour.

I am leaning towards not adding the trailing newline and break GNU bc, this can be addressed by piping into something else, say { cat; echo; } | bc. Its still not very satisfying, but it seems the most “correct” solution, its unfortunate POSIX bc does not mandate it to handle that case.

I agree this needs to be fixed somehow, the general issue is that many commands, like base64, append an EOL to their output, but we dont want that newline to get inserted when we pipe some text that does not end-up with a newline. The current logic does the following: if input text does not end with EOL, append an EOL then remove it from the pipe output text.

Maybe the solution is to simplify the existing logic not-to add a newline when its missing, but still remove a newline from the output if the original text did not end with one ?

I wonder what we would break if we stopped doing that, here is a diff that does that:

diff --git a/src/normal.cc b/src/normal.cc
index beb880f9..33256c0b 100644
--- a/src/normal.cc
+++ b/src/normal.cc
@@ -579,9 +579,7 @@ void pipe(Context& context, NormalParams)
                     const auto end = changes_tracker.get_new_coord_tolerant(sel.max());
 
                     String in = buffer.string(beg, buffer.char_next(end));
-                    const bool insert_eol = in.back() != '\n';
-                    if (insert_eol)
-                        in += '\n';
+                    const bool ends_with_eol = in.back() == '\n';
 
                     // Needed in case we read selections inside the cmdline
                     context.selections_write_only().set({keep_direction(Selection{beg, end}, sel)}, 0);
@@ -590,12 +588,9 @@ void pipe(Context& context, NormalParams)
                         cmdline, context, in,
                         ShellManager::Flags::WaitForStdout).first;
 
-                    if (insert_eol)
-                    {
-                        in.resize(in.length()-1, 0);
-                        if (not out.empty() and out.back() == '\n')
-                            out.resize(out.length()-1, 0);
-                    }
+                    if (not ends_with_eol and not out.empty() and out.back() == '\n')
+                        out.resize(out.length()-1, 0);
+
                     auto new_end = apply_diff(buffer, beg, in, out);
                     if (new_end != beg)
                         new_sels.push_back(keep_direction({beg, buffer.char_prev(new_end), std::move(sel.captures())}, sel));

I’ll run with that for a while see if I can spot any issue.

According to my GNU/bc manpage:

In bc statements are executed “as soon as possible.” Execution happens when a newline in encountered and there is one or more complete statements. Due to this immediate execution, newlines are very important in bc. In fact, both a semicolon and a newline are used as statement separators. An improperly placed newline will cause a syntax error.

I guess users who know about that could pipe 1+1;.

But to play the advocate of the conservative side here, why does the editor have to pick one way or the other?

I understand we’re trying to make the default behaviour as convenient as possible to everybody, but since there are cases where a trailing newline is important to the shell program, why make the editor take the decision whether to include/strip it arbitrarily?

Could we not let the | primitive pipe the text to a tool, as-is, and document a snippet/script in the wiki that handles fancy side-effects that modify the input? After all, users who don’t want the input to be modified will have to do that to correct the editor’s behaviour.