ansi-to-html: ansi-to-html incorrectly interpret tput output to *(B characters

Hi, thanks a lot for the great job on ansi-to-html!

I’m a Drone CI user, Drone CI use ansi-to-html to render console output. It works almost fine, while tput doesn’t work correctly, I reported a bug to downstream, and then notice ansi-to-html is the right upstream: https://github.com/drone/drone/issues/1491

To reproduce: $ tput sgr0 | ansi2html

Expected result: No text in <body></body> Actual result: There is a (B which is unexpected:

 48 <body>
 49 <pre>(B</pre>
 50 </body>

I rebuilt latest ansi-to-html and confirmed it can be reproduced at 7d34444cb45eb53253b2e119a36c95ccf4410684

Could you have a look? Thanks a lot!

About this issue

  • Original URL
  • State: open
  • Created 8 years ago
  • Comments: 15 (4 by maintainers)

Commits related to this issue

Most upvoted comments

Actually, I’m realizing this wasn’t a different context after all… The extraneous (B was showing up in my CI/CD output on gitlab, which appears to be processed through ansi2html, whenever there was a tput sgr0 in the CI/CD script. My .gitlab-ci.yml file included the following code:

variables:
  TERM: xterm

Once I changed this to

variables:
  TERM: ansi

all the (B’s disappeared.

So after having the exact same problem as @earthman1 (thank you for your detailed explanation and solution!), I think the best thing to do might be to explicitly document which values for TERM are supported by this library, and then closing this issue should be safe. I think this is just a documentation bug and not an implementation bug.

(Also, it’s frustrating that GitLab doesn’t set TERM to an appropriate value by default or at least document acceptable values somewhere, because it took me an incredibly long time to find this thread and the solution herein, but that’s GitLab’s problem and not yours.)

I found this issue while searching for an explanation for this behavior, which I have also observed in a different context. I believe that I may have discovered the cause of my own issue, and so I post my discoveries here for your benefit as well, and for the benefit of anyone else who may stumble across it as I did.

sgr0 has different codes depending on which terminal you are using. If you are configured to output codes for a vt100 terminal, the output is as you expected. If your terminal is set to xterm, then the output is ^[(B^[[m. You may demonstrate this behavior in tput using the command

tput -T xterm sgr0 | hexdump

which will output 0000000 281b 1b42 6d5b, as @fracting observed above. Contrast that with the output of

tput -T vt100 sgr0 | hexdump

which outputs 0000000 5b1b 0f6d, which exactly matches the output @rburns observed. This doesn’t appear to be a bug in any particular software, but, at least in my case, resulted from a misconfigured TERM environment variable.

This solved my problem, and I hope it clears it up for you as well!