pdfminer.six: /usr/bin/env: ‘python\r’: No such file or directory

pdf2txt.py fails to run with:

/usr/bin/env: ‘python\r’: No such file or directory

This appears to be due to a DOS carriage return in the shebang line. Running dos2unix pdf2txt.py appears to fix the issue.

Test environment:

$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04 LTS Release: 16.04 Codename: xenial

pdfminer.six (20160614) - from PyPi via pip

Running in a virtualenv

$ virtualenv --version 15.0.1

$ python --version Python 3.5.1+

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 1
  • Comments: 34 (13 by maintainers)

Most upvoted comments

Problem still persists as I just installed it using pip on Ubuntu 16.04.3 LTS and had to manually fix pdf2txt.py to get it to work.

I ran sed $'s/\r//' -i $(which pdf2txt.py) to fix the problem.

Fixed problem with

vim which pdf2txt.py

:set ff=unix :wq

To clarify, this was how I patched the installed script:

  1. vim `which pdf2txt.py`
  2. In Vim: :set ff=unix
  3. Close the file: :wq

This is just a reformatted version of @tiru1930’s solution

Some good news: I was able to install directly from the current version using

pip install https://github.com/goulu/pdfminer/zipball/e6ad15af79a26c31f4e384d8427b375c93b03533#egg=pdfminer.six

I’ve confirmed this as tests are now passing for textract (hallelujah!). When a new version of pdfminer.six is released, I believe you can close this out @goulu.

Thanks for managing pdfminer and helping to merge many great ideas into the package!

In the latest version 20170720 the problem still persist if the package is installed from pypi. If you install it directly from github there is no problem, because the file in git is ok.

@goulu on what OS did you prepare the package for PyPi?

The problem is that in Unix like OSes (Linux/BSD/MacOS etc.) in the first line of executable text file there should not be present a \r character, because this will mess the shell trying to figure out which interpreter to start in order to execute the file.

As we see in the code everything is ok but there is some problem in the process of creating the package which is uploaded to PyPi.

Both pdf2txt.py and dumppdf.py have this issue. Fixed it by :set ff=unix in vim

pdfminer.six using pip on Ubuntu zesty

thanks for the quick reply @bittner! That’s the output of pip show pdfminer.six, which seems to be the lastest version if I’m not mistaken.

Metadata-Version: 2.0
Name: pdfminer.six
Version: 20170720
Summary: PDF parser and analyzer
Home-page: http://github.com/pdfminer/pdfminer
Author: Yusuke Shinyama + Philippe Guglielmetti
Author-email: pdfminer@goulu.net
License: MIT/X
Location: /home/bastian/anaconda/envs/py35/lib/python3.5/site-packages
Requires: six, chardet, pycryptodome

I changed the shebang line This works on ubuntu for python3 from: /usr/bin/env: ‘python\r’ to : /usr/bin/env: ‘python3\r’

But this still didn’t solve the issue. I had to convert the file from dos format to unix format. sudo apt-get install dos2unix

and then find the file by whereis pdf2txt.py and then convert the file to unix format sudo dos2unix pathto(pdf2txt.py)

This is quite a nasty bug. easy fix is dos2unix pdf2txt.py Shouldn’t have to if this bug gets fixed.

Strange. According to the commit log the \r issue should have been fixed for that version.

@goulu How do you sdist upload new releases? With python setup.py or manually? Are you on a Windows machine?

@gedankenstuecke How did you install the module, and which version is it?

(Run pip show pdfminer.six, maybe?)

Just a heads up, I’ve run dos2unix over all the following files in PR #58

  • *.py
  • *.md
  • *.rst
  • *.yml
  • .gitignore
  • Makefile
  • MANIFEST.in

… and the result is that only .travis.yml was still affected.

Oh. Sorry. I got confused. It is working for me in Ubuntu. I don’t know if there is still a problem with this PR in other operating systems.

Well, I don’t fully understand … is this problem specific to OSX ?

I only tried this with OSX, but my previous points should apply equally to any Unix environment with python2 and python3 installed side by side.

If you want to migrate to using entry points, the basic idea is that you refactor each of the tools into a function in a module, and then map the name of the tool to that function. So setup.py would contain something like:

entry_points = {'console_scripts': [
    'pdf2txt=pdfminer.pdf2txt:main',
    ...
]},

In this configuration, pdf2txt.py is simply part of the pdfminer package. the pdf2txt executable is generated on install and calls pdfminer.pdf2txt.main().

I’m sure there are other ways to do it, but I like this method because it plays well with pip developer mode: pip3 install -e .. In developer mode changes to the source code are immediately visible to the installed scripts.

This is a problem with installing pdfminer.six in a virtualenv