stringtie: Error when running prepDE.py

I get the following error when running prepDE.py on stringtie estimate output:

Traceback (most recent call last):
  File "./prepDE.py", line 255, in <module>
    geneDict.setdefault(geneIDs[i],{}) #gene_id
KeyError: 'STRG.337.1'

I only get this for one sample out of fifteen. I run the HISAT2 > Stringtie workflow as described in Pertea et al. (2016). I want to feed this into edgeR. Is there another option in case I can not resolve this problem?

Additional question: What is the difference between MSTRG and STRG?

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Reactions: 1
  • Comments: 22 (9 by maintainers)

Most upvoted comments

@MarinaMann, thank you for the nice words but that problem is actually unrelated to the stringtie -e issue in this thread. In fact the order of the command line parameters should never matter for StringTie (due to the way that argument parser is designed). If it did make a difference as it seemed to have happened in your case, that would have been an ugly bug as we’ve never intended it! But this was something else, a bit more insidious… see below.

Not sure if it’s visible in your browser, but do you notice any difference between these two lines below?

–e –B
-e -B

(in my browser it looks like the dashes in the first line are a bit longer and closer to the letter that follows them, compared to those on the second line).

There seems to have been a editing/typographical (?) error which affected our protocol paper when it was submitted for publishing, where sometimes the plain ‘dash’ or ‘minus’ character (‘-’) in our command lines got unexpectedly replaced with the “em dash” typographical symbol (‘–’). I guess some word processing/publishing software like Microsoft Word even makes this kind of substitution automatically in some contexts. The issue was previously reported when users copied gffcompare commands from the protocol paper (see gpertea/gffcompare#3) but now I see it affected other command lines too… Sorry about that confusion. Unfortunately this means that one should not be using simple copy&paste from the paper (or if one does, one should edit the line before running it in order to delete all the dashes and replace them with the regular ‘-’ character)