scirpy: Cannot convert output from Scirpy to dandelion

Description of the bug

I try to convert the AnnData after clonal assignment by Scirpy into dandelion format. The result showed that “field productive has invalid bool T + T”. I would like to convert it for updating germline sequence of each BCR sequences using dandelion because I did not find this function in Scirpy. However, if you could suggest the other ways. Feel free to let me know.

Minimal reproducible example

import scirpy as ir

ABC_irdata_exclude_orphan_dandelion = ir.io.to_dandelion(ABC_irdata_exclude_orphan)
ABC_irdata_exclude_orphan_dandelion

The error message produced by the code above

~/.conda/envs/dandelion/lib/python3.8/site-packages/airr/schema.py in validate_row(self, row)
    276                 if spec == 'number':  self.to_float(row[f], validate=True)
    277             except ValidationError as e:
--> 278                 raise ValidationError('field %s has %s' %(f, e))
    279 
    280         return True

ValidationError: field productive has invalid bool T + T

Version information

versions

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 19 (6 by maintainers)

Most upvoted comments

Hi @grst,

i’ve just release sc-dandelion==0.2.3 (it’s actually 0.2.2 but i thought my upload went wrong) Need to wait for https://github.com/pypa/warehouse/issues/11696 to be fixed before making any changes here though… =(

ok. the ‘issue’ is with line 273: https://github.com/scverse/scirpy/blob/2c5b99e7c5205adc506a10d1aca97250051fab81/scirpy/io/_datastructures.py#L260-L273

because dandelion’s productive column in the metadata will update the productive key that scirpy was making from to_airr_cells because it appears later https://github.com/scverse/scirpy/blob/2c5b99e7c5205adc506a10d1aca97250051fab81/scirpy/io/_io.py#L726-L739

Can confirm that if just change the name away from productive on dandelion’s side, it resolves this.

@sbenjamaporn if you just rename the current productive column to productive_status:

adata.obs.rename(columns={'productive':'productive_status'}, inplace=True)

you should be able to do the transfer.

I will action this on dandelion’s side to rename productive to productive_status.

@grst, The productive column shows “T + T” . I preprocessed my scBCR sequences from Dandelion and then use “ddl.to_scirpy”. After that, I define clone by Scirpy. Now, I would like to convert it back in order to update germline to study mutational analysis in Dandelion.