fastai: Potential error in TextBlock_from_df ?

Hi. I am not sure why but I got trouble running TextBlock_from_df(). I have a data.table looking like the following :

Id | text_article | label

d0fa7568-7d8e-4db9-870f-f9c6f668c17b | What is this study about? This study used data from the National Education Longitudinal Study (NELS:88) to examine the effects of dual enrollment programs for high school students on college degree attainment | national education longitudinal study

I am trying to run :

data_block = DataBlock(
             blocks = list(TextBlock_from_df(text_cols="text_article")),  #List() is required on the R side.
             get_x=ColReader("text_article"), 
             get_y=ColReader('label'))

According to the example here https://docs.fast.ai/text.data.html#TextBlock.from_df and here : https://github.com/EagerAI/fastai#text-data and the fastai books.

But I also tried directly :

data_block = DataBlock(blocks = list(TextBlock_from_df(text_cols="text_article"))) #from fastai doc

Both gave me the following error :

Error: $ operator is invalid for atomic vectors Traceback:

  1. DataBlock(blocks = list(TextBlock_from_df(text_cols = “text_article”)))
  2. TextBlock_from_df(text_cols = “text_article”)
  3. do.call(text()$TextBlock$from_df, args)

I put in bold what attract my suspicion. I am not really at home here, I am doing NLP with fastai for the first time, both in R and fastai as well. I am putting the name of the column directly as a string, “text_article” as I am supposed to do. Based on the examples the dataframe with the text is supposed to be provided after. For what I understand the fastai wrapper is not happy about the text()$TextBlock$from_df).

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 27 (27 by maintainers)

Most upvoted comments

I have just added a test and it worked. I cannot find your notebook. Could you share the link, again, please?