nagisa: Heroku deployment of NLP model Nagisa Tokenizer showing error
Hi,
I deployed my Flask App ( NLP model ) on Heroku. I was basically a price prediction model where some columns were in Japanese where I applied NLP + Nagisa Library for tokenization and some columns were numerical data. I pickled vectorizers and the model and Finally added them to my Flask API. But after deployment when I added the values in the frontend and clicked on Predict button, the result is not getting displayed. This is the exact error I am facing.
The exact code of Tokenizer_jp is :
def tokenize_jp(doc): doc = nagisa.tagging(doc) return doc.words
I am not able to figure out how to fix this? does Nagisa work in Heroku deployment? PS: I am not really sure if the problem is with Heroku or Nagisa, please help me with this.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 22 (11 by maintainers)
Hi @Pranjal-bisht. OK. I will try to come up with a solution using this site as a reference.
Please write tokenize_jp to the file(e.g., utils_tokenizer.py).
Load it in the python script where pickle is saved and in the API scirpt, respectively.
Hi @Pranjal-bisht. I understand your situation. Once again, I think that if you adjust the memory management in Heroku properly, the program will work without any problems. I hope it works well. If you encounter any other issues, please let me know. I think I can help you. Thanks!
Hi @Pranjal-bisht. Thank you for the Python libraries’ information. I used the following code to check memory usage. As a result, the Python libraries in your configuration use at least 345.8 MiB of memory. In addition, loading xgb models and pickle files will use additional memory.
I think the 512MB of memory in the Free plan of heroku will not be enough. Your local environment has enough memory, so it works fine.
This is not a problem with the nagisa library itself. It is a problem of how memory is used in Heroku. As a solution, the free plan Heroku allows you to use two processes. So how about separating the API for tokenizing and the API for xgb models to separate the memory usage?
Hi, @Pranjal-bisht. Thank you for using nagisa! First of all, I have confirmed that nagisa works on Heroku.
As far as the error is concerned, you are getting a Heroku memory overflow error. Nagisa uses about 270 MiB of memory. If you are using the free Heroku, then only 500MiB of memory is available. So, to avoid errors, it is necessary to conserve memory usage with other libraries.
Do you use libraries other than nagisa on Heroku? Let me first check your situation. Thanks