ChatterBot: UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 48: invalid start byte

I just copy pasted following example:

# -*- coding: utf-8 -*-
from chatterbot import ChatBot


# Create a new instance of a ChatBot
bot = ChatBot(
    'Default Response Example Bot',
    storage_adapter='chatterbot.storage.JsonFileStorageAdapter',
    logic_adapters=[
        {
            'import_path': 'chatterbot.logic.BestMatch'
        },
        {
            'import_path': 'chatterbot.logic.LowConfidenceAdapter',
            'threshold': 0.65,
            'default_response': 'I am sorry, but I do not understand.'
        }
    ],
    trainer='chatterbot.trainers.ListTrainer'
)

# Train the chat bot with a few responses
bot.train([
    'How can I help you?',
    'I want to create a chat bot',
    'Have you read the documentation?',
    'No, I have not'
    # 'This should help get you started: http://chatterbot.rtfd.org/en/latest/quickstart.html'
])

# Get a response for some unexpected input
response = bot.get_response('How do I make an omelette?')
print(response)

I get following error:


Traceback (most recent call last):
  File "C:/Users/ccx/work/pscripts/conversationalspeech/t2.py", line 27, in <module>
    'No, I have not'
  File "C:\Users\ccx\Anaconda2\lib\site-packages\chatterbot\trainers.py", line 82, in train
    statement = self.get_or_create(text)
  File "C:\Users\ccx\Anaconda2\lib\site-packages\chatterbot\trainers.py", line 25, in get_or_create
    statement = self.storage.find(statement_text)
  File "C:\Users\ccx\Anaconda2\lib\site-packages\chatterbot\storage\jsonfile.py", line 46, in find
    values = self.database.data(key=statement_text)
  File "C:\Users\ccx\Anaconda2\lib\site-packages\jsondb\db.py", line 98, in data
    return self._get_content(key)
  File "C:\Users\ccx\Anaconda2\lib\site-packages\jsondb\db.py", line 52, in _get_content
    obj = self.read_data(self.path)
  File "C:\Users\ccx\Anaconda2\lib\site-packages\jsondb\file_writer.py", line 15, in read_data
    obj = decode(content)
  File "C:\Users\ccx\Anaconda2\lib\site-packages\jsondb\compat.py", line 32, in decode
    return json_decode(value, encoding='utf-8', object_hook=json_util.object_hook)
  File "C:\Users\ccx\Anaconda2\lib\json\__init__.py", line 352, in loads
    return cls(encoding=encoding, **kw).decode(s)
  File "C:\Users\ccx\Anaconda2\lib\json\decoder.py", line 364, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\ccx\Anaconda2\lib\json\decoder.py", line 380, in raw_decode
    obj, end = self.scan_once(s, idx)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 48: invalid start byte

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 16 (3 by maintainers)

Most upvoted comments

@vkosuri @gunthercox The problem is that I have to delete this database.db each time I run a new script, as there are multiple bot scripts in a directory.

Moreover same error happens if we pass unicode training statements in bot.train

Hence should not be closed.

@vkosuri that worked!