twarc: twarc2 search without configure on Windows throws JSON parse error
I ran the request below: twarc2 search ‘#ENDSARS-is:retweet’ --start-time 2017-12-01 --end-time 2020-11-30 --flatten --archive C:\Users\USER\Desktop\MyTwarcResults.json
and I got this error message below:
Traceback (most recent call last):
File "C:\Users\USER\PycharmProjects\workspace\venv\Scripts\twarc2-script.py", line 33, in <module>
sys.exit(load_entry_point('twarc==2.0.6', 'console_scripts', 'twarc2')())
File "c:\users\user\pycharmprojects\workspace\venv\lib\site-packages\click\core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "c:\users\user\pycharmprojects\workspace\venv\lib\site-packages\click\core.py", line 782, in main
rv = self.invoke(ctx)
File "c:\users\user\pycharmprojects\workspace\venv\lib\site-packages\click\core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "c:\users\user\pycharmprojects\workspace\venv\lib\site-packages\click\core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "c:\users\user\pycharmprojects\workspace\venv\lib\site-packages\click\core.py", line 610, in invoke
return callback(*args, **kwargs)
File "c:\users\user\pycharmprojects\workspace\venv\lib\site-packages\click\decorators.py", line 33, in new_func
return f(get_current_context().obj, *args, **kwargs)
File "c:\users\user\pycharmprojects\workspace\venv\lib\site-packages\twarc\decorators.py", line 172, in __call__
result = e.response.json()
File "c:\users\user\pycharmprojects\workspace\venv\lib\site-packages\requests\models.py", line 900, in json
return complexjson.loads(self.text, **kwargs)
File "C:\Users\USER\AppData\Local\Programs\Python\Python36\lib\json\__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "C:\Users\USER\AppData\Local\Programs\Python\Python36\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Users\USER\AppData\Local\Programs\Python\Python36\lib\json\decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
What exactly be the cause/source of this error, and how can i get help?
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 63 (37 by maintainers)
Commits related to this issue
- Log error message parse failures If errors from the Twitter API are not JSON they can cause strange errors. Instead we should catch these and log what was received from Twitter instead of JSON. Refs... — committed to DocNow/twarc by edsu 3 years ago
- Improved config logging This commit adds twarc.config.ConfigProvider which is based on click_config_file.configobj_provider and stores the file path for the config file that was used. This is useful ... — committed to DocNow/twarc by edsu 3 years ago
This is a special appreciation to you both for your constant support and perseverance. I am so glad to inform you that my twarc2 now works and generates data. I can’t thank you enough sirs, as all your responses were useful. I sincerely appreciate your patience. For the records, I think the improper configuration of the twarc2 (in addition to the twarc configuration) contributed to the reasons for getting those errors.
I will begin to work on twarc2 for my archival data collection now. I will be so glad if you would come to my rescue if problems arise.
Thank you once again for the support.
On Wed, May 5, 2021 at 6:32 PM Ed Summers @.***> wrote:
– Kingsley Oladayo Ogunne
Department of Corporate Services Obafemi Awolowo University Teaching Hospitals Complex P.M.B. 5538 Ile-Ife, Nigeria
Telephone: +2348088444325, +2349050054242
With @AbirRes’ help we were able to figure out that the bearer token was not persisted to the configuration file correctly. It was a ctrl-v character, which seemed to really confuse the Twitter API. I think the ctrl-v ended up in the configuration file because we were previously hiding the input of the token (for screen recording). It could be that some Windows terminals aren’t set up to do ctrl-v properly, and users could not see that it wasn’t working since it was hidden. Tokens should now appear in the console to help catch this in the future.
So if you have this problem, please make sure you are using twarc v2.1.5 or higher:
and then reconfigure twarc2:
Hopefully that will allow you to use twarc2 subcommands going forwards. Thanks for everyone’s patience on this!
I get the message: Unable to parse 400 error as JSON: Bad Request.
I am sorry, I can’t post a snapshot as I am not in front of my system right now.
@edsu I am using Windows 10 and python 3.9.5. I downloaded it from their official website. I also tried it after downloading Anaconda, where then I used the Anaconda prompt to run the commands. Furthermore, I followed the usual/suggested install methods and did not do anything custom to change the path, etc.
Hi @edsu, not sure if this thread is still running. I am facing a similar issue as @osemele, “unable to parse 400 error as json: Bad request” with twarc2. I have been able to successfully configure twarc2 as well as twarc, so the above-suggested fix does not work for me. twarc runs perfectly for me, but twarc2, unfortunately, does not. When I run the command: twarc2 stream blm > tweets.json1, it creates a file “tweets” but without any data. I have tried installing, uninstalling Anaconda, Python, etc., but unfortunately, nothing has worked so far. I also tried on a computer where the username does not have any space in it to avoid the pip breaking down, but that did not seem to be the problem as well. I am sorry for the long post, but I can’t seem to find the fix while twarc2 seems to do exactly what I need which is why I really want it to work. I would really appreciate any suggestions that you could kindly provide.
This also has me wondering if the input should actually display the keys on the console. It seems to be causing some confusion.
Do you find my suggestion on configuring twarc2 separately to avoid error useful?
On Mon, May 10, 2021, 11:39 AM Igor Brigadir @.***> wrote:
Yeah, that would be nice if it wasn’t too tricky. Do the old stand-alone apps have access to the Twitter v2 API? I guess it is confusing for someone might concurrently use twarc and twarc2. I wanted to update twarc2 to allow for “profiles” like twarc.
That’s very helpful thanks @osemele . We will test running
twarc2 searchwithout having runtwarc2 configurefirst on Windows.Yes, I think all this while I never knew I didn’t configure my twarc2 properly. First, I thought since I had configured twarc, that that would suffice for twarc2 Second, the error messages I was getting on twarc2 were not pointing towards issues of none/poor configuration. It never directly mentioned authorization as a problem, hence, my attention never went to configuration problems. Again, whenever I made attempts to configure my twarc2, it never displayed the bearer token, API secret and token secret on my screen while pasting it. So in most cases, I abruptly discontinued the process until I read somewhere that not displaying such secret keys and tokens was the normal process of configuring twarc2.
I think those getting similar errors to mine, especially when the python environment has properly been created should also look into their twarc2 configuration specifically.
Thank you once again.
On Fri, May 7, 2021 at 12:18 PM Ed Summers @.***> wrote:
– Kingsley Oladayo Ogunne
Department of Corporate Services Obafemi Awolowo University Teaching Hospitals Complex P.M.B. 5538 Ile-Ife, Nigeria
Telephone: +2348088444325, +2349050054242
Building on that a bit more to test your Python environment you can run this little program after replacing CHANGEME with your Bearer Token?
https://gist.github.com/edsu/a1a86ff8398edaef3010e3453665e6d6
If that works then it must be something in twarc.
'#ENDSARS-is:retweet'i think this query is missing a space, it should be
"#ENDSARS -is:retweet"Another issue may be the
'vs"quotes - so the full command that might work is:twarc2 search --start-time "2017-12-01" --end-time "2020-11-30" --flatten --archive "#ENDSARS -is:retweet" "C:\Users\USER\Desktop\MyTwarcResults.json"Does that give the same error?