hosts: CANNOT using someonewhocares.org cause ENCODING or DOWNLOAD issue
When I add someonewhocares.org directory to data dir, then I cannot update hosts file due error:
Python2 - cannot finish download hosts file from someonewhocares.org
Python3 - finish download file, but ENCODING error made:
[fademind@manjaro hsts.test]$ python3 updateHostsFile.py -a -c
Updating source data/add.Risk from https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Risk/hosts
Traceback (most recent call last):
File "updateHostsFile.py", line 1416, in <module>
main()
File "updateHostsFile.py", line 167, in main
update_all_sources(source_data_filename, settings["hostfilename"])
File "updateHostsFile.py", line 603, in update_all_sources
update_data = json.load(update_file)
File "/usr/lib/python3.6/json/__init__.py", line 296, in load
return loads(fp.read(),
File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 27: ordinal not in range(128)
When I delete someonewhocares.org dir from data dir generate hosts file went FINE:
[fademind@manjaro ~]$ uhf
Updating source data/add.Risk from https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Risk/hosts
Updating source data/CoinBlockerLists from https://raw.githubusercontent.com/ZeroDot1/CoinBlockerLists/master/hosts_browser
Updating source data/adaway.org from https://raw.githubusercontent.com/AdAway/adaway.github.io/master/hosts.txt
Updating source data/tyzbit from https://raw.githubusercontent.com/tyzbit/hosts/master/data/tyzbit/hosts
Updating source data/StevenBlack from https://raw.githubusercontent.com/StevenBlack/hosts/master/data/StevenBlack/hosts
Updating source data/add.2o7Net from https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.2o7Net/hosts
Updating source data/UncheckyAds from https://raw.githubusercontent.com/FadeMind/hosts.extras/master/UncheckyAds/hosts
Updating source data/mvps.org from http://winhelp2002.mvps.org/hosts.txt
Updating source data/hpHosts-ATS from https://hosts-file.net/ad_servers.txt
Updating source data/add.Dead from https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Dead/hosts
Updating source data/hpHosts-EMD from https://hosts-file.net/emd.txt
Updating source data/yoyo.org from https://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&mimetype=plaintext&useip=0.0.0.0
Updating source data/hpHosts-MMT from https://hosts-file.net/mmt.txt
Updating source data/KADhosts from https://raw.githubusercontent.com/azet12/KADhosts/master/KADhosts.txt
Updating source data/Badd-Boyz-Hosts from https://raw.githubusercontent.com/mitchellkrogza/Badd-Boyz-Hosts/master/hosts
Updating source data/malwaredomainlist.com from http://www.malwaredomainlist.com/hostslist/hosts.txt
Updating source data/Spotify-Ad-free from https://raw.githubusercontent.com/CHEF-KOCH/Spotify-Ad-free/master/Spotifynulled.txt
Updating source data/add.Spam from https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Spam/hosts
Updating source extensions/fakenews from https://raw.githubusercontent.com/marktron/fakenews/master/fakenews
Updating source extensions/social from https://raw.githubusercontent.com/Sinfonietta/hostfiles/master/social-hosts
Updating source extensions/porn/clefspeare13 from https://raw.githubusercontent.com/Clefspeare13/pornhosts/master/0.0.0.0/hosts
Updating source extensions/porn/sinfonietta-snuff from https://raw.githubusercontent.com/Sinfonietta/hostfiles/master/snuff-hosts
Updating source extensions/porn/sinfonietta from https://raw.githubusercontent.com/Sinfonietta/hostfiles/master/pornography-hosts
Updating source extensions/gambling from https://raw.githubusercontent.com/Sinfonietta/hostfiles/master/gambling-hosts
==>0.0.0.0 12.170.116.68<==
==>0.0.0.0 164.109.51.67<==
==>0.0.0.0 5.100.249.215<==
==>0.0.0.0 74.112.173.77<==
==>0.0.0.0 91.205.157.38<==
==>0.0.0.0 91.212.132.230<==
Success! The hosts file has been saved in folder
It contains 246,768 unique entries.
Moving the file requires administrative privileges. You might need to enter your password.
Flushing the DNS cache to utilize new hosts file...
Flushing the DNS cache requires administrative privileges. You might need to enter your password.
Flushing the DNS cache by restarting NetworkManager.service succeeded
Flushing the DNS cache by restarting dnsmasq.service succeeded
Flushing the DNS cache by restarting NetworkManager.service succeeded
Flushing the DNS cache by restarting dnsmasq.service succeeded
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 21 (16 by maintainers)
Commits related to this issue
- Review of get_file_by_url() Please note that this patch also introduce which is in charge of converting a domain in a line into IDNA and/or UTF-8 format. Also note the introduction of BeautifulSoup(... — committed to funilrys/hosts by funilrys 6 years ago
- Review of get_file_by_url() Please note that this patch also introduce which is in charge of converting a domain in a line into IDNA and/or UTF-8 format. Also note the introduction of BeautifulSoup(... — committed to evilfactorylabs/gatekeeper by funilrys 6 years ago
I can confirm that PR from @funilrys https://github.com/StevenBlack/hosts/pull/520 RESOLVED issue. CC @StevenBlack
@StevenBlack
I have this issue on macOS. But not if i use the default configuration (of the hosts repo).
This is how i can reproduce it:
git clone https://github.com/StevenBlack/hosts.gitpython3 updateHostsFile.py --auto --backup --replace --flush-dns-cacheThis finishes succesfullyhosts/extensions/hpHosts/pshupdate.json:python3 updateHostsFile.py --auto --backup --replace --flush-dns-cache --extensions hpHostshosts/extensions/hpHosts/pshin a terminal and run this to manually download the file:python3 updateHostsFile.py --auto --backup --replace --flush-dns-cache --extensions hpHostsWe have been battling this one a long time. Two other issues were https://github.com/StevenBlack/hosts/issues/465 and https://github.com/StevenBlack/hosts/issues/440. I think @funilrys may have just closed half our tickets π
We can now close this @StevenBlack π
@StevenBlack as those libraries are not built-in Python should I include a
requirement.txt? π€Okay Steven @StevenBlack Iβm going to try one of that library to see if it fixes @notDavid protocol then.
I think (from my last comment) that we can put an explanation on that issue β¦ I donβt if it has been already done but :
We (and Steven @StevenBlack) assumes that the format is always
utf-8which is far from true as we can see every time someone gotError in updating source: xxxActually, this is one of the world all-time issues as we can not detect the format/encoding of a file or byte. I also think that using a library like BeautifulSoup or chardet may help but would it be without performance issues Steven @StevenBlack ?