geopandas: wrong country code ISO_3 for France in naturalearth_lowres dataset

Not sure this is the right place to mention it, but code iso_3 for France should be ‘FRA’ not -99.

    import geopandas
    path = geopandas.datasets.get_path('naturalearth_lowres')
    world = geopandas.read_file(path)
    world[world['name']=='France'].iso_a3

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 7
  • Comments: 19 (12 by maintainers)

Commits related to this issue

Most upvoted comments

I have the same issue:

The following iso_a3 codes are incorrect.

import geopandas as gpd
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
world = world[(world.pop_est>0) & (world.name!="Antarctica")]
print(world[world['iso_a3'] == '-99']['name'])

21 Norway 43 France 160 N. Cyprus 167 Somaliland 174 Kosovo Name: name, dtype: object

That does not have these issues? I wouldn’t include both.

It’s all in data definition… but to your point simplest is to override for general simple use in way I have indicated.

I’d be personally fine with manual renaming to resolve the issues pointed out in this issue and document that there is this small deviation from the original data.

I’ll go ahead with this and generate a PR

Still a problem in pandas 1.3.1+, even though https://github.com/nvkelso/natural-earth-vector/issues/284 was fixed in August 2021. Workaround remains:

# fix for bug https://github.com/geopandas/geopandas/issues/1041
world.loc[world['name'] == 'France', 'iso_a3'] = 'FRA'
world.loc[world['name'] == 'Norway', 'iso_a3'] = 'NOR'
world.loc[world['name'] == 'N. Cyprus', 'iso_a3'] = 'CYP'
world.loc[world['name'] == 'Somaliland', 'iso_a3'] = 'SOM'
world.loc[world['name'] == 'Kosovo', 'iso_a3'] = 'RKS'

It’s 2021 and I’m still stumbling about this bug. This is really annoying, because when you want to plot a choropleth world map using naturalearth_lowres dataset one has to join the own data with this dataset based on the iso_a3 column. But this will fail for several countries because of this old bug from 2019:

[1]: world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
[2]: print(world[world['iso_a3'] == '-99']['name'])
21         Norway
43         France
160     N. Cyprus
167    Somaliland
174        Kosovo
Name: name, dtype: object

The workaround at this point of time is:

# fix for bug https://github.com/geopandas/geopandas/issues/1041
world.loc[world['name'] == 'France', 'iso_a3'] = 'FRA'
world.loc[world['name'] == 'Norway', 'iso_a3'] = 'NOR'
world.loc[world['name'] == 'N. Cyprus', 'iso_a3'] = 'CYP'
world.loc[world['name'] == 'Somaliland', 'iso_a3'] = 'SOM'
world.loc[world['name'] == 'Kosovo', 'iso_a3'] = 'RKS'

I understand that the problem is not caused by geopandas. But it would be very nice if you could fix this issue anyway, so users of geopandas don’t have to wait for another two years until https://github.com/nvkelso/natural-earth-vector/issues/284 is resolved.

I don’t think we want to deviate from the Natural Earth source. So if you think something is wrong, it’s best to open an issue there (https://github.com/nvkelso/natural-earth-vector/issues)

@martinfleis I checked and it’s a problem with the data source:

    import geopandas as gpd
    import requests
    import os
    url = 'https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_sovereignty.zip'
    url = 'https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip'
    path = os.path.basename(url)
    with open(path, 'wb') as f:
        f.write(requests.get(url).content)
    world_raw = gpd.read_file("zip://./{}".format(path))
    world_df = world_raw[
        ["POP_EST", "CONTINENT", "NAME", "ISO_A3", "GDP_MD_EST", "geometry"]
        ]
    world_df.columns = world_df.columns.str.lower()

    print(world_df[world_df['iso_a3'] == '-99']['name'])

6 Dhekelia 21 Somaliland 22 France 53 Norway 65 Kosovo 133 USNB Guantanamo Bay 167 N. Cyprus 168 Cyprus U.N. Buffer Zone 169 Siachen Glacier 170 Baikonur 171 Akrotiri 225 Indian Ocean Ter. 247 Coral Sea Is. 248 Spratly Is. 249 Clipperton I. 251 Ashmore and Cartier Is. 252 Bajo Nuevo Bank 253 Serranilla Bank 254 Scarborough Reef

I noticed this page: http://www.naturalearthdata.com/downloads/110m-cultural-vectors/110m-admin-0-countries/

Countries distinguish between metropolitan (homeland) and independent and semi-independent portions of sovereign states. If you want to see the dependent overseas regions broken out (like in ISO codes, see France for example), use map units instead.

But France has an ISO alpha3 code equal to the string -99 in both data sets. I’ve opted to manually fix it for now.

    world.loc[world['name'] == 'France', 'iso_a3'] = 'FRA'
    world.loc[world['name'] == 'Norway', 'iso_a3'] = 'NOR'
    world.loc[world['name'] == 'Somaliland', 'iso_a3'] = 'SOM'
    world.loc[world['name'] == 'Kosovo', 'iso_a3'] = 'RKS'