cartopy: Major slowdown of app after upgrading to cartopy 0.20 that uses pyproj

I am still investigating but I found that somewhere in my app, the projecting is taking 97% of the load time.

 97.00%  97.00%   322.7s    322.8s   __init__ (pyproj/crs/crs.py:313)
  1.00%   1.00%   14.25s    14.25s   equals (pyproj/crs/crs.py:933)

Will update this as I figure out where it’s coming from.

Some context, I am using geoviews, datashader to plot millions of points. I am not sure if geoviews / datashader is using an outdated cartopy method to transform the points; will test.

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Comments: 32 (16 by maintainers)

Most upvoted comments

I realized that the ocean is not a necessary feature to draw. Just make the axes background the ocean color and then draw the land, which is much less expensive to do. @akrherz

I’ve also had numerous problems with Cartopy 0.20.1/Shapely 1.8 suddenly being unbearably slow after I recently upgraded to them. Plotting the terrain field for a large WRF domain of 1062 x 512 points (dx = 1 km) suddenly took nearly 11m 30s. That’s unacceptably slow. With my prior version of Cartopy (0.16, I think??), it took “only” 7 min, which was still far too slow to be useful. (I’m running this on NCAR’s Cheyenne, on a head node.)

I set PYPROJ_GLOBAL_CONTEXT=ON, but that had little effect on the speed for this plot. (I noticed more of a speed-up for other plots I tried of different data, though.)

I then used the transform_first argument to plt.contourf ([https://scitools.org.uk/cartopy/docs/latest/gallery/scalar_data/contour_transforms.html]), and that resulted in a noticeable speed-up, to 10m 45s.

However, commenting out the call to draw the oceans (ax.add_feature(oceans)), but instead assigning my ocean color using ax.set_facecolor, resulted in a SIGNIFICANT speed-up, all the way down to 12 s. If I eliminated the transform_first argument again, it slowed down to just over 1 min, so transform_first is still quite helpful.

So for a temporary workaround to anyone experiencing similar major slowdowns with Cartopy 0.20+, hopefully implementing these steps will help:

  1. Don’t draw the oceans unless they’re actually necessary (MAJOR speed-up).
  2. Use the transform_first keyword in the contourf call (moderate speed-up).
  3. Also try PYPROJ_GLOBAL_CONTEXT=ON for a potential speed-up (in some situations, but not all).

I also very much look forward to Cartopy 0.20.2 release to eliminate the ShapelyDeprecationErrors (#1936) that also started happening constantly once I upgraded.

Are you setting PYPROJ_GLOBAL_CONTEXT=ON? See the note on the new docs about that speed-up: https://scitools.org.uk/cartopy/docs/latest/index.html

However, this looks like a lot of pyproj CRS objects are being created, rather than re-using the same object which would be causing the slowdown. I wonder if we should look at putting a cache on the CRS constructors in addition to the Transformers?

# Many calls like this
for i in range(100):
    ax.scatter(..., transform=ccrs.PlateCarree())

# rather than storing the object
pc = ccrs.PlateCarree()
for i in range(100):
    ax.plot(..., transform=pc)

Hi All

this seems like a pretty major performance hit on the 0.19 0.20 transition

does it seem like this is an edge case? is this a performance hit that could be seen (oom) for many cases?

Has any consideration has been given to providing access to the proj transforms for users, enabling them to opt out of the pyproj transforms if they desire?

I’m a bit worried that the documentation note:

The v0.20 release uses pyproj for transformations, which could be slower in some situations. doesn’t provide sufficient information given the magnitude of performance hit seen here

I feel that there is more that should be considered here than only PYPROJ_GLOBAL_CONTEXT=ON

many thanks marqh

D’oh! Thanks for the better programmatic access.

While this is definitely faster, it still gets slow as I zoom in further, even without the ocean. Here’s my current example:

#!/usr/bin/env python                                                                                                                                                                                                                                         

import matplotlib.pyplot as pl
import cartopy

ax = pl.subplot(1, 1, 1, projection=cartopy.crs.Robinson())
ax.set_global()
ax.set_facecolor(cartopy.feature.COLORS['water'])
ax.add_feature(cartopy.feature.LAND, zorder=0, edgecolor='black')
pl.show()

It took about a minute to zoom in on a region roughly corresponding to Scotland (including the Hebrides and a bit of Northern Ireland). Here’s the start of the cProfile output:

         78692232 function calls (78587675 primitive calls) in 62.467 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    5.294    5.294   60.703   60.703 {built-in method exec}
    27312    4.976    0.000    6.410    0.000 coords.py:164(xy)
  1703573    4.340    0.000    7.629    0.000 {method '_transform' of 'pyproj._transformer._Transformer' objects}
  3407146    4.220    0.000    8.479    0.000 utils.py:88(_copytobuffer)
    35554    3.796    0.000   12.785    0.000 polygon.py:438(geos_linearring_from_py)
  1703573    3.309    0.000   21.418    0.000 transformer.py:649(transform)
   493202    3.052    0.000    9.624    0.000 coords.py:76(__getitem__)
     6830    3.008    0.000   24.696    0.004 {cartopy.trace.project_linear}
  3406556    2.308    0.000    2.308    0.000 utils.py:66(_copytobuffer_return_scalar)
  1068583    1.767    0.000    5.798    0.000 coords.py:43(_update)
  1197644    1.728    0.000    2.754    0.000 predicates.py:23(__call__)
 12855265    1.471    0.000    1.475    0.000 {built-in method builtins.isinstance}
  1197596    1.288    0.000    4.454    0.000 base.py:709(is_empty)
  3407146    1.259    0.000    1.259    0.000 utils.py:138(_convertback)
   413541    1.108    0.000    1.212    0.000 coords.py:61(__iter__)
   548018    1.041    0.000    4.053    0.000 coords.py:51(__len__)
  1729426    0.918    0.000    1.252    0.000 enum.py:359(__call__)
  1703576    0.883    0.000    2.107    0.000 enums.py:13(create)
  6429967    0.866    0.000    0.866    0.000 {built-in method _ctypes.byref}
  6856486    0.813    0.000    1.395    0.000 {built-in method builtins.hasattr}
  1703575    0.742    0.000    0.742    0.000 transformer.py:313(_transformer)
...

I’m not sure if I should expect Cartopy to take that long but it surprises me.

Thanks for posting these results! I think this points to the need for some documentation around potential speedups that people can try all in one place rather than spread throughout the GitHub Issues.

From my reading of everything, it seems like the major issue is the oceans feature and not land or coastlines? This may mean we should look into optimizing polygon edge closing with the boundaries.

You should be able to get the colors programmatically from that location too: cartopy.feature.COLORS["water"] https://scitools.org.uk/cartopy/docs/latest/reference/generated/cartopy.feature.COLORS.html#cartopy.feature.COLORS

However, commenting out the call to draw the oceans (ax.add_feature(oceans)), but instead assigning my ocean color using ax.set_facecolor, resulted in a SIGNIFICANT speed-up, all the way down to 12 s.

Thanks @jaredalee! I found this very helpful, reducing the time to zoom in on my point map from ~30s to about ~3s.

If anyone else is curious, Cartopy’s water colour is np.array((152, 183, 226)) / 256.:

https://github.com/SciTools/cartopy/blob/591fb5450e11b42b6de1cebe4f240112f915bd52/lib/cartopy/feature/__init__.py#L22-L24

@akrherz, can you see if PR #1918 helps your use-case at all?

Sorry for the delay, sadly not much.

import cartopy
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt

fig = plt.figure()

ax = fig.add_subplot(111, projection=ccrs.Mercator())
feature = cfeature.NaturalEarthFeature(
    name="ocean",
    category="physical",
    scale="10m",
    edgecolor="#000000",
    facecolor="#AAAAAA",
)
ax.add_feature(feature)
fig.savefig('test.png')

~13.8 seconds to ~13.4 seconds.

@ktyle Yeah, I realized that the ocean is not a necessary feature to draw. Just make the axes background the ocean color and then draw the land, which is much less expensive to do.

So I think I just reproduced what @greglucas stated above, the segment-by-segment reprojection in trace.pyx is what is very slow and what causes cfeature.LAND to be so slow, given the large number of segments it has.

@akrherz I find an even larger performance penalty by adding cfeature.OCEAN (esp. with 10m res and transforming to another CRS), but this pre-dates 0.20.