python-skyfield: Skyfield far slower than ephem

I wrote a small script to match sightings with satellites via their TLE.

The script using skyfield is about 80x slower than the same using ephem.

My question is: can this be improved? Am I using skyfield wrong here?

#!/usr/bin/python3
import sys
import datetime
from skyfield.api import load, utc, Topos

satlist = load.tle_file(sys.argv[1])
ts = load.timescale(builtin=True)


tim=float(sys.argv[2])/1000.
lat=float(sys.argv[3])
lon=float(sys.argv[4])
alt=int(sys.argv[5])*1000

time_t = datetime.datetime.utcfromtimestamp(tim)
time_t = time_t.replace(tzinfo=utc)
t = ts.utc(time_t)
mysat=Topos(latitude_degrees=lat, longitude_degrees=lon, elevation_m=alt)
my=mysat.at(t)

minsep=9e9
sname=None
for sat in satlist:
    pos=sat.at(t)

    sep=pos.separation_from(my).degrees
    if sep<minsep:
        sname=sat.name
        minsep=sep

print("Minimum: %4.2f: %s"%(minsep/1000,sname))

vs. similar code with ephem:

#!/usr/bin/python3
import sys
import math
import datetime
import ephem

degrees_per_radian = 180.0 / math.pi

satlist = loadTLE(sys.argv[1])

tim=float(sys.argv[2])/1000.
lat=float(sys.argv[3])
lon=float(sys.argv[4])

t = datetime.datetime.utcfromtimestamp(tim)

minsep=9e9
sname=None
for sat in satlist:
    sat.compute(t)

    sep=ephem.separation((sat.sublong,sat.sublat),(lon/degrees_per_radian,lat/degrees_per_radian))*degrees_per_radian
    if sep<minsep:
        sname=sat.name
        minsep=sep

print("Minimum: %4.2f: %s"%(minsep,sname))

As a datapoint: using a TLE file with 75 satellites, and running in a 1000x loop, I get runtimes of 0.17s for ephem vs. 14s for skyfield.

Is a way to optimize/speed up what I’m doing with skyfield?

Thanks, Sec

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 17 (12 by maintainers)

Most upvoted comments

theta_GMST1982() will be called repeatedly …

Yes, that’s indeed true, and we could avoid that call by stashing the value somewhere, maybe on the Time object. I even did a quick experiment to do so during that recent performance work, but:

  1. It made little difference to the overall time; it was a surprisingly small contribution to the cost of a small program computing positions.
  2. The cost will go away if instead Skyfield adds full-fledged support for arrays of satellites, because the time would only get computed once for a whole array; so the extra code and complexity would use up RAM for most users but without necessarily having a benefit.
  3. I have just removed a few cached values of time that were taking up memory, and wanted to think for a few weeks before adding a new one back.
  4. I want to centralize Skyfield’s half-hearted support for polar motion, which will affect the TEME_to_ITRF() routine. In particular, it will raise the question of what we cache: the theta angle itself, which would make us compute the matrices over and over again? Or would we cache the matrices, at 9× the RAM cost?

All of which has led to my holding off on caching that time unit for now. But it’s not impossible to imagine that in the near future that some kind of caching could take place in case a Time is used on several satellites; but as it will be a small gain, I would rather take our time with the design.

@brandon-rhodes Thank you for the pointer to timelib.py showing the use of @reify. I have a new level of appreciation for the work that goes into converting coordinates!

That’s an interesting idea! Skyfield does include some values that are computed on-demand, particularly on the Time object:

https://github.com/skyfielders/python-skyfield/blob/c10e7401598342c891d7befca00fc209392114e1/skyfield/timelib.py#L583

I’ll ponder whether velocity could become an on-demand property. One problem is that in some cases intermediate results that could be useful to its computation might be gone by the time it is accessed, whereas if it’s computed at the same time as position, the same matrices can be applied immediately to both. It can also be disconcerting for users if they think of, say, .at() or .apparent() as the “expensive step” where a calculation takes place, but if really it’s the .velocity access at the end of the method chain which suddenly launches the intermediate steps that would have taken place to compute it. Also, it means that the final object with the .velocity property needs to keep a reference to every previous object — even if normally they would have disappeared and had their memory freed — until such time as it’s asked for the property value, so memory usage and object lifetime can be mysteriously higher.

My guess is that a more sustainable approach will be a skip_velocity option in .at() methods, that then results in methods like .apparent() from even needing to transform velocity because it would be a None object.