osxphotos: Exception in Export
Describe the bug
I’m receiving this error when using export:
Exception ignored in: <function ExportDBInMemory.__del__ at 0x105633040>
Traceback (most recent call last):
File "/Users/eecue/mambaforge/lib/python3.9/site-packages/osxphotos/export_db.py", line 901, in __del__
self.close()
File "/Users/eecue/mambaforge/lib/python3.9/site-packages/tenacity/__init__.py", line 326, in wrapped_f
return self(f, *args, **kw)
File "/Users/eecue/mambaforge/lib/python3.9/site-packages/tenacity/__init__.py", line 406, in __call__
do = self.iter(retry_state=retry_state)
File "/Users/eecue/mambaforge/lib/python3.9/site-packages/tenacity/__init__.py", line 363, in iter
raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x48a68b790 state=finished raised ProgrammingError>]
Note that the exports actually seem to work. Here’s the code:
photo = photosdb.photos(uuid=[uuid])[0]
pe = osxphotos.PhotoExporter(photo)
e = osxphotos.ExportOptions()
e.convert_to_jpeg = True
e.download_missing = True
e.overwrite = True
# e.use_photokit = True
pe.export("/tmp", f"photo_{hash}.jpeg", options=e)
Desktop (please complete the following information):
- OS: Ventura 13.2.1
- osxphotos version 0.57.3
About this issue
- Original URL
- State: open
- Created a year ago
- Comments: 48 (34 by maintainers)
Commits related to this issue
- Fix error on closing export db, #999 — committed to RhetTbull/osxphotos by RhetTbull a year ago
- Fix error on closing export db, #999 (#1002) — committed to RhetTbull/osxphotos by RhetTbull a year ago
- Working on making export threadsafe, #999 — committed to RhetTbull/osxphotos by RhetTbull a year ago
- Working on making export threadsafe, #999 — committed to RhetTbull/osxphotos by RhetTbull a year ago
- refactor for concurrent export, #999 — committed to RhetTbull/osxphotos by RhetTbull a year ago
- Concurrency refactor 999 (#1029) * Working on making export threadsafe, #999 * Working on making export threadsafe, #999 * refactor for concurrent export, #999 * Fixed race condition in Expo... — committed to RhetTbull/osxphotos by RhetTbull a year ago
- Fix non-incrementing photo count, #999 — committed to RhetTbull/osxphotos by RhetTbull a year ago
- Fix for json() failing on photos in projects, #999 — committed to RhetTbull/osxphotos by RhetTbull a year ago
- Fix for json() failing on photos in projects, #999 (#1039) — committed to RhetTbull/osxphotos by RhetTbull a year ago
- Added test for #999 (project_info) that I missed on last branch — committed to RhetTbull/osxphotos by RhetTbull a year ago
- Feature help no selection 1036 (#1042) * Added validation for --selected * Added test for #999 (project_info) that I missed on last branch — committed to RhetTbull/osxphotos by RhetTbull a year ago
@oPromessa There were bugs in 0.59.2, notably a memory leak, that would have caused it to slow down. There was also some unnecessary calls to the AlbumInfo code that was pretty slow so I’ve refactored those out. Give 0.59.3 a try. In my testing with my 125GB library, exporting to external Thunderbolt 4 drive, it’s slightly faster than 0.58.2.
--breakpointoption 😉Thanks for the run time – that’s helpful! I’ve got a shallow json method implemented that is sufficient for export purposes and will add that in which will bring export db size back down.
Actually I do have a project in the test suite but wasn’t testing the asdict()/json() output for this. Added a test (currently failing), and now I can work a fix.
Another option for the json issue is to offer both a “shallow” and a “deep” json method. The deep method (0.59.1) is needed for multithreading / multiprocessing. The shallow method (0.58.1) should be sufficient for export.
@oPromessa Give v0.59.1 a try. I believe it should restore performance to what you’d experienced under 0.57.0. Will still need to work on the multi-threaded code / lock files but putting that on the back burner again as I’ve got several other projects demanding attention.
@oPromessa the lock file code is contained in two utility functions,
lock_filename()andunlock_filename(). I can easily add an option for--no-lockor just disable these two for now (make them no-op but retain the code for when it’s eventually needed). Will take a look at what’s the best option. Interestingly, in my profiling, these take basically no time. But–I’m running on fast APFS formatted SSD, not a NAS. I really wish there was a way I could “mock” the NAS (slow network, flaky copies, etc.) for testing to catch these types of things.While profiling, I found some other inefficiencies where I can pull code out of a loop to further speed-up export. Will implement these.
CLI Performance
Thanks so much. Will run tonight the CLI export with profile on versions 0.58.1 so as to have a baseline then
API Workers
On the API testing I’ve done one can see an improvement of about 36% for 251 pics. Of course, as you pointed out the API interface does not have all the other (nice) options, update, delete, exiftool template, etc…
Summary results
Workers: 1
Workers: 40 (default)
I did some benchmarking. On my M1 Mac Mini (8 cores), the sweet spot appears to be 8 thread workers. The python default is
number of processors * 5but this appears to offer no benefit. This makes sense to me as a single thread per core allows the thread to saturate the core while reducing the number of thread joins. Using more threads doesn’t get more out of the CPU but increases the thread joins (which from profiling, appear to consume much of the time). All this testing was done to a fast external SSD connected directly to the Mac Mini using Thunderbolt 4. I did this to include the file transfer time in the test as doing an export on the same SSD uses copy-on-write and is very fast regardless of how many threads are used.I believe this is now fixed in v0.59.0. I’ve also added an example script that performs concurrent export. In some quick testing, it does appear that on my M1 mini, there is a performance boost using concurrent export. Note that this only works on Python 3.11+ and only applies to the API; the
osxphotos exportCLI does not yet support concurrent export.So I’ve gotten a very rough alpha implementation of a thread-safe export and it appears to actually be slower than single threaded. The test case was exporting 615 photos from my library to my internal SSD (M1 Mac Mini, 8 cores) with 8 max workers (
concurrent.futures.ThreadPoolExecutor). The concurrent test took about twice as long as the single threaded case. Looking at profile data, it looks like there’s a lot of time spent managing threads:To get this fully working I’ll also need to parallelize the exiftool code (which currently runs as singleton process).
I tried
ProcessPoolExecutor()but it doesn’t work because theExportDBobject cannot be pickled so I’d have to refactor that too.Confirmed that I no longer see the memory db warning.