ponyc: Extremely poor performance and high memory usage when iterating a FileLines instance
Given the program:
use "files"
actor Main
new create(env: Env) =>
try
var count: U32 = 0
var path = FilePath(env.root as AmbientAuth, "./googlebooks-eng-all-1gram-20120701-0")?
var file = OpenFile(path) as File
for line in FileLines(file) do
count = count + 1
end
end
Running on the data file https://storage.googleapis.com/books/ngrams/books/googlebooks-eng-all-1gram-20120701-0.gz (uncompress first)
You will see the program takes an extremely long time to iterate the file, moreover, the ram usage is more than I had expected: ~5GB from ~180 meg input file.
For comparison with python:
$ cat test.py
count = 0
for l in open("googlebooks-eng-all-1gram-20120701-0"):
count += 1
$ time python test.py
real 0m2.092s
user 0m1.310s
sys 0m0.108s
$ time ./test-pony
<<Manually kill>>
real 0m18.638s
user 0m2.584s
sys 0m13.456s
Could reproduce with pony 0.18 on windows 10 and linux (xubuntu).
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 26 (14 by maintainers)
Commits related to this issue
- reimplement FileLines using buffered.Reader Making it maintain its own cursor through the file for not disturbing other operations on the same file. Removed File.line as agreed upon in https://githu... — committed to ponylang/ponyc by deleted user 6 years ago
- reimplement FileLines using buffered.Reader Making it maintain its own cursor through the file for not disturbing other operations on the same file. Removed File.line as agreed upon in https://githu... — committed to ponylang/ponyc by deleted user 6 years ago
- reimplement FileLines using buffered.Reader Making it maintain its own cursor through the file for not disturbing other operations on the same file. Removed File.line as agreed upon in https://githu... — committed to ponylang/ponyc by deleted user 6 years ago
- reimplement FileLines using buffered.Reader Making it maintain its own cursor through the file for not disturbing other operations on the same file. Removed File.line as agreed upon in https://githu... — committed to ponylang/ponyc by deleted user 6 years ago
- reimplement FileLines using buffered.Reader Making it maintain its own cursor through the file for not disturbing other operations on the same file. Removed File.line as agreed upon in https://githu... — committed to ponylang/ponyc by deleted user 6 years ago
- reimplement FileLines using buffered.Reader Making it maintain its own cursor through the file for not disturbing other operations on the same file. Removed File.line as agreed upon in https://githu... — committed to ponylang/ponyc by deleted user 6 years ago
- reimplement FileLines using buffered.Reader Making it maintain its own cursor through the file for not disturbing other operations on the same file. Removed File.line as agreed upon in https://githu... — committed to ponylang/ponyc by deleted user 6 years ago
- reimplement FileLines using buffered.Reader Making it maintain its own cursor through the file for not disturbing other operations on the same file. Removed File.line as agreed upon in https://githu... — committed to ponylang/ponyc by deleted user 6 years ago
- Reimplement files.FileLines (#2707) * Add UnitTest.set_up similar to UnitTest.tear_down for use-cases where the Env from the TestHelper is needed to initialize state in a test, which is very hard ... — committed to ponylang/ponyc by mfelsche 6 years ago
We have decided to go with the following approach: