textwrap: Integer size and overflow when calculating penalty

When testing the optimal-fit algorithm, I encountered a panic during the cost calculations:

thread 'main' panicked at 'attempt to multiply with overflow', .../textwrap-0.13.0/src/core.rs:710:21

I fixed this by multiplying my mm widths with the factor 100 instead of 1000, which should still be a sufficient precision. But looking at the cost calculations, I have two follow-up questions:

Why are the costs stored as a i32 instead of a usize? They are generated by adding positive usize values, positive integer constants and other costs, so usize should work too.
How should overflows be handled when calculating the penalty? I changed the i32 type to usize, causing an integer overflow at the same location for the max_width test case that sets the line width to usize::MAX. This is currently working, probably because the huge value target_width - line_width is cropped when casting to i32 before updating the cost. Shouldn’t the cost calculation use saturating additions and multiplications if large line widths are supported?

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 17 (13 by maintainers)

Commits related to this issue

Introduce fuzz tests for wrap_optimal_fit and wrap_first_fit These fuzz tests immediately found the problem reported in #247 and gave a short way of reproducing it: wrap("x y z", usize::max_valu... — committed to mgeisler/textwrap by mgeisler 4 years ago
Introduce fuzz tests for wrap_optimal_fit and wrap_first_fit These fuzz tests immediately found the problem reported in #247 and gave a short way of reproducing it: wrap("x y", 515566821223) wi... — committed to mgeisler/textwrap by mgeisler 4 years ago
Handle overflows in wrap_optimal_fit by divide-and-conquer The `wrap_optimal_fit algorithm` computes the penalty for a gap as `gap * gap`. If a fragment has a size near `usize::max_value()` and if th... — committed to mgeisler/textwrap by mgeisler 4 years ago
Handle overflows in wrap_optimal_fit by divide-and-conquer The `wrap_optimal_fit algorithm` computes the penalty for a gap as `gap * gap`. If a fragment has a size near `usize::max_value()` and if th... — committed to mgeisler/textwrap by mgeisler 4 years ago
Handle overflows in wrap_optimal_fit by divide-and-conquer The `wrap_optimal_fit algorithm` computes the penalty for a gap as `gap * gap`. If a fragment has a size near `usize::max_value()` and if th... — committed to mgeisler/textwrap by mgeisler 4 years ago
Handle overflows in wrap_optimal_fit by divide-and-conquer The `wrap_optimal_fit‘ algorithm computes the penalty for a gap as `gap * gap`. If a fragment has a size near `usize::max_value()` and if th... — committed to mgeisler/textwrap by mgeisler 4 years ago
Add fuzz tests with fully arbitrary fragments The new `wrap_first_fit.rs` and `wrap_optimal_fit.rs` fuzz tests generate completely random fragments with arbitrary widths. They then check that the fra... — committed to mgeisler/textwrap by mgeisler 2 years ago
Use `f64` instead of `usize` for fragment widths This changes the type used for internal width computations in the wrap algorithms. Before, we used `usize` to represent the fragment widths and for th... — committed to mgeisler/textwrap by mgeisler 3 years ago
Change penalties to non-negative numbers The optimization problem solved by the optimal-fit algorithm is fundamentally a minimization problem. It is therefore not sensible to allow negative penalties... — committed to mgeisler/textwrap by mgeisler 2 years ago
Change penalties to non-negative numbers The optimization problem solved by the optimal-fit algorithm is fundamentally a minimization problem. It is therefore not sensible to allow negative penalties... — committed to mgeisler/textwrap by mgeisler 2 years ago
Change penalties to non-negative numbers The optimization problem solved by the optimal-fit algorithm is fundamentally a minimization problem. It is therefore not sensible to allow negative penalties... — committed to mgeisler/textwrap by mgeisler 2 years ago
Change penalties to non-negative numbers The optimization problem solved by the optimal-fit algorithm is fundamentally a minimization problem. It is therefore not sensible to allow negative penalties... — committed to mgeisler/textwrap by mgeisler 2 years ago
Use `f64` instead of `usize` for fragment widths This changes the type used for internal width computations in the wrap algorithms. Before, we used `usize` to represent the fragment widths and for th... — committed to mgeisler/textwrap by mgeisler 3 years ago
Use `f64` instead of `usize` for fragment widths This changes the type used for internal width computations in the wrap algorithms. Before, we used `usize` to represent the fragment widths and for th... — committed to mgeisler/textwrap by mgeisler 3 years ago

Most upvoted comments

What about #289? Wouldn’t it mostly mitigate the problem for realistic use cases? And it should still be possible to detect the overflow case by checking for infinity.

I think it’s even better: there are no infinities because the penalties are scaled up to some maximum value: https://github.com/mgeisler/textwrap/pull/289/commits/93e51143adb061d1b2d7c43a98fbec1c6b8b6131

This approach also has a funny side effect of making the penalties dependent on the line length. That is, a gap of 10 is seen as catastrophic if the line length is 20, but seen as negligible if the line length is 1000. That should be a good feature, and if you know the line length, I believe you can compute suitable penalties (now that they’re adjustable: #389) to get the effect you want.

However, since this is different that the current approach, I recall spending a lot of time with a Jupyter Notebook trying to figure out if I can scale the penalties in a nice way to make them independent of the line width — but now I think I might have tried to solve a non-problem.

mgeisler on Jun 29, 2021