pulldown-cmark: The lifetimes on BrokenLinkCallback are wrong
Here is a simple callback which marks every link as working:
fn callback<'a>(link: BrokenLink<'a>) -> Option<(CowStr<'a>, CowStr<'a>)> {
Some(("#".into(), link.reference.into()))
}
On its own, it typechecks fine. Unfortunately, this doesn’t work with Parser::with_broken_link_callback:
fn f(txt: &str) {
for _ in Parser::new_with_broken_link_callback(txt, Options::empty(), Some(&mut callback)) {
}
}
error: implementation of `FnOnce` is not general enough
--> src/lib.rs:8:80
|
8 | for _ in Parser::new_with_broken_link_callback(txt, Options::empty(), Some(&mut callback)) {
| ^^^^^^^^^^^^^ implementation of `FnOnce` is not general enough
|
::: /home/joshua/.local/lib/rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:219:1
|
219 | / pub trait FnOnce<Args> {
220 | | /// The returned type after the call operator is used.
221 | | #[lang = "fn_once_output"]
222 | | #[stable(feature = "fn_once_output", since = "1.12.0")]
... |
227 | | extern "rust-call" fn call_once(self, args: Args) -> Self::Output;
228 | | }
| |_- trait `FnOnce` defined here
|
= note: `for<'a> fn(pulldown_cmark::BrokenLink<'a>) -> Option<(pulldown_cmark::CowStr<'a>, pulldown_cmark::CowStr<'a>)> {callback}` must implement `FnOnce<(pulldown_cmark::BrokenLink<'_>,)>`
= note: ...but `FnOnce<(pulldown_cmark::BrokenLink<'_>,)>` is actually implemented for the type `for<'a> fn(pulldown_cmark::BrokenLink<'a>) -> Option<(pulldown_cmark::CowStr<'a>, pulldown_cmark::CowStr<'a>)> {callback}`
error: aborting due to previous error
The issue is that BrokenLinkCallback is typed as having the same lifetime as its outputs: https://github.com/raphlinus/pulldown-cmark/blob/e97974b8d76195c953f0d427e8725ef9ad1a0c17/src/parse.rs#L1270
That means that the callback can’t e.g. be passed to two different parsers, because the first call will fix a set lifetime:
fn f(txt: &str) {
let mut callback = |link: BrokenLink<'_>| -> Option<(CowStr<'_>, CowStr<'_>)> {
Some(("#".into(), link.reference.to_owned().into()))
};
for _ in Parser::new_with_broken_link_callback(txt, Options::empty(), Some(&mut callback)) {
}
for _ in Parser::new_with_broken_link_callback(txt, Options::empty(), Some(&mut callback)) {
}
}
error[E0499]: cannot borrow `callback` as mutable more than once at a time
--> src/lib.rs:11:80
|
8 | for _ in Parser::new_with_broken_link_callback(txt, Options::empty(), Some(&mut callback)) {
| ------------- first mutable borrow occurs here
...
11 | for _ in Parser::new_with_broken_link_callback(txt, Options::empty(), Some(&mut callback)) {
| ^^^^^^^^^^^^^
| |
| second mutable borrow occurs here
| first borrow later used here
The fix I was thinking of was something like this:
diff --git a/src/parse.rs b/src/parse.rs
index d6388b1..bccd68c 100644
--- a/src/parse.rs
+++ b/src/parse.rs
@@ -129,12 +129,12 @@ pub struct BrokenLink<'a> {
}
/// Markdown event iterator.
-pub struct Parser<'a> {
- text: &'a str,
+pub struct Parser<'input, 'callback: 'input> {
+ text: &'input str,
options: Options,
tree: Tree<Item>,
- allocs: Allocations<'a>,
- broken_link_callback: BrokenLinkCallback<'a>,
+ allocs: Allocations<'input>,
+ broken_link_callback: BrokenLinkCallback<'callback>,
html_scan_guard: HtmlScanGuard,
// used by inline passes. store them here for reuse
@@ -1266,8 +1267,8 @@ pub(crate) struct HtmlScanGuard {
pub declaration: usize,
}
-pub type BrokenLinkCallback<'a> =
- Option<&'a mut dyn FnMut(BrokenLink) -> Option<(CowStr<'a>, CowStr<'a>)>>;
+pub type BrokenLinkCallback<'b> =
+ Option<&'b mut dyn for<'a> FnMut(BrokenLink<'a>) -> Option<(CowStr<'a>, CowStr<'a>)>>;
/// Markdown event and source range iterator.
///
This does two things:
- Separates the lifetime of the link from the lifetime of the callback (by changing
&'a FnMut() -> &'a strto&'b for<'a> FnMut() -> &'a str). - Separates the lifetime of the link from the lifetime of the link (by adding a new
'callbacklifetime).
Unfortunately, this uncovers that the change can’t work:
error[E0597]: `link_label` does not live long enough
--> src/parse.rs:457:64
|
145 | impl<'a, 'b> Parser<'a, 'b> {
| -- lifetime `'a` defined here
...
450 | .or_else(|| {
| -- value captured here
...
457 | reference: link_label.as_ref(),
| ^^^^^^^^^^ borrowed value does not live long enough
...
460 | / callback(broken_link).map(|(url, title)| {
461 | | (link_type.to_unknown(), url, title)
462 | | })
| |__________________________________________________- returning this value requires that `link_label` is borrowed for `'a`
...
503 | }
| - `link_label` dropped here while still borrowed
The issue is that link_label is only alive for the duration for the length of a single loop iteration, not the lifetime of the input. Even though it’s parameterized by a lifetime 'input, it has a Box variant, so if it gets dropped the compiler has to conservatively assume the entire link is invalid.
I don’t have a solution for this, I think it will require a redesign of the parser. But people are running into this in the real world: https://github.com/rust-lang/rust/pull/79781/files#r537769663
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 19 (8 by maintainers)
Hey @Veykril,
Apologies for the delay in getting back to you. The master branch should be ready for use and it could be published as new point release. The reason that hasn’t happened yet is that there was still an option to be explored where the callback function would be a type parameter (especially around code bloat and embedability). I hope to sort that out soon.
Edit: to clarify: we could release now and do another point release later if we chose to parameterize the callback function, but that would require people tracking the latest releases to update their callback code twice. That’s the exact reason we haven’t published yet.
Interesting. That makes for a good argument to use generics. I’ll have to investigate how large the binary size cost is whenever different callbacks are used.
Okay, looking forward to that. I will try to understand more deeply in the meantime.
edit:
I understand this now. Thank you for this apt and concise explanation.