obsidian-linter: Bug: `No Bare URLs` include period and parenthesis at the end of the sentence.
- I have verified that I am on the latest version of the Linter
Describe the Bug
Please see the example
How to Reproduce
Input
- https://theintercept.com/2023/05/23/henry-kissinger-cambodia-bombing-survivors/.
- https://www.gettyimages.com/detail/news-photo/617942032.
Current output
- <https://theintercept.com/2023/05/23/henry-kissinger-cambodia-bombing-survivors/.>
- <https://www.gettyimages.com/detail/news-photo/617942032.>
Expected Behavior
<> wrap only the URL and not the period at the end of the sentence as below
- <https://theintercept.com/2023/05/23/henry-kissinger-cambodia-bombing-survivors/>.
- <https://www.gettyimages.com/detail/news-photo/617942032>.
Device
- Desktop Windows 11. Obsidian 1.3.5.
Additional info
The parenthesis ) is included in similar behavior.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 16
I am still looking into this @nhan000 , so it may be that a period at the end of the URL is not valid. In the meantime I think we should keep this open until I can validate whether or not the period is considered valid as part of a URL.
I believe I ironed out the kinks I was seeing. The changes should be on master and go out in the next release. Please let us know if there is an issue with master or in the next release.
Well. It may be that I cannot actually use that package, but I can likely use the regex it generates instead.
Edit: the bundling fails due to a dependency that cannot be bundled.
I seem to be hitting a bit of an issue with removing multiple angle brackets at the start or end of a URL, so I will need to go see what is going on there. Other than that, I need to add an option to enable no bare URIs as part of no bare URLs. Once those two things are knocked out. I feel comfortable merging the change.
Looks like it works for the scenarios in question. I am now checking that it identifies general URIs (this would be a nice plus) and that it does not stall out when certain regexes are are matched against.
Looks like the regex would handle these scenarios. I do need to test something to make sure it plays well with the two scenarios mentioned here.
It might just ignore trailing parens, but I need to do some testing to see if it works and is efficient. It may also help with the URI match as well.
I may however be able to use https://github.com/spamscanner/url-regex-safe for the purposes of grabbing the urls. It does look like it has the ability to ignore trailing periods. I am not sure on a trailing parenthesis though.
Looks like a URL parser does remove any trailing periods from the URL. However trailing parentheses do not seem to be removed.
Looks like I ran out of time to look at this. I will have to see about getting around to this.
I will have to take a closer look at this this evening.
@nhan000 , I am not sure this is a bug. Can URLs end in a period? If so, this is expected behavior.