Open-XML-SDK: Malformed mailto Hyperlink causes Exception on .NET 4.5+
Hi,
This issue is similar to #7 but for recent versions of the .NET framework.
If you create a blank document and set some text to the following field code (note the extra space):
{ HYPERLINK "mailto:email@address%20.com" }
Running the following on .NET 4.0 will work fine:
var wpdoc = WordprocessingDocument.Open(@"TestDoc.docx", false)
Running the same line on .NET 4.5+ will throw the following exception:
DocumentFormat.OpenXml.Packaging.OpenXmlPackageException: Invalid Hyperlink: Malformed URI is embedded as a hyperlink in the document.
at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.Load() in OpenXmlPackage.cs: line 490
at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.OpenCore(String path, Boolean readWriteMode) in OpenXmlPackage.cs: line 402
at DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open(String path, Boolean isEditable, OpenSettings openSettings) in PackageDocument.cs: line 297
at DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open(String path, Boolean isEditable) in PackageDocument.cs: line 256
The inner-exception being this:
at System.Uri.CreateThis(String uri, Boolean dontEscape, UriKind uriKind)
at System.Uri..ctor(String uriString, UriKind uriKind)
at MS.Internal.IO.Packaging.InternalRelationshipCollection.ProcessRelationshipAttributes(XmlCompatibilityReader reader)
at MS.Internal.IO.Packaging.InternalRelationshipCollection.ParseRelationshipPart(PackagePart part)
at MS.Internal.IO.Packaging.InternalRelationshipCollection..ctor(Package package, PackagePart part)
at System.IO.Packaging.PackagePart.EnsureRelationships()
at System.IO.Packaging.PackagePart.GetRelationshipsHelper(String filterString)
at System.IO.Packaging.PackagePart.GetRelationships()
at DocumentFormat.OpenXml.Packaging.PackagePartRelationshipPropertyCollection..ctor(PackagePart packagePart)
at DocumentFormat.OpenXml.Packaging.OpenXmlPart.Load(OpenXmlPackage openXmlPackage, OpenXmlPart parent, Uri uriTarget, String id, Dictionary`2 loadedParts)
at DocumentFormat.OpenXml.Packaging.OpenXmlPartContainer.LoadReferencedPartsAndRelationships(OpenXmlPackage openXmlPackage, OpenXmlPart sourcePart, RelationshipCollection relationshipCollection, Dictionary`2 loadedParts)
at DocumentFormat.OpenXml.Packaging.OpenXmlPackage.Load()
I assume this is due to the following change introduced in .Net 4.5 (https://msdn.microsoft.com/en-us/library/hh367887(v=vs.110).aspx): An invalid mailto: URL throws an exception in the Uri class constructor.
Word seems to be perfectly happy with the malformed mailto hyperlink so I assume this would count as a valid document. I am therefore not sure how this could be fixed elegantly.
Thanks,
About this issue
- Original URL
- State: closed
- Created 9 years ago
- Comments: 24 (14 by maintainers)
@igitur Thank you. But sorry, I do not think it is a good idea. On the one hand, this code adds a traversal, which will reduce performance. On the other hand, this code seems to modify the original document.
@vogla This was fixed in https://github.com/OfficeDev/Open-XML-SDK/pull/793 . Even if it weren’t, the cool thing with open source software is that you can fix it yourself. And if you’re really nice, you can contribute it back to the original project.
I just ran into this issue with an application that processes excel documents. How can such a fundamental and important issue be ignored for years? Invalid hyperlinks occur in excel documents all the time and everywhere. The argument that this is not a concern for an OpenXML document parsing library is ridiculous.
@twsouthwick, I am very frequently running in this issue and have implemented a workaround based on Eric White’s article. However, rather than having everybody implement that workaround for himself or herself, wouldn’t it be better to offer that as part of the Open XML SDK? For example, we could either provide a separate utility class or add a static utility method to
OpenXmlPackage
.Just to be clear: The main message thing was just a overstatement. I know that there are a lot of functions inside Word that are not easily resolvable via pure Open XML (e.g. everything that renders the actual content), but that a typical user can create a somewhat invalid OpenXML file (at least from the view of the Open XML SDK) via a pretty basic function is quite bad.
So change your main message. Clearly it’s not true.
For those still struggling with this issue, please see @EricWhiteDev 's article on how to handle this: http://ericwhite.com/blog/handling-invalid-hyperlinks-openxmlpackageexception-in-the-open-xml-sdk/
The link Eric posted above seems obsolete.
@EricWhiteDev: Is there absolutely no way that this workaround will be implemented going forward?
Closing this suggestion. This is most likely not going to be implemented, so should close this issue to reflect that.