autorest.csharp: AutoRest fails to produce the expected XML documentation in .NET libraries

ℹ️ Updated: 02 January 2023

Problem

When AutoRest generates C# XML documentation comments in .NET libraries from OpenAPI description files whose description fields contain HTML markup tags, the tags are merely escaped by transforming the tag delimiters into their HTML character entity equivalents. This leads to downstream tools like IntelliSense, DocFx, and other documentation processors rendering unexpected (incorrect) output that doesn’t reflect OpenAPI description file author’s intended formatting.

For example, the angle brackets in HTML tags like <b>, <ul>, and <li> that appear in the description field of OpenAPI descriptions (aka “Swagger files”) are transformed into their character entity equivalents &lt; and &gt; in the XML documentation comments generated by AutoRest.

Duplicate issues:

Expected

  • AutoRest generates well-formed XML in C# doc comments per the C# language specification.
  • AutoRest generates doc comments that reflect the OpenAPI description author’s intent.
  • AutoRest has markup support parity with OpenAPI 2.0 and 3.0 specifications.
    • Specifically for this issue is parity support for handling the markup that appears in the description field of applicable entity types defined in an OpenAPI description file.

Expected behavior: two options

There are a couple options I can think of that might be considered as a fix:

Encountered

OpenAPI description (Swagger file) - description field of adminUsername from Microsoft.Compute/stable/2022-03-01/ComputeRP/computeRPCommon.json#L1360

"description": "Specifies the name of the administrator account. <br><br> This property cannot be updated after the VM is created. <br><br> **Windows-only restriction:** Cannot end in \".\" <br><br> **Disallowed values:** \"administrator\", \"admin\", \"user\", \"user1\", \"test\", \"user2\", \"test1\", \"user3\", \"admin1\", \"1\", \"123\", \"a\", \"actuser\", \"adm\", \"admin2\", \"aspnet\", \"backup\", \"console\", \"david\", \"guest\", \"john\", \"owner\", \"root\", \"server\", \"sql\", \"support\", \"support_388945a0\", \"sys\", \"test2\", \"test3\", \"user4\", \"user5\". <br><br> **Minimum-length (Linux):** 1  character <br><br> **Max-length (Linux):** 64 characters <br><br> **Max-length (Windows):** 20 characters."

AutoRest-generated source code - OSProfile.cs

/// <summary> Specifies the name of the administrator account. &lt;br&gt;&lt;br&gt; This property cannot be updated after the VM is created. &lt;br&gt;&lt;br&gt; **Windows-only restriction:** Cannot end in &quot;.&quot; &lt;br&gt;&lt;br&gt; **Disallowed values:** &quot;administrator&quot;, &quot;admin&quot;, &quot;user&quot;, &quot;user1&quot;, &quot;test&quot;, &quot;user2&quot;, &quot;test1&quot;, &quot;user3&quot;, &quot;admin1&quot;, &quot;1&quot;, &quot;123&quot;, &quot;a&quot;, &quot;actuser&quot;, &quot;adm&quot;, &quot;admin2&quot;, &quot;aspnet&quot;, &quot;backup&quot;, &quot;console&quot;, &quot;david&quot;, &quot;guest&quot;, &quot;john&quot;, &quot;owner&quot;, &quot;root&quot;, &quot;server&quot;, &quot;sql&quot;, &quot;support&quot;, &quot;support_388945a0&quot;, &quot;sys&quot;, &quot;test2&quot;, &quot;test3&quot;, &quot;user4&quot;, &quot;user5&quot;. &lt;br&gt;&lt;br&gt; **Minimum-length (Linux):** 1  character &lt;br&gt;&lt;br&gt; **Max-length (Linux):** 64 characters &lt;br&gt;&lt;br&gt; **Max-length (Windows):** 20 characters. </summary>

How it’s rendered Intellisense:

image

Rendered on docs.microsoft.com - OSProfile.AdminPassword:

image

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Comments: 21 (11 by maintainers)

Most upvoted comments

@danieljurek only option1 is sustainable to me.

  • Swagger description are written in MD.
  • C# docstring is some form of XML doc
  • Thefore, autorest needs to be able to transform MD to the right XML doc. Autorest is just a smart SDK writer, it should write actual good C#, including doc.

For instance with Python:

  • Swagger are MD
  • Python Docstring are RST
  • Therefore we plugged a “m2r” (MD to RST) plugin that we apply to all description string

I feel C# needs something like that.

CC @AlexanderSher

Sorry @danieljurek, no, I have not manually tested the proposed changes.

Apologies for the delay in response, as well. I moved on from Microsoft at the end of October and got caught up in the switch-to-a-new-company-and-new-job thing, and your question got lost in the fray.

On that note, there’ll likely be no change in testing status from me on this issue, but I’ll certainly continue following it. 🙂

Cc: @m-nash

ping… any update on this?

Plan B would be doc team process “summary” as a big bloc of Markdown/HTML instead of the expected summary XML format. That could be a pragmatic way to solve it, but then we should keep in mind that we tight a particular MS-specific output to a particular specific MS-specific doc input.

And I throw myself under the bus for Python as well, this is the only reason why Python doc works: I write Markdown in my DocString (where I should write RST only), and it works just because Doc team happen to parse it by chance. So again, I really don’t say it’s trivial to do this transformation.

Ok, thought “summary” was a place holder for TEXT (CDATA like) but it’s actually a proper XML language.

For that I see, there need to be a translation from Markdown to this XML format.

Example:

/// used in this field, see [Selecting User Names for Linux on
/// Azure](https://docs.microsoft.com/azure/virtual-machines/virtual-machines-linux-usernames?toc=%2fazure%2fvirtual-machines%2flinux%2ftoc.json)

should be (ref)

/// used in this field, see <see href="https://docs.microsoft.com/azure/virtual-machines/virtual-machines-linux-usernames?">
/// Selecting User Names for Linux on Azure</see>

or this:

text 1 <br/> text 2

should be (ref from ScottHa)

<para>text 1</para>
<para>text 2</para>

Again, I’m not a C# dev but this is my understanding of the XML spec.

And yes, it’s not trivial, I never said that…