runtime: Problems with thousands separators when parsing strings to numbers

Since some months ago, I have been aware about the problems that the parsing methods (at least, Parse and TryParse) of various numerical types (at least, Decimal and Double) have when dealing with thousands separators (at least in .NET 4.5; in both C# and VB.NET). The following C# sample code illustrates the point well:

string stringVal = "1,1.1";
decimal decVal = 0m;
if (decimal.TryParse(stringVal, NumberStyles.Any, CultureInfo.InvariantCulture, out decVal))
{
    //This part will be reached, because decimal.TryParse thinks that 1,1.1 is a valid number
}

Such misbehaviour can be replicated under many different conditions. The basic idea is that the parsing approaches don’t understand what the thousands separators really imply (i.e., groups of 3 numbers in the aforementioned example).

Today, I focused my analysis of the code on the Decimal type and found ParseNumber (mscorlib\src\System\Number.cs). The way in which this method treats the thousands separators (in all their forms) explains perfectly the observed behaviour: it focus the analysis on valid/invalid characters and doesn’t bring into account the size of the group at any point.

My proposal consists in forcing the corresponding methods to trigger an error when the thousands separators aren’t in the right positions. My intention is firstly focusing on the Decimal type (although the aforementioned method is most likely used by other types too). After having ready a good enough correction for Decimal, I will analyse the remaining types in detail and perform additional changes if required.

The correction will consist in analysing the input string; in case of including thousands separators, their locations would be checked. If this analysis fails, no conversion to number would be performed. In principle, I am planning to take advantage from the current implementation (more specifically, a loop analysing all the characters) and rely on “low level” approaches; although I have still to think carefully about the exact algorithm.

About this issue

Original URL
State: closed
Created 9 years ago
Reactions: 3
Comments: 114 (61 by maintainers)

Most upvoted comments

Should we also submit a proposal to ECMAScript as JavaScript does:

parseInt("1.2,2")
> 1

parseFloat("1,2,3.4")
> 1

Is this a bug in JavaScript? Python and Ruby also possess similar relaxation in numeric parsing.

wasting time

has been said enormous amount of time in this thread by OP, which is pretty ridiculous.

language / culture (ironically) barrier.

Given the OP’s attitude towards others, this is a moot point. Every human on planet should have a sense of realization and some kind of tolerance for others’ opinion. Failing to possess this basic human quality is your fault. Don’t blame your culture for it or use it as excuse. You know exactly what you are saying when you repeat it so many times.

ghost on Dec 11, 2015

I wonder, can this be rephrased into a documentation enhancement request?

Should the function docs mention closer what the actual de-facto implementation does, guaranteeing that people that rely on observed behaviour (eliding group separators) has a guarantee that it continues to be so?

We’ve already established that the function contract cannot be changed due to demonstrated reliance on the behaviour. Should it be changed to explicitly state this instead of relying on happenstance?

zao on Dec 11, 2015

@mikedn (This is precisely the kind of conversation which I don’t want to have, because of representing a pure waste of time. A more-sensible-than-me person would plainly ignore such a pointless argument, but I am too damn respectful and don’t like ignoring people. You clearly come from a completely different background and I don’t have any interest in convincing you of anything, just in showing what I consider relevant to people thinking like me. Not sure about the reason why you are so interested in continuing a conversation with a person seeing things from a completely different perspective than you do. This one will be my last reply to comments on these lines)

All what you are saying doesn’t make any sense! You are talking about applications. You are not facing the problem as a programmer, but as a user! If you click on a button, you don’t know what this program is internally doing! We are precisely discussing about these internals!

You seriously think that if Decimal.Parse fails in certain situation, all the applications relying on it would fail too?! Only the applications built by incompetent programmers would fail. Competent programmers create applications which will always work independently upon virtually anything else (e.g., the errors of the given programming language, how clueless the given user might be, etc.)!

The applications created with .NET 2.0 (and before .NET existed) didn’t include lots of fancy features which the last version includes, but basic functionalities like inputting “12,2,3” and not getting any error were still working anyway. With loops, conditions and a couple of things more you can create almost anything; although this is not the goal of the modern programming languages (i.e., basic functionalities), but being as programmer-friendly as possible (e.g., to maximise productivity), to not mention the internal consistency of the given environment (perhaps irrelevant to you, but not to everyone).

Have you read all the questions in my previous comment? What do you think that these people were doing after realising that the decimal (or double or float) parsing method wasn’t working as expected?! Letting it like this? Creating a warning saying “Sorry, the functionality which you have requested is still not accounted by .NET”?! If you have to multiply 5 times 5 and the multiplication sign of your calculator doesn’t work, you can do 5+5+5+5+5; would that mean that the multiplication sign doesn’t need to be fixed? No. You need the multiplication sign to do things easily and quickly. And even worse: if you are commercialising calculators and one of your clients comes to you complaining about the multiplication sign not working, would you say him “use the addition sign”? What would that client think of your professionalism and the reliability of your products?

Please, stop converting what seemed a true dream (being able to perform relevant modifications in CoreCLR & CoreFX) in meaningless conversations which I don’t want to have.

varocarbas on Dec 10, 2015

@mikedn I am answering mostly to highlight that you misunderstood comment (“see this baseless accusation”). My bothering-you comment wasn’t exactly addressed to you and was meant in a quite wide sense on the lines of “subjective views when dealing with objective facts”. I have been answering comments from quite a few people for a while; what was tiring, because I don’t like it (wasn’t expecting it here and don’t even think that things here should be that way).

I am not trying to criticise anyone’s approach, just clearly stating what I am willing to do. I am more than happy to explain what is required from a strictly technical perspective; to convert my “I think that” into verifiable facts. Also in case of being required, I am very happy in following a more or less complex process (even involving some random chatting) but only in case of aiming a clearly-defined goal (not the case so far).

What about if I exclusively participate in conversations dealing with the implementation aspects? You can discuss as much as you wish, come to any conclusion and letting me know only in case of thinking that my approach is worthy. Would this be acceptable?

varocarbas on Dec 10, 2015

I am seriously not interested in convincing anyone or tolerating not-completely-straightforward/honest attitudes.

I’ll pretend that I didn’t see this baseless accusation.

I have explained my point clearly, would love performing this modifications (and quite a few other ones)

That’s all fine and appreciated but again, this isn’t about changing the code.

If you (the decision-makers) consider that I might go ahead

Decision makers (Microsoft people) have yet to comment on this matter. Neither me nor anyone else who commented in this thread up to now do not work for Microsoft. But the decision makes have clearly stated (and documented) in the past that breaking changes need to be discussed. That’s exactly what we are doing now but it appears you’re not willing to have a discussion.

if you have solid enough expertise related to what is being discussed here (e.g., efficient algorithm building, deep .NET knowledge mainly regarding this specific implementation, you are an local decision-maker, etc.)

Again, we’re not discussing any implementation details, we’re discussing the behavior of the API. I don’t know how to make this more clear, you’re approaching this from the wrong angle.

mikedn on Dec 10, 2015

Thanks for your contributions, but seriously this is not what I want.

I have started some threads which I consider completely clear. If I am wrong and this community doesn’t think like me, I would not implement anything (and, most likely, wouldn’t contribute further).

I like objectivity-driven communities where all their members have similar knowledge and expectations and where subjectivity is not favoured. The open .NET seems a perfect excuse for objective-correctness-focused discussions where everyone would win (Microsoft by getting a beyond-excellent product and the contributors by working on so worthy resources; seriously, after taking a quick look at CoreFX & CoreCLR I am speechless); this situation is certainly very appealing to me. Participating in random chats with random people (no offense) about random issues is not what I want.

Please, if you have solid enough expertise related to what is being discussed here (e.g., efficient algorithm building, deep .NET knowledge mainly regarding this specific implementation, you are an local decision-maker, etc.) and you want to share anything with me (question, suggestion, request, etc.) from a completely technical and objective perspective, please feel free to contact me. Otherwise, I would not answer you. As said, no disrespect intended, just trying to avoid everyone wasting their time.

varocarbas on Dec 10, 2015

@mikedn I am seriously not interested in convincing anyone or tolerating not-completely-straightforward/honest attitudes. I have explained my point clearly, would love performing this modifications (and quite a few other ones) but if I cannot I would accept it (haven’t had anything until yesterday).

The situation is very clear: you have “1,” being accepted as a valid number. Do you want that? Excellent; but let’s better not waste everyon’s time by trying to prove that this should actually be the case (you don’t have any way to convince me of such a thing).

I have made my proposals and spent a relevant amount of time explaining different issues. If you (the decision-makers) consider that I might go ahead (or even that I should provide further informtion), please let me know.

varocarbas on Dec 10, 2015

but triggers an error with “1…”. Does it make sense? No.

The decimal point has a very specific meaning and its presence and position affects the result. IMO the fact that 1… triggers an error makes sense.

mikedn on Dec 10, 2015

@Joe4evr It was a rhetorical question. The current implementation of Decimal.Parse (or similar) accepts “1,” as valid, but triggers an error with “1…”. Does it make sense? No. That’s why I am suggesting to change it (what @mikedn thinks that is not a good idea).

PS: if you think that numbers are confusing why are you commenting in a thread aiming to change the source code dealing with these issues?! Note that the culture is being brought into account always when parsing decimal (when using an overload not taking this argument, the code would assume a default one). PS2: don’t want to offend anyone, but I was expecting a bit more of “action” (i.e., actually implementing changes or discussing with actually knowledgeable people in the given implementation). All what I see here is general talking.

varocarbas on Dec 10, 2015

This thread has turned into more of a chat room on compatibility. Definitely a good topic, but this isn’t really the best forum for it. Feel free to use the .NET Foundation Forums for this discussion or I see that Reddit has a thread going on this topic. I’m going to close this issue as a result.

I think that the community has largely spoken on this topic. A brief summary, plus some small additions:

We’re not going to make breaking changes to existing APIs, except in the rare case for security. Where we want to change the behavior of an existing API, we can always add a new one and that’s a fine conversation. We need to balance evolution and stability.

On the particular topic of formats and culture, please do refer as much as possible to industry specs to justify and provide weight to proposals. Discussions on these type of topics w/o authoritative sources quickly devolve as this one has. Internally, we have people with ECMA 334/335, Unicode, HTTP and various chip specs on their desks. They often ‘argue’ in terms of those specs. It’s a good approach to follow.

Pointers on the contribution guidelines: Overall Guidelines and Managed Code Compatibility.

Have a great weekend!

richlander on Dec 12, 2015

@canton7 The problem here was that I didn’t want to participate in certain discussion, what some people didn’t like. Better: some people didn’t like me being very clear on this and other fronts (and got offended when this wasn’t ever my intention). I wanted to plainly propose something (to eventually do some tests and prove some points but only from a technical point of view) and to let the analysis of further issues and the final decision to others (i.e., Microsoft and/or the community not including me).

I should have plainly written the code, done a simple fork/pull request and nothing else. I made a mistake. Lesson learned (= from now on more forking/pulling and less talking).