runtime: Problems with thousands separators when parsing strings to numbers

Since some months ago, I have been aware about the problems that the parsing methods (at least, Parse and TryParse) of various numerical types (at least, Decimal and Double) have when dealing with thousands separators (at least in .NET 4.5; in both C# and VB.NET). The following C# sample code illustrates the point well:

string stringVal = "1,1.1";
decimal decVal = 0m;
if (decimal.TryParse(stringVal, NumberStyles.Any, CultureInfo.InvariantCulture, out decVal))
{
    //This part will be reached, because decimal.TryParse thinks that 1,1.1 is a valid number
}

Such misbehaviour can be replicated under many different conditions. The basic idea is that the parsing approaches don’t understand what the thousands separators really imply (i.e., groups of 3 numbers in the aforementioned example).

Today, I focused my analysis of the code on the Decimal type and found ParseNumber (mscorlib\src\System\Number.cs). The way in which this method treats the thousands separators (in all their forms) explains perfectly the observed behaviour: it focus the analysis on valid/invalid characters and doesn’t bring into account the size of the group at any point.

My proposal consists in forcing the corresponding methods to trigger an error when the thousands separators aren’t in the right positions. My intention is firstly focusing on the Decimal type (although the aforementioned method is most likely used by other types too). After having ready a good enough correction for Decimal, I will analyse the remaining types in detail and perform additional changes if required.

The correction will consist in analysing the input string; in case of including thousands separators, their locations would be checked. If this analysis fails, no conversion to number would be performed. In principle, I am planning to take advantage from the current implementation (more specifically, a loop analysing all the characters) and rely on “low level” approaches; although I have still to think carefully about the exact algorithm.

About this issue

  • Original URL
  • State: closed
  • Created 9 years ago
  • Reactions: 3
  • Comments: 114 (61 by maintainers)

Most upvoted comments

Should we also submit a proposal to ECMAScript as JavaScript does:

parseInt("1.2,2")
> 1

parseFloat("1,2,3.4")
> 1

Is this a bug in JavaScript? Python and Ruby also possess similar relaxation in numeric parsing.

wasting time

has been said enormous amount of time in this thread by OP, which is pretty ridiculous.

language / culture (ironically) barrier.

Given the OP’s attitude towards others, this is a moot point. Every human on planet should have a sense of realization and some kind of tolerance for others’ opinion. Failing to possess this basic human quality is your fault. Don’t blame your culture for it or use it as excuse. You know exactly what you are saying when you repeat it so many times.

I wonder, can this be rephrased into a documentation enhancement request?

Should the function docs mention closer what the actual de-facto implementation does, guaranteeing that people that rely on observed behaviour (eliding group separators) has a guarantee that it continues to be so?

We’ve already established that the function contract cannot be changed due to demonstrated reliance on the behaviour. Should it be changed to explicitly state this instead of relying on happenstance?

@mikedn (This is precisely the kind of conversation which I don’t want to have, because of representing a pure waste of time. A more-sensible-than-me person would plainly ignore such a pointless argument, but I am too damn respectful and don’t like ignoring people. You clearly come from a completely different background and I don’t have any interest in convincing you of anything, just in showing what I consider relevant to people thinking like me. Not sure about the reason why you are so interested in continuing a conversation with a person seeing things from a completely different perspective than you do. This one will be my last reply to comments on these lines)

All what you are saying doesn’t make any sense! You are talking about applications. You are not facing the problem as a programmer, but as a user! If you click on a button, you don’t know what this program is internally doing! We are precisely discussing about these internals!

You seriously think that if Decimal.Parse fails in certain situation, all the applications relying on it would fail too?! Only the applications built by incompetent programmers would fail. Competent programmers create applications which will always work independently upon virtually anything else (e.g., the errors of the given programming language, how clueless the given user might be, etc.)!

The applications created with .NET 2.0 (and before .NET existed) didn’t include lots of fancy features which the last version includes, but basic functionalities like inputting “12,2,3” and not getting any error were still working anyway. With loops, conditions and a couple of things more you can create almost anything; although this is not the goal of the modern programming languages (i.e., basic functionalities), but being as programmer-friendly as possible (e.g., to maximise productivity), to not mention the internal consistency of the given environment (perhaps irrelevant to you, but not to everyone).

Have you read all the questions in my previous comment? What do you think that these people were doing after realising that the decimal (or double or float) parsing method wasn’t working as expected?! Letting it like this? Creating a warning saying “Sorry, the functionality which you have requested is still not accounted by .NET”?! If you have to multiply 5 times 5 and the multiplication sign of your calculator doesn’t work, you can do 5+5+5+5+5; would that mean that the multiplication sign doesn’t need to be fixed? No. You need the multiplication sign to do things easily and quickly. And even worse: if you are commercialising calculators and one of your clients comes to you complaining about the multiplication sign not working, would you say him “use the addition sign”? What would that client think of your professionalism and the reliability of your products?

Please, stop converting what seemed a true dream (being able to perform relevant modifications in CoreCLR & CoreFX) in meaningless conversations which I don’t want to have.

@mikedn I am answering mostly to highlight that you misunderstood comment (“see this baseless accusation”). My bothering-you comment wasn’t exactly addressed to you and was meant in a quite wide sense on the lines of “subjective views when dealing with objective facts”. I have been answering comments from quite a few people for a while; what was tiring, because I don’t like it (wasn’t expecting it here and don’t even think that things here should be that way).

I am not trying to criticise anyone’s approach, just clearly stating what I am willing to do. I am more than happy to explain what is required from a strictly technical perspective; to convert my “I think that” into verifiable facts. Also in case of being required, I am very happy in following a more or less complex process (even involving some random chatting) but only in case of aiming a clearly-defined goal (not the case so far).

What about if I exclusively participate in conversations dealing with the implementation aspects? You can discuss as much as you wish, come to any conclusion and letting me know only in case of thinking that my approach is worthy. Would this be acceptable?

I am seriously not interested in convincing anyone or tolerating not-completely-straightforward/honest attitudes.

I’ll pretend that I didn’t see this baseless accusation.

I have explained my point clearly, would love performing this modifications (and quite a few other ones)

That’s all fine and appreciated but again, this isn’t about changing the code.

If you (the decision-makers) consider that I might go ahead

Decision makers (Microsoft people) have yet to comment on this matter. Neither me nor anyone else who commented in this thread up to now do not work for Microsoft. But the decision makes have clearly stated (and documented) in the past that breaking changes need to be discussed. That’s exactly what we are doing now but it appears you’re not willing to have a discussion.

if you have solid enough expertise related to what is being discussed here (e.g., efficient algorithm building, deep .NET knowledge mainly regarding this specific implementation, you are an local decision-maker, etc.)

Again, we’re not discussing any implementation details, we’re discussing the behavior of the API. I don’t know how to make this more clear, you’re approaching this from the wrong angle.

Thanks for your contributions, but seriously this is not what I want.

I have started some threads which I consider completely clear. If I am wrong and this community doesn’t think like me, I would not implement anything (and, most likely, wouldn’t contribute further).

I like objectivity-driven communities where all their members have similar knowledge and expectations and where subjectivity is not favoured. The open .NET seems a perfect excuse for objective-correctness-focused discussions where everyone would win (Microsoft by getting a beyond-excellent product and the contributors by working on so worthy resources; seriously, after taking a quick look at CoreFX & CoreCLR I am speechless); this situation is certainly very appealing to me. Participating in random chats with random people (no offense) about random issues is not what I want.

Please, if you have solid enough expertise related to what is being discussed here (e.g., efficient algorithm building, deep .NET knowledge mainly regarding this specific implementation, you are an local decision-maker, etc.) and you want to share anything with me (question, suggestion, request, etc.) from a completely technical and objective perspective, please feel free to contact me. Otherwise, I would not answer you. As said, no disrespect intended, just trying to avoid everyone wasting their time.

@mikedn I am seriously not interested in convincing anyone or tolerating not-completely-straightforward/honest attitudes. I have explained my point clearly, would love performing this modifications (and quite a few other ones) but if I cannot I would accept it (haven’t had anything until yesterday).

The situation is very clear: you have “1,” being accepted as a valid number. Do you want that? Excellent; but let’s better not waste everyon’s time by trying to prove that this should actually be the case (you don’t have any way to convince me of such a thing).

I have made my proposals and spent a relevant amount of time explaining different issues. If you (the decision-makers) consider that I might go ahead (or even that I should provide further informtion), please let me know.

but triggers an error with “1…”. Does it make sense? No.

The decimal point has a very specific meaning and its presence and position affects the result. IMO the fact that 1… triggers an error makes sense.

@Joe4evr It was a rhetorical question. The current implementation of Decimal.Parse (or similar) accepts “1,” as valid, but triggers an error with “1…”. Does it make sense? No. That’s why I am suggesting to change it (what @mikedn thinks that is not a good idea).

PS: if you think that numbers are confusing why are you commenting in a thread aiming to change the source code dealing with these issues?! Note that the culture is being brought into account always when parsing decimal (when using an overload not taking this argument, the code would assume a default one). PS2: don’t want to offend anyone, but I was expecting a bit more of “action” (i.e., actually implementing changes or discussing with actually knowledgeable people in the given implementation). All what I see here is general talking.

This thread has turned into more of a chat room on compatibility. Definitely a good topic, but this isn’t really the best forum for it. Feel free to use the .NET Foundation Forums for this discussion or I see that Reddit has a thread going on this topic. I’m going to close this issue as a result.

I think that the community has largely spoken on this topic. A brief summary, plus some small additions:

We’re not going to make breaking changes to existing APIs, except in the rare case for security. Where we want to change the behavior of an existing API, we can always add a new one and that’s a fine conversation. We need to balance evolution and stability.

On the particular topic of formats and culture, please do refer as much as possible to industry specs to justify and provide weight to proposals. Discussions on these type of topics w/o authoritative sources quickly devolve as this one has. Internally, we have people with ECMA 334/335, Unicode, HTTP and various chip specs on their desks. They often ‘argue’ in terms of those specs. It’s a good approach to follow.

Pointers on the contribution guidelines: Overall Guidelines and Managed Code Compatibility.

Have a great weekend!

@canton7 The problem here was that I didn’t want to participate in certain discussion, what some people didn’t like. Better: some people didn’t like me being very clear on this and other fronts (and got offended when this wasn’t ever my intention). I wanted to plainly propose something (to eventually do some tests and prove some points but only from a technical point of view) and to let the analysis of further issues and the final decision to others (i.e., Microsoft and/or the community not including me).

I should have plainly written the code, done a simple fork/pull request and nothing else. I made a mistake. Lesson learned (= from now on more forking/pulling and less talking).

@varocarbas let me put this cleanly, so there is no chance of misunderstanding.

The issue is not with people failing to understand you. It is:

  1. You assert that “1,1.1” is “obviously” not a valid decimal, and fail to respond to the assertions to the contrary.
  2. You assert that breaking backwards compatibility here is not important (when it is) and ignore assertions to the contrary.

People on the internet get wound up when they see other people behaving like this: that’s just what happens.

The fact that this happened in a coreclr / .NET thread is irrelevant: it would have happened anywhere. This just happens to be a repository which lots of people watch.

@Dave3of5 Ey, stupid, I can laugh at you (at your ignorance and at your lack of capability to understand even the simplest idea) and let you think that you have everything under control. But, piece of shit, don’t ever dare to insult me when I am present.

You can now go back with your friends (the idiots/cowards who take you seriously) to talk about me behind my back; and then, eventually, get a bit of courage to say something ambiguous enough to me, and perhaps I will tolerate it, but this is it. Idiot.

Go back to Reddit, to deal with people saying stupid things in low voice and behind my back, where you belong.

@MartinJohns Just to let one point completely clear: I never wanted to ignore anything. But do you have an idea of how much time I have been wasting here between yesterday and today (in 3 threads I have been participating)? How many comments repeating basically the same things I have got? I was nice to everyone during the first 2-3 hours, then I got tired.

Understand that I do respect everyone’s approach, but I am free to prefer certain discussions; why should I participate in what I don’t want to participate? When you force me to talk about something I don’t want, you are the one not respecting me.

As said: thanks to everyone and sorry if I have offended some people (it wasn’t my intention).

I guess that such a behaviour will be rejected almost for sure.

Creating a fork and submitting a PR? That’s exactly what they/we/everyone want(s). It’s up to the owners of the project to accept or reject the PR.

That’s why it is called bug (= against the most logical behaviour)

A bug is when something says (documented or implied) to do one thing and does something else. It has nothing to do with logical behavior. If a method AddNumbers(a, b) returns -1 for arguments 2 and 3 then it probably is a bug if it is documented as “Adds two numbers together” or because the method quite clearly states what it’s supposed to do unambiguously. The (Try)Parse(...) method (in it’s name) doesn’t clearly specify what it does with grouping digits and also the documentation is at the very least ambiguous about the cases which it accepts and/or rejects as parseable and what the results will be. And that’s when people start relying on observed behavior (simply try it and see what it does). They probably shouldn’t do that but it happens. And that’s why you can’t change such elementary things on products / libraries / frameworks (with such a large userbase) easily on a whim. It has nothing to do with “logical behaviour”; what is logical to me or you may not be logical to others. It’s what’s currently in the field built upon the existing behavior that matters.

As I’ve said before: you can still have it your way by proposing something like (Try)ParseExact or (Try)ParseStrict or something; this way you don’t break existing code and as an added bonus the method (just a little bit) more clearly states it’s intention of being more strict that current existing (Try)Parse methods and that documentation for this method doesn’t currently exists and can be written accordingly. Another option would be to add a NumberStyles.StrictThousandsSeparator or something which would allow you to keep using the existing (Try)Parse methods and also make it possible to use the strict(er) behaviour. Key is that neither option changes current behavior and breaks backwards compatibility.

@varocarbas You are repeatedly ignoring very valid concerns and responses, simply because they do not match what you want to hear. This is not how open source communities work.

@varocarbas Mate if you’ve had enough on this conversation then just stop … You seem like a troll to me please respect MY opinion. You are making yourself look a bit silly here. Do you know this is front page on /r/programming ?

Random trolling, seriously? (These last comments represent the absolute bottom; but there have been quite a few comments before on similar lines). Is this what I have to pass through to propose an improvement in the .NET Framework? Seriously?

I hope that someone in Microsoft will take a look at some of the comments in this thread (and in other ones) and think carefully whether this is the best way to allow an open-source community to exist at all. Are you expecting good-will contributors to pass through this process every time? Good luck with that (+ don’t count on me).

@Dave3of5 I think this is serious. Not sure however.

Is this a troll or is this serious ?

@AyrA (This is the last time I try to address a clueless question, to not mention what I think about the aparent good-faith/proper-understanding of the commentator)

For all you people extremely keen on abstractly discussing about random misinterpretations of my words and with presumably a quite limited knowledge about how parsing of string to decimal numbers is actually happening in .NET (and well… in most of others modern languages), here you have some basics:

Invariant Culture/en-US/one used by default in the parsing method mentioned in my first comment “1,1.1” -> wrong (wrongly parsed as right). “1,000.0” -> right (rightly parsed as right). On the group/thousands separators front, this specific culture is completely defined with the following 2 variables: “,” as group separator and 3 as the size of the group. That is: any string including a thousands separator which doesn’t follow these rules is a wrong number in this culture (i.e., shouldn’t be parsed as a right number). The parsing method does have all the required information (i.e., it knows that is parsing this specific culture and knows what variables define it). There is no underlying and comprehensive wisdom in the code being criticised, their creators plainly decided to not account for a somehow-complex reality properly (for whatever reason).

If you want to use a different culture, the rules might be different. Even in some cases, as pointed in previous comments, these rules might become much more complex. Even more complex than what the CultureInfo class (the one in charge of dealing with all this information) can deal with (what purely speaking would also represent a bug in such a situation). If you would have taken a look at the code I am talking about (because it might sound incredible to some of the commentators here, but not knowing anything about the code being discussed isn’t precisely the ideal scenario to share your opinions on this matter), you would know that it even includes certain hardcoded rules for specific cultures! (what in my opinion should also be improved, but this is not the point now).

In summary, all what you are saying about thousands separators in general terms don’t have anything to do with what I have been talking about (i.e., plainly referring to a specific/default culture; or shall I enumerate each single culture every time?!). And what is more important: no matter how complex a given situation is, it shouldn’t ever represent a limitation for its proper implementation (logically, I am talking exclusively from a technical perspective).

There are sensible reasons to oppose to what I am proposing (e.g., “we don’t want to make a so radical change because would affect old codes”) and nonsensical ones (e.g., “this behaviour is fine” when you aren’t even understanding the behaviour; or the reasons for this behaviour to exist). I respect the ones belonging to the first group, but don’t want to get involved in the associated discussions anyway. Regarding the ones in the second group, I nicely ask their authors to please stop bothering me.

@Davio “well, I’ll stop wasting my time talking to a wall.” -> thanks for giving me a break and allowing me to stop answering what I consider irrelevant concerns from a most-likely unreasonable person (yes, you are the wall in my book).

Someone proposed the TryParseExact alternative before and I thought the same than now: it seems much more complicated.

I fail to see how. It is a pre-existing pattern that expresses an intent to follow strict formatting rules, rather than lenient ones (the default). It seems to be exactly what you are after.

Additionally, to highlight the impact: from my perspective (as a library author targeting multiple platforms), anything other than this approach can basically be rephrased as “you can’t use Parse or TryParse any more, and need to remove it from any existing libraries”. In the real world, you can’t fundamentally alter what a method means. What you propose is not a bug fix, it is not a feature request (additional support): it is a major breaking “this method now means something completely different”. And simply: that is never going to fly.

@Davio The problem is that accepting “1,1” and rejecting “1…1” is ambiguous and arbitrary. And this behaviour responds to the mere fact that the algorithm taking care of string parsing didn’t account properly for the more-complex group/thousands separators reality (i.e., what, IMHO, represents a bug); even despite of having all the required information available.

Anyway, thanks for your inputs and please understand that I don’t want to get involved in certain discussions. I will accept whatever decision the community (I mean, Microsoft) will make.

@jakesays I think that you are the one not getting the idea. I haven’t doubted even a single second about my limited expertise regarding so big implementations. That’s why I said: “discuss for as long as you wish and come to your conclusions”. I want to continue focusing my activity on mere implementation aspects (i.e., “you have the go ahead; do that”) and not having to worried about all these issues.

When I wrote this proposal, my intention was triggering the process to perform the correction; but being exclusively involved in the correction itself (not in all the required discussions driving to the final decision). As said since a while ago: no disrespect intended; just don’t want to participate in certain discussions; because I don’t like them, do recognise my limited expertise on certain aspects and do want to continue having my less-bigger-picture ideas.

Thanks again for all your contributions. I insist in the fact that there are many comments, most of them focused on certain ideas about which I don’t want to discuss (i.e., if you think in this way, I would accept your decision; trying to convince you is not one of my priorities).

Some of the last comments are highlighting issues with a slight technical essence which might need some clarification (well… not ideally); this is the main reason for writing this new comment.

Note that I use expressions like “.”, “,”, “thousands separators with certain number of digits in between” just to simplify ideas. I really mean: “decimal separator within the given culture”, “group separator within the given culture” and “size of the group in the given culture”. The original version of the parsing method (i.e., the one which I am referring in my first comment) logically accounts for the given culture (not for “.” or “,”; but for the decimal/group separator in the given culture); that is: the string is already being parsed on account of the expected format. Curiously, this information is not being fully maximised and that why the error I am referring happens: this method does know the size of the group/thousand separator (even if groups are acceptable/not in the given culture), but it doesn’t use it at all. Although it does use the given group separator character; that’s why the referred faulty reality exists: “1…” being wrong and “1,” being right (logically, I mean the English/Invariant culture; the one which this method takes by default in case that the user doesn’t expressly input a different one).

I hope that this point is now clear. I hope that you will continue enjoying your discussion. Again thanks to everyone for sharing your thoughts. Again, no disrespect intended (but why do I have to clarify things which should be evident to the kind of people I want to discuss with, that is: knowledge-enough on this specific issue?). And again, make the decision you wish and contact me only for issues strictly related with the actual implementation.

To me, the fundamental problem here is that you are asserting a particular definition of “correct”, but that definition is highly questionable. You say string stringVal = "1,1.1"; is not a valid decimal. Well… yes, yes it is. It might not be in the most commonly expressed form for that culture, but: it is perfectly well-formed, unambiguous, etc. There is no reason not to accept this (contrast "1.1.1", since . in the stated culture has a very different impact), unless we’re using TryParseExact. Now; decimal doesn’t have a TryParseExact, but some other types do.

If the conversation were restructured to

Proposal: add TryParseExact to numeric types, enforcing expected formatting rules including group separator positions, etc

then I could totally get behind that proposal. But: the API at the moment is perfectly valid - it just isn’t what you expect. That’s fine; it doesn’t have to be.

I am new here and I am not sure how all this works.

@varocarbas Fairly certain the way you’re approaching this right now won’t get you very far. Accusing people at random while praising your own achievements/skills, instead of providing concrete, real-world cases of why a change that would break backward-compatibility on an extremely widely used API that has been in the framework with its current behaviour for over a decade would be required.

There are countless applications running business critical logic that rely on this API for them to work. If that API were to suddenly change when the target platform for those applications changes, without any notice, then that could be an absolute disaster. The fact that you don’t seem to grasp that is simply beyond me.

Barging in here shouting that you’re not interested in discussion and your proposed change is the be-all-end-all solution to a huge (but non-existent) problem, is not in the spirit of this community. You’re not Torvalds, and this isn’t the Linux repository. Get a grip.

@mikedn Wake me up? From what? Do you think that I don’t have experience in dealing with cluelessness? That I haven’t seen lots of problems provoked for no reason? That a win-win situation can easily be converted into a lose-lose one? I am perfectly aware about this sad reality (much more than what the attitude I show might indicate; in fact, I don’t trust in people’s proper understanding and practicality since some time ago).

FYI, the dream already came true (I have full access to an impressive code; thanks again Microsoft, you made me really happy 😃). It would be quite nice to allow others to also benefit from all what I will be doing with that code; but to be honest, I don’t have many hopes on this front (as said, I have long experience dealing with pure nonsense).

Please, stop converting what seemed a true dream

I’m not converting any dream, I’m just trying to wake you up.

None of those posts seem to answer the “why?”. People simply say “I want X” but nobody explains why.

I have 2 online banking accounts and I was curious to see what happens if you enter something like 12,34,3. One simply ignores all the commas and the other doesn’t allow you to type any commas (which IMO is the best thing to do). If banks have no problems with digit grouping then why is this a significant problem for a WinForms app?

If you try to copy/paste 12,3,4 in the Windows calculator you get 1234. Excel doesn’t treat 12,3,4 as number but then Excel is a rather special case because it tries to distinguish text from numbers.

If this fix/change was for .NET 1.0 Beta then it would have been perfectly fine but it’s .NET 4.6.x we’re talking about and introducing a breaking change simply because someone wants it but offers no justification won’t be easy.

@shahid-pk what you are saying doesn’t make any (practical) sense. Please, take a look at my following comment.

The problem here was that I didn’t want to participate in certain discussion, what some people didn’t like.

Again: You do not choose what direction the discussion goes. The discussion goes towards all aspects that the proposed change would affect.

And sentences like this are really unneeded:

and mainly coming from people whose knowledge was clearly limited on this front

Many of the people posted here have a much broader knowledge than you.

Ok I guess I deserved that I’ll leave now bye 😃

I was starting to wonder about the reason for so many comments whose motivation wasn’t too clear (and mainly coming from people whose knowledge was clearly limited on this front); also about the curious raise of visits to my web. By looking at the origin of the visitors, I reached discovered this: https://www.reddit.com/r/programming/comments/3w9tc8/a_perfect_example_of_how_not_to_propose_a_change/

I have to recognise that I am kind of honoured (why nobody told me about it?!), this is the first time that anyone has created a whole thread just for me! I mean… I don’t use Reddit, but it is kind of relevant (and there are lots of comments). Thanks fans. Sorry about that, but you hero doesn’t care too much about you (no, I will not even read what you have written, because after seeing the comments here I don’t too much there) 😉

Microsoft, I do recognise that I am a magnet for this kind of reactions (although seriously that this is the first time when I have provoked such a thing!), but I hope that everyone will learn from this “incident”. I personally will focus my activity on mere fork/pull-ing (and eventually post very small answers/questions).

It looks to me that the issue really is a language / culture (ironically) barrier. He failed to express his intentions / others failed to understand him in that he is not after changing a known behavior and thus breaking every code ever written using this method. And he’s not after any politics, nor he’s accustomed to the processes taking place in this community to start a discussion on something, or propose a change or addition.

What he’s after, is a discussion on how to implement something like ParseExact, with all the possible options available to configure the parser, etc. Also see @atheken comment: https://github.com/dotnet/coreclr/issues/2285#issuecomment-163937612

Eh, miscommunication happens. 😃

@varocarbas The free market of ideas can sometimes be chaotic, but mostly produces the best outcomes. This is the spirit of open source. It is not just a technical movement, but cultural as well. Welcome to the community.

@varocarbas Well looks like you are coming to your senses, maybe you aren’t a troll after all. Seriously though I would just stop replying to this issue if I was you just ignore everything here. Try to learn from this.

You don’t need to participate in what you don’t want to participate in. But by responding you chose to participate, you chose to respond and to react. Like I said in the other issue already: You may have raised this issue, but that doesn’t mean you own it. It doesn’t mean the discussion goes only according to your preference.

And frankly, I don’t see any point in submitting a PR before the discussion was resolved here. If you just submit a PR now, then the discussion will just repeat in your pull request. What’s the point of that? You said you don’t want to waste your time, yet you do it.

Indeed. In the reality we could not see it here. It is a pity that you chose to ignore valid points as “not technical” or “meta”. 😦

Anyway, go ahead and send the pull request. The discussion will definitely repeat there.

@MartinJohns I was (kind of) sarcastic too. With “kind of” I meant that I got your idea (= how this should work), but the reality (= what I have seen here, mainly today) doesn’t seem to prove it (= lots of people wasting time, getting offended and trying to offend; just a few trying to understand the underlying problem and trying to help). In any case, thanks for your contribution (I love Dr. Who).

PS: I haven’t ever seen a Dr. Who episode.

@varocarbas The correct tool for the problem you wish to solve: “validating the format of this string that might represent a number” is a Regex.

Let’s try to formulate a hypothetical situation to illustrate the problem with changing framework features like Parse.

You write an app that uses a REST API. You’ve got strict JSON parsing on the response payload from one of their endpoints (you throw an exception if there’s unexpected properties in the response payload). The REST API adds new properties to this endpoint. Your app, which has been working in production for months is now crashing.

When you wrote your app, did you have a reasonable expectation that the endpoint’s behavior would not change? Did you write tests that validated the internal behavior of the API?

Of the .net applications that are out in the world, how many of them do you suppose have tests on TryParse to verify the behavior handles a case like this? My guess is very few. Writing tests to verify framework behavior at this level would be madness.

It’s not even about whether the current behavior is “correct”, it’s about breaking existing applications that aren’t checking for (and don’t care) about positional commas. And, for what, because we saved a few people that had a specific formatting requirement (that is not even universally accepted) a Regex check?

No. Just no.

I’m sorry you feel you’ve had a hostile reception, and the conversation indeed isn’t a model of civility. I hope you don’t let it influence you to be less involved. But: I think there are some learning points here that you should take on board:

  • you have dismissed some valid points raised by multiple people in the community
  • you have refused to take on board repeated correction on the same themes (. vs , etc)
  • you have tried to insist that only some people are entitled to have an oppinion: that is a match on a petrol can right there

There are good reasons not to accept the idea as proposed, and they aren’t because the idea is stupid or whatever - so don’t take it personally. There is some good stuff here that could be used in a different API.

@varocarbas I was being sarcastic, thus the irony punctuation mark. The spirit of Open Source is open discussion. So kinda the opposite of what you do.

And why am I here wasting so much time unnecessarily?!

That’s the true spirit of Open Source! (⸮)

@varocarbas The position of the chars is ignored for the same reason, white space or the currency symbol is ignored: it does not changes the numerical value.

You are unable to learn this, you have established that. You can try to convince us as long as you want, it is not going to happen. If the change will ever be implemented, then only in the next major release to not break existing code.

I am a quite experience programmer, what I seriously doubt that is your case

The fact, that you do not seem to understand how stupid this whole situation is speaks otherwise. Also I have certification from the Swiss government for programming. If your website is not lying, then I have also been programming a lot longer and in many more languages than you, so stop making assumptions about other people.

@Dave3of5 My blog? What are you talking about?! I don’t have any blog.

I know that trolls shouldn’t be fed, but I am not used to deal with this level of crazy nonsense; that’s why reading such pointless stupidities and not saying even a word is kind of difficult to me (will try harder next time).

Apparently, I was wrong: the bottom wasn’t still reached in my previous comment (not sure how low can go some people still. Let’s see… absolute ignorance + obsessive behaviours can perform really well on this front).

Okay, I’m finally convinced, you’ve pleaded your case well, @varocarbas, please vote for and support this change and add it to the main line in the next release.

Aaah, just kidding of course. Do have to point this out though: https://en.wikipedia.org/wiki/Dunning–Kruger_effect

Hi, I’m not sure if this is the right place; but I also have a suggestion to Console.Write. I believe that printing out grammatically incorrect sentences is wrong. It should only print out a sentence that meets the language grammar that is defined in the current CultureInfo.

These are my first attempts here and I am still getting used to this format (even to how to deal with the .NET team).

This is a public format. You are not only dealing with the .NET team, but with the whole community instead. This is nothing you can avoid if you want to participate in this format. You can’t just ignore what other people say and “focus on the implementation only” - there are lot of concerns that have been brought up and that need to be evaluated. This is a public format and you don’t decide on how the conversation goes. And saying this like “with no interest in adapting myself to what I don’t consider appealing enough.” is the absolute worst way to go.

No it isn’t. 1,1 isn’t ambiguous, it can only be 11. 1…1 just isn’t a number (would it be 1.1? or 1.001? 1.0.0.1, that’s not even a number. something else? this is why a number may only have 1 decimal separator)

You seem to have a problem with the group size, especially when it’s 0, but the group sizes can be really awkward (for instance in India you can have something like 1,00,00,000), now do you choose group size 2 or 3 here? Both won’t work. The option which works best (is able to parse most numbers) is to just disregard the group size and group separator altogether and that’s what’s currently happening and there’s no need to change that. If you can’t be convinced, well, I’ll stop wasting my time talking to a wall.

The problem is that accepting “1,1” and rejecting “1…1” is ambiguous and arbitrary.

It was mentioned multiple times already that this is not ambiguous or arbitrary in any way. The one is a padding character ignored by the parsing algorithm, the other is a separation character which identifies when the floating-point part begins.

Still I insist you try to see the difference between “something that can be parsed” (can be turned into a decimal number with no ambiguity) and “something which is strictly valid given a certain culture”. You seem to think these are one and the same while they’re in fact not.

And who even decides what’s strictly valid? Certainly not you, is there an official RFC for number representation or an ISO standard? I can’t seem to find one. So all we have to go on are some real world examples and in that case it’s very helpful if the parser accepts a wide range of possible values (as long as there’s no ambiguity about the number it represents) instead of trying to be limited. 1,1 is just as straightforward for the parser as 11, there’s really no harm in it being able to parse it.

Ok then you have communicated your position on the implementation aspects of your idea. However you cannot propose such an idea and not involve yourself in the larger picture because of its importance. I would suggest you attempt to learn from the comments of others so that you may be able to fully participate in the process.

There is no valid reason explaining why “1…1” is wrong and “1,1” is right.

Yes, there is. You are just ignoring what people are saying about it. Group separators are essentially only padding. 12345678 is identical to 12,345,678 is identical to 12,34,56,78 (clarification: in a culture where groups are comma-separated). You probably wouldn’t choose the last form (although various group lengths exist in different cultures, including groups of different length depending on position), but it isn’t invalid. The decimal separator is completely different; it can only appear once, and if it appears more than once there is no valid way of interpreting the data. Trying to emphasize “1…1” vs “1,1” actually weakens your argument.

I have proposed a modification which I consider worthy.

And people are pointing out the many ways in which this modification would be undesirable and unhelpful - and that is before you get to the problems of multi-targeting (it would be very bad if decimal.Parse did completely different things on dnxcore50 vs dnx451). And hence also why the “perhaps add a new TryParseExact etc” could be a viable alternative that achieves what you want, without hitting all the bumps in the road.

@jakesays Please, don’t force me to be involved in a conversation which I don’t want to be part of. I have been crystal clear on this front in my previous-to-last message (i.e., with “,” I mean group separator in the given culture). Try to make some effort to understand properly, rather than arbitrarily criticising every single word (and/or defending something as pointless as “1,1” being a number and “1…1” not being a number).

@mgravell is talking to me!!! Ah!!! I am kidding, but you are kind of a legend, you know? 😃

Sorry to disagree with you, but “1,1.1” is plainly wrong not subjectively wrong. There is no valid reason explaining why “1…1” is wrong and “1,1” is right.

On the other hand and as explained many times before, I do understand that there are many more issues involved when deciding to perform certain modification. I am not trying to convince anyone about the main priorities here (i.e., keeping backwards compatibility no matter what vs. being somehow adaptable on this front). I am not even saying that am in a position to know what is best (you, for example, have certainly much more experience than me on big developments and on coordinating different expectations). I have proposed a modification which I consider worthy. Doesn’t the .NET community think like me? I would accept any decision (but not necessarily agree with it).

There is a difference between being parseable (which is always a best-effort kind of thing, trying not to break on things it can deal with) and being strictly valid in a certain context (in this case in a certain default culture), they’re not the same; valid is always going to be a subset of parseable. A number like " 1.2 " (note the whitespace) isn’t strictly valid, but I’m glad it can be parsed.

@varocarbas You are completely missing the valid point @mikedn is trying to make. HOW the code should work in your opinion is irrelevant. What is relevant is the fact that the current behavior is what people expect, and has been what they expect for 15 years now. You cannot make a breaking change like this, not even to make the code ‘correct.’ If you do you run the risk of potentially breaking thousands of applications.

I really do not understand why you are having such difficulty with this concept.

I would like to quote some comments from Reddit:

For example if you accepted a formatted value 1,234,567 with the user deleting the last two characters it could become an input of 1,234,5 and currently the method would accept this value. If the user deleted everything but the first digit it would be 1, and still be valid. The number of comma’s is entirely arbitrary at this point. - (Losobie)

Even worse: suppose I entered 1,000,000,000 and realise I’ve entered one too many zeroes I have to now not only delete only the last zero, I also need to re-arrange all the thousands separators and, effectively, re-enter the entire number.

The correct answer to why he’s wrong is that thousands separators in numbers are semantically insignificant whitespace and numbers can only have one radix, since the radix signifies the ones place. Thus, a radix is significant, and a thousands separator is not. We could say that the first radix in a string is the canonical one for that string, but that would be a decision of the specification maker. What would make much less sense is treating semantically insignificant whitespace as semantically significant. (Godd2)

Good point IMHO.

Then there’s the “detail” that a thousands-separator is not always used to group digits in threes (so I’m not quite sure how ‘simple’ a ‘fix’ would be (if you’re gonna fix it, do it right, right? 😛 ) , might be harder than you think):

The Indian numbering system is somewhat more complex: it groups the rightmost three digits in a similar manner to European languages but then groups every two digits thereafter: 1.5 million would accordingly be written 15,00,000 and […]. (Wikipedia)

The International Bureau of Weights and Measures states that “when there are only four digits before or after the decimal marker, it is customary not to use a space to isolate a single digit” (Wikipedia)

The above quotes are, to me, reason enough to treat thousands separators as ‘whitespace’ or ‘noise’ if you will; they’re only for (human) readability and change nothing to the meaning of the number. So why bother with them at all when parsing a number?

Yes, 1,,,,,, doesn’t make sense and yes maybe a, for example, ParseStrict() method should make it’s way into .Net but changing behavior on existing methods people (knowingly or _un_knowingly) rely on is a dangerous thing and should not be taken lightly.

And I’d like to leave with a quote from myself:

Such changes require […] shims, maybe even a (new overload with a) extra argument to the […] methods that allow you to specify the desired behavior or maybe even a whole new method needs to be created […].

Nobody is saying that the current behavior is absolutely perfect and nobody is saying there’s nothing wrong with current behavior. They are, however, saying that such changes can’t be made on a whim or go overnight. It needs careful consideration and deliberation.

@varocarbas Unfortunately, this has nothing to do with your culture or locality, but rather with the standards of the language, the library and its culture. You say:

Clear bug (e.g., accepting “1,” as a valid number; but triggering an error with “1…”).

But it is not a clear bug, in fact it’s not a bug at all. In the American standard, the thousands operator is indeed optional, and while writing numbers like this: “12,34,56.3” is frowned upon, it is not incorrect. On the other hand, there can be at most one decimal point, otherwise the string of digits cannot be interpreted as a valid number. The reason I am talking about the American standard here is because the underlying logic of this library, and the .NET framework, is built upon this standard. If the .NET framework were forced to also accommodate other standards, then I might as well request that the “for” keyword be changed to “para” for all the same reasons you want to change the behavior of this function. You want this function to behave in a non-standard way, which is by definition a code-breaking change. As a matter of fact, the behavior you are proposing would be considered a bug, not the other way around.

@varocarbas Your passion for technical details is admirable, but unfortunately, software development - and especially on the scale of .NET - revolves around expectations, promises, and assumptions. Merely by saying “It is very unlikely (to not say virtually impossible) that any (serious-enough) code has ever relied on a so unexpected behaviour”, you are already creating assumptions of your own.

Not all people, and indeed not all programmers, are created equal, and although most of us prefer the vacuum of technical implementation details, when it comes to proposing changes to a long-established code base, discussion such as in this issue is a required part of the process. Your proposal may indeed have merit, and there is no one turning it down saying that is doesn’t straight off the bat - they are merely proposing opposing views. Whether that is a process you would like to participate in or not, is entirely your choice.

@varocarbas no one is questioning your skills or ability here , all that we are questioning is your idea and giving our views on your idea. Even the developers that are professionally developing .net gets questions like these. You have to understand that this is not targeted only to you or only to your idea, when you create an issue/or mail on any open source project which brings a new idea, others developers give their views on that idea , this is how open source works. This is certainly not to discourage you but to encourage you to reason about your change taking into account every concern that is raised. Coming to this issue if you look it from front yes this a behavioral bug which in a perfect world should be fixed but we don’t live in a perfect world , we live in a world in which ms has shipped this behavior for ten years or so and people like you and me are working around this bug, so a change in the behavior will break those workarounds , and no one can be certain that this change won’t break many many applications. Neither can we say to users of .net that your application was working on .net 4.6.1 but now your application is broken on .net 4.6.2 because you are a bad programmer and you used our buggy TryParse which we just now fixed. Further more we can only talk about the technicalities of the code that you would write if all the stalk holders agree to bring this breaking change. Which is the easiest part because as i said earlier no one is questioning your capability or ability.

@varocarbas When you work on something like .NET that has millions of developers targetting it, you cannot afford to fix somethings in a way that would break substantial amount of existing applications. Even though the fact that they relied on the broken behavior can mean that they were not great programmers. Because if you do that, you hit the users of those applications and they will blame you, not the application developer. From their point of view, they had an app that was working fine, but when they upgraded to the latest .NET framework runtime, the app stopped working. Of course bugs should get fixed in general, but one need to be extremely careful about the way to fix them so that the fix doesn’t break others.

@varocarbas lets say i am user of this TryParse i am depending on 1, to be valid why ? because this behavior is their for almost 10 years , i don’t know how many apps i have developed depend on this. If you change this you will break all my apps. Do you have a solution that fix this and does not break all my ten or so app that is using the old behaviour ?

if you think that numbers are confusing why are you commenting in a thread aiming to change the source code dealing with these issues

The real problem here is that you think that this is about changing a bit of code somewhere. It is not, it’s about changing the behavior of a commonly used framework API.

So, you think that a parsing method which accepts “1,” as valid number is fine because “,” are the thousands separators in the given culture? Then why not accepting “1…” too? What makes the thousands separators so special?

Nothing, really. Numeric separators aren’t by the thousands everywhere on Earth. And they aren’t commas everywhere on Earth, either. Numbers are really confusing once you bring culture into the mix.

So, you think that a parsing method which accepts “1,” as valid number is fine because “,” are the thousands separators in the given culture?

What I’m trying to say first and foremost is that this is a breaking change and as such it needs to be treated carefully. It’s more important how this change affects existing code than the fact that that it can be fixed, or if the fix is efficient or if there’s who knows what comment in the source code.

@mikedn So, you think that a parsing method which accepts “1,” as valid number is fine because “,” are the thousands separators in the given culture? Then why not accepting “1…” too? What makes the thousands separators so special?

Regarding linking to the SO questions, I can easily locate one (because I answered it); but not so sure regarding the others questions. Although if this is a basic requirement to do what I consider required, I might do some effort.

Regarding why I think that it would have no impact? Exactly for the same reason why I consider that I can fix it: because I have quite a long experience on (efficient) algorithm building and know what I am talking about 😃

BTW, do you know what are the expected next steps? Should I just wait here until someone in the .NET considers that my suggestion is relevant?

What is the difference between these examples and “111#12”?

It seems obvious to me, # is invalid character in this context while , is not.

I knew about this “issue” some months ago while answering a SO question of someone being very interested in triggering an error for these cases.

It may be useful to include a link to that question or at least an excerpt from it if you want to convince people that a breaking change is necessary.

precisely includes a comment explaining the supported situations, where properly accounting for the locations of thousands separators is not being mentioned (as said, the method doesn’t even deal with the required information).

I don’t see any such comment but anyway, what does that have to do with what the documentation says?

Lastly, I want to highlight that the required correction would have a really low impact.

How do you know that? There may be code out there which parses text which contains such numbers and all of the sudden that code will stop working.

Such misbehaviour can be replicated under many different conditions. The basic idea is that the parsing approaches don’t understand what the thousands separators really imply (i.e., groups of 3 numbers in the aforementioned example).

Why exactly is this a bug? There doesn’t appear to be anything in the documentation of decimal.TryParse that requires group sizing to be enforced. Also, group sizing doesn’t affect the numeric result in any way - 12,34,56.78 is exactly the same value as 123,456.78. Changing this behavior now seems like a gratuitous breaking change.