gitea: Valid email addresses are rejected with `The email address is invalid`, entirely preventing the use of gitea

Description

Valid email addresses are rejected, preventing admin user, or any other user, creation; additionally, these same valid email addresses are rejected if a temporary one is added and attempted to fix with the valid one after the fact.

Example address that is not accepted, despite being perfectly valid under (among other standards):

Without quotes, local-parts may consist of any combination of
   alphabetic characters, digits, or any of the special characters

      ! # $ % & ' * + - / = ?  ^ _ ` . { | } ~

   period (".") may also appear, but may not be used to start or end the
   local part, nor may two or more consecutive periods appear.  Stated
   differently, any ASCII graphic (printing) character other than the
   at-sign ("@"), backslash, double quote, comma, or square brackets may
   appear without quoting.  If any of that list of excluded characters
   are to appear, they must be quoted.  Forms such as

      user+mailbox@example.com

      customer/department=shipping@example.com

      $A12345@example.com

      !def!xyz%abc@example.com

      _somename@example.com

   are valid and are seen fairly regularly, but any of the characters
   listed above are permitted.  In the context of local parts,
   apostrophe ("'") and acute accent ("`") are ordinary characters, not
   quoting characters.  Some of the characters listed above are used in
   conventions about routing or other types of special handling by some
   receiving hosts.  But, since there is no way to know whether the
   remote host is using those conventions or just treating these
   characters as normal text, sending programs (and programs evaluating
   address validity) must simply accept the strings and pass them on.

Example:

_@anydomain.com

Gitea Version

v1.16.8

Can you reproduce the bug on the Gitea demo site?

Yes

Log Gist

No response

Screenshots

No response

Git Version

No response

Operating System

No response

How are you running Gitea?

not relevant

Database

MySQL

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Reactions: 9
  • Comments: 20 (15 by maintainers)

Most upvoted comments

How about user@some-domain.com? Dashes in domain names is very common.

This was my worry when the email character restrictions were merged. There is really only one thing we should not be allowing outside of RFC5322:

  • An initial -

but even that is really not our responsibility - it’s only a problem with those running sendmail commands and the way they’ve configured the command. It should probably be an optional setting.

I think that UTF-8 should be allowed - at least optionally - we’re no longer in the 80s and whilst the majority of email addresses are still using only basic ASCII it’s not really right to still be restricting in this matter. If there are issues with ambiguous characters (and there would be) we can fix the display of those (and in fact #19990 would provide the mechanism for doing this.)

This issue has been raised before, and closed as completed, but clearly, it is not

https://github.com/go-gitea/gitea/issues/17029#event-6424074969

https://github.com/go-gitea/gitea/issues/17029#issue-994354749

I don't think it makes sense for gitea to validate emails at this point. AFAIK these emails serve two purposes:

    Send email notifications
    Associate the user with the emails in git histories

https://davidcel.is/2012/09/06/stop-validating-email.html

https://medium.com/hackernoon/the-100-correct-way-to-validate-email-addresses-7c4818f24643

The 100% correct way

Send your users an activation email. (That’s a bold full-stop for effect.)

If you are not going to send a validation email, then do not try to validate the email address; don’t reject valid email addresses you haven’t validated, because you cannot know they are invalid if you haven’t tested their validity by verifying they are valid…

The myth about the no leading - restriction is, so far as I can determine, from people that aren’t aware of how to interact with unix command line utilities.

Under both Linux and BSD, sendmail, like pretty much all other CLI utilities, should be invoked with a -- argument after the flags you want sendmail executed with. All arguments supplied after -- are parsed as literal payload values and are not considered flags/switches to sendmail.

This should also be the case with all the other popular sendmail alternative MTAs like postfix and the rest.

Sure, it’s possible that some ancient version of a popular MTA didn’t support -- and this workaround was required (though basic unix utilities have had to support this from forever ago in order for you to create or delete a file name - so it’s not some really advanced wizardry) but an MTA that old should probably not be connected to the internet as it surely has bigger problems than not supporting the venerable -- syntax (cough security vulnerabilities cough).

I think we should be careful here to note that git will allow these extremely weird email addresses and Gitea will just use them.

So by having this super-restrictive pattern we’re not preventing weird and ambiguous email addresses from appearing in Gitea - just preventing the user from saying that one belongs to them.

Further, the potential problem of email addresses being ambiguous/confusable with another user isn’t particularly a much worse issue as the Gitea will not show the email address and will show the user that they map to instead. Thus unicode ambiguity of email addresses should only affect the user who the ambiguous email address belongs to.

Next we should consider if there are potential sec issues by allowing arbitrary email addresses.

  1. Sendmail - preventing an initial - seems to be all that is needed.
  2. We don’t appear to display the email in any email templates - so I don’t think there’s a problem there.
  3. Otherwise the only places that the email is shown is on the user’s own settings page or when they try to login with this.

As far as I can see the only person affected by Gitea allowing users to register their own weird email address is the user itself. Thus apart from blocking the initial - I can’t see a good reason for further restriction beyond RFC5322.

I might be missing something though - does anyone have any other ideas?

I think yes, recent version has a more restricted limitation for email address than any RFC. For your example, both user+mailbox@example.com and customer/department=shipping@example.com are valid email addresses.

That is exactly the problem: If we allow leading -, there will always be someone who somehow forgets to prepend -- to his sendmail command. While Gitea should handle all its sendmail code correctly, that is not necessarily the case for all subsequent tools we have no influence over that for example query the email from the API.

Non-argument. You can say that about any option & any command. If you don’t know what you are doing, you can do everything the wrong way.

Exhibit A

The find command.

If you don’t know, that this antiquated tool from pre-historic stone ages is using single dashes for long options, you are just as lost. Yet, this tool is (sadly) used all over the world & people deal with it. So, if they can deal with this pre-historic piece made by a caveman, they can deal with a simple inbetween double dash.


I don’t even know why you would want an email address that starts with -. It might be allowed but the only benefits it grants are “trolling” insecure applications and annoying senders who want to send you an email.

Non-argument. Even more so, than the last one.

Exhibit A

“I don’t even know why you would want a nickname like ‘delvh’, like why? Why not use your real name or a normal English word?” - someone could ask.

If something is not forbidden, why make it “undesired” by convention? Such actions are usually much more confusing than an unusual case within the frame of what is allowed, because then you have millions of conventions, which are basically unwritten “rules” one “must” (not really) follow.

Therefore, if it is allowed & according to the spec, it should work. Simple as that.

If someone does not use a double dash or quotes or whatever options there are & then blaming the software design for it, is pretty much like saying “well, someone could type ls --l instead of ls -l, so we must make it work with a double dash, as well” or something along those lines.

This feels like something that is going to go round and round in circles.

The restriction is in:

https://github.com/go-gitea/gitea/blob/0e46499258a20a8d701cdfc489c55a4246b4901e/models/user/email_address.go#L145-L168

If you want to change/relax this you will need to make a PR that changes the code here.

You will also need to consider:

  1. Sendmail - email addresses are sent as arguments to the command line - AFAIU disallowing an initial - should be all that needs.
  2. Any other mail system - could arbitrary unicode characters break things?
  3. Do we need to add ambiguous checking to the display of email addresses.

Code speaks louder than words. Make the PR.

Plus since 2012 you can use international characters above U+007F, encoded as UTF-8.

https://stackoverflow.com/questions/3844431/are-email-addresses-allowed-to-contain-non-alphanumeric-characters

(just FYI, not sure whether Gitea should support it. there could be some comments or documents about supporting or not).