gitea: Valid email addresses are rejected with `The email address is invalid`, entirely preventing the use of gitea
Description
Valid email addresses are rejected, preventing admin user, or any other user, creation; additionally, these same valid email addresses are rejected if a temporary one is added and attempted to fix with the valid one after the fact.
Example address that is not accepted, despite being perfectly valid under (among other standards):
- RFC 5322 https://datatracker.ietf.org/doc/html/rfc5322
- RFC 3696 https://datatracker.ietf.org/doc/html/rfc3696
Without quotes, local-parts may consist of any combination of
alphabetic characters, digits, or any of the special characters
! # $ % & ' * + - / = ? ^ _ ` . { | } ~
period (".") may also appear, but may not be used to start or end the
local part, nor may two or more consecutive periods appear. Stated
differently, any ASCII graphic (printing) character other than the
at-sign ("@"), backslash, double quote, comma, or square brackets may
appear without quoting. If any of that list of excluded characters
are to appear, they must be quoted. Forms such as
user+mailbox@example.com
customer/department=shipping@example.com
$A12345@example.com
!def!xyz%abc@example.com
_somename@example.com
are valid and are seen fairly regularly, but any of the characters
listed above are permitted. In the context of local parts,
apostrophe ("'") and acute accent ("`") are ordinary characters, not
quoting characters. Some of the characters listed above are used in
conventions about routing or other types of special handling by some
receiving hosts. But, since there is no way to know whether the
remote host is using those conventions or just treating these
characters as normal text, sending programs (and programs evaluating
address validity) must simply accept the strings and pass them on.
Example:
_@anydomain.com
Gitea Version
v1.16.8
Can you reproduce the bug on the Gitea demo site?
Yes
Log Gist
No response
Screenshots
No response
Git Version
No response
Operating System
No response
How are you running Gitea?
not relevant
Database
MySQL
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 9
- Comments: 20 (15 by maintainers)
How about
user@some-domain.com? Dashes in domain names is very common.This was my worry when the email character restrictions were merged. There is really only one thing we should not be allowing outside of RFC5322:
-but even that is really not our responsibility - it’s only a problem with those running sendmail commands and the way they’ve configured the command. It should probably be an optional setting.
I think that UTF-8 should be allowed - at least optionally - we’re no longer in the 80s and whilst the majority of email addresses are still using only basic ASCII it’s not really right to still be restricting in this matter. If there are issues with ambiguous characters (and there would be) we can fix the display of those (and in fact #19990 would provide the mechanism for doing this.)
This issue has been raised before, and closed as
completed, but clearly, it is nothttps://github.com/go-gitea/gitea/issues/17029#event-6424074969
https://github.com/go-gitea/gitea/issues/17029#issue-994354749
https://davidcel.is/2012/09/06/stop-validating-email.html
https://medium.com/hackernoon/the-100-correct-way-to-validate-email-addresses-7c4818f24643
If you are not going to send a validation email, then do not try to validate the email address; don’t reject valid email addresses you haven’t validated, because you cannot know they are invalid if you haven’t tested their validity by verifying they are valid…
The myth about the no leading
-restriction is, so far as I can determine, from people that aren’t aware of how to interact with unix command line utilities.Under both Linux and BSD,
sendmail, like pretty much all other CLI utilities, should be invoked with a--argument after the flags you wantsendmailexecuted with. All arguments supplied after--are parsed as literal payload values and are not considered flags/switches tosendmail.This should also be the case with all the other popular sendmail alternative MTAs like postfix and the rest.
Sure, it’s possible that some ancient version of a popular MTA didn’t support
--and this workaround was required (though basic unix utilities have had to support this from forever ago in order for you to create or delete a file name-so it’s not some really advanced wizardry) but an MTA that old should probably not be connected to the internet as it surely has bigger problems than not supporting the venerable--syntax (cough security vulnerabilities cough).I think we should be careful here to note that git will allow these extremely weird email addresses and Gitea will just use them.
So by having this super-restrictive pattern we’re not preventing weird and ambiguous email addresses from appearing in Gitea - just preventing the user from saying that one belongs to them.
Further, the potential problem of email addresses being ambiguous/confusable with another user isn’t particularly a much worse issue as the Gitea will not show the email address and will show the user that they map to instead. Thus unicode ambiguity of email addresses should only affect the user who the ambiguous email address belongs to.
Next we should consider if there are potential sec issues by allowing arbitrary email addresses.
-seems to be all that is needed.As far as I can see the only person affected by Gitea allowing users to register their own weird email address is the user itself. Thus apart from blocking the initial
-I can’t see a good reason for further restriction beyond RFC5322.I might be missing something though - does anyone have any other ideas?
I think yes, recent version has a more restricted limitation for email address than any RFC. For your example, both
user+mailbox@example.comandcustomer/department=shipping@example.comare valid email addresses.Non-argument. You can say that about any option & any command. If you don’t know what you are doing, you can do everything the wrong way.
Exhibit A
The find command.
If you don’t know, that this antiquated tool from pre-historic stone ages is using single dashes for long options, you are just as lost. Yet, this tool is (sadly) used all over the world & people deal with it. So, if they can deal with this pre-historic piece made by a caveman, they can deal with a simple inbetween double dash.
Non-argument. Even more so, than the last one.
Exhibit A
“I don’t even know why you would want a nickname like ‘delvh’, like why? Why not use your real name or a normal English word?” - someone could ask.
If something is not forbidden, why make it “undesired” by convention? Such actions are usually much more confusing than an unusual case within the frame of what is allowed, because then you have millions of conventions, which are basically unwritten “rules” one “must” (not really) follow.
Therefore, if it is allowed & according to the spec, it should work. Simple as that.
If someone does not use a double dash or quotes or whatever options there are & then blaming the software design for it, is pretty much like saying “well, someone could type
ls --linstead ofls -l, so we must make it work with a double dash, as well” or something along those lines.This feels like something that is going to go round and round in circles.
The restriction is in:
https://github.com/go-gitea/gitea/blob/0e46499258a20a8d701cdfc489c55a4246b4901e/models/user/email_address.go#L145-L168
If you want to change/relax this you will need to make a PR that changes the code here.
You will also need to consider:
-should be all that needs.Code speaks louder than words. Make the PR.
Plus since 2012 you can use international characters above U+007F, encoded as UTF-8.
https://stackoverflow.com/questions/3844431/are-email-addresses-allowed-to-contain-non-alphanumeric-characters
(just FYI, not sure whether Gitea should support it. there could be some comments or documents about supporting or not).