-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[Mailer][Mime] Support unicode email addresses #58361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Before this commit, Envelope would throw InvalidArgumentException when a unicode sender address was used. Now, that error is thrown slightly later, is thrown for recipient addresses as well, but is not thrown if the next-hop server supports SMTPUTF8. As a side effect, transports that use JSON APIs to ESPs can also use unicode addresses if the ESP supports that (many do, many don't).
Hey! I see that this is your first PR. That is great! Welcome! Symfony has a contribution guide which I suggest you to read. In short:
Review the GitHub status checks of your pull request and try to solve the reported issues. If some tests are failing, try to see if they are failing because of this change. When two Symfony core team members approve this change, it will be merged and you will become an official Symfony contributor! I am going to sit back now and wait for the reviews. Cheers! Carsonbot |
Also fix one mysteriously broken unit test.
I pushed a new commit resolving all received comments. Thanks! |
Please fix the coding standards to follow the Symfony coding standards (see the fabbot.io check) |
* The SMTPUTF8 extension is strictly required if any address | ||
* contains a non-ASCII character in its localpart. If non-ASCII | ||
* is only used in domains (e.g. horst@freiherr-von-mühlhausen.de) | ||
* then it is possible to to send the message using IDN encoding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* then it is possible to to send the message using IDN encoding | |
* then it is possible to send the message using IDN encoding |
@@ -44,10 +44,6 @@ public static function create(RawMessage $message): self | |||
|
|||
public function setSender(Address $sender): void | |||
{ | |||
// to ensure deliverability of bounce emails independent of UTF-8 capabilities of SMTP servers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would bounces still work fine if your SMTP server supports SMTPUTF8 but the target server does not ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you get a 5xx error when the supporting SMTP client sends to the unsupporting SMTP server.
The Symfony code is written to minimize the chance of running into this at all, though. If you send to e.g. info@grå.org, Symfony will be able to send that even to an unsupporting client. That's why the code tests for non-ASCII in the localpart (not the entire address). This is the same approach as e.g. Exchange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should elaborate. The 5xx error means that the server that generates the DNS is one that supports SMTPUTF8. It will generate a bounce that does not require SMTPUTF8 in order to be delivered, and which contains UTF8 in the body text.
It works quite well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Sender will become the recipient of the bounce. That's why I'm wondering how this would behave.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the sender uses an ASCII address (highly advisable at this time), then the bounce does not require SMTPUTF8. Delivering the DSN is simple.
If the sender uses a non-ASCII address, then Symfony's upstream MTA will generate a DSN that uses SMTPUTF8, but in this case we know that the sender has support for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bounces work reliably. My job (at ICANN) is one where I get to hear about this kind of problem ;)
If the recipient's server receives the message at all, then it supports SMTPUTF8. An extended server never forwards an SMTPUTF8 message to an unextended server. (This is a nontrivial design decision, and was made only after a large testbed experiment.)
This means that if any server along the path needs (or chooses) to bounce the message, then it has SMTPUTF8 support.
I wonder whether it makes sense to enforce ASCII in the sender's localpart… let me sleep on that, please. AIUI Symfony is used mostly to send mail from servers to users? Like a web server's noreply@example.com? Is Symfony also used by scripts that people like us run on the command line or from cron? There's a five-digit number of domains that actively use unicode email addresses, maybe even six-digit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea what people build using symfony/mailer
(nothing forbids you to write a PHP script meant to be run on the command line, and Symfony does not call home to report us that you did write such a script). However, my intuition is that at least 99% of usages (and probably much more than that) is about sending from a server
If the recipient's server receives the message at all, then it supports SMTPUTF8. An extended server never forwards an SMTPUTF8 message to an unextended server. (This is a nontrivial design decision, and was made only after a large testbed experiment.)
that's actually a good context to have to understand how this works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about forbidding UTF8 sender addresses, and…
I think the key here is that when people make the mistake of sending "here's the link to change password" or "your order has shipped" from a unicode address, they understand it really quickly. When you do that, maybe 20% of your outgoing mail bounces, and you change your configuration later on the same day.
It's not the kind of mistake that causes slow trouble over a long time, it's the kind of mistake that causes a lot of trouble immediately.
If, on the other hand, you write a script to process, sort and forward inbound mail, then you may not receive any from a unicode address very soon, and a limitaiton on sender address is one that shows up slowly, after a while, and seldom.
For this reason, I think the argument to forbid UTF8 sender addresses is fairly weak. But you can judge it better than I — my expertise is in unicode email and domain names, you know Symfony users and traditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If, on the other hand, you write a script to process, sort and forward inbound mail, then you may not receive any from a unicode address very soon, and a limitaiton on sender address is one that shows up slowly, after a while, and seldom.
wouldn't those fail DMARC checks if you forward them using the original sender ?
I would vote for keeping the restriction on UTF8 sender, which will give immediate feedback to devs instead of waiting for them to get trouble with their delivery once reaching production (if they attempt to use an Unicode sender, it is likely that their own email servers will support them and so they won't get delivery issues in dev/staging)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never seen any of those scripts do external forwarding, which would subject the message to DMARC tests. People write code to sort mail to info@… into different classes, route some to autoresponders and some to various local addresses. If the body text mentions product x, the message is forwarded to address y, etc. Some companies make sure a Key Account Representative get a copy of mail to/from key customers.
Some mail servers can do that with rules, but if a company employs developers, it's often done in the language those developers use.
I'll add another commit that reinstates the restriction on the localpart, with a unit test or two that make sure of compatibility with all receivers. You can include or exclude that commit when you merge the PR (I do hope you'll merge). IMO both approaches are good, the choice of best is a matter of Symfony's philosophy.
This commit also adds a test that Symfony chooses IDN encoding when possible (to be compatible with all email receivers), and adjusts a couple of tests to match the name used in the main source code.
Thank you @arnt. |
I'd like to question how this PR is explained to users in the docs. I'm talking about the setup which we all think is the most common ("99%"):
This PR does two things:
Right? In case 1, if the receiving SMTP server doesn't support From my observation I would estimate that by far not all mail servers are supporting SMTPUTF8 (I would say around 50%); can you confirm this? You can argue that people that have an IDN domain will probably have an SMTPUTF8-enabled mail server. (I don't have any data about this, do you?) But I'm not sure about this, since punycode is so widely used (in many other systems as well). So instead of telling users that Symfony now fully supports non-ASCII email addresses (see https://symfony.com/doc/current/mailer.html#email-addresses), I would rather advise them to add What do you think? |
a couple of things. I have data about SMTP. I work with this, for ICANN. In short: If you as a random internet user want to send mail to someone who uses a unicode address, the chance that their incoming server supports SMTPUTF8 is practically 100%, the chance that your outgoing server supports it is 80% or a little higher, as a wordwide average. That's for people worldwide — the outgoing server composition for servers like Symfony is different than for humans. Humans can use Yahoo, web servers can use Sparkpost, see? The composition for diligently upgraded servers is different again. It's difficult to count that, though. You're slightly wrong about what the PR does, BTW. The condition for adding the SMTPUTF8 keyword isn't that a server supports it, but rather that the destination address requires it. Anyway, I spoke to a large mail company in China a few weeks ago, they just don't see the errors you mention any more, and I know a couple of implementations (one that I wrote) for which it also appears to be a no-op. Punycode does nothing for the deliverability. I can guess why it is a nonproblem. This is guesswork, not based on data:
That small value doesn't outweigh the permanent cost of punycode in interoperability risks and showing users xn--foo-43243129. |
This is exactly the point that I'm questioning.
Well, if the localpart is ASCII, then SMTPUTF8 isn't really required, since there's an alternative (punycode). So, after some more thinking about it, I would say: Symfony should behave like common MUA's (Thunderbird, Outlook, etc.) are behaving. Cause that's probably what most users expect: If it works in Thunderbird, it should work in Symfony. The only problem is: I don't know how Thunderbird handles it ;-) Do you? |
Thunderbird sends UTF8. The main punycode senders are Mutt and Exchange. I think there's a third notable one, can't remember which one that is. UTF8 is the majority anyway. (I'll ask a colleague to survey how many of the relevant mail servers accept the punycode form of an address. It should be 100%, but sometimes people read documentation like this , test with gmail and don't notice that they've forgotten to add the xn-- form to the list of domains. Would be good to check how common that is.) |
This allows applications to send mail to all-Chinese email addresses, or like my test address grå@grå.org. Code that uses Symfony needs no change and should experience no difference, although if the upstream MTA doesn't support it (most do by now) then an exception is thrown slightly later than before this change.