Address munging

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

Address munging is the practice of disguising an e-mail address to prevent it from being automatically collected by unsolicited bulk e-mail providers. Address munging is intended to disguise an e-mail address in a way that prevents computer software from seeing the real address, or even any address at all, but still allows a human reader to reconstruct the original and contact the author: an email address such as, "no-one@example.com", becomes "no-one at example dot com", for instance.

Any e-mail address posted in public is likely to be automatically collected by computer software used by bulk emailers (a process known as e-mail address harvesting). Addresses posted on webpages, Usenet or chat rooms are particularly vulnerable to this.[1] Private e-mail sent between individuals is highly unlikely to be collected, but e-mail sent to a mailing list that is archived and made available via the web or passed onto a Usenet news server and made public, may eventually be scanned and collected.

Disadvantages

Disguising addresses makes it more difficult for people to send e-mail to each other. Many see it as an attempt to fix a symptom rather than solving the real problem of e-mail spam, at the expense of causing problems for innocent users.[2] In addition, there are e-mail address harvesters who have found ways to read the munged email addresses.

The use of address munging on Usenet is contrary to the recommendations of RFC 1036 governing the format of Usenet posts, which requires a valid e-mail address be supplied in the From: field of the post. In practice, few people follow this recommendation strictly.[3]

Disguising e-mail addresses in a systematic manner (for example, user[at]domain[dot]com) offers little protection. For example, such addresses can be revealed through a simple Google Search.

Any impediment reduces the user's willingness to take the extra trouble to email the user. In contrast, well-maintained e-mail filtering on the user's end does not drive away potential correspondents. No spam filter is 100 percent immune to false positives, however, and the same potential correspondent that would have been deterred by address munging may instead end up wasting time on long letters that will merely disappear into junk mail folders.

For commercial entities, maintaining contact forms on web pages rather than publicizing e-mail addresses may be one way to ensure that incoming messages are relatively spam-free yet do not get lost. In conjunction with CAPTCHA fields, spam on such comment fields can be reduced to effectively zero, except that non-accessibility of CAPTCHAs bring exactly the same deterrent problems as address munging itself.

Alternatives

As an alternative to address munging, there are several "transparent" techniques that allow people to post a valid e-mail address, but still make it difficult for automated recognition and collection of the address:

  • "Transparent name mangling" involves replacing characters in the address with equivalent HTML references from the list of XML and HTML character entity references, e.g. the '@' gets replaced by either 'U+0040' or '@ and the '.' gets replaced by either 'U+002E'or '.' with the user knowing to take out the dashes.[4]
  • Posting all or part of the e-mail address as an image,[5] for example, no-one@example.com, where the at sign is disguised as an image, sometimes with the alternative text specified as "@" to allow copy-and-paste, but while altering the address to remain outside of typical regular expressions of spambots.
  • Using a client-side form with the e-mail address as a CSS3 animated text logo captcha and shrinking it to normal size using inline CSS.[6]
  • Posting an e-mail address with the order of characters jumbled and restoring the order using CSS.[7]
  • Building the link by client-side scripting.[8]
  • Using server-side scripting to run a contact form.[9]

An example of munging "user@example.com" via client-side scripting would be:

 <script type="text/javascript">
 var name = 'user';
 var at = '@';
 var domain = 'example.com';
 document.write(name + at + domain);
 </script>

The use of images and scripts for address obfuscation can cause problems for people using screen readers and users with disabilities, and ignores users of text browsers like lynx and w3m, although being transparent means they don't disadvantage non-English speakers that cannot understand the plain text bound to a single language that is part of non-transparent munged addresses or instructions that accompany them.

According to a 2003 study by the Center for Democracy and Technology, even the simplest "transparent name mangling" of e-mail addresses can be effective.[10]

Examples

Common methods of disguising addresses include:

Disguised address Recovering the original address
no-one at example (dot) com Replace " at " with "@", and " (dot) " with "."
no-one@elpmaxe.com.invalid Reverse domain name: elpmaxe to example
remove .invalid
moc.elpmaxe@eno-on Reverse the entire address
no-one@exampleREMOVEME.com Instructions in the address itself; remove REMOVEME.
no-one@exampleNOSPAM.com.invalid Remove NOSPAM and .invalid from the address.
n o - o n e @ e x a m p l e . c o m This is still readable, but the spaces between letters stop most automatic spambots.
no-one<i>@</i>example<i>.</i>com (as HTML) This is still readable and can be copied directly from webpages,
but stops many email harvesters.
по-опе@ехатрlе.сот Cannot be copied directly from Webpages, must be manually copied. All letters except l are Cyrillic homoglyphs that are identical to Latin equivalents to the human eye but are perceived differently by most computers. (See also IDN homograph attack for more malicious use of this strategy.)

The reserved top-level domain .invalid is appended to ensure that a real e-mail address is not inadvertently generated. One problem is that some spammers will now use filters to remove obvious munges and send spam to the cleaned up address.[citation needed] For this reason many people recommend using a totally invalid address[vague] (especially in the From line) and perhaps a disposable email address in the Reply To.

References

  1. Email Address Harvesting: How Spammers Reap What You Sow, Federal Trade Commission. URL accessed on 24 April 2006.
  2. Address Munging Considered Harmful, Matt Curtin
  3. See Usenet.
  4. Lua error in package.lua at line 80: module 'strict' not found.
  5. E-mail as an image
  6. Client-side contact form generator (the generator requires JavaScript enabled, output for displaying emails requires CSS)
  7. PHP jumbler tool
  8. JavaScript address script generator (the generator requires cookies enabled, output for displaying emails requires javascript enabled)
  9. PHP contact form generator
  10. "Why Am I Getting All This Spam? Unsolicited Commercial E-mail Research Six Month Report" March 2003.

See also

External links