When ICANN introduced domain names that can use
non-ASCII characters, it created opportunities for phishers. Here's how that
can be overcome.
When ICANN began to allow
registration of internationalized domain names—that is, domain names that use
non-ASCII characters—they unwittingly opened a new method for phishing
campaigns to succeed. Visual similarities between characters in different
scripts, called homoglyphs, can be used to create domain names with visually
indiscernible differences that can be used to easily fool users into believing
that one domain is actually another.
Without
using links, consider the differences between ТесhRерubliс and TechRepublic.
One is written normally, with ASCII characters. The other substitutes the
Latin-based ASCII characters characters with Cyrillic characters for T, e, c,
and p. (The answer to which is written at the bottom of this article.) Russian
lends itself well to homoglyph attacks, as the lowercase a, o, x, and y can be
rendered identically, as а, о, х, and у, with other possibilities extant in non-Russian
Cyrillic characters. Other, less precise homoglyphs are possible as well. For
example, the letter i is visually similar to і (Cyrillic) and ì (Latin, with
grave).
This
is, to some extent, a problem in other languages as well. Consider that Japanese
has three writing systems—Hiragana, Katakana, and Kanji. For the company name
Mitsubishi, it would normally be written as 三菱 (three
diamonds). For Japanese, the kanji for three (三) looks
similar to the katakana for mi (ミ), which
can lead to confusion. As would be expected, it is possible to mix and match
these writing systems when registering domain names. For Traditional and
Simplified Chinese, many characters are homoglyphs of each other as well.
Principally,
this becomes a problem when attackers use these homoglyphs in phishing attacks,
as it would be easy to impersonate popular websites using this type of
strategy. ICANN's policies on how to deal with this problem—or IDNs in
general—are sparse. As a result, each registry has its own rules about how to
handle IDNs.
Many
ccTLDs and new gTLDs disallow IDNs, or have restrictions on how those can be
used, though these are inconsistent between registries. The .com and .net
registries essentially allow anything through. By merit of being the most
popular TLDs for legitimate websites, the lack of protection in this case makes
it more attractive for phishers.
At
present, Google Chrome, Microsoft Edge, and Mozilla Firefox handle
mixed-character IDNs by reverting to punycode, that is, the ASCII
representation of an IDN. Because of the complexity of changing character
encoding, IDNs were implemented in a somewhat kludge-like fashion. So, from the
above example, instead of seeing techrepublic.com in the address bar, you would
see xn—hrubli-2ofc3hgib.com.
But,
this behavior breaks situations where it would be expected to mix non-Latin
characters with standard ASCII character sets. Microsoft tried to fix this
problem by manually
whitelisting scripts in IE, which are allowed to mix with ASCII
without reverting to punycode.
There is a more elegant solution to this problem,
however. For domain names that mix ASCII and non-ASCII characters, changing
individual non-ASCII characters in a domain name to red in the address bar
would sufficiently differentiate characters otherwise useful for homoglyph
attacks while preserving the intended use of IDNs. For obvious reasons,
extension engines in browsers generally do not allow this behavior to be
implemented as an extension, making it necessary to implement as a feature of
the browser itself.
This solution, however, is only a band-aid to a
problem that exists because of ICANN's failure to generate a coherent and
universally applicable set of standards for registration of IDNs to prevent
this type of abuse. From a registry perspective, the best solution is probably
that of .ca, which disallows another registrant from buying an accented version
of an existing name.
Update (June 29, 2018): EURid, the operator of the .eu registry, has issued a notice indicating
that domains using Cyrillic characters will be deleted as of June 1, 2019. The
same organization is requiring domains names with Cyrillic characters to use
the matching .ею TLD instead, which is also controlled by EURid. According to
the organization, the move is part of a requirement forcing domain name owners
to match the script of the TLD with the second-level name in order to avoid
homoglyph attacks.
A report in The Register noted
that this is inconsistent, as this still allows the use of any letter of the
Greek alphabet, as well as accented characters from multiple European
languages, including the "German ü, the Romanian ș, and the Swedish
å."
Political questions aside, this is good in terms of
minimizing phishing attacks, but still insufficient for differentiating
characters. In Greek, omicron (ο) and in certain fonts, nu (ν) are the closest
matches to ASCII characters, though slightly more abstract matches also exist,
in order: εικηρτυωχγ resembles eiknptuwxy to a degree, with larger variances
depending on the fonts involved. Accented characters are too numerous to
mention. While these variants do exist, the further away attackers go from the
intended character, the more likely a ransom note effect will occur.
As it is, the most inclusive solution to preserving
the intended display of IDNs while preserving security for users is to change
the color of non-ASCII characters.
While the practice of mass deletion is generally
abnormal for a registry to engage in, the European Commission issued a notice
in March that registrants of .eu domain names within the United Kingdom will
lose their eligibility to hold .eu


No comments:
Post a Comment