Hal Berghel*+, James Carpinter +, and Ju-Yeon Jo*
* School of Informatics and Internet Forensics Laboratory
University of Nevada , Las Vegas
+ Department of Computer Science and Software Engineering,
University of Canterbury ,
Christchurch , New Zealand
Phishing attacks attempt to fraudulently solicit sensitive information from a user by masquerading as a known trustworthy agent. Spoofed emails in association with fake websites are commonly used for phishing attack in order to coerce a user into revealing personal financial data. Phishing is now a serious crime even embraced by organized crime. Criminals have adopted the well-developed and well-known techniques and are exploiting Internet users with sophisticated phishing attacks. Phishers are known to have successfully attacked an estimated 1.2 million users and stole an estimated US$929 million in the twelve months to May 2005.
This article is aimed to provide the current status of phishing techniques and the defense methods. We first overview the fundamental phishing techniques for delivering phishing messages such as bulk emailing with fake web sites and detection avoidance methods using a variety of obfuscation techniques. Then we survey more sophisticated methods that may deceive even the knowledgeable and vigilant users. These techniques do not rely on naïve email delivery and simple web sites, but use very realistic fake web sites, employ generic hacking techniques such as DNS poisoning or cross site scripting, or actively exploit browser vulnerabilities. For example, a Man-In-The-Middle attack or a DNS poisoning can easily fool even the advanced users who may be well aware of URL obfuscation.
Quite a few defensive methods have been developed although many are still in early development stages. URL obfuscation can be rather reliably detected using analysis algorithms. Fake web sites can be also detected automatically with a low false positive ratio by comparing them with the real web sites. Clients can utilize anti-phishing-capable devices or software such as anti-virus, anti-spam, anti-spyware, or IDS. Web browsers can be armed with anti-phishing plug-ins such as Spoofstick or SpoofGuard. Recognizing the seriousness of phishing attack, diverse efforts are made in user education, reporting and response, and in legal aspects nowadays.
With all the efforts both technical and social, the outlook is not entirely bleak against phishing. If organizations prepare well, remain vigilant and follow attack trends carefully, they can respond quickly and effectively with a range of available techniques to defend their customer's data. If individuals take a responsibility for their protection and adopt a defense-in-depth approach, they can defend themselves against the most sophisticated attacks. Although there is no simple solution, active and aware users and organizations have the ability to form a strangle-hold on this ever-growing threat.
1.2 Current status
1.3 Phishing illustrated
2 Core Phishing Techniques
2.1 Bulk emailing combined with fake websites
2.2 Alternative delivery techniques
2.2.1 Web-based delivery
2.2.2 IRC and instant messaging (IM)
2.3 Obfuscation techniques
3. Advanced Phishing Techniques
3.2 Man-in-the-middle attacks
3.3 Website-based exploitation
3.4 Server-side exploits
3.5 Client-side vulnerabilities
3.6 Context aware attacks
3.7 Empirical results
4. Anti-Phishing Techniques
4.1 Detecting phishing attacks
4.3 Client-side security measures
4.4 Web browser enhancement
4.5 Server-side security measures
4.6 Alternative authentication
4.7 Email security
5 Comprehensive Anti-phishing Efforts
5.1 User vigilance and education
5.2 Proactive detection of phishing activities
5.3 Reporting and response
Phishing attacks attempt to fraudulently solicit sensitive information from a user by masquerading as a known trustworthy agent [ 5 , 60 ]. They most commonly use ‘spoofed' emails in association with fake websites in order to coerce a user into revealing personal financial data, such as credit card numbers, account user names and passwords, or social security numbers [ 49 ]. By hijacking brand names of banks, e-retailers and credit card companies, phishers often convince recipients to respond [ 5 ]. Phishing attacks range in sophistication, from simply fooling a user with a seemingly legitimate communication, to deliberately exploiting weaknesses in software to prevent users from determining the true nature of the attack.
The idea of obtaining user information through fraudulent means is not unique; phishing is merely a subset of two larger problems that exist in both the electronic and ‘real-world' domain:
Phishing shares many characteristics with two similar techniques: pharming and the abuse of alternate data streams. Both require a higher level of skill to execute successfully than simple phishing schemes. Pharming is a more active form of phishing, with the user automatically directed away from the legitimate website to the fraudulent website without warning [ 9 ]. Alternate data streams can be used to secretly associate hostile executables with legitimate files; this is effectively ‘file phishing' [ 8 ] . With minimal effort, a hidden executable can be masked and its function obscured. Like phishing, the resulting environment is not entirely as it appears.
The word ‘phishing' is a derivative of the word ‘fishing' and describes the process of using lures to ‘fish' for (i.e. obtain) sensitive user information [ 2 ]. Exchanging ‘f' for ‘ph' is a common hacker replacement; it is most likely an acknowledgement of the original term for hacking, known as ‘phreaking'. The original form of hacking, known as phone phreaking, involved sending specific tones along a phone line that allowed users to manipulate phone switches. This allowed free long distance calls, or the billing of services to other accounts etc.
The first recorded use of term ‘phishing' was in January 1996, in a posting to the alt.2600 newsgroup by firstname.lastname@example.org. It was in reference to the theft of AOL user accounts [ 13 ] by scamming passwords off unsuspecting users. The technique itself predates the reference by email@example.com: AOL users were already being targeted via instant messages sent by users masquerading as AOL staff members, who would request a user's account details [ 60 ] . By 1995, AOL software contained a ‘report password solicitation' button, which gives an indication of the magnitude of the threat.
Those seeking free AOL accounts initially took advantage of poor credit card validation techniques and used algorithmically generated credit card numbers to acquire accounts that could last up to a month. They turned to phishing for legitimate accounts after AOL bought in measures in 1995 to prevent this type of behavior. Hacked accounts were referred to as ‘phish' and by 1997, phish were being actively traded as a form of electronic currency [ 46 ] . For example, phish could be traded for hacker software or ‘warez'.
Since that time, the definition of phishing has widened to cover not only obtaining user account details, but also obtaining access to all personal and financial data. The sophistication of the field has also grown: modern schemes go far beyond simple instant messages, and typically target thousands of users using mass mailings and fake websites.
Phishing is now more than a mere annoyance: it is a common online crime that is relatively easy to perform, has a low chance of being caught, and has a potentially very high reward [ 27 ]. It is for these reasons that phishing has been embraced by organized crime, both in the United States and in Eastern Europe (particularly in Russia and the former Soviet bloc). It is also believed [ 23 ] that terrorist sympathizers, operating out of Africa and the Middle East , are using phishing to steal identities and cash.
Phishers typically send out massive emails in the hope that some naïve recipients will respond. Although the majority of the recipients feel suspicious on such phishing emails, some recipients are successfully convinced into the scam. Phishers successfully attacked an estimated 1.2 million users and cost an estimated US$929 million in the twelve months to May 2005 [ 38 ]. U.S. businesses lose an estimated $2 billion a year as their clients become victims [ 39 ]. The Anti-Phishing Working Group [ 3 ] received 20,109 reports of phishing scams in May 2006, primarily targeting financial institutions (92% of all reports). In year to May 2006, the average growth rate for phishing attacks was 34% [ 3 ].
There are several steps that phishers follow. Two examples are illustrated here, for posers and mongers respectively. The posers are the bottom-feeders in the phishing community that exhibit a very low level of sophistication. The phish mongers are those who deploy these phish scams in such a way that they stand a measurable chance of success against a reasonably intelligent and enlightened end-user [ 7 ].
The essential requirements of effective phishing require that the bait:
Figure 1 : Phishing email that satisfies our five effectiveness criteria
Figure 1 is modeled after some live phish captured on the net and meets all of the five criteria identified above. First, the email looks real – at least to the extent that it betrays nothing suspicious to a typical bank customer (a.k.a. target-of-opportunity). The graphic appears to be a reasonable facsimile of a familiar logo, and the salutation and letter is what we might expect in this context. Second, the target is the subset of recipients who are Bank of America customers. The fact that the majority of recipients are not is not a deterrent because there is no penalty for over-phishing in the Internet waters. Third, the request seems entirely reasonable and appropriate given the justification. Customers reason that if they were a bank, they might do the same thing. Fourth, the URL-link seems to be appropriate to the brand. Unwary internet users might readily trade off any lingering disbelief for the opportunity to correct what might be a simple error that could adversely affect use of a checking or credit card account. The link to “verify.bofa.com” may be assumed to take us to an equally plausible Web form that would request an account name and password or PIN.
The unwary in this case is M. Jones whose harvested Web form appears to the phisherman as in Figure 2 . This is a screenshot of an actual phishing server in our lab.
Figure 2 : Phishing from phisherman's perspective
In order to complete the scam the fifth condition must apply. In this case, after the private information is harvested, the circle is completed when the phishing server redirects the victim to the actual bank site. This has the effect of keeping the bank's server logs roughly in line in case someone makes an inquiry of the help desk. Figure 3 illustrates this activity.
Figure 3 : Phish Clean-up
Mongers employ more sophisticated schemes. Look carefully at the cursor in Figure 4 . The cursor seems to be sensing the link even though it is not particularly close to it. The fact is that it is not sensing the link at all, but rather an image map.
Figure 4 : Phish Mongering
A quick review of the source code, below, leads us to a veritable cornucopia of trickery.
<html><p><font face="Arial"><A HREF="https://signin.ebay.com/ws/eBayISAPI.dll?SignIn&sid=verify&co_partnerId=2&siteid=0"><map name="xlhjiwb"><area coords="0, 0, 646, 569" shape="rect" href="http://218.1.XXX.YYY/.../e3b/"></map>
<img SRC="cid:part1.04050500.04030901@firstname.lastname@example.org" border="0" usemap="#xlhjiwb"></A></a></font></p>
<p><font color="#FFFFF3">Barbie Harley Davidson in 1803 in 1951 AVI
Several features make it interesting. First, the image map coordinates take up nearly the whole page. Second, the image that is mapped is the actual text of the email. So what appeared to be email was just a picture of email. Thus, the redirect was actually not a secure connection to eBay at all as it appeared, but an insecure connection to 218.1.XXX.YYY/.../e3b/ . While Windows users see the “dots of laziness” frequently when path expression is too long for the path pane in some window, this is not a Windows path in a path pane. These “dots of laziness” are a directory name. It is not clear why someone would create a directory named “…” as it certainly falls short of the mnemonic requirements most of us learned in intro to programming.
On the other hand, it might blend in stealthily with the other *nix hidden files, “.” and “..”, and possibly escape an onlooker's suspicion. This suggests that the computer at the end of 218.1.XXX.YYY may not be the phisher at all, but another unsuspecting victim whose computer has been compromised (for that reason, the final two octets of the IP address have been concealed). Another sign of intrigue is the font color of almost pure white “ #FFFFF3 ” for “Barbie Harley Davidson in 1803 in 1951 AVI”. Though their names are sullied, neither Barbie nor Harley Davidson had anything to do with this scam. This white-on-white hidden text is there to throw off the Bayesian analyzers in spam filters. As the email text is actually a graphic, the Baysian analysis likely concludes that this is about Barbie and her Harley given that it has no other text to base its decision on. As opposed to the posers, this phish monger is moderately clever.
So far the most common of phishing attacks have been illustrated. The rest of this article is organized as follows. In section 2, the fundamental techniques used in phishing are explained, such as bulk emailing, alternative delivery techniques, and obfuscation techniques for masking the fake websites. Section 3 addresses advanced phishing techniques, including Malware, man-in-the-middle attack, and website-based attacks, etc. In section 4, anti-phishing techniques are discussed. However, technical solutions are only part of the picture in anti-phishing efforts and in section 5 , more comprehensive efforts are examined. The article is then concluded in section 6.
In order to achieve their goals, phishers typically use a mixture of two techniques: social engineering and technical subterfuge [ 5 ]. Social engineering is the primary technique used and appears to some extent in most attacks. Arguably, the use of this technique distinguishes phishing from other forms of electronic fraud. Technical subterfuge exploits software-based weaknesses in both servers and clients in order to mask the true nature of the transaction from the victim or plants crimeware onto PCs to steal credentials directly (often using Trojan keylogger spyware). Pharming crimeware misdirects users to fraudulent sites or proxy servers, typically through DNS hijacking or poisoning [ 5 ].
Both techniques are employed in the pursuit of the same goal: the victim must be convinced to perform a series of steps to reveal confidential data. However, these two techniques seek to attack from opposite ends of the spectrum: one targets human weaknesses, the other technical vulnerabilities. In this section and the next, how phishers exploit these vulnerabilities is examined.
A basic phishing scheme needs four elements: a bulk mailing tool, a standard email, a ghost (fake) website and a database of email addresses [ 20 , 49 ]. Typically, the ghost website is set up, and then the bulk email tool distributes the phishing email to all those addresses in the email database. The most successful phishing scams have genuine looking content in both their e-mail (if used) and the fake website. This includes:
The standard email sent is branded so it appears as though it was sent by a trusted and reputable party (e.g. a financial institution). The most commonly spoofed companies include Citibank, eBay and PayPal. It is likely that not all those users within the email database will have accounts with the spoofed organization; this is somewhat unavoidable and it can reveal the operation of a phishing attack. A number of techniques can be used within the email [ 20 , 34 , 46 , 49 ]:
The key objective of the email is to create a plausible premise that persuades the user to release personal information. The contents of the email must be designed to illicit an immediate user reaction, which prompts them to follow the enclosed link to the website (see Figure 5 ). For example, the email may [ 20 ]:
Ironically, many phishing emails take advantage of the user's fear of online fraud [ 20 ], using a premise that requires users to update their information due to a security system upgrade or similar (see Figure 5 ). While there are many different approaches, each must create a scenario to convince the user to provide the requested information in a timely manner (i.e. before the phishing site is shutdown).
Figure 5 : an example phishing email recorded by the Anti-Phishing Working Group ( 10/05/2005 ).
Using a bulk email tool and an email database, the standard email can be sent to millions of legitimate, active email addresses within a few hours. The use of a network of trojaned machines can speed the process up considerably [ 34 , 46 ]. The email database can be acquired from a number of sources on the internet, for free or for a fee. Such vendors target their email databases at spam distributors; however, the databases they distribute are equally useful for both purposes.
A ghost, or fake, website is typically hosted by a hijacked PC 2, compromised by other means [ 34 , 46 , 49 ]. A mechanism will need to be in place to facilitate information retrieval (by the phisher); it is speculated anonymous login or email may be used for this activity. Ideally, this machine would reside in a different country to that of legitimate website's organization as this increases the difficulties involved in shutting the phishing website down. The domain name and email URL are designed to prevent the target from noticing they are transacting with a ghost website rather than the legitimate site. Subtle character replacements can achieve this: for example, www.paypal.com could be imitated using www.paypa1.com or www.paypal.cc . More sophisticated methods will be discussed in section 2.3 (URL obfuscation).
The content of the website is likely to be a near-exact copy of the legitimate site, updated to allow the phisher to record user details. The ghost website is also likely to contain introduction pages, processing pages and pages thanking the user for submitting their data, in a further attempt to increase authenticity. It may also use a legitimate server-side certificate, signed by Verisign or similar, issued to the ghost website. Alternatively, it could use an unsigned certificate under the assumption that most users will be unable to interpret the security warning (if the security warning has not already been disabled). Even invalid or fake certificates are likely to make users feel more secure. The absence of SSL/TLS security may alert some users to the true nature of the website; however, security indicators within the user's browser can potentially be forged using browser exploits (see Figure 6 ).
Figure 6 : a phishing website targeted at PayPal customers from the Anti-Phishing Working Group.
Upon submitting their details to the ghost website, the user is often redirected to the legitimate site to encourage the user to continue to believe they have revealed their personal data to a legitimate organization. Alternatively, the phisher may use a post-submission page to encourage the user not to access or use their accounts for a specific timeframe, in order to mask the phisher's exploitation of their sensitive information (e.g. use of a credit card number). It is critical that the user does not realize they have submitted their data to an illegitimate organization. If this occurs, the personal data can be quickly rendered useless (e.g. their password will be changed or accounts closed).
Communication via email remains the most common and successful form of attack; however, other electronic communication mechanisms are becoming increasingly popular, such as web pages, IRC and instant messaging [ 46 ]. In all forms of communication, the phisher must imitate a trusted source in order for the victim to release their information.
Rather than distributing the malicious URL (or similar) via email, an increasingly popular technique is to place it in website content [ 46 ]. The website itself can be hosted by the phisher, or by a third party host (which could be acquired freely, for a fee, or via a Trojan horse attack). The level of sophistication varies: a malicious URL could be disguised and placed on a popular website or comment board, or a website developed for the express purpose of luring in potential victims.
If specialist website is employed to lure victims, the phisher may employ several techniques [ 46 ] :
In order to attract users to their website, fake banner advertising could be employed. Banner images belonging to the company the phisher is attempting to mimic can be placed on popular websites and direct users to the phisher's website, rather than the legitimate website. Standard URL obfuscation techniques can be used to hide this subtle redirection from the user. Many vendors provide online registration for banner advertising; with a stolen credit card (or similar), a phisher can easily acquire advertising while remaining concealed from law enforcement agencies.
While these techniques were popular in the early days of phishing, email has become the technique of choice for modern day phishers. However, it is predicted [ 27 , 46 ] that the use of these techniques will become more popular in the future, given that these technologies are popular with home users and are gaining in their complexity on a regular basis. Embedded dynamic content, such as multimedia, graphics, and URLs, can now be sent with many IM programs, allowing standard email and web-based phishing techniques to be easily mapped to this domain. Automated bots, that interact unsupervised with IRC participants, could also be used by phishers to coerce users into visiting their fake websites.
In addition to the techniques previously mentioned, phishers have other techniques to deliberately disguise the true nature of the message from the recipient.
URL obfuscation is an essential part of most phishing attacks. It fools the user into believing they are following a link to a legitimate website; in actual fact, they are being transported to the phisher's fake website. The simplest technique for URL obfuscation uses HTML; the legitimate website's address is displayed to the user in plain text, but the link is targeted at the phisher's website. For example:
is displayed to the user as:
but links to http://www.evilsite.com (see Figure 5 ). This technique would fool a basic user, who many not be aware than the display address of the hyperlink can be different to its target. Other URL obfuscation techniques include [ 20 , 46 , 49 ]:
Given the bulk nature of these emails, and the threat they pose to users, most organizations opt to treat them as spam, and filter them before they reach users. Several techniques can be used by phishers to make the filtering task more difficult, and therefore reach more potential victims [ 46 ]:
Figure 7 : an example from the Anti-Phishing Working Group that illustrates the use of hidden text in order to avoid detection by spam filters.
Malware is a term used to describe any form of malicious software, including viruses, worms, trojans and others. For phishers, this software represents a new route to defraud their victims that may complement or even replace the social engineering techniques that phishing often relies upon. The potential for fraud here is greater: rather than asking the victim for their information, they simply take it. The complete replacement of social engineering in a phishing attack with malware arguably represents an entirely different class of attack. However, much malware still relies on the targeted user to approve its installation and/or execution; it is for this reason social engineering is likely to remain a core skill relied upon by phishers.
Known malware worms used for phishing include [ 42 ]:
Interestingly, Brazilian banks appear to be over-represented in malware-based phishing schemes. In the six months to March 2004, twenty different malware applications were identified that targeted Brazilian banks [ 42 ].
Malicious users have long used software designed to log keystrokes and record screen captures to obtain sensitive data. These utilities are being employed more frequently in phishing attacks. These utilities can remain on a user's computer for an indefinite amount of time, and can record a far greater amount of information than any one basic phishing attack. Depending on the extent to which the attacker is willing to analyze the recorded logs, account information from a variety of sites can be harvested (rather than a single account typically recorded by a standard phishing scheme). Given the volume of data these techniques can potentially generate, the attacker has three options to retrieve recorded information:
Key loggers record all keystrokes entered by a user. With the use of appropriate filtering techniques, the attacker can isolate credentials used to access various online services. They vary in sophistication: some will record all key presses, while other will only record key presses entered in the web browser. The Anti-Phishing Working Group [ 3 ] recorded 215 phishing attacks in May 2006 (out of a total of 20,109) that used key logging malware.
Screen capture utilities were, in part, a response to advanced anti-key logging techniques used by some organizations; they record the other primary form of user input. These utilities record a screen image on a regular basis, or part of a screen image (i.e. the relevant observational area, such as the authentication area of a particular website). Partial screen captures minimize the size of the require upload to the attacker. Such techniques are successful against organizations such as Barclays Bank; they require users to select several, randomly selected, characters from their ‘memorable word' from drop-down lists (e.g. the second and fourth letter) as part of their second tier login process. This is an example of an anti-key logging technique.
Phishers and spammers typically share some commonalities: both typically want to distribute substantial amounts of email quickly and without being detected. There is some evidence [ 49 ] that these groups are exchanging techniques. Some techniques applicable to phishing are [ 42 ]:
The principles behind man-in-the-middle attacks are simple: the attacker acts as an intermediary between the victim and the legitimate site and records the information exchanged between the two parties [ 46 ]. The attacker achieves an ideal vantage point on the transaction, and can potentially remain unnoticed by both parties. The idea behind this attack is not unique to this domain: it is used throughout network security (e.g. TCP hijacking).
The intermediary machine utilized by the attacker is referred to as the proxy. Ideally, it is transparent: it does not effect the communication between the legitimate parties, and is not easily detected. Such proxies can be located on the same network segment as the target, or en route to the legitimate website. To ensure the client routes traffic through the proxy, browser proxy settings can be overridden (either by a software exploit, or through the use of social engineering); however, this is now obvious to the client. Proxy configuration is generally performed before the phishing email message is sent: this ensures the transmitted data is recorded if the user follows the enclosed link.
This form of attack can be successful for both HTTP and HTTPS (i.e. SSL/TLS) connections [ 46 ]. SSL/TLS provides application-level security between the client and the legitimate website; standard proxies between these two parties can only record the cipher text. However, if the phishing email can ensure the user connects to the proxy, rather than legitimate website, their data can be recorded. URL obfuscation techniques are useful in achieving this. The proxy passes all of the user's requests to the legitimate website, and responses from the legitimate website are passed back to the user. In the case of a SSL/TLS connection, a secure connection is established between the proxy and the legitimate website. A secure connection can also be maintained between user and the proxy via the methods described previously.
Using the legitimate website to process information submitted by the victim also aids the phisher as it allows invalid data to be discarded. It not only makes the phisher's job of identifying valid accounts easier but it also makes the site appear more authentic to the user [ 20 ].
DNS cache poisoning [ 46 ] attempts to corrupt the local cache maintained by a specific DNS server. When a user requests the IP address of a domain name, the request is forwarded to the DNS server. If the DNS server does not have the IP address of the domain in its cache, it will query an authoritative domain name server for the information. The BIND attack, an example of DNS cache poisoning, requires the attacker to spoof the reply from the authoritative name server; in the reply, the attacker can set the IP address of the queried domain to any desired machine. By exploiting DNS vulnerabilities, the phisher could potentially redirect traffic directed at a site such as www.citibank.com to their fake website. DNS cache poisoning can be particularly effective, as most ISPs operate one DNS server for all of their subscribers. If the network's DNS server is poisoned, all of the ISP's customers will be redirected to the fake website.
HTML frames can be used to obscure attack content. They enjoy wide browser support and are simple to use, and therefore are ideal for phishing websites. For example:
<frameset rows=“100%,*”, framespacing=“0”>
<frame name=“real” src=“http://www.citibank.com” scrolling=“auto”>
<frame name=“hidden” src=“http://fakesite.com” scrolling=“auto”>
The legitimate Citibank site is all that is viewable within the browser window; however, this code snippet also loads HTML from fakesite.com. The additional code could [ 46 ] :
Hidden frames can also hide the address of the phisher's content server. Only the URL of the document containing the frameset will be accessible from the browser interface (e.g. from the location bar or the page properties dialog).
var d = document;
d.write(‘<DIV id=“fake”, style=“position:absolute; left:200; top:200; z-index:2”>
<TABLE width=500 height=1000 cellspacing=0 cellpadding=14><TR>');
d.write(‘<TD colspan=2 bgcolor=#FFFFFF valign=top height=125');
Users are increasingly aware of the visual clues that mark a secure and legitimate site [ 46 ] . For example: the https identifier at the beginning of the URL, the URL itself, the zone of the page source (e.g. My Computer, Trusted, Internet etc), and the padlock icon somewhere in the browser (indicating secure SSL/TLS communication). These visual clues can sometimes be difficult to mimic using traditional techniques; however, specially created graphics can be loaded and positioned over specific areas of the browser ‘chrome' (the window frame, menus, toolbars, scroll bars and other widgets that comprise the browser user interface) using scripting languages.
For graphical substitution to be successful, the graphics must be consistent with the browser. It is trivial to detect the browser the user is using 3; from this information, the correct graphics can be overlaid. Areas of interest for graphical overlays include [ 46 ] :
Figure 8 : this illustrates the use of a pop-up window over the legitimate site in the hopes of increasing the scheme's credibility. Obtained from the Anti-Phishing Working Group.
Figure 9 : This shows another pop-up window over a legitimate webpage. Using scripts, it opens up the real webpage and then opens a bare window popup asking for information
Unlike Internet Explorer and other browsers, Mozilla and Firefox do not compile their graphical user interface into the browser itself. Instead, it is stored as XUL: XML User Interface Language. The XUL data for these browsers is readily available, and can be rendered inside the browser's content area. This could potentially allow a phisher to perfectly mimic the appearance of the browser, but allow them to arbitrarily set the location bar text or SSL/TLS padlock [ 43 ].
Any discussion of the exploitation of server-side vulnerabilities to assist in a phishing attack quickly transcends phishing and enters the realm of general hacking and cracking; this would be somewhat beyond the scope of this document (see [ 34 ] for some additional details). Suffice to say there are numerous techniques for exploiting operating systems, applications and network protocols that a phisher could use if they were determined to comprise a legitimate website in order to conduct a phishing attack. However, two ‘non-invasive' techniques of relevance to phishers will be discussed: cross site scripting and preset sessions [ 46 ].
Cross site scripting (CSS or XSS) seeks to inject custom URLs or code into a web-based application data field, and takes advantage of poorly developed systems [ 27 ]. Three techniques are typically used [ 46 ]:
In this example, the standard legitimate website content is rendered, but the web application uses a parameter to identify where to load specific page content (for example the login box); in this case, that content is fetched from fakesite.com (whose URL could be obfuscated using previously described techniques).
In this example, a script to be executed is passed to the web application.
In this example, the script is placed in the URL and executed by the web application.
Preset sessions use session identifiers. Session identifiers are typically used in HTTP and HTTPS transactions as a mechanism for tracking users through the website and to manage access to restricted resources (i.e. manage state). Session IDs can be implemented in a variety of ways; for example, cookies, hidden HTML fields or URL parameters. Most web applications allow the client to define the session ID. This allows the phisher to embed a session ID within the URL (that refers to the legitimate server) sent as part of the initial email [ 46 ]. For example,
Once the email is sent, the phisher polls the legitimate server with the predefined session ID; once the user authenticates against the given session ID, the phisher will have access to all restricted content.
Any discussion of client-side vulnerabilities is similar to that of its server-side counterpart: there are a multitude of vulnerabilities that a smart phisher could take advantage of in order to execute arbitrary code or to manipulate the browser. Given their exposure to the internet, it is not surprising browsers suffer from a significant number of security vulnerabilities. Most browsers also support a number of plug-ins, each of which carries its own security risks. While patches are typically available in a timely manner, home users are notoriously poor at applying them quickly; therefore, phishers have ample time to exploit most security vulnerabilities, if they choose to do so.
Some past exploits used by phishers include [ 42 , 46 ]:
The real URL : http://email@example.com/phisher.html
What the User sees: http://www.citibank.com
Where the browser goes: http://fakesite.com/fakepage.html
By inserting a %01 string in the username portion of the URL, the location bar displays http://www.citibank.com , while redirecting the user to fakesite.com. Earthlink, Citibank and PayPal were all targeted using this particular flaw.
While malware can often be eliminated with a regularly updated antivirus utility, browser (or any client-side) exploits cannot be defended against until a patch is available and applied.
Context aware attacks [ 37 ] manipulate the victim into readily accepting the authenticity of any phishing emails they may receive. The first phase, which may involve interaction with the victim, will be innocuous and not request any sensitive information. Rather, the goal here is to ensure the victim will expect the message sent in the second phase. The second phase marks the dispatch of the actual phishing email; however, the email is expected by the victim, and therefore more likely to be considered authentic. The actions suggested in the phishing email would often arouse suspicion in the victim if viewed in isolation, but the preset context allows this to be avoided. Jakobsson [ 37 ] presented a context aware phishing scenario to 25 users, and recorded a 46% success ratio.
A simple example of a context aware attack involves targeting an eBay seller [ 37 ] (also see Figure 10 ). Firstly, a seller is located who has an active auction and accepts payments via PayPal (but preferably not by credit card). At the end of the auction, a spoofed message is sent by the phisher from PayPal, indicating the successful buyer has paid for the goods won at the auction, but using a credit card (which the seller does not support). The email gives the seller two choices: either reject the payment, or upgrade their account to support credit card transactions; both these options require the seller to log into their account. By embedding an obfuscated URL to a fake website within the email, the phisher can easily record the seller's credentials. In this situation, the seller was expecting an email confirming payment; therefore, the spoofed email is expected, and is therefore viewed with less skepticism.
Figure 10 : this email is particularly well done, and illustrates a context-aware attack. On arriving at the site, the user is presented with a pop-up over the legitimate site, which gives the user the option of changing account details. It is not coercive and therefore not suspicious. They accept its legitimacy, as they require the ability to change their details. Obtained from the Anti-Phishing Working Group.
Dhamija & Tygar [ 17 ] characterize the most common successful techniques employed by phishers. They reviewed the phishing attacks archived by the Anti-Phishing Working Group [ 4 ] over a period from September 2003 to mid 2005. Their findings were consistent with what is known about phishing: these attacks exploit human tendencies to trust certain brands and logos and that many phishing schemes prey on the widespread sense that the internet is unsafe and that users must take the steps suggested by the attacker to ensure the security of their data. Furthermore, they concluded that the effectiveness of phishing schemes is raised when users cannot reliably verify security indicators. Unfortunately, this often the case, as browsers have generally not been designed with security usability in mind. More specifically, they identified the following phishing techniques as particularly serious:
In related work, Friedman et al. [ 25 ] established that users found it difficult to determine whether a connection was secure under normal conditions. Intentional phishing and spoofing attempts will only make this task more difficult.
The realm of phishing techniques is large and constantly expanding [ 16 ]; however, anti-phishing systems are not commonplace. Dhamija & Tygar [ 17 ] identify five basic principles that illustrate why designing secure interfaces is difficult:
Figure 11 : Security warning pop-up message
The authors argue that any complete phishing solution should fulfill all of these goals. In the following sections, we discuss available technical solutions for thwarting phishing attacks.
Wenyin et al. [ 58 ] propose a system for detecting phishing website based on visual similarity. By examining the similarities between text, images, overall layout and overall style, an overall measure of similarity is produced. Experimental results indicate a low level of false positives based on a collection of 328 suspicious web pages. They intend the algorithm to be applied in a commercial setting by a monitoring company.
Automatic response to phishing email can be used to detect the authenticity of the response [ 10 ]. It retrieves the embedded links in the email, visits the linked web site, provides phantom user information, and analyzes the response from the fake web site. If the visited website reacts differently from the expected behavior of a legitimate web site, it determines that the site is a phishing site.
Some consideration should also be given to the structure of URLs over the entire website; simple URLs can be readily identified by users, and makes the identification of obfuscated URLs somewhat easier. Such updates to custom web applications can be done without interruption to users; however, secure application development requires skilled developers and thorough testing. The number of attack vectors available to the phisher can be substantially reduced through the use of these techniques, and is relatively cost effective for the organization (when compared with the cost of an attack exploiting their web application).
Unicode attacks exploit the visual similarities between many Unicode characters (section 2.3 ). Such attacks can be detected by character-character similarity and word-word similarity [ 26 ]. It has been demonstrated that this attack can be accomplished using the English alphabet, Chinese characters, or the Japanese alphabet.
Several anti-phishing companies offer retaliatory services [ 27 ]. They respond by sending phishing sites so much fake financial information that the sites can't accept information from would-be victims. Most phishing sites run off of Web servers installed on hijacked home computers and can't handle much traffic. However, retaliatory services generally don't shut down phishing sites by overwhelming them with traffic, as occurs in a denial-of-service attack. They just send the sites as much traffic as they can handle and dilute their database with largely false information, a process known as poisoning .
Similar technique is proposed by [ 10 ]. Phantom user information is provided to the embedded web site in the phishing email. By repeating this step rapidly, it can poison the phishing database.
The installation of generic security software on a user's local machine can circumvent a number of phishing attacks, in addition to protecting against a number of other security risks. Four key pieces of software should be installed on each user's machine:
Most consumers already recognize the value of anti-virus systems; it would be reasonable to assume they would be similarly interested in the internet equivalents. While the purchase price for all four components can be substantial, well-regarded freely available products are also available in each of the four categories. The combination of these services on a local machine can create some false positives; however, the net defense-in-depth effect gained positively impacts on a user's or an organization's security posture. Similar systems should also be deployed at the local network level [ 36 ] .
Sophisticated email clients are widely used; however, advanced corporate users only require most of the functionality provided. The unnecessary functionality exposes the user to additional exploitable vulnerabilities that can be used by phishers [ 46 ]. The success of many phishing attacks can be attributed to the use of HTML email; in particular, it is particularly successful in obfuscating hyperlinks. By disabling HTML email in all email client applications, standard obfuscation and spoofing techniques can be rendered ineffective; however, this makes legitimate HTML emails difficult to read. The email client should also prevent the user from quickly executing dangerous content. At minimum, the user should be forced to save the attachment before opening it. This gives anti-virus software a better opportunity to consider the file, as well as preventing malicious code from compromising the rendering application (i.e. the email application). The use of simple clients, plain text email and automated attachment blocking can eliminate potential attack vectors for a phisher.
Web browsers, when properly patched and configured, can be used as a defense mechanism against phishing attacks. In some respects they are similar to email clients: most browsers contain more functionality than the user will typically require [ 46 ]. The more functionality provided, the more security flaws are generally exposed. For example, in typical web browsing, a user will only use 5% of Microsoft Internet Explorer's functionality. Therefore, a browser that is appropriate to the user is important: simple web browsers are sufficient and more secure for most users who simply seek to browse the web.
Web browsers should also be properly configured to protect against phishing attacks. Popup windows should be disabled, along with native Java support, ActiveX support, and multimedia auto-playback and auto-execute extensions. In addition, non-secure cookies should not be stored, and new downloads should not be executable from inside the browser before being copied to a local directory.
The plug-in architecture provided by most browsers is being used to support an increasing number of anti-phishing systems. Security toolbars are widely in use, for example, Spoofstick [ 52 ] displaying website's real domain name, Netcraft Toolbar [ 45 ] displaying information about the site, Trustbar [ 33 ] displaying logos and certificate authority of the web site, eBay Account Guard [ 21 ] indicating true eBay site, SpoofGuard [ 11 ] calculating the spoof score, and Web Wallet [ 63 ]. Typically these plug-ins are added to the browser toolbar, and confirm that the current URL is not part of a known phishing attack by contacting a centralized server. Ultimately, their effectiveness is dependent on the reporting mechanisms used by the system. Users often ignore toolbar messages and toolbars also make mistakes, so those toolbars must be used with a precaution [ 62 ]. The embedded links in phishing mail often contains different link from the text. For example,
<a href=http://188.8.131.52> http://www.goodsite.com</a>
The antiphish browser extension [ 40 ] detects such discrepancy and warns the user. Other approaches to identifying phishing websites include [ 51 ]:
Dhamija & Tygar [ 17 ] propose the use of trusted security windows for the display and submission of credentials. The user would assign a unique security image as the background of the security window. The image would be stored locally, and could not be spoofed by a remote user. Therefore, the user would be aware when they were looking at legitimate security information or entering their username and password into an authentic form. The use of browser-generated random images or server-generated random images (also known as dynamic security skins) can also provide the user with a prominent visual indication of a secure connection. Z. Ye at al. [ 64 ] proposed trusted paths from the browser to the human user that might work under browser spoofing.
Organizations should take a role in preparing users for an eventual phishing attack [ 24 ] . Most major online vendors, such as major banks, PayPal or eBay, already practice this to some extent (and to some effect) [ 46 ]. Communications from the organization should remind users not to release credentials to any other party, with an emphasis on prompting the user to consider the legitimacy of the motivation (e.g. email hyperlink) that drove them to the page. General phishing resources should also be made available to customers, detailing methods that can be used to ensure the validity of a site and how a customer can report a phishing scheme. Reported attacks should be responded to quickly, and users appropriately notified. Finally, all outgoing communications should be standardized; this reduces the likelihood legitimate communications could be confused with a phishing attack. All of these suggestions have a low cost to the organization, but must be delivered in a consistent manner where the customer is not overloaded with information.
Organizations can take a number of steps to validate their email communications with their customers, in order to make phishing attacks more obvious [ 46 ]. Emails can be personalized with some personal information known only to trusted organizations, such as greeting the customer by name, or including the last few digits of their credit card. A trail of trust can be established it subsequent emails precisely reference previous communications. Digital signatures can also be used to securely sign emails [ 1 , 55 ]; however, this relies on the user to validate the signature. Specialist web applications can also be made available to users to check the email was in fact sent from the organization. In order for these techniques to deter a phishing attack, the user must be aware of their existence and actively look for them.
Poor development techniques can expose custom web applications to some phishing techniques, such as cross-site scripting or the inline embedding of custom content (as discussed previously). Some of the key security requirements for a custom web application include [ 46 ]:
Two-factor authentication (e.g. username/password and a secure token) has been suggested [ 46 ] as a possible solution to phishing attacks. By making the password time-dependent (i.e. it can only be used once), the phisher is limited in their ability to subsequently connect to the server. This system combats simple eavesdropping and password guessing; however, it is not a complete solution to phishing attacks [ 50 , 54 ] . Attack techniques such as man-in-the-middle or the use of Trojan horses will not be stopped: man-in-the-middle will still grant the phisher access, and Trojan horses will allow the phisher access to subsequent sessions from that machine. Two-channel authentication 4 is similarly vulnerable to active phishing attacks, but would eliminate some phishing attack vectors.
Delayed password disclosure [ 51 ] requires the server to continuously authenticate itself with the user. After a user enters each character of their password, a predefined image selected by the user is displayed. The pattern of images would be difficult for a phishing website to mimic. Mutual authentication is also achieved by using server and client side certificates. However, this requires users to have their certificate with them in order to connect to their bank; this inconvenience will limit the use of this technology [ 13 ] .
Sophisticated browser password management can also be used to circumvent phishing attacks [ 51 ] . If the user allows the browser to manage all passwords, and a domain name is associated with each password, a user's credentials will only be automatically entered at legitimate web sites [ 31 ] .
Bellovin [ 6 ] believes that new authentication mechanisms will fail until prior relationships can be adequately captured. The use of certificates, both in email and on websites, merely guarantees the sender/website owns that particular domain name. It does not guarantee that this is the same party that the user gave money or sensitive data to. He proposes a simple solution to illustrate this point: if users were provided with the bank's certificate when opening an account, the certificate could be used to authenticate bank email and websites. The certificate is bound to a previous legitimate transaction, rather than simply being bound to a name.
By modifying existing spam email filtering approaches, phishing emails can be detected and filtered by analyzing their content. According to [ 35 ], 54 out of 3,370 spam emails intercepted were phishing emails. Phishing emails typically contained text related to banks and auction sites. By checking the text and other email characteristics such as sender, domain, and links, they formulated a scoring system to identify and block phishing mails.
Digital signatures can be used to make it easier to check the identity of the sender and the integrity of the message. However, it is still possible for a phisher to send a message using an anonymous public/private key pair. There are two popular standards for digitally signed email, S/MIME and PGP, which are supported by most Internet mail clients.
Van der Merwe et al. [ 44 ] identify five key counter-attack categories for users and organizations to consider:
In other words, phishing cannot be prevented just by technical means alone; rather, a comprehensive response is necessary.
The behavior of users targeted by phishing attacks has been studied extensively in [ 18 , 19 , 48 ]. [ 18 ] observed the responses of 22 participants and analyzed the results by sex, age, education level, hours using the computer, etc. The study did not find any of these factors made a significant difference in the susceptibility of the user to the attack. Somewhat shockingly, e ven in the best case scenario, when users expected spoofs to be present and were motivated to discover them, many users could not distinguish a legitimate website from a spoofed website. In fact, the best phishing site was able to fool more than 90% of participants. In [ 19 ], 57 participants were tested and found that people use various strategies to distinguish phishing web sites; however, these techniques were not necessarily effective. In [ 48 ], a user education course was offered and found that the user-awareness was greatly improved. Individual users are the most essential piece in an anti-phishing effort and they must take an active role to avoid becoming a victim of a phishing attack. Users can take several simple steps to protect themselves and their privacy:
Various companies offer monitoring services, which are aimed at the early detection and elimination of phishing attacks. For example [ 27 ]:
Early reporting of phishing schemes allows them to be shut down as soon as possible and also allows users to be provided with some warning (e.g. by the organization involved or through anti-phishing software) [ 49 ] . Major banks and e-commerce businesses generally have reporting forms as part of their website; the US Bank provides an email address to forward suspect emails to, while Citibank also lists recent scams with a link to each one. Independent groups, such as the Anti-Phishing Working Group, also maintain information regarding known phishing attacks. Digital Phishnet is an organization formed to fight phishing attacks. It combines the forces of nine of the top ten U.S. banks and financial services providers, four of the top five ISPs and five digital commerce and technology companies. They cooperate with the FBI, Federal Trade Commission (FTC), U.S. Secret Service and the U.S. Postal Inspection Service, under the aegis of the FBI's Internet Crime Complaint Center (IC3) [ 41 ].
Once reported, law enforcement officials are responsible for shutting the website down, tracing the source of the emails, tracking stolen funds and prosecuting those responsible. In Australia , the Australian High Tech Crime Centre and the Australian Computer Emergency Response Team are responsible for pursuing reported phishing attacks [ 49 ]. The URL contained within the phishing email will be used in a DNS search to find the ISP responsible for hosting the attack. This information usually allows the website to be quickly shutdown; however this may not be the case if the ISP is overseas, or in an unfriendly country. A G8 taskforce, consisting of 37 member countries, has recently been established to combat computer crime, including phishing. Of the phishing attacks recorded in May 2006 [ 3 ] , 34.1% were conducted from inside the US, 15% from China, 8.17% from Korea, 3.94% from France, 3.38% from Germany .
Through effective reporting, historical conceptions about the spread of phishing attacks are changing [ 29 ] . Rather than spreading in a disorganized wildfire pattern, researchers now believe phishing attacks originate from specific IP blocks. CipherTrust [ 12 ] believes most phishing attacks are likely to originate from fewer than 5,000 networks. Messages sent from sources that do not typically send legitimate email are candidates for subsequent analysis. The IP addresses contained in such emails can then be followed to check for phishing attacks. More research is likely to allow researchers to better characterize phishing attacks.
In the United States , Democratic Senator Patrick Leahy introduced the Anti-Phishing Act of 2005 on February 28, 2005 [ 30 ]. It allows prison time of up to five years and fines of up to US $250,000 for people who design fake Web sites for the purposes of stealing money or credit card numbers. California passed an anti-phishing law in September 2005, permitting victims to seek recovery of actual damages or up to $500,000 for each violation, whichever is greater [ 32 ]. Other US states, including Texas , New Mexico and Arizona , have also passed an anti-phshing law.
Although not common, some phishers get arrested. A 45-year-old California man, Jeffrey Brett Goodin, was arrested in January 2006 and charged with operating an online phishing scheme that targeted America Online customers [ 47 ]. He was charged with wire fraud and unauthorized use of a credit card. Goodin is alleged to have sent e-mail messages to thousands of AOL users to entice them to visit fraudulent Web sites he set up to collect personal information. Another phisher was arrested in August 2005 in Iowa [ 57 ]. Jayson Harris was charged with 75 counts of wire fraud for allegedly stealing credit card numbers and personal information in a phishing scheme targeting Microsoft's MSN customers. Other countries have followed the lead of the U.S. by tracing and arresting phishers.
Companies are taking proactive approaches in cracking down the phishers. On March 31, 2005 , Microsoft filed 117 federal lawsuits in the U.S. District Court for the Western District of Washington. The lawsuits accuse phishers of using various methods to obtain passwords and confidential information. AOL reinforced its efforts against phishing in early 2006 with three lawsuits seeking a total of $18 million USD under the 2005 amendments to the Virginia Computer Crimes Act.
Much of the Internet's malicious user population 6 has historically been motivated by challenge, curiosity, rebellion, vandalism, and the desire for respect and power. Modern trends in phishing reveal a very different situation: criminals have adopted the well-developed and well-known techniques of malicious users and are exploiting internet users with sophisticated phishing attacks. The concept of phishing has mutated significantly since its creation almost ten years ago. Modern phishers are financially motivated and likely to pursue their attacks more aggressively than the average cracker [ 53 ] . The influence of organized crime further supports the changing nature of crime on the internet. Phishing is also being used target individual users in an attempt to gain access to specific resources [ 27 ] .
However, the outlook is not entirely bleak: anti-virus, anti-spyware and anti-spam systems are continuing to evolve, as are internet browsers. If organizations prepare well, remain vigilant and follow attack trends carefully, they can respond quickly and effectively with a range of techniques to defend their customer's data. If individuals take a responsibility for their protection and adopt a defense-in-depth approach, compromising of a comprehensive and complementary toolkit of software and education, they can defend themselves against the most sophisticated of attacks. There is no simple solution, but active and aware users and organizations have the ability to form a strangle-hold on this ever-growing threat. Consider yourself warned!
Fragments of this article have been taken from Berghel, “Phishing Mongers and Posers”  with the permission of the publisher.