- 1 -
Threats to IM and Security Mechanism Analysis of XMPP
Zhenxing Cui*,Zhihua Gu
Department of Computer Science and Technology,Wuhan University of Technology,Wuhan ,
(430063)
Abstract
Security has become a major disincentive to the development of instant Messaging. This paper focused
on security issues related to instant messaging. It firstly introduced IM and XMPP, and then mainly
discussed kinds of security threats to IM Systems and the security mechanism of XMPP, lastly, raised
several security considerations in implementation of XMPP.
Keywords:IM,XMPP,SASL,TLS,authentication,encryption,security
1. Introduction
Instant Messaging (IM) [1] is a type of communications service over the Internet that enables
individuals to exchange messages and track availability of a list of users in real-time. With the benefits
of convenience, efficiency, cheapness, instantaneous intercommunication etc., IM services are now
very popular as an instant way of communication over the Internet, especially IM for customers
(public), which is also called CIM. But there are mainly two obstacles [9] IM faces: security and
interpenetration issues between different IMs. This paper mainly discusses IM security and security
mechanisms of XMPP, interpenetration is beyond the scope of the paper.
The XMPP[2] (Extensible Messaging and Presence Protocol) is an open Extensible Markup Language
protocol for near-real-time messaging, presence, and request-response services, which has evolved
through an open development within the Jabber open-source community. XMPP, an IM standard of the
Internet Engineering Task Force (IETF), can be used to stream virtually any XML data between
individuals or applications, making it a perfect choice for applications such as IM. While XMPP
provides a generalized, extensible framework for exchanging XML data, it is used mainly for the
purpose of building instant messaging and presence applications that meet the requirements of
RFC2779.
Two fundamental concepts make possible the rapid, asynchronous exchange of structured information
between presence-aware entities: XML streams and XML stanzas [2]. An XML stream is a container for
the exchange of XML elements between any two entities over a network, liking an envelope for all
XML stanzas sent during a session. An XML stanza is a discrete semantic unit of structured
information sent from one entity to another over an XML stream. Stream authentication use SASL and
dialback protocol to complete communication authentication, which is an important component part of
XMPP security mechanism.
2. Model for IM
Although XMPP is not wedded to any specific network architecture, it is usually implemented via a
client-server architecture wherein a client utilizing XMPP accesses a server and servers also
communicate with each other both over TCP connections[1][2]. Figure 1 shows this architecture.
- 2 -
. the XMPP architecture
A server acts as an intelligent abstraction layer for XMPP communications. Its primary responsibilities
are: (1) to manage connections from or sessions for other entities, in the form of XML streams to and
from authorized clients, servers, and other entities. (2) To route appropriately-addressed XML stanzas
among such entities over XML streams. (3) Store the data that is used by clients (., contact lists for
users of XMPP-based instant messaging and presence applications). Most clients connect directly to a
server over a TCP connection and use XMPP to take full advantage of the functionality provided by a
server and any associated services. A gateway is a special-purpose server-side service whose primary
function is to translate XMPP into the protocol used by a foreign (non-XMPP) messaging system, as
well as to translate the return data back into XMPP.
Most communications in IM systems are client-server based, messages among users are also typically
relayed through the server. However, purely peer-to-peer communications also occur in some situations
(. audio/video chat, file transfer). If user A wants to communicate instantly with user B, both must
log into the same IM service. Messages from A to B will be delivered by the server depending on B’s
privacy settings. For direct communications between A and B, the server provides necessary
information (. network address) to each party.
3. Security threats to IM
The evolution of IM systems suggests that security and privacy issues have received little consideration
from the major IM vendors. This section lists the most significant threats to IM systems, including
viruses and worms, Trojan horses, identity theft, impersonation, eavesdropping, data loss, and
denial-of-service attacks and so on. [4][6]
1) Insecure connections. Perhaps the greatest threat to current popular instant messaging
networks lies in their open, insecure connections. Most services use a client-server model for
communication among users with a few exceptions like file transfer, voice and video services where a
peer-to-peer connection is used. Connections are susceptible to being taken over during client-to-server,
server-to-client, client-to-client and intra-server communications. Once authenticated during the login
time, all these connections deploy little (sequence number or transaction identifier, which can be easily
spoofed) or no security measures at all. Hence almost all popular IM connections lack authentication
(except in the login message), confidentiality and integrity. This opens the door to many other security
vulnerabilities including impersonation, denial of service, man-in-the-middle attacks, replay, etc.
2) Propagating Malware. CIM client applications typically support the ability to transfer files,
and not just text, between users. The file-sharing function has encouraged the development of instant
messaging viruses, worms, or Trojan horses that use CIM networks as a channel for spreading
malicious code. Generally, users are unsuspecting when receiving a file from a known contact. Worms
successfully use this behavior by impersonating the sender. This is becoming a serious problem, as
common anti-virus tools do not generally monitor IM traffic, moreover, anti-virus tools generally scan
only a small subset of all possible file types. For example, a media file (. an MPEG file) may contain
a specially crafted data sequence that may crash a user’s media player or do something more harmful.
- 3 -
In fact, for Real Media and JPEG files, these threats are already reality. As most anti-virus tools are not
generally used to scan data files (. media or image files), widespread use of software such as
Windows Media Player may become a potential source of attacks that use malware in data files. Also,
IM file transfers carrying malware penetrate firewalls more easily than email attachments. Trojan
horses are another type of malware that can spread via the file sharing function. Trojan horses are
malicious programs disguised as benign programs so that an unsuspecting victim will accept them on
his or her own computer. Once accepted, the Trojan horse works silently to change settings on or steal
information from the infected computer.
3) Identity Theft and Impersonation. The lack of linking to true identifiers in the CIM client
makes it difficult to detect identity theft when it occurs. If an identity thief is able to crack a user’s CIM
username and password and log in under the stolen name, the victim’s CIM buddies may not be able to
tell the difference. For example, assume that Alice has regular secure instant messaging
communications with Bob. Alice identifies Bob’s presence on the MSN CIM network based on Bob’s
screen name “BobSmith.” A third-party identity thief (Steve) steals Bob’s username and password and
logs into the CIM network as “BobSmith.” Alice will assume that “Bob123” is Bob himself unless
Steve reveals otherwise by engaging in messaging that is uncharacteristic of Bob. While logged in as
Bob, Steve can leverage the trust Alice has in Bob to gain access to sensitive information that
comprises either Alice’s or Bob’s security. In this example, if Bob is Alice’s boss, the information
disclosure that occurs can easily include sensitive corporate information. Alternatively, Steve can take
advantage of his assumed trusted position to initiate a Trojan horse attack on Alice via file sharing.
4) Eavesdropping. CIM services typically transmit messages in clear-text over the public
Internet. An eavesdropper can intercept messages at various points in the communication pipeline. For
example, packet sniffer software can intercept the contents of many messages near the CIM centralized
message routing server. Alternatively, a specific individual’s message content can be intercepted at the
local network level for a targeted eavesdropping attack. Suppose Alice has regular secure instant
messaging communications with Bob using a CIM client at work. Assuming an eavesdropper (Steve) is
able to install a network sniffer on Alice’s company network, he will be able to specifically monitor or
log the communication session between Alice and Bob. Alternatively, Steve can eavesdrop on all CIM
conversations taking place within the company until the sniffer is identified. This can be particularly
damaging in a workplace setting if CIM is used by employees to communicate sensitive corporate
information.
5) Denial of Service (DoS). DoS attacks can be launched in many different ways. Some may
simply crash the messaging client repeatedly. Attackers may use the client to process CPU and/or
memory intensive work that will lead to an unresponsive or crashed system. Flooding with unwanted
messages is particularly easy when users choose to receive messages from everyone. In this case,
attackers may also send spam messages such as advertisements. However, all the common IM clients
support user blocking. A victim can block the attacker’s account ID easily; however, attackers may get
through this barrier by using many compromised accounts simultaneously. Attackers may also change
the password of compromised accounts using automated scripts. This will cause the victims to lose
access to their accounts whose account names they have distributed to many contacts.
6) Other Threats. A core value of IM services is the presence awareness function. Presence
awareness — knowing when someone is online and available to chat —is a requirement for the
initiation of real-time communication. Unfortunately, presence information can be used to compile a
profile about a user’s online behavior (., when a user is online or offline). The profiler can then
combine online presence information with other basic information, such as work schedules, to track the
victim’s physical location.
- 4 -
4. XMPP security mechanism
Network services security system mainly include four aspects: authentication, authorization and data
protection and recognition. Authentication is to ensure that the various teaches of the network have
access to appropriate services. Authorization is to determine whether the petitioner can use his
requested content. Data protection refers primarily to data in the transmission process will not cause
any problem, including data confidentiality and integrity. Recognition means ensuring that the sender
of information and the creation are of consistency. XMPP use authentication and encryption methods to
provide coverage of the four elements of a security framework[2][7][8], while authentication making use
of SASL (Simple Authentication and Security Layer) and encryption using TLS (Transport Layer
Security)[3] protocol.
Authentication
The authentication protocol allows clients to prove to the server that they are who they claim to be.
Authentication is the first line of XMPP security, provided sufficient access control for most IM tasks.
It accomplishes this with three different algorithms for client authentication: plain, digest, and
zero-knowledge[7]. Plain authentication is the simplest and least secure, and zero-knowledge is the most
complex and provides the highest level of security.
1) Plain authentication.
Plain is the first authentication method that provides some level of security. Its primary advantage is
the extreme simplicity of implementing it. Plain authentication works by sending a plain text copy of
the user’s password to the server in the authentication set query: The server directly compares the
password to the one stored in the user’s account. If they match, the server sends the client an empty
result query packet indicating the client has been authenticated with the server. If it doesn’t match, the
server sends a standard error IQ packet.
The primary problem with plain authentication is that the password is sent in the open to the server. It
is easy for eavesdroppers to watch the data on the network going to the XMPP server and steal users’
passwords as they’re being sent. For this reason, it is highly recommended that clients avoid using plain
authentication if at all possible.
2) Digest authentication
To avoid sending passwords as plain text, the digest authentication adds an extra step to the process
(figure 3). The server starts its stream using the <stream:stream>packet containing a random session ID
string in the packet’s id attribute.
To generate a digest authentication credential, you take the session ID from the server’s initial
<stream:stream> tag and concatenate it with the user’s password. The resulting string is then hashed
using the SHA-1 message digest algorithm. The lowercase hexadecimal text (UTF-8/ASCII)
representation of the resulting hash is then sent in the <digest> field of the authentication set query
The drawback to digest authentication is that the user’s password must be sent to the server during the
register protocol as plain text. In addition, the server must store the user’s password as plain text. A
compromise of the server’s security could compromise all of its users’ passwords. The zero-knowledge
authentication method was developed to eliminate these problems.
3) Zero-knowledge authentication
The most secure, and most complex method supported by the XMPP protocols is zero-knowledge
authentication. The zero-knowledge authentication method is complex and its adoption in servers and
clients has been slow because of this. Zero-knowledge authentication removes the requirement for
servers to store the user’s password. In fact, the authentication information the server stores is a
throwaway credential that can be used to authenticate the user only once. Successful zero-knowledge
authentications generate a new, one-time use, authentication credential. The technique uses four pieces
- 5 -
of information:
User’s password— Used by the client along with a token to generate valid zero-knowledge keys. The
password is stored on the client (or entered by the user) and never sent to the server. A zero-knowledge
key set is defined by the combination of password and token.
Token—A randomly generated piece of information used to create a set of zero-knowledge keys. The
token is stored on the server. Splitting the password and token between client and server respectively
makes the key set created from them unique to the client/server pair.
Sequence—A constantly decrementing number indicating which key in the key set is being used.
Hash—A particular key in the key set identified by sequence number.
Initially the client must generate all of these pieces of information for use in the register protocol. To
do this, the client:
1) Creates an SHA-1 message digest of the user’s password to create hashA. The digest (a series of
bytes) is then converted to the lower-case hexadecimal text (UTF-8/ASCII) representation of the
digest we’ll call hashAasciihex.
2) Generates a random token string.
3) Creates a digest of the concatenation of hashAasciihex and the token string to create hash0. The
hash0 digest is converted to its lower-case hexadecimal text representation
hash0asciihex.
4) Selects an arbitrary sequence number M (., 500).
5) Digests hashnasciihex to create hashn+1 and converts it to a hexadecimal text representation
hashn+1asciihex until it generates hashmasciihex where m(m=M)is the sequence number from the
previous step.
The client sends the token, sequence (M), and hash (hashmasciihex) to the server in the register protocol if
it support zero-knowledge authentication.
To authenticate, the client follows a two-step authentication process shown in figure 2. In the first step,
the client sends an authentication probe query and the server will return the token, and sequence
number minus one (M-1). The server’s reply tells the client, “Take this token, and this sequence
number, and generate a new hash.” The client follows the same procedures as described previously
except it uses the given token and sequence numbers to generate hashm-1asciihex. It sends this value to
the server in the authentication set query:
- 6 -
Figure 2. Zero-knowledge authentication
The server takes hashm-1asciihex and generates hashm asciihex from it by simply hashing it once using the
SHA-1 message digest. It compares this new hashmasciihex to the one the client sent during the register
protocol. If they don’t match, the client failed to properly authenticate and the server sends a standard
IQ error response. If they do match, the client is authenticated. The server then decrements the user
account’s sequence number to M-1 and stores hashm-1asciihex. The next time a client authenticates, the
server will send the token and M-2 to the client. The process can continue until the sequence number
reaches zero. The client must use the register protocol before the sequence number reaches zero to reset
the zero-knowledge credentials using a new token and sequence number. Notice that the server cannot
predict what hashn-1asciihex is from hashnasciihex. It truly is a one-time use key. If an eavesdropper steals a
copy of hashnasciihex and sees that it was successfully used (so it knows it has a valid hashnasciihex) the
credential it has just stolen has instantly become obsolete and useless.
The zero-knowledge authentication technique has several advantages:
z Passwords are never transferred over the network.
z Passwords are never stored on the server.
z Passwords stolen during authentication packet exchanges become useless as soon as they are used.
z The majority of processing load is transferred to the client aiding Jabber server scalability.
Server dialback
The Jabber protocols from which XMPP was adapted include a "server dialback" method for protecting
against domain spoofing, thus making it more difficult to spoof XML stanzas. Server dialback is not a
security mechanism, and results in weak verification of server identities only. The server dialback
method is made possible by the existence of the Domain Name System (DNS), since one server can
(normally) discover the authoritative server for a given domain[2][8].
Server dialback is uni-directional, and results in (weak) verification of identities for one stream in one
direction. Because server dialback is not an authentication mechanism, mutual authentication is not
possible via dialback. Therefore, server dialback MUST be completed in each direction in order to
enable bi-directional communications between two domains.
Client
Probe
Response
Server
Knows:
Password
Username
resource
Knows:
hashn
token
sequence(n)
Hashn-1
resource
Hashn
Authenticate
Response
token
Success/error
Username
Resource
Hashn-1
digest
Sequence
token
Hashn-1
username
sequence(n-1)
compares
- 7 -
There are three entities involved in the process of sever dialback:
Originating Server-- the server that is attempting to establish a connection between two domains
Receiving Server -- the server that is trying to authenticate that the Originating Server represents the
domain which it claims to be
Authoritative Server -- the server that answers to the DNS hostname asserted by the Originating Server;
for basic environments this will be the Originating Server, but it could be a separate machine in the
Originating Server's network.
The following is a brief summary of the order of events in dialback:
1. The Originating Server establishes a connection to the Receiving Server.
2. The Originating Server sends a 'key' value over the connection to the Receiving Server.
3. The Receiving Server establishes a connection to the Authoritative Server.
4. The Receiving Server sends the same 'key' value to the Authoritative Server.
5. The Authoritative Server replies that key is valid or invalid.
6. The Receiving Server informs the Originating Server whether it is authenticated or not.
Encryptation
XMPP includes a method for securing the stream from tampering and eavesdropping[2][5][8]. This
channel encryption method makes use of the Transport Layer Security (TLS) protocol (shown as ),
along with a "STARTTLS"[5] extension, providing data confidentiality and integrity. Encryption helps
reduce the threat of eavesdropping.
Other high level protocols
TLS
Handshake Protocol: authentication, Secret negotiation, reliable negotiation
Record Protocol: private connection, reliable connection
Transport Protocols(., TCP)
. TLS Protocol
The primary goal of the TLS Protocol is to provide privacy and data integrity between two
communicating applications. The protocol is composed of two layers: the TLS Record Protocol and the
TLS Handshake Protocol. At the lowest level, layered on top of some reliable transport protocol (.,
TCP), is the TLS Record Protocol. The TLS Record Protocol provides connection security that has two
basic properties:
1) The connection is private. Symmetric cryptography is used for data encryption (., DES,
RC4, etc.) The keys for this symmetric encryption are generated uniquely for each connection and are
based on a secret negotiated by another protocol (such as the TLS Handshake Protocol). The Record
Protocol can also be used without encryption.
2) The connection is reliable. Message transport includes a message integrity check using a
keyed MAC. Secure hash functions (., SHA, MD5, etc.) are used for MAC computations.
The TLS Record Protocol is used for encapsulation of various higher level protocols. One such
encapsulated protocol, the TLS Handshake Protocol, allows the server and client to authenticate each
other and to negotiate an encryption algorithm and cryptographic keys before the application protocol
transmits or receives its first byte of data. The TLS Handshake Protocol provides connection security
that has three basic properties:
1) The peer's identity can be authenticated using asymmetric or public key, cryptography (.,
- 8 -
RSA, DSS, etc.). This authentication can be made optional, but is generally required for at least one of
the peers.
2) The negotiation of a shared secret is secure: the negotiated secret is unavailable to
eavesdroppers, and for any authenticated connection the secret cannot be obtained, even by an attacker
who can place himself in the middle of the connection.
3) The negotiation is reliable: no attacker can modify the negotiation communication without
being detected by the parties to the communication.
Although SASL provides advanced authentication mechanisms to ensure that normal communications
can not be carried out until both communication sides get specific authorization, but in some
applications such as IMAP, POP3, and ACAP, often need login process where login information are
generally password, user name, and other important information. Unfortunately, most transmission
process using plaintext delivery, hence it is very easy to be eavesdropped. To solve this problem, the
“STARTTLS” extension based on TLS can be used; the “STARTTLS” extension has currently been
successfully applied to IMAP, POP3, and ACAP protocols.
5. Security Considerations in implementation
In the specific application of XMPP, we should consider the following aspects of the security system
design according to the security mechanisms XMPP provides.
1) In client-to-server Communications, a compliant client implementation must support both TLS and
SASL for connections to a server. The TLS protocol for encrypting XML streams provides a
reliable mechanism for helping to ensure the confidentiality and data integrity of data exchanged
between two entities. The SASL protocol for authenticating XML streams provides a reliable
mechanism for validating that a client connecting to a server is who it claims to be.
2) In server-to-server communications, a compliant server implementation MUST support both TLS
and SASL for inter-domain communications. Because service provisioning is a matter of policy, it
is optional for any given domain to communicate with other domains, and server-to-server
communications may be disabled by the administrator of any given deployment.
3) Mandatory-to-implement technologies: The SASL[DIGEST-MD5] mechanism for authentication;
TLS using the TLS_RSA_WITH_3DES_EDE_CBC_SHA cipher for confidentiality; TLS plus
SASL EXTERNAL(using the TLS_RSA_WITH_3DES_EDE_CBC_SHA cipher supporting
client-side certificates) for both authentication and confidentiality.
4) Firewalls: Communications using XMPP normally occur over TCP connections on port 5222
(client-to-server) or port 5269 (server-to-server). Use of these well-known ports allows
administrators to easily enable or disable XMPP activity through existing and
commonly-deployed firewalls.
6. Conclusion
The security issues related to IM is one of most obstacles that IM faces, has been attracting increasing
attentions. Ensure communication security is very important while enjoying the fast and convenient
service of instant messaging. As an IM standard protocol, XMPP provides a comprehensive security
mechanism for instant Messaging service. In specific applications, an efficient, safe system of instant
messaging can be achieved according to security mechanism combining its own security needs for safe
design.
- 9 -
References
[1] M. Day, J. Rosenberg and H. Sugano, “A Model for Presence and Instant Messaging” , February 2000, RFC
2778, available at:
[2] P. Saint-Andre, Ed., “Extensible Messaging and Presence Protocol (XMPP): Core” , October 2004, RFC
3920, available at:
[3] T. Dierks, C. Allen , “The TLS Protocol Version ”, January 1999, RFC 2246, available at:
[4] Joon S. Park and Tito Sierra, “Security Analyses for Enterprise Instant Messaging(EIM) Systems”,
Information System security, 2005. 3
[5] C. Newman, “Using TLS with IMAP, POP3 and ACAP”, June 1999, RFC 2595, available at:
[6] . van Oorschot, “Secure Public Instant Messaging: A Survey”, , available at:
[7] IAIN SHIGEOKA, “Instant Messaging in Java”, ISBN: 1-930110-46-4, Manning Publications Co. 2002.
[8] 苗凯,“XMPP的安全机制分析”,通信技术,2003年第 8期
[9] CNNIC, “2006中国即时通信市场调查报告”, , available at:
Author Brief Introduction:
Zhenxing Cui(1983-), Male, born in Linyi city of Shandong Province, master, Major in: algorithm
design and analysis, computer network.
作者简介:崔振兴(1983-):男,山东省临沂市人,硕士,主要研究方向:算法设计与分
析,计算机网络。