Entity Expansion Attack

In my last article, I covered the basic attacks that could be tried with the XML file. In today’s article, I will describe in detail about an attack called ‘Entity Expansion’. This is also called as the million laugh attack.

Consider the below piece of XML code.

<!DOCTYPE foo [

<!ENTITY a "1234567890" >

<!ENTITY b "&a;&a;&a;&a;&a;&a;&a;&a;" >

<!ENTITY c "&b;&b;&b;&b;&b;&b;&b;&b;" >

<!ENTITY d "&c;&c;&c;&c;&c;&c;&c;&c;" >

<!ENTITY e "&d;&d;&d;&d;&d;&d;&d;&d;" >

<!ENTITY f "&e;&e;&e;&e;&e;&e;&e;&e;" >

<!ENTITY g "&f;&f;&f;&f;&f;&f;&f;&f;" >

<!ENTITY h "&g;&g;&g;&g;&g;&g;&g;&g;" >

<!ENTITY i "&h;&h;&h;&h;&h;&h;&h;&h;" >

<!ENTITY j "&i;&i;&i;&i;&i;&i;&i;&i;" >

<!ENTITY k "&j;&j;&j;&j;&j;&j;&j;&j;" >

<!ENTITY l "&k;&k;&k;&k;&k;&k;&k;&k;" >

<!ENTITY m "&l;&l;&l;&l;&l;&l;&l;&l;" >



The above does look like some garbage but when this data is parsed by your XML parser, it has the potential to use up all your CPU and get your XML service down.

Does this get your attention? Ok, now let us what is so scary about this innocent looking code.


People who are familiar with DOCTYPE, DTD and Entities can move on to the next passage. For others, I will try to give a little background on this.

A XML document is made up building blocks called Elements. Each element can have one to many attributes and zero –to-many child elements. The elements will also carry data. While XML is all about elements, data and its attributes, the definition of these elements is done in Document Type Definition (DTD). There is one other building block in XML called Entities. Entities are something like macros or alias. If you want to repeat the message ‘hi’ 1000 times in your XML, you can just define this string as an entity and specify it in your XML. While parsing, XML parser will take care of replacing the entity with ‘hi’ thousand times.

Code Explanation:

In the above code, while XML parses the entities, the entity ‘&m;’ will blow out to 687,194,767,360 in size. Expanding this entity would be a time consuming job for the CPU and it will go down. And so, we successfully brought down a system with a humble piece of code.


A soap message should actually make use of XSD schema and not DTD. Even if DTD is used, the XML parser shouldn’t encourage the use of entities. But there might be instances when entities are desired. In that case, the parser should limit the size of data it expands. Or set an Auto Timeout after which it will stop parsing to halt this denial of service attack.

But, in reality, how many parsers take care of this attack?


XML Security – Part 1

I have been doing some research on XML Security and attack vectors related to it. The more I dig into the attacks possible, the more I am convinced that given the right kind of attack, even a sophisticated XML parser would succumb to the exploit. While, this might seem like a bold statement with no proof attached, I am afraid that this is indeed true.

If you are a developer working on XML, you should know how to protect your application from XML based attacks. If you are not working on XML, its never too late to learn 🙂

Before we dive into XML Security, I will give a brief on what is XML.


XML stands for eXtensible Markup Language. This is the de-facto standard produced and specified by W3C to transport, store and carry data.


XML is used extensively to transport data between applications, web services and is one of the components in web2.0 ajax based framworks. In this age, atleast 1/3 of the websites available on the internet would use XML in one form or other. These applications would not just use XML but rely on XML for their usability, availability and accuracy.

Since XML has become more important for an application, attackers are also more interested in exploiting XML data. While there are numerous examples on the internet to lanuch network based attacks and application based attacks, exploits against XML payloads (data) are very less in number.

In this series, we will see what kind of attacks are possible and how we can protect a XML payload against these attacks. Today, I will talk about one particular attack called ‘Parameter Tampering’.

Paramater Tampering:

This is not a new term to an application security professional. Ever since appsec consultants were born, they have been tampering with whatever data that comes to their hand. So, XML based tampering is no surprise.

So, what kind of acts are possible in this category?

1) Tweaking the XML elements, attributes or the text content to inject cross site scripting attack.

2) SQL Injection attack by tweaking the text content in XML.

3) Adding non-existent attributes or elements to an XML and checking whether it would cause DOS or information leakage.

4) Adding parameters that would make the XML malformed and check for exceptional conditions.

5) Inserting malicious special characters to check for malformed XML.

6) Using long attribute names or element names

7) Jumbo Payload (unclosed tags) and checking whether it cause DOS (denial of service).

The above (7) points are pretty self-explanatory and I hope I needn’t explain step by step. Now, that I have detailed these notorius acts, what do you think can protect your application from these acts?


The application should ensure that it checks for the correct element length, type, position, format and validate its XML data. Seems fair enough, isn’t it? In my next article, I would talk about another attack called ‘Entity Expansion’.