In my last article, I covered the basic attacks that could be tried with the XML file. In today’s article, I will describe in detail about an attack called ‘Entity Expansion’. This is also called as the million laugh attack.
Consider the below piece of XML code.
<!DOCTYPE foo [
<!ENTITY a "1234567890" >
<!ENTITY b "&a;&a;&a;&a;&a;&a;&a;&a;" >
<!ENTITY c "&b;&b;&b;&b;&b;&b;&b;&b;" >
<!ENTITY d "&c;&c;&c;&c;&c;&c;&c;&c;" >
<!ENTITY e "&d;&d;&d;&d;&d;&d;&d;&d;" >
<!ENTITY f "&e;&e;&e;&e;&e;&e;&e;&e;" >
<!ENTITY g "&f;&f;&f;&f;&f;&f;&f;&f;" >
<!ENTITY h "&g;&g;&g;&g;&g;&g;&g;&g;" >
<!ENTITY i "&h;&h;&h;&h;&h;&h;&h;&h;" >
<!ENTITY j "&i;&i;&i;&i;&i;&i;&i;&i;" >
<!ENTITY k "&j;&j;&j;&j;&j;&j;&j;&j;" >
<!ENTITY l "&k;&k;&k;&k;&k;&k;&k;&k;" >
<!ENTITY m "&l;&l;&l;&l;&l;&l;&l;&l;" >
The above does look like some garbage but when this data is parsed by your XML parser, it has the potential to use up all your CPU and get your XML service down.
Does this get your attention? Ok, now let us what is so scary about this innocent looking code.
People who are familiar with DOCTYPE, DTD and Entities can move on to the next passage. For others, I will try to give a little background on this.
A XML document is made up building blocks called Elements. Each element can have one to many attributes and zero –to-many child elements. The elements will also carry data. While XML is all about elements, data and its attributes, the definition of these elements is done in Document Type Definition (DTD). There is one other building block in XML called Entities. Entities are something like macros or alias. If you want to repeat the message ‘hi’ 1000 times in your XML, you can just define this string as an entity and specify it in your XML. While parsing, XML parser will take care of replacing the entity with ‘hi’ thousand times.
In the above code, while XML parses the entities, the entity ‘&m;’ will blow out to 687,194,767,360 in size. Expanding this entity would be a time consuming job for the CPU and it will go down. And so, we successfully brought down a system with a humble piece of code.
A soap message should actually make use of XSD schema and not DTD. Even if DTD is used, the XML parser shouldn’t encourage the use of entities. But there might be instances when entities are desired. In that case, the parser should limit the size of data it expands. Or set an Auto Timeout after which it will stop parsing to halt this denial of service attack.
But, in reality, how many parsers take care of this attack?