What is CRLF?
A CRLF (carriage return line feed) injection is type of an attack that can be carried out by inputting characters that have specials meanings in HTML or HTTP responses. If an application fails to sanitize or improperly sanitize injected special characters, attackers can utilize this failure to change application behavior for malicious purposes.
Two of these special characters are carriage return and line feed (\n and \r respectively). Both servers and browsers use these characters to identify sections of an HTTP message like headers, body etc. The encoding for \n is %0D, and for \r it is %0A.
If an attacker successfully exploit CRLF vulnerability they can use it for HTTP response splitting or HTTP request smuggling attacks.
How are HTTP response splitting attacks carried?
As mentioned above, browsers and servers use special characters to identify sections of requests and responses. Repeating CRLF encoding twice (CRLFCRLF) tells that header ends and request body starts. Applications may use custom headers for variety of purposes such as rate limiting or creating custom logic on the server. Let’s say a web application has following custom header
X-Section: value, where value of the section is set by the header. If requests that contain such a header aren’t sanitized and validated, an attacker could insert an XSS payload by simply adding the CRLF encoding twice after the header as in the example:
?section=account%0d%0a%0d%0a<script>alert('payload')</script>. As you can see, the attacker splitted the body immediately by inserting CLRF encoding twice, thus it was possible to insert the payload to the vulnerable backend. Sometimes only a limited number of characters can be added after CRLFCRLF, in such instances an attacker can just inject a
Location header field to combine the vulnerability with a redirect for phishing attacks.
How are HTTP Request Smuggling attacks carried?
These attacks take advantage of the disparities between how servers handle
Transfer-Encoding headers. The important difference between the two is that, Content-Length sends the entire request body at once, and specifies the hexadecimal size of the body, whereas Transfer-Encoding sends it in pieces (or chunks) and specifies the size of each chunk in hexadecimal. It would be helpful to show it in code so here is an example request with Transfer-Encoding. Notice how the final line is 0, this indicates that there are no more chunks:
POST / HTTP/1.1
Thus there is no reason to have both fields specified. However, if both fields are present in a request then the Content-Length must be ignored. The problem is that, today’s applications may use more than one server to handle requests. When there are more than one server involved then there is a higher chance that the fields will be interpreted in a different order. Consider the following request:
POST / HTTP/1.1
Assume that front end server is using Content-Length, so then everything that follows Content-Length field are supposed to be the body. This in itself would not be a problem but what if the next server that handles the request uses Transfer-Encoding and ignores Content-Length. Remember that hexadecimal 0 means the end of body in this instance. So now we have
field=value attached to the next request (which we can assume would be valid, coming from a normal user) which renders it invalid. Of course an attacker would not just add random bits of code that would render the next request invalid. The smuggled portion would likely be used for a request that would steal user data or redirect the user to a phishing site. Here in this section so far, we have defined CLTE smuggling attack. The other main type of request smuggling attacks are TECL and CLCL. Main idea is the same in each of these attacks. The main difference for example in case of CLCL is that rather than including a Transfer-Encoding, an attacker that utilizes CLCL will put two Content-Length fields in the malicious request. Hoping that front end and back end will each evaluate different Content-Length fields. I am leaving an example here without explanations as an exercise. I will just note that normally GET requests should not have body, but that validation is often overlooked:
GET / HTTP/1.1
GET /executethis HTTP/1.1
How to Prevent CRLF Injections?
By removing the root cause. As mentioned in the text, there is no reason for a request to have neither both Content-Length and Transfer-Encoding fields, nor multiple fields of the same type (CLCL or TETE). If we just ignore/disallow such requests then problem is automatically solved. Other solutions are more error prone and likely not necessary as well so I will not touch them here.