Understanding MIME types and content sniffing

29 December, 2025

Any web or web-adjacent developer worth their salt has probably come across the Content-Type HTTP header at some point. I have always held a kind of ambiguous notion that this header contains a MIME type, which the browser (or “a client” to keep things abstract) will interpret in order to decide what to do with the returned data.

Recently I had to deal with a bug caused by this mundane header, and was forced to read up on what a MIME types actually are, and how browsers deal with them.

History of the MIME types

MIME (Multipurpose Internet Mail Extensions) types originated in the early 90s as a way to let emails carry data other than ASCII text, such as images, attachments, different character sets, and so on.

As HTTP came around, it reused these types in the form of the Content-Type header. I was just a baby back then, but from what I’ve read, this system did not work very well in the early days of the web: servers regularly answered with types that were either wrong or completely made up, leading to browsers having to fall back on sniffing and other heuristics to figure out the intent behind whatever data the servers spat out.

Over time this guesswork grew to be quite a burden, but more importantly it became a security risk. For example, MIME confusion attacks could be used to trick browsers to run JavaScript disguised as JSON, parse HTML disguised as an image, or accomplish cross-site scripting with mislabeled content.
As time went on the web matured. Standards tightened, and browsers became stricter. MIME types transformed from “possibly helpful hints” to actual contracts, and legacy or non-standard types that once may have worked by accident now fail by design.

Or at least they do on secure, hardened systems.

What is a MIME type?

Before diving into how browsers deal with MIME types, it’s worth briefly looking at what a MIME type actually is.

I won’t rewrite the official spec here, but in a nutshell a MIME type is a label that describes the format of some data using the form type/subtype, for example text/html, image/webp, or application/json.
The top-level type indicates the broad category (text, image, application, etc.), while the subtype specifies the exact format.

The list of official MIME types is maintained by IANA, but many legacy and vendor-specific types still exist.

How browsers deal with MIME types

As stated above, MIME types are there to tell the browser (and other clients) how to interpret the response body, more specifically which parser or decoder to invoke in order to process the bytes: maybe it’s an image or HTML that should be rendered inline, maybe it’s JavaScript that should be executed as a script, or perhaps we should just treat the data as a download and write to disk.

Content-Type is consulted first and used as the primary signal. Since only a limited set of MIME types map to renderable or executable content, unknown or unsupported types are handled conservatively, in most cases simply downloaded directly. Modern browsers increasingly treat MIME types as authoritative, which means sending garbage types tends to break things.

Content sniffing

Content sniffing is a fallback mechanism that browsers use when dealing with incorrect or missing MIME types. Instead of trusting the Content-Type explicitly, it inspects the first few “magic bytes” of the response, and compares them to a very small, hardcoded set of known signatures.

As stated in the history section, in the early days of the web this sniffing behaviour was aggressive and very permissive, as the servers were often misconfigured, and such “best effort” tactics were necessary to make anything work at all.

Sniffing is still supported today, but it exists as a tightly constrained compatibility layer that applies to a narrow set of legacy formats, follows standardized rules, and is often disabled entirely via security headers like X-Content-Type-Options: nosniff.

The bug

Let’s circle back to the bug that prompted the need to learn about MIME types.

In this case, certain (valid) ICO files were uploaded, and then subsequently served with an incorrect MIME type of image/ico. This type is still widely used on the internet, and browsers are perfectly capable of sniffing out the magic bytes 00 00 01 00 to identify the data as an ICO-file.
The root cause (unsuprisingly) was that the sniffing was disabled with the aforementioned nosniff header, removing the escape hatch: the browser was forced to take the MIME type at face value, and since it did not map to an officially sanctioned one, fell back to downloading the file instead of rendering it.

The fix of course was trivial to implement once understood: serve the file with the registered and unambiguous MIME type image/vnd.microsoft.icon. Strange name, but apparently Microsoft originally came up with the ICO format, so they got to name the MIME type.

Final thoughts

This issue was not a browser bug or some overzealous security feature, but rather the intended behaviour of MIME types. Disabling sniffing removes an entire class of MIME confusion attacks, but also exposes any existing misconfigurations and latent bugs that have evaded capture purely because of it.

I think the broader lesson is that today, MIME types are not just cosmetic metadata or mere suggestions. In modern, security-conscious systems they are a binding contract between the server and the client. Relying on content sniffing may work today and exist of out of necessity, but it’s fundamentally brittle and likely on its way out completely.