Curious how MP3, one of the most popular audio coding and file formats, came to be? So were we, and decided to do a deep dive. For one, you should know that MP3 became popular largely to the file size limits of CDs, file sharing website’s bandwidth and storage space, and people’s mp3 players and hard drives at home. The need to store a lot of stuff with such limitations brought people to MP3, an audio encoding and decoding format that was able to reduce an audio file up to twelve times. And it did so without ruining the listening experience significantly. With that clue out of the way, let’s explain what is MP3.
MP3: Origins and definition
MP3 is both a coding format and a file format. What became MP3, the coding format, was first published in 1993 and as MPEG-1 Audio Layer 3. It was updated in 1994 and published as MPEG-2 Audio Layer 3 in 1995. At the time, it used a .bit file extension. On July 14, 1995, a German audio engineer named Karlheinz Brandenburg, who made vital contributions to both coding formats, announced they would be moving to a .mp3 file extension.
MP3, the file format, simply represents an audio stream of elementary data encoded by the MPEG-1 or MPEG-2 standards. The first MP3 encoder was named i3enc, and released on July 7, 1994, and the first MP3 player was titled WinPlay3, and released on September 9, 1995.
How does MP3 compression work?
To reduce file size through compression, the encoder needs certain values, most notably the sampling rate, bit rate, and bit depth. As mentioned above, CDs were gaining popularity in the late 1990s, so Karlheinz Brandenburg used those parameters as a reference – 44.1 kHz (sampling rate of 44,100 times per second) and 2 channels with 16 bits (bit depth of 2 bytes). The particular CD he used to fine-tune the compression was a CD with the song Tom’s Diner by Suzanne Vega.
MP3 compression, also known as lossy-compression, is based on two primary limitations of human hearing:
- Humans cannot hear an original tone if it was by another tone of lower frequency, which is called auditory masking.
- It was determined that humans can only hear sound in frequencies of between 20 Hz and 20 kHz.
The algorithm takes this, as well as many other things, into consideration when reducing the file size by discarding certain parts of the uncompressed audio file.
When did MP3 become popular?
The ability to reduce the file size by 10 to 12 times without a significant difference in sound fidelity made a boom in the second half of the 1990s. Around 1996, one hacker found the source code for MPEG implementations, altered it to improve its quality, and released it on the Internet. This started the trend of ripping songs from CDs. Instead of a 12-track album taking the entirety of a 700 MB CD, people were able to burn an equivalent of 10 to 12 of such albums onto the same CD. The fact that a typical hard drive size was between 500 MB and 1 GB during that time also explains the prominence.
Additionally, websites for sharing compressed audio files started gaining traction around 1997, which is the same year Winamp, a well-known MP3 player, was released. In November of the same year, a website mp3.com went online, allowing people to download compressed music from independent creators. Napster.com went online in 1999 and allowed people to download entire ripped albums and songs for free. They were later shut down for music piracy.
Design of MP3
MP3 file is made of thousands of MP3 frames, each consisting of a header and a data block. These intertwine irregularly and form sequences, which are called elementary streams. Later versions of MP3 added a third part, ID3 metadata, which is used for detecting errors after a transfer and
The bit rate affects the perception of sound. If too low, the file size will be very low as well, but so will the audio fidelity, and there is a chance of compression artifacts appearing. These are sounds that weren’t present in the original audio file. Although the perception depends on the listener (their audio equipment, background noise, attention, music training), these are common bit rates.
- 48 Kbps – Lowest acceptable bit rate.
- 128 kbps – Considered the lowest enjoyable audio bit rate. Small file size and acceptable sound fidelity.
- 160 to 192 kbps – Lowest to common bit rate used by free radio station websites.
- 256 Kbps – Used by stations that provide a free radio or paid online streams of higher-quality.
- 320 Kbps – Golden standard in MP3 files for albums, hard drive storage, or online music streaming. It provides a great balance between high sound fidelity and small file size.
Although CBR (Constant Bit Rate) is used very often, a lot of mp3 encoders use VBR (Variable Bit Rate) algorithms to recognize and change the compression rate based on the type of sound throughout the audio file. This ensures the best quality for the smallest size.
From the original 44.100 Hz or 44.1 kHz, it has only increased slightly for the most part, to 48.000 Hz or 48 kHz nowadays.
Data and metadata
An MP3 file can have so-called ancillary data stored in it. Granted, it’s optional and usually encoded to give the decoder guidance on how to improve audio quality upon decoding. Metadata is also optional and has no defined size or standard format. It is used for adding information about the file (artist, song name, track number, album publisher, genre, year of release, etc.) The most widespread tag formats are ID3v1, ID3v2, and APEv2.
MP3 Surround is backward compatible with MP3 and used to carry 5.1 surround sound (5 speakers and one sub-woofer) to the listener.