From ZIP to Zstandard, including 7z, RAR and ARJ: how file compression has transformed digital storage from floppy disks to the modern cloud
Introduction
In the age of terabytes and instant data transfers, file compression may seem less relevant. But in reality, it remains crucial for reducing storage costs, speeding up downloads, automating backups, and optimizing cloud infrastructure. This article explores the evolution of file compression formats, from legacy tools like ARC, ARJ and ZIP, to modern innovations such as Zstandard and LZ4. We examine strengths, limitations, and real-world use cases across platforms.
A Brief History: From Floppies to Cloud Storage
The earliest compression formats emerged in the 1980s to save space on floppy disks and facilitate software distribution via modem. Among them:
- ARC (1985): developed by System Enhancement Associates, it was one of the first all-in-one archiver + compression tools.
- ZIP (1989): created by Phil Katz as a faster and more open alternative to ARC. It became a de facto standard and was natively supported in Windows 98.
- ARJ (1990): developed by Robert Jung, popular in MS-DOS environments thanks to advanced features like file spanning.
- LZH / LHA (1988–1990): originating in Japan, it was widely used on NEC PC-98 systems and retro game files.
As internet use and multimedia files exploded, so did the need for better compression. The result: new tools such as RAR (1993), 7z (1999), and Zstandard (2016) offering higher ratios, encryption, multithreaded compression, and better error recovery.
Legacy Formats (No Longer Commonly Used)
Format | Extension | Release Year | End of Common Use | Notes |
---|---|---|---|---|
ARC | .arc | 1985 | ~1995 | Replaced by ZIP, obsolete. |
ARJ | .arj | 1990 | ~2005 | Advanced DOS-era tool, still readable. |
LZH / LHA | .lzh | 1988 | ~2008 | Widely used in Japan, supported in Windows until Vista. |
Q (Quantum) | .q | ~1989 | ~1996 | Fast MS-DOS compressor, vanished. |
CAB | .cab | 1994 | ~2010 | Used by Microsoft for drivers and setups. |
Modern Compression Formats (2025)
Format | Extension | Compression Ratio | Key Benefits | Drawbacks | Algorithm Base |
---|---|---|---|---|---|
ZIP | .zip | Medium | Widely compatible, fast | Lower compression ratio | DEFLATE |
7z (7-Zip) | .7z | Very High | Open source, solid compression, AES-256 | Slower compression times | LZMA, LZMA2 |
RAR / RAR5 | .rar | High | Recovery features, encryption, solid archives | Proprietary, limited support | Proprietary RAR |
TAR.GZ | .tar.gz | High (gzip) | Unix-friendly, fast | Not browsable without extracting | gzip + TAR |
TAR.XZ | .tar.xz | Very High | Compact, ideal for backups | Slow, CPU intensive | LZMA2 |
Zstandard (zst) | .zst | High | Fast, efficient, Facebook-backed | Less common in Windows | Zstandard |
LZ4 | .lz4 | Low–Medium | Extremely fast, real-time capable | Lower compression ratio | LZ4 |
BZIP2 | .bz2 | High | Good lossless compression | Very slow, outdated | Burrows-Wheeler + Huffman |
Cross-Platform Compatibility
Platform | ZIP | RAR | 7z | TAR.GZ | XZ | Zstd | BZIP2 |
---|---|---|---|---|---|---|---|
Windows | Yes | Yes | Yes | Partial (tools) | Partial | Partial | Partial |
macOS | Yes | Yes | Yes | Yes | Yes | Partial | Yes |
Linux | Yes | Yes* | Yes | Yes | Yes | Yes | Yes |
Android | Yes | Yes | Yes | Partial | Partial | Partial | Partial |
iOS | Yes | Partial | Partial | Partial | Partial | No | Partial |
*Linux supports RAR extraction via unrar
but cannot create .rar
files without proprietary software.
Performance Benchmark (ENWIK8 dataset, 100 MB plain text)
Format | Compressed Size | Compression Speed | Decompression Speed |
---|---|---|---|
ZIP | ~30 MB | Very Fast | Very Fast |
7z (Ultra) | ~18 MB | Slow | Fast |
RAR5 | ~20 MB | Medium | Medium |
TAR.GZ | ~23 MB | Fast | Fast |
TAR.XZ | ~17 MB | Very Slow | Slow |
Zstandard | ~19 MB | Very Fast | Very Fast |
BZIP2 | ~21 MB | Very Slow | Slow |
LZ4 | ~30 MB | Extremely Fast | Extremely Fast |
Sources: lzbench
, 7-Zip, zstd CLI tests on ENWIK8
Compression in the Cloud Era
Compression plays a vital role in modern cloud operations:
- AWS Lambda, CloudFront, S3: Support Gzip, Brotli, Zstd for file storage and HTTP responses.
- Docker: Uses
tar.gz
ortar.zst
for image layers. - Backup tools: BorgBackup, Restic, Duplicacy rely on Zstandard or LZMA2 for efficiency.
- CI/CD Pipelines: Artifacts are packaged in ZIP or TAR formats.
Extreme Compression: The Record Breakers
Experimental formats such as PAQ, CMIX, and Krak push compression limits:
- PAQ8PX has compressed ENWIK8 to as low as 12.6 MB, outperforming even 7z.
- These formats require huge CPU time and memory, making them unsuitable for practical use but valuable for academic benchmarking.
Best Format per Use Case
Scenario | Recommended Format |
---|---|
Maximum Compatibility | ZIP |
Archival with Maximum Compression | 7z, TAR.XZ |
Fastest Compression & Decompression | Zstandard, LZ4 |
Secure Encryption | 7z (AES-256), RAR |
Linux/Unix Backups | TAR.ZST, TAR.XZ |
Mobile Devices | ZIP, RAR |
Retro Systems or Emulators | LHA, ARJ, ZIP |
Conclusion
Far from obsolete, file compression continues to evolve. Even with cheap SSDs and cloud storage, time, bandwidth, and energy still matter. That’s why formats like Zstandard are gaining momentum in modern software, while 7-Zip remains unmatched in free, open-source high-ratio archiving.
From floppy disks to Docker containers, compression is both a science and an art — one that remains essential in the age of big data and distributed computing.