From ZIP to Zstandard, including 7z, RAR and ARJ: how file compression has transformed digital storage from floppy disks to the modern cloud

Introduction

In the age of terabytes and instant data transfers, file compression may seem less relevant. But in reality, it remains crucial for reducing storage costs, speeding up downloads, automating backups, and optimizing cloud infrastructure. This article explores the evolution of file compression formats, from legacy tools like ARC, ARJ and ZIP, to modern innovations such as Zstandard and LZ4. We examine strengths, limitations, and real-world use cases across platforms.


A Brief History: From Floppies to Cloud Storage

The earliest compression formats emerged in the 1980s to save space on floppy disks and facilitate software distribution via modem. Among them:

  • ARC (1985): developed by System Enhancement Associates, it was one of the first all-in-one archiver + compression tools.
  • ZIP (1989): created by Phil Katz as a faster and more open alternative to ARC. It became a de facto standard and was natively supported in Windows 98.
  • ARJ (1990): developed by Robert Jung, popular in MS-DOS environments thanks to advanced features like file spanning.
  • LZH / LHA (1988–1990): originating in Japan, it was widely used on NEC PC-98 systems and retro game files.

As internet use and multimedia files exploded, so did the need for better compression. The result: new tools such as RAR (1993), 7z (1999), and Zstandard (2016) offering higher ratios, encryption, multithreaded compression, and better error recovery.


Legacy Formats (No Longer Commonly Used)

FormatExtensionRelease YearEnd of Common UseNotes
ARC.arc1985~1995Replaced by ZIP, obsolete.
ARJ.arj1990~2005Advanced DOS-era tool, still readable.
LZH / LHA.lzh1988~2008Widely used in Japan, supported in Windows until Vista.
Q (Quantum).q~1989~1996Fast MS-DOS compressor, vanished.
CAB.cab1994~2010Used by Microsoft for drivers and setups.

Modern Compression Formats (2025)

FormatExtensionCompression RatioKey BenefitsDrawbacksAlgorithm Base
ZIP.zipMediumWidely compatible, fastLower compression ratioDEFLATE
7z (7-Zip).7zVery HighOpen source, solid compression, AES-256Slower compression timesLZMA, LZMA2
RAR / RAR5.rarHighRecovery features, encryption, solid archivesProprietary, limited supportProprietary RAR
TAR.GZ.tar.gzHigh (gzip)Unix-friendly, fastNot browsable without extractinggzip + TAR
TAR.XZ.tar.xzVery HighCompact, ideal for backupsSlow, CPU intensiveLZMA2
Zstandard (zst).zstHighFast, efficient, Facebook-backedLess common in WindowsZstandard
LZ4.lz4Low–MediumExtremely fast, real-time capableLower compression ratioLZ4
BZIP2.bz2HighGood lossless compressionVery slow, outdatedBurrows-Wheeler + Huffman

Cross-Platform Compatibility

PlatformZIPRAR7zTAR.GZXZZstdBZIP2
WindowsYesYesYesPartial (tools)PartialPartialPartial
macOSYesYesYesYesYesPartialYes
LinuxYesYes*YesYesYesYesYes
AndroidYesYesYesPartialPartialPartialPartial
iOSYesPartialPartialPartialPartialNoPartial

*Linux supports RAR extraction via unrar but cannot create .rar files without proprietary software.


Performance Benchmark (ENWIK8 dataset, 100 MB plain text)

FormatCompressed SizeCompression SpeedDecompression Speed
ZIP~30 MBVery FastVery Fast
7z (Ultra)~18 MBSlowFast
RAR5~20 MBMediumMedium
TAR.GZ~23 MBFastFast
TAR.XZ~17 MBVery SlowSlow
Zstandard~19 MBVery FastVery Fast
BZIP2~21 MBVery SlowSlow
LZ4~30 MBExtremely FastExtremely Fast

Sources: lzbench, 7-Zip, zstd CLI tests on ENWIK8


Compression in the Cloud Era

Compression plays a vital role in modern cloud operations:

  • AWS Lambda, CloudFront, S3: Support Gzip, Brotli, Zstd for file storage and HTTP responses.
  • Docker: Uses tar.gz or tar.zst for image layers.
  • Backup tools: BorgBackup, Restic, Duplicacy rely on Zstandard or LZMA2 for efficiency.
  • CI/CD Pipelines: Artifacts are packaged in ZIP or TAR formats.

Extreme Compression: The Record Breakers

Experimental formats such as PAQ, CMIX, and Krak push compression limits:

  • PAQ8PX has compressed ENWIK8 to as low as 12.6 MB, outperforming even 7z.
  • These formats require huge CPU time and memory, making them unsuitable for practical use but valuable for academic benchmarking.

Best Format per Use Case

ScenarioRecommended Format
Maximum CompatibilityZIP
Archival with Maximum Compression7z, TAR.XZ
Fastest Compression & DecompressionZstandard, LZ4
Secure Encryption7z (AES-256), RAR
Linux/Unix BackupsTAR.ZST, TAR.XZ
Mobile DevicesZIP, RAR
Retro Systems or EmulatorsLHA, ARJ, ZIP

Conclusion

Far from obsolete, file compression continues to evolve. Even with cheap SSDs and cloud storage, time, bandwidth, and energy still matter. That’s why formats like Zstandard are gaining momentum in modern software, while 7-Zip remains unmatched in free, open-source high-ratio archiving.

From floppy disks to Docker containers, compression is both a science and an art — one that remains essential in the age of big data and distributed computing.


References & Tools

Scroll to Top