joriszwart.nl

Data compression

Minimal ZIP file – part I

This article is part of a series.

  1. Creating a valid ZIP-file 👈 here we are
  2. Use .NET’s DeflateStream
  3. No dependencies (coding the DEFLATE algorithm)

Introduction

How difficult is it to create a ZIP-file1 with a bare minimum2 - but readable - amount of code?

To go slowly, we do this in a few steps. Each in their own article. The first step is to create a valid, but not very optimal ZIP-file (i.e., uncompressed).

File format

For a little background, these are the parts of a ZIP-file:

File header 1     <------------+
File data 1                    |
File header 2     <--------+   |
File data 2                |   |
.... .... .                |   |
File header n     <----+   |   |
File data n            |   |   |
Central directory <----|---|---|---+
File entry 1  ---------+   |   |   |
File entry 2  -------------+   |   |
.... ..... .                   |   |
File entry n  -----------------+   |   
End of Central Directory ----------+

The individual parts contain things like file names, sizes, and CRCs.

Source code

The source code is 100 lines and follows said file layout. Readability was the main goal.

Zip1.cs

The proof is in the pudding

PKZIP 2.04g (Jan ‘93) confirms that it is a valid ZIP-file.

pkunzip 2.04g doing it’s work

Info-ZIP’s zip agrees:

dotnet run
zip -T minimal.zip
test of minimal.zip OK

Info-ZIP’s zipinfo is also a great tool to inspect ZIP-files.

zipinfo -v minimal.zip

In the next part we’ll cheat a little and use .NET’s DeflateStream3 to create a more useful ZIP-file.


  1. https://en.wikipedia.org/wiki/ZIP_(file_format) ↩︎

  2. The title of this article is about the code, not about the size of created ZIP-file. Good find, Bertrik Sikken↩︎

  3. Using ZipArchive would be the real cheat! ↩︎

Related