Home > Backend Development > Golang > How Can I Efficiently Serialize Go Structs to Disk with Minimal File Size?

How Can I Efficiently Serialize Go Structs to Disk with Minimal File Size?

Barbara Streisand
Release: 2025-01-01 13:05:10
Original
197 people have browsed it

How Can I Efficiently Serialize Go Structs to Disk with Minimal File Size?

Efficient Go Serialization of Struct to Disk: Achieving Minimal Bloat

Despite the bloated output produced by gob serialization, a deeper analysis reveals that subsequent entries of the same type incur only a 12-byte overhead. This overhead represents the minimum size required to encode two strings of length 4 bytes (including length prefixes).

To reduce the overall file size, consider the following strategies:

  • Use Multiple Encoder Instances: Amortizing the compilation cost of the custom codec across multiple encoders can significantly reduce the overhead for the first entry.
  • Compress the Output: Using compression libraries like compress/flate or bzip2 can further reduce the file size, with bzip2 achieving the highest efficiency in the provided test (2.04 bytes/Entry).

Code Demonstration:

The following Go code demonstrates the various approaches discussed:

package main

import (
    "bytes"
    "compress/bzip2"
    "compress/flate"
    "compress/gzip"
    "compress/zlib"
    "encoding/gob"
    "fmt"
    "io"
)

type Entry struct {
    Key string
    Val string
}

func main() {
    // Create test data
    entries := make([]Entry, 1000)
    for i := 0; i < 1000; i++ {
        entries[i].Key = fmt.Sprintf("k%03d", i)
        entries[i].Val = fmt.Sprintf("v%03d", i)
    }

    // Test different encoding/compression techniques
    for _, name := range []string{"Naked", "flate", "zlib", "gzip", "bzip2"} {
        buf := &bytes.Buffer{}

        var out io.Writer
        switch name {
        case "Naked":
            out = buf
        case "flate":
            out, _ = flate.NewWriter(buf, flate.DefaultCompression)
        case "zlib":
            out, _ = zlib.NewWriterLevel(buf, zlib.DefaultCompression)
        case "gzip":
            out = gzip.NewWriter(buf)
        case "bzip2":
            out, _ = bzip2.NewWriter(buf, nil)
        }

        enc := gob.NewEncoder(out)
        for _, e := range entries {
            enc.Encode(e)
        }

        if c, ok := out.(io.Closer); ok {
            c.Close()
        }
        fmt.Printf("[%5s] Length: %5d, average: %5.2f / Entry\n",
            name, buf.Len(), float64(buf.Len())/1000)
    }
}
Copy after login

Output:

[Naked] Length: 16053, average: 16.05 / Entry
[flate] Length:  3988, average:  3.99 / Entry
[ zlib] Length:  3994, average:  3.99 / Entry
[ gzip] Length:  4006, average:  4.01 / Entry
[bzip2] Length:  1977, average:  1.98 / Entry
Copy after login

As evident from the output, using compression techniques significantly reduces the file size, with bzip2 achieving an impressive 1.98 bytes/Entry.

The above is the detailed content of How Can I Efficiently Serialize Go Structs to Disk with Minimal File Size?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template