Golang vs Python zlib: Dissecting the Output Differences
In the provided code snippets, you're attempting to compress a string using both Python's zlib and Go's flate package. However, your Python implementation yields a different output than the Go counterpart. Why is this the case?
To assist in debugging, let's analyze the relevant code fragments:
Go Implementation (compress.go)
<code class="go">package main import ( "compress/flate" "bytes" "fmt" ) func compress(source string) []byte { w, _ := flate.NewWriter(nil, 7) buf := new(bytes.Buffer) w.Reset(buf) w.Write([]byte(source)) w.Close() return buf.Bytes() } func main() { example := "foo" compressed := compress(example) fmt.Println(compressed) }</code>
The key step in the Go code is closing the Writer, which flushes the compressed data and writes a checksum to the end.
Python Implementation (compress.py)
<code class="python">from __future__ import print_function import zlib def compress(source): # golang zlib strips header + checksum compressor = zlib.compressobj(7, zlib.DEFLATED, -15) compressor.compress(source) # python zlib defaults to Z_FLUSH, but # https://golang.org/pkg/compress/flate/#Writer.Flush # says "Flush is equivalent to Z_SYNC_FLUSH" return compressor.flush(zlib.Z_SYNC_FLUSH) def main(): example = u"foo" compressed = compress(example) print(list(bytearray(compressed))) if __name__ == "__main__": main()</code>
Here, you've explicitly flushed the compressor by calling compressor.flush(zlib.Z_SYNC_FLUSH).
Dissecting the Output
The Python output contains a fifth byte of 0, whereas Go has 4. The former is the result of Zlib's handling of the end of data. The latter is due to Flate stripping the header and checksum when closing the writer.
Bridging the Output Gap
To obtain comparable output from both implementations, you can either:
Use Flush() in Go: Replace w.Close() with w.Flush() in your Go code to emit the compressed data without the checksum.
<code class="go">buf := new(bytes.Buffer) w, _ := flate.NewWriter(buf, 7) w.Write([]byte(source)) w.Flush() return buf.Bytes()</code>
Conclusion
While you might be able to tweak parameters to force a byte-for-byte match between the two implementations, this is not necessary or even desirable. The output compatibility between different compression libraries is guaranteed but not identical.
The above is the detailed content of Why do Python and Go zlib generate different compressed output for the same input?. For more information, please follow other related articles on the PHP Chinese website!