Home Backend Development Python Tutorial Python中的数据对象持久化存储模块pickle的使用示例

Python中的数据对象持久化存储模块pickle的使用示例

Jun 10, 2016 pm 03:05 PM
pickle python

Python中可以使用 pickle 模块将对象转化为文件保存在磁盘上,在需要的时候再读取并还原。具体用法如下:
pickle是Python库中常用的序列化工具,可以将内存对象以文本或二进制格式导出为字符串,或者写入文档。后续可以从字符或文档中还原为内存对象。新版本的Python中用c重新实现了一遍,叫cPickle,性能更高。 下面的代码演示了pickle库的常用接口用法,非常简单:

import cPickle as pickle

# dumps and loads
# 将内存对象dump为字符串,或者将字符串load为内存对象
def test_dumps_and_loads():
  t = {'name': ['v1', 'v2']}
  print t

  o = pickle.dumps(t)
  print o
  print 'len o: ', len(o)

  p = pickle.loads(o)
  print p

 

# 关于HIGHEST_PROTOCOL参数,pickle 支持3种protocol,0、1、2:
# http://stackoverflow.com/questions/23582489/python-pickle-protocol-choice
# 0:ASCII protocol,兼容旧版本的Python
# 1:binary format,兼容旧版本的Python
# 2:binary format,Python2.3 之后才有,更好的支持new-sytle class
def test_dumps_and_loads_HIGHEST_PROTOCOL():
  print 'HIGHEST_PROTOCOL: ', pickle.HIGHEST_PROTOCOL

  t = {'name': ['v1', 'v2']}
  print t

  o = pickle.dumps(t, pickle.HIGHEST_PROTOCOL)
  print 'len o: ', len(o)

  p = pickle.loads(o)
  print p


# new-style class
def test_new_sytle_class():
  class TT(object):
    def __init__(self, arg, **kwargs):
      super(TT, self).__init__()
      self.arg = arg
      self.kwargs = kwargs

    def test(self):
      print self.arg
      print self.kwargs

  # ASCII protocol
  t = TT('test', a=1, b=2)
  o1 = pickle.dumps(t)
  print o1
  print 'o1 len: ', len(o1)
  p = pickle.loads(o1)
  p.test()

  # HIGHEST_PROTOCOL对new-style class支持更好,性能更高
  o2 = pickle.dumps(t, pickle.HIGHEST_PROTOCOL)
  print 'o2 len: ', len(o2)
  p = pickle.loads(o2)
  p.test()


# dump and load
# 将内存对象序列化后直接dump到文件或支持文件接口的对象中
# 对于dump,需要支持write接口,接受一个字符串作为输入参数,比如:StringIO
# 对于load,需要支持read接口,接受int输入参数,同时支持readline接口,无输入参数,比如StringIO

# 使用文件,ASCII编码
def test_dump_and_load_with_file():
  t = {'name': ['v1', 'v2']}

  # ASCII format
  with open('test.txt', 'w') as fp:
    pickle.dump(t, fp)

  with open('test.txt', 'r') as fp:
    p = pickle.load(fp)
    print p


# 使用文件,二进制编码
def test_dump_and_load_with_file_HIGHEST_PROTOCOL():
  t = {'name': ['v1', 'v2']}
  with open('test.bin', 'wb') as fp:
    pickle.dump(t, fp, pickle.HIGHEST_PROTOCOL)

  with open('test.bin', 'rb') as fp:
    p = pickle.load(fp)
    print p


# 使用StringIO,二进制编码
def test_dump_and_load_with_StringIO():
  import StringIO

  t = {'name': ['v1', 'v2']}

  fp = StringIO.StringIO()
  pickle.dump(t, fp, pickle.HIGHEST_PROTOCOL)

  fp.seek(0)
  p = pickle.load(fp)
  print p

  fp.close()


# 使用自定义类
# 这里演示用户自定义类,只要实现了write、read、readline接口,
# 就可以用作dump、load的file参数
def test_dump_and_load_with_user_def_class():
  import StringIO

  class FF(object):
    def __init__(self):
      self.buf = StringIO.StringIO()

    def write(self, s):
      self.buf.write(s)
      print 'len: ', len(s)

    def read(self, n):
      return self.buf.read(n)

    def readline(self):
      return self.buf.readline()

    def seek(self, pos, mod=0):
      return self.buf.seek(pos, mod)

    def close(self):
      self.buf.close()

  fp = FF()
  t = {'name': ['v1', 'v2']}
  pickle.dump(t, fp, pickle.HIGHEST_PROTOCOL)

  fp.seek(0)
  p = pickle.load(fp)
  print p

  fp.close()


# Pickler/Unpickler
# Pickler(file, protocol).dump(obj) 等价于 pickle.dump(obj, file[, protocol])
# Unpickler(file).load() 等价于 pickle.load(file)
# Pickler/Unpickler 封装性更好,可以很方便的替换file
def test_pickler_unpickler():
  t = {'name': ['v1', 'v2']}

  f = file('test.bin', 'wb')
  pick = pickle.Pickler(f, pickle.HIGHEST_PROTOCOL)
  pick.dump(t)
  f.close()

  f = file('test.bin', 'rb')
  unpick = pickle.Unpickler(f)
  p = unpick.load()
  print p
  f.close()

Copy after login


pickle.dump(obj, file[, protocol])
这是将对象持久化的方法,参数的含义分别为:

  • obj: 要持久化保存的对象;
  • file: 一个拥有 write() 方法的对象,并且这个 write() 方法能接收一个字符串作为参数。这个对象可以是一个以写模式打开的文件对象或者一个 StringIO 对象,或者其他自定义的满足条件的对象。
  • protocol: 这是一个可选的参数,默认为 0 ,如果设置为 1 或 True,则以高压缩的二进制格式保存持久化后的对象,否则以ASCII格式保存。

对象被持久化后怎么还原呢?pickle 模块也提供了相应的方法,如下:

pickle.load(file)
只有一个参数 file ,对应于上面 dump 方法中的 file 参数。这个 file 必须是一个拥有一个能接收一个整数为参数的 read() 方法以及一个不接收任何参数的 readline() 方法,并且这两个方法的返回值都应该是字符串。这可以是一个打开为读的文件对象、StringIO 对象或其他任何满足条件的对象。

下面是一个基本的用例:

# -*- coding: utf-8 -*-

import pickle
# 也可以这样:
# import cPickle as pickle

obj = {"a": 1, "b": 2, "c": 3}

# 将 obj 持久化保存到文件 tmp.txt 中
pickle.dump(obj, open("tmp.txt", "w"))

# do something else ...

# 从 tmp.txt 中读取并恢复 obj 对象
obj2 = pickle.load(open("tmp.txt", "r"))

print obj2

# -*- coding: utf-8 -*-
 
import pickle
# 也可以这样:
# import cPickle as pickle
 
obj = {"a": 1, "b": 2, "c": 3}
 
# 将 obj 持久化保存到文件 tmp.txt 中
pickle.dump(obj, open("tmp.txt", "w"))
 
# do something else ...
 
# 从 tmp.txt 中读取并恢复 obj 对象
obj2 = pickle.load(open("tmp.txt", "r"))
 
print obj2

Copy after login


不过实际应用中,我们可能还会有一些改进,比如用 cPickle 来代替 pickle ,前者是后者的一个 C 语言实现版本,拥有更快的速度,另外,有时在 dump 时也会将第三个参数设为 True 以提高压缩比。再来看下面的例子:

# -*- coding: utf-8 -*-

import cPickle as pickle
import random
import os

import time

LENGTH = 1024 * 10240

def main():
 d = {}
 a = []
 for i in range(LENGTH):
 a.append(random.randint(0, 255))

 d["a"] = a

 print "dumping..."

 t1 = time.time()
 pickle.dump(d, open("tmp1.dat", "wb"), True)
 print "dump1: %.3fs" % (time.time() - t1)

 t1 = time.time()
 pickle.dump(d, open("tmp2.dat", "w"))
 print "dump2: %.3fs" % (time.time() - t1)

 s1 = os.stat("tmp1.dat").st_size
 s2 = os.stat("tmp2.dat").st_size

 print "%d, %d, %.2f%%" % (s1, s2, 100.0 * s1 / s2)

 print "loading..."

 t1 = time.time()
 obj1 = pickle.load(open("tmp1.dat", "rb"))
 print "load1: %.3fs" % (time.time() - t1)

 t1 = time.time()
 obj2 = pickle.load(open("tmp2.dat", "r"))
 print "load2: %.3fs" % (time.time() - t1)


if __name__ == "__main__":
 main()

# -*- coding: utf-8 -*-
 
import cPickle as pickle
import random
import os
 
import time
 
LENGTH = 1024 * 10240
 
def main():
 d = {}
 a = []
 for i in range(LENGTH):
 a.append(random.randint(0, 255))
 
 d["a"] = a
 
 print "dumping..."
 
 t1 = time.time()
 pickle.dump(d, open("tmp1.dat", "wb"), True)
 print "dump1: %.3fs" % (time.time() - t1)
 
 t1 = time.time()
 pickle.dump(d, open("tmp2.dat", "w"))
 print "dump2: %.3fs" % (time.time() - t1)
 
 s1 = os.stat("tmp1.dat").st_size
 s2 = os.stat("tmp2.dat").st_size
 
 print "%d, %d, %.2f%%" % (s1, s2, 100.0 * s1 / s2)
 
 print "loading..."
 
 t1 = time.time()
 obj1 = pickle.load(open("tmp1.dat", "rb"))
 print "load1: %.3fs" % (time.time() - t1)
 
 t1 = time.time()
 obj2 = pickle.load(open("tmp2.dat", "r"))
 print "load2: %.3fs" % (time.time() - t1)
 
 
if __name__ == "__main__":
 main()

Copy after login


在我的电脑上执行结果为:

dumping…
dump1: 1.297s
dump2: 4.750s
20992503, 68894198, 30.47%
loading…
load1: 2.797s
load2: 10.125s
Copy after login

可以看到,dump 时如果指定了 protocol 为 True,压缩过后的文件的大小只有原来的文件的 30% ,同时无论在 dump 时还是 load 时所耗费的时间都比原来少。因此,一般来说,可以建议把这个值设为 True 。

另外,pickle 模块还提供 dumps 和 loads 两个方法,用法与上面的 dump 和 load 方法类似,只是不需要输入 file 参数,输入及输出都是字符串对象,有些场景中使用这两个方法可能更为方便。

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
Two Point Museum: All Exhibits And Where To Find Them
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Is the conversion speed fast when converting XML to PDF on mobile phone? Is the conversion speed fast when converting XML to PDF on mobile phone? Apr 02, 2025 pm 10:09 PM

The speed of mobile XML to PDF depends on the following factors: the complexity of XML structure. Mobile hardware configuration conversion method (library, algorithm) code quality optimization methods (select efficient libraries, optimize algorithms, cache data, and utilize multi-threading). Overall, there is no absolute answer and it needs to be optimized according to the specific situation.

Is there any mobile app that can convert XML into PDF? Is there any mobile app that can convert XML into PDF? Apr 02, 2025 pm 08:54 PM

An application that converts XML directly to PDF cannot be found because they are two fundamentally different formats. XML is used to store data, while PDF is used to display documents. To complete the transformation, you can use programming languages ​​and libraries such as Python and ReportLab to parse XML data and generate PDF documents.

How to convert XML files to PDF on your phone? How to convert XML files to PDF on your phone? Apr 02, 2025 pm 10:12 PM

It is impossible to complete XML to PDF conversion directly on your phone with a single application. It is necessary to use cloud services, which can be achieved through two steps: 1. Convert XML to PDF in the cloud, 2. Access or download the converted PDF file on the mobile phone.

How to control the size of XML converted to images? How to control the size of XML converted to images? Apr 02, 2025 pm 07:24 PM

To generate images through XML, you need to use graph libraries (such as Pillow and JFreeChart) as bridges to generate images based on metadata (size, color) in XML. The key to controlling the size of the image is to adjust the values ​​of the <width> and <height> tags in XML. However, in practical applications, the complexity of XML structure, the fineness of graph drawing, the speed of image generation and memory consumption, and the selection of image formats all have an impact on the generated image size. Therefore, it is necessary to have a deep understanding of XML structure, proficient in the graphics library, and consider factors such as optimization algorithms and image format selection.

What is the function of C language sum? What is the function of C language sum? Apr 03, 2025 pm 02:21 PM

There is no built-in sum function in C language, so it needs to be written by yourself. Sum can be achieved by traversing the array and accumulating elements: Loop version: Sum is calculated using for loop and array length. Pointer version: Use pointers to point to array elements, and efficient summing is achieved through self-increment pointers. Dynamically allocate array version: Dynamically allocate arrays and manage memory yourself, ensuring that allocated memory is freed to prevent memory leaks.

How to convert xml into pictures How to convert xml into pictures Apr 03, 2025 am 07:39 AM

XML can be converted to images by using an XSLT converter or image library. XSLT Converter: Use an XSLT processor and stylesheet to convert XML to images. Image Library: Use libraries such as PIL or ImageMagick to create images from XML data, such as drawing shapes and text.

How to open xml format How to open xml format Apr 02, 2025 pm 09:00 PM

Use most text editors to open XML files; if you need a more intuitive tree display, you can use an XML editor, such as Oxygen XML Editor or XMLSpy; if you process XML data in a program, you need to use a programming language (such as Python) and XML libraries (such as xml.etree.ElementTree) to parse.

What is the process of converting XML into images? What is the process of converting XML into images? Apr 02, 2025 pm 08:24 PM

To convert XML images, you need to determine the XML data structure first, then select a suitable graphical library (such as Python's matplotlib) and method, select a visualization strategy based on the data structure, consider the data volume and image format, perform batch processing or use efficient libraries, and finally save it as PNG, JPEG, or SVG according to the needs.

See all articles