This article shares with you an example of python reading data in text and converting it into a DataFrame. It has a certain reference value and I hope it can help those in need.
See it in the technical Q&A Such a question seems to be relatively common, so I will write it down in a separate article.
Read data from the plain text format file "file_in" in the following format:
Needs to be output as "file_out", the format is as follows:
The original format of the data is "Category: Content", with blank lines "\n" means sub-entries. After conversion, it becomes one entry per line, and the content is written out in order of category.
It is recommended that after reading, use pandas to create a table called DataFrame from the data. This will make it easier to process the data later. But the original format is not the usual table format, so some simple processing needs to be done first.
#coding:utf8 import sys from pandas import DataFrame #DataFrame通常来装二维的表格 import pandas as pd #pandas是流行的做数据分析的包 #建立字典,键和值都从文件里读出来。键是nam,age……,值是lili,jim…… dict_data={} #打开文件 with open('file_in.txt','r')as df: #读每一行 for line in df: #如果这行是换行符就跳过,这里用'\n'的长度来找空行 if line.count('\n') == len(line): continue #对每行清除前后空格(如果有的话),然后用":"分割 for kv in [line.strip().split(':')]: #按照键,把值写进去 dict_data.setdefault(kv[0],[]).append(kv[1]) #print(dict_data)看看效果 #这是把键读出来成为一个列表 columnsname=list(dict_data.keys()) #建立一个DataFrame,列名即为键名,也就是nam,age…… frame = DataFrame(dict_data,columns=columnsname) #把DataFrame输出到一个表,不要行名字和列名字 frame.to_csv('file_out0.txt',index=False,header=False)
Related recommendations:
Python reads the text content in word
Detailed explanation of three ways to read file content in Python and efficiency comparison
The above is the detailed content of Example of python reading data in text and converting it into DataFrame_python. For more information, please follow other related articles on the PHP Chinese website!