Detailed explanation of how Python uses Pandas for data analysis

WBOY
Release: 2022-09-06 19:54:41
forward
3769 people have browsed it

[Related recommendations: Python3 video tutorial]

Pandas is the most popular for data analysis Python library. It provides highly optimized performance, with backend source code written entirely in C or Python.

We can analyze the data in pandas in the following ways:

  • 1.Series

  • 2.Data Frame

Series

Series is a one-dimensional (1-D) array defined in pandas and can be used to store any data type.

Code#1

Create Series

# 创建 Series 的程序

# 导入 Panda 库
import pandas as pd

# 使用数据和索引创建 Series
a = pd.Series(Data, index = Index)
Copy after login

Here, data can be:

  • A scalar value , which can be an integerValue or a string
  • can be a Python dictionary of key-value pairs
  • a Ndarray

Note: By default, the index starts from 0, 1, 2, ...(n-1), where n is the data length.

Code#2

When Data contains a scalar value

# 使用标量值创建 Series 的程序

# 数值数据
Data =[1, 3, 4, 5, 6, 2, 9]

# 使用默认索引值创建系列
s = pd.Series(Data)	

# 预定义的索引值
Index =['a', 'b', 'c', 'd', 'e', 'f', 'g']

# 创建具有预定义索引值的系列
si = pd.Series(Data, Index)
Copy after login

Output:

Scalar data with default index

Scalar data with index

Code #3

When the data contains a dictionary

# 创建词典 Series 程序
dictionary ={'a':1, 'b':2, 'c':3, 'd':4, 'e':5}

# 创建字典类型 Series
sd = pd.Series(dictionary)
Copy after login

Output:

Dictionary type data

Code #4

When Data contains Ndarray

# 创建 ndarray series 的程序

# 定义二维数组
Data =[[2, 3, 4], [5, 6, 7]]

# 创建一系列二维数组
snd = pd.Series(Data)
Copy after login

Output:

Data as Ndarray

Data Frame

DataFrames is a two-dimensional (2-D) data structure defined in pandas, consisting of rows and columns.

Code#1

Create DataFrame

# 创建 DataFrame 的程序

# 导入库
import pandas as pd

# 使用数据创建 DataFrame
a = pd.DataFrame(Data)
Copy after login

Here, the data can be:

  • One or more This Dictionary
  • One or more Series
  • 2D-numpy Ndarray

Code #2

When the data is a dictionary

# 使用两个字典创建数据框的程序

# 定义字典 1
dict1 ={'a':1, 'b':2, 'c':3, 'd':4}

# 定义字典 2
dict2 ={'a':5, 'b':6, 'c':7, 'd':8, 'e':9}

# 用 dict1 和 dict2 定义数据
Data = {'first':dict1, 'second':dict2}

# 创建数据框
df = pd.DataFrame(Data)
Copy after login

Output:

DataFrame with two dictionaries

Code#3

When the data is a Series

# 创建三个系列的Dataframe的程序
import pandas as pd

# 定义 series 1
s1 = pd.Series([1, 3, 4, 5, 6, 2, 9])

# 定义 series 2
s2 = pd.Series([1.1, 3.5, 4.7, 5.8, 2.9, 9.3])

# 定义 series 3
s3 = pd.Series(['a', 'b', 'c', 'd', 'e'])	

# 定义 Data
Data ={'first':s1, 'second':s2, 'third':s3}

# 创建 DataFrame
dfseries = pd.DataFrame(Data)
Copy after login

Output:

DataFrame of three Series

Code#4

When Data is 2D-numpy ndarrayNote : One constraint must be maintained when creating a DataFrame of 2D arrays - the dimensions of the 2D arrays must be the same.

# 从二维数组创建 DataFrame 的程序

# 导入库
import pandas as pd

# 定义 2d 数组 1
d1 =[[2, 3, 4], [5, 6, 7]]

# 定义 2d 数组 2
d2 =[[2, 4, 8], [1, 3, 9]]

# 定义 Data
Data ={'first': d1, 'second': d2}

# 创建 DataFrame
df2d = pd.DataFrame(Data)
Copy after login

Output:

DataFrame with 2d ndarray

[Related recommendations:Python3 Video tutorial

The above is the detailed content of Detailed explanation of how Python uses Pandas for data analysis. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:jb51.net
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template