What is the difference between data frames and matrices in Python Pandas?

WBOY
Release: 2023-09-14 19:53:02
forward
1182 people have browsed it

在Python Pandas中,数据帧(data frames)和矩阵(matrices)之间的区别是什么?

In this article we will show you the difference between dataframe and matrix in python panda.

Data frames and matrices are both two-dimensional data structures. Generally speaking, a data frame can contain multiple types of data (numbers, characters, factors, etc.), while a matrix can only store one type of data.

Data Frame in Python

In Python, DataFrame is a two-dimensional, tabular, mutable data structure that can store tabular data containing objects of various data types. DataFrame has axes labeled in rows and columns. DataFrames are useful tools in data preprocessing because they provide valuable data processing methods. DataFrame can also be used to create pivot tables and plot data using Matplotlib.

Application of Dataframe

  • Data frames can perform a variety of tasks, such as fitting statistical formulas.

  • Data processing (Matrix is ​​not possible, must be converted to data frame first)

  • Convert rows to columns and vice versa, which is very useful in data science.

Create a sample data frame

Algorithm (steps)

The following are the algorithms/steps that need to be followed to perform the required task -

  • Use the import keyword to import the pandas and numpy modules with aliases.

  • Use the DataFrame() function of the pandas module to create a data frame.

  • Print the input data frame.

Example

The following program uses the DataFrame() function to return a data frame -

# importing pandas, numpy modules with alias names
import pandas as pd
import numpy as np

# creating a dataframe
inputDataframe = pd.DataFrame({'Name': ['Virat', 'Rohit', 'Meera', 'Nick', 'Sana'], 'Jobrole': ['Developer', 'Analyst', 'Help Desk', 'Database Developer', 'Finance accountant'], 'Age': [25, 30, 28, 25, 40]})

# displaying the dataframe
print(inputDataframe)
Copy after login

Output

When executed, the above program will generate the following output -

   Name             Jobrole      Age
0  Virat            Developer    25
1  Rohit            Analyst      30
2  Meera            Help Desk    28
3  Nick  Database   Developer    25
4  Sana  Finance    accountant   40
Copy after login

Matrix in Python

A matrix is ​​a collection of homogeneous data sets organized in a two-dimensional rectangular grid. It is an m*n array with the same data type. It is created with vector input. There are a fixed number of rows and columns. Python supports various arithmetic operations such as addition, subtraction, multiplication, and division on Matrix.

Application of Matrix

  • It is useful in economics for calculating statistics such as GDP (gross domestic product) or PI (price per capita income).

  • It is also useful for studying electrical and electronic circuits.

  • Print the input data frame.

  • Matrix is ​​used for research, such as drawing diagrams.

  • This is useful in probability and statistics.

Matrix multiplication by converting matrix to data frame

Algorithm (steps)

The following are the algorithms/steps that need to be followed to perform the required task -

  • Use the import keyword to import the pandas module with an alias.

  • Create two variables to store the two input matrices respectively.

  • Use the pandas module's DataFrame() function (Create DataFrame) to create data frames for the first and second matrices and store them in separate variables. This data is loaded into pandas DataFrames.

  • Print the data frame of input matrix 1.

  • Print the dimensions (shape) of input matrix 1 by applying the shape attribute.

  • Print the data frame of input matrix 2.

  • Print the dimensions (shape) of input matrix 2 by applying the shape attribute.

  • Use the dot() function to multiply the matrices inputMatrix_1 and inputMatrix_2 and create a variable to store it.

  • Print the result matrix of the multiplication of inputMatrix_1 and inputMatrix_2 matrices.

  • Print the dimensions (shape) of the resulting matrix by applying the shape attribute.

Example

The following program uses the DataFrame() function to return a data frame -

# importing pandas module
import pandas as pd

# input matrix 1
inputMatrix_1 = [[1, 2, 2],
   [1,  2, 0],
   [1,  0, 2]]

# input matrix 2
inputMatrix_2 = [[1, 0, 1],
   [2, 1, 1],
   [2, 1, 2]]

# creating a dataframe of first matrix
#(here data is loaded into a pandas DataFrames)
df_1 = pd.DataFrame(data=inputMatrix_1)

# creating a dataframe of second matrix
df_2 = pd.DataFrame(data=inputMatrix_2)

# printing the dataframe of input matrix 1
print("inputMatrix_1:")
print(df_1)

# printing the dimensions(shape) of input matrix 1
print("The dimensions(shape) of input matrix 1:")
print(df_1.shape)
print()

# printing the dataframe of input matrix 2
print("inputMatrix_2:")
print(df_2)

# printing the dimensions(shape) of input matrix 1
print("The dimensions(shape) of input matrix 2:")
print(df_2.shape)
print()

# multiplying both the matrices inputMatrix_1 and inputMatrix_2
result_mult = df_1.dot(df_2)

# Printing the resultant of matrix multiplication of inputMatrix_1 and inputMatrix_2
print("Resultant Matrix after Matrix multiplication:")
print(result_mult)

# printing the dimensions(shape) of resultant Matrix
print("The dimensions(shape) of Resultant Matrix:")
print(result_mult.shape)
Copy after login

Output

inputMatrix_1:
0 1 2
0 1 2 2
1 1 2 0
2 1 0 2
The dimensions(shape) of input matrix 1:
(3, 3)

inputMatrix_2:
0 1 2
0 1 0 1
1 2 1 1
2 2 1 2
The dimensions(shape) of input matrix 2:
(3, 3)

Resultant Matrix after Matrix multiplication:
0 1 2
0 9 4 7
1 5 2 3
2 5 2 5
The dimensions(shape) of Resultant Matrix:
(3, 3)
Copy after login

The following is the difference table between matrix and data frame.

Matrix and data frame

matrix Data frame
It is a collection of data sets arranged in a two-dimensional rectangular organization It stores data tables with multiple data types in multiple columns called fields.
The matrix is ​​an m*n array with the same data type A data frame is a list of vectors of the same length. A data frame is a generalized form of a matrix.
A matrix has a fixed number of rows and columns. The number of rows and columns of Dataframe is variable.
homogeneous Heterogeneous

in conclusion

We learned about the difference between matrices and data frames in Python in this program. We also learned how to make a data frame and how to convert a matrix into a data frame.

The above is the detailed content of What is the difference between data frames and matrices in Python Pandas?. For more information, please follow other related articles on the PHP Chinese website!

source:tutorialspoint.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template