Pandas is a popular and powerful Python library commonly used for data analysis and manipulation. It provides a number of data structures, including Series, DataFrame, and Panel, for working with tabular and time series data.
Pandas DataFrame is a two-dimensional tabular data structure. In this article, we'll cover various ways to determine the data type of a column in Pandas. There are many situations where we have to find the data type of a column in a Pandas DataFrame. Each column in a Pandas DataFrame can contain different data types.
Before proceeding, let us make a sample dataframe on which we have to get the data type of the column in Pandas
import pandas as pd # create a sample dataframe df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]}) print(df)
This python script prints the DataFrame we created.
Vehicle name price 0 Supra 5000000 1 Honda 600000 2 Lamorghini 7000000
The methods you can take to complete the task are as follows
Use dtypes attribute
Use select_dtypes()
Use info() method
Use describe() function
Now let us discuss each method and how to use them to get the data type of a column in Pandas.
We can use the dtypes attribute to get the data type of each column in the DataFrame. This property will return a series containing the data type of each column. The following syntax can be used:
Grammar
df.dtypes
Return type The data type of each column in the DataFrame.
Import the Pandas library.
Use the pd.DataFrame() function to create a DataFrame and pass the examples as a dictionary.
Use the dtypes attribute to get the data type of each column in the DataFrame.
Print the results to check the data type of each column.
# import the Pandas library import pandas as pd # create a sample dataframe df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]}) # print the dataframe print("DataFrame:\n", df) # get the data types of each column print("\nData types of each column:") print(df.dtypes)
DataFrame: Vehicle name price 0 Supra 5000000 1 Honda 600000 2 Lamorghini 7000000 Data types of each column: Vehicle name object price int64 dtype: object
In this example, we get the data type of a single column of the DataFrame
# import the Pandas library import pandas as pd # create a sample dataframe df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]}) # print the dataframe print("DataFrame:\n", df) # get the data types of column named price print("\nData types of column named price:") print(df.dtypes['price'])
DataFrame: Vehicle name price 0 Supra 5000000 1 Honda 600000 2 Lamorghini 7000000 Data types of column named price: int64
We can use the select_dtypes() method to filter out the data type columns we need. The select_dtypes() method returns a subset of columns based on the data types provided as input. This method allows us to select columns that belong to a specific data type and then determine the data type.
Import the Pandas library.
Use the pd.DataFrame() function to create a DataFrame and pass the given data as a dictionary.
Print the DataFrame to check the created data.
Use the select_dtypes() method to select all numeric columns from the DataFrame. Use the include parameter to pass the list of data types we want to select as parameters.
Loop over the columns to iterate over each numeric column and print its data type.
# import the Pandas library import pandas as pd # create a sample dataframe df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]}) # print the dataframe print("DataFrame:\n", df) # select the numeric columns numeric_cols = df.select_dtypes(include=['float64', 'int64']).columns # get the data type of each numeric column for col in numeric_cols: print("Data Type of column", col, "is", df[col].dtype)
DataFrame: Vehicle name price 0 Supra 5000000 1 Honda 600000 2 Lamorghini 7000000 Data Type of column price is int64
We can also use the info() method to complete our tasks. The info() method gives us a concise summary of the DataFrame, including the data type of each column. The following syntax can be used:
Grammar
DataFrame.info(verbose=None, buf=None, max_cols=None, memory_usage=None, null_counts=None)
Return valueNone
Import the Pandas library.
Use the pd.DataFrame() function to create a DataFrame and pass the above data as a dictionary.
Print the DataFrame to check the created data.
Use the info() method to get information about the DataFrame.
Print the information obtained from the info() method.
# import the Pandas library import pandas as pd # create a sample dataframe df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]}) # print the dataframe print("DataFrame:\n", df) # use the info() method to get the data type of each column print(df.info())
DataFrame: Vehicle name price 0 Supra 5000000 1 Honda 600000 2 Lamorghini 7000000 <class 'pandas.core.frame.DataFrame'> RangeIndex: 3 entries, 0 to 2 Data columns (total 2 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Vehicle name 3 non-null object 1 price 3 non-null int64 dtypes: int64(1), object(1) memory usage: 176.0+ bytes None
describe() method is used to generate descriptive statistics of DataFrame, including the data type of each column.
Use the import statement to import the Pandas library.
Use the pd.DataFrame() function to create a DataFrame and pass the given data as a dictionary.
Print the DataFrame to check the created data.
Use the describe() method to obtain the descriptive statistics of the DataFrame.
Use the include parameter of the describe() method to 'all' to include all columns in the descriptive statistics.
Use the dtypes attribute to get the data type of each column in the DataFrame.
Print the data type of each column.
# import the Pandas library import pandas as pd # create a sample dataframe df = pd.DataFrame({'Vehicle name': ['Supra', 'Honda', 'Lamorghini'],'price': [5000000, 600000, 7000000]}) # print the dataframe print("DataFrame:\n", df) # use the describe() method to get the descriptive statistics of the dataframe desc_stats = df.describe(include='all') # get the data type of each column dtypes = desc_stats.dtypes # print the data type of each column print("Data type of each column in the descriptive statistics:\n", dtypes)
DataFrame: Vehicle name price 0 Supra 5000000 1 Honda 600000 2 Lamorghini 7000000 Data type of each column in the descriptive statistics: Vehicle name object price float64 dtype: object
Knowing how to obtain the data type of each column, we can efficiently complete various data operations and analysis work. Each method has its own advantages and disadvantages depending on the method or function used. You can choose which method you want based on how complex you want the expression to be and your personal coding preferences.
The above is the detailed content of Get data type of column in Pandas - Python. For more information, please follow other related articles on the PHP Chinese website!