Python Pandas is an open source toolkit that provides data manipulation and analysis functions for Python programming. This library has become an essential tool for data scientists and analysts. It provides an efficient way to manage structured data (Series and DataFrame).
In the field of artificial intelligence, Pandas is often used in the preprocessing steps of machine learning and deep learning processes. By providing data cleaning, reshaping, merging, and aggregation, Pandas can transform raw data sets into structured, ready-to-use 2-dimensional tables that can be fed into artificial intelligence algorithms.
PandasAI converts Pandas into a conversational tool. You can ask questions about the data and it will answer in the form of a Pandas dataframe.
For example, we can ask PandasAI to return all rows in a DataFrame with column values greater than 5, and it will return a DataFrame containing only these rows.
<code>import pandas as pd from pandasai import PandasAI # Sample DataFrame df = pd.DataFrame({ "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"], "gdp": [21400000, 2940000, 2830000, 3870000, 2160000, 1350000, 1780000, 1320000, 516000, 14000000], "happiness_index": [7.3, 7.2, 6.5, 7.0, 6.0, 6.3, 7.3, 7.3, 5.9, 5.0] }) # Instantiate a LLM from pandasai.llm.openai import OpenAI llm = OpenAI() pandas_ai = PandasAI(llm) pandas_ai.run(df, prompt='Which are the 5 happiest countries?')</code>
In addition to returning results, you can also generate charts:
<code>pandas_ai.run( df, "Plot the histogram of countries showing for each the gpd, using different colors for each bar", )</code>
You can use it as long as you install it with pip:
<code>pip install pandasai</code>
But when using pandasai, you need to enter an openai api-key, so that it can call openai's language model:
Then import it before using it, and then enter the api key to use it:
<code>#Import pandas and pandas-ai import pandas as pd from pandasai import PandasAI # Instantiating my llm using OpenAI API key. from pandasai.llm.openai import OpenAI # OpenAI llm = OpenAI(api_token="YOUR_OPENAI_API_KEY")</code>
Because of the characteristics of pandas, we can not only process csv files, we You can also connect to a relational database, such as pgsql:
<code># creating the uri and connecting to database pg_conn = "postgresql://YOUR URI HERE" #Query sql database query = """ SELECT * FROM table_name """ #Create dataframe named df df = pd.read_sql(query,pg_conn)</code>
Then like the above code, we can talk to it directly:
<code># Using pandas-ai! pandas_ai = PandasAI(llm) pandas_ai.run(df, prompt='Place your prompt here)</code>
ChatGPT and Pandas are powerful tools that, when combined, can completely change the way we interact with and analyze data. With its advanced natural language processing capabilities, ChatGPT enables more intuitive human-like interaction with data. PandasAI can enhance the Pandas data analysis experience. By converting complex data manipulation tasks into simple natural language queries, PandasAI makes it easier for users to extract valuable insights from data without writing extensive code.
This is a new approach to programming for those who are not yet familiar with Python or pandas operations/conversions. Instead of programming the task you want to perform, you just talk to the AI agent and tell it exactly what result you want, and the agent converts this message into computer-interpretable code and returns the result.
The above is the detailed content of Pandas-ai: A new method of interactive data processing, integrating ChatGPT technology.. For more information, please follow other related articles on the PHP Chinese website!