Tutorial to Create a Data Science Agent: A Code Implementation using gemini-2.0-flash-lite model through Google API, google.generativeai, Pandas and IPython.display for Interactive Data Analysis
www.marktechpost.com
In this tutorial, we demonstrate the integration of Pythons robust data manipulation library Pandas with Google Clouds advanced generative capabilities through the google.generativeai package and the Gemini Pro model. By setting up the environment with the necessary libraries, configuring the Google Cloud API key, and leveraging the IPython display functionalities, the code provides a step-by-step approach to building a data science agent analyzing a sample sales dataset. The example shows how to convert a DataFrame into markdown format and then use natural language queries to generate insights about the data, highlighting the potential of combining traditional data analysis tools with modern AI-driven methods.!pip install pandas google-generativeai --quietFirst, we install the Pandas and google-generativeai libraries quietly, setting up the environment for data manipulation and AI-powered analysis.import pandas as pdimport google.generativeai as genaifrom IPython.display import MarkdownWe import Pandas for data manipulation, google.generativeai for accessing Googles generative AI capabilities, and Markdown from IPython.display to render markdown-formatted outputs.GOOGLE_API_KEY = "Use Your API Key Here"genai.configure(api_key=GOOGLE_API_KEY)model = genai.GenerativeModel('gemini-2.0-flash-lite')We assign a placeholder API key, configure the google.generativeai client with it, and initialize the gemini-2.0-flash-lite GenerativeModel for generating content.data = {'Product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam', 'Headphones'], 'Category': ['Electronics', 'Electronics', 'Electronics', 'Electronics', 'Electronics', 'Electronics'], 'Region': ['North', 'South', 'East', 'West', 'North', 'South'], 'Units Sold': [150, 200, 180, 120, 90, 250], 'Price': [1200, 25, 75, 300, 50, 100]}sales_df = pd.DataFrame(data)print("Sample Sales Data:")print(sales_df)print("-" * 30)Here, we create a Pandas DataFrame named sales_df containing sample sales data for various products, and then print the DataFrame followed by a separator line to visually distinguish the output.def ask_gemini_about_data(dataframe, query): """ Asks the Gemini Pro model a question about the given Pandas DataFrame. Args: dataframe: The Pandas DataFrame to analyze. query: The natural language question about the DataFrame. Returns: The response from the Gemini Pro model as a string. """ prompt = f"""You are a data analysis agent. Analyze the following pandas DataFrame and answer the question. DataFrame: ``` {dataframe.to_markdown(index=False)} ``` Question: {query} Answer: """ response = model.generate_content(prompt) return response.textHere, we construct a markdown-formatted prompt from a Pandas DataFrame and a natural language query, then use the Gemini Pro model to generate and return an analytical response.# Query 1: What is the total number of units sold across all products?query1 = "What is the total number of units sold across all products?"response1 = ask_gemini_about_data(sales_df, query1)print(f"Question 1: {query1}")print(f"Answer 1:\n{response1}")print("-" * 30)Query 1 Output# Query 2: Which product had the highest number of units sold?query2 = "Which product had the highest number of units sold?"response2 = ask_gemini_about_data(sales_df, query2)print(f"Question 2: {query2}")print(f"Answer 2:\n{response2}")print("-" * 30)Query 2 Output# Query 3: What is the average price of the products?query3 = "What is the average price of the products?"response3 = ask_gemini_about_data(sales_df, query3)print(f"Question 3: {query3}")print(f"Answer 3:\n{response3}")print("-" * 30)Query 3 Output# Query 4: Show me the products sold in the 'North' region.query4 = "Show me the products sold in the 'North' region."response4 = ask_gemini_about_data(sales_df, query4)print(f"Question 4: {query4}")print(f"Answer 4:\n{response4}")print("-" * 30)Query 4 Output# Query 5. More complex query: Calculate the total revenue for each product.query5 = "Calculate the total revenue (Units Sold * Price) for each product and present it in a table."response5 = ask_gemini_about_data(sales_df, query5)print(f"Question 5: {query5}")print(f"Answer 5:\n{response5}")print("-" * 30)Query 5 OutputIn conclusion, the tutorial successfully illustrates how the synergy between Pandas, the google.generativeai package, and the Gemini Pro model can transform data analysis tasks into a more interactive and insightful process. The approach simplifies querying and interpreting data and opens up avenues for advanced use cases such as data cleaning, feature engineering, and exploratory data analysis. By harnessing these state-of-the-art tools within the familiar Python ecosystem, data scientists can enhance their productivity and innovation, making it easier to derive meaningful insights from complex datasets.Here is the Colab Notebook. Also,dont forget to follow us onTwitterand join ourTelegram ChannelandLinkedIn Group. Dont Forget to join our85k+ ML SubReddit. Asif RazzaqWebsite| + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/Google AI Released TxGemma: A Series of 2B, 9B, and 27B LLM for Multiple Therapeutic Tasks for Drug Development Fine-Tunable with TransformersAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Meet Open Deep Search (ODS): A Plug-and-Play Framework Democratizing Search with Open-source Reasoning AgentsAsif Razzaqhttps://www.marktechpost.com/author/6flvq/A Code Implementation of Monocular Depth Estimation Using Intel MiDaS Open Source Model on Google Colab with PyTorch and OpenCVAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Google DeepMind Researchers Propose CaMeL: A Robust Defense that Creates a Protective System Layer around the LLM, Securing It even when Underlying Models may be Susceptible to Attacks
0 Commentarii
·0 Distribuiri
·77 Views