Building an Interactive Weather Data Scraper in Google Colab: A Code Guide to Extract, Display, and Download Live Forecast Data Using Python, BeautifulSoup, Requests, Pandas, and Ipywidgets
www.marktechpost.com
In this tutorial, we will build an interactive web scraping project in Google Colab! This guide will walk you through extracting live weather forecast data from the U.S. National Weather Service. Youll learn to set up your environment, write a Python script using BeautifulSoup and requests, and integrate an interactive UI with ipywidgets. This tutorial provides a step-by-step approach to collecting, displaying, and saving weather data, all within a single, self-contained Colab notebook.!pip install beautifulsoup4 ipywidgets pandasFirst, we install three essential libraries: BeautifulSoup4 for parsing HTML content, ipywidgets for creating interactive elements, and pandas for data manipulation and analysis. Running it in your Colab notebook ensures your environment is fully prepared for the web scraping project.import requestsfrom bs4 import BeautifulSoupimport csvfrom google.colab import filesimport ipywidgets as widgetsfrom IPython.display import display, clear_output, FileLinkimport pandas as pdWe import all the necessary libraries to build an interactive web scraping project in Colab. It includes requests for handling HTTP requests, BeautifulSoup from bs4 for parsing HTML, and csv for managing CSV file operations. Also, it brings in files from google.colab for file downloads, ipywidgets and IPythons display tools for creating an interactive UI, and pandas for data manipulation and display.def scrape_weather(): """ Scrapes weather forecast data for San Francisco from the National Weather Service. Returns a list of dictionaries containing the period, short description, and temperature. """ url = 'https://forecast.weather.gov/MapClick.php?lat=37.7772&lon=-122.4168' print("Scraping weather data from:", url) response = requests.get(url) if response.status_code != 200: print("Error fetching page:", url) return None soup = BeautifulSoup(response.text, 'html.parser') seven_day = soup.find(id="seven-day-forecast") forecast_items = seven_day.find_all(class_="tombstone-container") weather_data = [] for forecast in forecast_items: period = forecast.find(class_="period-name").get_text() if forecast.find(class_="period-name") else '' short_desc = forecast.find(class_="short-desc").get_text() if forecast.find(class_="short-desc") else '' temp = forecast.find(class_="temp").get_text() if forecast.find(class_="temp") else '' weather_data.append({ "period": period, "short_desc": short_desc, "temp": temp }) print(f"Scraped {len(weather_data)} forecast entries.") return weather_dataWith the above function, we retrieve the weather forecast for San Francisco from the National Weather Service. It makes an HTTP request to the forecast page, parses the HTML with BeautifulSoup, and extracts details like the forecast period, description, and temperature from each entry. The collected data is then stored as a list of dictionaries and returned.def save_to_csv(data, filename="weather.csv"): """ Saves the provided data (a list of dictionaries) to a CSV file. """ with open(filename, "w", newline='', encoding='utf-8') as f: writer = csv.DictWriter(f, fieldnames=["period", "short_desc", "temp"]) writer.writeheader() writer.writerows(data) print(f"Data saved to {filename}") return filenameNow, this function takes the scraped weather data from a list of dictionaries and writes it into a CSV file using Pythons CSV module. It opens the file in write mode with UTF-8 encoding, initializes a DictWriter with predefined fieldnames (period, short_desc, and temp), writes the header row, and then writes all the rows of data.out = widgets.Output()def on_button_click(b): """ Callback function that gets executed when the "Scrape Weather Data" button is clicked. It scrapes the weather data, saves it to CSV, displays the data in a table, and shows a download link for the CSV file. """ with out: clear_output() print("Starting weather data scrape...") data = scrape_weather() if data is None: print("Failed to scrape weather data.") return csv_filename = save_to_csv(data) df = pd.DataFrame(data) print("\nWeather Forecast Data:") display(df) print("\nDownload CSV file:") display(FileLink(csv_filename))button = widgets.Button(description="Scrape Weather Data", button_style='success')button.on_click(on_button_click)display(button, out)Finally, the last snippet sets up an interactive UI in Colab using ipywidgets that, when triggered, scrapes weather data, displays it in a table, and provides a CSV download link. It efficiently combines web scraping and user interaction in a compact notebook setup.Output SampleIn this tutorial, we demonstrated how to combine web scraping with an interactive UI in a Google Colab environment. We built a complete project that fetches real-time weather data, processes it using BeautifulSoup, and displays the results in an interactive table while offering a CSV download option.Here is the Colab Notebook for the above project. Also,dont forget to follow us onTwitterand join ourTelegram ChannelandLinkedIn Group. Dont Forget to join our80k+ ML SubReddit. Asif RazzaqWebsite| + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/DeepSeek AI Releases DeepEP: An Open-Source EP Communication Library for MoE Model Training and InferenceAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Building a Legal AI Chatbot: A Step-by-Step Guide Using bigscience/T0pp LLM, Open-Source NLP Models, Streamlit, PyTorch, and Hugging Face TransformersAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Moonshot AI and UCLA Researchers ReleaseMoonlight: A 3B/16B-Parameter Mixture-of-Expert (MoE) Model Trained with 5.7T Tokens Using Muon OptimizerAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Fine-Tuning NVIDIA NV-Embed-v1 on Amazon Polarity Dataset Using LoRA and PEFT: A Memory-Efficient Approach with Transformers and Hugging Face Recommended Open-Source AI Platform: IntellAgent is a An Open-Source Multi-Agent Framework to Evaluate Complex Conversational AI System' (Promoted)
0 Reacties
·0 aandelen
·45 Views