Python Web Scraping Example: Creating a Line Chart

Web scraping, the process of extracting data from websites, has become an invaluable tool for data analysis and visualization. Python, with its robust libraries like BeautifulSoup and Selenium, offers a straightforward approach to scraping web data. In this article, we will walk through an example of scraping web data using Python and then creating a line chart to visualize the extracted information.

Step 1: Setting Up the Environment

Before diving into scraping, ensure you have Python installed on your machine. You’ll also need to install some external libraries, which you can do using pip:

bashCopy Code
pip install requests beautifulsoup4 matplotlib pandas
  • requests for making HTTP requests.
  • beautifulsoup4 for parsing HTML and XML documents.
  • matplotlib for plotting graphs.
  • pandas for data manipulation and analysis.

Step 2: Scraping Web Data

As an example, let’s scrape historical stock prices from Yahoo Finance. Note that web scraping can violate the terms of service of some websites. Always ensure you have permission to scrape data and comply with robots.txt and legal requirements.

pythonCopy Code
import requests from bs4 import BeautifulSoup import pandas as pd url = 'https://finance.yahoo.com/quote/AAPL/history?period1=1609459200&period2=1640995200&interval=1d&filter=history&frequency=1d' response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') table = soup.find('table', {'class': 'W(100%) M(0) Bdcl(c)'}) rows = table.find_all('tr') data = [] for row in rows[1:]: # Skip the header row cols = row.find_all('td') data.append([ele.text.strip() for ele in cols]) # Convert the data into a DataFrame df = pd.DataFrame(data, columns=['Date', 'Open', 'High', 'Low', 'Close*', 'Adj Close**', 'Volume']) df['Date'] = pd.to_datetime(df['Date']) df.set_index('Date', inplace=True)

Step 3: Creating a Line Chart

With the data extracted and stored in a pandas DataFrame, we can now use matplotlib to create a line chart. Let’s plot the closing prices over time.

pythonCopy Code
import matplotlib.pyplot as plt plt.figure(figsize=(10, 5)) plt.plot(df['Close*'], label='Close Price') plt.title('AAPL Stock Prices Over Time') plt.xlabel('Date') plt.ylabel('Close Price') plt.legend() plt.grid(True) plt.show()

This code will generate a line chart showing the closing prices of AAPL stock over the specified period.

Conclusion

Python, with its powerful libraries, makes web scraping and data visualization accessible to everyone. This example demonstrates how to scrape historical stock prices from Yahoo Finance and create a line chart to visualize the data. Remember to always respect the website’s terms of service and use scraping responsibly.

[tags]
Python, Web Scraping, BeautifulSoup, Matplotlib, Data Visualization, Pandas, Stock Prices, Line Chart

Python official website: https://www.python.org/