Web Scraping

python@millstreamsolutions.com

Turn hours of copy-paste website data into one simple click. We develop custom web scraping scripts for businesses or individuals to instantly pull data from websites and get straight to analysis.

Recent Project

State Production Records

Data pulled from web and graphs generated automatically

For this project, our client was looking to purchase a number of oil & gas wells and wanted to analyze publicly available production data. We wrote a Python script that automatically inputted each API number into the state website, initiated the search, found the production data table in the HTML code of the website, and pulled the data into an Excel spreadsheet. What would have taken hours of search → copy → paste → search with potential for human error, now only took about 100 seconds with complete confidence in the accuracy in the data.

In addition to quickly pulling the data, we utilized the Python library, Matplotlib, to automatically generate a graph of historic production data for each well of interest. So, not only did the client receive production data in tabular format, but we also kick started his analysis by providing visual aids of the data for each well.

Time Saved by Millstream Solution: 8 minutes/well x 200 wells = 3.33 days
Key Words: Python, Pandas, MatplotLib, Oil & Gas, Excel

Recent Project

State Permit Records

Automated weekly permit report email

For this project, our client wanted to view a weekly report of what oil & gas permits were submitted to the state for jobs that would require cementing work (Plug & Abandoned and New Drill). We wrote a Python script that:

  • Automatically initiates a browser session
  • Inputs past week's dates
  • Clicks search
  • Downloads Excel file with the permit data
  • Pulls the data into Python and removes unwanted rows and columns
  • Replaces old Excel file with new revised Excel file
  • Emails the Excel file to the client and associates.

This Python script was set up with Windows Task Scheduler so every Friday when the user's computer is turned on, the script runs and the emails are sent out.

Time Saved by Millstream Solution: 10 minutes/week
Key Words: Python, Pandas, Selenium, Oil & Gas, Excel, Email