How Can I Make my Own Price Tracker with Python and Selenium?
Welcome to my blog. My name is Jonathan and I am a software developer working in IT for the last 15 years. In today’s blog post, we’re going to discuss how we can build our own price tracker. This price tracker has three main components. Selenium, Python, and the machine running the script.
How do all these technologies work together? Well Selenium is a python library in this deployment. There are other deployments like for PHP and Java but for the purposes of this example, we’re using python. In my experience, python is just more versatile and has more options if you want to scale this up.
Table of Contents
The Chrome Driver
The chrome driver is basically how selenium runs. Selenium uses a driver (and references it in the code) to run your script.
driver = webdriver.Chrome(PATH, options=chromeOptions)
PATH = "/usr/local/bin/chromedriver"
In the above example. PATH is your location of the driver.
How Do I Download The Chrome Driver for Selenium On Mac?
There are two important parts you need to understand. The first is the download location. That is here: https://sites.google.com/a/chromium.org/chromedriver/downloads and second is you need to match up the driver version with your current version of chrome. To get your current version of chrome, click your chrome options dropdown, then “Help” and then “about google chrome.” Look at the below screenshot, you need to match the chrome driver download with the version you have installed. Please be aware, if you update to a new version of chrome but forget to change out the driver, your script will break.
How to Run Chrome Driver Headless?
To run the chrome driver headless, you need to specify in your options that it is to be instantiated headless. To do that, use this below code. You can see there are two ways to create the driver. The second way is commented out. First create your chromeOptions. And second, set headless to False or True. Alternatively, you can just leave the options out and it will default to not-headless.
#Set some selenium chrome options
chromeOptions = Options()
chromeOptions.headless = False
driver = webdriver.Chrome(PATH, options=chromeOptions)
# driver = webdriver.Chrome(PATH)
What is Chrome Driver Headless Mode?
When it’s headless, no window will pop up running the chrome driver. When it is not headless, you will see a new instance of Chrome Drier spawn on your screen. And it will click the links, input the keys, etc. You will literally watch it work.
How to Install Google Chrome Driver on Mac for Selenium?
Python Deployment
Now I want to talk a little bit about my specific deployment. I am using the free version of Pycharm. The below code is my imported libraries:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
Picking Your Target Page
So if we’re making a price tracker, we need to choose the location of that price. For this example, let’s use this page: https://corbah.com/collections/jackets/products/corbah-cxm-light-weight-fall-jacket
In the code, choosing and opening that page will look like this:
def initialize_browser():
driver.get("https://corbah.com/collections/jackets/products/corbah-cxm-light-weight-fall-jacket")
print("starting_Driver")
#run browser init
initialize_browser()
With our target page selected, we need to find the price. For this particular item, the price is $199.99. Let’s grab that price. We start by right clicking the price and clicking “inspect.” With the inspector, we can find some properties of the HTML being used to display the price. The DIV of the html for the price is below:
<div class="selection-wrapper price product-single__price-product-template">
<span class="money" id="ProductPrice-product-template" itemprop="price" content="199.95">$199.95</span>
<p id="ComparePrice-product-template" style="display:none;">
Compare at <span class="money"></span>
</p>
</div>
So given this HTML, how to we pull out the price? Within the driver object, there are a handful of methods. I can look at this code and see that the there is a css class being used by the div that is wrapping the price text. The name of that CSS Class is product-single__price-product-template , And I will use that as a flag to grab out the price. That code looks like this:
price = driver.find_elements_by_class_name('product-single__price-product-template')
print(price[0].text)
And that is it. That is all the code you need to pull out the price of a product on a website, in this example, a shopify website. I think that’s about 20 lines of code total. So you can see, it is quite straight forward. The remainder of the code will be executing the code and creating a comparison to alert you to the new price.
Python Pickling and Comparing New vs Old
Pickling is basically how python serializes. It’s the most basic type of file storage built into python and I decided to use it because the scope of this tutorial is limited. The logic below is basically all the logic we need to test if the file is unchanged, lowered, increased, or loaded.
filename = 'prices_file'
first_load = False
try:
infile = open(filename,'rb')
except:
the_price = price[0].text
price_dict = {'price': price[0].text}
outfile = open(filename, 'wb')
pickle.dump(price_dict, outfile)
outfile.close()
first_load = True
infile = open(filename, 'rb')
new_dict = pickle.load(infile)
infile.close()
old_price = new_dict['price']
if(first_load==True):
print('nothing to compare, first load')
elif(old_price == price[0].text):
print("unchanged price")
elif(old_price < price[0].text):
print("The Price has risen")
elif(old_price > price[0].text):
print("The Price has dropped")
the_price = price[0].text
price_dict = {'price':price[0].text}
outfile = open(filename,'wb')
pickle.dump(price_dict,outfile)
outfile.close()
And that is it. That is all the logic you need to test whether your price is unchanged, risen, or dropped. The only thing left to do is build in your specific needs for
How do you write in a text box in Selenium?
- Find the class ID of the text box. In the case of Google.com, the class name of the text box is “gLFyf”
- Select the text box by it’s Class ID : content = driver.find_element_by_class_name(‘gLFyf’)
- Click the text box using Selenium: content.click()
- Send the keys to the text box: content.send_keys(“the text you wanna send”)
How to Run The Most Basic Selenium Script
#Requirement number 1, install necessary package and imports.
from selenium import webdriver
#required for controlling browser By and Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
#REQUIRED FOR HEADLESS PART 1
from selenium.webdriver.chrome.options import Options
#New paradigm for path - required
from selenium.webdriver.chrome.service import Service
#Requirement #2 - Set your installed chrome driver.
# There is a video on this in the description
s = Service('/usr/local/bin/chromedriver')
#Set some selenium chrome options - Optional
#These options set your driver to run headless or not.
chromeOptions = Options()
chromeOptions.headless = False
#Initialize your Chrome Driver with services and options (optional)
#For mor info on running headless, there's a link below in description
driver = webdriver.Chrome(service=s, options=chromeOptions)
# This you have to run the driver here. Your execution will depend.
def initialize_browser():
driver.get("https://Google.com")
print("starting_Driver")
#content = driver.find_element_by_class_name('gLFyf')
content = driver.find_element(By.CLASS_NAME, "gLFyf")
content.click()
content.send_keys("The Text")
content.send_keys(Keys.RETURN)
result_stats = driver.find_element(By.ID, "result-stats")
print(result_stats.text)
#run browser init
initialize_browser()
Leave a Reply
You must be logged in to post a comment.