Data Acquisition 10 | Selenium

Series: Data Acquisition

Data Acquisition 10 | Selenium

  1. Selenium Introduction

Selenium is an open-source web-based automation tool. Python language is used with Selenium for testing. We have used Python programs to scrape data from websites, but there are three scenarios that make it more challenging. For example, if we want to log in or click some buttons, it can be impossible if we use only the beautifulsoup.

2. Installation

To use the selenium on our computer, we have to install it first.

Step 1. Download/install Chrome browser.

Step 2. Install python packages we need.

$ pip install -U selenium
$ pip install chromedriver

Step 3. Install chrome driver binary executable (non-Python code):

$ brew cask install chromedriver

3. Starter Code for Selenium

We can use the following code to test the selenium,

This program will open a new instance of chrome browser for us. If we run the code above, then it is going to create a new instance for us. Then if we simply press the Enter key, the instance will be closed.

4. Browser Seaching Simulation

If we want to use selenium to search something for us, we can actually write a script to help us to do so. To find the search box element, we can view the html code for google, where we can discover,

<input class="gLFyf gsfi" maxlength="2048" name="q" type="text" jsaction="paste:puy29d" aria-autocomplete="both" aria-haspopup="false" autocapitalize="off" autocomplete="off" autocorrect="off" autofocus="" role="combobox" spellcheck="false" title="Search" value="" aria-label="Search" data-ved="0ahUKEwi4yJ-667btAhUhMn0KHeLHBvUQ39UDCAY">

Then we can know that the name for the search box element is “q”. To grab this element for us, we can write,

search_box = driver.find_element_by_name('q')

Then we can use the following statement to sent words to search in this box,

search_box.send_keys(query)

Finally, the .submit method will then send the words for us,

search_box.submit()

In summary, we can test the following code,

5. Button Clicking Simulation

To test on clicking on a certain button, we can use the following statement on a statement

btn.click()

An interesting instance is to create a clicking instance based on the wepage of the willyoupressthebutton.com. There are two buttons on this page and we can choose one to click. The program is,

The result can be (should be different based on different options):

WILL YOU PRESS THE BUTTON? (yes/no):yes
Yes: 41290 (56%)
No: 31976 (44%)
Press Enter to quit

6. Logging in Simulation

We can also write a program to simlate the actions of logging in twitter (quite similar to what we have done before),

Also, to make it better looking, the GUI can be used to create a window for the users to type in their username and password,

After running this code, a GUI window will be created to make things clear to our users,

7. Scrolling Simulation

Sometimes, the information we want to grab is from different pages and these pages can be accessed by scrolling down the browser. The statement that we can use to scroll down is,

driver.execute_script("window.scrollTo(0, 10000);")

Then we can test this on the reddit. Remeber to pause for 2 second after each scrolling for the information to load.