0% found this document useful (0 votes)
16 views19 pages

Python Specialization Course 4 Module 3

Uploaded by

zaman.iu21.ali1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views19 pages

Python Specialization Course 4 Module 3

Uploaded by

zaman.iu21.ali1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

DevOps and Build Automation with Python

4th Course in Python Scripting for DevOps Specialization


Browser Automation
In this module, we look at how to utilize pyppeteer in Python
to automate browser interaction

2 © LearnQuest 2021
Learning Objectives
Browser Automation

Upon completion of this module, learners will be able to:


• Describe the use case for headless browsing
• Develop scripts that utilize headless browsing to
visit a webpage
• Develop scripts that utilize headless browsing to extract
elements from a web page

3 © LearnQuest 2021
Lesson 1 In this lesson we extend our toolset
of python to look at utilize python to
Headless Browsing script the web browser automation

4 © LearnQuest 2021
Headless Browsing
• Headless browsers provide automated
control of a web page in an environment
similar to popular web browsers

• Puppeteer is a very popular JavaScript


headless browsing framework -
https://github.com/puppeteer/puppeteer

• Puppeteer automates a headless


Chrome Browser

• Pyppeteer is a port to Python

5 © LearnQuest 2021
Headless Browsing Use Cases
• Test automation in modern web applications

• Taking screenshots of web pages

• Running automated tests for JavaScript libraries

• Scraping web sites for data

• Automating interaction of web pages

6 © LearnQuest 2021
Malicious Headless Browsing Use Cases
• Perform DDoS attacks on web sites

• Increase advertisement impressions

• Automate web sites in unintended ways (credential stuffing)

7 © LearnQuest 2021
Installing Pyppeteer

pip install pyppeteer

8 © LearnQuest 2021
Lesson 1
Headless browsing can be used
Review to test web applications
for performance

Headless browsing can be used to


test web applications for errors

Headless browsing can be used to


take screenshots of webpages

9 © LearnQuest 2021
Lesson 2 In this lesson we look at how to
develop scripts that utilize headless
Writing Scripts to Visit a browsing to visit a webpage
Web Page

10 © LearnQuest 2021
Coroutines
• Coroutines are computer program
components that generalize
subroutines for non-preemptive
multitasking, by allowing multiple
entry points for suspending and
resuming execution at certain
locations

• Asyncio module provides


async/await syntax to build
asychronous coroutines

11 © LearnQuest 2021
Async & Await
Async Await
• The syntax async def • The keyword await passes
introduces either a native function control back to the
coroutine or an event loop. (It suspends the
asynchronous generator execution of the surrounding
coroutine.)
• await f() - returns control
until f finishes

To call a coroutine function, you must await it to get its results.

12 © LearnQuest 2021
Example Headless call

1. import asyncio
2. from pyppeteer import launch
3. async def main():
4. browser = await launch()
5. page = await browser.newPage()
6. await page.goto('https://example.com')
7. await page.screenshot({'path': 'example.png'})
8. await browser.close()
9. asyncio.get_event_loop().run_until_complete(main())

13 © LearnQuest 2021
Lesson 2
Review Async def introduces
a native coroutine

Await passes function control


back to the event loop

To call a coroutine function,


you must await it

14 © LearnQuest 2021
Lesson 3 In this lesson we look at how we
can develop scripts that utilize
Extracting HTML headless browsing to extract
Elements elements from a web page

15 © LearnQuest 2021
Page Class
• This class provides methods to interact with a single tab of chrome

• Goto function to load URL

• Screenshot function to save page as an image

• Content function to return contents as a string

• Metrics function to return a dictionary of metrics in key value pairs

• Emulate function to set agent and other browser characteristics

16 © LearnQuest 2021
Selecting Dom Elements
Gets Elements Inner Text

• Page.querySelector()

• Page.querySelectorAll()

• Page.xpath()

Example:
element = await page.querySelector('h1')

17 © LearnQuest 2021
Running JavaScript
• JavaScript strings can be function or expression.

• Pyppeteer tries to automatically detect the string is function or expression

• force_expr=True option forces pyppeteer to treat the string as expression

Example:
dimensions = await page.evaluate('''() => {
return {
width: document.documentElement.clientWidth,
height: document.documentElement.clientHeight,
deviceScaleFactor: window.devicePixelRatio,
}
}''')

18 © LearnQuest 2021
Lesson 3
Review Xpath can be used to navigate
through elements in a page

Evaluate can execute JavaScript


in the context of the page

Arrow functions are used


to express JavaScript

19 © LearnQuest 2021

You might also like