*Selenium is mainly used for automated testing and supports multiple browsers. It is mainly used in crawlers to solve JavaScript rendering problems.
Simulate the browser to load the web page. When requests and urllib cannot obtain the web page content normally*
1. Declare the browser object
Attention point 1, Python file name Or do not name the package name selenium, which will result in the inability to import
from selenium import webdriver
browser = webdriver.Chrome()
2. Visit the page and get the webpage html
from selenium import webdriver browser = webdriver.Chrome() browser.get('https://www.taobao.com') print(browser.page_source) # browser.page_source是获取网页的全部htmlbrowser.close()
3. Find the element
Single element
from selenium import webdriver browser = webdriver.Chrome() browser.get('https://www.taobao.com') input_first = browser.find_element_by_id('q') input_second = browser.find_element_by_css_selector('#q') input_third = browser.find_element_by_xpath('//*[@id="q"]') print(input_first,input_second,input_third) browser.close()
Commonly used Search method
find_element_by_name find_element_by_xpath find_element_by_link_text find_element_by_partial_link_text find_element_by_tag_name find_element_by_class_name find_element_by_css_selector
You can also use the general method
from selenium import webdriverfrom selenium.webdriver.common.by import By browser = webdriver.Chrome() browser.get('https://www.taobao.com') input_first = browser.find_element(BY.ID,'q')#第一个参数传入名称,第二个传入具体的参数print(input_first) browser.close()
Multiple elements, multiple elements
input_first = browser.find_elements_by_id('q')
4. Element interaction - search box input Keywords for automatic search
from selenium import webdriver import timebrowser = webdriver.Chrome() browser.get('https://www.taobao.com') input = browser.find_element_by_id('q')#找到搜索框input.send_keys('iPhone')#传送入关键词time.sleep(5) input.clear()#清空搜索框input.send_keys('男士内裤') button = browser.find_element_by_class_name('btn-search')#找到搜索按钮button.click()
More operations: http://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.remote.webelement#Can have attributes and screenshots Wait
5. Interactive actions, drive the browser to perform actions, simulate drag and drop actions, attach the actions to the action chain to execute serially
from selenium import webdriverfrom selenium.webdriver import ActionChains#引入动作链browser = webdriver.Chrome() url = 'http://www.runoob.com/try/try.php?filename=jqueryui-api-droppable'browser.get(url) browser.switch_to.frame('iframeResult')#切换到iframeResult框架source = browser.find_element_by_css_selector('#draggable')#找到被拖拽对象target = browser.find_element_by_css_selector('#droppable')#找到目标actions = ActionChains(browser)#声明actions对象actions.drag_and_drop(source, target) actions.perform()#执行动作
More operations : http://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.common.action_chains
6. Execute JavaScript
Some actions may not be provided API, such as progress bar drop-down, at this time, we can execute JavaScript through code
from selenium import webdriver browser = webdriver.Chrome() browser.get('https://www.zhihu.com/explore') browser.execute_script('window.scrollTo(0, document.body.scrollHeight)') browser.execute_script('alert("To Bottom")')
7. Get element information
Get attributes
from selenium import webdriverfrom selenium.webdriver import ActionChains browser = webdriver.Chrome() url = 'https://www.zhihu.com/explore'browser.get(url) logo = browser.find_element_by_id('zh-top-link-logo')#获取网站logoprint(logo) print(logo.get_attribute('class')) browser.close()
Get text value
from selenium import webdriver browser = webdriver.Chrome() url = 'https://www.zhihu.com/explore'browser.get(url) input = browser.find_element_by_class_name('zu-top-add-question') print(input.text)#input.text文本值browser.close()
Get the Id, location, tag name, size
from selenium import webdriver browser = webdriver.Chrome() url = 'https://www.zhihu.com/explore'browser.get(url) input = browser.find_element_by_class_name('zu-top-add-question') print(input.id)#获取idprint(input.location)#获取位置print(input.tag_name)#获取标签名print(input.size)#获取大小browser.close()
8. Frame operation
frame is equivalent to an independent web page. If you search for a subcategory in the parent category network frame, you must switch To the frame of the subclass, if the subclass is looking for the parent class, it also needs to switch first
from selenium import webdriverfrom selenium.common.exceptions import NoSuchElementException browser = webdriver.Chrome() url = 'http://www.runoob.com/try/try.php?filename=jqueryui-api-droppable'browser.get(url) browser.switch_to.frame('iframeResult') source = browser.find_element_by_css_selector('#draggable') print(source)try: logo = browser.find_element_by_class_name('logo')except NoSuchElementException: print('NO LOGO') browser.switch_to.parent_frame() logo = browser.find_element_by_class_name('logo') print(logo) print(logo.text)
9. Waiting
Implicit waiting
When implicit waiting is used When executing the test, if WebDriver does not find the element in the DOM, it will continue to wait. After the set time is exceeded, an exception of the element not found will be thrown.
In other words, when the element is found or the element does not appear immediately At this time, implicit waiting will wait for a period of time before searching the DOM. The default time is 0
from selenium import webdriver browser = webdriver.Chrome() browser.implicitly_wait(10)#等待十秒加载不出来就会抛出异常,10秒内加载出来正常返回browser.get('https://www.zhihu.com/explore') input = browser.find_element_by_class_name('zu-top-add-question') print(input)
Explicit waiting
Specify a waiting condition and a maximum waiting time. The program will determine whether it is within the waiting time. Whether the condition is met. If it is met, it will be returned. If it is not met, it will continue to wait. If the time exceeds, an exception will be thrown.
from selenium import webdriverfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.webdriver.support import expected_conditions as EC browser = webdriver.Chrome() browser.get('https://www.taobao.com/')wait = WebDriverWait(browser, 10) input = wait.until(EC.presence_of_element_located((By.ID, 'q'))) button = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '.btn-search'))) print(input, button)
title_is 标题是某内容 title_contains 标题包含某内容 presence_of_element_located 元素加载出,传入定位元组,如(By.ID, 'p') visibility_of_element_located 元素可见,传入定位元组 visibility_of 可见,传入元素对象 presence_of_all_elements_located 所有元素加载出 text_to_be_present_in_element 某个元素文本包含某文字 text_to_be_present_in_element_value 某个元素值包含某文字 frame_to_be_available_and_switch_to_it frame加载并切换 invisibility_of_element_located 元素不可见 element_to_be_clickable 元素可点击 staleness_of 判断一个元素是否仍在DOM,可判断页面是否已经刷新 element_to_be_selected 元素可选择,传元素对象 element_located_to_be_selected 元素可选择,传入定位元组 element_selection_state_to_be 传入元素对象以及状态,相等返回True,否则返回False element_located_selection_state_to_be 传入定位元组以及状态,相等返回True,否则返回False alert_is_present 是否出现Alert
Details: http://selenium-python.readthedocs.io/api.html#module -selenium.webdriver.support.expected_conditions
11. Forward and Back - Realize the browser’s forward and backward movement to browse different web pages
import timefrom selenium import webdriver browser = webdriver.Chrome() browser.get('https://www.baidu.com/') browser.get('https://www.taobao.com/') browser.get('https://www.python.org/') browser.back()time.sleep(1) browser.forward() browser.close()
12. Cookies
from selenium import webdriver browser = webdriver.Chrome() browser.get('https://www.zhihu.com/explore') print(browser.get_cookies()) browser.add_cookie({'name': 'name', 'domain': 'www.zhihu.com', 'value': 'germey'}) print(browser.get_cookies()) browser.delete_all_cookies() print(browser.get_cookies())
Tab Management Adding Browser Window
import timefrom selenium import webdriver browser = webdriver.Chrome() browser.get('https://www.baidu.com') browser.execute_script('window.open()') print(browser.window_handles) browser.switch_to_window(browser.window_handles[1]) browser.get('https://www.taobao.com')time.sleep(1) browser.switch_to_window(browser.window_handles[0]) browser.get('http://www.fishc.com')
13. Exception Handling
from selenium import webdriver browser = webdriver.Chrome() browser.get('https://www.baidu.com') browser.find_element_by_id('hello')from selenium import webdriverfrom selenium.common.exceptions import TimeoutException, NoSuchElementException browser = webdriver.Chrome()try: browser.get('https://www.baidu.com')except TimeoutException: print('Time Out')try: browser.find_element_by_id('hello')except NoSuchElementException: print('No Element')finally: browser.close()
This article explains selenium Usage, please pay attention to php Chinese website for more related content.
Related recommendations:
How to perform 2D conversion through CSS3
Detailed explanation of JavaScript variables and scope
Detailed explanation of $.ajax() method parameters
The above is the detailed content of Detailed explanation of selenium usage. For more information, please follow other related articles on the PHP Chinese website!