网站首页 > linux / 正文

在Linux配置Python的selenium环境

2024-11-26 16:58 huorong linux 9 ℃ 0 评论

1. 介绍

1.1 介绍

福哥要做一个自动化测试的脚本来对网站的功能进行了全面的测试，很多人一下会想到通过curl之类的方式去请求服务器页面，通过代码模拟浏览器？不就是发送 GET/POST 请求吗？

一开始，福哥也是这样认为的，直到学习了python的selenium，才体会到“无界面浏览器操作模拟”的神奇，今天就跟着福哥学习一下如何使用selenium去模拟用户操作浏览器吧！

2. 安装

2.1 安装Chrome

直接输入命令

yum install https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm

2.2 安装ChromeDriver

下载压缩包，然后解压缩，然后移动到 /usr/bin/ 目录，最后授权一下

wget https://chromedriver.storage.googleapis.com/90.0.4430.24/chromedriver_linux64.zip
unzip chromedriver_linux64.zip
mv chromedriver /usr/bin/
chmod +x /usr/bin/chromedriver

2.3 安装Selenium

通过pip安装，直接输入命令

pip3 install selenium

3. 使用

3.1 示例一

3.1.1 任务

打开百度首页
输入关键字“site:tongfu.net”
点击搜索按钮
查看搜索结果是否匹配关键字“同福主页 - 首页 - 同福网 - TONGFU.net”

3.1.2 代码

通过selenium导入webdriver

通过webdriver创建chrome实例

通过chrome实例打开百度首页

通过id选择到百度首页的输入框，填写关键字“site:tongfu.net”

通过id选择到百度首页的提交按钮，提交查询

等上3秒钟

在页面源代码里查资关键字，如果百度工作正常，我们就可以匹配到查询结果

from selenium import webdriver
import time
import re

# make options of Chrome
opts = webdriver.ChromeOptions()
opts.add_argument("--headless")
opts.add_argument("--disable-gpu")
opts.add_argument("--no-sandbox")

# init Chrome
chrome = webdriver.Chrome(options=opts);

# call baidu.com
chrome.get("http://www.baidu.com")

# add keyword and press submit
input = chrome.find_element_by_id("kw")
input.send_keys("site:tongfu.net")
button = chrome.find_element_by_id("su")
button.click()

# wait a moment
time.sleep(3)

# check source code
source = chrome.page_source
regexp = re.compile("同福主页 - 首页 - 同福网 - TONGFU.net", re.M)
mats = regexp.search(source)
if mats:
    print (mats.group())
else:
    print ("miss match")

# close Chrome
chrome.close()

4. 元素定位

4.1 八种方式

以下八种元素定位方法可以灵活地拿到页面上的元素

最后的execute_script方法，可以通过JS命令去拿页面上的元素

福哥觉得一个execute_script+jquery等于一切了（前提是目标页面有加载jQuery框架）

4.1.1 find_element_by_id

通过元素id定位

4.1.2 find_element_by_name

通过元素name定位

4.1.3 find_element_by_class_name

通过元素class定位

4.1.4 find_element_by_xpath

通过xpath定位，类似selector一种语法

4.1.5 find_element_by_css_selector

通过元素selector定位

4.1.6 find_element_by_tag_name

通过元素标签名称定位

4.1.7 find_element_by_link_text

通过a标签的label精确匹配

4.1.8 find_element_by_partial_link_text

通过a标签的label模糊匹配

4.1.9 execute_script

通过执行页面JS命令匹配

5. 选项介绍

ChromeOptions有很多选择，这里给大家介绍一下常用的选项

5.1 常用选项

--user-data-dir="[PATH]"
# 指定用户数据文件夹路径

--disk-cache-dir="[PATH]"
# 指定缓存路径

--disk-cache-size=
# 指定缓存大小，单位字节

--first-run 
# 第一次运行

--incognito 
# 隐身模式

--disable-javascript 
# 禁用Javascript

--omnibox-popup-count
# 地址栏弹出的提示菜单数量

--user-agent=
# 修改HTTP请求头部的Agent字符串

--disable-plugins 
# 禁止加载所有插件，可以增加速度

--disable-java 
# 禁用java

--start-maximized 
# 最大化方式启动浏览器

--no-sandbox 
# 取消沙盒模式

--single-process 
# 单进程运行

--process-per-tab 
# 每个标签使用单独进程

--process-per-site 
# 每个站点使用单独进程

--in-process-plugins 
# 插件不启用单独进程

--disable-popup-blocking 
# 禁用弹出拦截

--disable-plugins 
# 禁用插件

--disable-images 
# 禁用图像

--incognito 
# 启动进入隐身模式

--enable-udd-profiles 
# 启用账户切换菜单

--proxy-pac-url 
# 使用pac代理

--lang=zh-CN 
# 设置语言为简体中文

--disk-cache-dir 
# 自定义缓存目录

--disk-cache-size 
# 自定义缓存最大值，单位字节

--media-cache-size 
# 自定义多媒体缓存最大值，单位字节

--bookmark-menu 
# 在工具栏菜单

--enable-sync 
# 启用书签同步

--disable-gpu
# 谷歌文档提到过

--hide-scrollbars
 # 隐藏滚动条

--headless
# 浏览器不提供可视化页面

--no-sandbox
# 以最高权限运行

window-size
# 指定浏览器分辨率

blink-settings-imagesEnabled
# 加载图片

6. 总结

使用selenium打开一个无界面的浏览器进程

然后通过代码对这个浏览器进程进行操作，通过复杂的业务逻辑代码模拟用户操作

最后通过获取页面元素的方式，或者通过查看页面源代码的方式进行结果判定

通过这样几步操作就可以达到了网站自动化测试的目的了

https://m.tongfu.net/home/35/blog/512631.html

Tags：linux下载python

上一篇：linux 安装python3
下一篇：Linux：CentOS 7.x上安装Python最新版本3.11.2