网页爬虫 - ubuntu 下 python 使用 selenium + PhantomJS 时出错
PHP中文网
PHP中文网 2017-04-17 14:33:09
0
4
874
PHP中文网
PHP中文网

认证0级讲师

reply all(4)
迷茫

I also encountered this recently. I think the dynamic js has not been parsed yet, so the web page code cannot be obtained. The exception is NoSuchElementException, which is obvious.

洪涛

There is another possibility. Because phantomjs belongs to a headless browser and has no window, all elements may not be drawn. So any element you find at this time will be a NoSuchElementException exception.
You can try the following steps:

browser = webdriver.PhantomJS()
browser.set_window_size(800, 600) # set browser size.
browser.get("http\:example.com") # Load page

Reference: https://github.com/ariya/phantomjs/issues/11637

刘奇

Answer it yourself.
Found a solution on stackoverflow.
Block out css, images and js to improve speed.
Although PhantomJS still cannot be used, it is indeed faster and the purpose is achieved.

firefox_profile = webdriver.FirefoxProfile()
firefox_profile.set_preference("browser.download.folderList", 2)
firefox_profile.set_preference("permissions.default.stylesheet", 2)
firefox_profile.set_preference("permissions.default.image", 2)
firefox_profile.set_preference("javascript.enable", False)

browser = webdriver.Firefox(firefox_profile=firefox_profile)

http://stackoverflow.com/questions/20892768/how-to-speed-up-browsing-in-selenium-firefox
http://stackoverflow.com/questions/17462884/is-selenium-slow -or-is-my-code-wrong

阿神

In this case, wouldn’t js also be unable to be parsed? Why not use other faster tools?

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!