python爬虫 - python爬取豆瓣电影,无法抓取到内容
阿神
阿神 2017-04-17 17:05:11
0
2
1094

代码:

# /usr/bin/python
#coding:utf-8
__author__ = 'eyu Fanne'

import requests,re
from bs4 import BeautifulSoup

move_url = 'https://movie.douban.com/'

def Robot():
    res_url = requests.get(move_url)
    print res_url.status_code
    soup = BeautifulSoup(res_url.text,'lxml')
    print soup.title
    soup_a = soup.find_all("a",class_="item")
    for i in soup_a:
        print i
    print soup_a



if __name__=='__main__':
    Robot()

结果:
200
<title>

    豆瓣电影

</title>
[]

抓取

<a class='item' ....>

这个标签内的值,但获取到的空,这是为何。

阿神
阿神

闭关修行中......

reply all(2)
大家讲道理

Check the source code of the page, there is no movie information in it. In fact, it is rendered by JS on the page.
You can check out this link https://movie.douban.com/j/search_subjects?type=movie&tag=%E7%83%AD%E9%97%A8&sort=recommend&page_limit=20&page_start=0

Peter_Zhu

Douban Movies has a public API interface. . Why crawl the page? .
http://developers.douban.com/wiki/?title=movie_v2

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template