如何用node.js爬取动态数据
大家讲道理
大家讲道理 2017-04-17 15:38:59
0
5
419

如何爬取动态数据,就是ajax请求的数据
比如说在代码中

<html>

<head>
<title>开课课程信息</title>
<meta name="GENERATOR" content="Microsoft FrontPage 3.0">
</head>

<frameset border="false" frameborder="0" rows="30,*">
  <frame name="header" scrolling="no" noresize target="frmCourMain" src="akcjj.asp" marginwidth="0"
  marginheight="0">
  <frame name="frmCourMain" src="akechengdw.asp" scrolling="auto" target="frmCourMain">
  <noframes>
  <body>
  <p>This page uses frames, but your browser doesn't support them.</p>
  </body>
  </noframes>
</frameset>
</html>

从代码中可以看出来源的数据是框架的akechengdw.asp,但是如何爬取这样的数据

大家讲道理
大家讲道理

光阴似箭催人老,日月如移越少年。

reply all(5)
巴扎黑

If it is data requested by ajax, there are generally two ideas.

1. It is a simulated browser to access. Specifically, you can use Google to simulate the browser crawler keyword, but you still have to practice it yourself.

2. Find the relevant interface, crawl the interface, and pay attention to some request headers.

巴扎黑

F12 looks at the ajax request, just pay attention to disguise it, such as user agent, referrer and so on.
If you need login permission, just add a cookie to identify the user. You can try it one by one.
If there is a CSRF defense mechanism, just find the hidden CSRF token and attach it to it.

阿神

Two ideas to supplement the one above

To simulate a browser, you can generally use some headless broswer. For Node, there are some packages, such as https://github.com/amir20/pha...

Ty80

At least post a url. I suggest you go to Baidu first and look up "The Art of Questioning". Don't talk so much. It's all just talk. When you ask questions, you have to let others understand.

大家讲道理

phantomjs

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template