c++ - libcurl:无法获取某个页面的html源码(需要发送cookie)
大家讲道理
大家讲道理 2017-04-17 11:13:42
0
2
772

访问CSDN的下载频道-->我的资源的时候,需要发送cookie,以下代码在本地搭建的php环境中,可以成功发送cookie信息,并获取网页源码。
但在访问CSDN的页面时候,无法返回数据,但是响应code是200,不清楚怎么回事,希望会的朋友帮忙看下。
注:我搞了一整天了,也百度google了好久,还是没法解决!
根据@依云 兄和 暗夜网友的提示,使用http analyzer软件抓到了数据包,和火狐抓到的数据包对比如下:

网页访问的时候,抓到的数据包:

软件访问的时候,抓到的数据包(无法获取返回信息,代码贴在下方):

注:对比可以发现,发送到服务端的数据包应该没错的~是不是代码哪里错了呢,我还是没检查出来呀

代码如下

long writer(void *data, int size, int nmemb, string &content)
{
    long sizes = size * nmemb;
    string temp((char*)data,sizes);
    content += temp;
    return sizes;
}

int _tmain(int argc, _TCHAR* argv[])
{
    CURL *curl;
    CURLcode res;
    curl_global_init(CURL_GLOBAL_ALL);
    curl = curl_easy_init();
    if (curl) {
        curl_easy_setopt(curl,CURLOPT_VERBOSE,1);       //输出请求头和响应头
        curl_easy_setopt(curl,CURLOPT_HEADER,1);
        curl_easy_setopt(curl,CURLOPT_URL, "http://download.csdn.net/my/downloads/1");

        //curl_easy_setopt(curl,CURLOPT_ACCEPT_ENCODING, "gzip");//采用gzip压缩

        //http请求头
        struct curl_slist *headers = NULL;
        headers = curl_slist_append(headers,"Host: download.csdn.net");
        headers = curl_slist_append(headers,"User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0");
        headers = curl_slist_append(headers,"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
        headers = curl_slist_append(headers,"Accept-Language: zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3");
        //headers = curl_slist_append(headers,"Accept-Encoding: gzip, deflate");
        headers = curl_slist_append(headers,"Referer: http://www.csdn.net/");
        headers = curl_slist_append(headers,"Cookie: UserName=用户名; UserInfo=该cookie信息从浏览器就可以获取;");
        headers = curl_slist_append(headers,"Connection: keep-alive");
        headers = curl_slist_append(headers,"Cache-Control: max-age=0");

        curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headers);

        string error,content;
        //writer:回调函数
        res = curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writer);
        if (res != CURLE_OK)
            printf( "Failed to set writer [%s]\n", error);

        //回调函数的参数:content
        res = curl_easy_setopt(curl, CURLOPT_WRITEDATA, &content);
        if (res != CURLE_OK)
            printf( "Failed to set write data [%s]\n",error);

        res = curl_easy_perform(curl);
        if (res != CURLE_OK) {
            fprintf(stderr, "Curl perform failed: %s\n", curl_easy_strerror(res));
            return 1;
        }
        double length = 0;
        res = curl_easy_getinfo(curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD , &length);

        FILE * file = fopen("test.html","wb");
        fseek(file,0,SEEK_SET);
        fwrite(content.c_str(),1,length,file);
        fclose(file);

        //释放资源
        curl_slist_free_all(headers);
    }
    curl_global_cleanup();
    getchar();
    return 0;
}
大家讲道理
大家讲道理

光阴似箭催人老,日月如移越少年。

reply all(2)
Peter_Zhu

解决办法

fwrite(content.c_str(),1,length,file);

改成

fwrite(content.c_str(),1,content.length(),file);
大家讲道理

你 cookie 不完整吧。我试了一下这条命令:

curl -H "Cookie: 用 firebug 等工具得到的完整 cookie" http://download.csdn.net/my/downloads/1

cookie 的所有部分原封不动地放在里面的话是可以的。如果嫌太多你可以再尝试从中去掉一些部分,目测 access-token 这类东西应该是必要的。

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!