网络爬虫脚本
最近需要写个脚本程序抓取一些网络数据,于是就有了常见的php脚本;测试代码如下: #!/usr/local/bin/php -q?php/** * Created by PhpStorm. * User: jackqqxu * Date: 14-9-12 * Time: 上午12:34 * 解析一个目录下面的文件,分析所有的静态资源然后下载下来
最近需要写个脚本程序抓取一些网络数据,于是就有了常见的php脚本;测试代码如下:
#!/usr/local/bin/php -q <?php /** * Created by PhpStorm. * User: jackqqxu * Date: 14-9-12 * Time: 上午12:34 * 解析一个目录下面的文件,分析所有的静态资源然后下载下来; */ //echo "请输入需要提取的文件路径:\n"; //$path = fread(STDIN, 100); //echo "程序即将读取 $path 路径下面的文件\n"; //echo "请输入需要提取的文件类型:\n"; //$type = fread(STDIN, 100); // Open a known directory, and proceed to read its contents //$path = '/Users/jackqqxu/Desktop/task/game/a_grain_of_truth_files/css/'; $destPath = '/Users/jackqqxu/task/aliyunsvn/health/grain/views/locations/'; //静态文件html $sourcePath = '/Users/jackqqxu/task/aliyunsvn/health/grain/js/'; //静态文件html //$baseUrl = 'http://www.zamolski.com/agot/resources/stylesheets/'; $netSourceUrl = 'http://www.zamolski.com/agot/views/locations/'; //现在获取位置信息 //$type = '.css'; $type = '.js'; //很多需要获取定位的位置信息; $typeLen = strlen($type); //echo 'r=' . realpath('/Users/jackqqxu/Desktop/task/game/a_grain_of_truth_files/css/../images/ui/frame_h.png') . "\n\n"; //echo "the programe will read the $type from the $path\n"; //if (!is_dir($destPath)) { // exec('mkdir -p ' . $destPath); //} if ($dh = opendir($sourcePath)) { while (($file = readdir($dh)) !== false) { $fileType = filetype($sourcePath . $file); if ($fileType != 'file') { continue; } // echo 'f=' . $file . substr($file, strlen($file)-$typeLen) . "\n"; if (substr($file, strlen($file)-$typeLen) == $type) { //类型相同 // echo "filename: $file : filetype: " . filetype($path . $file) . "\n"; echo '$sourcePath . $file=' . $sourcePath . $file . "\n"; $fileContentArr = file($sourcePath . $file); foreach($fileContentArr as $fileLine) { // if ($fileLine =~ /url\((.*?)\)/){ // if (preg_match_all("/url\((.*?)\)/", $fileLine, $matches)) { //css中通过url获取其他图片; if (preg_match_all("/gotoLocation\(\"(.*?)\"\)/", $fileLine, $matches)) { //中通过关键词获取其他文件; // print_r($matches);exit; // foreach($matches[1] as $matchImgUrl) { foreach($matches[1] as $matchUrl) { $sourceUrl = $netSourceUrl . $matchUrl . '.html'; echo 'n='.$sourceUrl."\n";//exit; $descFile = $destPath . $matchUrl . '.html'; // echo 'fs=' . function_exists('realpath'); // echo 'ni=' . $newImgFile."\n";//exit; // echo 'mkdir -p=' . dirname($newImgFile); // exec('mkdir -p ' . dirname($newImgFile)); $ret = file_put_contents($descFile, file_get_contents($sourceUrl)); if ($ret) { echo "文件$descFile 写入成功\n"; // exit; } // exit; } } } } } closedir($dh); } ?>

Del.icio.us![]() |
Facebook![]() |
TweetThis![]() |
Digg![]() |
StumbleUpon![]() |
Comments: 0 (Zero), Be the first to leave a reply!
You might be interested in this:
-
Ubuntu 安装JRE7的快捷方法(验证有效)
-
BigPipe的技术实现【转】
-
'insertCell' called on an object that does not implement interface HTMLTableRowElement.
-
javascript性能优化-repaint和reflow
-
Fiddler工作原理
Copyright © web代码网 [网络爬虫脚本], All Right Reserved. 2014.

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



The default map on the iPhone is Maps, Apple's proprietary geolocation provider. Although the map is getting better, it doesn't work well outside the United States. It has nothing to offer compared to Google Maps. In this article, we discuss the feasible steps to use Google Maps to become the default map on your iPhone. How to Make Google Maps the Default Map in iPhone Setting Google Maps as the default map app on your phone is easier than you think. Follow the steps below – Prerequisite steps – You must have Gmail installed on your phone. Step 1 – Open the AppStore. Step 2 – Search for “Gmail”. Step 3 – Click next to Gmail app
![WLAN expansion module has stopped [fix]](https://img.php.cn/upload/article/000/465/014/170832352052603.gif?x-oss-process=image/resize,m_fill,h_207,w_330)
If there is a problem with the WLAN expansion module on your Windows computer, it may cause you to be disconnected from the Internet. This situation is often frustrating, but fortunately, this article provides some simple suggestions that can help you solve this problem and get your wireless connection working properly again. Fix WLAN Extensibility Module Has Stopped If the WLAN Extensibility Module has stopped working on your Windows computer, follow these suggestions to fix it: Run the Network and Internet Troubleshooter to disable and re-enable wireless network connections Restart the WLAN Autoconfiguration Service Modify Power Options Modify Advanced Power Settings Reinstall Network Adapter Driver Run Some Network Commands Now, let’s look at it in detail

We need to use the correct DNS when connecting to the Internet to access the Internet. In the same way, if we use the wrong dns settings, it will prompt a dns server error. At this time, we can try to solve the problem by selecting to automatically obtain dns in the network settings. Let’s take a look at the specific solutions. How to solve win11 network dns server error. Method 1: Reset DNS 1. First, click Start in the taskbar to enter, find and click the "Settings" icon button. 2. Then click the "Network & Internet" option command in the left column. 3. Then find the "Ethernet" option on the right and click to enter. 4. After that, click "Edit" in the DNS server assignment, and finally set DNS to "Automatic (D

This article will introduce the solution to the problem that the globe symbol is displayed on the Win10 system network but cannot access the Internet. The article will provide detailed steps to help readers solve the problem of Win10 network showing that the earth cannot access the Internet. Method 1: Restart directly. First check whether the network cable is not plugged in properly and whether the broadband is in arrears. The router or optical modem may be stuck. In this case, you need to restart the router or optical modem. If there are no important things being done on the computer, you can restart the computer directly. Most minor problems can be quickly solved by restarting the computer. If it is determined that the broadband is not in arrears and the network is normal, that is another matter. Method 2: 1. Press the [Win] key, or click [Start Menu] in the lower left corner. In the menu item that opens, click the gear icon above the power button. This is [Settings].

Cutting is a video editing tool with comprehensive editing functions, support for variable speed, various filters and beauty effects, and rich music library resources. In this software, you can edit videos directly or create editing scripts, but how to do it? In this tutorial, the editor will introduce the method of editing and making scripts. Production method: 1. Click to open the editing software on your computer, then find the "Creation Script" option and click to open. 2. In the creation script page, enter the "script title", and then enter a brief introduction to the shooting content in the outline. 3. How can I see the "Storyboard Description" option in the outline?

How to execute .sh file in Linux system? In Linux systems, a .sh file is a file called a Shell script, which is used to execute a series of commands. Executing .sh files is a very common operation. This article will introduce how to execute .sh files in Linux systems and provide specific code examples. Method 1: Use an absolute path to execute a .sh file. To execute a .sh file in a Linux system, you can use an absolute path to specify the location of the file. The following are the specific steps: Open the terminal

LOL cannot connect to the server, please check the network. In recent years, online games have become a daily entertainment activity for many people. Among them, League of Legends (LOL) is a very popular multiplayer online game, attracting the participation and interest of hundreds of millions of players. However, sometimes when we play LOL, we will encounter the error message "Unable to connect to the server, please check the network", which undoubtedly brings some trouble to players. Next, we will discuss the causes and solutions of this error. First of all, the problem that LOL cannot connect to the server may be

Is the clock app missing from your phone? The date and time will still appear on your iPhone's status bar. However, without the Clock app, you won’t be able to use world clock, stopwatch, alarm clock, and many other features. Therefore, fixing missing clock app should be at the top of your to-do list. These solutions can help you resolve this issue. Fix 1 – Place the Clock App If you mistakenly removed the Clock app from your home screen, you can put the Clock app back in its place. Step 1 – Unlock your iPhone and start swiping to the left until you reach the App Library page. Step 2 – Next, search for “clock” in the search box. Step 3 – When you see “Clock” below in the search results, press and hold it and
