Home Backend Development PHP Tutorial 关于数组循环的有关问题

关于数组循环的有关问题

Jun 13, 2016 pm 12:54 PM
array spider url

关于数组循环的问题
代码有点多不方便贴出来,不过我希望朋友们能给我一个思路,这里先谢谢了

$_array_article=array("http://blog.csdn.net/anewczs/article/details/6617391");<br />
//$_array_article[]="http://blog.csdn.net/tianlesoftware/article/details/6723117";<br />
<br />
foreach($_array_article as $value){<br />
	$spider->begin_url=$value;<br />
	file_get_contents($spider->begin_url);<br />
	_spider($spider->fetch_turl($spider->begin_url));<br />
}<br />
Copy after login


这里是代码的一部分,通过一个链接组成的数组,来对各个链接进行处理,但是又这样一个情况:数组元素大于一个的话就会出错,我的感觉是循环进行了一次之后,内存中的一些值影响了第二次循环的进行,这样才导致了出错,怎么可以做到让我需要的两个全局数组可以不断添加新元素,其它的所有内存中的值都清空?


------解决方案--------------------
不能这么抓的,很容易陷入抓取死循环。
抓取一般是这样
#1.建立一个文件用于保存url
#2.抓取得到的url附加进文件去
#3.读取文件里的url,一行行抓数据,反复#2,#3

这里面会有一些问题,比如如何避免相同的链接抓取两次,如何限定抓取目标为某个域名。。等等。这些小问题相信你能解决了。
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

PHP function introduction—get_headers(): Get the response header information of the URL PHP function introduction—get_headers(): Get the response header information of the URL Jul 25, 2023 am 09:05 AM

PHP function introduction—get_headers(): Overview of obtaining the response header information of the URL: In PHP development, we often need to obtain the response header information of the web page or remote resource. The PHP function get_headers() can easily obtain the response header information of the target URL and return it in the form of an array. This article will introduce the usage of get_headers() function and provide some related code examples. Usage of get_headers() function: get_header

Why NameResolutionError(self.host, self, e) from e and how to solve it Why NameResolutionError(self.host, self, e) from e and how to solve it Mar 01, 2024 pm 01:20 PM

The reason for the error is NameResolutionError(self.host,self,e)frome, which is an exception type in the urllib3 library. The reason for this error is that DNS resolution failed, that is, the host name or IP address attempted to be resolved cannot be found. This may be caused by the entered URL address being incorrect or the DNS server being temporarily unavailable. How to solve this error There may be several ways to solve this error: Check whether the entered URL address is correct and make sure it is accessible Make sure the DNS server is available, you can try using the "ping" command on the command line to test whether the DNS server is available Try accessing the website using the IP address instead of the hostname if behind a proxy

How to get your Steam ID in a few steps? How to get your Steam ID in a few steps? May 08, 2023 pm 11:43 PM

Nowadays, many Windows users who love games have entered the Steam client and can search, download and play any good games. However, many users' profiles may have the exact same name, making it difficult to find a profile or even link a Steam profile to other third-party accounts or join Steam forums to share content. The profile is assigned a unique 17-digit id, which remains the same and cannot be changed by the user at any time, whereas the username or custom URL can. Regardless, some users don't know their Steamid, and it's important to know this. If you don't know how to find your account's Steamid, don't panic. In this article

What is the difference between html and url What is the difference between html and url Mar 06, 2024 pm 03:06 PM

Differences: 1. Different definitions, url is a uniform resource locator, and html is a hypertext markup language; 2. There can be many urls in an html, but only one html page can exist in a url; 3. html refers to is a web page, and url refers to the website address.

How to use URL encoding and decoding in Java How to use URL encoding and decoding in Java May 08, 2023 pm 05:46 PM

Use url to encode and decode the class java.net.URLDecoder.decode(url, decoding format) decoder.decoding method for encoding and decoding. Convert into an ordinary string, URLEncoder.decode(url, encoding format) turns the ordinary string into a string in the specified format packagecom.zixue.springbootmybatis.test;importjava.io.UnsupportedEncodingException;importjava.net.URLDecoder;importjava.net. URLEncoder

Scrapy optimization tips: How to reduce crawling of duplicate URLs and improve efficiency Scrapy optimization tips: How to reduce crawling of duplicate URLs and improve efficiency Jun 22, 2023 pm 01:57 PM

Scrapy is a powerful Python crawler framework that can be used to obtain large amounts of data from the Internet. However, when developing Scrapy, we often encounter the problem of crawling duplicate URLs, which wastes a lot of time and resources and affects efficiency. This article will introduce some Scrapy optimization techniques to reduce the crawling of duplicate URLs and improve the efficiency of Scrapy crawlers. 1. Use the start_urls and allowed_domains attributes in the Scrapy crawler to

How to add URL prefix to SpringBoot multiple controllers How to add URL prefix to SpringBoot multiple controllers May 12, 2023 pm 06:37 PM

Preface In some cases, the prefixes in the service controller are consistent. For example, the prefix of all URLs is /context-path/api/v1, and a unified prefix needs to be added to some URLs. The conceivable solution is to modify the context-path of the service and add api/v1 to the context-path. Modifying the global prefix can solve the above problem, but there are disadvantages. If the URL has multiple prefixes, for example, some URLs require prefixes. If it is api/v2, it cannot be distinguished. If you do not want to add api/v1 to some static resources in the service, it cannot be distinguished. The following uses custom annotations to uniformly add certain URL prefixes. one,

Sort array using Array.Sort function in C# Sort array using Array.Sort function in C# Nov 18, 2023 am 10:37 AM

Title: Example of using the Array.Sort function to sort an array in C# Text: In C#, array is a commonly used data structure, and it is often necessary to sort the array. C# provides the Array class, which has the Sort method to conveniently sort arrays. This article will demonstrate how to use the Array.Sort function in C# to sort an array and provide specific code examples. First, we need to understand the basic usage of the Array.Sort function. Array.So

See all articles