How to use dedecms collection
Taking the official website of Dreamweaver as an example, we collect the PHP tutorial column under the Webmaster Academy and open the list address http://www.dedecms.com/web-art/PHP_jiaocheng.
#Log in to the backend, enter "Collection Node Management", create a new node, and select the content model as "Normal Article".
1. Set the basic information of the node (Recommended learning: dedecms tutorial)
First fill in a node name that is easy to remember, and select The target page code is GB2312. The anti-hotlink mode does not need to be set. Since the target site has no restrictions, this item will not be modified. The system default timeout is 10 seconds.
2. Set the list URL acquisition rules
In this step we need to make some settings, obtain the article list address, return to the target site list page, and observe the changes between pages , it can be found that only the numbers after "14_" have regular incremental changes.
Home page: http://www.dedecms.com/web-art/PHP_jiaocheng/list_14_1.html
Middle: http://www.dedecms.com/web-art/PHP_jiaocheng /list_14_(*).html
Last page: http://www.dedecms.com/web-art/PHP_jiaocheng/list_14_172.html
Copy a paging address and return to "New On the "Add Collection Node" page, select "Source Attribute" as "Batch Generate List URL", paste the address into the "Matching URL", modify the rule change as (*), and enter 1 in the "Batch Generate Address Settings" (*) To 172, what this means is to generate all addresses from the first page to the last 172 pages of the list.
Test it. In the pop-up box, we can see that 172 address records are looped out, and it is set up smoothly. Sometimes we encounter a list that is difficult to obtain, then we can copy the irregular address into the "Manually specified list URL" text box to collect it.
3. Set article URL matching rules
The article address source page has been specified above. In this step, you need to find the article address page that meets the requirements among these pages. . Open a list page and observe that the box in the left column contains all the addresses we need. In this case, the pages that are clearly distinguished can be filtered using the "HTML at the beginning of the region" and "HTMLL at the end of the region" settings.
But other methods can also be used. Move the mouse to various link addresses and observe the complete address displayed in the lower left corner of the browser. The addresses we need all contain "PHP_jiaocheng/20", then we fill it in "Must Contain".
Both methods can filter out addresses. When it comes to complex pages, they can be used together. With the addition of regular rules, there are almost no addresses that cannot be filtered out. Compare with the figure below. Finally confirm and go to the next step "Web content acquisition rules".
4. Web page content acquisition rules
The above introduces the list setting method, next we enter the setting of content acquisition rules , if the collection is to serve, the function of the above one to three steps is just that the appetizer serves as a guide for the following main course. The next step is to introduce how to collect article content from the target site. This step is the most core part of the entire collection.
Continue to return to the PHP tutorial list of DreamWeaver and open an article in the list. Here we take the article "Regular Expressions" as an example: http://www.dedecms.com/web -art/PHP_jiaocheng/20070420/38633.html, copy this address to the "Preview URL"; because all articles of DreamWeaver are not paginated, there is no need to set the pagination here, and you can directly enter the "Fixed Collection Project" page
(Note: If the collected content contains paging, you only need to set the matching rules in the paging navigation part. Here are all listed paging lists, top and bottom pages, or incomplete paging lists that can be set according to the content. Yes)
The following is the quoted content:
All listed paginated list: The paginated content lists all links, as shown in the figure below
Up and down page form or incomplete paging list: a single page displays the current paging content, an incomplete display list form
5. Fixed collection items
Enter here In the first step, we start to analyze the page source code. Collection is nothing more than analyzing the structure of the HTML page to obtain the content we need. Therefore, we are required to have a certain understanding of HTML code and be able to find the required content by viewing the page source file. It is best to open several more pages for analysis and find the similarities.
It is recommended that everyone use Dreamweaver analysis. When analyzing the page code, it will be much more convenient to use the search function more often. Especially after finding the tag, search to see if there are any duplications to reduce analysis errors.
1) Article title: The title of this page is "Regular Expression" Copy it, press Ctrl F key in Dreamweaver to search all, there are 30 records. Because of the uniqueness, here we select the "
Regular Expression
" tag on line 105, copy it to the matching rule of the "Fixed Collection Project" article title, and replace it with the keyword "[content]" Title, ultimately[content]
.2) Author: Continue searching with author as the keyword. Only 110 lines have unique occurrences. Copy them together with the tags before and after alluse to the matching rules, and use [content] to replace the place to be collected.
3) Source: Same as above. Find the tag in line 109, copy it, and use [content] to replace the place to be collected. If the source contains hyperlink tags that you want to remove, in the filter rule box, fill in the following rules to filter them out:
<a>]*)> <br></a><br>
4) Release time: Copy, paste and modify the same operations as above at line 111.
5) Article content: Search for the beginning of the article content. For example, "Part One" found the target in line 118. Click the status bar
and found that all the article content could not be selected. Continue to the previous
At this point, the content filtering settings have been completed.
6. Node collection
If your collection node is completed in one go and the test is successful, click the button as prompted to collect directly, but the node is written before Yes, you need to go to the "Node Management Page" to check the nodes to be collected and press the "Collect" button to collect. If you want to collect new content from all nodes, go to the monitoring collection page to operate.
You can set the number of data collected per page for each page collection. Generally speaking, do not set it too large, otherwise the system may not be able to process it and some parts cannot be collected. It is recommended not to exceed 15.
The number of threads refers to how many threads are collecting at the same time each time. Increasing the number of threads can speed up the collection, but it will also increase the occupation of server resources, so please use it with caution. If the target site has an anti-refresh limit, you can set it here according to the anti-refresh limit time of the target site. If not, the default is 0 seconds.
Additional options These three settings should be easy to understand literally, so you can choose according to your actual needs.
Collection completed.
For more wordpress related technical articles, please visit the wordpress tutorial column to learn!
The above is the detailed content of How to use dedecms collection. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Empire CMS template download location: Official template download: https://www.phome.net/template/ Third-party template website: https://www.dedecms.com/diy/https://www.0978.com.cn /https://www.jiaocheng.com/Installation method: Download template Unzip template Upload template Select template

Template replacement can be implemented in Dedecms through the following steps: modify the global.cfg file and set the required language pack. Modify the taglib.inc.php hook file and add support for language suffix template files. Create a new template file with a language suffix and modify the required content. Clear Dedecms cache.

Dedecms is an open source CMS that can be used to create various types of websites, including: news websites, blogs, e-commerce websites, forums and community websites, educational websites, portals, other types of websites (such as corporate websites, personal websites, photo album websites, video sharing website)

How to upload local videos using Dedecms? Prepare the video file in a format that is supported by Dedecms. Log in to the Dedecms management backend and create a new video category. Upload video files on the video management page, fill in the relevant information and select the video category. To embed a video while editing an article, enter the file name of the uploaded video and adjust its dimensions.

Dedecms is an open source Chinese CMS system that provides content management, template system and security protection. The specific usage includes the following steps: 1. Install Dedecms. 2. Configure the database. 3. Log in to the management interface. 4. Create content. 5. Set up the template. 6. Manage users. 7. Maintain the system.

Accurate and reliable dedecms conversion tool evaluation report With the rapid development of the Internet era, website construction has become one of the necessary tools for many companies and individuals. In website construction, using a content management system (CMS) can manage website content and functions more conveniently and efficiently. Among them, dedecms, as a well-known CMS system, is widely used in various website construction projects. However, sometimes we are faced with the need to convert the dedecms website to other formats, in which case we need to use a conversion tool

Learning dedecms encoding conversion function is not complicated. Simple code examples can help you quickly master this skill. In dedecms, the encoding conversion function is usually used to deal with problems such as Chinese garbled characters and special characters to ensure the normal operation of the system and the accuracy of data. The following will introduce in detail how to use the encoding conversion function of dedecms, allowing you to easily cope with various encoding-related needs. 1.UTF-8 to GBK In dedecms, if you need to convert UTF-8 encoded string to G

DedeCMS is an open source content management system that has some potential vulnerabilities and security risks: 1. SQL injection vulnerability. Attackers can perform unauthorized operations or obtain sensitive data by constructing malicious SQL query statements; 2. File Upload vulnerability, attackers can upload files containing malicious code to the server to execute arbitrary code or obtain server permissions; 3. Sensitive information leakage; 4. Unauthenticated vulnerability exploitation.
