Home Java javaTutorial Java multi-threading captures ringtone data from the official website of Ringtone Duoduo

Java multi-threading captures ringtone data from the official website of Ringtone Duoduo

Jan 05, 2017 pm 02:16 PM
java multithreading

一直想练习下java多线程抓取数据。

有天被我发现,铃声多多的官网(http://www.shoujiduoduo.com/main/)有大量的数据。

通过观察他们前端获取铃声数据的ajax

Java multi-threading captures ringtone data from the official website of Ringtone Duoduo

http://www.shoujiduoduo.com/ringweb/ringweb.php?type=getlist&listid={类别ID}&page={分页页码}

很容易就能发现通过改变 listId和page就能从服务器获取铃声的json数据, 通过解析json数据,

可以看到都带有{"hasmore":1,"curpage":1}这样子的指示,通过判断hasmore的值,决定是否进行下一页的抓取。

但是通过上面这个链接返回的json中不带有铃声的下载地址

很快就可以发现,点击页面的“下载”会看到

通过下面的请求,就可以获取铃声的下载地址了

http://www.shoujiduoduo.com/ringweb/ringweb.php?type=geturl&act=down&rid={铃声ID}

Java multi-threading captures ringtone data from the official website of Ringtone Duoduo

所以,他们的数据是很容易被偷的。于是我就开始...

源码已经发在github上。如果感兴趣的童鞋可以查看

github:https://github.com/yongbo000/DuoduoAudioRobot

上代码:

<pre class="brush:java;">package me.yongbo.DuoduoRingRobot;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
import java.util.Iterator;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import com.google.gson.Gson;
import com.google.gson.JsonArray;
import com.google.gson.JsonElement;
import com.google.gson.JsonParser;
/* * @author yongbo_ * @created 2013/4/16 * * */
public class DuoduoRingRobotClient implements Runnable {
public static String GET_RINGINFO_URL = "http://www.shoujiduoduo.com/ringweb/ringweb.php?type=getlist&listid=%1$d&page=%2$d";
public static String GET_DOWN_URL = "http://www.shoujiduoduo.com/ringweb/ringweb.php?type=geturl&act=down&rid=%1$d";
public static String ERROR_MSG = "listId为 %1$d 的Robot发生错误,已自动停止。当前page为 %2$d";public static String STATUS_MSG = "开始抓取数据,当前listId: %1$d,当前page: %2$d";
public static String FILE_DIR = "E:/RingData/";public static String FILE_NAME = "listId=%1$d.txt";private boolean errorFlag = false;private int listId;private int page;
private int endPage = -1;private int hasMore = 1;
private DbHelper dbHelper;
/** * 构造函数 * @param listId 菜单ID * @param page 开始页码 * @param endPage 结束页码 * */
public DuoduoRingRobotClient(int listId, int beginPage, int endPage)
 {this.listId = listId;this.page = beginPage;this.endPage = endPage;this.dbHelper = new DbHelper();}
/** * 构造函数 * @param listId 菜单ID * @param page 开始页码 * */
public DuoduoRingRobotClient(int listId, int page) {this(listId, page, -1);}
/** * 获取铃声 * */public void getRings() {String url = String.format(GET_RINGINFO_URL, listId, page);String responseStr = httpGet(url);hasMore = getHasmore(responseStr);
page = getNextPage(responseStr);
ringParse(responseStr.replaceAll("\\{\"hasmore\":[0-9]*,\"curpage\":[0-9]*\\},", "").replaceAll(",]", "]"));}/** * 发起http请求 * @param webUrl 请求连接地址 * */public String httpGet(String webUrl){URL url;URLConnection conn;StringBuilder sb = new StringBuilder();String resultStr = "";try {url = new URL(webUrl);conn = url.openConnection();conn.connect();InputStream is = conn.getInputStream();InputStreamReader isr = new InputStreamReader(is);BufferedReader bufReader = new BufferedReader(isr);String lineText;while ((lineText = bufReader.readLine()) != null) {sb.append(lineText);}resultStr = sb.toString();} catch (Exception e) {errorFlag = true;//将错误写入txtwriteToFile(String.format(ERROR_MSG, listId, page));}return resultStr;}/** * 将json字符串转化成Ring对象,并存入txt中 * @param json Json字符串 * */public void ringParse(String json) {Ring ring = null;JsonElement element = new JsonParser().parse(json);JsonArray array = element.getAsJsonArray();// 遍历数组Iterator<JsonElement> it = array.iterator();
Gson gson = new Gson();while (it.hasNext() && !errorFlag) {JsonElement e = it.next();// JsonElement转换为JavaBean对象ring = gson.fromJson(e, Ring.class);ring.setDownUrl(getRingDownUrl(ring.getId()));if(isAvailableRing(ring)) {System.out.println(ring.toString());
//可选择写入数据库还是写入文本//writeToFile(ring.toString());writeToDatabase(ring);}}}
/** * 写入txt * @param data 字符串 * */public void writeToFile(String data)
 {String path = FILE_DIR + String.format(FILE_NAME, listId);File dir = new File(FILE_DIR);File file = new File(path);FileWriter fw = null;if(!dir.exists()){dir.mkdirs();
}try {if(!file.exists()){file.createNewFile();}fw = new FileWriter(file, true);
fw.write(data);fw.write("\r\n");fw.flush();} catch (IOException e) {
// TODO Auto-generated catch blocke.printStackTrace();
}finally {try {if(fw != null){fw.close();}} catch (IOException e) {
// TODO Auto-generated catch blocke.printStackTrace();}}}/** * 写入数据库 * @param ring 一个Ring的实例 * */
public void writeToDatabase(Ring ring) {dbHelper.execute("addRing", ring);}
@Overridepublic void run() {while(hasMore == 1 && !errorFlag){if(endPage != -1){if(page > endPage) { break; }}System.out.println(String.format(STATUS_MSG, listId, page));
getRings();System.out.println(String.format("该页数据写入完成"));}System.out.println("ending...");}
private int getHasmore(String resultStr){Pattern p = Pattern.compile("\"hasmore\":([0-9]*),\"curpage\":([0-9]*)"); 
 Matcher match = p.matcher(resultStr);  
 if (match.find()) {  return Integer.parseInt(match.group(1));
  }  return 0;
}
private int getNextPage(String resultStr){Pattern p = Pattern.compile("\"hasmore\":([0-9]*),\"curpage\":([0-9]*)");Matcher match = p.matcher(resultStr);if (match.find()) {return Integer.parseInt(match.group(2));}return 0;}
/** * 判断当前Ring是否满足条件。当Ring的name大于50个字符或是duration为小数则不符合条件,将被剔除。 * @param ring 当前Ring对象实例 * */private boolean isAvailableRing(Ring ring){Pattern p = Pattern.compile("^[1-9][0-9]*$");
Matcher match = p.matcher(ring.getDuration());
if(!match.find()){return false;}if(ring.getName().length() > 50 || ring.getArtist().length() > 50 || ring.getDownUrl().length() == 0){return false;}return true;}
/** * 获取铃声的下载地址 * @param rid 铃声的id * */
public String getRingDownUrl(String rid){String url = String.format(GET_DOWN_URL, rid);
String responseStr = httpGet(url);return responseStr;}}
Copy after login

更多Java multi-threading captures ringtone data from the official website of Ringtone Duoduo相关文章请关注PHP中文网!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Java development optimization method for file reading multi-thread acceleration performance Java development optimization method for file reading multi-thread acceleration performance Jun 30, 2023 pm 10:54 PM

In Java development, file reading is a very common and important operation. As your business grows, so do the size and number of files. In order to increase the speed of file reading, we can use multi-threading to read files in parallel. This article will introduce how to optimize file reading multi-thread acceleration performance in Java development. First, before reading the file, we need to determine the size and quantity of the file. Depending on the size and number of files, we can set the number of threads reasonably. Excessive number of threads may result in wasted resources,

Detailed explanation of usage scenarios and functions of volatile keyword in Java Detailed explanation of usage scenarios and functions of volatile keyword in Java Jan 30, 2024 am 10:01 AM

Detailed explanation of the role and application scenarios of the volatile keyword in Java 1. The role of the volatile keyword In Java, the volatile keyword is used to identify a variable that is visible between multiple threads, that is, to ensure visibility. Specifically, when a variable is declared volatile, any modifications to the variable are immediately known to other threads. 2. Application scenarios of the volatile keyword The status flag volatile keyword is suitable for some status flag scenarios, such as a

Exception handling in Java multi-threaded environment Exception handling in Java multi-threaded environment May 01, 2024 pm 06:45 PM

Key points of exception handling in a multi-threaded environment: Catching exceptions: Each thread uses a try-catch block to catch exceptions. Handle exceptions: print error information or perform error handling logic in the catch block. Terminate the thread: When recovery is impossible, call Thread.stop() to terminate the thread. UncaughtExceptionHandler: To handle uncaught exceptions, you need to implement this interface and assign it to the thread. Practical case: exception handling in the thread pool, using UncaughtExceptionHandler to handle uncaught exceptions.

Explore the working principles and characteristics of java multithreading Explore the working principles and characteristics of java multithreading Feb 21, 2024 pm 03:39 PM

Explore the working principles and characteristics of Java multithreading Introduction: In modern computer systems, multithreading has become a common method of concurrent processing. As a powerful programming language, Java provides a rich multi-threading mechanism, allowing programmers to better utilize the computer's multi-core processor and improve program running efficiency. This article will explore the working principles and characteristics of Java multithreading and illustrate it with specific code examples. 1. The basic concept of multi-threading Multi-threading refers to executing multiple threads at the same time in a program, and each thread processes different

Java Multithreading Performance Optimization Guide Java Multithreading Performance Optimization Guide Apr 11, 2024 am 11:36 AM

The Java Multithreading Performance Optimization Guide provides five key optimization points: Reduce thread creation and destruction overhead Avoid inappropriate lock contention Use non-blocking data structures Leverage Happens-Before relationships Consider lock-free parallel algorithms

Java multi-thread debugging technology revealed Java multi-thread debugging technology revealed Apr 12, 2024 am 08:15 AM

Multi-threaded debugging technology answers: 1. Challenges in multi-threaded code debugging: The interaction between threads leads to complex and difficult-to-track behavior. 2. Java multi-thread debugging technology: line-by-line debugging thread dump (jstack) monitor entry and exit events thread local variables 3. Practical case: use thread dump to find deadlock, use monitor events to determine the cause of deadlock. 4. Conclusion: The multi-thread debugging technology provided by Java can effectively solve problems related to thread safety, deadlock and contention.

Multi-thread safety issues in Java - solutions to java.lang.ThreadDeath Multi-thread safety issues in Java - solutions to java.lang.ThreadDeath Jun 25, 2023 am 11:22 AM

Java is a programming language widely used in modern software development, and its multi-threaded programming capabilities are also one of its greatest advantages. However, due to the concurrent access problems caused by multi-threading, multi-thread safety issues often occur in Java. Among them, java.lang.ThreadDeath is a typical multi-thread security issue. This article will introduce the causes and solutions of java.lang.ThreadDeath. 1. Reasons for java.lang.ThreadDeath

Detailed explanation of Java multi-threaded concurrency lock Detailed explanation of Java multi-threaded concurrency lock Apr 11, 2024 pm 04:21 PM

The Java concurrency lock mechanism ensures that shared resources are accessed by only one thread in a multi-threaded environment. Its types include pessimistic locking (acquire the lock and then access) and optimistic locking (check for conflicts after accessing). Java provides built-in concurrency lock classes such as ReentrantLock (mutex lock), Semaphore (semaphore) and ReadWriteLock (read-write lock). Using these locks can ensure thread-safe access to shared resources, such as ensuring that when multiple threads access the shared variable counter at the same time, only one thread updates its value.

See all articles