Home Backend Development Python Tutorial Java calls Python Spark program to get stuck: How to solve the problem of Runtime.getRuntime().exec() blocking?

Java calls Python Spark program to get stuck: How to solve the problem of Runtime.getRuntime().exec() blocking?

Apr 01, 2025 pm 10:42 PM
python ai python program python script red

Analysis and solution of python code stuck in java call

In the process of calling python code using java, you often encounter some difficult problems, such as the program being stuck and unable to continue executing. This article will analyze a specific case and provide corresponding solutions.

Problem description: The developer uses Java's runtime.getruntime().exec() method to execute python scripts, and the python script uses spark for data processing. On the Java side, the output of the python script is obtained through the process object, but after the python script is executed to the sorted_word_count.take(20), the java side program is stuck and cannot continue execution.

The python script code is as follows:

 spark = sparksession.builder.appname("read from java backend").master("local[*]").getorcreate();

# Get the passed parameter comment = sys.argv[1]

# Convert json string to python object comment = json.loads(comment)

# Convert comment list to rdd
comment_rdd = spark.sparkcontext.parallelize(comment)

# Convert rdd to dataframe
df = spark.createdataframe(comment_rdd.map(lambda x: row(**x)))

# Load the stop word library stop_words = spark.sparkcontext.textfile("c:/users/10421/downloads/baidu_stopwords.txt").collect()

# ... (Some codes are omitted here) ...

# Calculate the number of occurrences of each word word_count = df.rdd.map(lambda x: (x.word, 1)).reducebykey(lambda x, y: xy)
sorted_word_count = word_count.sortby(lambda x: x[1], ascending=false)
top_20_words = sorted_word_count.take(20)
column = 0
for row in top_20_words:
    print(row[column])
Copy after login

The java code snippet is as follows:

 process process = runtime.getruntime().exec(args1);

// Get the program execution result inputstream inputstream = process.getinputstream();
bufferedreader reader = new bufferedreader(new inputstreamreader(inputstream,"gb2312"));
// ... (Some codes are omitted here) ...
Copy after login

Problem analysis: After testing, it was found that the reason why the Java program was stuck is the execution of the code sorted_word_count.take(20) in the python script. This part of the code will block until spark processing completes and returns the result. Since process.getinputstream() is blocking, if the output of the python program is not output to the standard output stream in time, the java program will wait for it, resulting in a stuck.

Solution: The problem is most likely in character encoding. The original code uses gb2312 encoding to read the output of python, which may be inconsistent with the output encoding of the python script, causing data read blockage. Modifying the java code and using utf-8 encoding to read the output of python can solve this problem.

Modified java code:

 BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream, "UTF-8"));
BufferedReader reader2 = new BufferedReader(new InputStreamReader(errorStream, "UTF-8"));
Copy after login

By modifying the encoding of the read input stream and the error stream in the java code to utf-8, the problem of java program stuck can be solved. It should be noted that python scripts also need to make sure their output is encoded using utf-8. If the problem persists, you need to further check the execution efficiency of the spark job and whether there are other potential blocking operations in the python script.

The above is the detailed content of Java calls Python Spark program to get stuck: How to solve the problem of Runtime.getRuntime().exec() blocking?. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
Will R.E.P.O. Have Crossplay?
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Golang vs. Python: Key Differences and Similarities Golang vs. Python: Key Differences and Similarities Apr 17, 2025 am 12:15 AM

Golang and Python each have their own advantages: Golang is suitable for high performance and concurrent programming, while Python is suitable for data science and web development. Golang is known for its concurrency model and efficient performance, while Python is known for its concise syntax and rich library ecosystem.

Golang vs. Python: Concurrency and Multithreading Golang vs. Python: Concurrency and Multithreading Apr 17, 2025 am 12:20 AM

Golang is more suitable for high concurrency tasks, while Python has more advantages in flexibility. 1.Golang efficiently handles concurrency through goroutine and channel. 2. Python relies on threading and asyncio, which is affected by GIL, but provides multiple concurrency methods. The choice should be based on specific needs.

sublime column mode sublime column mode Apr 16, 2025 am 08:03 AM

Sublime Text's column editing function can greatly improve code efficiency. 1. Select the same content through the shortcut key (Ctrl Shift L/Cmd Shift L) to modify it uniformly, such as batch replacement of variable names; 2. Use multiple column selection (Ctrl Shift M/Cmd Shift M) to modify it in the same position in different rows, such as adding parameters to multiple functions at the same time. After proficiency, column editing can significantly improve coding efficiency and reduce errors. It is suitable for various programming languages, but for complex code or conditional modifications, other tools may be required.

How to write code with gbk in sublime How to write code with gbk in sublime Apr 16, 2025 am 09:30 AM

To write code using GBK encoding in Sublime Text, you need to: 1. Set the project encoding to GBK; 2. Create a new file; 3. Select GBK encoding when saving as; 4. Enter the code using GBK encoding.

Golang vs. Python: Applications and Use Cases Golang vs. Python: Applications and Use Cases Apr 17, 2025 am 12:17 AM

ChooseGolangforhighperformanceandconcurrency,idealforbackendservicesandnetworkprogramming;selectPythonforrapiddevelopment,datascience,andmachinelearningduetoitsversatilityandextensivelibraries.

How to format vscode automatically How to format vscode automatically Apr 16, 2025 am 06:03 AM

There are two ways to automatically format code in VSCode: use shortcut keys (Windows/Linux: Ctrl Shift I, macOS: Cmd Shift I) or through the menu (Editor Menu Bar > "Source" > "Format Document"). VSCode provides customizable automatic formatting options that can be configured in the Settings menu.

What does sublime renewal balm mean What does sublime renewal balm mean Apr 16, 2025 am 08:00 AM

Sublime Text is a powerful customizable text editor with advantages and disadvantages. 1. Its powerful scalability allows users to customize editors through plug-ins, such as adding syntax highlighting and Git support; 2. Multiple selection and simultaneous editing functions improve efficiency, such as batch renaming variables; 3. The "Goto Anything" function can quickly jump to a specified line number, file or symbol; but it lacks built-in debugging functions and needs to be implemented by plug-ins, and plug-in management requires caution. Ultimately, the effectiveness of Sublime Text depends on the user's ability to effectively configure and manage it.

Python: The Power of Versatile Programming Python: The Power of Versatile Programming Apr 17, 2025 am 12:09 AM

Python is highly favored for its simplicity and power, suitable for all needs from beginners to advanced developers. Its versatility is reflected in: 1) Easy to learn and use, simple syntax; 2) Rich libraries and frameworks, such as NumPy, Pandas, etc.; 3) Cross-platform support, which can be run on a variety of operating systems; 4) Suitable for scripting and automation tasks to improve work efficiency.

See all articles