A csv file stores a large amount orders data.
Use Java to process this file: Find orders whose amounts are between 3,000 and 5,000, group them by customers, and sum order amounts and count orders.
Write the following SPL statement:
=file("d:/OrdersBig.csv").cursor@mtc(;8).select(Amount>=3000 && Amount<5000).groups(Client;sum(Amount):amt,count(1):cnt)
cursor() function parses a large file that cannot fit into the memory; by default, it performs the serial computation. @m option enables multithreaded data retrieval; 8 is the number of parallel threads; @t option enables importing the first line as column titles; and @c option enables using comma as the separator.
Read How to Call a SPL Script in Java to find how to integrate SPL into a Java application.
This is one of the problems on StackOverflow. You can click on it to see that the conventional solution is quite complicated, but the SPL approach is really simple and efficient.
SPL open source address
The above is the detailed content of Process a large csv file with parallel processing #eg39. For more information, please follow other related articles on the PHP Chinese website!