应用程序中的计算困境
开发和框架,哪个优先?
Java 是应用程序开发中最常用的编程语言。但用 Java 编写处理数据的代码并不简单。例如,下面是对两个字段进行分组和聚合的Java代码:
Map<Integer, Map<String, Double>> summary = new HashMap<>(); for (Order order : orders) { int year = order.orderDate.getYear(); String sellerId = order.sellerId; double amount = order.amount; Map<String, Double> salesMap = summary.get(year); if (salesMap == null) { salesMap = new HashMap<>(); summary.put(year, salesMap); } Double totalAmount = salesMap.get(sellerId); if (totalAmount == null) { totalAmount = 0.0; } salesMap.put(sellerId, totalAmount + amount); } for (Map.Entry<Integer, Map<String, Double>> entry : summary.entrySet()) { int year = entry.getKey(); Map<String, Double> salesMap = entry.getValue(); System.out.println("Year: " + year); for (Map.Entry<String, Double> salesEntry : salesMap.entrySet()) { String sellerId = salesEntry.getKey(); double totalAmount = salesEntry.getValue(); System.out.println(" Seller ID: " + sellerId + ", Total Amount: " + totalAmount); } }
相比之下,SQL 对应的部分要简单得多。一个 GROUP BY 子句足以关闭计算。
从订单分组中选择年份(订单日期),卖家 ID,总和(金额)按年份(订单日期),卖家 ID
事实上,早期的应用程序是通过 Java 和 SQL 的协作来工作的。应用端采用Java实现业务流程,后端数据库采用SQL进行数据处理。由于数据库限制,该框架难以扩展和迁移。这对于当代的应用来说是非常不友好的。而且很多时候没有数据库或者涉及到跨库计算时SQL是不可用的。
鉴于此,后来很多应用开始采用完全基于Java的框架,数据库只做简单的读写操作,应用端的业务流程和数据处理都是用Java实现,尤其是微服务出现后。这样应用程序就与数据库解耦了,并获得了良好的可扩展性和可移植性,这有助于在面对前面提到的Java开发复杂性的同时获得框架优势。
看来我们只能专注于一个方面——开发或者框架。要享受Java框架的优势,就必须忍受开发的困难;而要使用SQL,就需要容忍框架的缺点。这就造成了一个两难的境地。
那我们能做什么呢?
那么增强Java的数据处理能力呢?这不仅避免了 SQL 问题,还克服了 Java 的缺点。
其实Java Stream/Kotlin/Scala都在尝试这样做。
直播
Java 8中引入的Stream增加了很多数据处理方法。下面是实现上述计算的 Stream 代码:
Map<Integer, Map<String, Double>> summary = orders.stream() .collect(Collectors.groupingBy( order -> order.orderDate.getYear(), Collectors.groupingBy( order -> order.sellerId, Collectors.summingDouble(order -> order.amount) ) )); summary.forEach((year, salesMap) -> { System.out.println("Year: " + year); salesMap.forEach((sellerId, totalAmount) -> { System.out.println(" Seller ID: " + sellerId + ", Total Amount: " + totalAmount); }); });
Stream确实在一定程度上简化了代码。但总体来说还是比较麻烦,而且远不如 SQL 简洁。
科特林
号称更强大的Kotlin进一步进步:
val summary = orders .groupBy { it.orderDate.year } .mapValues { yearGroup -> yearGroup.value .groupBy { it.sellerId } .mapValues { sellerGroup -> sellerGroup.value.sumOf { it.amount } } } summary.forEach { (year, salesMap) -> println("Year: $year") salesMap.forEach { (sellerId, totalAmount) -> println(" Seller ID: $sellerId, Total Amount: $totalAmount") } }
Kotlin 代码更简单,但改进有限。和SQL相比还是有很大差距
斯卡拉
然后是 Scala:
val summary = orders .groupBy(order => order.orderDate.getYear) .mapValues(yearGroup => yearGroup .groupBy(_.sellerId) .mapValues(sellerGroup => sellerGroup.map(_.amount).sum) ) summary.foreach { case (year, salesMap) => println(s"Year: $year") salesMap.foreach { case (sellerId, totalAmount) => println(s" Seller ID: $sellerId, Total Amount: $totalAmount") } }
Scala 比 Kotlin 简单一点,但仍然无法与 SQL 相比。另外Scala太笨重,使用起来不方便。
事实上,这些技术虽然并不完美,但走在正确的道路上。
编译语言不可热插拔
此外,Java 作为一种编译语言,缺乏对热插拔的支持。修改代码需要重新编译和重新部署,通常需要重新启动服务。当面对需求的频繁变化时,这会导致体验不佳。相比之下,SQL在这方面就没有问题。
Java开发复杂,框架也存在缺陷。 SQL很难满足框架的要求。困境很难解决。还有别的办法吗?
终极解决方案——集算器SPL
集算器SPL是一种纯Java开发的数据处理语言。它开发简单,框架灵活。
简洁的语法
让我们回顾一下上述分组和聚合操作的 Java 实现:
与Java代码相比,SPL代码简洁得多:
Orders.groups(year(orderdate),sellerid;sum(amount))
就像SQL实现一样简单:
SELECT year(orderdate),sellerid,sum(amount) FROM orders GROUP BY year(orderDate),sellerid
事实上,SPL 代码通常比 SQL 代码更简单。由于支持基于顺序和过程计算,SPL 能够更好地执行复杂计算。考虑这个例子:计算股票连续上涨的最大天数。 SQL需要下面的三层嵌套语句,很难理解,更不用说写了。
select max(continuousDays)-1 from (select count(*) continuousDays from (select sum(changeSign) over(order by tradeDate) unRiseDays from (select tradeDate, case when closePrice>lag(closePrice) over(order by tradeDate) then 0 else 1 end changeSign from stock) ) group by unRiseDays)
SPL 只需一行代码即可实现计算。这比 SQL 代码还要简单得多,更不用说 Java 代码了。
stock.sort(tradeDate).group@i(price<price[-1]).max(~.len())
Comprehensive, independent computing capability
SPL has table sequence – the specialized structured data object, and offers a rich computing class library based on table sequences to handle a variety of computations, including the commonly seen filtering, grouping, sorting, distinct and join, as shown below:
Orders.sort(Amount) // Sorting Orders.select(Amount*Quantity>3000 && like(Client,"*S*")) // Filtering Orders.groups(Client; sum(Amount)) // Grouping Orders.id(Client) // Distinct join(Orders:o,SellerId ; Employees:e,EId) // Join ……
More importantly, the SPL computing capability is independent of databases; it can function even without a database, which is unlike the ORM technology that requires translation into SQL for execution.
Efficient and easy to use IDE
Besides concise syntax, SPL also has a comprehensive development environment offering debugging functionalities, such as “Step over” and “Set breakpoint”, and very debugging-friendly WYSIWYG result viewing panel that lets users check result for each step in real time.
Support for large-scale data computing
SPL supports processing large-scale data that can or cannot fit into the memory.
In-memory computation:
External memory computation:
We can see that the SPL code of implementing an external memory computation and that of implementing an in-memory computation is basically the same, without extra computational load.
It is easy to implement parallelism in SPL. We just need to add @m option to the serial computing code. This is far simpler than the corresponding Java method.
Seamless integration into Java applications
SPL is developed in Java, so it can work by embedding its JARs in the Java application. And the application executes or invokes the SPL script via the standard JDBC. This makes SPL very lightweight, and it can even run on Android.
Call SPL code through JDBC:
Class.forName("com.esproc.jdbc.InternalDriver"); con= DriverManager.getConnection("jdbc:esproc:local://"); st =con.prepareCall("call SplScript(?)"); st.setObject(1, "A"); st.execute(); ResultSet rs = st.getResultSet(); ResultSetMetaData rsmd = rs.getMetaData();
As it is lightweight and integration-friendly, SPL can be seamlessly integrated into mainstream Java frameworks, especially suitable for serving as a computing engine within microservice architectures.
Highly open framework
SPL’s great openness enables it to directly connect to various types of data sources and perform real-time mixed computations, making it easy to handle computing scenarios where databases are unavailable or multiple/diverse databases are involved.
Regardless of the data source, SPL can read data from it and perform the mixed computation as long as it is accessible. Database and database, RESTful and file, JSON and database, anything is fine.
Databases:
RESTful and file:
JSON and database:
Interpreted execution and hot-swapping
SPL is an interpreted language that inherently supports hot swapping while power remains switched on. Modified code takes effect in real-time without requiring service restarts. This makes SPL well adapt to dynamic data processing requirements.
This hot—swapping capability enables independent computing modules with separate management, maintenance and operation, creating more flexible and convenient uses.
SPL can significantly increase Java programmers’ development efficiency while achieving framework advantages. It combines merits of both Java and SQL, and further simplifies code and elevates performance.
SPL open source address
以上是有些东西可以使 Java 程序员的开发效率加倍的详细内容。更多信息请关注PHP中文网其他相关文章!