Splitting Strings with Delimiters Preserved
When working with multiline strings, it often becomes necessary to split them into their component parts using delimiters. However, the default behavior of String.split() discards the delimiters, leaving only the extracted text.
Problem:
Consider the following string:
(Text1)(DelimiterA)(Text2)(DelimiterC)(Text3)(DelimiterB)(Text4)
Splitting this string using String.split() yields:
Desired Output:
To retain the delimiters and split the string accordingly, we require an approach that preserves the delimiters.
Solution:
The JDK provides a way to achieve this using lookahead and lookbehind Regular Expression (regex) features. Here's how it works:
<code class="java">System.out.println(Arrays.toString("a;b;c;d".split("(?<=;)"))); System.out.println(Arrays.toString("a;b;c;d".split("(?=;)"))); System.out.println(Arrays.toString("a;b;c;d".split("((?<;=;)|(?=;))")));</code>
This results in the following output:
The last output aligns with the desired format, where each delimiter is retained and the string is split into separate parts.
Regex Explanation:
By combining these patterns, we effectively split the string at every delimiter while preserving the delimiter itself as part of the output.
Readability Enhancements:
For improved readability, consider using named regular expressions as follows:
<code class="java">static public final String WITH_DELIMITER = "((?<=%1$s)|(?=%1$s))"; public void someMethod() { final String[] aEach = "a;b;c;d".split(String.format(WITH_DELIMITER, ";")); ... }</code>
This makes the regular expression more self-explanatory and easier to maintain.
The above is the detailed content of How to Split Strings Preserving Delimiters?. For more information, please follow other related articles on the PHP Chinese website!