Home Java JavaBase Detailed explanation of java regular knowledge

Detailed explanation of java regular knowledge

Nov 29, 2019 pm 01:11 PM
java

Detailed explanation of java regular knowledge

Expression meaning: (Recommended: java video tutorial)

1, character

x character x . For example, a represents the character a

\\ backslash character. When writing, write \\\\. (Note: Because Java parses \\\\ into a regular expression \\ during the first parsing, and then parses it into \\ during the second parsing, so any escape characters that are not listed in 1.1 include those in 1.1 \\, and those with \ must be written twice)

\0n Character n with octal value 0 (0 \0nn Character nn with octal value 0 (0 \0mnn With octal value Character mnn with value 0 (0 \xhh Character with hexadecimal value 0x hh

\uhhhh Character with hexadecimal value 0x hhhh

\t Tab character ('\u0009')

\n New line (line feed) character ('\u000A')

\r Carriage return character ('\u000D')

\ f form feed character ('\u000C')

\a alarm (bell) character ('\u0007')

\e escape character ('\u001B')

\cx corresponds to the control character of x

2, character class

[abc] a, b or c (simple class). For example, [egd] means containing the characters e, g or d.

[^abc] Any character except a, b, or c (negative). For example, [^egd] means not containing the characters e, g, or d.

[a- zA-Z] a to z or A to Z, inclusive (range)

[a-d[m-p]] a to d or m to p: [a-dm-p] (and Set)

[a-z&&[def]] d, e or f (intersection)

[a-z&&[^bc]] a to z, except b and c: [ad -z] (subtract)

[a-z&&[^m-p]] a to z, not m to p: [a-lq-z] (subtract)

3 , predefined character classes (note that the backslash must be written twice, for example, \d is written as \\d) any character (which may or may not match the line terminator)

\d Number: [0 -9]

\D Non-digits: [^0-9]

\s Blank characters: [ \t\n\x0B\f\r]

\ S Non-whitespace characters: [^\s]

\w Word characters: [a-zA-Z_0-9]

\W Non-word characters: [^\w]

4.POSIX character class (US-ASCII only) (note that the backslash must be written twice, for example, \p{Lower} is written as \\p{Lower})

\p{Lower} Lowercase alphabetic characters: [a-z].

\p{Upper} Uppercase alphabetic characters: [A-Z]

\p{ASCII} All ASCII: [\x00-\x7F]

\p{Alpha} Alphabetic characters: [\p{Lower}\p{Upper}]

\p{Digit} Decimal digits: [0-9]

\p {Alnum} Alphanumeric characters: [\p{Alpha}\p{Digit}]

\p{Punct} Punctuation: !"#$%&'()* ,-./:;? @[\]^_`{|}~

\p{Graph} Visible characters: [\p{Alnum}\p{Punct}]

\p{Print} Printable Characters: [\p{Graph}\x20]

\p{Blank} Space or tab: [ \t]

\p{Cntrl} Control characters: [\x00- \x1F\x7F]

\p{XDigit} Hexadecimal digits: [0-9a-fA-F]

\p{Space} White space characters: [ \t\n \x0B\f\r]

5.java.lang.Character class (simple java character type)

\p{javaLowerCase} is equivalent to java.lang.Character.isLowerCase( )

\p{javaUpperCase} is equivalent to java.lang.Character.isUpperCase()

\p{javaWhitespace} is equivalent to java.lang.Character.isWhitespace()

\p{javaMirrored} Equivalent to java.lang.Character.isMirrored()

6. Classes for Unicode blocks and categories

\p{InGreek} Greek blocks (simple blocks ) characters in

\p{Lu} Uppercase letters (simple category)

\p{Sc} Currency symbols

\P{InGreek} All characters, Greek blocks Except in (negation)

[\p{L}&&[^\p{Lu}]] All letters, except uppercase letters (minus)

7. Boundary matcher

^ At the beginning of the line, use ^ at the beginning of the regular expression. For example: ^(abc) represents a string starting with abc. Note that the parameter MULTILINE must be set when compiling, such as Pattern p = Pattern.compile(regex,Pattern.MULTILINE);

$ at the end of the line, please use it at the end of the regular expression. For example: (^bca).*(abc$) means a line starting with bca and ending with abc.

\b Word boundaries. For example, \b(abc) means that the beginning or end of the word contains abc, (both abcjj and jjabc can match)

\B Non-word boundary. For example, \B(abc) means that the middle of the word contains abc, (jjabcjj matches but jjabc, abcjj do not match)

\A The beginning of the input

\G The end of the previous match (personal I feel like this parameter is useless). For example, \\Gdog means to search for dog at the end of the previous match. If there is no dog, then search from the beginning. Note that if the beginning is not dog, it cannot match.

\Z The end of the input, used only for the final terminator (if any)

The line terminator is a sequence of one or two characters that marks the end of the line of the input character sequence .

The following codes are recognized as line terminators:

-new line (newline) character ('\n'),

-return followed by a new line character Carriage return character ("\r\n"),

-single carriage return character ('\r'),

-next line character ('\u0085'),

‐Line separator ('\u2028') or

‐Paragraph separator ('\u2029).

\z End of input

When compiling a pattern, one or more flags can be set, for example

Pattern pattern = Pattern.compile(patternString,Pattern.CASE_INSENSITIVE Pattern .UNICODE_CASE);

The following six flags are supported:

‐CASE_INSENSITIVE: Matching characters is case-independent. This flag only considers US ASCII characters by default.

‐UNICODE_CASE: When combined with CASE_INSENSITIVE, use Unicode letter matching

‐MULTILINE: ^ and $ match the beginning and end of a line, rather than the entire input

‐UNIX_LINES : When matching ^ and $ in multiline mode, treat only '\n' as a line terminator

‐DOTALL: When this flag is used, the . symbol matches all line terminators including Character

‐CANON_EQ: Consider the canonical equivalent of Unicode characters

8, Greedy quantifier

X? X, not once or not

X* X, zero or more times

X X, one or more times

X{n} X, exactly n times

X{n,} X, at least n times

X{n,m} X, at least n times, but not more than m times

9.Reluctant quantifier

X??

##X*? X, zero or more times

X ? #X{n,}? X, at least n times

X{n,m}? ##X? Exactly n times

X{n,} X, at least n times

X{n,m} X, at least n times, but not more than m times

Greedy, The difference between Reluctant and Possessive is: (Note only when performing fuzzy processing)

The greedy quantifier is considered "greedy" because it reads the entire fuzzy matched string for the first time. If the first match attempt (the entire input string) fails, the matcher will back off one character after the last character in the matched string and try again, repeating this process until a match is found or there are no more remaining characters. until you can retreat. Depending on the quantifier used in the expression, the last thing it tries to match is 1 or 0 characters.

However, reluctant quantifiers take the opposite approach: they start at the beginning of the string being matched, and then progressively read one character at a time to search for a match. The last thing they try to match is the entire input string.

Finally, the possessive quantifier always reads the entire input string, trying one (and only one) match. Unlike the greedy quantifier, possessive never retreats.

11. Logical operator

XY X followed by Y

X|Y X or Y

(X) X, as a capturing group. For example (abc) means capturing abc as a whole

12, Back reference

\n Any matching nth capture group

capture group can be passed from left to right Count its opening brackets to number. For example, in the expression ((A)(B(C))), there are four such groups:

1 ((A)(B(C)))

2 \A

3 (B(C))

4 (C)

The corresponding group can be referenced by \n in the expression, for example (ab) 34\1 means ab34ab, (ab)34(cd)\1\2 means ab34cdabcd.

13. Quote

\ Nothing, but quote the following characters

\Q Nothing, but quote all characters until \E. The string between QE will be used unchanged (except for the escaped characters in 1.1). For example, ab\\Q{|}\\\\E

would match ab{|}\\

\E Nothing, but end the reference starting with \Q

14, Special construction (non-capturing)

(?:X) X, as a non-capturing group

(?idmsux-idmsux) Nothing, but changes the matching flag from on to off. For example: the expression (?i)abc(?-i)def At this time, (?i) turns on the case-insensitive switch, abc matches

idmsux description is as follows:


‐i CASE_INSENSITIVE The :US-ASCII character set is not case sensitive. (?i)

‐d UNIX_LINES: Turn on UNIX line breaks

‐m MULTILINE: Multiline mode (?m)

UNIX line breaks\n

WINDOWS switching behavior\r\n(?s)

‐u UNICODE_CASE: Unicode is not case sensitive. (?u)

‐x COMMENTS: You can use comments in pattern, ignore the whitespace in pattern, and "#" until the end (# is followed by comments). (?x) For example (?x)abc#asfsdadsa can match the string abc

(?idmsux-idmsux:X) X as a non-capturing group with the given flags on - off. Similar to the above, the above expression can be rewritten as: (?i:abc)def, or (?i)abc(?-i:def)

(?=X) X, passing through zero The width of the positive lookahead. A zero-width positive lookahead assertion continues matching only if subexpression X matches to the right of this position. For example, \w (?=\d) means a letter followed by a number, but does not capture the number (no backtracking)

(?!X) X, via a zero-width negative lookahead. Zero-width negative lookahead assertion. Continue matching only if subexpression X does not match to the right of this position. For example, \w (?!\d) means a letter is not followed by a digit, and digits are not captured.

(? (? (?>X) X, as an independent non-capturing group (no backtracking)

The difference between (?=X) and (?>X) is ( ?> >b|bc) cannot be matched, because when the latter matches b, since it has already been matched, it jumps out of the non-capturing group and does not match the characters in the group again. This can speed up the process.

For more java knowledge, please pay attention to the

java basic tutorial

column.

The above is the detailed content of Detailed explanation of java regular knowledge. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
Two Point Museum: All Exhibits And Where To Find Them
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Square Root in Java Square Root in Java Aug 30, 2024 pm 04:26 PM

Guide to Square Root in Java. Here we discuss how Square Root works in Java with example and its code implementation respectively.

Perfect Number in Java Perfect Number in Java Aug 30, 2024 pm 04:28 PM

Guide to Perfect Number in Java. Here we discuss the Definition, How to check Perfect number in Java?, examples with code implementation.

Random Number Generator in Java Random Number Generator in Java Aug 30, 2024 pm 04:27 PM

Guide to Random Number Generator in Java. Here we discuss Functions in Java with examples and two different Generators with ther examples.

Armstrong Number in Java Armstrong Number in Java Aug 30, 2024 pm 04:26 PM

Guide to the Armstrong Number in Java. Here we discuss an introduction to Armstrong's number in java along with some of the code.

Weka in Java Weka in Java Aug 30, 2024 pm 04:28 PM

Guide to Weka in Java. Here we discuss the Introduction, how to use weka java, the type of platform, and advantages with examples.

Smith Number in Java Smith Number in Java Aug 30, 2024 pm 04:28 PM

Guide to Smith Number in Java. Here we discuss the Definition, How to check smith number in Java? example with code implementation.

Java Spring Interview Questions Java Spring Interview Questions Aug 30, 2024 pm 04:29 PM

In this article, we have kept the most asked Java Spring Interview Questions with their detailed answers. So that you can crack the interview.

Break or return from Java 8 stream forEach? Break or return from Java 8 stream forEach? Feb 07, 2025 pm 12:09 PM

Java 8 introduces the Stream API, providing a powerful and expressive way to process data collections. However, a common question when using Stream is: How to break or return from a forEach operation? Traditional loops allow for early interruption or return, but Stream's forEach method does not directly support this method. This article will explain the reasons and explore alternative methods for implementing premature termination in Stream processing systems. Further reading: Java Stream API improvements Understand Stream forEach The forEach method is a terminal operation that performs one operation on each element in the Stream. Its design intention is

See all articles