Split will split a string according to a given pattern. For strings that use tabs, colons, whitespace or any symbols to separate different fields, it is very convenient to use this operator to decompose and extract fields. As long as you can write the delimiters as patterns (usually very simple regular expressions), you can use Split to break up the data. Its usage is as follows:
my @fields = split /separator/, $string;
The Split operator here scans the specified string using split mode and returns a list of fields (that is, substrings). During this period, as long as the pattern is successfully matched somewhere, that place is the end of the current field and the beginning of the next field. Therefore, anything matching the pattern will not appear in the return field. The following is a typical Split pattern with colon as delimiter:
my @fields = split /:/, "abc:def:g:h"; #Get ("abc", "def", "g", "h")
If two delimiters are connected together, an empty field will be generated:
my @fields = split /:/, "abc:def::g:h"; #Get ("abc", "def", "", "g", "h")
Here’s a rule that may seem odd at first but rarely causes problems: Split will keep empty fields at the beginning but discard empty fields at the end. For example:
my @fields = split /:/, ":::a:b:c:::"; #Get ("", "", "", "a", "b", "c")
It is also a common practice to use Split’s /s+/ mode to separate characters based on whitespace. This mode treats all consecutive spaces as a single space and splits the data accordingly:
my $some_input = “This is a t test.n”;
my @args = split /s+/, $some_input; #Get ("This", "is", "a", "test.")
By default, Split will separate the strings in $_ with whitespace characters:
my @fields = split; #Equivalent to split /s+/,$_;
This is almost equivalent to using /s+/ as the pattern, except that it omits the leading empty field. So, even if the line starts with a blank, you won't see an empty field at the beginning of the returned list. If you want to split a space-separated string in this way, you can use a space as the pattern: split '', $other_string Using a space as the pattern is a special use of split.
Generally speaking, the patterns used in Split are as simple as seen before. But if you use more complex patterns, please avoid using capturing parentheses in the pattern, because this will activate the so-called "delimiter preserving mode" (see the Perlfunc documentation for details). If you need to use group matching in the pattern, please use Use non-capturing parentheses (?:) in Split to avoid accidents
.
Further deepen the convenience brought by Split decomposition and extraction of fields. Here is a piece of code that I did not use the Split operator to decompose and extract fields in my actual work (the code using the Split operator will be given later) to compare and feel its power:
Task: Extract user name and user home directory information from passwd file;
Let’s first take a look at the record format in the passwd file (Figure 1-1 Partial excerpt):
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/bin/sh
……
It can be seen that each field is separated by a colon (:). Taking the first record from left to right as an example, we need to extract the root (user name) before the first colon and the root before the sixth colon. /root (user home directory).
[php]
#Code 1.1 The Split operator is not used to extract field codes;
#!/usr/bin/perl -w
use strict;
open (FH, '/etc/passwd') or die "Can't open file: $!";
while (){
my ($Pos,$endPos,$length,$Name,$Dir);
#############
# Get user name
#############
$length = index ($_, ":");
$Name = substr ($_, 0, $length);
#####################
# Get the user HOME directory location
#####################
$endPos = rindex ($_, ":");
# $endPos-1 skip the current position (colon)
$Pos = rindex ($_, ":", $endPos - 1); $Pos = rindex ($_, ":", $endPos - 1);
# $Pos+1 skip the current position (colon)
# Search direction from left to right. So +1
$Pos += 1;
$length = $endPos - $Pos;
$Dir = substr ($_, $Pos, $length);
print "$Namet$Dirn";
}
close (FH);
The output after running the program is as follows (Figure 1-2):
root /root
bin /bin
……
Now let’s analyze the algorithm of this code. Extracting the username is very simple. You only need to find the first colon position. The substring returned by the substr($_,0,$length) function is the required username. . The more complicated part of the algorithm is to extract the user's home directory. From Figure 1-1, we can see that the passwd file itself has a fixed format. The /root after the penultimate colon is recorded from back to front (from right to left). Directory information.
Algorithm idea for extracting user home directory:
1. Skip the last field of the record;
2. Find the starting position of the penultimate field;
3. Subtract the starting position (/ sign) of the penultimate field character from the starting (colon) position of the penultimate field, and the result is the character length in the user home directory field;
4. substr($_,$Pos,$length); returns user home directory information;
5. Completed.
(Figure 1-3 User directory extraction algorithm)
In summary, we can complete our task by locating and extracting field information through Perl string processing functions. It is foreseeable that when we want to extract multiple unconnected fields, the steps will be more cumbersome, the code will be longer, and it will be more error-prone. If, Record the location of each field and you will have to redesign your algorithm.
Now, let’s look at an example of using the Split operator to decompose the extracted fields:
[php]
#Code 1.2 Use the Split operator to extract field codes;
#!/usr/bin/perl -w
use strict;
open (FH, '/etc/passwd') or die "Can't open file: $!";
while (){
###########
# Get user information
###########
my($Name,$Dir) = (split /:/,$_)[0,5];
print "$Namet$Dirn";
}
close (FH);
.
http://www.bkjia.com/PHPjc/477706.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/477706.htmlTechArticleSplit It will split the string according to the given pattern, for using tab, colon, whitespace or For strings that separate different fields with any symbols, use this operator to decompose and extract the fields...