A string is an object containing a sequence of characters. Characters are strings of length 1. In Python, individual characters are also strings. But what is more interesting is that there is no character data type in the Python programming language, but there are character data types in other programming languages such as C, Kotlin and Java
We can use single quotes, double quotes, Triple quotes or the str() function to declare Python strings. The following code snippet shows how to declare a string in Python:
# A single quote string single_quote = 'a'# This is an example of a character in other programming languages. It is a string in Python # Another single quote string another_single_quote = 'Programming teaches you patience.' # A double quote string double_quote = "aa" # Another double-quote string another_double_quote = "It is impossible until it is done!" # A triple quote string triple_quote = '''aaa''' # Also a triple quote string another_triple_quote = """Welcome to the Python programming language. Ready, 1, 2, 3, Go!""" # Using the str() function string_function = str(123.45)# str() converts float data type to string data type # Another str() function another_string_function = str(True)# str() converts a boolean data type to string data type # An empty string empty_string = '' # Also an empty string second_empty_string = "" # We are not done yet third_empty_string = """"""# This is also an empty string: ''''''
Another way to get a string in Python is to use the input() function. The input() function allows us to insert entered values into the program using the keyboard. Inserted values are read as strings, but we can convert them to other data types:
# Inputs into a Python program input_float = input()# Type in: 3.142 input_boolean = input() # Type in: True # Convert inputs into other data types convert_float = float(input_float)# converts the string data type to a float convert_boolean = bool(input_boolean) # converts the string data type to a bool
We use the type() function to determine the data type of an object in Python, which returns the class of the object. When the object is a string, it returns str class. Likewise, when the object is a dictionary, integer, float, tuple or boolean, it returns the dict, int, float, tuple, bool classes respectively. Now let us use the type() function to determine the data type of the variable declared in the previous code snippet:
# Data types/ classes with type() print(type(single_quote)) print(type(another_triple_quote)) print(type(empty_string)) print(type(input_float)) print(type(input_boolean)) print(type(convert_float)) print(type(convert_boolean))
American Standard Code for Information Interchange (ASCII) Purpose Helps us map characters or text to numbers because sets of numbers are easier to store in computer memory than text. ASCII encodes 128 characters primarily in English and is used for processing information in computers and programming. ASCII-encoded English characters include lowercase letters (a-z), uppercase letters (A-Z), numbers (0-9), and punctuation marks and other symbols.
ord() function converts Python characters of length 1 (one character) The string is converted to its decimal representation on the ASCII table, and the chr() function converts the decimal representation back to a string. For example:
import string # Convert uppercase characters to their ASCII decimal numbers ascii_upper_case = string.ascii_uppercase# Output: ABCDEFGHIJKLMNOPQRSTUVWXYZ for one_letter in ascii_upper_case[:5]:# Loop through ABCDE print(ord(one_letter))
Output:
65 66 67 68 69
# Convert digit characters to their ASCII decimal numbers ascii_digits = string.digits# Output: 0123456789 for one_digit in ascii_digits[:5]:# Loop through 01234 print(ord(one_digit))
Output:
48 49 50 51 52
In the above code snippet, we iterate through the strings ABCDE and 01234 and convert each character to them Decimal representation in ASCII table. We can also use the chr() function to perform the reverse operation, converting the decimal numbers on the ASCII table to their Python string characters. For example:
decimal_rep_ascii = [37, 44, 63, 82, 100] for one_decimal in decimal_rep_ascii: print(chr(one_decimal))
Output:
% , ? R d
In the ASCII table, the string characters in the output of the above program are mapped to their respective decimal numbers
Zero index: The first element in the string has an index of zero, while the last element has an index of len(string) - 1. For example:
immutable_string = "Accountability" print(len(immutable_string)) print(immutable_string.index('A')) print(immutable_string.index('y'))
Output:
14 0 13
Immutability: This means we cannot update the characters in the string. For example we cannot remove an element from a string or try to allocate a new element at any of its index positions. If we try to update the string, it will throw TypeError:
immutable_string = "Accountability" # Assign a new element at index 0 immutable_string[0] = 'B'
Output:
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) ~AppDataLocalTemp/ipykernel_11336/2351953155.py in 2 3 # Assign a new element at index 0 ----> 4 immutable_string[0] = 'B' TypeError: 'str' object does not support item assignment
But we can reassign the string to the immutable_string variable, but we should note that they are not the same character Strings because they do not point to the same object in memory. Python does not update the old string object; it creates a new one, as we can see by the ids:
immutable_string = "Accountability" print(id(immutable_string)) immutable_string = "Bccountability" print(id(immutable_string) test_immutable = immutable_string print(id(test_immutable))
Output:
2693751670576 2693751671024 2693751671024
The above two ids are on the same computer are also different, which means that both immutable_string variables point to different addresses in memory. We assign the last immutable_string variable to the test_immutable variable. You can see that the test_immutable variable and the last immutable_string variable point to the same address
Concatenation: Concatenate two or more strings together to get a new string with the symbol. For example:
first_string = "Zhou" second_string = "luobo" third_string = "Learn Python" fourth_string = first_string + second_string print(fourth_string) fifth_string = fourth_string + " " + third_string print(fifth_string)
Output:
Zhouluobo Zhouluobo Learn Python
Repeat: Strings can be repeated using the * symbol. For example:
print("Ha" * 3)
Output:
HaHaHa
Indexing and slicing: We have established that strings are indexed from zero and we can access any element in the string using its index value. We can also get a subset of a string by slicing between two index values. For example:
main_string = "I learned English and Python with ZHouluobo. You can do it too!" # Index 0 print(main_string[0]) # Index 1 print(main_string[1]) # Check if Index 1 is whitespace print(main_string[1].isspace()) # Slicing 1 print(main_string[0:11]) # Slicing 2: print(main_string[-18:]) # Slicing and concatenation print(main_string[0:11] + ". " + main_string[-18:])
Output:
I True I learned English You can do it too! I learned English. You can do it too!
str.split(sep=None, maxsplit=-1): The string splitting method contains two attributes :sep and maxsplit. When this method is called with its default value, it splits the string wherever there are spaces. This method returns a list of strings:
string = "Apple, Banana, Orange, Blueberry" print(string.split())
Output:
['Apple,', 'Banana,', 'Orange,', 'Blueberry']
We can see that the string is not split well because the split string contains ,. We can use sep=',' to split where there is:
print(string.split(sep=','))
Output:
['Apple', ' Banana', ' Orange', ' Blueberry']
This is better than the previous split, but we can split the string in some I saw spaces before. You can remove it using (sep=', '):
# Notice the whitespace after the comma print(string.split(sep=', '))
Output:
['Apple', 'Banana', 'Orange', 'Blueberry']
Now the string is nicely split. Sometimes we don't want to split the maximum number of times, we can use the maxsplit attribute to specify the number of times we intend to split:
print(string.split(sep=', ', maxsplit=1)) print(string.split(sep=', ', maxsplit=2))
Output:
['Apple', 'Banana, Orange, Blueberry'] ['Apple', 'Banana', 'Orange, Blueberry']
str.splitlines(keepends=False): 有时我们想处理一个在边界处具有不同换行符('n'、nn'、'r'、'rn')的语料库。我们要拆分成句子,而不是单个单词。可以使用 splitline 方法来执行此操作。当 keepends=True 时,文本中包含换行符;否则它们被排除在外
import nltk# You may have to `pip install nltk` to use this library. macbeth = nltk.corpus.gutenberg.raw('shakespeare-macbeth.txt') print(macbeth.splitlines(keepends=True)[:5]
Output:
['[The Tragedie of Macbeth by William Shakespeare 1603]n', 'n', 'n', 'Actus Primus. Scoena Prima.n', 'n']
str.strip([chars]): 我们使用 strip 方法从字符串的两侧删除尾随空格或字符。例如:
string = "Apple Apple Apple no apple in the box apple apple " stripped_string = string.strip() print(stripped_string) left_stripped_string = ( stripped_string .lstrip('Apple') .lstrip() .lstrip('Apple') .lstrip() .lstrip('Apple') .lstrip() ) print(left_stripped_string) capitalized_string = left_stripped_string.capitalize() print(capitalized_string) right_stripped_string = ( capitalized_string .rstrip('apple') .rstrip() .rstrip('apple') .rstrip() ) print(right_stripped_string)
Output:
Apple Apple Apple no apple in the box apple apple no apple in the box apple apple No apple in the box apple apple No apple in the box
在上面的代码片段中,我们使用了 lstrip 和 rstrip 方法,它们分别从字符串的左侧和右侧删除尾随空格或字符。我们还使用了 capitalize 方法,它将字符串转换为句子大小写str.zfill(width): zfill 方法用 0 前缀填充字符串以获得指定的宽度。例如:
example = "0.8"# len(example) is 3 example_zfill = example.zfill(5) # len(example_zfill) is 5 print(example_zfill)
Output:
000.8
str.isalpha(): 如果字符串中的所有字符都是字母,该方法返回True;否则返回 False:
# Alphabet string alphabet_one = "Learning" print(alphabet_one.isalpha()) # Contains whitspace alphabet_two = "Learning Python" print(alphabet_two.isalpha()) # Contains comma symbols alphabet_three = "Learning," print(alphabet_three.isalpha())
Output:
True False False
如果字符串字符是字母数字,str.isalnum() 返回 True;如果字符串字符是十进制,str.isdecimal() 返回 True;如果字符串字符是数字,str.isdigit() 返回 True;如果字符串字符是数字,则 str.isnumeric() 返回 True
如果字符串中的所有字符都是小写,str.islower() 返回 True;如果字符串中的所有字符都是大写,str.isupper() 返回 True;如果每个单词的首字母大写,str.istitle() 返回 True:
# islower() example string_one = "Artificial Neural Network" print(string_one.islower()) string_two = string_one.lower()# converts string to lowercase print(string_two.islower()) # isupper() example string_three = string_one.upper() # converts string to uppercase print(string_three.isupper()) # istitle() example print(string_one.istitle())
Output:
False True True True
str.endswith(suffix) 返回 True 是以指定后缀结尾的字符串。如果字符串以指定的前缀开头,str.startswith(prefix) 返回 True:
sentences = ['Time to master data science', 'I love statistical computing', 'Eat, sleep, code'] # endswith() example for one_sentence in sentences: print(one_sentence.endswith(('science', 'computing', 'Code')))
Output:
True True False
# startswith() example for one_sentence in sentences: print(one_sentence.startswith(('Time', 'I ', 'Ea')))
Output:
True True True
str.find(substring) 如果子字符串存在于字符串中,则返回最低索引;否则它返回 -1。str.rfind(substring) 返回最高索引。如果找到,str.index(substring) 和 str.rindex(substring) 也分别返回子字符串的最低和最高索引。如果字符串中不存在子字符串,则会引发 ValueError
string = "programming" # find() and rfind() examples print(string.find('m')) print(string.find('pro')) print(string.rfind('m')) print(string.rfind('game')) # index() and rindex() examples print(string.index('m')) print(string.index('pro')) print(string.rindex('m')) print(string.rindex('game'))
Output:
6 0 7 -1 6 0 7 --------------------------------------------------------------------------- ValueErrorTraceback (most recent call last) ~AppDataLocalTemp/ipykernel_11336/3954098241.py in 11 print(string.index('pro'))# Output: 0 12 print(string.rindex('m'))# Output: 7 ---> 13 print(string.rindex('game'))# Output: ValueError: substring not found ValueError: substring not found
str.maketrans(dict_map) 从字典映射创建一个翻译表,str.translate(maketrans) 用它们的新值替换翻译中的元素。例如:
example = "abcde" mapped = {'a':'1', 'b':'2', 'c':'3', 'd':'4', 'e':'5'} print(example.translate(example.maketrans(mapped)))
Output:
12345
字符串是可迭代的,因此它们支持使用 for 循环和枚举的循环操作:
# For-loop example word = "bank" for letter in word: print(letter)
Output:
b a n k
# Enumerate example for idx, value in enumerate(word): print(idx, value)
Output:
0 b 1 a 2 n 3 k
当使用关系运算符(>、<、== 等)比较两个字符串时,两个字符串的元素按其 ASCII 十进制数字逐个索引进行比较。例如:
print('a' > 'b') print('abc' > 'b')
Output:
False False
在这两种情况下,输出都是 False。关系运算符首先比较两个字符串的索引 0 上元素的 ASCII 十进制数。由于 b 大于 a,因此返回 False;在这种情况下,其他元素的 ASCII 十进制数字和字符串的长度无关紧要
当字符串长度相同时,它比较从索引 0 开始的每个元素的 ASCII 十进制数,直到找到具有不同 ASCII 十进制数的元素。例如:
print('abd' > 'abc')
Output:
True
in 运算符用于检查子字符串是否是字符串的成员:
print('data' in 'dataquest') print('gram' in 'programming')
Output:
True True
检查字符串成员资格、替换子字符串或匹配模式的另一种方法是使用正则表达式
import re substring = 'gram' string = 'programming' replacement = '1234' # Check membership print(re.search(substring, string)) # Replace string print(re.sub(substring, replacement, string))
Output:
pro1234ming
f-string 和 str.format() 方法用于格式化字符串。两者都使用大括号 {} 占位符。例如:
monday, tuesday, wednesday = "Monday", "Tuesday", "Wednesday" format_string_one = "{} {} {}".format(monday, tuesday, wednesday) print(format_string_one) format_string_two = "{2} {1} {0}".format(monday, tuesday, wednesday) print(format_string_two) format_string_three = "{one} {two} {three}".format(one=tuesday, two=wednesday, three=monday) print(format_string_three) format_string_four = f"{monday} {tuesday} {wednesday}" print(format_string_four)
Output:
Monday Tuesday Wednesday Wednesday Tuesday Monday Tuesday Wednesday Monday Monday Tuesday Wednesday
f-strings 更具可读性,并且它们比 str.format() 方法实现得更快。因此,f-string 是字符串格式化的首选方法
撇号 (') 在 Python 中表示一个字符串。为了让 Python 知道我们不是在处理字符串,我们必须使用 Python 转义字符 ()。因此撇号在 Python 中表示为 '。与处理撇号不同,Python 中有很多处理引号的方法。它们包括以下内容:
# 1. Represent string with single quote (`""`) and quoted statement with double quote (`""`) quotes_one ='"Friends don't let friends use minibatches larger than 32" - Yann LeCun' print(quotes_one) # 2. Represent string with double quote `("")` and quoted statement with escape and double quote `("statement")` quotes_two =""Friends don't let friends use minibatches larger than 32" - Yann LeCun" print(quotes_two) # 3. Represent string with triple quote `("""""")` and quoted statment with double quote ("") quote_three = """"Friends don't let friends use minibatches larger than 32" - Yann LeCun""" print(quote_three)
Output:
"Friends don't let friends use minibatches larger than 32" - Yann LeCun "Friends don't let friends use minibatches larger than 32" - Yann LeCun "Friends don't let friends use minibatches larger than 32" - Yann LeCun
字符串作为编程语言当中最为常见的数据类型,熟练而灵活的掌握其各种属性和方法,实在是太重要了,小伙伴们千万要实时温习,处处留心哦!
好了,这就是今天分享的内容,如果喜欢就点个赞吧~
The above is the detailed content of Python string summary, recommended to collect!. For more information, please follow other related articles on the PHP Chinese website!