Best Way to Strip Punctuation from a String
When attempting to remove punctuation from a string in Python, one might utilize the following approach:
import string s = "string. With. Punctuation?" # Sample string out = s.translate(string.maketrans("",""), string.punctuation)
However, this method may appear overly complex. Are there any simpler solutions?
Efficiency Perspective
For optimal efficiency, it's hard to surpass:
s.translate(None, string.punctuation)
This code utilizes C's raw string operations with a lookup table, providing a highly optimized solution.
Alternative Approaches
If speed is not a primary concern, consider the following alternative:
exclude = set(string.punctuation) s = ''.join(ch for ch in s if ch not in exclude)
This option is faster than using s.replace for each character but is still outperformed by non-pure Python approaches such as string.translate.
Timing Analysis
To compare the performance of these methods, the following timing code is utilized:
import re, string, timeit s = "string. With. Punctuation" exclude = set(string.punctuation) table = string.maketrans("","") regex = re.compile('[%s]' % re.escape(string.punctuation)) def test_set(s): return ''.join(ch for ch in s if ch not in exclude) def test_re(s): return regex.sub('', s) def test_trans(s): return s.translate(table, string.punctuation) def test_repl(s): for c in string.punctuation: s=s.replace(c,"") return s print "sets :",timeit.Timer('f(s)', 'from __main__ import s,test_set as f').timeit(1000000) print "regex :",timeit.Timer('f(s)', 'from __main__ import s,test_re as f').timeit(1000000) print "translate :",timeit.Timer('f(s)', 'from __main__ import s,test_trans as f').timeit(1000000) print "replace :",timeit.Timer('f(s)', 'from __main__ import s,test_repl as f').timeit(1000000)
The results indicate that:
Therefore, for efficient punctuation removal, it is advisable to use the s.translate(None, string.punctuation) (for lower Python versions) or s.translate(str.maketrans('', '', string.punctuation)) (for higher Python versions) code.
The above is the detailed content of What's the Most Efficient Way to Remove Punctuation from a String in Python?. For more information, please follow other related articles on the PHP Chinese website!