python - 读取txt文件寻找匹配的混拼单词

Question

有一个类似字典的txt文件，读取后输入一段混乱的字母寻找匹配的单词（Word Jumble）。
并且print在这么多数量的单词中有几个是匹配的
输出结果应该类似

ringa_lee · Answer

First, string.split() will split the string into a list, and list.append() will add the entire parameter to the list as an element, so

in your code

wordlist.append(line.split(','))

will make wordlist a list of lists, that is, each element in wordlist is a list, not a single word as you expect, you should use:

wordlist.extend(line.split(','))

or

wordlist += line.split(',')

Secondly, readlines() The returned string contains the newline character at the end of the line, as shown in the following code:

>>> 'abbe
'.split()
['abbe']
>>> 'abbe
'.split(',')
['abbe
']

should be putline.split(',')改为 line.split().

The reference code is as follows:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import argparse

def jumbler(jumble, dict_file_name):
    """
    supply an excellent docstring here
    """

    # first you must open the file

    # second you must read each word from the file and perform an
    # appropriate comparison of each with 'jumble'; you need to count the
    # number of lines read from the file

    # if a word matches 'jumble', you are to print the word on a line by itself

    # after you have read each word from the file and compared, you need to
    # close the file

    # assume that there were MATCHES words that matched, and NLINES in the file
    # if there was a single match, you need to print
    # "1 match in NLINES words", where NLINES is replaced by the value of NLINES
    # if there were two or more matches, you need to print
    # "MATCHES matches in NLINES words"
    # if there were no matches, you need to print
    # "No matches"
    line_count = 0
    match_count = 0
    dictionary = open(dict_file_name,"r")
    for line in dictionary.readlines():
        line_count += 1
        for word in line.split():
            if sorted(str(jumble)) == sorted(str(word)):
                match_count += 1
                print(word)
    if match_count == 0:
        print("No matches")
    else:
        print('%d matches in %d words' %(match_count, line_count))
    dictionary.close()



def main():
    """
    collect command arguments and invoke jumbler()
    inputs:
        none, fetches arguments using argparse
    effects:
        calls jumbler()
    """
    parser = argparse.ArgumentParser(description="Solve a jumble (anagram)")
    parser.add_argument("jumble", type=str, help="Jumbled word (anagram)")
    parser.add_argument('wordlist', type=str,
                        help="A text file containing dictionary words, one word per line.")
    args = parser.parse_args()  # gets arguments from command line
    jumble = args.jumble
    wordlist = args.wordlist
    jumbler(jumble, wordlist)

if __name__ == "__main__":
    main()

天蓬老师 · Answer

The reason is simple, the problem lies here:

for line in dictornary.readlines():
    wordlist.append(line.split(','))    # line.split()得到的是一个list，所以wrodlist最终会是一个二维列表

You should be able to modify the code a little bit:

wordlist.extend(line.split(','))

But you store all the words in list里是很耗费内存的，如果词库文件特别大的话……
个人建议你只用把匹配的词存在list, so there is no need to pay attention to the mismatched words.

我根据两位评论修改了list.extend,但是结果还是满满的no matches..不是很理解