使用Python設計一個程式碼統計工具-Python教學-PHP中文網

首頁

後端開發

Python教學

使用Python設計一個程式碼統計工具

不言

Apr 04, 2018 pm 04:57 PM

python 程式碼

這篇文章主要介紹了使用Python設計一個程式碼統計工具的相關資料，包括檔案個數，程式碼行數，註解行數，空行行數。有興趣的朋友跟隨腳本之家小編一起看看吧

問題

設計一個程序，用於統計一個項目中的程式碼行數，包括檔案個數，程式碼行數，註解行數，空行行數。盡量設計靈活一點可以透過輸入不同參數來統計不同語言的項目，例如：

# type用于指定文件类型
python counter.py --type python

登入後複製

輸出：

files: 10
code_lines:200
comments:100
blanks:20

分析

這是一個看起來很簡單，但做起來有點複雜的設計題，我們可以把問題化小，只要能正確統計一個文件的程式碼行數，那麼統計一個目錄也不成問題，其中最複雜的就是關於多行註釋，以Python 為例，註解程式碼行有下列幾種情況：

1、井號開頭的單行註解

# 單行註解

2、多行註解符在同一行的情況

"""這是多行註解"""
'''這也是多行註解'''
3、多行註解符號

#"""
這3行都是註解符號
"""

我們的思路採取逐行解析的方式，多行註解需要一個額外的標識符in_multi_comment 來標識當前行是不是處於多行註解符當中，預設為False，多行註解開始時，置為True，遇到下一個多行註解符時置為False。從多行註解開始符號直到下一個結束符號之間的程式碼都應該屬於註解行。

知識點

如何正確讀取文件，讀出的文件當字串處理時，字串的常用方法

簡化版

我們逐步進行迭代，先實作一個簡化版程序，只統計Python程式碼的單文件，而且不考慮多行註解的情況，這是任何入門Python 的人都能實現的功能。關鍵地方是把每一行讀出來之後，先用strip() 方法把字串兩邊的空格、回車去掉

# -*- coding: utf-8 -*-
"""
只能统计单行注释的py文件
"""
def parse(path):
 comments = 0
 blanks = 0
 codes = 0
 with open(path, encoding=&#39;utf-8&#39;) as f:
 for line in f.readlines():
  line = line.strip()
  if line == "":
  blanks += 1
  elif line.startswith("#"):
  comments += 1
  else:
  codes += 1
 return {"comments": comments, "blanks": blanks, "codes": codes}
if __name__ == &#39;__main__&#39;:
 print(parse("xxx.py"))

登入後複製

#多行註解版

如果只能統計單行註解的程式碼，意義並不大，要解決多行註解的統計才能算是一個真正的程式碼統計器

# -*- coding: utf-8 -*-
"""

登入後複製

可以統計包含有多行註解的py檔案

"""
def parse(path):
 in_multi_comment = False # 多行注释符标识符号
 comments = 0
 blanks = 0
 codes = 0
 with open(path, encoding="utf-8") as f:
 for line in f.readlines():
  line = line.strip()
  # 多行注释中的空行当做注释处理
  if line == "" and not in_multi_comment:
  blanks += 1
  # 注释有4种
  # 1. # 井号开头的单行注释
  # 2. 多行注释符在同一行的情况
  # 3. 多行注释符之间的行
  elif line.startswith("#") or \
    (line.startswith(&#39;"""&#39;) and line.endswith(&#39;"""&#39;) and len(line)) > 3 or \
   (line.startswith("&#39;&#39;&#39;") and line.endswith("&#39;&#39;&#39;") and len(line) > 3) or \
   (in_multi_comment and not (line.startswith(&#39;"""&#39;) or line.startswith("&#39;&#39;&#39;"))):
  comments += 1
  # 4. 多行注释符的开始行和结束行
  elif line.startswith(&#39;"""&#39;) or line.startswith("&#39;&#39;&#39;"):
  in_multi_comment = not in_multi_comment
  comments += 1
  else:
  codes += 1
 return {"comments": comments, "blanks": blanks, "codes": codes}
if __name__ == &#39;__main__&#39;:
 print(parse("xxx.py"))

登入後複製

上面的第4種情況，遇到多行註解符號時，in_multi_comment 識別碼進行取反操作是關鍵操作，而不是單純地置為False 或True，第一次遇到""" 時為True，第二次遇到"""就是多行註解的結束符，取反為False，以此類推，第三次又是開始，取反又是True。

那麼判斷其它語言是不是要重新寫一個解析函數呢？如果你仔細觀察的話，多行註釋的4種情況可以抽像出4個判斷條件，因為大部分語言都有單行註釋，多行註釋，只是他們的符號不一樣而已。

CONF = {"py": {"start_comment": [&#39;"""&#39;, "&#39;&#39;&#39;"], "end_comment": [&#39;"""&#39;, "&#39;&#39;&#39;"], "single": "#"},
 "java": {"start_comment": ["/*"], "end_comment": ["*/"], "single": "//"}}
start_comment = CONF.get(exstansion).get("start_comment")
end_comment = CONF.get(exstansion).get("end_comment")
cond2 = False
cond3 = False
cond4 = False
for index, item in enumerate(start_comment):
 cond2 = line.startswith(item) and line.endswith(end_comment[index]) and len(line) > len(item)
 if cond2:
 break
for item in end_comment:
 if line.startswith(item):
 cond3 = True
 break
for item in start_comment+end_comment:
 if line.startswith(item):
 cond4 = True
 break
if line == "" and not in_multi_comment:
 blanks += 1
# 注释有4种
# 1. # 井号开头的单行注释
# 2. 多行注释符在同一行的情况
# 3. 多行注释符之间的行
elif line.startswith(CONF.get(exstansion).get("single")) or cond2 or \
 (in_multi_comment and not cond3):
 comments += 1
# 4. 多行注释符分布在多行时，开始行和结束行
elif cond4:
 in_multi_comment = not in_multi_comment
 comments += 1
else:
 codes += 1

登入後複製

只需要一個配置常數把所有語言的單行、多行註解的符號標記出來，對應出 cond1到cond4幾種情況就ok。剩下的任務就是解析多個文件，可以用 os.walk 方法。

def counter(path):
 """
 可以统计目录或者某个文件
 :param path:
 :return:
 """
 if os.path.isdir(path):
 comments, blanks, codes = 0, 0, 0
 list_dirs = os.walk(path)
 for root, dirs, files in list_dirs:
  for f in files:
  file_path = os.path.join(root, f)
  stats = parse(file_path)
  comments += stats.get("comments")
  blanks += stats.get("blanks")
  codes += stats.get("codes")
 return {"comments": comments, "blanks": blanks, "codes": codes}
 else:
 return parse(path)

登入後複製

當然，想要把這個程式做完善，還有很多工作要多，包括命令列解析，根據指定參數只解析某種語言。

補充：

Python實作程式碼行數統計工具

我們常常想要統計專案的程式碼行數，但如果想統計功能比較完善可能就不是那麼簡單了，今天我們來看看如何用python來實作一個程式碼行統計工具。

想法：

首先取得所有文件，然後統計每個文件中程式碼的行數，最後將行數相加.

#實現的功能：

統計每個檔案的行數；
統計總行數；
統計運行時間；
支援指定統計文件類型，排除不想統計的檔案類型；
遞歸統計資料夾下包含子檔案件下的檔案的行數；

排除空白行；

# coding=utf-8
import os
import time
basedir = &#39;/root/script&#39;
filelists = []
# 指定想要统计的文件类型
whitelist = [&#39;php&#39;, &#39;py&#39;]
#遍历文件, 递归遍历文件夹中的所有
def getFile(basedir):
 global filelists
 for parent,dirnames,filenames in os.walk(basedir):
  #for dirname in dirnames:
  # getFile(os.path.join(parent,dirname)) #递归
  for filename in filenames:
   ext = filename.split(&#39;.&#39;)[-1]
   #只统计指定的文件类型，略过一些log和cache文件
   if ext in whitelist:
    filelists.append(os.path.join(parent,filename))
#统计一个文件的行数
def countLine(fname):
 count = 0
 for file_line in open(fname).xreadlines():
  if file_line != &#39;&#39; and file_line != &#39;\n&#39;: #过滤掉空行
   count += 1
 print fname + &#39;----&#39; , count
 return count
if __name__ == &#39;__main__&#39; :
 startTime = time.clock()
 getFile(basedir)
 totalline = 0
 for filelist in filelists:
  totalline = totalline + countLine(filelist)
 print &#39;total lines:&#39;,totalline
 print &#39;Done! Cost Time: %0.2f second&#39; % (time.clock() - startTime)

登入後複製

結果：

#[root@pythontab script]# python countCodeLine.py
/root/script/test /gametest.php---- 16
/root/script/smtp.php---- 284
/root/script/gametest.php---- 16
/root/script/countCodeLine .py---- 33
/root/script/sendmail.php---- 17
/root/script/test/gametest.php---- 16
total lines: 382
Done! Cost Time: 0.00 second
[root@pythontab script]

#只會統計php和python文件，非常方便。

熱AI工具

Undresser.AI Undress

人工智慧驅動的應用程序，用於創建逼真的裸體照片

AI Clothes Remover

用於從照片中去除衣服的線上人工智慧工具。

Undress AI Tool

免費脫衣圖片

Clothoff.io

AI脫衣器

Video Face Swap

使用我們完全免費的人工智慧換臉工具，輕鬆在任何影片中換臉！

熱工具

記事本++7.3.1

好用且免費的程式碼編輯器

SublimeText3漢化版

中文版，非常好用

禪工作室 13.0.1

強大的PHP整合開發環境

Dreamweaver CS6

視覺化網頁開發工具

SublimeText3 Mac版

神級程式碼編輯軟體(SublimeText3)

熱門話題

Java教學

1666

CakePHP 教程

1425

Laravel 教程

1325

PHP教程

1273

C# 教程

1252

Related knowledge

PHP和Python：解釋了不同的範例 Apr 18, 2025 am 12:26 AM

PHP主要是過程式編程，但也支持面向對象編程（OOP）；Python支持多種範式，包括OOP、函數式和過程式編程。 PHP適合web開發，Python適用於多種應用，如數據分析和機器學習。

在PHP和Python之間進行選擇：指南 Apr 18, 2025 am 12:24 AM

PHP適合網頁開發和快速原型開發，Python適用於數據科學和機器學習。 1.PHP用於動態網頁開發，語法簡單，適合快速開發。 2.Python語法簡潔，適用於多領域，庫生態系統強大。

sublime怎麼運行代碼python Apr 16, 2025 am 08:48 AM

在 Sublime Text 中運行 Python 代碼，需先安裝 Python 插件，再創建 .py 文件並編寫代碼，最後按 Ctrl B 運行代碼，輸出會在控制台中顯示。

PHP和Python：深入了解他們的歷史 Apr 18, 2025 am 12:25 AM

PHP起源於1994年，由RasmusLerdorf開發，最初用於跟踪網站訪問者，逐漸演變為服務器端腳本語言，廣泛應用於網頁開發。 Python由GuidovanRossum於1980年代末開發，1991年首次發布，強調代碼可讀性和簡潔性，適用於科學計算、數據分析等領域。

Python vs. JavaScript：學習曲線和易用性 Apr 16, 2025 am 12:12 AM

Python更適合初學者，學習曲線平緩，語法簡潔；JavaScript適合前端開發，學習曲線較陡，語法靈活。 1.Python語法直觀，適用於數據科學和後端開發。 2.JavaScript靈活，廣泛用於前端和服務器端編程。

Golang vs. Python：性能和可伸縮性 Apr 19, 2025 am 12:18 AM

Golang在性能和可擴展性方面優於Python。 1)Golang的編譯型特性和高效並發模型使其在高並發場景下表現出色。 2)Python作為解釋型語言，執行速度較慢，但通過工具如Cython可優化性能。

vscode在哪寫代碼 Apr 15, 2025 pm 09:54 PM

在 Visual Studio Code（VSCode）中編寫代碼簡單易行，只需安裝 VSCode、創建項目、選擇語言、創建文件、編寫代碼、保存並運行即可。 VSCode 的優點包括跨平台、免費開源、強大功能、擴展豐富，以及輕量快速。

notepad 怎麼運行python Apr 16, 2025 pm 07:33 PM

在 Notepad 中運行 Python 代碼需要安裝 Python 可執行文件和 NppExec 插件。安裝 Python 並為其添加 PATH 後，在 NppExec 插件中配置命令為“python”、參數為“{CURRENT_DIRECTORY}{FILE_NAME}”，即可在 Notepad 中通過快捷鍵“F6”運行 Python 代碼。

See all articles

使用Python設計一個程式碼統計工具

熱AI工具

Undresser.AI Undress

AI Clothes Remover

Undress AI Tool

Clothoff.io

Video Face Swap

熱門文章

熱工具

記事本++7.3.1

SublimeText3漢化版

禪工作室 13.0.1

Dreamweaver CS6

SublimeText3 Mac版

熱門話題