开展电子邮件营销活动时,最大的挑战之一是确保您的邮件到达收件箱而不是垃圾邮件文件夹。
在这篇文章中,我们将构建一个工具,可以验证您的电子邮件是否会被标记为垃圾邮件以及为什么被标记为垃圾邮件。
该工具将以 API 形式并在线部署,以便可以集成到您的工作流程中。
Apache SpamAssassin 是一个由 Apache 软件基金会维护的开源垃圾邮件检测平台,它是许多电子邮件客户端和电子邮件过滤工具广泛使用的工具,用于将邮件分类为垃圾邮件。
它使用多种规则、贝叶斯过滤和网络测试来为给定的电子邮件分配垃圾邮件“分数”。一般来说,得分为 5 或以上的电子邮件被标记为垃圾邮件的风险很高。
由于 Apache SpamAssassin 是一个垃圾邮件检测软件,因此它也可以用来判断您的电子邮件是否会被标记为垃圾邮件。
SpamAssassin 的评分是透明且有据可查的,您可以放心地使用它来准确识别电子邮件的哪些方面导致了高垃圾邮件分数并提高您的写作水平。
SpamAssassin 设计为在 Linux 系统上运行。您需要 Linux 操作系统或创建 Docker 容器来安装和运行它。
在 Debian 或 Ubuntu 系统上,使用以下命令安装 SpamAssassin:
apt-get update && apt-get install -y spamassassin sa-update
sa-update 命令确保 SpamAssassin 的规则是最新的。
安装后,您可以将电子邮件消息通过管道传输到 SpamAssassin 的命令行工具中。输出包括带有垃圾邮件分数的电子邮件的带注释版本,并解释了触发哪些规则。
典型用法可能如下所示:
spamassassin -t < input_email.txt > results.txt
results.txt 将包含处理后的电子邮件以及 SpamAssassin 的标头和分数,如下所示:
X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on 254.254.254.254 X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=HTML_MESSAGE, MIME_HTML_ONLY,MISSING_MID,NO_RECEIVED, NO_RELAYS autolearn=no autolearn_force=no version=4.0.0 // ... Content analysis details: (0.2 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.1 MISSING_MID Missing Message-Id: header -0.0 NO_RECEIVED Informational: message has no Received headers -0.0 NO_RELAYS Informational: message was not relayed via SMTP 0.0 HTML_MESSAGE BODY: HTML included in message 0.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
SpamAssassin 只有在封装为 API 时才能发挥其最大潜力,因为这种形式使其更加灵活并允许集成到各种工作流程中。
想象一下:在您点击电子邮件上的“发送”之前,内容首先发送到 SpamAssassin API。仅当确定电子邮件不符合垃圾邮件标准时才允许继续。
让我们创建一个简单的 API 来接受这些电子邮件字段:主题、html_body 和 text_body。它将把字段传递给 SpamAssassin 并返回验证结果。
from fastapi import FastAPI from datetime import datetime, timezone from email.utils import format_datetime from pydantic import BaseModel import subprocess def extract_analysis_details(text): lines = text.splitlines() start_index = None for i, line in enumerate(lines): if line.strip().startswith("pts rule"): start_index = i break if start_index is None: print("No content analysis details found.") return [] data_lines = lines[start_index+2:] parsed_lines = [] for line in data_lines: if line.strip() == "": break parsed_lines.append(line) results = [] current_entry = None split_line = lines[start_index+1] pts_split, rule_split, *rest = split_line.strip().split(" ") pts_start = 0 pts_end = pts_start + len(pts_split) rule_start = pts_end + 1 rule_end = rule_start + len(rule_split) desc_start = rule_end + 1 for line in parsed_lines: pts_str = line[pts_start:pts_end].strip() rule_name_str = line[rule_start:rule_end].strip() description_str = line[desc_start:].strip() if pts_str == "" and rule_name_str == "" and description_str: if current_entry: current_entry["description"] += " " + description_str else: current_entry = { "pts": pts_str, "rule_name": rule_name_str, "description": description_str } results.append(current_entry) return results app = FastAPI() class Email(BaseModel): subject: str html_body: str text_body: str @app.post("/spam_check") def spam_check(email: Email): # assemble the full email message = f"""From: example@example.com To: recipient@example.com Subject: {email.subject} Date: {format_datetime(datetime.now(timezone.utc))} MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="__SPAM_ASSASSIN_BOUNDARY__" --__SPAM_ASSASSIN_BOUNDARY__ Content-Type: text/plain; charset="utf-8" {email.text_body} --__SPAM_ASSASSIN_BOUNDARY__ Content-Type: text/html; charset="utf-8" {email.html_body} --__SPAM_ASSASSIN_BOUNDARY__--""" # Run SpamAssassin and capture the output directly output = subprocess.run(["spamassassin", "-t"], input=message.encode('utf-8'), capture_output=True) output_str = output.stdout.decode('utf-8', errors='replace') details = extract_analysis_details(output_str) return {"result": details}
在上面的代码中,我们定义了一个辅助函数 extract_analysis_details,用于从完整结果报告中仅提取评分原因。您可以进一步改进此功能,例如从结果中过滤掉某些规则。
回复将包含 SpamAssassin 结果的分析详细信息。
让我们以此输入为例:
主题
apt-get update && apt-get install -y spamassassin sa-update
html_body
spamassassin -t < input_email.txt > results.txt
text_body
X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on 254.254.254.254 X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=HTML_MESSAGE, MIME_HTML_ONLY,MISSING_MID,NO_RECEIVED, NO_RELAYS autolearn=no autolearn_force=no version=4.0.0 // ... Content analysis details: (0.2 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.1 MISSING_MID Missing Message-Id: header -0.0 NO_RECEIVED Informational: message has no Received headers -0.0 NO_RELAYS Informational: message was not relayed via SMTP 0.0 HTML_MESSAGE BODY: HTML included in message 0.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
响应将是这样的:
from fastapi import FastAPI from datetime import datetime, timezone from email.utils import format_datetime from pydantic import BaseModel import subprocess def extract_analysis_details(text): lines = text.splitlines() start_index = None for i, line in enumerate(lines): if line.strip().startswith("pts rule"): start_index = i break if start_index is None: print("No content analysis details found.") return [] data_lines = lines[start_index+2:] parsed_lines = [] for line in data_lines: if line.strip() == "": break parsed_lines.append(line) results = [] current_entry = None split_line = lines[start_index+1] pts_split, rule_split, *rest = split_line.strip().split(" ") pts_start = 0 pts_end = pts_start + len(pts_split) rule_start = pts_end + 1 rule_end = rule_start + len(rule_split) desc_start = rule_end + 1 for line in parsed_lines: pts_str = line[pts_start:pts_end].strip() rule_name_str = line[rule_start:rule_end].strip() description_str = line[desc_start:].strip() if pts_str == "" and rule_name_str == "" and description_str: if current_entry: current_entry["description"] += " " + description_str else: current_entry = { "pts": pts_str, "rule_name": rule_name_str, "description": description_str } results.append(current_entry) return results app = FastAPI() class Email(BaseModel): subject: str html_body: str text_body: str @app.post("/spam_check") def spam_check(email: Email): # assemble the full email message = f"""From: example@example.com To: recipient@example.com Subject: {email.subject} Date: {format_datetime(datetime.now(timezone.utc))} MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="__SPAM_ASSASSIN_BOUNDARY__" --__SPAM_ASSASSIN_BOUNDARY__ Content-Type: text/plain; charset="utf-8" {email.text_body} --__SPAM_ASSASSIN_BOUNDARY__ Content-Type: text/html; charset="utf-8" {email.html_body} --__SPAM_ASSASSIN_BOUNDARY__--""" # Run SpamAssassin and capture the output directly output = subprocess.run(["spamassassin", "-t"], input=message.encode('utf-8'), capture_output=True) output_str = output.stdout.decode('utf-8', errors='replace') details = extract_analysis_details(output_str) return {"result": details}
看到了吗? “亲爱的获奖者”被检测到,因为它常用于垃圾邮件。
运行SpamAssassin需要安装了该软件的Linux环境。传统上,您可能需要 EC2 实例或 DigitalOcean Droplet 进行部署,这可能成本高昂且乏味,特别是在您的使用量较低的情况下。
对于无服务器平台,他们只是不允许你安装任何系统软件包,例如 SpamAssassin。
Leapcell 可以完美胜任这项工作。
使用 Leapcell,您可以部署像 SpamAssassin 一样的任何系统包,同时保持服务无服务器 - 您只需为调用付费,这通常更便宜。
在 Leapcell 上部署 API 非常简单。您不必设置任何环境。只需部署一个Python镜像,并正确填写“Build Command”字段即可。
部署后,您将拥有一个用于垃圾邮件验证的 API。每当调用 API 时,它都会运行 SpamAssassin,对电子邮件进行评分并返回分数。
阅读我们的博客
以上是您的营销电子邮件最终会成为垃圾邮件吗?我们构建了一个工具来找出答案的详细内容。更多信息请关注PHP中文网其他相关文章!