最近,我的一个老朋友向我打电话求助。他从事记者的职业有多年了,最近获得了重新出版他的很多早期专栏的权利。他希望把他的作品贴在Web上;但是他的专栏都是以纯文本文件的形式保存的,而且他既没有时间也不想去为了把它们转换成为Web页面而学习HTML的知识。由于我是他电话本里唯一一个精通计算机的人,所以他打电话给我看我是否能够帮帮他。
“让我来处理吧,”我说:“一个小时以后再给我打电话。”当然了,当他几个小时以后打电话过来,我已经为他准备好了解决的方法。这需要用到一点点PHP,而我收获了他没完没了的感谢和一箱红酒。
那么我在这一个小时里做了些什么呢?这就是本篇文章的内容。我将告诉你如何使用PHP来快速将纯ASCII文本完美地转换成为可读的HTML标记。
首先让我们来看一个我朋友希望转换的纯文本文件的例子:
Green for Mars!
John R. Doe
The idea of little green men from Mars, long a staple of science fiction, may soon turn out to be less fantasy and more fact.
Recent samples sent by the latest Mars exploration team indicate a high presence of chlorophyll in the atmosphere. Chlorophyll, you will recall, is what makes plants green. It's quite likely, therefore, that organisms on Mars will have, through continued exposure to the green stuff, developed a greenish tinge on their outer exoskeleton.
An interview with Dr. Rushel Bunter, the head of ASDA's Mars Colonization Project blah blah...
What does this mean for you? Well, it means blah blahblah...
Track follow-ups to this story online at ', $html);
// start building output page
// add page header
$output =
// add page content
$output .= "$slug";
$output .= "By $byline
// add page footer
$output .=
FOOTER;
// display in browser
echo $output;
// AND/OR
// write output to a new .html file
file_put_contents(basename($source,
substr($source, strpos($source, '.'))) . ".html", $output)
or die("Cannot write file");
?>
The first step is to read the pure ASCII file into a PHP array. This is easily accomplished using the file() function, which converts each line of the file into an element in a numerically indexed array.
Then, the title and author lines (I assume both of these are the first two lines of the file) are extracted from the array using the array_shift() function and placed into separate variables. The remaining members of the array are then concatenated into a string. This string now contains the entire text of the article.
Special symbols like "'" and "" in the article body are converted into corresponding HTML symbols through the htmlspecialchars() function. In order to preserve the original format of the article, line breaks and paragraphs are converted into HTML elements using the nl2br() function. Multiple spaces in the middle of the article are compressed into one space through simple string replacement.
The URL in the article body is detected using regular expressions, with elements on both sides. When the page is displayed in a web browser, it converts the URL into a clickable hyperlink.
Then create the output HTML page using standard HTML rules. The title, author and body of the article are all formatted using CSS style rules. Although this script doesn't do that, this is where you can customize the look of the final page. You can add graphic elements, colors, or other flashy content to the template.
Once the HTML page is built, it can be sent to the browser or saved as a static file using file_put_contents(). Note that when saving, the original filename will be broken down and a new filename (called filename.html) will be created for the newly created web page. You can then publish the Web page to a Web server, save it to a CD, or edit it further.
Note: When using this script to create and save HTML files to disk, you must ensure that the script has write permissions for the directory where the file is saved.
As you can see, if you have a standard format ASCII plain text data file, you can convert it into a usable Web page fairly quickly using PHP. If you already have a Web site and plan to add new Web pages to it, it's fairly easy to adjust the templates used by the page builder to adapt them to the look of the original Web site. Try it yourself!