Home > Backend Development > PHP Tutorial > PHP Master | Monitoring File Integrity

PHP Master | Monitoring File Integrity

Christopher Nolan
Release: 2025-03-03 08:26:13
Original
998 people have browsed it

PHP Master | Monitoring File Integrity

Key Points

  • Monitoring file integrity is essential for website management and helps detect when files are added, modified, deleted, or corrupted by maliciously. Hashing the contents of a file is a reliable way to monitor such changes.
  • PHP's hash_file() function can be used to create a file structure configuration file for monitoring. The hash value of each file can be stored for later comparisons to detect any changes.
  • You can set a database table to store the hash value of a file, where file_path stores the path of the file on the server, and file_hash stores the hash value of a file.
  • PHP's RecursiveDirectoryIterator class can be used to traverse the file tree and collect hashes for comparison. The integrity_hashes database can then be updated with these hashes. The array_diff_assoc() function of PHP can be used to check for differences, which helps identify files that have been added, deleted, or changed.

Collaborate on various situations in website management

Consider how to solve the following situations when managing a website:

  • Accidentally adding, modifying or deleting files
  • Maliciously add, modify or delete files
  • Files are corrupted

More importantly, do you know if one of these happens? If your answer is no, please continue reading. In this guide, I will demonstrate how to create a file structure configuration file that can be used to monitor file integrity.

The best way to determine if a file has been changed is to hash its contents. PHP provides multiple hash functions, but for this project, I decided to use the hash_file() function. It provides a variety of different hashing algorithms, which will make my code easy to modify if I decide to change it later. Hash is used in a variety of applications, from password protection to DNA sequencing. The hashing algorithm works by converting data into a fixed-size repeatable encrypted string. They are designed such that even slight modifications to the data should produce very different results. When two or more different data produce the same string result, it is called "conflict". The strength of each hashing algorithm can be measured by its speed and probability of collision. In my example, I will use the SHA-1 algorithm because it is fast, has low probability of conflict, and has been widely used and fully tested. Of course, you are welcome to research other algorithms and use any algorithm you like. After obtaining the hash value of the file, it can be stored for later comparison. If the file hashing later does not return the same hash string as before, then we know that the file has been changed.

Database

First, we need to layout a base table to store the hash value of the file. I will use the following pattern:

CREATE TABLE integrity_hashes (
    file_path VARCHAR(200) NOT NULL,
    file_hash CHAR(40) NOT NULL,
    PRIMARY KEY (file_path)
);
Copy after login
Copy after login

file_path The path to the file on the storage server, since the value is always unique (because two files cannot occupy the same location in the file system), it is our primary key. I specified its maximum length to 200 characters, which should allow some longer file paths. file_hash Stores the hash value of the file, which will be a SHA-1 40-character hexadecimal string.

Collect files

The next step is to build the configuration file for the file structure. We define the path to start collecting files and iterate over each directory recursively until we overwrite the entire branch of the file system and can optionally exclude certain directories or file extensions. We collect the required hash values ​​when it traversing the file tree and then store it in the database or for comparison. PHP provides several ways to traverse the file tree; for simplicity, I will use the RecursiveDirectoryIterator class.

<?php
define("PATH", "/var/www/");
$files = array();

// 要获取的扩展名,空数组将返回所有扩展名
$ext = array("php");

// 要忽略的目录,空数组将检查所有目录
$skip = array("logs", "logs/traffic");

// 构建配置文件
$dir = new RecursiveDirectoryIterator(PATH);
$iter = new RecursiveIteratorIterator($dir);
while ($iter->valid()) {
    // 跳过不需要的目录
    if (!$iter->isDot() && !in_array($iter->getSubPath(), $skip)) {
        // 获取特定文件扩展名
        if (!empty($ext)) {
            // PHP 5.3.4: if (in_array($iter->getExtension(), $ext)) {
            if (in_array(pathinfo($iter->key(), PATHINFO_EXTENSION), $ext)) {
                $files[$iter->key()] = hash_file("sha1", $iter->key());
            }
        } else {
            // 忽略文件扩展名
            $files[$iter->key()] = hash_file("sha1", $iter->key());
        }
    }
    $iter->next();
}
Copy after login
Copy after login

Note that I referenced the same folder twice in the $skip array. Just because I chose to ignore a specific directory doesn't mean that the iterator also ignores all subdirectors, depending on your needs, which can be useful or annoying. The logs class gives us access to multiple methods: RecursiveDirectoryIterator

  • Check if we are using valid filesvalid()
  • Determine whether the directory is "." or ".."isDot()
  • Return to the folder name where the file pointer is currently located getSubPath()
  • Return the full path and file namekey()
  • Restart the loopnext()
There are many more methods available, but most of the time the above listed are all the methods we need, although the

method was added in PHP 5.3.4, which returns the file extension. If your PHP version supports it, you can use it to filter unwanted entries instead of what I did with getExtension() . After execution, the code should fill the pathinfo() array with results similar to: $files

<code>Array
(
    [/var/www/test.php] => b6b7c28e513dac784925665b54088045cf9cbcd3
    [/var/www/sub/hello.php] => a5d5b61aa8a61b7d9d765e1daf971a9a578f1cfa
    [/var/www/sub/world.php] => da39a3ee5e6b4b0d3255bfef95601890afd80709
)</code>
Copy after login
Copy after login
After building the configuration file, it is very easy to update the database.

<?php
$db = new PDO("mysql:host=" . DB_HOST . ";dbname=" . DB_NAME,
    DB_USER, DB_PASSWORD);

// 清除旧记录
$db->query("TRUNCATE integrity_hashes");

// 插入更新的记录
$sql = "INSERT INTO integrity_hashes (file_path, file_hash) VALUES (:path, :hash)";
$sth = $db->prepare($sql);
$sth->bindParam(":path", $path);
$sth->bindParam(":hash", $hash);
foreach ($files as $path => $hash) {
    $sth->execute();
}
Copy after login
Copy after login

Check the difference

You now know how to build a new configuration file for the directory structure and how to update records in the database. The next step is to combine it into some kind of real application, such as a cron job with email notifications, an admin interface, or anything else you like. If you just want to collect a list of changed files without caring how they change, the easiest way is to extract the data from the database into an array similar to

and use PHP's $files function to remove unwanted content. array_diff_assoc()

CREATE TABLE integrity_hashes (
    file_path VARCHAR(200) NOT NULL,
    file_hash CHAR(40) NOT NULL,
    PRIMARY KEY (file_path)
);
Copy after login
Copy after login

In this example, $diffs will be populated with any found differences, or if the file structure is complete, it will be an empty array. Unlike array_diff() , array_diff_assoc() will use the key in comparison, which is important when we conflict, such as two empty files have the same hash value. If you want to go a step further, you can add some simple logic to accurately determine how the file is affected, whether it is deleted, changed, or added.

<?php
define("PATH", "/var/www/");
$files = array();

// 要获取的扩展名,空数组将返回所有扩展名
$ext = array("php");

// 要忽略的目录,空数组将检查所有目录
$skip = array("logs", "logs/traffic");

// 构建配置文件
$dir = new RecursiveDirectoryIterator(PATH);
$iter = new RecursiveIteratorIterator($dir);
while ($iter->valid()) {
    // 跳过不需要的目录
    if (!$iter->isDot() && !in_array($iter->getSubPath(), $skip)) {
        // 获取特定文件扩展名
        if (!empty($ext)) {
            // PHP 5.3.4: if (in_array($iter->getExtension(), $ext)) {
            if (in_array(pathinfo($iter->key(), PATHINFO_EXTENSION), $ext)) {
                $files[$iter->key()] = hash_file("sha1", $iter->key());
            }
        } else {
            // 忽略文件扩展名
            $files[$iter->key()] = hash_file("sha1", $iter->key());
        }
    }
    $iter->next();
}
Copy after login
Copy after login

When we traverse the results in the database, we do multiple checks. First, use array_key_exists() to check if the file path in our database appears in $files, and if not, the file must have been deleted. Second, if the file exists but the hash does not match, the file must have been changed or not changed. We store each check into a temporary array called $tmp and finally, if the number in $files is greater than the number in the database, then we know that the remaining unchecked files have been added. Once done, $diffs is either an empty array or contains any differences found in the form of a multidimensional array, which might look like this:

<code>Array
(
    [/var/www/test.php] => b6b7c28e513dac784925665b54088045cf9cbcd3
    [/var/www/sub/hello.php] => a5d5b61aa8a61b7d9d765e1daf971a9a578f1cfa
    [/var/www/sub/world.php] => da39a3ee5e6b4b0d3255bfef95601890afd80709
)</code>
Copy after login
Copy after login

To display results in a more user-friendly format (such as the management interface), you can for example iterate over the results and output them as bulleted lists.

<?php
$db = new PDO("mysql:host=" . DB_HOST . ";dbname=" . DB_NAME,
    DB_USER, DB_PASSWORD);

// 清除旧记录
$db->query("TRUNCATE integrity_hashes");

// 插入更新的记录
$sql = "INSERT INTO integrity_hashes (file_path, file_hash) VALUES (:path, :hash)";
$sth = $db->prepare($sql);
$sth->bindParam(":path", $path);
$sth->bindParam(":hash", $hash);
foreach ($files as $path => $hash) {
    $sth->execute();
}
Copy after login
Copy after login

At this point, you can provide a link to trigger the operation of updating the database with the new file structure (in which case you might choose to store $files in a session variable), or if you do not approve the differences, you can handle them as needed.

Summary

I hope this guide will help you better understand file integrity monitoring. Installing such content on your website is a valuable security measure and you can rest assured that your files will remain the same as you intend. Of course, don't forget to back up regularly. in case.

(The FAQ part of the original text should be retained here, because the content of this part has nothing to do with the code part, belongs to the supplementary description, and does not fall into the category of pseudo-originality)

The above is the detailed content of PHP Master | Monitoring File Integrity. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template