This article mainly introduces a package class for PHP to quickly read large CSV files line by line. This class is also suitable for other large text files. Friends who need it can refer to it
The reading of large CSV files has been described before (code example of PHP reading and processing large CSV files line by line), but there are still some problems in how to quickly and completely operate large files.
1. How to quickly get the total number of lines of a large CSV file?
Method 1: Get the file content directly and use newline characters to split it to get the total number of lines. This method is feasible for small files, but not feasible when processing large files;
Method 2: Use fgets to traverse line by line and get the total number of lines. This method is better than method 1, but there is still a possibility of timeout for large files;
Method 3: Use the SplFileObject class to directly position the pointer to the end of the file and obtain the total number of lines through the SplFileObject::key method. This method is feasible and efficient.
Specific implementation method:
The code is as follows:
$csv_file = 'path/bigfile.csv';
$spl_object = new SplFileObject($csv_file, 'rb');
$spl_object->seek(filesize($csv_file));
echo $spl_object->key();
2. How to quickly obtain data from large CSV files?
Still using PHP’s SplFileObject class to achieve quick positioning through the seek method.
The code is as follows:
$csv_file = 'path/bigfile.csv';
$start = 100000; // Start reading from line 100000
$num = 100; // Read 100 lines
$data = array();
$spl_object = new SplFileObject($csv_file, 'rb');
$spl_object->seek($start);
while ($num-- && !$spl_object->eof()) {
$data[] = $spl_object->fgetcsv();
$spl_object->next();
}
print_r($data);
3. Based on the above two points, organize it into a class for reading csv files:
The code is as follows:
class CsvReader {
private $csv_file;
private $spl_object = null;
private $error;
public function __construct($csv_file = '') {
if($csv_file && file_exists($csv_file)) {
$this->csv_file = $csv_file;
}
}
public function set_csv_file($csv_file) {
if(!$csv_file || !file_exists($csv_file)) {
$this->error = 'File invalid';
return false;
}
$this->csv_file = $csv_file;
$this->spl_object = null;
}
public function get_csv_file() {
return $this->csv_file;
}
private function _file_valid($file = '') {
$file = $file ? $file : $this->csv_file;
if(!$file || !file_exists($file)) {
return false;
}
if(!is_readable($file)) {
return false;
}
return true;
}
private function _open_file() {
if(!$this->_file_valid()) {
$this->error = 'File invalid';
return false;
}
if($this->spl_object == null) {
$this->spl_object = new SplFileObject($this->csv_file, 'rb');
}
return true;
}
public function get_data($length = 0, $start = 0) {
if(!$this->_open_file()) {
return false;
}
$length = $length ? $length : $this->get_lines();
$start = $start - 1;
$start = ($start < 0) ? 0 : $start;
$data = array();
$this->spl_object->seek($start);
while ($length-- && !$this->spl_object->eof()) {
$data[] = $this->spl_object->fgetcsv();
$this->spl_object->next();
}
return $data;
}
public function get_lines() {
if(!$this->_open_file()) {
return false;
}
$this->spl_object->seek(filesize($this->csv_file));
return $this->spl_object->key();
}
public function get_error() {
return $this->error;
}
}
The calling method is as follows:
The code is as follows:
include('CsvReader.class.php');
$csv_file = 'path/bigfile.csv';
$csvreader = new CsvReader($csv_file);
$line_number = $csvreader->get_lines();
$data = $csvreader->get_data(10);
echo $line_number, chr(10);
print_r($data);
In fact, the above-mentioned CsvReader class is not only for CSV large files, it can also be used for large or very large files of other text types, provided that the fgetcsv method in the class is slightly changed to current.