Getting the Last 10 Lines of a Massive Text File (Over 10GB): An Efficient C# Approach
When dealing with massive text files exceeding 10GB, extracting the last lines can pose a performance challenge. Here's how to achieve this effectively using C#:
Code Implementation:
This generalized approach allows you to specify the number of tokens to extract (numberOfTokens), the file path (path), the encoding (encoding), and the token separator (tokenSeparator):
public static string ReadEndTokens(string path, Int64 numberOfTokens, Encoding encoding, string tokenSeparator) { int sizeOfChar = encoding.GetByteCount("\n"); byte[] buffer = encoding.GetBytes(tokenSeparator); using (FileStream fs = new FileStream(path, FileMode.Open)) { Int64 tokenCount = 0; Int64 endPosition = fs.Length / sizeOfChar; for (Int64 position = sizeOfChar; position < endPosition; position += sizeOfChar) { fs.Seek(-position, SeekOrigin.End); fs.Read(buffer, 0, buffer.Length); if (encoding.GetString(buffer) == tokenSeparator) { tokenCount++; if (tokenCount == numberOfTokens) { byte[] returnBuffer = new byte[fs.Length - fs.Position]; fs.Read(returnBuffer, 0, returnBuffer.Length); return encoding.GetString(returnBuffer); } } } // handle case where number of tokens in file is less than numberOfTokens fs.Seek(0, SeekOrigin.Begin); buffer = new byte[fs.Length]; fs.Read(buffer, 0, buffer.Length); return encoding.GetString(buffer); } }
How It Works:
By leveraging this approach, you can efficiently extract the last lines of massive text files, addressing the challenges posed by their large size.
The above is the detailed content of How Can I Efficiently Extract the Last 10 Lines from a 10GB Text File in C#?. For more information, please follow other related articles on the PHP Chinese website!