C#에서 10GB 텍스트 파일의 마지막 10줄을 효율적으로 추출하려면 어떻게 해야 합니까?-C++-php.cn

C#에서 10GB 텍스트 파일의 마지막 10줄을 효율적으로 추출하려면 어떻게 해야 합니까?

Susan Sarandon

풀어 주다： 2024-12-30 06:28:11

원래의

230명이 탐색했습니다.

How Can I Efficiently Extract the Last 10 Lines from a 10GB Text File in C#?

대용량 텍스트 파일(10GB 이상)의 마지막 10줄 가져오기: 효율적인 C# 접근 방식

10GB, 마지막 줄을 추출하면 성능 문제가 발생할 수 있습니다. C#을 사용하여 이를 효과적으로 달성하는 방법은 다음과 같습니다.

코드 구현:

이 일반화된 접근 방식을 사용하면 추출할 토큰 수(numberOfTokens), 파일 경로 (경로), 인코딩(인코딩) 및 토큰 구분 기호 (tokenSeparator):

public static string ReadEndTokens(string path, Int64 numberOfTokens, Encoding encoding, string tokenSeparator) {

    int sizeOfChar = encoding.GetByteCount("\n");
    byte[] buffer = encoding.GetBytes(tokenSeparator);

    using (FileStream fs = new FileStream(path, FileMode.Open)) {
        Int64 tokenCount = 0;
        Int64 endPosition = fs.Length / sizeOfChar;

        for (Int64 position = sizeOfChar; position < endPosition; position += sizeOfChar) {
            fs.Seek(-position, SeekOrigin.End);
            fs.Read(buffer, 0, buffer.Length);

            if (encoding.GetString(buffer) == tokenSeparator) {
                tokenCount++;
                if (tokenCount == numberOfTokens) {
                    byte[] returnBuffer = new byte[fs.Length - fs.Position];
                    fs.Read(returnBuffer, 0, returnBuffer.Length);
                    return encoding.GetString(returnBuffer);
                }
            }
        }

        // handle case where number of tokens in file is less than numberOfTokens
        fs.Seek(0, SeekOrigin.Begin);
        buffer = new byte[fs.Length];
        fs.Read(buffer, 0, buffer.Length);
        return encoding.GetString(buffer);
    }
}

로그인 후 복사

작동 방식: