Why is downloading in memory slower than downloading in the file system from aws s3?

王林
Release: 2024-02-09 11:57:29
forward
593 people have browsed it

为什么在内存中下载比从 aws s3 在文件系统中下载慢?

Why is downloading in memory slower than downloading in the file system from AWS S3? When downloading files, we usually choose to download from an AWS S3 bucket to the local file system. However, sometimes we find that using the in-memory download method is slower than downloading directly from S3 to the file system. This is because downloading in memory involves some extra steps and resource consumption. First, in-memory downloads require reading the file contents into memory and then writing them to the file system. This process involves additional memory operations and IO operations, which will cause the download speed to be slower than downloading directly from S3 to the file system. In addition, in-memory downloads may also be affected by memory limitations. When the downloaded files are large, it may cause insufficient memory problems, thereby affecting the download speed. Therefore, when choosing a download method, you need to weigh the pros and cons according to the specific situation and choose the most suitable method for downloading.

Question content

I am using aws gosdk to download from a certain bucket. The following are two implementations for download

  1. Download to file
func (a *awsclient) downloadtofile(ctx context.context, objectkey string) (string, error) {
    params := &awss3.getobjectinput{
        bucket: aws.string(a.bucket),
        key:    aws.string(objectkey),
    }

    downloadpath := "some/valid/path"
    f, err := os.create(downloadpath)
    defer f.close()
    _, err = a.downloader.download(ctx, f, params)
    return downloadpath, err
}
Copy after login
  • Download to memory
  • func (a *AwsClient) DownloadToMemory(ctx context.Context, objectKey string) (string, error) {
        params := &awsS3.GetObjectInput{
            Bucket: aws.String(a.bucket),
            Key:    aws.String(objectKey),
        }
    
        buffer := manager.NewWriteAtBuffer([]byte{})  
        _, err = a.downloader.Download(ctx, buffer, params)
        return buffer.Bytes(), err
    }
    Copy after login

    For a 100 mb file, it takes 30 seconds to download to memory and only 8 seconds to download to the file system. My expectation is that memory downloads should be much faster. My system (apple m1, ventura, 8gb ram) has enough ram available so this is not an issue. Can anyone help me understand this behavior?

    Solution

    Downloading large S3 objects into dynamic buffers is very inefficient. The buffer was reallocated multiple times to handle 100M of data and multiple download threads. Memory reallocation requires a lot of CPU time.

    Try to allocate 100M at the beginning instead of using null byte slices.

    If the object size is unknown, you can use S3.HeadObject to get the object length in real time.

    The above is the detailed content of Why is downloading in memory slower than downloading in the file system from aws s3?. For more information, please follow other related articles on the PHP Chinese website!

    Related labels:
    source:stackoverflow.com
    Statement of this Website
    The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
    Popular Tutorials
    More>
    Latest Downloads
    More>
    Web Effects
    Website Source Code
    Website Materials
    Front End Template
    About us Disclaimer Sitemap
    php.cn:Public welfare online PHP training,Help PHP learners grow quickly!