Table of Contents
Question content
Workaround
如果我不知道需要多少内存怎么办
回顾
过早的优化是万恶之源
Home Backend Development Golang Release memory from large object

Release memory from large object

Feb 09, 2024 am 09:03 AM
Memory usage

Release memory from large object

php Xiaobian Yuzi introduces you to a technique to optimize memory usage - releasing memory from large objects. During the development process, we often create some large objects, such as large arrays or large database query results, and these objects take up a lot of memory resources. When we are done using these objects, it is a good programming habit to release the memory in time. This article will show you how to free memory from large objects to improve application performance and efficiency.

Question content

I encountered something that I don’t understand. Hope you all can help!

resource:

  1. https://medium.com/@chaewonkong/solving-memory-leak-issues-in-go-http-clients-ba0b04574a83
  2. https://www.golinuxcloud.com/golang-garbage-collector/

I read in a few articles that we could simplify by setting large slices and maps (I guess this applies to all reference types) to nil after we no longer need them gc job. Here's one of the examples I've read:

func ProcessResponse(resp *http.Response) error {
    data, err := ioutil.ReadAll(resp.Body)
    if err != nil {
        return err
    }
    // Process data here

    data = nil // Release memory
    return nil
}
Copy after login

From what I understand, when the function processresponse completes, the data variable will go out of scope and basically cease to exist. The gc will then verify that there are no references to the []byte slice (the slice pointed to by data) and will clear the memory.

Setting data to nil How to improve garbage collection?

Thanks!

Workaround

As others have already pointed out: setting data = nil before returning does not change anything on the gc side. The go compiler will apply optimizations and golang's garbage collector works in different stages. In the simplest terms (with many omissions and oversimplifications): setting data = nil , and removing all references to the underlying slice does not trigger an atomic-style release of memory that is no longer referenced. Once a slice is no longer referenced, it is marked as such and the associated memory is not released until the next scan.

Garbage collection is a difficult problem, largely because it is not the kind of problem that has an optimal solution that produces the best results for all use cases. The go runtime has evolved a lot over the years, and the important work is done on the runtime garbage collector. The result is that, in rare cases, a simple somevar = nil will make even a small, let alone noticeable, difference.

If you're looking for some simple rule-of-thumb type hints that may affect the runtime overhead associated with garbage collection (or runtime memory management in general), I do know that this sentence seems to vaguely cover one In your question:

It is suggested that we can simplify the work of gc by setting up large slices and mappings

This can produce significant results when analyzing code. Assuming you're reading a large amount of data that needs to be processed, or you have to do some other kind of batch operation and return slices, it's not uncommon for people to write something like this:

func processstuff(input []sometypes) []resulttypes {
    data := []resulttypes{}
    for _, in := range input {
        data = append(data, processt(in))
    }
    return data
}
Copy after login

Can be easily optimized by changing the code to:

func processstuff(input []sometypes) []resulttypes {
    data := make([]resulttypes, 0, len(input)) // set cap
    for _, in := range input {
        data = append(data, processt(in))
    }
    return data
}
Copy after login

What happens in the first implementation is that you create a slice for 0 using len and cap. The first time you call append, you exceed the slice's current capacity, which will cause the runtime to allocate memory. As explained here, the calculation of the new capacity is quite simple, the memory is allocated and the data is allocated and copied:

t := make([]byte, len(s), (cap(s)+1)*2)
copy(t, s)
Copy after login

Essentially, every time you call append when the slice to be appended is full (i.e. len == cap), you will allocate an available Holds: (len 1) * 2 new slice of elements. Knowing that in the first example data starts with len and cap == 0, let's see what this means:

1st iteration: append creates slice with cap (0+1) *2, data is now len 1, cap 2
2nd iteration: append adds to data, now has len 2, cap 2
3rd iteration: append allocates a new slice with cap (2 + 1) *2, copies the 2 elements from data to this slice and adds the third, data is now reassigned to a slice with len 3, cap 6
4th-6th iterations: data grows to len 6, cap 6
7th iteration: same as 3rd iteration, although cap is (6 + 1) * 2, everything is copied over, data is reassigned a slice with len 7, cap 14
Copy after login

If the data structures in the slice are large (i.e. many nested structures, lots of indirection, etc.), then this frequent reallocation and copying can become quite expensive. If your code contains a lot of these loops, it will start to show up in pprof (you'll start seeing a lot of time spent calling gcmalloc). Additionally, if you were processing 15 input values, your data slice would end up looking like this:

dataslice {
    len: 15
    cap: 30
    data underlying_array[30]
}
Copy after login

This means that you will allocate memory for 30 values ​​when you only need 15, and you will allocate that memory into 4 progressively larger chunks, copying the data on each reallocation.

In contrast, the second implementation will allocate a data piece like this before the loop:

data {
    len: 0
    cap: 15
    data underlying_array[15]
}
Copy after login

It is allocated once, so no reallocation and copying are required, and the returned slice will occupy half of the memory space. In this sense, we first allocate larger memory blocks at the beginning to reduce the number of incremental allocation and copy calls required later, which overall reduces the runtime cost.

如果我不知道需要多少内存怎么办

这是一个公平的问题。这个例子并不总是适用。在这种情况下,我们知道需要多少个元素,并且可以相应地分配内存。有时,世界并不是这样运作的。如果您不知道最终需要多少数据,那么您可以:

  1. 做出有根据的猜测:gc 很困难,而且与您不同的是,编译器和 go 运行时缺乏模糊逻辑,人们必须提出现实、合理的猜测。有时它会像这样简单:“嗯,我从该数据源获取数据,我们只存储最后 n 个元素,所以最坏的情况下,我将处理 n 个元素”,有时它有点模糊,例如:您正在处理包含 sku、产品名称和库存数量的 csv。您知道 sku 的长度,可以假设库存数量为 1 到 5 位数字之间的整数,产品名称平均为 2-3 个单词长。英文单词的平均长度为 6 个字符,因此您可以粗略地了解 csv 行由多少字节组成:假设 sku == 10 个字符,80 个字节,产品描述 2.5 * 6 * 8 = 120 个字节,以及 ~ 4 个字节表示库存计数 + 2 个逗号和一个换行符,平均预期行长度为 207 个字节,为了谨慎起见,我们将其称为 200。统计输入文件,将其大小(以字节为单位)除以 200,您应该对行数有一个可用的、稍微保守的估计。在该代码末尾添加一些日志记录,比较上限与估计值,然后您可以相应地调整您的预测计算。
  2. 分析您的代码。有时,您会发现自己正在开发新功能或全新项目,而您没有历史数据可以依靠进行猜测。在这种情况下,您可以简单地猜测,运行一些测试场景,或者启动一个测试环境来提供您的代码生产数据版本并分析代码。当您正在主动分析一两个切片/映射的内存使用/运行时成本时,我必须强调这是优化。仅当这是瓶颈或明显问题时(例如,运行时内存分配阻碍了整体分析),您才应该在这方面花费时间。在绝大多数情况下,这种级别的优化将牢牢地属于微优化的范畴。 坚持80-20原则

回顾

不,将一个简单的切片变量设置为 nil 在 99% 的情况下不会产生太大影响。创建和附加到地图/切片时,更可能产生影响的是通过使用 make() + 指定合理的 cap 值来减少无关分配。其他可以产生影响的事情是使用指针类型/接收器,尽管这是一个需要深入研究的更复杂的主题。现在,我只想说,我一直在开发一个代码库,该代码库必须对远远超出典型 uint64 范围的数字进行操作,不幸的是,我们必须能够以更精确的方式使用小数比 float64 将允许。我们通过使用像 holiman/uint256 这样的东西解决了 uint64 问题,它使用指针接收器,并解决shopspring/decimal 的十进制问题,它使用值接收器并复制所有内容。在花费大量时间优化代码之后,我们已经达到了使用小数时不断复制值的性能影响已成为问题的地步。看看这些包如何实现加法等简单操作,并尝试找出哪个操作成本更高:

// original
a, b := 1, 2
a += b
// uint256 version
a, b := uint256.NewUint(1), uint256.NewUint(2)
a.Add(a, b)
// decimal version
a, b := decimal.NewFromInt(1), decimal.NewFromInt(2)
a = a.Add(b)
Copy after login

这些只是我在最近的工作中花时间优化的几件事,但从中得到的最重要的一点是:

过早的优化是万恶之源

当您处理更复杂的问题/代码时,您需要花费大量精力来研究切片或映射的分配周期,因为潜在的瓶颈和优化需要付出很大的努力。您可以而且可以说应该采取措施避免过于浪费(例如,如果您知道所述切片的最终长度是多少,则设置切片上限),但您不应该浪费太多时间手工制作每一行,直到该代码的内存占用尽可能小。成本将是:代码更脆弱/更难以维护和阅读,整体性能可能会恶化(说真的,你可以相信 go 运行时会做得很好),大量的血、汗和泪水,以及急剧下降在生产力方面。

The above is the detailed content of Release memory from large object. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Detailed steps for cleaning memory in Xiaohongshu Detailed steps for cleaning memory in Xiaohongshu Apr 26, 2024 am 10:43 AM

1. Open Xiaohongshu, click Me in the lower right corner 2. Click the settings icon, click General 3. Click Clear Cache

What to do if your Huawei phone has insufficient memory (Practical methods to solve the problem of insufficient memory) What to do if your Huawei phone has insufficient memory (Practical methods to solve the problem of insufficient memory) Apr 29, 2024 pm 06:34 PM

Insufficient memory on Huawei mobile phones has become a common problem faced by many users, with the increase in mobile applications and media files. To help users make full use of the storage space of their mobile phones, this article will introduce some practical methods to solve the problem of insufficient memory on Huawei mobile phones. 1. Clean cache: history records and invalid data to free up memory space and clear temporary files generated by applications. Find "Storage" in the settings of your Huawei phone, click "Clear Cache" and select the "Clear Cache" button to delete the application's cache files. 2. Uninstall infrequently used applications: To free up memory space, delete some infrequently used applications. Drag it to the top of the phone screen, long press the "Uninstall" icon of the application you want to delete, and then click the confirmation button to complete the uninstallation. 3.Mobile application to

How to fine-tune deepseek locally How to fine-tune deepseek locally Feb 19, 2025 pm 05:21 PM

Local fine-tuning of DeepSeek class models faces the challenge of insufficient computing resources and expertise. To address these challenges, the following strategies can be adopted: Model quantization: convert model parameters into low-precision integers, reducing memory footprint. Use smaller models: Select a pretrained model with smaller parameters for easier local fine-tuning. Data selection and preprocessing: Select high-quality data and perform appropriate preprocessing to avoid poor data quality affecting model effectiveness. Batch training: For large data sets, load data in batches for training to avoid memory overflow. Acceleration with GPU: Use independent graphics cards to accelerate the training process and shorten the training time.

nuScenes' latest SOTA | SparseAD: Sparse query helps efficient end-to-end autonomous driving! nuScenes' latest SOTA | SparseAD: Sparse query helps efficient end-to-end autonomous driving! Apr 17, 2024 pm 06:22 PM

Written in front & starting point The end-to-end paradigm uses a unified framework to achieve multi-tasking in autonomous driving systems. Despite the simplicity and clarity of this paradigm, the performance of end-to-end autonomous driving methods on subtasks still lags far behind single-task methods. At the same time, the dense bird's-eye view (BEV) features widely used in previous end-to-end methods make it difficult to scale to more modalities or tasks. A sparse search-centric end-to-end autonomous driving paradigm (SparseAD) is proposed here, in which sparse search fully represents the entire driving scenario, including space, time, and tasks, without any dense BEV representation. Specifically, a unified sparse architecture is designed for task awareness including detection, tracking, and online mapping. In addition, heavy

What to do if the Edge browser takes up too much memory What to do if the Edge browser takes up too much memory What to do if the Edge browser takes up too much memory What to do if the Edge browser takes up too much memory May 09, 2024 am 11:10 AM

1. First, enter the Edge browser and click the three dots in the upper right corner. 2. Then, select [Extensions] in the taskbar. 3. Next, close or uninstall the plug-ins you do not need.

For only $250, Hugging Face's technical director teaches you how to fine-tune Llama 3 step by step For only $250, Hugging Face's technical director teaches you how to fine-tune Llama 3 step by step May 06, 2024 pm 03:52 PM

The familiar open source large language models such as Llama3 launched by Meta, Mistral and Mixtral models launched by MistralAI, and Jamba launched by AI21 Lab have become competitors of OpenAI. In most cases, users need to fine-tune these open source models based on their own data to fully unleash the model's potential. It is not difficult to fine-tune a large language model (such as Mistral) compared to a small one using Q-Learning on a single GPU, but efficient fine-tuning of a large model like Llama370b or Mixtral has remained a challenge until now. Therefore, Philipp Sch, technical director of HuggingFace

The impact of the AI ​​wave is obvious. TrendForce has revised up its forecast for DRAM memory and NAND flash memory contract price increases this quarter. The impact of the AI ​​wave is obvious. TrendForce has revised up its forecast for DRAM memory and NAND flash memory contract price increases this quarter. May 07, 2024 pm 09:58 PM

According to a TrendForce survey report, the AI ​​wave has a significant impact on the DRAM memory and NAND flash memory markets. In this site’s news on May 7, TrendForce said in its latest research report today that the agency has increased the contract price increases for two types of storage products this quarter. Specifically, TrendForce originally estimated that the DRAM memory contract price in the second quarter of 2024 will increase by 3~8%, and now estimates it at 13~18%; in terms of NAND flash memory, the original estimate will increase by 13~18%, and the new estimate is 15%. ~20%, only eMMC/UFS has a lower increase of 10%. ▲Image source TrendForce TrendForce stated that the agency originally expected to continue to

Does win11 take up less memory than win10? Does win11 take up less memory than win10? Apr 18, 2024 am 12:57 AM

Yes, overall, Win11 takes up less memory than Win10. Optimizations include a lighter system kernel, better memory management, new hibernation options and fewer background processes. Testing shows that Win11's memory footprint is typically 5-10% lower than Win10's in similar configurations. But memory usage is also affected by hardware configuration, applications, and system settings.

See all articles