This article is reprinted with the authorization of AIGC Open Community . Please contact the source for reprinting.
To learn more about AIGC, please visit: 51CTO AI.x Community
https://www.51cto.com/aigc/
Last year, the emergence of AutoGPT made us realize the powerful automation capabilities of AI agents and created a new AI agent track. However, there are still many problems that need to be solved in sub-task scheduling, resource allocation and collaboration between AI.
So researchers at Rutgers University developed AIOS, an AI agent operating system with large models at its core. It can effectively solve the problem of low resource call rate as the number of AI agents increases. It can also promote context switching between agents, implement concurrent agents, and maintain agent access control.
Open source address: https://github.com/agiresearch/AIOS
Paper address: https://arxiv.org/abs /2403.16971
##The architecture of AIOS is similar to the PC operating system we use. It is mainly divided into application layer, kernel layer and hardware layer. Chunk . The only difference is that AIOS builds a kernel manager in the kernel layer that specifically manages tasks related to large models.
Application layer , mainly consists of proxy applications (e.g. , travel agent, mathematical agent, code agent, etc.); the kernel layer is developed by combining the traditional OS system and large model. The OS system is mainly used for file management, and the large model is used for the scheduling and management of AI agents;
The hardware layer consists of hardware devices such as CPU, GPU, memory and peripherals. However, the kernel of the large model cannot directly interact with the hardware. Instead, it manages hardware resources indirectly through calls provided by the kernel layer to ensure System integrity and efficiency.
The AI agent scheduler is mainly responsible for reasonably scheduling and optimizing the agent requests of large models to make full use of the large model. Computational resources for the model. When multiple agents initiate requests to a large model at the same time, the scheduler needs to sort the requests according to a specific scheduling algorithm to avoid a single agent occupying the large model for a long time and causing other agents to wait for a long time.
In addition, the design of AIOS also supports more complex scheduling strategies, for example, considering the dependencies between proxy requests to achieve more optimized resource allocation.
When there is no scheduling instruction, the agents need to execute tasks one by one in order, and subsequent agents need to wait for a long time;After using the scheduling algorithm, the requests of each agent can be interleaved and parallelized execution, significantly reducing overall wait time and response latency.
Since the large model generation process generally uses heuristic searches such as Beam Search, the search tree will be gradually built and different paths will be evaluated. Finally the result is given.
However, if a large model is interrupted by the scheduler during the generation process, in order to avoid losing all intermediate states and wasting previous calculations, the context manager will update the current Beam Search tree state (including Probability of each path, etc.) to save snapshots.
When the large model regains execution resources, the context manager can accurately resume the previous Beam Search state from the point of interruption and continue to generate the remaining parts , ensuring the completeness and accuracy of the final result.
In addition, most large models have context length limitations, and the input context in actual scenarios often exceeds this limit. To solve this problem, the context manager integrates functions such as text summarization, which can compress or block long contexts, allowing large models to efficiently understand and process long context information.
The memory manager is mainly responsible for managing short-term memory resources and providing efficient interaction logs and intermediate data for each AI agent. temporary storage.
當AI代理處於等待執行或正在運行狀態時,其所需的資料將被保存在由記憶體管理器分配的記憶體區塊中。一旦代理任務結束,對應的記憶體區塊也會被系統回收,以確保記憶體資源的高效利用。
AIOS會為每個AI代理程式分配獨立的記憶體,並透過存取管理器來實現不同代理程式之間記憶體隔離。未來,AIOS會引入更複雜的記憶體共享機制和層級快取策略,以進一步優化AI代理的整體效能。
想了解更多AIGC的內容,請造訪:51CTO AI.x社群
https://www.51cto.com/ aigc/
The above is the detailed content of Open source large-model AI agent operating system: control AI agents like Windos. For more information, please follow other related articles on the PHP Chinese website!