CRAM: A New Chip Design That Could Reduce the Power Consumption of AI Protocols by Orders of Magnitude-web3.0-php.cn

Home

web3.0

CRAM: A New Chip Design That Could Reduce the Power Consumption of AI Protocols by Orders of Magnitude

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Aug 12, 2024 pm 09:03 PM

ai Energy Demands

Artificial Intelligence (AI) continues to power the 4th industrial revolution, alongside its energy demands. Today, anyone can access advanced AI tools

CRAM: A New Chip Design That Could Reduce the Power Consumption of AI Protocols by Orders of Magnitude

Artificial Intelligence (AI) continues to power the 4th industrial revolution, alongside its energy demands. Today, anyone can access advanced AI tools and integrate them into their systems to improve efficiency and reduce workload. The energy required to power these algorithms increases as the demand for AI applications increases. As such, environmentalists are already pointing out sustainability concerns surrounding the tech. Thankfully, a team of researchers has created a highly efficient alternative. Here's what you need to know.

Growing AI Energy Demands Creating an Energy Crisis

New AI systems continue to launch at an increasing frequency. The most recent global energy use forecast predicts that AI energy consumption will double from 460 terawatt-hours (TWh) in 2022 to 1,000 TWh by 2026. These protocols include recommenders, large language models (LLMs), image and video processing and creation, Web3 services, and more.

According to the researcher's study, AI systems require data transference that equates to “200 times the energy used for computation when reading three 64-bit source operands from and writing one 64-bit destination operand to an off-chip main memory.” As such, reducing energy consumption for artificial intelligence (AI) computing applications is a prime concern for developers who will need to overcome this roadblock to achieve large-scale adoption and mature the tech.

Thankfully, a group of innovative engineers from the University of Minnesota have stepped up with a possible solution that could reduce the power consumption of AI protocols by orders of magnitude. To accomplish this task, the researchers introduce a new chip design that improves on the Von Neumann Architecture found in most chips today.

Von Neumann Architecture

John von Neumann revolutionized the computer sector in 1945 when he separated logic and memory units, enabling more efficient computing at the time. In this arrangement, the logic and data are stored in different physical locations. His invention improved performance because it allowed both to be accessed simultaneously.

Today, most computers still use the Von Neuman structure with your HD storing your programs and the random access memory (RAM) housing programming instructions and temporary data. Today's RAM accomplishes this task using various methods including DRAM, which leverages capacitors, and SRAM, which has multiple circuits.

Notably, this structure worked great for decades. However, the constant transfer of data between logic and memory requires lots of energy. This energy transfer increases as data requirements and computational load increase. As such, it creates a performance bottleneck that limits efficiency as computing power increases.

Attempted Improvements on Energy Demands

Over the years, many attempts have been made to improve Von Neumann's architecture. These attempts have created different variations of the memory process with the goal of bringing the two actions closer physically. Currently, the three main variations include.

Near-memory Processing

This upgrade moves logic physically closer to memory. This was accomplished using a 3D-stacked infrastructure. Moving the logic closer reduced the distance and energy needed to transfer the data required to power computations. This architecture provided improved efficiency.

In-memory Computing

Another current method of improving computational architecture is in-memory computing. Notably, there are two variations of this style of chip. The original integrates clusters of logic next to the memory on a single chip. This deployment enables the elimination of transistors used in predecessors. However, there are many who consider this method not “true” to the in-memory structure because it still has separate memory locations, which means that initial performance issues that resulted from the data transfer exist, albeit on a smaller scale.

True In-memory

The final type of chip architecture is “true in-memory.” To qualify as this type of architecture, the memory needs to perform computations directly. This structure enhances capabilities and performance because the data for logic operations remains in its location. The researcher's latest version of true in-memory architecture is CRAM.

(CRAM)

Computational random-access memory (CRAM) enables true in-memory computations as the data is processed within the same array. The researchers modified a standard 1T1M STT-MRAM architecture to make CRAM possible. The CRAM layout integrates micro transistors into each cell and builds on the magnetic tunnel junction-based CPUs.

This approach provides better control and performance. The team then stacked an additional transistor, logic line (LL), and logic bit line (LBL) in each cell, enabling real-time computation within the same memory bank.

History of CRAM

Today's AI systems require a new structure that can meet their computational demands without diminishing sustainability concerns. Recognizing this demand, engineers decided to delve deep into CRAM capabilities for the first time. Their results were published in the NPJ scientific journal under the report “Experimental demonstration of magnetic tunnel junction-based computational random-access memory.”

The first CRAM leveraged an MTJ device structure. These spintronic devices improved on previous storage methods by using electron spin rather than transistors to transfer and store

The above is the detailed content of CRAM: A New Chip Design That Could Reduce the Power Consumption of AI Protocols by Orders of Magnitude. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

4 weeks ago By DDD

How to fix KB5055518 fails to install in Windows 10?

4 weeks ago By DDD

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks ago By DDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial

1664

CakePHP Tutorial

1422

Laravel Tutorial

1316

PHP Tutorial

1266

C# Tutorial

1239

Related knowledge

How to understand DMA operations in C? Apr 28, 2025 pm 10:09 PM

DMA in C refers to DirectMemoryAccess, a direct memory access technology, allowing hardware devices to directly transmit data to memory without CPU intervention. 1) DMA operation is highly dependent on hardware devices and drivers, and the implementation method varies from system to system. 2) Direct access to memory may bring security risks, and the correctness and security of the code must be ensured. 3) DMA can improve performance, but improper use may lead to degradation of system performance. Through practice and learning, we can master the skills of using DMA and maximize its effectiveness in scenarios such as high-speed data transmission and real-time signal processing.

How to use the chrono library in C? Apr 28, 2025 pm 10:18 PM

Using the chrono library in C can allow you to control time and time intervals more accurately. Let's explore the charm of this library. C's chrono library is part of the standard library, which provides a modern way to deal with time and time intervals. For programmers who have suffered from time.h and ctime, chrono is undoubtedly a boon. It not only improves the readability and maintainability of the code, but also provides higher accuracy and flexibility. Let's start with the basics. The chrono library mainly includes the following key components: std::chrono::system_clock: represents the system clock, used to obtain the current time. std::chron

How to handle high DPI display in C? Apr 28, 2025 pm 09:57 PM

Handling high DPI display in C can be achieved through the following steps: 1) Understand DPI and scaling, use the operating system API to obtain DPI information and adjust the graphics output; 2) Handle cross-platform compatibility, use cross-platform graphics libraries such as SDL or Qt; 3) Perform performance optimization, improve performance through cache, hardware acceleration, and dynamic adjustment of the details level; 4) Solve common problems, such as blurred text and interface elements are too small, and solve by correctly applying DPI scaling.

What is real-time operating system programming in C? Apr 28, 2025 pm 10:15 PM

C performs well in real-time operating system (RTOS) programming, providing efficient execution efficiency and precise time management. 1) C Meet the needs of RTOS through direct operation of hardware resources and efficient memory management. 2) Using object-oriented features, C can design a flexible task scheduling system. 3) C supports efficient interrupt processing, but dynamic memory allocation and exception processing must be avoided to ensure real-time. 4) Template programming and inline functions help in performance optimization. 5) In practical applications, C can be used to implement an efficient logging system.

Quantitative Exchange Ranking 2025 Top 10 Recommendations for Digital Currency Quantitative Trading APPs Apr 30, 2025 pm 07:24 PM

The built-in quantization tools on the exchange include: 1. Binance: Provides Binance Futures quantitative module, low handling fees, and supports AI-assisted transactions. 2. OKX (Ouyi): Supports multi-account management and intelligent order routing, and provides institutional-level risk control. The independent quantitative strategy platforms include: 3. 3Commas: drag-and-drop strategy generator, suitable for multi-platform hedging arbitrage. 4. Quadency: Professional-level algorithm strategy library, supporting customized risk thresholds. 5. Pionex: Built-in 16 preset strategy, low transaction fee. Vertical domain tools include: 6. Cryptohopper: cloud-based quantitative platform, supporting 150 technical indicators. 7. Bitsgap:

How to measure thread performance in C? Apr 28, 2025 pm 10:21 PM

Measuring thread performance in C can use the timing tools, performance analysis tools, and custom timers in the standard library. 1. Use the library to measure execution time. 2. Use gprof for performance analysis. The steps include adding the -pg option during compilation, running the program to generate a gmon.out file, and generating a performance report. 3. Use Valgrind's Callgrind module to perform more detailed analysis. The steps include running the program to generate the callgrind.out file and viewing the results using kcachegrind. 4. Custom timers can flexibly measure the execution time of a specific code segment. These methods help to fully understand thread performance and optimize code.

How to use string streams in C? Apr 28, 2025 pm 09:12 PM

The main steps and precautions for using string streams in C are as follows: 1. Create an output string stream and convert data, such as converting integers into strings. 2. Apply to serialization of complex data structures, such as converting vector into strings. 3. Pay attention to performance issues and avoid frequent use of string streams when processing large amounts of data. You can consider using the append method of std::string. 4. Pay attention to memory management and avoid frequent creation and destruction of string stream objects. You can reuse or use std::stringstream.

An efficient way to batch insert data in MySQL Apr 29, 2025 pm 04:18 PM

Efficient methods for batch inserting data in MySQL include: 1. Using INSERTINTO...VALUES syntax, 2. Using LOADDATAINFILE command, 3. Using transaction processing, 4. Adjust batch size, 5. Disable indexing, 6. Using INSERTIGNORE or INSERT...ONDUPLICATEKEYUPDATE, these methods can significantly improve database operation efficiency.