This article is reproduced from Lei Feng.com. If you need to reprint, please go to the official website of Lei Feng.com to apply for authorization.
Chen Yiran is a professor in the Department of Electrical and Computer Engineering at Duke University, director of the National Science Foundation (NSF) Next Generation Mobile Networks and Edge Computing Institute (Athena), and NSF New and Director of the University-Industry Collaborative Research Center (IUCRC) of Sustainable Computing (ASIC) and co-director of the Center for Computational Evolutionary Intelligence (DCEI) at Duke University.
Chen Yiran is a 1994 undergraduate student in the Department of Electronics of Tsinghua University. She received a master's degree from Tsinghua University in 2001 and a doctorate from Purdue University in 2005. His research interests include new memory and storage systems, machine learning, neuromorphic computing, and mobile computing systems. He has published more than 500 papers, 1 monograph, and won several best paper awards at various conferences. His honors include the IEEE Computer Society Edward J. McCluskey Technical Achievement Award, the ACM SIGDA Service Award, etc., and he was nominated as an ACM Fellow for his contributions to non-volatile memory technology. He is also the chairman of the ACM Special Interest Group on Design Automation (SIGDA).
Recently, Professor Chen Yiran accepted an interview with ACM and shared his thoughts on new computing architecture, AI computing energy efficiency, NSF AI edge computing center, electronic design automation and ACM design automation branch, and views on future technology trends.
AI Technology Review has compiled the original interview text without changing the original meaning.
ACM: Since you entered the field of memory and storage systems, what has surprised you most about the development of this field?
Chen Yiran:I think the most exciting thing that has happened in the field of memory and storage systems in the past 15-20 years is that the boundary between computing and storage has become Vague.
The recent revolution in the modern computing paradigm began with the need to process big data, which triggered an increasing demand for large-capacity storage devices. A bottleneck quickly emerged due to the limited bandwidth between the computing unit and the storage device (often referred to as the "von Neumann bottleneck"). Making memory and storage systems more "intelligent" has become a popular solution to alleviate the system's dependence on memory bandwidth and speed up data processing, such as near-memory computing and in-memory computing.
This is a good example of how the shift in target applications (i.e., from scientific computing to data-centric computing) has changed the design philosophy of computer architecture. This change in philosophy has inspired a variety of new computing products, such as smart solid-state drives (SSD), dynamic random access memory (DRAM) and data processing units (DPU), as well as many emerging memory technologies such as 3D Xpoint memory (Intel and Micron).
It has also led to the emergence of some new non-von Neumann architectures, such as crossbar-based dot product engines, which map computations directly to the topology of the computing hardware. to perform vector matrix multiplication.
ACM: One of your most cited papers recently is "Learning Structured Sparsity in Deep Neural Networks ", which illustrates the importance of improving the efficiency of deep neural networks. Why is it important to improve the efficiency of deep neural networks? What are the promising research directions in this field?
Paper address: https://dl.acm.org/doi /pdf/10.5555/3157096.3157329
Chen Yiran:As we all know, the high (inference) accuracy of modern deep neural networks (DNNs) is accompanied by high computational costs. This is caused by the increase in depth and width of neural networks. However, we also know that the connection weights of the neural network do not have the same impact on the accuracy of the neural network. When the connection weights are close to zero, it is likely that the connections can be pruned (i.e., the weights are set to zero) without significantly affecting the accuracy of the neural network in any way. This paper we published at NeurIPS 2016 shows that learning non-zero weight structured sparse neural networks stored in memory can maintain good data locality and reduce cache miss rates. Therefore, the computational efficiency of neural networks is greatly improved. The proposed technique, namely structured sparse learning (often called structured splicing) and its variants, have been widely used in modern efficient DNN model design and are supported by many artificial intelligence (AI) computing chips, such as Intel Nervana and NVIDIA Ampere.
Improving the efficiency of DNN is crucial because it largely hinders the scaling of large DNN models and the deployment of large models on systems with limited computing, storage resources, and power budgets, such as Edge and IoT devices. The latest research trend in this field is the combination of algorithm and hardware-level innovations. For example, the design of artificial intelligence accelerators based on emerging nanodevices is used to accelerate new or undeveloped artificial intelligence models, such as Bayesian models, quantum-like models, neural Symbolic model etc.
ACM: It was recently announced that you will direct the Athena project (Athena) of the National Science Foundation's Next Generation Network and Edge Computing Artificial Intelligence Institute. The Athena project is a five-year, $20 million project that will involve several institutions including Duke University, MIT, Princeton University, Yale University, University of Michigan, University of Wisconsin and North Carolina Agricultural and Technical State University. What are the goals of the Athena project?
Chen Yiran:We are very excited about the establishment of the Athena project, which is funded by the National Science Foundation and Edge Computing Artificial Intelligence Flagship Institute sponsored by the U.S. Department of Homeland Security. Athena's goal is to transform the design, operations and services of future mobile network systems by delivering unprecedented performance and supporting previously impossible services while controlling complexity and cost through advanced artificial intelligence technology.
Athena’s research activities are divided into four core areas: edge computing systems, computer systems, network systems, and services and applications. The artificial intelligence technology we develop will also provide the theoretical and technical foundation for the functionality, heterogeneity, scalability and trustworthiness of future mobile networks.
As a connection point for the community, Athena will promote the ecosystem of emerging technologies and cultivate a diverse new generation of technology leaders with ethical and fair values. We expect that Athena's success will reshape the future of the mobile network industry, create new business models and entrepreneurial opportunities, and change future mobile network research and industrial applications.
ACM: What are the most exciting trends in design automation? As the chair of the ACM Special Interest Group on Design Automation (SIGDA), what role do you see this organization playing in this area?
Chen Yiran: Over the past decade, the most exciting trend in design automation has been the widespread adoption of electronic design automation (EDA) tools Machine learning technology. Since the quality of chip design depends largely on the experience of chip designers, it is a natural idea to develop intelligent EDA tools that can directly learn how to inherit semiconductor chip design methods from previously existing designs without having to Go through the traditional bulky model again. Various machine learning models have been embedded into the latest EDA flows to accelerate computational test routing and placement, power estimation, timing analysis, parameter tuning, signal integrity, and more. Machine learning algorithms have also been implemented in the chip’s hardware modules to monitor and predict the chip’s runtime power consumption. For example, our APOLLO framework (won the MICRO 2021 Best Paper Award).
Paper address: https://dl.acm.org/doi/pdf/10.1145/3466752.3480064
As one of the largest EDA professional associations, SIGDA is committed to improving the skills and knowledge of EDA professionals and students around the world. SIGDA sponsors and organizes more than 30 international and regional conferences each year, edits and supports multiple journals and newsletters, and hosts more than a dozen educational and technical events, including workshops, tutorials, webinars, competitions, research forums, and university presentations. In collaboration with our industry partners, SIGDA also provides travel stipends to young students, teachers and professionals to support their attendance at conferences. We also present a number of awards to outstanding researchers and volunteers in the community.
#ACM: What is one example of a research avenue in your field that will be particularly impactful in the coming years?
Chen Yiran: I believe that a universal and explainable AI computing hardware design process will be the next step in EDA and computing system research Revolutionary technology.
Over the past decade, various hardware designs have been proposed to accelerate the computation of artificial intelligence models. However, designers always struggle between design versatility and efficiency, as many hardware customizations are required to accommodate the unique structure of the ever-changing model. On the other hand, explainability has been a long-term challenge in ensuring the robustness of AI models and generalizing model design.
Future AI computing hardware design may be composed of various interpretable hardware modules that correspond to their respective algorithms. The performance of AI computing hardware is guaranteed by a common design process. One possible solution is to use neural symbolic methods to build a composable AI model and implement hardware modules corresponding to the symbolic algorithm modules. Extended AutoML flows can then be used to automate the design of target AI computing hardware to achieve the desired performance while guaranteeing generalizability and interpretability.
The above is the detailed content of Universal, explainable AI computing hardware design will be the next revolutionary technology in EDA. For more information, please follow other related articles on the PHP Chinese website!