Chip发表复旦大学刘琦团队与香港大学、壁仞科技合作综述论文:面向并行搜索和人工智能应用的新兴非易失性TCAM发展趋势

FUTURE远见| 2022-06-20

Future|远见

Future|远见future选编

近日,复旦大学刘琦团队与香港大学李灿、壁仞科技唐杉以「The Trend of Emerging Non-Volatile TCAM for Parallel Search and AI Applications」为题在Chip上发表长篇综述文章,全面介绍了基于新兴非易失性器件的三态内容寻址存储器研究领域的发展趋势与挑战。第一作者为周可基,通讯作者是陈迟晓和刘琦。

面向并行搜索和人工智能应用的新兴非易失性TCAM发展趋势

相比于软件的实现方式,三态内容寻址存储器(ternary content addressable memory, TCAM)能以较高的并行搜索效率和纳秒以下的低延迟对数据进行分类和转发,在网络路由器、模式匹配、缓存控制器和其他各种人工智能(AI)应用等领域备受关注。为了满足各种应用日益增长的需求,TCAM需要不断提高其响应速度和存储容量。然而,受限于存储单元面积大、搜索功耗高的缺点,传统的基于静态随机存取存储器(Static Random-Access Memory,SRAM)实现的TCAM较难被接受,其单元结构和工作方式如图1所示。为了解决上述问题,非易失性TCAM(non-volatile TCAM, nvTCAM)使用新兴非易失性器件替代交叉耦合的反相器来实现数据的存储,有效地提高了TCAM的存储密度。此外,非易失性器件可以在断电后继续保持数据,极大地降低了待机功耗。因此,新兴非易失性器件被广泛研究用于设计面积更小、能效更高的nvTCAM。

图1 (a)NOR型TCAM存储单元,(b) NAND型TCAM存储单元,(c) TCAM结构,(d) 工作波形。

现有多种新兴非易失性器件用于构建nvTCAM存储单元,如FeFET、PcRAM、ReRAM和MRAM。与基于SRAM的TCAM单元相比,nvTCAM单元利用非易失性器件替代耦合反相器来存储数据,只需额外2-6个晶体管,可以大大提高存储密度和搜索能效。该综述总结了基于各种新兴非易失性器件实现的常见TCAM单元,如图2所示。这些不同的nvTCAM单元具有较高的存储密度,并通过基于不同新兴非易失性存储器改进单元结构来实现可靠的搜索操作。

图2 基于各种非易失性器件实现的nvTCAM存储单元:(a) 2T2R,(b) 2D2R,(c) 2EeET,(d) 2.5T1R,(e) 4T2R,(f) 2T2P,(g) 4T2FeET,(h) 5T4MTJ。

然而,使用非易失性器件构建nvTCAM存储单元将带来新的挑战,如有限的开关比导致的低检测裕度,以及进一步提高性能的要求。因此,该综述详细调研并分析了各种单元结构、外围电路和算法的设计,并展示了在存储密度、搜索能效、搜索延时、电流开关比、更新效率等多个方向进行优化设计的代表性工作。

此外,由于nvTCAM具有高效的并行搜索能力,人们开始探索nvTCAM来满足人工智能应用的巨大计算需求。根据不同的应用场景,用于人工智能的非易失性TCAM的研究趋势可分为两个领域,分别为面向计算科学和面向神经科学的计算,这两个领域需要根据应用需求做出相应的改变和增加新的功能。此篇综述展示并分析了基于nvTCAM实现的多种智能计算,如布尔函数、全加器、矩阵乘法、一次性学习、近似计算、脉冲神经网络(Spiking Neural Network,SNN)、路由器等。

为了激励相关研究工作的进一步发展,该综述分析了现有nvTCAM的发展方向和面临的挑战,并从设备、电路和算法三个领域进行了总结。器件层面主要从开关电流比、存储密度、写电压和耐久性的四个优化方向展开讨论与分析。电路层面分析了有限开关电流比、灵敏放大器的失调、匹配线频繁充放电三方面造成的问题与可能的解决方案。算法层面则是介绍了两个主要的研究热点,提升更新的效率和减小搜索范围。

综上所述,nvTCAM通过使用新兴非易失性器件实现存储数据来提升搜索操作的能效和存储密度,并被广泛研究探索用以满足AI应用的巨大计算需求。然而,nvTCAM的进一步发展仍需要器件、电路和算法的协同设计的努力。

The Trend of Emerging Non-Volatile TCAM for Parallel Search and AI Applications

Instead of software, ternary content-addressable memory (TCAM) can classify and forward data with high parallel search efficiency and low latencies below nanoseconds, attracting great attention in domains such as network routers, pattern matching, cache controllers, and other various AI applications. To satisfy the increasing demand of various applications, TCAM needs to enhance its response speed and storage capacities.

However, due to the disadvantages of large cell areas and high power consumption, traditional TCAM based on static random-access memory (SRAM) is quickly becoming obsolete. The structure and operating waveform of SRAM-based TCAM are shown in Fig. 1. To address the aforementioned issues, non-volatile TCAM (nvTCAM) utilizes non-volatile devices to store data instead of cross-coupled inverters, which effectively improves the storage density of TCAM. In addition, non-volatile devices can retain data after power-off, which greatly reduces standby power consumption. Therefore, emerging non-volatile devices have been widely studied to design nvTCAM with smaller cell area and higher energy efficiency.

There are various emerging non-volatile devices that are utilized to construct the nvTCAM storage cell, such as ferroelectric field-effect transistor (FeFET), phase change random-access memory (PCRAM), resistive random-access memory (ReRAM), and magnetoresistive random-access memory (MRAM).

Compared with the SRAM-based TCAM cell, the nvTCAM cell is realized by emerging non-volatile devices instead of coupling inverters to store data, and only needs 2-6 extra transistors, greatly improving the storage density and searching energy efficiency. Common TCAM cells based on various non-volatile devices are summarized in this paper, as shown in Fig. 2. These various nvTCAM cells have high storage density and achieve reliable search operation by improving the cell structure based on different emerging non-volatile memories.

However, using non-volatile devices to construct nvTCAM cell brings about new challenges, such as low sense margin caused by limited on/off ratio, and the requirement to further improve performance. To shed light on these issues, this review investigates and analyzes the design of various cell structures, peripheral circuits, and algorithms in detail, and shows the representative works of optimization design in the different domains, such as storage density, search energy efficiency, search delay, current switch ratio, update efficiency, etc.

In addition, owing to its efficiency in parallel search, nvTCAM has been explored to meet the challenging computing requirements of AI applications. According to different applications, the research trends of non-volatile TCAM used for AI can be divided into two domains: computer-science-oriented computing and neuroscience-oriented computing. This review shows and analyzes a variety of intelligent computing based on nvTCAM, such as Boolean functions, full adder, matrix multiplication, one-shot learning, approximate computing, router of spiking neural networks (SNN), etc.

To motivate the further development of related research, this review analyzes the development directions and challenges of the existing nvTCAM, and summarizes the field of nvTCAN from three perspectives: device, circuit, and algorithm. On the device level, the four optimization directions of switching current ratio, storage density, write voltage, and endurance are discussed and analyzed. On the circuit level, the problems caused by limited switch current ratio, offset of the sense amplifier, and frequent charging and discharging of the matching line are analyzed and possible solutions are proposed. On the algorithm level, two main research hotspots are introduced, which are improving the update efficiency and narrowing the search targets.

In conclusion, nvTCAM can achieve high searching energy efficiency and storage density, and it has been widely explored to satisfy AI applications’ gigantic need for computing power. However, some important issues still remain which post challenges to the further development of nvTCAM. Resolving these issues requires significantly improved co-design of devices, circuits, and algorithms.

关于Chip

Chip全球唯一聚焦芯片类研究的综合性国际期刊,已入选由中国科协、教育部、科技部、中科院等单位联合实施的「中国科技期刊卓越行动计划高起点新刊项目」,为科技部鼓励发表「三类高质量论文」期刊之一。

Chip期刊由上海交通大学与Elsevier集团合作出版,并与多家国内外知名学术组织展开合作,为学术会议提供高质量交流平台。

Chip秉承创刊理念: All About Chip,聚焦芯片,兼容并包,旨在发表与芯片相关的各科研领域尖端突破性成果,助力未来芯片科技发展。迄今为止,Chip已在其编委会汇集了来自13个国家的68名世界知名专家学者,其中包括多名中外院士及IEEE、ACM、Optica等知名国际学会终身会士(Fellow)。

Chip第二期将于2022年7月在爱思唯尔Chip官网以金色开放获取形式(Gold Open Access)发布,欢迎访问阅读文章。

爱思唯尔Chip官网:

https://www.journals.elsevier.com/chip

预印版链接:

https://www.sciencedirect.com/science/article/pii/S2709472322000107?v=s5


Warning: Invalid argument supplied for foreach() in /www/wwwroot/www.futureyuanjian.com/wp-content/themes/future/single-news.php on line 41