Research

Designing Application-Specific Hardware 

The demand for more computing power is higher than ever as new AI applications are emerging. However,  Moore's law no longer provide enough computing power to meet the demand. In this era, specialization may be the only way to bring more computing power.  We currently focus on developing specialized hardware for artificial intelligence using Application Specific Integrated Circuits (ASICs) and Field Programmable Gate Arrays (FPGAs). 

AI for Designing Computer Chips

Designing computer chips is a labor-intensive and timing consuming process until today. Would it be the same in the future? We strongly believe that AI can revolutionize the way we design computer chips.  However, there are still a lot of challenges in applying AI to chip design even though astonishing  progress in machine learning. We will foster AI experts and chip design experts in a single lab and  develop the best AI for designing chips.

Efficient AI Algorithms


The predictive performance of recent AI often scales with more data and larger models, and we have witnessed what AI is capable of. However, the large models often require massive amounts of computation, huge memory and storage, and present formidable challenges in serving them at Internet-scale and deploying them on mobile devices. We often deal with these challenges by purely developing novel algorithms without relying on new hardware

Microarchitecture-Logic Co-Design

2D spatial architectures such as systolic arrays are the heart of modern deep learning accelerators such as NPU, TPU, etc. From architectural point of view, arithmetic logic circuits such as multipliers and adders are atomic blackbox primitives. We open this box and consider them together with the microarchitecture, finding new opportunities for better designs.




FPGA-based Systems/Neuromorphic Computing Systems

FPGAs allocate exclusive computational resources for each task and provides true concurrency while providing flexibility like software. We are experts on FPGAs. We have built a neuromorphic computing system on top of FPGAs back in 2015 and to our best knowledge, it has the lowest of latency in processing neural networks till today. If your application needs low, predictable latency for real-time processing of neural networks, please let us know.


<Selected Publications>



AI-based General Optimization Frameworks for Chip Design

AI based on reinforcement learning (RL) is surpassing humans in games, algorithm design, computational biology, and so on. However, RL is expensive, especially in chip design because simulations in chip design is too slow compared to other domains. We are developing a general, cost-efficient, learning-based optimization framework for chip design as an alternative to RL.



 

Model Compression

Deep learning models are often redundant and can be compressed by several techniques. We started model compression research back in 2014 and achieved the state-of-the art results in 2021 in binary and multi-bit quantization.