The research calls for a new Cloud Processor Architecture that leverages the clouds’ high number of independent parallel general-purpose tasks and redesign the main Central Processing Unit (CPU) to achieve high throughput at a cost of single threaded performance.
The concept is based on an efficient new multitasking architecture driven to extreme via Machine Learning Reinforcement Learning architecture to achieve high utilization of the CPU resources.
A new DNN Architecture that leverages the DNN inherent sparsity and resilience to achieve a Non-Blocking Simultaneous MultiThreading technique (NB-SMT).
The new technique enables DNN execution units to be shared among several computational flows to avoid idle computing element operations due to data sparsity. In the scenario of a structural hazard on a shared execution unit, we propose to temporarily and locally “squeeze in” the operations by reduced precision.