Artificial intelligence and machine learning technologies have been accelerating the advancement of intelligent applications.
To cope with the increasingly complex applications, semiconductor companies are constantly developing processors and accelerators, including CPU, GPU, and TPU.
However, with Moore’s law slowing down, CPU performance alone will not be enough to execute demanding workloads efficiently. The problem is, how can companies accelerate the performance of entire systems to support the excessive demands of AI applications?
The answer may come via the development of GPUs and TPUs for supplementing CPUs to run deep learning models. That is why it is essential to understand the technologies behind CPU, GPU, and TPU to keep up with the constantly evolving technologies for better performance and efficiency.
What Is A CPU?
CPU is known as brain for every ingrained system. CPU comprises the arithmetic logic unit (ALU) accustomed quickly to store the information and perform calculations and Control Unit (CU) for performing instruction sequencing as well as branching. CPU interacts with more computer components such as memory, input and output for performing instruction.
Common CPU components:
- control unit (CU)
- arithmetic logic unit (ALU)
CPU Features Summary:
Has Several Cores
Specialized in Serial Processing
Capable of executing a handful of operations at once
Have the highest FLOPS utilization for RNNs (recurrent neural network)
Support the largest model thanks to its large memory capacity
Much more flexible and programmable for irregular computations (e.g., small batches non MatMul computations)
When To Use CPU:
- Prototypes that require the highest flexibility
- Training simple models that do not require a long time
- Training small models with small effective batch sizes
- Mostly written in C++ based on custom TensorFlow operations
- Models with limited I/O or limited system’s networking bandwidth
Intel, AMD, Qualcomm, NVIDIA, IBM, Samsung, Apple, Hewlett-Packard, VIA, Atmel, etc.
What Is A GPU?
GPU is used to provide the images in computer games. GPU is faster than CPU’s speed and it emphasis on high throughput. It’s generally incorporated with electronic equipment for sharing RAM with electronic equipment that is nice for the foremost computing task. It contains more ALU units than CPU.
GPU Features Summary:
- Has thousands of cores
- High throughput
- Specialized for parallel processing
- Capable of executing thousands of operations at once
When To Use GPU:
- Models that are too difficult to change or sources that do not exist
- Models with numerous custom TensorFlow operations that a GPU must support
- Models that are not available on Cloud TPU
- Medium or larger size models with bigger effective batch sizes
NVIDIA, AMD, Broadcom Limited, Imagination Technologies (PowerVR)
What Is A TPU?
TPUs stand for Tensor Processing Units, which are application-specific integrated circuits (ASICs). TPUs were s designed from the ground up by Google; they started using TPUs in 2015 and made them public in 2018. TPUs are available as a cloud or smaller version of the chip. Cloud TPUs are incredibly fast at performing dense vector and matrix computations to accelerate neural network machine learning on the TensorFlow software.
TPUs Features Summary:
- Special Hardware for Matrix Processing
- High Latency (compared to CPU)
- Very High Throughput
- Compute with Extreme Parallelism
- Highly-optimized for large batches and CNNs (convolutional neural network)
When To Use TPU:
- Training models using mostly matrix computations
- Training models without custom TensorFlow operations inside the main training loop
- Training Models that require weeks or months to complete
- Training huge models with very large effective batch sizes
Google, Coral (owned by Google), HAILO