MACs and FLOPs Featured

Why you should care about MACs and FLOPs in Neural Network?

Knowing the intricacies of Multiply-Accumulate Operations (MACs) and Floating Point Operations (FLOPs) helps in developing neural network models that are not only accurate but also efficient and scalable. Building hardware-aware models ensures they are optimized for latency, power, and overall efficiency, making them suitable for real-world applications.

Floating Point Operations

FLOPs consists of all floating-point operations such as addition, subtraction, multiplication and division. In neural networks, FLOPs are measure of total computational workload required to process training or inference of a model.

Multiply-Accumulate Operations

MAC operation is multiplication of two numbers followed by an addition. This is the building blocks in layers of neural networks as each MAC operation is seen as a small step in calculating the final output of the layer
In neural networks, each MAC operation counts as two FLOPs (multiplication and addition).

MACs and FLOPs in Practice for Convolution

$$
\begin{aligned}
\text{MACs} &= \text{inp_channels} \times \text{kernel_w} \times \text{kernel_h} \times \text{out_w} \times \text{out_h} \times \text{out_channels} \
&= 3 \times 3 \times 3 \times 224 \times 224 \times 64 \
&= 86,!704,!128 \
\
\text{FLOPs} &= 2 \times \text{inp_channels} \times \text{kernel_w} \times \text{kernel_h} \times \text{out_w} \times \text{out_h} \times \text{out_channels} \
&= 2 \times 3 \times 3 \times 3 \times 224 \times 224 \times 64 \
&= 173,!408,!256
\end{aligned}
$$

Optimizing Neural Networks

Using FLOPs and MACs, we can estimate theoretical compute requirements of the model on a particular hardware, for example, latency and efficiency.
To improve these, neural networks can be optimized by reducing the number of operations. Some ways are:

  • Substitute layers that use less parameters (eg. Use separable conv layer)
  • Model Quantization
  • Model Pruning
  • Knowledge Distillation

More In Tech Explained

Models are compressed and downsized using techniques such as Model Quantization to address the constraints of local computing.
Let's break it down: What RAG is, How it functions, and Why it’s such a transformative technology in Artificial Intelligence.

More Blogs

Use a suite of tools that streamline and enhance development process. These tools have been trusted in many projects, which have accelerated Machine Learning development.
Explore the top 20 Linux commands that every Machine Learning Engineer should know to enhance productivity and streamline their work.
While the industry has been heavily focused on Transformers, it's exciting to see how State Space Models (SSMs) are emerging as the next-generation alternative.