Machine learning compilation (MLC): process of transforming and optimizing machine learning execution from its development form to its deployment form.
Goal of MLC
dependency minimization: make code smaller, automatic fusion
leverage native hardware acceleration: support heterogeneous device
optimization
Difference from normal compilation
might not have code generation: can be model description and engine that execute it
it helps both training and inference
Reason for study:
deployment is not easy
in-depth understanding of existing framework
building software stack for emerging hardware
Tensor: object
Tensor function: operators
Above is one possible optimization
Abstraction allow us to optimize code.
Table of Content