TensorComprehensions

screenshot of TensorComprehensions

A domain specific language to express machine learning workloads.

Overview

Tensor Comprehensions (TC) is a powerful C++ library designed for the automatic synthesis of high-performance machine learning kernels. It utilizes Halide, ISL, NVRTC, and LLVM to achieve this functionality. TC offers seamless integration with popular machine learning frameworks such as Caffe2 and PyTorch. The library is portable, framework-agnostic, and requires only a basic tensor library for memory management, offloading, and synchronization.

Features

  • Automatic Kernel Synthesis: TC can automatically synthesize high-performance machine learning kernels.
  • Integration with Caffe2 and PyTorch: TC offers basic integration with popular machine learning frameworks.
  • JIT-Compile Capabilities: The library can JIT-compile high-performance kernels on demand for specific sizes.
  • Autotuning: Users can autotune TC once to obtain mapping options for various problem sizes.
  • High Performance: TC aims to achieve 80%+ of peak shared memory bandwidth after autotuning.
  • Productivity Improvement: TC bridges the productivity gap between research and production needs in machine learning.

Binaries

  • Install the TC binary using the provided conda package. Refer to the documentation for detailed instructions.

From Source

  • Follow the documentation for building TC from source using docker, conda packages, or in a non-conda environment.

Summary

Tensor Comprehensions is a versatile C++ library that streamlines the development of high-performance machine learning kernels. With its automatic synthesis, JIT-compile capabilities, and integration with popular frameworks, TC offers a convenient solution for speeding up machine learning tasks. By addressing the productivity gap between research and production needs, TC provides a valuable tool for the machine learning community.