BMNNSDK (BITMAIN Neural Network SDK) is a deep learning SDK customized by BitMain based on its self-developed AI chip, covering the network compilation and runtime enablement required by Neural Network inference , to provide an easy-to-use and high-efficiency full-stack solution for deep learning applications development and deployment.
The BMNNSDK consists of two parts, BMNet and BMRuntime. BMNet is responsible for optimizing various deep neural network models (such as caffemodel) , fully balancing the EU operation and access time, improving the parallelism of the operation, and finally transforming into bmodel supported by Bitmain TPU. BMRuntime is responsible for driving the TPU chip, providing a unified programming interface for the upper application, enabling the network inference through the bmodel, while the users don’t need to be concerned with the hardware implementation details.
BMNNSDK has two kinds of compilation. For the layer that TPU support, you can use the BMNet to compile and deploy. For the layer that TPU can’t support currently, you can extend the compiler by BMNet programming interface, use the BMKernel programming interface or CPU instructions to add custom network layer, enable users to compile a non-public network.
We provide developers with docker image for development, which integrated the tools and libraries required for BMNNSDK, developers can use it to develop the deep learning application.
The compiled network and the deep learning application can be deployed through BMRuntime after integrated. In the deployed process, you can use the BMNet inference engine API interface for programming.
It provides two types of development, PCIE and SOC, you can choose the one that suits you.
Base on BitMain’s AI chip, it provides the maximum inference throughput and the simplest deployment environment.
Provides a runtime programming interface to use the TPU compulting resources, enable users to develop in depth.
The runtime library provides concurrent processing capability, and supports multithreading and multiprocessing.