Memory Compression: Getting the Most out of Storage and Communication
ShapeShifter reduces the amount of data required for advanced AI applications, by taking advantage of their naturally occurring data distribution.
ShapeShifter works on top of other quantization schemes.
Offline and Online
ShapeShifter includes both a software profiler that automatically determines the minimum data width required by an application, and run-time components that adapt to the varying precision requirements on the fly.
ShapeShifter can work in any network type, handling weights and activations of any layer kind.
Sparsity Structure Agnostic
ShapeShifter provides several components that can be integrated into hardware designs offering different cost-benefit tradeoffs: either reducing only off-chip traffic and storage or compounding the benefits by enhancing on-chip storage too.
Applicable to All Accelerators
ShapeShifter does not necessitate invasive changes to the compute core of existing accelerators: it can sit in front of the memory interface and work transparently to the rest of the system.
Sources of Precision Variability
Precision variability is an inherent property found in deep learning. We have found it is universally present on all networks, presenting an opportunity to greatly reduce data needs.
A plug-in engine that dynamically (de-)compresses all data going through it, plus a set of techniques that improve the results by determining a priori an upper bound in the precision requirements for any given network.
Why is data footprint important?
Contemporary accelerators are designed around and limited by storage and bandwidth constraints. Improvements in footprint can translate directly into cost reductions or throughput improvements.
Our expert team can provide the ShapeShifter design, integration and supporting software to provide custom solution to your design constraints and needs.