LightCode

Architecture

File Structure

alt text

Optimization Pipeline

alt text

Stacked Graph IR

Relay IR creates a computational graph. This graph is DAG where nodes represent operations (add, ReLu, matmul, …) and the directed edges represent data (often a tensor) being passed from the output of the previous node to the next.

The Stacked Graph IR takes this one step further by introducing the Stack.

For example, computational graphs for Machine Learning often have matrix multiplication operations (matmul). The matmul would be represented with a stack. That stack might have 3 nodes. Once conducting the matmul on CPU, one on GPU, another on photonic hardware.

Each node has a cost function representing the cost of computation. Each edge has a cost function representing the cost of data transfer. The cost can be different for each node and each edge entering/exiting a node. Each stack has many nodes and hence many in-edges for each ‘hyper-edge’2.

Key point, during flattening of the graph, only one node is selected from each stack.

Arithmatic Hardware Simulator

alt text

The cost function mentioned in Stacked Graph IR is calculated using the Arithmetic Hardware Simulator. Computation cost is based on a linear regression of ‘operations’ which is self defined and calculated based on the input tensor shapes and the operation being performed. Runtime data is collected on real hardware if available. Transfer cost is based on the number of bits being sent between locations. For instance, if the result of a photonic matmul needs to be sent to a GPU add, each bit must be sent to local SRAM, then to GPU.

Note: This is an extremely simplified hardware model, especially when considering memory accesses. It was designed to be a quick gauge for how running a computation on a novel hardware might compare. More accurate (and time intensive) hardware models could be added as a separate backend at some point.

alt text

Prompt Size Lightcode TVM % Error
2 0.03245 0.05926 -45
3 0.05216 0.08938 -42
4 0.07188 0.07535 -5
5 0.09157 0.12632 -27
6 0.11132 0.10438 7
8 0.15077 0.11115 36
10 0.20997 0.23493 -11
12 0.24945 0.27081 -8
14 0.28895 0.30901 -6
17 0.32845 0.34200 -4
19 0.36797 0.38200 -4
22 0.44705 0.45000 -1
  1. Node in a DAG such that its removal would split the graph in two. Usually found between layers in many LLms. Visualization 

  2. This is a special case Hypergraph with Hyperedges.