TensorFlow library for adding FPGA based layers

subDesTagesMitExtraKaese 992428a2c9 updated doku vor 3 Jahren
c++ e1b73ad0a6 fixed output channel accumulation bug vor 4 Jahren
doku 992428a2c9 updated doku vor 3 Jahren
examples 474d326b5f fixed train example vor 4 Jahren
hostLib 71217544b7 added more layers vor 4 Jahren
tests a7d674e8cc added latency tests vor 4 Jahren
.gitignore a49eba986a added bandwidth benchmark vor 4 Jahren
.gitmodules a6e49047a8 moved sources vor 4 Jahren
README.md 992428a2c9 updated doku vor 3 Jahren
config.json d839b063df copied json.hpp vor 4 Jahren

README.md

TensorFlow library for adding FPGA based layers

Assembled System

Components

Usage

import tensorflow as tf
from tensorflow.keras import models
from hostLib.layers.conv2d import Conv2D as Conv2DFPGA

model = models.Sequential()
model.add(Conv2DFPGA(1))

Installation

  1. clone repository and init submodules

    git clone <this url>
    cd ./tf-fpga
    git submodule init
    
  2. install dependencies (on Ubuntu Linux for example)

    sudo apt update                           
    sudo apt upgrade -y
    sudo apt autoremove
    sudo apt install python3 python3-pip
    sudo python3 -m pip install --upgrade pip # update pip globally
    python3 -m pip install tensorflow
    
  3. install C++ compiler

    sudo apt install g++
    
  4. compile operator and fpga libraries

    cd ./c++
    ./configure
    make
    
    > /usr/bin/g++ ... -o build/dummyBigOp.o src/dummyBigOp.cpp
    > ...
    > /usr/bin/g++ ... -o build/op_lib.so ...
    
  5. update config.json with your FPGA addresses defined in the VHDL design

    {"fpgas": [
      {
        "ip":   "192.168.1.33",
        "port": 1234
      },
      {
        "ip":   "192.168.1.34",
        "port": 1234
      },
      {
        "ip":   "192.168.1.35",
        "port": 1234
      }
    ]}
    

Adding new custom layers

For more details on how to contribute to git projects see https://gist.github.com/MarcDiethelm/7303312.

  1. create a computation module in the FPGA implementation
  2. add your FPGA module to the list of modules c++/lib/mlfpga/include/modules.hpp

    then the MOD_DEF macro creates these entries automagically:

    moduleIds[Module::myNewModule];
    moduleNames[Module::myNewModule];
    moduleSendPayloadLength[Module::myNewModule];
    moduleRecvPayloadLength[Module::myNewModule];
    
  3. create a TF kernel implementation MyNewOp inherited from AsyncOpKernel, inside these files:

    c++/src/myNewOp.cpp and c++/include/myNewOp.hpp

    define the constructor and overwrite the ComputeAsync method:

    class MyNewOp : public AsyncOpKernel {
      public:
        explicit MyNewOp(OpKernelConstruction* context);
    
        void ComputeAsync(OpKernelContext* context, DoneCallback done) override;
    }
    

    using your FPGA module

    auto worker = connectionManager.createWorker(Module::myNewModule, count);
    
  4. register the the kernel with a custom operator:

    c++/src/entrypoint.cpp

    REGISTER_OP("MyNewOp")
      .Input("input: float")
      .Output("output: float")
      .SetShapeFn([](InferenceContext* c) {
        c->set_output(0, c->input(0));
        return Status::OK();
      });
    ;
    
    REGISTER_KERNEL_BUILDER(Name("MyNewOp").Device(DEVICE_CPU), MyNewOp);
    //                                  the custom kernel class /\ 
    

    c++/include/entrypoint.hpp

    #include "myNewOp.hpp"
    

    More information on creating custom TF kernels can be found here.


  5. compile everything

    cd ./c++
    make clean
    make
    
  6. append a test for your operator

    tests/op_test.py

    def testMyNewOp(self):
      with self.session():
        input = [1,2,3]
        result = load_op.op_lib.MyNewOp(input=input)
        self.assertAllEqual(result, input)
    
  7. add a custom layer that uses the operator

    hostLib/layers/myNewLayer.py

    class MyNewLayer(layers.Layer):
      ...
      def call(self, inputs):
        return load_op.op_lib.MyNewOp(input=inputs)
    
    
  8. add that layer to the python module

    hostLib/layers/__init__.py

    __all__ = ["conv2d", "myNewLayer"]
    

Tests

There are tests for each complexity level of this project.

  1. loopback test without connected FPGAs. This will only succeed for modules that have equal input and output lengths.

    compile the UDP echo server and run it in a seperate terminal:

    cd ./c++
    make echo
    ./build/echo
    

    edit config.json:

    {"fpgas": [
      {
        "ip":   "localhost",
        "port": 1234
      }
    ]}
    

    then run any dummy module test:

    python3 tests/op_test.py
    

    echo_test

  2. FPGA communication test c++/tests/main.cpp

    cd ./c++
    make test
    ./build/test
    

    comm_test

  3. operator validation test, based on TFs test suite tests/op_test.py

    python3 tests/op_test.py
    

    op_test

Dependencies

C++

Python3

  • tensorflow
  • c++/build/op_lib.so

Used in examples:

  • Pillow
  • CV2
  • mss
  • numpy
  • IPython