Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Configuring a Cross-Compilation Environment for TensorFlow on ARMv7 Chips

Tech May 10 5

Since most PC Linux kernels are X86 or X64, while the target chip is ARMv7, directly compiled vertions cannot be used on the chip. Therefore, a cross-compilation environment needs to be configured.

Steps to install the cross-compilation environment:

  1. Install Bazel Method One: Method Two:

    1. sudo apt-get install openjdk-8-jdk
    2. Add Bazel source: sudo apt-get install curl echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
    3. Install Bazel sudo apt-get update sudo apt-get install bazel (If the googleapi website fails during installation, try several times)
    4. Upgrade Bazel sudo apt-get upgrade bazel
  2. Set up the cross-compilation toolchain Current cross-compilation SDK directory: /home/dev/sysroots/x86_64-linux Add the following instructions at the end of /etc/bash.bashrc:

export ARCH=arm
export PATH=/home/dev/sysroots/x86_64-linux/bin/:$PATH
export CROSS_COMPILE=arm-poky-linux-gnueabi-
export CC=/home/dev/sysroots/x86_64-linux/bin/arm-poky-linux-gnueabi-gcc
export CXX=/home/dev/sysroots/x86_64-linux/bin/arm-poky-linux-gnueabi-g++
export LD=/home/dev/sysroots/x86_64-linux/bin/arm-poky-linux-gnueabi-ld
export AR=/home/dev/sysroots/x86_64-linux/bin/arm-poky-linux-gnueabi-ar
export AS=/home/dev/sysroots/x86_64-linux/bin/arm-poky-linux-gnueabi-as
export RANLIB=/home/dev/sysroots/x86_64-linux/bin/arm-poky-linux-gnueabi-ranlib

Restart the command line after modification for the changes to take effect.

Use the following command to confirm the cross-compilation chain is set:

echo $CC If it displays /home/dev/sysroots/x86_64-linux/bin/arm-poky-linux-gnueabi-gcc, the cross-compilation chain is set. When needing to switch back to native compilation, comment out the above instructions.

  1. Prepare cross-compilation configuration The following Bazel configuration scripts need to be created and modified:

    • ./WORKSPACE
    • ./arm_compiler
      • ../BUILD
      • ../build_armv7.sh
      • ../cross_toolchain_target_armv7.BUILD
      • ../CROSSTOOL

    3.1 Modify WORKSPACE Add the following content to the TensorFlow root directory:

new_local_repository(
    name ='toolchain_target_armv7',   # cross-compiler alias
    path ='/home/dev/sysroots/x86_64-linux',  # cross-compiler path, adjustable
    build_file = 'arm_compiler/cross_toolchain_target_armv7.BUILD'   # cross-compiler description file
)

3.2 Create cross-compiler description file: Create cross_toolchain_target_armv7.BUILD:

major_version: "local"
minor_version: ""
default_target_cpu: "armv7"

default_toolchain {
  cpu: "armv7"
  toolchain_identifier: "arm-poky-linux-gnueabi"
}

default_toolchain {
  cpu: "k8"
  toolchain_identifier: "local"
}

toolchain {
  abi_version: "gcc"
  abi_libc_version: "glibc_2.23"
  builtin_sysroot: ""
  compiler: "compiler"
  host_system_name: "armv7"
  needsPic: true
  supports_gold_linker: false
  supports_incremental_linker: false
  supports_fission: false
  supports_interface_shared_objects: false
  supports_normalizing_ar: true
  supports_start_end_lib: false
  supports_thin_archives: true
  target_libc: "glibc_2.23"
  target_cpu: "armv7"
  target_system_name: "armv7"
  toolchain_identifier: "arm-poky-linux-gnueabi"

  tool_path { name: "ar" path: "/home/dev/sysroots/x86_64-linux/usr/bin/arm-poky-linux-gnueabi/arm-poky-linux-gnueabi-ar" }
  tool_path { name: "compat-ld" path: "/home/dev/sysroots/x86_64-linux/usr/bin/arm-poky-linux-gnueabi/arm-poky-linux-gnueabi-ld" }
  tool_path { name: "cpp" path: "/home/dev/sysroots/x86_64-linux/usr/bin/arm-poky-linux-gnueabi/arm-poky-linux-gnueabi-cpp" }
  tool_path { name: "dwp" path: "/home/dev/sysroots/x86_64-linux/usr/bin/arm-poky-linux-gnueabi/arm-poky-linux-gnueabi-dwp" }
  tool_path { name: "gcc" path: "/home/dev/sysroots/x86_64-linux/usr/bin/arm-poky-linux-gnueabi/arm-poky-linux-gnueabi-gcc" }
  tool_path { name: "gcov" path: "/home/dev/sysroots/x86_64-linux/usr/bin/arm-poky-linux-gnueabi/arm-poky-linux-gnueabi-gcov" }
  tool_path { name: "ld" path: "/home/dev/sysroots/x86_64-linux/usr/bin/arm-poky-linux-gnueabi/arm-poky-linux-gnueabi-ld" }
  tool_path { name: "nm" path: "/home/dev/sysroots/x86_64-linux/usr/bin/arm-poky-linux-gnueabi/arm-poky-linux-gnueabi-nm" }
  tool_path { name: "objcopy" path: "/home/dev/sysroots/x86_64-linux/usr/bin/arm-poky-linux-gnueabi/arm-poky-linux-gnueabi-objcopy" }
  objcopy_embed_flag: "-I"
  objcopy_embed_flag: "binary"
  tool_path { name: "objdump" path: "/home/dev/sysroots/x86_64-linux/usr/bin/arm-poky-linux-gnueabi/arm-poky-linux-gnueabi-objdump" }
  tool_path { name: "strip" path: "/home/dev/sysroots/x86_64-linux/usr/bin/arm-poky-linux-gnueabi/arm-poky-linux-gnueabi-strip" }

  compiler_flag: "-nostdinc"
  compiler_flag: "-isystem"
  compiler_flag: "/home/dev/sysroots/x86_64-linux/cortexa9hf/usr/include"
  compiler_flag: "-isystem"
  compiler_flag: "/home/dev/sysroots/x86_64-linux/usr/lib/arm-poky-linux-gnueabi/gcc/arm-poky-linux-gnueabi/5.3.0/include"
  compiler_flag: "-isystem"
  compiler_flag: "/home/dev/sysroots/x86_64-linux/usr/lib/arm-poky-linux-gnueabi/gcc/arm-poky-linux-gnueabi/5.3.0/include-fixed"
  compiler_flag: "-isystem"
  compiler_flag: "/home/dev/sysroots/x86_64-linux/cortexa9hf/usr/include/c++/5.3.0"
  compiler_flag: "-isystem"
  compiler_flag: "/home/dev/sysroots/x86_64-linux/cortexa9hf/usr/include/c++/5.3.0/arm-poky-linux-gnueabi"

  cxx_flag: "-isystem"
  cxx_flag: "/home/dev/sysroots/x86_64-linux/cortexa9hf/usr/include"
  cxx_flag: "-isystem"
  cxx_flag: "/home/dev/sysroots/x86_64-linux/usr/lib/arm-poky-linux-gnueabi/gcc/arm-poky-linux-gnueabi/5.3.0/include"
  cxx_flag: "-isystem"
  cxx_flag: "/home/dev/sysroots/x86_64-linux/usr/lib/arm-poky-linux-gnueabi/gcc/arm-poky-linux-gnueabi/5.3.0/include-fixed"
  cxx_flag: "-isystem"
  cxx_flag: "/home/dev/sysroots/x86_64-linux/cortexa9hf/usr/include/c++/5.3.0"
  cxx_flag: "-isystem"
  cxx_flag: "/home/dev/sysroots/x86_64-linux/cortexa9hf/usr/include/c++/5.3.0/arm-poky-linux-gnueabi"
  cxx_flag: "-std=c++11"

  cxx_builtin_include_directory: "/home/dev/sysroots/x86_64-linux/cortexa9hf/usr/include"
  cxx_builtin_include_directory: "/home/dev/sysroots/x86_64-linux/usr/lib/arm-poky-linux-gnueabi/gcc/arm-poky-linux-gnueabi/5.3.0/include"
  cxx_builtin_include_directory: "/home/dev/sysroots/x86_64-linux/usr/lib/arm-poky-linux-gnueabi/gcc/arm-poky-linux-gnueabi/5.3.0/include-fixed"
  cxx_builtin_include_directory: "/home/dev/sysroots/x86_64-linux/cortexa9hf/usr/include/c++/5.3.0"
  cxx_builtin_include_directory: "/home/dev/sysroots/x86_64-linux/cortexa9hf/usr/include/c++/5.3.0/arm-poky-linux-gnueabi"

  linker_flag: "-lstdc++"
  linker_flag: "-L/home/dev/sysroots/x86_64-linux/lib"
  linker_flag: "-L/home/dev/sysroots/x86_64-linux/usr/lib"
  linker_flag: "-L/home/dev/sysroots/x86_64-linux/cortexa9hf/lib"
  linker_flag: "-L/home/dev/sysroots/x86_64-linux/cortexa9hf/usr/lib"
  linker_flag: "-Wl,--dynamic-linker=/lib/ld-linux-aarch64.so.1"

  unfiltered_cxx_flag: "-no-canonical-prefixes"
  linker_flag: "-no-canonical-prefixes"

  unfiltered_cxx_flag: "-Wno-builtin-macro-redefined"
  unfiltered_cxx_flag: "-D__DATE__=\"redacted\""
  unfiltered_cxx_flag: "-D__TIMESTAMP__=\"redacted\""
  unfiltered_cxx_flag: "-D__TIME__=\"redacted\""

  compiler_flag: "-U_FORTIFY_SOURCE"
  compiler_flag: "-fstack-protector"
  compiler_flag: "-fPIE"
  linker_flag: "-pie"
  linker_flag: "-Wl,-z,relro,-z,now"

  compiler_flag: "-fdiagnostics-color=always"

  compiler_flag: "-Wall"
  compiler_flag: "-Wunused-but-set-parameter"
  compiler_flag: "-Wno-free-nonheap-object"

  compiler_flag: "-fno-omit-frame-pointer"

  linker_flag: "-Wl,--build-id=md5"
  linker_flag: "-Wl,--hash-style=gnu"

  compilation_mode_flags {
    mode: DBG
    compiler_flag: "-g"
  }
  compilation_mode_flags {
    mode: OPT
    compiler_flag: "-g0"
    compiler_flag: "-O3"
    compiler_flag: "-DNDEBUG -mcpu=cortex-a9 -mfpu=neon"
  }
}

toolchain {
  toolchain_identifier: "local"
  abi_libc_version: "local"
  abi_version: "local"
  builtin_sysroot: ""
  compiler: "compiler"
  compiler_flag: "-U_FORTIFY_SOURCE"
  compiler_flag: "-D_FORTIFY_SOURCE=2"
  compiler_flag: "-fstack-protector"
  compiler_flag: "-Wall"
  compiler_flag: "-Wl,-z,-relro,-z,now"
  compiler_flag: "-B/usr/bin"
  compiler_flag: "-B/usr/bin"
  compiler_flag: "-Wunused-but-set-parameter"
  compiler_flag: "-Wno-free-nonheap-object"
  compiler_flag: "-fno-omit-frame-pointer"
  compiler_flag: "-isystem"
  compiler_flag: "/usr/include"
  cxx_builtin_include_directory: "/usr/include/c++/5.4.0"
  cxx_builtin_include_directory: "/usr/include/c++/5"
  cxx_builtin_include_directory: "/usr/lib/gcc/x86_64-linux-gnu/5/include"
  cxx_builtin_include_directory: "/usr/include/x86_64-linux-gnu/c++/5.4.0"
  cxx_builtin_include_directory: "/usr/include/c++/5.4.0/backward"
  cxx_builtin_include_directory: "/usr/lib/gcc/x86_64-linux-gnu/5.4.0/include"
  cxx_builtin_include_directory: "/usr/local/include"
  cxx_builtin_include_directory: "/usr/lib/gcc/x86_64-linux-gnu/5.4.0/include-fixed"
  cxx_builtin_include_directory: "/usr/lib/gcc/x86_64-linux-gnu/5/include-fixed"
  cxx_builtin_include_directory: "/usr/include/x86_64-linux-gnu"
  cxx_builtin_include_directory: "/usr/include"
  cxx_flag: "-std=c++11"
  host_system_name: "local"
  linker_flag: "-lstdc++"
  linker_flag: "-lm"
  linker_flag: "-Wl,-no-as-needed"
  linker_flag: "-B/usr/bin"
  linker_flag: "-B/usr/bin"
  linker_flag: "-pass-exit-codes"
  needsPic: true
  objcopy_embed_flag: "-I"
  objcopy_embed_flag: "binary"
  supports_fission: false
  supports_gold_linker: false
  supports_incremental_linker: false
  supports_interface_shared_objects: false
  supports_normalizing_ar: false
  supports_start_end_lib: false
  supports_thin_archives: false
  target_cpu: "k8"
  target_libc: "local"
  target_system_name: "local"
  unfiltered_cxx_flag: "-fno-canonical-system-headers"
  unfiltered_cxx_flag: "-Wno-builtin-macro-redefined"
  unfiltered_cxx_flag: "-D__DATE__=\"redacted\""
  unfiltered_cxx_flag: "-D__TIMESTAMP__=\"redacted\""
  unfiltered_cxx_flag: "-D__TIME__=\"redacted\""
  tool_path {name: "ar" path: "/usr/bin/ar" }
  tool_path {name: "cpp" path: "/usr/bin/cpp" }
  tool_path {name: "dwp" path: "/usr/bin/dwp" }
  tool_path {name: "gcc" path: "/usr/bin/gcc" }
  tool_path {name: "gcov" path: "/usr/bin/gcov" }
  tool_path {name: "ld" path: "/usr/bin/ld" }
  tool_path {name: "nm" path: "/usr/bin/nm" }
  tool_path {name: "objcopy" path: "/usr/bin/objcopy" }
  tool_path {name: "objdump" path: "/usr/bin/objdump" }
  tool_path {name: "strip" path: "/usr/bin/strip" }

  compilation_mode_flags {
    mode: DBG
    compiler_flag: "-g"
  }
  compilation_mode_flags {
    mode: OPT
    compiler_flag: "-g0"
    compiler_flag: "-O3"
    compiler_flag: "-DNDEBUG"
    compiler_flag: "-ffunction-sections"
    compiler_flag: "-fdata-sections"
    linker_flag: "-Wl,--gc-sections"
  }
  linking_mode_flags { mode: DYNAMIC }
}

3.4 Create BUILD

package(default_visibility = ["//visibility:public"])

cc_toolchain_suite(
    name = "toolchain",
    toolchains = {
        "armv7|compiler": ":cc-compiler-armv7",
        "k8|compiler": ":cc-compiler-local",
    },
)

filegroup(
    name = "empty",
    srcs = [],
)

filegroup(
    name = "arm_linux_all_files",
    srcs = [
        "@toolchain_target_armv7//:compiler_pieces",
    ],
)

cc_toolchain(
    name = "cc-compiler-local",
    all_files = ":empty",
    compiler_files = ":empty",
    cpu = "local",
    dwp_files = ":empty",
    dynamic_runtime_libs = [":empty"],
    linker_files = ":empty",
    objcopy_files = ":empty",
    static_runtime_libs = [":empty"],
    strip_files = ":empty",
    supports_param_files = 1,
)

cc_toolchain(
    name = "cc-compiler-armv7",
    all_files = ":arm_linux_all_files",
    compiler_files = ":arm_linux_all_files",
    cpu = "armv7",
    dwp_files = ":empty",
    dynamic_runtime_libs = [":empty"],
    linker_files = ":arm_linux_all_files",
    objcopy_files = "arm_linux_all_files",
    static_runtime_libs = [":empty"],
    strip_files = "arm_linux_all_files",
    supports_param_files = 1,
)

3.5 Add nsync cross-compilation support The official nsync cross-compiler may differ, requiring modifications to nsync's build settings in ~/.cache/bazel/<random_dir>/external/nsync/BUILD:

3.5.1 List directories: ls ~/.cache/bazel/ (finds bazel_<user>) 3.5.2 Navigate inside and find the latest random folder by timestamp. 3.5.3 Example folder: 7924169126bef9c95805dc831e19e9c3. Enter external/nsync/BUILD and make these additions:

Add config_setting:

config_setting(
    name = "armv7",
    values = {"cpu": "armeabi-v7a"},
)
config_setting(
    name = "armv8",
    values = {"cpu": "arm64-v8a"},
)

In NSYNC_OPTS, extend the select blocks to include ":armv7" and ":armv8". In NSYNC_SRC_PLATFORM, add:

":armv7": NSYNC_SRC_LINUX,
":armv8": NSYNC_SRC_LINUX,

Similarly update NSYNC_TEST_SRC_PLATFORM.

  1. Run the configure script cd tensorflow && ./configure Note:

    • Specify Python libray location.
    • Optimization flags: default -march=native should be changed for cross-compilation, e.g., -march=armv7-a.
    • Select No for all other options.
  2. Shell script: build_armv7.sh (cross-compile the main TensorFlow part):

    bazel build --copt="-fPIC" --copt="-march=armv7-a" --cxxopt="-fPIC" --cxxopt="-march=armv7-a" --verbose_failures --crosstool_top=//arm_compiler:toolchain --cpu=armv7 --config=opt tensorflow/examples/label_image/...
    

    You can set --jobs to specify compilation threads; default uses CPU cores × 2. Various errors might occur during compilation; refer to TensorFlow cross-compilation error collections for solutions. After compilation, target files are generated:

    • bazel-bin/tensorflow/libtensorflow_framework.so
    • bazel-bin/tensorflow/examples/label_image/label_image
  3. Cross-compile TensorFlow Lite Reference: tensorflow/contrib/lite/g3doc/rpi.md Before compiling, modify multi-threading code (default threads=4):

const Eigen::ThreadPoolDevice& GetThreadPoolDevice() {
  int thread_count = 1;
  const char *val = getenv("OMP_NUM_THREADS");
  if (val != nullptr) {
    thread_count = atoi(val);
  }
  static Eigen::ThreadPool* tp = new Eigen::ThreadPool(thread_count);
  static EigenThreadPoolWrapper* thread_pool_wrapper =
      new EigenThreadPoolWrapper(tp);
  static Eigen::ThreadPoolDevice* device =
      new Eigen::ThreadPoolDevice(thread_pool_wrapper, thread_count);
  return *device;
}

Step 1: ./tensorflow/contrib/lite/download_dependencies.sh (Run once if successful) Step 2: Based on ./tensorflow/contrib/lite/build_rpi_lib.sh, create an iMX6 script build_imx6_lib.sh:

set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$SCRIPT_DIR/../../.."
CC_PREFIX=arm-poky-linux-gnueabi- make -j 3 -f tensorflow/contrib/lite/Makefile TARGET=imx6 TARGET_ARCH=armv7a

Step 3: Copy the downloaded FlatBuffers into ./tensorflow/contrib/lite/schema. On success, generated targets:

  • tensorflow/contrib/lite/gen/lib/rpi_armv7/libtensorflow-lite.a
  • tensorflow/contrib/lite/gen/bin/rpi_armv7/benchmark_model
  1. Cross-compile TensorFlow Lite label_image test tool:

    bazel build --copt="-fPIC" --copt="-march=armv7-a" --cxxopt="-fPIC" --cxxopt="-march=armv7-a" --verbose_failures --crosstool_top=//arm_compiler:toolchain --cpu=armv7 --config=opt //tensorflow/contrib/lite/examples/label_image:label_image
    
  2. Native compile TOCO model conversion tool TOCO tool runs on X86 machines. To minimize the runtime environment, use FlatBuffers; convert models with TOCO before testing: bazel build tensorflow/contrib/lite/toco:toco (disable cross-compilation options) Generated target:

    • bazel-bin/tensorflow/contrib/lite/toco/toco (keep it in place for execusion)

TensorFlow Lite cross-compilation environment setup complete.

Tags: tensorflow

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.