jianglin 1c85872cc8 create repository 4 years ago
..
bcinfo 1c85872cc8 create repository 4 years ago
gdb_plugin 1c85872cc8 create repository 4 years ago
include 1c85872cc8 create repository 4 years ago
lib 1c85872cc8 create repository 4 years ago
tests 1c85872cc8 create repository 4 years ago
tools 1c85872cc8 create repository 4 years ago
Android.bp 1c85872cc8 create repository 4 years ago
CleanSpec.mk 1c85872cc8 create repository 4 years ago
NOTICE 1c85872cc8 create repository 4 years ago
OWNERS 1c85872cc8 create repository 4 years ago
README.html 1c85872cc8 create repository 4 years ago
README.rst 1c85872cc8 create repository 4 years ago
libbcc-targets.mk 1c85872cc8 create repository 4 years ago

README.html












libbcc: A Versatile Bitcode Execution Engine for Mobile Devices




Introduction


libbcc is an LLVM bitcode execution engine that compiles the bitcode
to an in-memory executable. libbcc is versatile because:



  • it implements both AOT (Ahead-of-Time) and JIT (Just-in-Time)
    compilation.

  • Android devices demand fast start-up time, small size, and high
    performance at the same time. libbcc attempts to address these
    design constraints.

  • it supports on-device linking. Each device vendor can supply his or
    her own runtime bitcode library (lib*.bc) that differentiates his or
    her system. Specialization becomes ecosystem-friendly.


libbcc provides:



  • a just-in-time bitcode compiler, which translates the LLVM bitcode
    into machine code

  • a caching mechanism, which can:

    • after each compilation, serialize the in-memory executable into a
      cache file. Note that the compilation is triggered by a cache
      miss.

    • load from the cache file upon cache-hit.




Highlights of libbcc are:



  • libbcc supports bitcode from various language frontends, such as
    Renderscript, GLSL (pixelflinger2).



  • libbcc strives to balance between library size, launch time and
    steady-state performance:



    • The size of libbcc is aggressively reduced for mobile devices. We
      customize and improve upon the default Execution Engine from
      upstream. Otherwise, libbcc's execution engine can easily become
      at least 2 times bigger.



    • To reduce launch time, we support caching of
      binaries. Just-in-Time compilation are oftentimes Just-too-Late,
      if the given apps are performance-sensitive. Thus, we implemented
      AOT to get the best of both worlds: Fast launch time and high
      steady-state performance.


      AOT is also important for projects such as NDK on LLVM with
      portability enhancement. Launch time reduction after we
      implemented AOT is signficant:



      Apps libbcc without AOT libbcc with AOT
      launch time in libbcc launch time in libbcc
      App_1 1218ms 9ms
      App_2 842ms 4ms
      Wallpaper:
      MagicSmoke 182ms 3ms
      Halo 127ms 3ms
      Balls 149ms 3ms
      SceneGraph 146ms 90ms
      Model 104ms 4ms
      Fountain 57ms 3ms

      AOT also masks the launching time overhead of on-device linking
      and helps it become reality.



    • For steady-state performance, we enable VFP3 and aggressive
      optimizations.





  • Currently we disable Lazy JITting.






API


Basic:



  • bccCreateScript - Create new bcc script

  • bccRegisterSymbolCallback - Register the callback function for external
    symbol lookup

  • bccReadBC - Set the source bitcode for compilation

  • bccReadModule - Set the llvm::Module for compilation

  • bccLinkBC - Set the library bitcode for linking

  • bccPrepareExecutable - deprecated - Use bccPrepareExecutableEx instead

  • bccPrepareExecutableEx - Create the in-memory executable by either
    just-in-time compilation or cache loading

  • bccGetFuncAddr - Get the entry address of the function

  • bccDisposeScript - Destroy bcc script and release the resources

  • bccGetError - deprecated - Don't use this


Reflection:



  • bccGetExportVarCount - Get the count of exported variables

  • bccGetExportVarList - Get the addresses of exported variables

  • bccGetExportFuncCount - Get the count of exported functions

  • bccGetExportFuncList - Get the addresses of exported functions

  • bccGetPragmaCount - Get the count of pragmas

  • bccGetPragmaList - Get the pragmas


Debug:



  • bccGetFuncCount - Get the count of functions (including non-exported)

  • bccGetFuncInfoList - Get the function information (name, base, size)




Cache File Format


A cache file (denoted as *.oBCC) for libbcc consists of several sections:
header, string pool, dependencies table, relocation table, exported
variable list, exported function list, pragma list, function information
table, and bcc context. Every section should be aligned to a word size.
Here is the brief description of each sections:



  • Header (MCO_Header) - The header of a cache file. It contains the
    magic word, version, machine integer type information (the endianness,
    the size of off_t, size_t, and ptr_t), and the size
    and offset of other sections. The header section is guaranteed
    to be at the beginning of the cache file.

  • String Pool (MCO_StringPool) - A collection of serialized variable
    length strings. The strp_index in the other part of the cache file
    represents the index of such string in this string pool.

  • Dependencies Table (MCO_DependencyTable) - The dependencies table.
    This table stores the resource name (or file path), the resource
    type (rather in APK or on the file system), and the SHA1 checksum.

  • Relocation Table (MCO_RelocationTable) - not enabled

  • Exported Variable List (MCO_ExportVarList) -
    The list of the addresses of exported variables.

  • Exported Function List (MCO_ExportFuncList) -
    The list of the addresses of exported functions.

  • Pragma List (MCO_PragmaList) - The list of pragma key-value pair.

  • Function Information Table (MCO_FuncTable) - This is a table of
    function information, such as function name, function entry address,
    and function binary size. Besides, the table should be ordered by
    function name.

  • Context - The context of the in-memory executable, including
    the code and the data. The offset of context should aligned to
    a page size, so that we can mmap the context directly into memory.


For furthur information, you may read bcc_cache.h,
CacheReader.cpp, and
CacheWriter.cpp for details.




JIT'ed Code Calling Conventions



  1. Calls from Execution Environment or from/to within script:


    On ARM, the first 4 arguments will go into r0, r1, r2, and r3, in that order.
    The remaining (if any) will go through stack.


    For ext_vec_types such as float2, a set of registers will be used. In the case
    of float2, a register pair will be used. Specifically, if float2 is the first
    argument in the function prototype, float2.x will go into r0, and float2.y,
    r1.


    Note: stack will be aligned to the coarsest-grained argument. In the case of
    float2 above as an argument, parameter stack will be aligned to an 8-byte
    boundary (if the sizes of other arguments are no greater than 8.)



  2. Calls from/to a separate compilation unit: (E.g., calls to Execution
    Environment if those runtime library callees are not compiled using LLVM.)


    On ARM, we use hardfp. Note that double will be placed in a register pair.