README.rst 6.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200
  1. ===============================================================
  2. libbcc: A Versatile Bitcode Execution Engine for Mobile Devices
  3. ===============================================================
  4. Introduction
  5. ------------
  6. libbcc is an LLVM bitcode execution engine that compiles the bitcode
  7. to an in-memory executable. libbcc is versatile because:
  8. * it implements both AOT (Ahead-of-Time) and JIT (Just-in-Time)
  9. compilation.
  10. * Android devices demand fast start-up time, small size, and high
  11. performance *at the same time*. libbcc attempts to address these
  12. design constraints.
  13. * it supports on-device linking. Each device vendor can supply his or
  14. her own runtime bitcode library (lib*.bc) that differentiates his or
  15. her system. Specialization becomes ecosystem-friendly.
  16. libbcc provides:
  17. * a *just-in-time bitcode compiler*, which translates the LLVM bitcode
  18. into machine code
  19. * a *caching mechanism*, which can:
  20. * after each compilation, serialize the in-memory executable into a
  21. cache file. Note that the compilation is triggered by a cache
  22. miss.
  23. * load from the cache file upon cache-hit.
  24. Highlights of libbcc are:
  25. * libbcc supports bitcode from various language frontends, such as
  26. Renderscript, GLSL (pixelflinger2).
  27. * libbcc strives to balance between library size, launch time and
  28. steady-state performance:
  29. * The size of libbcc is aggressively reduced for mobile devices. We
  30. customize and improve upon the default Execution Engine from
  31. upstream. Otherwise, libbcc's execution engine can easily become
  32. at least 2 times bigger.
  33. * To reduce launch time, we support caching of
  34. binaries. Just-in-Time compilation are oftentimes Just-too-Late,
  35. if the given apps are performance-sensitive. Thus, we implemented
  36. AOT to get the best of both worlds: Fast launch time and high
  37. steady-state performance.
  38. AOT is also important for projects such as NDK on LLVM with
  39. portability enhancement. Launch time reduction after we
  40. implemented AOT is signficant::
  41. Apps libbcc without AOT libbcc with AOT
  42. launch time in libbcc launch time in libbcc
  43. App_1 1218ms 9ms
  44. App_2 842ms 4ms
  45. Wallpaper:
  46. MagicSmoke 182ms 3ms
  47. Halo 127ms 3ms
  48. Balls 149ms 3ms
  49. SceneGraph 146ms 90ms
  50. Model 104ms 4ms
  51. Fountain 57ms 3ms
  52. AOT also masks the launching time overhead of on-device linking
  53. and helps it become reality.
  54. * For steady-state performance, we enable VFP3 and aggressive
  55. optimizations.
  56. * Currently we disable Lazy JITting.
  57. API
  58. ---
  59. **Basic:**
  60. * **bccCreateScript** - Create new bcc script
  61. * **bccRegisterSymbolCallback** - Register the callback function for external
  62. symbol lookup
  63. * **bccReadBC** - Set the source bitcode for compilation
  64. * **bccReadModule** - Set the llvm::Module for compilation
  65. * **bccLinkBC** - Set the library bitcode for linking
  66. * **bccPrepareExecutable** - *deprecated* - Use bccPrepareExecutableEx instead
  67. * **bccPrepareExecutableEx** - Create the in-memory executable by either
  68. just-in-time compilation or cache loading
  69. * **bccGetFuncAddr** - Get the entry address of the function
  70. * **bccDisposeScript** - Destroy bcc script and release the resources
  71. * **bccGetError** - *deprecated* - Don't use this
  72. **Reflection:**
  73. * **bccGetExportVarCount** - Get the count of exported variables
  74. * **bccGetExportVarList** - Get the addresses of exported variables
  75. * **bccGetExportFuncCount** - Get the count of exported functions
  76. * **bccGetExportFuncList** - Get the addresses of exported functions
  77. * **bccGetPragmaCount** - Get the count of pragmas
  78. * **bccGetPragmaList** - Get the pragmas
  79. **Debug:**
  80. * **bccGetFuncCount** - Get the count of functions (including non-exported)
  81. * **bccGetFuncInfoList** - Get the function information (name, base, size)
  82. Cache File Format
  83. -----------------
  84. A cache file (denoted as \*.oBCC) for libbcc consists of several sections:
  85. header, string pool, dependencies table, relocation table, exported
  86. variable list, exported function list, pragma list, function information
  87. table, and bcc context. Every section should be aligned to a word size.
  88. Here is the brief description of each sections:
  89. * **Header** (MCO_Header) - The header of a cache file. It contains the
  90. magic word, version, machine integer type information (the endianness,
  91. the size of off_t, size_t, and ptr_t), and the size
  92. and offset of other sections. The header section is guaranteed
  93. to be at the beginning of the cache file.
  94. * **String Pool** (MCO_StringPool) - A collection of serialized variable
  95. length strings. The strp_index in the other part of the cache file
  96. represents the index of such string in this string pool.
  97. * **Dependencies Table** (MCO_DependencyTable) - The dependencies table.
  98. This table stores the resource name (or file path), the resource
  99. type (rather in APK or on the file system), and the SHA1 checksum.
  100. * **Relocation Table** (MCO_RelocationTable) - *not enabled*
  101. * **Exported Variable List** (MCO_ExportVarList) -
  102. The list of the addresses of exported variables.
  103. * **Exported Function List** (MCO_ExportFuncList) -
  104. The list of the addresses of exported functions.
  105. * **Pragma List** (MCO_PragmaList) - The list of pragma key-value pair.
  106. * **Function Information Table** (MCO_FuncTable) - This is a table of
  107. function information, such as function name, function entry address,
  108. and function binary size. Besides, the table should be ordered by
  109. function name.
  110. * **Context** - The context of the in-memory executable, including
  111. the code and the data. The offset of context should aligned to
  112. a page size, so that we can mmap the context directly into memory.
  113. For furthur information, you may read `bcc_cache.h <include/bcc/bcc_cache.h>`_,
  114. `CacheReader.cpp <lib/bcc/CacheReader.cpp>`_, and
  115. `CacheWriter.cpp <lib/bcc/CacheWriter.cpp>`_ for details.
  116. JIT'ed Code Calling Conventions
  117. -------------------------------
  118. 1. Calls from Execution Environment or from/to within script:
  119. On ARM, the first 4 arguments will go into r0, r1, r2, and r3, in that order.
  120. The remaining (if any) will go through stack.
  121. For ext_vec_types such as float2, a set of registers will be used. In the case
  122. of float2, a register pair will be used. Specifically, if float2 is the first
  123. argument in the function prototype, float2.x will go into r0, and float2.y,
  124. r1.
  125. Note: stack will be aligned to the coarsest-grained argument. In the case of
  126. float2 above as an argument, parameter stack will be aligned to an 8-byte
  127. boundary (if the sizes of other arguments are no greater than 8.)
  128. 2. Calls from/to a separate compilation unit: (E.g., calls to Execution
  129. Environment if those runtime library callees are not compiled using LLVM.)
  130. On ARM, we use hardfp. Note that double will be placed in a register pair.