GSoC/GCI Archive
Google Summer of Code 2010

LLVM Compiler Infrastructure

Web Page:

Mailing List:


Low Level Virtual Machine (LLVM) is:

  1. A compilation strategy designed to enable effective program optimization across the entire lifetime of a program. LLVM supports effective optimization at compile time, link-time (particularly interprocedural), run-time and offline (i.e., after software is installed), while remaining transparent to developers and maintaining compatibility with existing build scripts.

  2. A virtual instruction set - LLVM is a low-level object code representation that uses simple RISC-like instructions, but provides rich, language-independent, type information and dataflow (SSA) information about operands. This combination enables sophisticated transformations on object code, while remaining light-weight enough to be attached to the executable. This combination is key to allowing link-time, run-time, and offline transformations.

  3. A compiler infrastructure - LLVM is also a collection of source code that implements the language and compilation strategy. The primary components of the LLVM infrastructure are a GCC-based C & C++ front-end, a link-time optimization framework with a growing set of global and interprocedural analyses and transformations, static back-ends for the X86, X86-64, PowerPC 32/64, ARM, Thumb, IA-64, Alpha, SPARC, MIPS and CellSPU architectures, a back-end which emits portable C code, and a Just-In-Time compiler for X86, X86-64, PowerPC 32/64 processors, and an emitter for MSIL.

  4. LLVM does not imply things that you would expect from a high-level virtual machine. It does not require garbage collection or run-time code generation (In fact, LLVM makes a great static compiler!). Note that optional LLVM components can be used to build high-level virtual machines and other systems that need these services.

LLVM is a robust system, particularly well suited for developing new mid-level language-independent analyses and optimizations of all sorts, including those that require extensive interprocedural analysis. LLVM is also a great target forfront-end development for conventional or research programming languages, including those which require compile-time, link-time, or run-time optimization for effective implementation, proper tail calls or garbage collection. We have an incomplete list of projects which have used LLVM for various purposes, showing that you can get up-and-running quickly with LLVM, giving time to do interesting things, even if you only have a semester in a University course. We also have a list of ideas for projects in LLVM.

See for more information.







  • Adding Range Analysis to LLVM The objective of this project is to add a range analysis to LLVM. The range analysis finds the intervals that integer variables used in the source program may assume. Such analysis has many clients, such as array-bound-check elimination, buffer overflow detection, improving register allocation, etc. I propose to implement the polynomial time analysis described by Zhendong and Wagner in the paper "A class of polynomially solvable range constraints for interval analysis without widening".
  • Buffer Initialization and Bounds Checking Currently the Clang compiler and static analyzer provide only very basic support for bounds and initialization checking. This project would extend that by providing new checks for functions that require an initialized argument and functions that modify part or all of an argument buffer. If possible, this functionality would be exposed via source-level annotations (__attribute__()), allowing programmers to declare intent when writing relevant functions.
  • Easily Extensible Attributes in clang The clang C/C++/Objective-C compiler currently handles three varieties of attributes - GCC attributes, MSVC attributes, and C++0x attributes - with varying approaches to each variety. This complexity is compounded by bad internal handling of attributes. This makes it difficult and complex to add a new attribute. This project intends to unify the approach taken to attributes and to make it very easy to add a new attribute and get it onto the AST, including some basic semantic checks.
  • Factor out Itanium C++ ABI support Currently, Clang, like GCC, only supports the Itanium C++ ABI. This is only one of many C++ ABIs in existence. Other C++ ABIs include the various vendor-specific ABIs used on various UNIXes, and the Microsoft Visual C++ ABI. This project proposes to introduce an interface to the IRGen library to support multiple C++ ABis, and to factor Itanium C++ ABI support to use this new interface.
  • Front-end for The polyhedral optimization framework for LLVM(Polly) The goal of this project is generate the polyhedral intermediate representation (Polly IR) from LLVM IR. It is the front-end of a larger project I am working on - The polyhedral optimization framework for LLVM (Polly). Polly will perform auto-parallelization and auto-vectorization of LLVM, which will help LLVM to generate high quality code for parallel computing platforms.