Imagine if you could run Python code everywhere. Python is an incredibly simple and mature language. But because it requires an interpreter, it either cannot run natively cross-platform; or it incurs a significant performance cost compared to languages that are “closer to hardware”.

We are on a mission to build a world where to get the performance benefits of Rust.

Function works by lowering your Python code to native code. The benefit is that developers can think and write code in a high-level language, but still get the raw performance of a low-level language.

Symbolic Tracing

Our compiler begins by building an intermediate representation (IR) of your Python function using a combination of static analysis and symbolic tracing. We use PEP 523 to hook into a sandboxed Python interpreter before executing your function. We can then build a trace of every operation that happened within your function.

PEP 523 also forms the foundation of torch.compile in PyTorch 2.0. In fact, this is what spurred initial development of Function. Read the paper.

For example, consider the following function which classifies an image:

classifier.py
from fxn import compile, Sandbox
from PIL import Image
from torchvision.models import mobilenet_v2, MobileNet_V2_Weights
from typing import Tuple

# Create MobileNet model
weights = MobileNet_V2_Weights.DEFAULT
model = mobilenet_v2(weights=weights).eval()
preprocess = weights.transforms()
labels = weights.meta["categories"]

# Define predictor
@compile(
    tag="@vision-co/image-classifier",
    description="Classify an image using the MobileNet v2 model.",
    sandbox=Sandbox().pip_install("torch", "torchvision")
)
def predict (image: Image.Image) -> Tuple[str, float]:
    batch = preprocess(image)[None]
    logits = model(batch).squeeze(0).softmax(0)
    class_id = logits.argmax().item()
    score = logits[class_id].item()
    label = labels[class_id]
    return label, score

The symbolic tracer generates a graph that looks like the following:

type           name                 target                                         args
-------------  -------------------  ---------------------------------------------  -------------------------------------------------------------------
input          image                image                                          ()
call_function  resize               <function resize at 0x11d3ae950>               (image, [232])
call_function  center_crop          <function center_crop at 0x11d3aeb90>          (resize, [224])
call_function  pil_to_tensor        <function pil_to_tensor at 0x11d3ae5f0>        (center_crop,)
call_function  convert_image_dtype  <function convert_image_dtype at 0x11d3ae680>  (pil_to_tensor, torch.float32)
call_function  normalize            <function normalize at 0x11d3ae7a0>            (convert_image_dtype,)
call_function  getitem              <built-in function getitem>                    (normalize, None)
call_module    model                model                                          (getitem,)
call_method    squeeze              squeeze                                        (model, 0)
call_method    softmax              softmax                                        (squeeze, 0)
call_method    argmax               argmax                                         (softmax,)
call_method    item                 item                                           (argmax,)
call_function  getitem_1            <built-in function getitem>                    (softmax, item)
call_method    item_1               item                                           (getitem_1,)
call_function  list_index           <function list_index at 0x11d6d5c60>           (['tench', 'goldfish', 'plow', ..., 'toilet tissue'], item)  {}
output         output               output                                         ((list_index, item_1),)

We inject more metadata with static analysis to build a full IR for your Python code. After this, we then lower the IR to native code.

Lowering to Native Code

We lower an IR graph to several different platform-specific implementations in native code. We do so by walking through the graph’s nodes and finding one or more operators that implement the node. Take the resize node in the graph above:

TypeNameTargetArgs
call_functionresize<function resize at 0x11d3ae950>(image, [232])

We search through our library of native operators that perform a resize operation on an image. Here’s an example targeting Apple Silicon with Accelerate.framework:

resize.mm
@import Accelerate;

Function::Image ResizeImageAppleSilicon (
    Function::Image image,
    std::span<int32_t> size
) {
    ...
    vImageScale_ARGB8888(...);
    ...
}

This approach gives us two main benefits:

If you are a hardware vendor interested in enabling developers to leverage your custom accelerators, reach out to us! You’ll bring the hardware; we’ll bring the software.

Compiling Binaries

Finally, we compile the lowered native code for each of our supported targets.

If you would like to compile Python functons to a custom platform, reach out to us.