Our Mission

MLC is dedicated to democratizing artificial intelligence by making high-performance model development and deployment accessible to everyone, everywhere, on any device.

We focus on compiler-driven and system-level innovations that unlock efficient AI across platforms. While we highlight modern LLM workloads, our work spans a broad range of models including vision, speech, and multimodal applications. Our goal is to empower developers, researchers, and organizations to deploy advanced AI—from laptops and phones to the cloud and edge—while maintaining exceptional performance, efficiency, and privacy.

Accessibility

Making AI accessible to developers and users across all platforms and devices.

Privacy

Enabling on-device inference to protect user privacy and data security.

Efficiency

Optimizing for minimal resource usage and maximum performance.

MLC
Mobile
Desktop
Server
Edge

Our Journey

2023 Q1

Project Inception

MLC-LLM project began as a research initiative to optimize large language model inference for mobile and edge devices.

2023 Q2

First Release

Launched MLC-LLM v0.1.0 with basic inference capabilities and support for popular LLM architectures.

2023 Q3

Mobile Support

Introduced MLC-LLM Mobile framework enabling native iOS and Android deployment with optimized performance.

2023 Q4

Community Growth

Reached 1,000 GitHub stars and established active Discord community with 500+ members.

2024 Q1

Production Serving

Released MLC-LLM Serve for production-ready deployment with auto-scaling and load balancing.

2024 Q2

Major Optimization

Achieved 50% memory reduction and 2x inference speed improvement through advanced compilation techniques.

2024 Q3

Enterprise Adoption

Major tech companies began adopting MLC-LLM for production workloads, reaching 10,000+ GitHub stars.

2024 Q4

Current Milestone

MLC-LLM v0.3.0 with breakthrough performance improvements and expanded hardware support.

Core Team

Dr. Tianqi Chen

Core Team · Carnegie Mellon University

Creator of Apache TVM and XGBoost. Leads compiler and system innovations in MLC, enabling efficient deployment of modern AI workloads including LLMs.

Dr. Zhihao Jia

Core Team · Carnegie Mellon University

Researches systems for ML and high-performance inference/serving. Contributes to scalable execution and optimization across heterogeneous hardware.

Dr. Xupeng Miao

Core Team · Purdue University

Focuses on low-latency LLM serving and compiler-runtime co-design, including speculative decoding and token-tree verification techniques.

Join Our Team: We're always looking for talented developers, researchers, and contributors to join our mission. Explore our projects or check out our tutorials to learn more about our work.

Technology Stack

Core Technologies

Python
C++
Rust
Swift
Kotlin

ML Frameworks

TVM
PyTorch
ONNX
TensorRT

Hardware Support

NVIDIA GPU
Apple Silicon
ARM CPU
x86 CPU

Our Partners

We're proud to collaborate with leading organizations and institutions that share our vision for democratizing AI.

Carnegie Mellon University

Research collaboration on machine learning compilation and optimization techniques.

University of Washington

Joint research on mobile AI and edge computing optimization strategies.

Purdue University

Collaboration on low-latency serving systems and compiler-runtime co-design.

Apache Software Foundation

Integration with Apache TVM ecosystem for advanced compilation capabilities.

Open Source Community

Collaborative development with the broader open source AI community.

Ready to Join Our Mission?

Whether you're a developer, researcher, or organization looking to leverage AI technology, MLC provides the tools and community support you need.