
I use a Pixel phone to set up Gemma and run it locally
Pixel 9 Pro XL + Tensor G4 + 16GB LPDDR5X + Mali-G715: The Power of Tri-Synergy!
This tutorial is designed for **Rooted Android Phones + Termux + Mali GPU (e.g., Pixel, Xiaomi, Samsung, etc.)**.
---
# 📱 Complete Guide: Termux + Root + Mali GPU Accelerated llama.cpp
## ⚠️ Prerequisites
**Phone must be rooted** (Magisk/KernelSU).
**Install Termux** (recommended to download the latest version from F-Droid or GitHub).
**Phone GPU must be Mali** (most MediaTek and Tensor chips use Mali; Qualcomm Snapdragon uses Adreno, which requires a different approach).
---
## 🟢 Phase One: Install Basic Environment & Compile llama.cpp
Run the following commands sequentially in Termux:
```bash
# 1. Update and install dependencies
pkg update && pkg upgrade -y
pkg install git cmake clang make vulkan-loader vulkan-tools -y
# 2. Clone llama.cpp
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp
# 3. Compile (enable Vulkan and ARM NEON optimizations)
mkdir build && cd build
cmake .. -DLLAMA_VULKAN=ON -DLLAMA_ARM_NEON=ON -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)
```
> 💡 **Tip**: Compilation may take 10-20 minutes; please be patient.
---
## 🔵 Phase Two: Extract System GPU Drivers (Core Step)
**This is the most critical step!** We need to "extract" the system drivers for Termux to use.
### 1. Create Storage Directories
```bash
mkdir -p /data/data/com.termux/files/usr/lib/vulkan
mkdir -p /data/data/com.termux/files/usr/share/vulkan/icd.d
```
### 2. Locate Driver File Paths
Paths vary by device. Run the following commands to search:
```bash
# Search for Vulkan driver
su -c "find /vendor /system -name 'vulkan*.so' 2>/dev/null"
# Search for GLES driver
su -c "find /vendor /system -name 'libGLES_mali.so' 2>/dev/null"
# Search for system Vulkan loader (critical!)
su -c "find /apex /system -name 'libvulkan.so' 2>/dev/null"
```
**👇 Example paths found (for Pixel 9 / Tensor G4):**
* Vulkan ICD: `/vendor/lib64/hw/vulkan.mali.so`
* GLES Lib: `/vendor/lib64/egl/libGLES_mali.so`
* Config Lib: `/vendor/lib64/aconfig_gpu_flags_c_lib.so` (if present)
* System Loader: `/system/lib64/libvulkan.so`
### 3. Copy Drivers to Termux
**Please modify the commands below according to your actual paths found above!**
```bash
# Gain root access and copy files (adjust paths as needed)
su << 'EOF'
# 1. Copy Mali-specific driver
cp /vendor/lib64/hw/vulkan.mali.so /data/data/com.termux/files/usr/lib/vulkan/
cp /vendor/lib64/egl/libGLES_mali.so /data/data/com.termux/files/usr/lib/vulkan/
# 2. Copy potential dependency libraries (ignore if file doesn't exist)
cp /vendor/lib64/aconfig_gpu_flags_c_lib.so /data/data/com.termux/files/usr/lib/vulkan/ 2>/dev/null
# 3. [Critical] Copy system-level Vulkan Loader (resolves missing symbol issues)
cp /system/lib64/libvulkan.so /data/data/com.termux/files/usr/lib/
# 4. Copy C++ standard library
cp /vendor/lib64/libc++.so /data/data/com.termux/files/usr/lib/vulkan/
# 5. Set permissions
chmod 755 /data/data/com.termux/files/usr/lib/vulkan/*.so
chmod 755 /data/data/com.termux/files/usr/lib/libvulkan.so
exit
EOF
```
---
## 🟠 Phase Three: Configure Environment Variables
### 1. Create ICD JSON Configuration File
Tell the Vulkan loader where to find the driver.
```bash
su -c "cat > /data/data/com.termux/files/usr/share/vulkan/icd.d/mali_icd.json << 'JSONEOF'
{
\"file_format_version\": \"1.0.1\",
\"ICD\": {
\"library_path\": \"/data/data/com.termux/files/usr/lib/vulkan/vulkan.mali.so\",
\"api_version\": \"1.4.335\"
}
}
JSONEOF"
chmod 644 /data/data/com.termux/files/usr/share/vulkan/icd.d/mali_icd.json
```
### 2. Set Environment Variables (Permanent)
```bash
# Append to ~/.bashrc
cat >> ~/.bashrc << 'RCPEOF'
# --- Vulkan GPU Acceleration for llama.cpp ---
export LD_LIBRARY_PATH=/system/lib64:/vendor/lib64:/vendor/lib64/egl:/data/data/com.termux/files/usr/lib/vulkan:/data/data/com.termux/files/usr/lib
export VK_ICD_FILENAMES=/data/data/com.termux/files/usr/share/vulkan/icd.d/mali_icd.json
# ---------------------------------------------
RCPEOF
# Apply changes immediately
source ~/.bashrc
```
---
## 🟣 Phase Four: Verification & Testing
### 1. Verify Vulkan Recognizes the GPU
```bash
vulkaninfo | grep -E "deviceName|deviceType"
```
✅ **Success indicator**: You should see `Mali-G715` or `Immortalis` etc., NOT `llvmpipe`.
### 2. Download a Model (e.g., Gemma-4)
Have a friend transfer the model to your phone, e.g., place it in `/storage/emulated/0/Download/`.
### 3. Run Test
```bash
cd ~/llama.cpp/build/bin
./llama-cli \
-m /storage/emulated/0/Download/Gemma-4-E2B--Q5_K_P.gguf \
--temp 1.0 -t 4 -c 4096 --no-mmap -n 20 --verbose 2>&1 | grep -E "Vulkan|GPU|device"
```
✅ **Success indicators**:
* You see `using device Vulkan0 (Mali-...)`
* You see `offloaded 36/36 layers to GPU`
---
## 🔴 Frequently Asked Questions (FAQ)
**Q1: What if my phone uses Qualcomm Snapdragon (Adreno GPU)?**
A: The process is the same, but the driver filename differs.
* Search for drivers: `su -c "find /vendor -name '*adreno*' -o -name '*kgsl*'"`
* The driver is typically named `vulkan.adreno.so`.
**Q2: Getting error `dlopen failed: library xxx not found`?**
A: This indicates missing dependency libraries.
* Use `readelf -d /path/to/vulkan.so | grep NEEDED` to identify missing libraries.
* Locate the corresponding `.so` files in `/vendor/lib64` or `/system/lib64` and copy them over.
**Q3: No speed improvement, still running on CPU?**
A: Check the `llama-cli` output.
* If you see `assigned to device CPU`, the environment variables may be misconfigured.
* Ensure you executed `source ~/.bashrc` before running.
* Verify thatvulkaninfocan detect your GPU.
This proves that:
- ✅ **GPU Model**: Mali-G715 successfully recognized
- ✅ **VRAM Usage**: 15456 MiB (~15 GB) - Indicates the model is fully loaded onto the GPU
- ✅ **Model Footprint**: 1641 MiB - Model weights residing in VRAM
- ✅ **Compute Buffer**: 545 MiB - GPU computation buffers
---
## 📊 Final Scorecard
| Metric | CPU Mode | GPU Mode | Improvement |
|--------|----------|----------|-------------|
| **Generation Speed** | 4.1 t/s | **7.3 t/s** | **+78%** 🚀 |
| **Processing Speed** | 8.4 t/s | **10.7 t/s** | **+27%** |
| **Device** | CPU | **Mali-G715** | ✅ |
| **Memory** | System RAM | **15 GB VRAM** | ✅ |
---
## 🎁 Now You Can:
### 1. **Create a Quick-Launch Script**
```bash
cat > ~/run-gemma.sh << 'EOF'
#!/data/data/com.termux/files/usr/bin/bash
cd ~/llama.cpp/build/bin
./llama-cli \
-m /storage/emulated/0/Download/Gemma-4-E2B--Q5_K_P.gguf \
--simple-io --jinja \
--temp 1.0 --top-p 0.95 --top-k 64 \
-t 4 -c 4096 --no-mmap \
--interactive-first
EOF
chmod +x ~/run-gemma.sh
```
From now on, simply run `~/run-gemma.sh` to launch!
### 2. **Try Larger Models**
Since your GPU has 15GB of VRAM, you can experiment with:
- **Llama-3-8B** (Q4_K_M quantized)
- **Qwen2.5-7B** (Q5_K_M quantized)
- **Mistral-7B** (Q6_K quantized)
> 💡 **Pro Tip**: Larger models with higher quantization levels (like Q6_K) offer better quality but require more VRAM. Monitor your memory usage with `vulkaninfo` or Android's developer options to avoid out-of-memory crashes.