Documentation

Getting Started

Install

Get ToolNeuron from Google Play or download the APK from GitHub Releases.

Pick a model

Open the drawer > Model Store. We recommend starting with Qwen3.5 0.8B (Q4_K_M, ~600MB).

Chat

Select your model, wait for it to load, start typing. Everything runs on-device, even in airplane mode.

Recommended Models

Use case	Model	Size
Quick test	Qwen3.5 0.8B Q4_K_M	~600 MB
General use	Qwen3.5 4B Q4_K_M	~2.8 GB
Power users	Qwen3.5 9B Q4_K_M	~5.5 GB

Pick Q4_K_M for a good balance. Use Q6_K if your device has the RAM.

Architecture

Modules

Module	Purpose
app	Main Android application
ums	Unified Memory System — binary record storage
neuron-packet	Encrypted RAG packet format with access control
memory-vault	Legacy encrypted storage (read-only, migration)
system_encryptor	Native encryption primitives
file_ops	Native file operations

Tech Stack

Layer	Technology
Language	Kotlin, C++ (JNI)
UI	Jetpack Compose
Text inference	llama.cpp
Image inference	LocalDream (SD 1.5)
TTS	Supertonic (ONNX Runtime)
Database	Room + UMS (custom binary format)
Encryption	AES-256-GCM, Android KeyStore
DI	Dagger Hilt
Async	Kotlin Coroutines + Flow

Building from Source

Prerequisites:

Android Studio (latest stable)
NDK 26.x
JDK 17+
CMake 3.22+

terminal

git clone https://github.com/Siddhesh2377/ToolNeuron.git
cd ToolNeuron
./gradlew assembleDebug
./gradlew installDebug

Make sure NDK 26.x is installed via SDK Manager. If you hit OOM during native builds, add org.gradle.jvmargs=-Xmx4g to gradle.properties.

FAQ

Does this really work completely offline?

Yes. Once a model is downloaded, no internet connection is needed. Everything runs on-device. Works in airplane mode.

Is my data actually private?

Yes. Zero telemetry, zero analytics. All data is encrypted with AES-256-GCM and never leaves your device. Don't take our word for it — read the source code.

How much storage do I need?

Minimum 4GB free. Models range from 600MB to 6GB+. The app itself is small — most space goes to model files.

Can I use custom models?

Yes. Any GGUF format model works. Load it via the built-in file picker — no conversion needed.

What about performance?

8-15 tokens/sec on flagship phones. Depends on model size, quantization, and your device's hardware. Smaller models run faster.

Contributing

Fork the repo, create a branch, make your changes, open a PR. Standard stuff.

Priority areas:

Bug fixes
Device testing and compatibility reports
Performance improvements
Documentation

Don't submit untested code. Don't add cloud dependencies. Don't break offline functionality.

View on GitHub