Documentation
Getting Started
2
Pick a model
Open the drawer > Model Store. We recommend starting with Qwen3.5 0.8B (Q4_K_M, ~600MB).
3
Chat
Select your model, wait for it to load, start typing. Everything runs on-device, even in airplane mode.
Recommended Models
| Use case | Model | Size |
|---|---|---|
| Quick test | Qwen3.5 0.8B Q4_K_M | ~600 MB |
| General use | Qwen3.5 4B Q4_K_M | ~2.8 GB |
| Power users | Qwen3.5 9B Q4_K_M | ~5.5 GB |
Pick Q4_K_M for a good balance. Use Q6_K if your device has the RAM.
Architecture
Modules
| Module | Purpose |
|---|---|
| app | Main Android application |
| ums | Unified Memory System — binary record storage |
| neuron-packet | Encrypted RAG packet format with access control |
| memory-vault | Legacy encrypted storage (read-only, migration) |
| system_encryptor | Native encryption primitives |
| file_ops | Native file operations |
Tech Stack
| Layer | Technology |
|---|---|
| Language | Kotlin, C++ (JNI) |
| UI | Jetpack Compose |
| Text inference | llama.cpp |
| Image inference | LocalDream (SD 1.5) |
| TTS | Supertonic (ONNX Runtime) |
| Database | Room + UMS (custom binary format) |
| Encryption | AES-256-GCM, Android KeyStore |
| DI | Dagger Hilt |
| Async | Kotlin Coroutines + Flow |
Building from Source
Prerequisites:
- Android Studio (latest stable)
- NDK 26.x
- JDK 17+
- CMake 3.22+
terminal
git clone https://github.com/Siddhesh2377/ToolNeuron.git
cd ToolNeuron
./gradlew assembleDebug
./gradlew installDebugMake sure NDK 26.x is installed via SDK Manager. If you hit OOM during native builds, add org.gradle.jvmargs=-Xmx4g to gradle.properties.
FAQ
Does this really work completely offline?
Yes. Once a model is downloaded, no internet connection is needed. Everything runs on-device. Works in airplane mode.
Is my data actually private?
Yes. Zero telemetry, zero analytics. All data is encrypted with AES-256-GCM and never leaves your device. Don't take our word for it — read the source code.
How much storage do I need?
Minimum 4GB free. Models range from 600MB to 6GB+. The app itself is small — most space goes to model files.
Can I use custom models?
Yes. Any GGUF format model works. Load it via the built-in file picker — no conversion needed.
What about performance?
8-15 tokens/sec on flagship phones. Depends on model size, quantization, and your device's hardware. Smaller models run faster.
Contributing
Fork the repo, create a branch, make your changes, open a PR. Standard stuff.
Priority areas:
- Bug fixes
- Device testing and compatibility reports
- Performance improvements
- Documentation
Don't submit untested code. Don't add cloud dependencies. Don't break offline functionality.
View on GitHub