Documentation

Getting Started

1

Install

Get ToolNeuron from Google Play or download the APK from GitHub Releases.

2

Pick a model

Open the drawer > Model Store. We recommend starting with Qwen3.5 0.8B (Q4_K_M, ~600MB).

3

Chat

Select your model, wait for it to load, start typing. Everything runs on-device, even in airplane mode.

Recommended Models

Use caseModelSize
Quick testQwen3.5 0.8B Q4_K_M~600 MB
General useQwen3.5 4B Q4_K_M~2.8 GB
Power usersQwen3.5 9B Q4_K_M~5.5 GB

Pick Q4_K_M for a good balance. Use Q6_K if your device has the RAM.

Architecture

Modules

ModulePurpose
appMain Android application
umsUnified Memory System — binary record storage
neuron-packetEncrypted RAG packet format with access control
memory-vaultLegacy encrypted storage (read-only, migration)
system_encryptorNative encryption primitives
file_opsNative file operations

Tech Stack

LayerTechnology
LanguageKotlin, C++ (JNI)
UIJetpack Compose
Text inferencellama.cpp
Image inferenceLocalDream (SD 1.5)
TTSSupertonic (ONNX Runtime)
DatabaseRoom + UMS (custom binary format)
EncryptionAES-256-GCM, Android KeyStore
DIDagger Hilt
AsyncKotlin Coroutines + Flow

Building from Source

Prerequisites:

  • Android Studio (latest stable)
  • NDK 26.x
  • JDK 17+
  • CMake 3.22+
terminal
git clone https://github.com/Siddhesh2377/ToolNeuron.git
cd ToolNeuron
./gradlew assembleDebug
./gradlew installDebug

Make sure NDK 26.x is installed via SDK Manager. If you hit OOM during native builds, add org.gradle.jvmargs=-Xmx4g to gradle.properties.

FAQ

Does this really work completely offline?
Yes. Once a model is downloaded, no internet connection is needed. Everything runs on-device. Works in airplane mode.
Is my data actually private?
Yes. Zero telemetry, zero analytics. All data is encrypted with AES-256-GCM and never leaves your device. Don't take our word for it — read the source code.
How much storage do I need?
Minimum 4GB free. Models range from 600MB to 6GB+. The app itself is small — most space goes to model files.
Can I use custom models?
Yes. Any GGUF format model works. Load it via the built-in file picker — no conversion needed.
What about performance?
8-15 tokens/sec on flagship phones. Depends on model size, quantization, and your device's hardware. Smaller models run faster.

Contributing

Fork the repo, create a branch, make your changes, open a PR. Standard stuff.

Priority areas:

  • Bug fixes
  • Device testing and compatibility reports
  • Performance improvements
  • Documentation
Don't submit untested code. Don't add cloud dependencies. Don't break offline functionality.
View on GitHub