ToolNeuron: Your AI Hub in Your Pocket

A privacy-focused mobile ecosystem for offline inference (GGUF) and online orchestration (OpenRouter).

github.com/Siddhesh2377 • Download Beta 5.1 • Join Discord • Join Testing Program

Abstract

ToolNeuron is an AI-native mobile ecosystem designed for Android. It bridges the gap between privacy and performance by enabling users to run offline GGUF models locally or connect to powerful cloud models via OpenRouter. With integrated features like premium offline Text-to-Speech (TTS), dynamic plugin systems, and a "DataHub" for context injection, ToolNeuron acts as a comprehensive operating layer for mobile Artificial Intelligence.

1. Visual Interface

The user interface is designed for clarity, featuring syntax highlighting for code, structured tables, and seamless model switching. The design prioritizes information density and ease of access to model parameters.

2. Core Functionality

ToolNeuron distinguishes itself through a hybrid architecture that prioritizes user privacy without sacrificing capability.

2.1 Dual Mode Operation

Run offline GGUF models directly on your device using llama.cpp (no internet needed), or connect to 100+ online models (GPT-4, Claude, Llama 3) via OpenRouter API.

2.2 Premium Voice (TTS)

Includes 11 premium voices (American & British accents) powered by Sherpa-ONNX. This runs completely offline with zero latency or cost.

2.3 DataHub & Context

Attach dynamic datasets to supercharge AI knowledge without retraining. Switch models mid-conversation while preserving context history.

2.4 Plugin System

Extend functionality with Web Search, Web Scraper, and Document Viewers. Future support includes Code Execution and Image Processing.

3. Comparative Analysis

Unlike standard SaaS AI applications that rely on subscriptions and data harvesting, ToolNeuron offers a transparent, open-source alternative.

Feature	ToolNeuron	Traditional SaaS
Inference	Local (GGUF) + Cloud	Cloud Only
Privacy	Local-First, No Logging	Server-side Logging
Cost	Free / BYOK	Subscription ($20+/mo)
TTS	Offline Neural (Sherpa)	Cloud API

4. System Specifications

To ensure optimal performance for local inference, the following hardware specifications are recommended.

Minimum

Android 8.0+ (API 26)
4GB RAM
2GB Free Storage

Recommended (Offline AI)

Android 14+
8GB+ RAM
Snapdragon 8 Gen 1 (or equiv)
5GB+ Storage for Models

5. Development Roadmap

The project follows a quarterly release cycle aimed at expanding format support and multi-modal capabilities.

Q1 2026: Advanced TTS with multiple voices, Speech-to-Text (STT) for voice input, and Code Export functionality.
Q2 2026: TFLite & ONNX model support, Image Generation (Stable Diffusion), and an advanced memory system.
Q3 2026: Multi-modal models (Text + Image), Cross-device sync, and Desktop companion apps.

6. Installation & Usage

Install: Download the latest APK from the link above. No Play Store account required.
Setup:
- Option A (Private): Download a GGUF model from HuggingFace and import via Settings → Local Models.
- Option B (Cloud): Enter an OpenRouter API key in Settings to access GPT-4, Claude, etc.
Interact: Use the chat interface, enable TTS for voice responses, or attach plugins for web capabilities.