ToolNeuron: Your AI Hub in Your Pocket

A privacy-focused mobile ecosystem for offline inference (GGUF) and online orchestration (OpenRouter).


Abstract

ToolNeuron is an AI-native mobile ecosystem designed for Android. It bridges the gap between privacy and performance by enabling users to run offline GGUF models locally or connect to powerful cloud models via OpenRouter. With integrated features like premium offline Text-to-Speech (TTS), dynamic plugin systems, and a "DataHub" for context injection, ToolNeuron acts as a comprehensive operating layer for mobile Artificial Intelligence.

1. Visual Interface

The user interface is designed for clarity, featuring syntax highlighting for code, structured tables, and seamless model switching. The design prioritizes information density and ease of access to model parameters.

2. Core Functionality

ToolNeuron distinguishes itself through a hybrid architecture that prioritizes user privacy without sacrificing capability.

2.1 Dual Mode Operation

Run offline GGUF models directly on your device using llama.cpp (no internet needed), or connect to 100+ online models (GPT-4, Claude, Llama 3) via OpenRouter API.

2.2 Premium Voice (TTS)

Includes 11 premium voices (American & British accents) powered by Sherpa-ONNX. This runs completely offline with zero latency or cost.

2.3 DataHub & Context

Attach dynamic datasets to supercharge AI knowledge without retraining. Switch models mid-conversation while preserving context history.

2.4 Plugin System

Extend functionality with Web Search, Web Scraper, and Document Viewers. Future support includes Code Execution and Image Processing.

3. Comparative Analysis

Unlike standard SaaS AI applications that rely on subscriptions and data harvesting, ToolNeuron offers a transparent, open-source alternative.

Feature ToolNeuron Traditional SaaS
Inference Local (GGUF) + Cloud Cloud Only
Privacy Local-First, No Logging Server-side Logging
Cost Free / BYOK Subscription ($20+/mo)
TTS Offline Neural (Sherpa) Cloud API

4. System Specifications

To ensure optimal performance for local inference, the following hardware specifications are recommended.

Minimum

  • Android 8.0+ (API 26)
  • 4GB RAM
  • 2GB Free Storage

Recommended (Offline AI)

  • Android 14+
  • 8GB+ RAM
  • Snapdragon 8 Gen 1 (or equiv)
  • 5GB+ Storage for Models

5. Development Roadmap

The project follows a quarterly release cycle aimed at expanding format support and multi-modal capabilities.

6. Installation & Usage

  1. Install: Download the latest APK from the link above. No Play Store account required.
  2. Setup:
    • Option A (Private): Download a GGUF model from HuggingFace and import via Settings → Local Models.
    • Option B (Cloud): Enter an OpenRouter API key in Settings to access GPT-4, Claude, etc.
  3. Interact: Use the chat interface, enable TTS for voice responses, or attach plugins for web capabilities.