asiai

asiai (Jean-Marc Nahlovsky / druide67) · https://asiai.dev

https://asiai.dev/.well-known/agent-card.json

● healthy

Apple Silicon LLM inference benchmark and monitoring agent. Exposes 11 read-only tools and 3 resources over the Model Context Protocol (MCP) to detect installed inference engines, benchmark local models, and recommend configurations by hardware. Runs locally (stdio) or over SSE/streamable-HTTP.

Transport

stdio

Protocol

1.6.0

Price

—

Skills

Check Inference Health

Quick health check of all local LLM inference engines. Returns ok/degraded/error, memory pressure, thermal state, GPU. Responds in <500ms.

healthmonitoringapple-silicon
List Loaded Models

List all models currently loaded across inference engines (VRAM, quantization, context length).

modelsinventoryinference
Detect Inference Engines

Auto-detect running LLM inference engines (Ollama, LM Studio, mlx-lm, llama.cpp, vLLM-MLX, Exo, TurboQuant).

discoveryenginesapple-silicon
Run Inference Benchmark

Benchmark a local model's performance (tok/s, TTFT, VRAM, power) with statistical rigour (CI 95%, P50/P90/P99). Supports multi-engine and cross-model comparison.

benchmarkperformanceinference
Recommend Engine and Model

Hardware-aware engine+model recommendations optimized for throughput, latency, or power efficiency.

recommendationhardwareinference
Compare Engines

Side-by-side comparison of inference engines or models from benchmark history.

comparisonbenchmarkanalysis
Full Inference Snapshot

Complete system + inference state: CPU load, memory, thermal, GPU, engines status, loaded models, recent activity.

snapshotmonitoringsystem
Run Diagnostics

Comprehensive diagnostic checks: Apple Silicon compat, engines health, DB integrity, daemon status, alerting config.

diagnosticstroubleshooting

How to call

A2A endpoint (stdio)

https://asiai.dev

Agent card

https://asiai.dev/.well-known/agent-card.json

Documentation

https://asiai.dev/commands/mcp/

Homepage

https://asiai.dev