Releases · A3S-Lab/Power

23 Feb 06:46

ZhiXiao-Lin

v0.4.2

eb2a91f

v0.4.2 Latest

Latest

Performance

Release binary size optimized ~48% (29MB → 15MB default, 5.2MB → 3.3MB picolm)
- opt-level = "z" (optimize for size) with LTO fat, codegen-units=1, strip, panic=abort

Bug Fixes

Fix all clippy warnings (CI green):
- Replace needless index loops with iterators in picolm, attention, norm
- Fix excessive float precision in GELU constant
- Add Default impl for JsonGrammarSampler
- Use ? operator in grammar sampler
Fix cargo fmt formatting in main.rs

Features (from v0.4.1)

Full CLI with serve, models (list/pull/rm/show), chat, ps subcommands
PowerConfig::load_from(path) for --config flag support

Assets 10

23 Feb 06:21

ZhiXiao-Lin

v0.4.1

f14220e

v0.4.1

Features

feat(cli): Full CLI with serve, models (list/pull/rm/show), chat, ps subcommands
Backward-compatible: no subcommand = serve (same as before)

Performance

perf: Release binary size reduced ~48% (29MB → 15MB default, 5.2MB → 3.3MB picolm)
- Switch opt-level from 3 to "z" (optimize for size)
- Combined with existing LTO fat, codegen-units = 1, strip = true, panic = "abort"
perf(picolm): Pre-dequantized layer norms + gate/up dual matvec fusion (+12.4% decode speed)

Other

docs: Updated README with picolm optimization status and v0.4.0 features
Added PowerConfig::load_from(path) for --config flag support
Made reqwest non-optional (needed by CLI HTTP client)

Assets 2

22 Feb 07:45

ZhiXiao-Lin

v0.4.0

a8a00a1

v0.4.0

Features

Batch prefill — process prompt tokens in batch for faster time-to-first-token
Grammar-constrained structured output — JSON schema enforcement during generation
Tool/function calling — OpenAI-compatible tool_calls with auto-dispatch
Speculative decoding — prompt-lookup draft for faster decode throughput
AVX2 Q4_K/Q6_K kernels — SIMD-accelerated quantized dot products on x86_64
Repeat/frequency/presence penalty — configurable repetition control
Startup self-test — validates norm, f32_dot, q8_0_dot on load
TEE hardening — AVX2 SIMD vec_dot kernels for secure enclaves

Performance

NEON SIMD for attention softmax, RMSNorm, SiLU/add_residual in FFN
Fused f16 KV attention — dot(q, k_f16) and accumulate(v_f16) without intermediate f32 buffer
Zero-alloc sampler — pre-allocated probs/indices buffers, no heap allocation per token
Zero-alloc hot path — pre-allocated ForwardBuffers for all decode operations
Q4_K NEON kernel rewrite — register-based nibble extraction via NEON intrinsics
Decode profiling instrumentation — per-stage timing breakdown (embed/attn/ffn/logit/sample)
~14 tok/s on Qwen 2.5 0.5B Q4_K_M (Apple Silicon, single-threaded decode)

Fixes

Resolve clippy warnings for picolm feature build
Fix missing #[cfg(feature = "picolm")] on repeat penalty tests

Assets 10

21 Feb 20:52

github-actions

v0.3.0

118cdcc

v0.3.0

chore: bump version to 0.3.0
feat(picolm): production-ready pure Rust inference with true layer-streaming
fix: correct Q4_K/Q5_K/Q6_K dequantization, add Qwen GPT-style tokenizer and attention bias
test(picolm): add end-to-end integration tests with synthetic GGUF
feat(picolm): implement pure-Rust LLM inference backend
docs: add picolm layer-streaming technical deep-dive to README
docs: rewrite README to lead with irreplaceable value proposition
docs: add CI/CD badges and section to README, fix test race condition

Assets 10

21 Feb 14:41

github-actions

v0.2.0

5788fa0

v0.2.0

fix: add #[serial] to test_power_home_default to prevent env var race
fix: use rustls-tls for reqwest, remove OpenSSL dependency
fix: allow deprecated Nonce::from_slice (generic-array transitive dep)
fix: clippy warnings for hw-verify feature (imports, type complexity, inspect_err)
fix(ci): drop --all-features (llamacpp needs C++ toolchain), use hf+hw-verify
fix(ci): scope fmt to power crate, fix stub lib.rs
fix(ci): setup-workspace stub, release profile optimization, fix homebrew heredoc
fix(ci): use --lib for clippy, add cross-build matrix
feat: v0.2.0 — picolm backend, HF pull, EPC routing, hw-verify, CI/CD
docs: add Discord community link
feat(server): graceful shutdown with SIGTERM support and audit log flush
feat(verify): add client-side attestation verification SDK and CLI
feat(backend): add embedding model support via HuggingFace format
feat(api): add stream_options.include_usage and num_parallel passthrough
feat(power): close gap analysis small items
fix(api,tee,router): fix active_requests leak, lazy usage counters, rate limiter, redact all occurrences
fix(auth,config,api): constant-time auth, config validation, lazy usage counters
fix(api): return proper HTTP status codes on all error paths
docs(readme): add model_signing_key to config reference, update test count to 671+
fix: correct keep_alive=0 on cache hit, add active_requests getter, fix serial test flake
feat(embeddings): add keep_alive to EmbeddingRequest and wire unload logic
fix(autoload,chat,completions,llamacpp): unload after inference for keep_alive=0, wire config defaults
feat(api): add keep_alive field and fix autoload integrity, SSE order, and reaper format lookup
fix: audit streaming paths, real token counts, SSE order, and measurement validation
feat: model management API, attestation health field, and privacy/TEE wiring
feat(tee): wire in_memory_decrypt and suppress_token_metrics
docs: update README for P2/P3 features
feat(tee): implement P2/P3 security features
feat: add Box + Power integration example with real model inference
docs: use explicit remote URL in brew tap instructions
feat(config): add HCL configuration file format support
chore: bump a3s-updater to 0.2, add path dependency
ci: add cargo-publish job to release workflow
fix: clippy warnings (assertions_on_constants, missing Default, doc comment)
style: cargo fmt cli/mod.rs

Assets 10

11 Feb 14:33

github-actions

v0.1.5

10a42c1

v0.1.5

Full Changelog: v0.1.4...v0.1.5

Assets 10

11 Feb 12:59

github-actions

v0.1.4

1912fc4

v0.1.4

Full Changelog: v0.1.2...v0.1.4

Assets 10

11 Feb 07:32

ZhiXiao-Lin

v0.1.2

489ee05

v0.1.2

What's New

Ollama Registry Integration: Pull any model from registry.ollama.ai by name — primary resolution source with automatic template, system prompt, parameters, and license extraction
3-tier Model Resolution: Ollama Registry → built-in known_models.json → HuggingFace API fallback
Vision Model Support: Multimodal projector auto-downloaded from Ollama registry for vision models (e.g. llava)
878 unit tests passing

Install

# Cargo
cargo install a3s-power

# macOS (Apple Silicon)
curl -LO https://github.com/A3S-Lab/Power/releases/download/v0.1.2/a3s-power-v0.1.2-aarch64-apple-darwin.tar.gz
tar xzf a3s-power-v0.1.2-aarch64-apple-darwin.tar.gz
sudo mv a3s-power /usr/local/bin/

Assets 10

10 Feb 20:12

ZhiXiao-Lin

v0.1.1

f69029d

v0.1.1 - 861 Unit Tests with 90%+ Coverage

What's Changed

Quality & Testing

861 unit tests (up from 558) with 90.11% region coverage
91.47% function coverage across 59 source files
Comprehensive test coverage for all modules: API handlers, CLI commands, model management, backend, server

Coverage Highlights

14 modules at 100% coverage
All API handlers tested (native + OpenAI)
All CLI commands tested
Model storage, registry, manifest fully covered
Backend (llama.cpp, chat templates, tool parsing, JSON schema) fully covered

Publishing

Published to crates.io
Homebrew formula updated: brew install a3s-lab/tap/a3s-power

Install

# From crates.io
cargo install a3s-power

# From Homebrew (macOS)
brew tap a3s-lab/tap
brew install a3s-power

Full Changelog: v0.1.0...v0.1.1

Assets 2

09 Feb 08:26

ZhiXiao-Lin

v0.1.0

d6a6949

a3s-power v0.1.0

Local model management and serving with OpenAI-compatible API

Key Features:

Ollama-compatible CLI (run, pull, list, show, delete, serve, create, push, cp)
OpenAI-compatible HTTP API
Vision and tool calling support
Blob management and model push
Health endpoint and model auto-loading
291 tests passing

Assets 2

Releases: A3S-Lab/Power

v0.4.2

Performance

Bug Fixes

Features (from v0.4.1)

Uh oh!

v0.4.1

Features

Performance

Other

Uh oh!

v0.4.0

Features

Performance

Fixes

Uh oh!

v0.3.0

Uh oh!

v0.2.0

Uh oh!

v0.1.5

Uh oh!

v0.1.4

Uh oh!

v0.1.2

What's New

Install

Uh oh!

v0.1.1 - 861 Unit Tests with 90%+ Coverage

What's Changed

Quality & Testing

Coverage Highlights

Publishing

Install

Uh oh!

a3s-power v0.1.0

Uh oh!