SYS.01 Systems engineering · local AI · product

I build systems that stay understandable under pressure.

Enterprise Linux operations, local inference experiments and production Android apps, documented with the trade-offs left in.

Explore field notes Profile + CV

Enterprise operations Local AI lab Shipped software

system_map://sergiiob.dev

interactive

01 / 04 PLATFORM

Enterprise Linux

Lifecycle, automation, hardening, HA and the operational work that keeps platforms predictable.

Select a node to inspect the system

Production Apps

Useful software, not AI wrappers.

Two privacy-minded Android products built around local-first value and resilient cloud boundaries.

ANDROID / VEHICLE OPS Live on Play Store

IntelliAuto

Vehicle management. AutoMind AI, ML Kit OCR, offline encrypted storage.

KotlinFirebaseOffline-firstML Kit

Play Store intelliauto.app

ANDROID / FINANCE OPS Live on Play Store

IntelliFlow

Finance tracker. Offline-first, cloud sync, budget projections, spending summaries.

KotlinFirebasePrivacy-firstForecasting

Play Store intelliflow.work

02 Educational Series

The LLM Infrastructure Handbook

An engineer's guide to Local AI. Demystifying Transformer architectures, KV cache quantization, modern attention and MoE routing.

Read the Handbook

Recent Field Notes

Measured, failed, fixed, documented.

57 View Full Archive

Latest signal Jul 16, 2026

Arc Pro B70 clean suite: Gemma 4 31B MTP, MoE prefill, and Grok tools

Real single-stream timings from a 2026-07-16 B70 suite: long-prompt prefill near 1.7k t/s on MoE, dense Gemma 31B +51% decode with MTP-4, and Grok Build CLI with tools enabled.

local-ai infrastructure

Jul 16

Grok Build CLI with local models on llama-server (Arc Pro B70)

local-ai automation

Jul 6

The Math Behind KV Cache Quantization: Why I Stopped Using Q5_0 for Keys

local-ai

Jul 6

The Reality of Edge AI Research: Why TurboQuant on Intel Arc SYCL Failed (For Now)

local-ai

Jul 5

Breaking the 67 tok/s Barrier: Optimizing Intel Arc Pro B70 for High-Concurrency MoE Inference

local-ai

Jul 2

KV Cache Quantization and Context Ceilings on Intel Arc Pro B70 32GB

local-ai infrastructure