2026

Git Branch Splitting: Untangling Mixed Feature Branches

Issue: Mixed branches make PRs unreviewable, increase blast radius, and risk dragging unrelated changes into production. When one branch contains role code, host variables, certificate files, and inventory updates together, reviewers cannot isolate what changed or why.

Solution: Split the oversized branch into multiple clean, topic-focused branches by checking out only the relevant paths from the mixed branch into new branches created fresh off main.

Automation Infrastructure
14 Models Benchmarked on RK3588: The Definitive CPU vs NPU Ranking

Issue: Previous benchmarks measured raw llama.cpp throughput but not real quality through the agent pipeline. Models that looked fast synthetically failed at reasoning, refused tool calls, or got intercepted by workspace routing before reaching the model.

Solution: Built a 14-test, 6-dimension benchmark harness that tests every model through the live Discord pipeline with quality validation: reasoning, factual accuracy, code generation, instruction following, tool calling, and math. Tested 14 models (9 CPU GGUF + 3 NPU RKLLM + 2 large MoE) with BENCHMARK_MODE to isolate pure model performance.

Local AI
llamacpp-workbench: Remote llama.cpp Control and REAP Model Serving on RK3588

Issue: Most local model UIs either abstract away the runtime details that actually matter on constrained hardware or assume desktop-class GPUs. On RK3588, that makes it harder to tune context, KV cache quantization, reasoning behavior, and model selection credibly.

Solution: Built and published `llamacpp-workbench`, a remote llama.cpp workbench with explicit runtime controls, model presets, markdown chat rendering, streaming responses, and benchmark-backed defaults for REAP and dense GGUF models.

Local AI
Qwen3.5 on RK3588 with llama.cpp: Real Benchmarks from a Radxa ROCK 5B+

Issue: The usual local-AI advice overemphasizes parameter count and underexplains bandwidth, context budget, KV cache policy, and interactive latency. On RK3588, that leads to bad defaults: models that technically load but feel broken in real chat and tool-calling workloads.

Solution: I ran a corrected Qwen3.5 sweep on RK3588 using source-built llama.cpp, quantized KV cache, and task-pass validation. Then I compared prefill, decode, stable context, average latency, and tool-calling behavior to determine the right model for each workload.

Local AI
GPU VRAM, CPU Offload, and llama.cpp: The Real Performance Cliff

Issue: Operators lacked a practical framework for choosing quantization, sizing VRAM budgets, deciding when CPU offload is acceptable, and understanding the difference between weight quantization and KV cache quantization. Windows-specific setup questions also created confusion around native builds versus WSL.

Solution: Documented the bandwidth-first model, explained hybrid offload behavior for 12 GB and mid-range modern GPUs, compared quantization choices such as Q4_K_M and q4_0 KV cache, and provided concrete llama.cpp launch patterns for Linux, Windows, and WSL.

Local AI
Implementing Google's TurboQuant: Hybrid KV Cache for Edge LLM Deployment

Issue: Every time you message an AI chatbot, the model stores your conversation in temporary memory called the KV cache. On large models, this cache alone can consume 40GB—more than the model itself. On a constrained edge device, this is the difference between working and broken.

Solution: Implemented hybrid per-layer KV cache quantization inspired by Google's TurboQuant (ICLR 2026). By using 8-bit quantization for early transformer layers (where attention quality matters most) and 4-bit quantization for later layers, we achieved 17% better compression without quality loss.

Local AI
The Architecture of Speed: Real-Time Telemetry and Generative AI in 2026 Motorsport

Issue: Processing millions of high-velocity data points per second for immediate broadcast insights and race strategy required moving beyond traditional databases to highly decoupled, event-driven streaming architectures capable of sub-millisecond HTAP and GenAI integrations.

Solution: A technical deep dive into F1's AWS 'Track Pulse' architecture utilizing Kinesis sharding and DynamoDB caching, compared alongside Formula E's GCP HTAP architecture leveraging Pub/Sub, AlloyDB's columnar engine, and Vertex AI for real-time coaching.

Cloud
RK3588 NPU Router Architecture: What Actually Runs, What Wins, and Why

Issue: Several paths technically loaded but were not practically usable. Large models timed out or delivered poor latency, CPU tuning mattered more than expected, and the product narrative needed to shift from 'many runtimes' to a benchmark-backed llama.cpp-first architecture.

Solution: Benchmarked llama.cpp and RKLLM on RK3588, identified the winning CPU configs for Qwen 3.5 4B and 9B, clarified where the NPU helps, documented KV cache and quantization choices, and reframed the architecture as llama.cpp-first with NPU used selectively.

Local AI
The Comprehensive Linux Engineer Command List

Issue: Scattered knowledge means slower response times during critical operations. Having Linux commands on one page and Ansible/Python commands on another breaks the operational flow.

Solution: Compiled every sanitized, production-tested command snippet from my daily workflow into a single, massive reference guide with a coordinated SVG poster set.

Snippets Infrastructure Automation
Ansible Vault, Python, and Molecule Snippets

Issue: One mixed list of Linux and automation commands is hard to scan during a delivery window. The commands need context, safe placeholders, and a quick explanation of the flags that matter.

Solution: Split the automation workflow into its own sanitized snippets post and grouped the commands into the same order I usually follow in a fresh repository: bootstrap, dependencies, secrets, linting, test scenarios, and quick local sharing.

Snippets Automation
Azure Provisioned Throughput: When Fixed Costs Beat Pay-Per-Token

Issue: As your application scales, Microsoft's default rate limits can throttle your service, leading to slow responses and inconsistent user experiences. You're essentially stuck in traffic during peak hours.

Solution: Think of it like a toll road. Standard use is like paying per mile, but you're stuck in traffic. Azure's Provisioned Throughput (PTU) is like renting your own dedicated express lane. We built a framework to calculate the exact financial break-even point between the two models.

AI
Daily Linux Ops Command Cheatsheet

Issue: The knowledge existed, but it was fragmented across storage work, account management, package checks, Git recovery, and automation workflows. That fragmentation increases the chance of typos and slows down repeat work.

Solution: Consolidated the most reused Linux and admin commands into a snippets-first cheatsheet, grouped them by task, added flag guidance, and replaced every real identifier with placeholders.

Snippets Infrastructure
DNS Migration Strategy for Zero-Downtime System Replacement

Issue: Direct IP replacement would cause service disruption. Applications had hardcoded references to old hostnames. Certificates were tied to specific DNS names. Testing needed to happen in parallel with production operation.

Solution: Implemented a two-phase DNS migration strategy using temporary test records, multi-SAN certificates, and coordinated DNS switchover during a planned maintenance window.

Infrastructure
Edge LLM Optimization: Memory Bandwidth and Context Management

Issue: Edge devices have hard constraints: limited RAM, no GPU VRAM, and strict latency requirements for interactive applications. The naive approach of 'make the model fit' failed repeatedly—either latency was too high or context windows would overflow during long conversations.

Solution: Developed a three-pronged approach: (1) enforce bandwidth-first model selection, (2) use KV cache quantization to reduce memory footprint, and (3) implement hierarchical context folding for long conversations.

Local AI
Enterprise Certificate Lifecycle Management with Ansible

Issue: No certificate lifecycle management, manual deployment prone to human error, security risks from unencrypted private keys, and reactive rather than proactive expiration monitoring causing service disruptions.

Solution: Implemented comprehensive certificate automation using OpenSSL for CSR generation, Ansible Vault for encryption, automated deployment roles, expiration monitoring with 90-day alerts, and standardized multi-SAN certificate templates.

Automation
Training Custom AI Models for Insurance Document Processing

Issue: Off-the-shelf OCR solutions couldn't handle the complexity of insurance documents. Different insurers used different layouts, multilingual support was limited, and extracted data needed to conform to a strict canonical schema for downstream systems.

Solution: Implemented a custom document intelligence solution using Azure AI Document Intelligence, training models on labeled examples to extract and normalize fields across multiple insurers and languages.

AI
IntelliFlow: Building a Production-Ready Finance App with AI

Issue: Personal finance apps either had weak security practices, unclear data policies, or required trusting black-box systems. I needed full visibility and control.

Solution: Built IntelliFlow from scratch with infrastructure-grade security: encrypted local storage, strict Firebase security rules, biometric auth, and AI features with privacy-preserving design and prompt injection safeguards.

Kotlin
LVM Operations: Expand, Shrink, and Migrate Volumes

Issue: Storage operations were handled inconsistently across the team. Some admins would reboot servers for partition changes, others would attempt risky online operations without proper checkpoints, and migrations often resulted in extended downtime windows.

Solution: Documented a standardized LVM playbook covering the three core operations—expansion, shrinking, and migration—with clear pre-flight checks, execution steps, and rollback procedures.

Infrastructure
IntelliAuto: AI-Powered Automotive Assistant with Secure Monetization

Issue: Existing automotive apps are passive logs. Adding AI creates risks: prompt injection through user input, data privacy concerns, API cost runaway, and potential for incorrect safety-critical advice.

Solution: Designed IntelliAuto with AutoMind AI assistant featuring backend proxy architecture, multi-layer prompt injection prevention, dynamic affiliate link generation, and strict safety disclaimers for automotive advice.

Kotlin AI
Linux-Active Directory Integration: Access Control, SSO, and Troubleshooting

Issue: AD integration was fragmented across multiple playbooks with no unified approach. Users couldn't 'su' to service accounts, SSO setup was manual and error-prone, and access control required manual sudoers edits on each server.

Solution: Implemented a unified AD integration strategy: AD group mapping for sudo access, automated Kerberos keytab deployment via Ansible, and standardized PAM configuration across all servers.

Infrastructure
Automating NFS Share Management at Scale

Issue: NFS configuration was inconsistent across servers. Some used hostnames, others used IPs. Network routing issues caused connections over slow backup networks instead of high-bandwidth production networks. Permission errors blocked user access.

Solution: Implemented automated NFS management using Ansible roles for export configuration, client mounting with proper network selection, and troubleshooting runbooks for common failure scenarios.

Infrastructure
Server Provisioning Playbook: From VM Request to Production

Issue: Server provisioning was inconsistent across team members. Some skipped steps, documentation was scattered across wikis and emails, and handoffs to application teams were incomplete—missing access groups, wrong technical user configurations, or incomplete application dependencies.

Solution: Developed a standardized provisioning checklist and Ansible playbook structure that covers the complete lifecycle from VM deployment to application-ready state.

Infrastructure
Security Layering for Edge AI APIs: Encryption, Rate Limits, Validation, and Monitoring

Issue: Without explicit controls, an AI API is vulnerable to abuse (burst traffic), unsafe inputs (command/path traversal), leaked secrets, and silent security regressions from dependencies.

Solution: Implemented five security modules: encryption at rest, enhanced rate limiting, advanced input validation, security monitoring + alerts, and vulnerability scanning with report generation.

Infrastructure
Automating Firebase Deployments: Multi-Account Routing and Discord Notifications

Issue: Manual Firebase deployments are easy to mis-target (wrong project/hosting target), hard to audit, and slow to coordinate without realtime status notifications.

Solution: Centralized deployment configuration into an `accounts.json` profile, added API endpoints for account switching, and integrated Discord webhooks for start/success/failure notifications with log snippets.

Cloud
RK3588 LLM Performance: NPU vs CPU in a Discord Agent

Issue: CPU-only inference on small models was too slow for interactive UX, and some NPU model runs initially failed for non-runtime reasons (corrupted downloads or wrong target platform conversions).

Solution: Benchmarked CPU (Ollama) vs NPU (RKLLM), applied system and inference parameter optimizations, and documented failure modes to distinguish model-file issues from NPU/runtime issues.

Local AI
Stretched Networks and Leaf-Spine Architecture

Issue: Lack of understanding about stretched networks, leaf-spine trade-offs, and how application traffic patterns would be affected.

Solution: Documented the stretched network architecture, analyzed application traffic flows, and provided clear guidance on which applications were suitable for stretched L2 vs. Layer 3 approaches.

Infrastructure
Automating AD Computer Object Deletion on Linux Decommission

Issue: Needed a repeatable way to use Ansible and adcli to safely remove a Linux server's computer object from Active Directory during decommissioning.

Solution: Implemented a practical runbook/automation pattern with clear safety checks, execution steps, and verification points.

Automation
Modernizing Android UX: High Refresh Rates & App Shortcuts

Issue: The app was locked to standard 60Hz rendering, causing sub-optimal scrolling experiences on devices capable of 90Hz or 120Hz. Additionally, users had to navigate through multiple screens to perform frequent actions.

Solution: Detected 90Hz+ display modes and configured window post-processing preferences for smoother rendering, then implemented static XML-based app shortcuts routed via deep links.

Kotlin AI
Silent Software Installations with Ansible

Issue: Manual software installations were time-consuming, inconsistent across servers, and couldn't be reproduced reliably for disaster recovery.

Solution: Developed Ansible patterns for silent installations with templated response files, pre-requisite validation, and idempotent deployment checks.

Infrastructure Automation
Building Custom Ansible Execution Environments

Issue: Ansible playbooks that worked on the control node failed on execution environments with missing dependencies, and reproducing issues was difficult without consistent environments.

Solution: Built custom Execution Environments using ansible-builder, packaging all Python dependencies, Ansible collections, and system packages into versioned container images.

Infrastructure Automation
PostgreSQL WAL Archiving with SELinux Considerations

Issue: No WAL archiving configured, SELinux contexts incorrect for archive directories, and point-in-time recovery was impossible.

Solution: Configured PostgreSQL WAL archiving with proper SELinux file contexts, tested restore procedures, and documented the end-to-end recovery process.

Infrastructure
Apache as a Reverse Proxy: Ansible Deployment Pattern

Issue: No consistent reverse proxy pattern, manual SSL certificate management, and inconsistent load balancer configurations across environments.

Solution: Developed an Ansible role for Apache reverse proxy with automated SSL deployment, health check endpoints, and standardized load balancer configurations.

Infrastructure Automation
Shipping My First Android App: IntelliFlow

Issue: Needed a repeatable way to leverage AI scaffolding to focus on infrastructure, security, and architecture while building a personal finance app.

Solution: Implemented a practical runbook/automation pattern with clear safety checks, execution steps, and verification points.

Kotlin
Building Golden Images with Packer and StackGuardian

Issue: No standardized golden images, manual image building was error-prone, and configuration drift between images caused deployment failures.

Solution: Implemented Packer with StackGuardian for automated golden image pipelines, creating standardized RHEL images with consistent configurations.

Infrastructure Automation
Securing and Scaling AI Context in an Automotive Assistant

Issue: Directly exposing LLMs to users risks massive API costs through spam or unbounded context windows. Furthermore, raw user input is vulnerable to jailbreaks (e.g., 'ignore previous instructions and execute code').

Solution: Implemented a multi-tier model routing strategy (chat vs reasoning), robust context truncation, regex-based jailbreak detection, and strict timestamp-based rate limiting.

AI Kotlin
Orchestrating Patching Waves for Enterprise Linux

Issue: Big-bang patching caused widespread outages with no rollback strategy, and identifying affected systems took hours during incidents.

Solution: Implemented wave-based patching with health gates between waves, automatic rollback triggers, and per-wave ownership documentation.

Infrastructure Automation
Testing Ansible Roles with Molecule and Docker

Issue: No automated testing for Ansible roles, production deployments were the first test, and role regressions were discovered only after incidents.

Solution: Implemented Molecule with Docker for local role testing, integrated into CI pipeline to catch issues before merge.

Infrastructure Automation
Building a Multilingual AI Backend for Part Recognition

Issue: The backend AI needed to recognize user intent and categorize vehicle parts accurately regardless of the input language, and subsequently generate both localized predictive maintenance responses and tailored affiliate search queries.

Solution: Implemented comprehensive multi-language keyword dictionaries, extracted user language context directly from client requests, and used mapping dictionaries to serve localized response templates.

AI
Slashing LLM API Costs with System Prompt Caching

Issue: Large Language Models charge per token. When you send a 1,000-token system prompt alongside a 50-token user question, you pay for 1,050 tokens every time, even though 95% of the payload never changes between requests.

Solution: Restructured the API payload to isolate static system instructions so the backend can take advantage of cached-input pricing or prompt caching features where the provider supports it.

AI
Managing Linux Users and Groups with Ansible

Issue: No centralized user management for local accounts, UID/GID inconsistencies breaking applications, and sudo access scattered across individual sudoers files.

Solution: Implemented Ansible-based user management with host_vars for server-specific accounts, standardized UID/GID ranges, and templated sudoers configurations.

Infrastructure Automation
Bash Script for User Permission Audits

Issue: No automated way to gather user permission data, manual auditing was error-prone and time-consuming, and compliance reports were always delayed.

Solution: Developed a bash script that collects user accounts, sudo access, and group memberships, outputting a standardized report that could be consolidated across all servers.

Infrastructure
Implementing the Outbox Pattern for Offline-First Sync

Issue: Direct-to-cloud write operations failed silently during poor network conditions. Historical data had hardcoded sync limits, and offline/guest modes were improperly triggering authentication flows.

Solution: Adopted the Outbox Pattern for all write operations, separated local execution from cloud sync workers, and implemented comprehensive state tracking with retry logic.

Kotlin
Tracking Required Reboots with RHEL Tracer

Issue: No visibility into which services had pending restarts, leading to either unnecessary reboots or missed restarts that caused instability.

Solution: Implemented tracer integration to identify pending restarts, combined with a tiered reboot strategy based on application criticality.

Infrastructure
Essential Red Hat Linux Administrator Commands

Issue: No single source of truth for common RHEL administration commands, leading to inconsistent practices and repeated onboarding questions.

Solution: Created a living cheatsheet covering systemd, LVM, networking, user management, and troubleshooting - the commands used daily in our environment.

Infrastructure
Safely Resolving Git Merge Conflicts

Issue: Needed a repeatable way to resolve git merge conflicts using git stash to protect your local work.

Solution: Implemented a practical runbook/automation pattern with clear safety checks, execution steps, and verification points.

Snippets
Infrastructure as Code: Structuring Ansible Repositories

Issue: No clear separation between dev and prod environments, inconsistent variable hierarchy, and accidental cross-environment changes were becoming common.

Solution: Implemented a standardized repository structure with separate inventory directories, clear group_vars/host_vars hierarchy, and environment-specific variable overrides.

Infrastructure Automation