Technical Articles | Bay Information Systems

AI Systems (15)

Every approach to context generation -- from basic chunking to knowledge compilation -- is an instance of the same pattern. The quality of the build step determines what your agent can do at runtime.

How to Evaluate an AI Pipeline

A reference guide to evaluation datasets, metrics, and methodology organised by output type and professional domain.

Embedding-Based Relationship Discovery: Finding the Real Network in Your Data

Most organisations have a formal model of their customer relationships and a real one that differs from it. Embeddings and community detection surface the real structure. Here is how the pattern works and where it applies.

Format Choice and LLM Performance: A Taxonomy

The format you use to pass data to a language model affects reliability and cost more than most practitioners expect. This is a taxonomy of the main options and the conditions under which each performs well.

Task Specification is the Primary Artefact

Foundation models changed the cost of the mechanism. They did not change the first question: where are the labels? A labelled dataset is a measurement instrument, not training fuel.

Static vs Dynamic Planning in AI Pipelines

Most production AI workloads are high-volume and fixed-task. Dynamic planning adds cost and reduces auditability without adding anything a static pipeline cannot do.

Agent Alloys or Persona Leakage?

Context bleeds between agents in multi-agent systems. The XBOW experiment suggests that for exploratory tasks, this is a feature rather than a failure mode.

Filtered Vector Search

Vector search in production needs metadata filters, business rules, and combined ranking signals. How index structure determines what actually executes efficiently.

LLM Training Fundamentals: From Tokens to Human Preference

LLMs are trained in stages with different objectives. How the shift from statistical training to human preference ranking produces behaviours that metrics alone cannot explain.

Sampling Strategies for Imbalanced Data

Imbalanced datasets cause models to ignore the minority class. The sampling strategies that actually work for fraud detection, medical diagnosis, and quality control.

LLM Evaluation Metrics

LLM outputs are probabilistic and context-dependent. A structured approach to evaluating language models in production across multiple dimensions.

A Complete Guide to Categorical Encoding for Machine Learning

One-hot, ordinal, target, embedding -- categorical encoding choices affect model performance significantly. A practical guide with model-specific recommendations.

Why Are Vector Databases Difficult? A Deep Dive

Vector databases are harder than they look. A technical examination of similarity search, indexing trade-offs, and why naive implementations fail at scale.

RAG Strategy and Tooling

RAG is not one technique. A breakdown of retrieval-augmented generation approaches by complexity, and how to match tooling to the technique you actually need.

Memory in Agentic AI

Memory is what separates an AI agent from a stateless function. The types of memory available, how they work, and when each is appropriate.

Product & Strategy (13)

What Software Engineers Already Know About Machine Learning

Eval datasets, scoring pipelines, and deployment gates for ML systems are not new concepts. Software engineers have been doing this for years under different names.

Open Weight Models and the Separation of Concerns in AI

Vertically integrated AI companies collect data, train models, and sell inference. This is a business process error. The organisation that understands the task should own the data that defines it.

Generative Engine Optimisation: A Measurement Problem

GEO tools claim to measure your visibility in AI-generated responses. Understanding how they actually collect data reveals why the problem is harder than SEO.

The Austrian Case for AI Investment

The Chicago school asks what AI allows you to stop paying for. The Austrian school asks what latent structure already exists in your accumulated data, waiting to be read.

Capital, Labour, and the Productivity Question

Productivity growth has slowed across OECD economies. One explanation: the statistics measure the wrong competition. Capital and labour are not on the same curve.

The Indirection Problem

Indirection -- accessing something through a reference rather than directly -- is a C programming concept. It also explains a surprising number of modern frustrations.

Metrics for Business Goals

A model with 95% accuracy can destroy business value. How to choose evaluation metrics that reflect what the business actually needs.

Machine Learning Engineer vs Software Engineer: Four Structural Differences

Software engineers expect delivery to end at deployment. In ML, that is where the work begins. Four structural differences that change how ML projects are scoped, managed, and judged.

5 Mistakes to Avoid When Building Your First AI Product

The traps that derail early AI products are predictable. Five failure modes from years of helping startups build and ship.

How to Brief a Developer on a Feature Without Being Technical

A practical guide for business and product owners who need to communicate clearly with AI engineers without speaking code.

Alignment: Semantics in the AI/ML Era

LLMs introduced a new kind of problem into software: the model's interpretation of intent can diverge from the developer's. What alignment actually means in practice.

Machine Learning Design Patterns

Choosing the right ML design pattern matters more than choosing the right model. Key patterns with applications to marketing and audience intelligence.

The Cost of Machine Learning

ML is a well-defined problem with specific costs. Using F(X)=Y to break down what you are actually paying for and who needs to do it.

Data Strategy (7)

What Is Data Engineering? Two Jobs, One Title

Data engineering describes two distinct disciplines that require different skills and produce different failures when confused. Understanding the split is the first step to hiring and structuring a data team that actually works.

What Your Data Already Knows

Siloed data is a list. Connected data encodes structure -- communities, gaps, relationships -- that accumulated through ordinary operation and has never been made visible.

SQL Schema Documentation for RAG Pipelines

Structured databases contain context that LLMs need to generate accurate queries. How to expose schema information in a way that preserves relationships and business logic.

How to Isolate Bad Training Data Using Cross-Validation

Cross-validation is usually framed as a model evaluation technique. It is also the most reliable way to find which training examples are hurting your model.

What Is Data Maturity?

Data maturity is a spectrum from ad hoc storage to strategic data infrastructure. What it is, how to assess it, and why it determines what AI is actually feasible.

Leveraging AI for Organisational Intelligence

Communication metadata is a structured sample of how an organisation actually works. What AI methods can surface from it.

What Data Do I Need? A Guide for Business and Product Owners

A checklist for integrating machine learning into a product without overcommitting data infrastructure before you know what the problem is.

Architecture & Deployment (9)

The Private Inference Stack: A Field Guide

From raw PyTorch to managed private API services. What each layer of the inference stack does, where the tools come from, and how they relate.

Multi-Tenant Architecture: A Practical Spectrum

Multi-tenancy is not a single pattern. It is a spectrum from shared tables to fully separate infrastructure, and the right point on that spectrum depends on what varies between tenants, where your operational complexity budget sits, and what failure looks like.

Bronze, Silver, Gold: A Layered Data Architecture in PostgreSQL

The medallion architecture -- bronze for raw ingestion, silver for enriched and validated records, gold for aggregated outputs -- maps cleanly onto PostgreSQL. Each layer has a single concern. The transitions between them are where the interesting engineering lives.

Event-Driven Embeddings in PostgreSQL: LISTEN/NOTIFY and the Async Pattern

PostgreSQL's LISTEN/NOTIFY mechanism lets you trigger embedding generation the moment a row is inserted, without polling, without a separate scheduler, and without coupling your embedding service to your database schema.

Private Inference: Running AI Inside Your Own Infrastructure

Data egress is the constraint that kills AI projects in regulated sectors. Open-weight models deployed inside a VPC remove the objection before it reaches legal review.

Scaling Databases: Lessons from Facebook's MySQL Journey

Facebook's progression from a single database to a sharded architecture contains practical lessons for any system that needs to grow. The constraints that drove each decision.

Platforms vs Pipelines: Engineering for Compounding Returns

AWS, Borg, chaos engineering -- successful organisations build platforms rather than optimise pipelines. The distinction determines whether engineering effort compounds.

AI Tooling and the Separation Between Coding and Programming

AI-assisted development accelerates a long-running separation between writing code and building systems. Why that distinction matters and what it means for teams.

SaaS Architecture 101 for Non-CTOs

LAMP, JAMstack, microservices, serverless -- what these terms actually mean and how to navigate architecture decisions without a technical co-founder.

Infrastructure (3)

Why Private Inference

Three risk categories that make private inference the right architectural choice -- regulatory constraint, exposure risk, and the largely unacknowledged risk of your data appearing in someone else's model output.

Deploying Models to AWS Lambda

Open-source models on serverless infrastructure cost less than incumbent AI services. A practical guide to deploying on AWS Lambda.

What is Docker? A Deep Dive into Containers

The Linux primitives behind Docker containers -- namespaces, cgroups, and how they make containers faster than virtual machines.

Developer Tools (6)

Marigold: Privately Hosted AI Inference on AWS

A drop-in replacement for OpenAI and Anthropic endpoints, running open-weight models on private AWS infrastructure in London. Your data does not leave your network. We do not train on it.

How to Build an AI Pipeline That Runs Anywhere: the runfox Backend Model

Most workflow libraries couple the pipeline definition to the execution environment. runfox separates them: the same YAML definition runs in-process, against SQLite, or distributed across SQS and DynamoDB, with no changes to the workflow code.

Why JSON Logic Needs JSONPath: Adding the vars Operator

Standard JSON Logic resolves single values via dot notation. When rules need to match against lists -- model outputs, tag arrays, multi-value fields -- dot notation is the wrong tool. json-logic-path adds a vars operator backed by JSONPath.

Defining Your AWS API Gateway in Python

API Gateway configuration and Python route definitions live in separate files and drift apart. fastapi-aws generates both the AWS integration spec and the public API documentation from a single Python source.

DynamoDB Access Patterns: The Part the Documentation Skips

DynamoDB rewards developers who define access patterns before writing schema. boto3 enforces none of that discipline. dynawrap moves the pattern onto the class and lets you swap the backend for PostgreSQL when AWS is not available.

Developing with APIs Without Rate Limits: Introducing APICache

Rate limits slow down early-stage development. APICache is an open-source library that caches API responses locally so you can iterate without hitting limits.

Security & Resilience (1)

Security Risks in AI Systems with Public Access

Publicly deployed AI systems introduce specific vulnerabilities. A breakdown of the key risks and what to do about them.

Perspectives (10)

Content Negotiation Was a Good Idea

HTTP content negotiation was designed to decouple content from form. It stopped at format selection. Generative inference completes the original intention.

The Embedding is the Message

AI communication inherited natural language from human communication. That constraint is not a technical necessity.

Adtech Ran This Experiment First

The measure-optimise-adjust loop now running through AI first ran through advertising technology. The pathologies are the same. Adtech just got there a decade earlier.

The Fat Protocol Thesis Applied to AI

Model weights are the fat layer -- general, beneath everything, available at marginal cost. Workflows are thin clients. Most AI infrastructure is built the wrong way round.

Software Engineering Solved Governance

Single source of truth, idempotency, rollback, least privilege -- software engineers formalised governance problems that other institutions still handle by convention.

Content is Dead: The Curation Inversion

Generative AI removed the cost constraint that forced editorial discipline. When creation is free and unlimited, the signal collapses. Curation is the new scarcity.

Digital Archaeology: Bitcoin's 21 Million

The 21 million bitcoin limit is not in the whitepaper. It derives from a 32-bit integer ceiling, a type migration, and a patch written under pressure after a 184 billion BTC exploit.

Bitcoin and AI both convert electrical energy into discrete computational units via open protocols. The economic structure is closer than it first appears.

PMFOps: Operationalising Product-Market Fit

DevOps operationalised code. MLOps operationalised models. PMFOps is the emerging third layer -- treating the audience as a testable, versioned artefact.

The Problem with Faster Horses

Foundation models are not fast humans. Treating them as a non-deterministic data store opens more useful questions than the labour displacement framing does.