Machine learning is a well-defined problem with specific costs and roles. In this article, we break down the machine learning process using the equation F(X) = Y to relate each component to costs and expertise. This approach allows businesses to estimate the costs of each part of the process and ensure they have the necessary roles within their organization, increasing the likelihood of successful completion.
This article will demonstrate how understanding the costs associated with data (X), models (F), and outputs (Y) can support decision making – and define “good enough” goals.
Machine learning can be summarised in a simple equation:
F(X) = Y
Each part of the equation has specific role requirements:
Each of these has specific costs and risks depending on provider, required capacity, regulatory requirements, and so on. These will often be understood as part of a larger digital infrastructure.
To illustrate these concepts, we will use a hypothetical app called CakeyBakey.io.
Our simple app converts a text description of a cake to an image and then rates the cake – completely automatically with no human involvement! The perfect way to judge my baking…
Once the system is running, we assume each tweet generates £0.50 in revenue. Our goal is to get live as fast as possible, but we don’t know whether to use a Model-as-a-service provider (like OpenAI) or whether to have a self-hosted solution.
This choice has important implications for support, development time, availability requirements, etc, so understanding per-event cost differential can help inform those decisions.
Let’s compare the costs for service-hosted and self-hosted models.
The costs for these are drawn from the API costs given by providers, in the hosted model case from the OpenAI pricing page and in the self-hosted case from the AWS instance pricing page.
In both cases, prices may vary depending on specific usage (batch vs streaming), cost agreements (pricing tiers and capacity reservations) and so on.
Each user interaction (a twitter post) can incur known and predictable costs, and we see that the OpenAI service offering is nearly 8x cheaper, and so, barring other issues, would be the obvious choice.
Occasionally a working ML system requires updates to either maintain or improve performance.
All of these are desirable, but may not be feasible.
Estimating the cost of improvements can help us decide on feasibility. First, we should understand that all new data will require labelling, and the models will require retraining with that labelled data. So the fundamental question to answer is: How much will it cost to acquire new labels? and How much improvement will we see?
This is a question we are often asked, and have derived the following formulas:
Returning to our cakeybakey.io example, assuming our labeller charges £500 a day and labels 20 items per day. For 100 items, cost = 5 days × £500 = £2,500 investment.
Using the Improvement calculation above on the 100 new labels, added to our 500 existing labels, we can estimate the improved accuracy as (1 + (100/600)) x 70% = 84% – an increase of 14%.
Given these numbers we can now estimate the Return on Investment (ROI) from the increased model performance:
Break-Even Point:
Obviously, there is now a requirement that cakeybakey.io can generate that level of engagement!
While it seems an obvious choice to choose better performance, it may be that the business simply cannot afford the investment required for the return, or the delay in labelling extra data will cause other milestones to be missed. These are important decisions for the business to understand; framing the decision with time and cost provides better context.
Understanding the costs associated with machine learning is crucial for making informed business decisions.
By breaking down the project we can isolate components and provide ball-park figures which allow decisions to be made at all stages of the project, estimate what is “good enough” and, crucially, avoid going over-budget.
Next Steps: As you consider implementing or expanding ML projects, evaluate the components of F(X) = Y in your context. Contact us from our main page to discuss further.