A few fixes.

This commit is contained in:
pablo 2021-12-08 20:11:53 +01:00
parent a3ba60111d
commit 65de889359
2 changed files with 40 additions and 41 deletions

View file

@ -1,6 +1,6 @@
It is widely known that most Machine Learning projects never deliver anything It is widely known that most Machine Learning projects never deliver anything
to production. Simplifying the message, they fail. There might be useful to production. Simplifying the message, they fail. There might be useful
by-products of running the project, but if it is never delivered as a service by-products of running the project, but if it is never turned into a service
that is used by people or other pieces of software, it failed to deliver. that is used by people or other pieces of software, it failed to deliver.
There are many reasons that can make Machine Learning projects fail, and There are many reasons that can make Machine Learning projects fail, and
@ -16,9 +16,8 @@ Learning algorithm), you should write a business case. It might be a more or
less detailed one. The utility of the predictions will sometimes be naturally less detailed one. The utility of the predictions will sometimes be naturally
easy to translate into financial results, and sometimes you will have to get easy to translate into financial results, and sometimes you will have to get
more creative to link the two worlds. But in any case, not doing the business more creative to link the two worlds. But in any case, not doing the business
case exercise means you are commiting into a line of work without thinking case exercise means you are commiting to a line of work without thinking first
first if it makes any sense to pursuit it. In my humble opinion, not a wise if it makes any sense to pursuit it. In my humble opinion, not a wise choice.
choice.
At some point, while you find yourself drafting the business case, you will At some point, while you find yourself drafting the business case, you will
have to deal with an inevitable fact: the algorithm will make errors. And have to deal with an inevitable fact: the algorithm will make errors. And
@ -29,19 +28,19 @@ accuracy affect business results.
## An example ## An example
Let's discuss an example of this to make things clearer. You are drafting the Let's discuss an example of this to make things clearer. You are drafting the
business case for a project in a motorcycle manufacturing company (let's call business case for a project in a motorcycle manufacturing company, let's call
Hamaya to keep it simpler). Repair shops affiliated with Hamaya will attend it Hamaya. Repair shops affiliated with Hamaya will attend customers that have
customers that have Hamaya motorcycles under warranty. When the shop considers Hamaya motorcycles under warranty. When the shop considers that the needed
that the needed repair is covered by the warranty, they will post a request to repair is covered by the warranty, they will post a request to Hamaya to obtain
Hamaya to obtain the required materials. Hamaya must judge the request and the required materials. Hamaya must judge the request and accept it if all
accept it if all looks good or reject it if it considers the repair should not looks good or reject it if it considers the repair should not be covered by the
be covered by the warranty or the material request does not fit with the repair warranty or the material request does not fit with the repair that needs to be
that needs to be done (the repair shope is trying to sneak some extra materials done (the repair shope is trying to sneak some extra materials in the request).
). Currently, all claims are reviewed manually by a team of customer care Currently, all claims are reviewed manually by a team of customer care
associates at Hamaya. The goal of the project is to deliver a classification associates at Hamaya. The goal of the project is to deliver a classification
algorithm that will automatically classify requests as accepted or "needs algorithm that will automatically classify requests as either accepted or
review", which will make the request enter the already in place process where "needs review", which will make the request enter the already in place process
the customer care associate will review it. where the customer care associate will review it.
Again, we know that whatever algorithm we come up with, it will make mistakes. Again, we know that whatever algorithm we come up with, it will make mistakes.
What do the mistakes look like in this case? What do the mistakes look like in this case?
@ -52,9 +51,9 @@ What do the mistakes look like in this case?
| Invalid | Accepted | Hamaya has given away materials it should not have. The cost of these materials has been lost. | | Invalid | Accepted | Hamaya has given away materials it should not have. The cost of these materials has been lost. |
| Whatever | Needs Review | Valuable customer care time will need to be spent on the case. | | Whatever | Needs Review | Valuable customer care time will need to be spent on the case. |
So now we know that we can lose money in two ways: by giving away materials when So now we know that we can screw up in two ways: by giving away materials when
we should, and by consuming time for the customer care team for cases that we should not or by consuming time for the customer care team for cases that
should have been marked accepted by the classification algorithm. could have been marked accepted by the classification algorithm.
## The choice ## The choice
@ -90,8 +89,8 @@ In the hypothetical Hamaya case, some role like director of customer care would
be a good fit for this description. be a good fit for this description.
Now, look for this person in your project. If you can't find it, there is Now, look for this person in your project. If you can't find it, there is
something pretty wrong with your project, and you should probably take a few something smelly about your project, and you should probably take some time to
minutes to reflect on what the hell is going on and fix some stuff. reflect on what the hell is going on and fix some stuff.
Once you find it, you need to make a nice ritual to take the ownership of the Once you find it, you need to make a nice ritual to take the ownership of the
design of the loss function and transfer it to the business person. This means: design of the loss function and transfer it to the business person. This means:
@ -113,7 +112,7 @@ design of the loss function and transfer it to the business person. This means:
lot, ranging from simple but clear statements ("We want the model to never lot, ranging from simple but clear statements ("We want the model to never
classify an invalid request as accepted. We don't mind sending more work to classify an invalid request as accepted. We don't mind sending more work to
the customer care team.") to quantified, detailed policies. But they decided the customer care team.") to quantified, detailed policies. But they decided
it. it, not you.
## Why ## Why
@ -127,7 +126,8 @@ clearly beyond the scope of the data scientist work. Let's think about the two
extremes: you either tune the algorithm to be very agressive in labeling extremes: you either tune the algorithm to be very agressive in labeling
requests as accepted, assuming the higher risk of incorrectly classifying requests as accepted, assuming the higher risk of incorrectly classifying
invalid requests as accepted, or you tune it to be conservative and send the invalid requests as accepted, or you tune it to be conservative and send the
request to the customer care team unless the case is dead obvious. request to the customer care team unless the case should obviously be
accepted.
If you go for the aggressive option, at least the following will happen: If you go for the aggressive option, at least the following will happen:
- You will take much more work out of the customer care team. This means that - You will take much more work out of the customer care team. This means that
@ -151,8 +151,8 @@ Despite the case being so simple and the comparison between both options so
brief, we can see plenty of impacts on business in this choice: we have hiring brief, we can see plenty of impacts on business in this choice: we have hiring
and staffing (since the sizing of the customer care team will change), and staffing (since the sizing of the customer care team will change),
customer satisfaction (speed of requests being handled), supply chain (the customer satisfaction (speed of requests being handled), supply chain (the
need for materials needs to be planned and served) and financial and cost need for materials must be planned and served) and financial and cost structure
structure matters (what is cheaper, giving away more material or hiring more matters (what is cheaper, incorrectly giving away more material or hiring more
customer care profiles? Also, both options lead to different cost structures. customer care profiles? Also, both options lead to different cost structures.
Which one fits better with Hamaya's business and finance strategy?). Which one fits better with Hamaya's business and finance strategy?).
@ -173,13 +173,13 @@ in the hands of data scientists. I have a few thoughts on this:
understand the important topics and why their involvement is needed at some understand the important topics and why their involvement is needed at some
parts of the project. parts of the project.
- Business has an excessively hands-off management style. They don't want to - Business has an excessively hands-off management style. They don't want to
know anything about the project, "because it is the data scientists' job and know anything about the inner workings of the project, "because it is the
we want results, not problems". This is the kind of management style that data scientists' job and we want results, not problems". This is the kind of
also ruins software projects because they are never accessible during the management style that also ruins software projects because they are never
project and then get pissed off at the end because the result does not match accessible during the project and then get pissed off at the end because the
some arbitrary expectation. This is a purely cultural and people issue, so I result does not match some arbitrary expectation. This is a purely cultural
won't even try to provide ideas here. Either try something to change their and people issue, so I won't even try to provide ideas here. Either try
mindset or just run away. something to change their mindset or just run away.
- Your project has no clear purpose, which probably translates to no clear - Your project has no clear purpose, which probably translates to no clear
business ownership. It might be seen as an initiative where only the data business ownership. It might be seen as an initiative where only the data
science team participates. Or maybe some C-level executive sent an email science team participates. Or maybe some C-level executive sent an email
@ -192,8 +192,6 @@ in the hands of data scientists. I have a few thoughts on this:
to stand firm and announce that neither you nor your team will move a finger to stand firm and announce that neither you nor your team will move a finger
unless the business case is built. unless the business case is built.
## Wrapping up ## Wrapping up
That was long. Let's make a simple recipe: That was long. Let's make a simple recipe:
@ -205,8 +203,9 @@ That was long. Let's make a simple recipe:
inevitable, but that you have a certain degree of control over _which_ errors inevitable, but that you have a certain degree of control over _which_ errors
happen. happen.
- Work together with the business team to assess how different errors link to - Work together with the business team to assess how different errors link to
business impacts, and provide them with the range of available options. business impacts, and provide them with the range of available options in
- Get a decision from them which you is clear, defined and you can translate terms of how can error behaviour be modified.
into the loss function. - Get a decision from them which is clear, defined and you can translate into
the loss function.
- Go ahead and deliver! - Go ahead and deliver!

View file

@ -1,5 +1,5 @@
title: Loss functions should be owned by business title: Loss functions should be owned by business
date: 2021-12-7 09:00:00 date: 2021-12-07 09:00:00
category: stories category: stories
tags: data science, optimization, management, analytics translation, organizational design tags: data science, optimization, management, analytics translation, organizational design
slug: loss-functions-should-be-owned-by-business slug: loss-functions-should-be-owned-by-business