A few fixes.
This commit is contained in:
parent
a3ba60111d
commit
65de889359
2 changed files with 40 additions and 41 deletions
|
|
@ -1,6 +1,6 @@
|
|||
It is widely known that most Machine Learning projects never deliver anything
|
||||
to production. Simplifying the message, they fail. There might be useful
|
||||
by-products of running the project, but if it is never delivered as a service
|
||||
by-products of running the project, but if it is never turned into a service
|
||||
that is used by people or other pieces of software, it failed to deliver.
|
||||
|
||||
There are many reasons that can make Machine Learning projects fail, and
|
||||
|
|
@ -16,9 +16,8 @@ Learning algorithm), you should write a business case. It might be a more or
|
|||
less detailed one. The utility of the predictions will sometimes be naturally
|
||||
easy to translate into financial results, and sometimes you will have to get
|
||||
more creative to link the two worlds. But in any case, not doing the business
|
||||
case exercise means you are commiting into a line of work without thinking
|
||||
first if it makes any sense to pursuit it. In my humble opinion, not a wise
|
||||
choice.
|
||||
case exercise means you are commiting to a line of work without thinking first
|
||||
if it makes any sense to pursuit it. In my humble opinion, not a wise choice.
|
||||
|
||||
At some point, while you find yourself drafting the business case, you will
|
||||
have to deal with an inevitable fact: the algorithm will make errors. And
|
||||
|
|
@ -29,19 +28,19 @@ accuracy affect business results.
|
|||
## An example
|
||||
|
||||
Let's discuss an example of this to make things clearer. You are drafting the
|
||||
business case for a project in a motorcycle manufacturing company (let's call
|
||||
Hamaya to keep it simpler). Repair shops affiliated with Hamaya will attend
|
||||
customers that have Hamaya motorcycles under warranty. When the shop considers
|
||||
that the needed repair is covered by the warranty, they will post a request to
|
||||
Hamaya to obtain the required materials. Hamaya must judge the request and
|
||||
accept it if all looks good or reject it if it considers the repair should not
|
||||
be covered by the warranty or the material request does not fit with the repair
|
||||
that needs to be done (the repair shope is trying to sneak some extra materials
|
||||
). Currently, all claims are reviewed manually by a team of customer care
|
||||
associates at Hamaya. The goal of the project is to deliver a classification
|
||||
algorithm that will automatically classify requests as accepted or "needs
|
||||
review", which will make the request enter the already in place process where
|
||||
the customer care associate will review it.
|
||||
business case for a project in a motorcycle manufacturing company, let's call
|
||||
it Hamaya. Repair shops affiliated with Hamaya will attend customers that have
|
||||
Hamaya motorcycles under warranty. When the shop considers that the needed
|
||||
repair is covered by the warranty, they will post a request to Hamaya to obtain
|
||||
the required materials. Hamaya must judge the request and accept it if all
|
||||
looks good or reject it if it considers the repair should not be covered by the
|
||||
warranty or the material request does not fit with the repair that needs to be
|
||||
done (the repair shope is trying to sneak some extra materials in the request).
|
||||
Currently, all claims are reviewed manually by a team of customer care
|
||||
associates at Hamaya. The goal of the project is to deliver a classification
|
||||
algorithm that will automatically classify requests as either accepted or
|
||||
"needs review", which will make the request enter the already in place process
|
||||
where the customer care associate will review it.
|
||||
|
||||
Again, we know that whatever algorithm we come up with, it will make mistakes.
|
||||
What do the mistakes look like in this case?
|
||||
|
|
@ -52,9 +51,9 @@ What do the mistakes look like in this case?
|
|||
| Invalid | Accepted | Hamaya has given away materials it should not have. The cost of these materials has been lost. |
|
||||
| Whatever | Needs Review | Valuable customer care time will need to be spent on the case. |
|
||||
|
||||
So now we know that we can lose money in two ways: by giving away materials when
|
||||
we should, and by consuming time for the customer care team for cases that
|
||||
should have been marked accepted by the classification algorithm.
|
||||
So now we know that we can screw up in two ways: by giving away materials when
|
||||
we should not or by consuming time for the customer care team for cases that
|
||||
could have been marked accepted by the classification algorithm.
|
||||
|
||||
## The choice
|
||||
|
||||
|
|
@ -90,8 +89,8 @@ In the hypothetical Hamaya case, some role like director of customer care would
|
|||
be a good fit for this description.
|
||||
|
||||
Now, look for this person in your project. If you can't find it, there is
|
||||
something pretty wrong with your project, and you should probably take a few
|
||||
minutes to reflect on what the hell is going on and fix some stuff.
|
||||
something smelly about your project, and you should probably take some time to
|
||||
reflect on what the hell is going on and fix some stuff.
|
||||
|
||||
Once you find it, you need to make a nice ritual to take the ownership of the
|
||||
design of the loss function and transfer it to the business person. This means:
|
||||
|
|
@ -113,7 +112,7 @@ design of the loss function and transfer it to the business person. This means:
|
|||
lot, ranging from simple but clear statements ("We want the model to never
|
||||
classify an invalid request as accepted. We don't mind sending more work to
|
||||
the customer care team.") to quantified, detailed policies. But they decided
|
||||
it.
|
||||
it, not you.
|
||||
|
||||
## Why
|
||||
|
||||
|
|
@ -127,11 +126,12 @@ clearly beyond the scope of the data scientist work. Let's think about the two
|
|||
extremes: you either tune the algorithm to be very agressive in labeling
|
||||
requests as accepted, assuming the higher risk of incorrectly classifying
|
||||
invalid requests as accepted, or you tune it to be conservative and send the
|
||||
request to the customer care team unless the case is dead obvious.
|
||||
request to the customer care team unless the case should obviously be
|
||||
accepted.
|
||||
|
||||
If you go for the aggressive option, at least the following will happen:
|
||||
- You will take much more work out of the customer care team. This means that
|
||||
the ratio of # requests/ # customer care associates will go down since more
|
||||
the ratio of #requests/#customer care associates will go down since more
|
||||
requests are handled automatically.
|
||||
- There will be more mistakes. Hamaya will be giving away more materials that
|
||||
should not be. This translates into money lost.
|
||||
|
|
@ -151,9 +151,9 @@ Despite the case being so simple and the comparison between both options so
|
|||
brief, we can see plenty of impacts on business in this choice: we have hiring
|
||||
and staffing (since the sizing of the customer care team will change),
|
||||
customer satisfaction (speed of requests being handled), supply chain (the
|
||||
need for materials needs to be planned and served) and financial and cost
|
||||
structure matters (what is cheaper, giving away more material or hiring more
|
||||
customer care profiles? Also, both options lead to different cost structures.
|
||||
need for materials must be planned and served) and financial and cost structure
|
||||
matters (what is cheaper, incorrectly giving away more material or hiring more
|
||||
customer care profiles? Also, both options lead to different cost structures.
|
||||
Which one fits better with Hamaya's business and finance strategy?).
|
||||
|
||||
So, do you think a data scientist is the right person to make this call?
|
||||
|
|
@ -173,13 +173,13 @@ in the hands of data scientists. I have a few thoughts on this:
|
|||
understand the important topics and why their involvement is needed at some
|
||||
parts of the project.
|
||||
- Business has an excessively hands-off management style. They don't want to
|
||||
know anything about the project, "because it is the data scientists' job and
|
||||
we want results, not problems". This is the kind of management style that
|
||||
also ruins software projects because they are never accessible during the
|
||||
project and then get pissed off at the end because the result does not match
|
||||
some arbitrary expectation. This is a purely cultural and people issue, so I
|
||||
won't even try to provide ideas here. Either try something to change their
|
||||
mindset or just run away.
|
||||
know anything about the inner workings of the project, "because it is the
|
||||
data scientists' job and we want results, not problems". This is the kind of
|
||||
management style that also ruins software projects because they are never
|
||||
accessible during the project and then get pissed off at the end because the
|
||||
result does not match some arbitrary expectation. This is a purely cultural
|
||||
and people issue, so I won't even try to provide ideas here. Either try
|
||||
something to change their mindset or just run away.
|
||||
- Your project has no clear purpose, which probably translates to no clear
|
||||
business ownership. It might be seen as an initiative where only the data
|
||||
science team participates. Or maybe some C-level executive sent an email
|
||||
|
|
@ -192,8 +192,6 @@ in the hands of data scientists. I have a few thoughts on this:
|
|||
to stand firm and announce that neither you nor your team will move a finger
|
||||
unless the business case is built.
|
||||
|
||||
|
||||
|
||||
## Wrapping up
|
||||
|
||||
That was long. Let's make a simple recipe:
|
||||
|
|
@ -205,8 +203,9 @@ That was long. Let's make a simple recipe:
|
|||
inevitable, but that you have a certain degree of control over _which_ errors
|
||||
happen.
|
||||
- Work together with the business team to assess how different errors link to
|
||||
business impacts, and provide them with the range of available options.
|
||||
- Get a decision from them which you is clear, defined and you can translate
|
||||
into the loss function.
|
||||
business impacts, and provide them with the range of available options in
|
||||
terms of how can error behaviour be modified.
|
||||
- Get a decision from them which is clear, defined and you can translate into
|
||||
the loss function.
|
||||
- Go ahead and deliver!
|
||||
|
||||
|
|
|
|||
|
|
@ -1,5 +1,5 @@
|
|||
title: Loss functions should be owned by business
|
||||
date: 2021-12-7 09:00:00
|
||||
date: 2021-12-07 09:00:00
|
||||
category: stories
|
||||
tags: data science, optimization, management, analytics translation, organizational design
|
||||
slug: loss-functions-should-be-owned-by-business
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue