Initial draft, needs review.

2021-12-08 12:44:45 +01:00 · 2021-12-08 12:44:45 +01:00 · a3ba60111d
commit a3ba60111d
parent fac06b2804
2 changed files with 219 additions and 0 deletions
--- a/posts/loss-functions-should-be-owned-by-business/content.md
+++ b/posts/loss-functions-should-be-owned-by-business/content.md
@ -0,0 +1,212 @@
+It is widely known that most Machine Learning projects never deliver anything
+to production. Simplifying the message, they fail. There might be useful
+by-products of running the project, but if it is never delivered as a service
+that is used by people or other pieces of software, it failed to deliver.
+
+There are many reasons that can make Machine Learning projects fail, and 
+discussing all of them is not what I want to do here. Instead, I want to focus
+on a specific one which bothers me a lot: the fact that business profiles never
+own loss functions.
+
+## The business case
+
+Let me take this step by step. When you start a Machine Learning project (which
+in my mind, covers the scope of designing, implementing and deploying a Machine 
+Learning algorithm), you should write a business case. It might be a more or 
+less detailed one. The utility of the predictions will sometimes be naturally
+easy to translate into financial results, and sometimes you will have to get
+more creative to link the two worlds. But in any case, not doing the business
+case exercise means you are commiting into a line of work without thinking 
+first if it makes any sense to pursuit it. In my humble opinion, not a wise 
+choice.
+
+At some point, while you find yourself drafting the business case, you will 
+have to deal with an inevitable fact: the algorithm will make errors. And
+these errors will have an impact on business results. This means, that, as part
+of the business case, you will have to define how do different levels of 
+accuracy affect business results.
+
+## An example
+
+Let's discuss an example of this to make things clearer. You are drafting the 
+business case for a project in a motorcycle manufacturing company (let's call 
+Hamaya to keep it simpler). Repair shops affiliated with Hamaya will attend 
+customers that have Hamaya motorcycles under warranty. When the shop considers
+that the needed repair is covered by the warranty, they will post a request to 
+Hamaya to obtain the required materials. Hamaya must judge the request and 
+accept it if all looks good or reject it if it considers the repair should not
+be covered by the warranty or the material request does not fit with the repair
+that needs to be done (the repair shope is trying to sneak some extra materials
+). Currently, all claims are reviewed manually by a team of customer care 
+associates at Hamaya. The goal of the project is to deliver a classification
+algorithm that will automatically classify requests as accepted or "needs 
+review", which will make the request enter the already in place process where
+the customer care associate will review it.
+
+Again, we know that whatever algorithm we come up with, it will make mistakes.
+What do the mistakes look like in this case?
+
+| The request is ... | The algorithm labels it as ... | Business outcome |
+|---|---|---|
+| Valid | Accepted | Good job. Human effort saved, everyone is happy. |
+| Invalid | Accepted | Hamaya has given away materials it should not have. The cost of these materials has been lost. |
+| Whatever | Needs Review | Valuable customer care time will need to be spent on the case. |
+
+So now we know that we can lose money in two ways: by giving away materials when
+we should, and by consuming time for the customer care team for cases that 
+should have been marked accepted by the classification algorithm.
+
+## The choice
+
+If you are a data scientist, I am sure you already knew that some degree of 
+error in a Machine Learning algorithm is inevitable in almost all cases. You 
+probably also know that you, although errors are inevitable, you have several 
+levers and knobs in your toolkit to modify _what_ errors your algorithm will 
+make. Going back to the Hamaya example: you can tune how conservative will the
+algorithm be, as in, where is the boundary to decide that a case is too unclear
+to mark it accepted and instead send it to the customer care team. 
+
+And here is where the loss function comes in. As a data scientist, you have the
+power of adjusting the loss function you are training on to influence what kind
+of mistakes will be punished harder, therefore determining _what_ kind of 
+mistakes will take place in production.
+
+So you just go and adjust that with your best judgement, right? Stop right here
+and read the title of this post again.
+
+Now that you are back here, repeat with me: _I do not own the loss function._
+
+Who does? Business. Who is business? I can't tell exactly, because it looks 
+different in every project and organization. But most probably, it is not you, 
+the data scientist. The business person that should be making the call on how
+the loss function behaves usually is:
+- A non-technical person.
+- Either owns or reports to the owner of the P&L that will be most impacted by
+  the deployment of this algorithm.
+- Should be pretty enthusiastic about putting this algorithm in production and
+  using it.
+
+In the hypothetical Hamaya case, some role like director of customer care would
+be a good fit for this description.
+
+Now, look for this person in your project. If you can't find it, there is 
+something pretty wrong with your project, and you should probably take a few 
+minutes to  reflect on what the hell is going on and fix some stuff.
+
+Once you find it, you need to make a nice ritual to take the ownership of the
+design of the loss function and transfer it to the business person. This means:
+- You help them understand that the algorithm you will develop will have 
+  errors, that this is a normal thing in the industry and not a result of you
+  being incompetent. 
+- You help them understand the trade-offs between the different errors. They
+  might even know the business impact of the errors better than you since, 
+  after all, they are the domain experts. But help them structure things
+  clearly anyway.
+- You show them what kind of power you have over which errors are made. If you
+  have some baseline or prototype model, this is a great moment to train it
+  with some different loss functions and show how it modifies the kind of 
+  errors taking place.
+- You make it crystal clear that you need their decision on how to tune the 
+  loss function. Write things down, help them structure their ideas, but at the
+  end of the day ensure it is them, not you, deciding on this matter.
+- You receive the decision from the business. The level of detail can vary a 
+  lot, ranging from simple but clear statements ("We want the model to never
+  classify an invalid request as accepted. We don't mind sending more work to
+  the customer care team.") to quantified, detailed policies. But they decided
+  it.
+
+## Why
+
+Because they know better than you. Because all of this is truly a business 
+decision and not a technical one and therefore, it is not a data science team
+but rather the business team that should be in charge.
+
+Let's come back to the Hamaya example. The loss function and the errors made by
+the classification algorithm will have impact on business matters that are 
+clearly beyond the scope of the data scientist work. Let's think about the two 
+extremes: you either tune the algorithm to be very agressive in labeling 
+requests as accepted, assuming the higher risk of incorrectly classifying 
+invalid requests as accepted, or you tune it to be conservative and send the 
+request to the customer care team unless the case is dead obvious.
+
+If you go for the aggressive option, at least the following will happen:
+- You will take much more work out of the customer care team. This means that
+  the ratio of # requests/ # customer care associates will go down since more
+  requests are handled automatically.
+- There will be more mistakes. Hamaya will be giving away more materials that
+  should not be. This translates into money lost.
+- The latency in responding to requests will go down since more of them will be
+  handled automatically. This will make customers and repair shops happier.
+
+On the other hand, if you go for the conservative option:
+- You will need more customer care associates than in the agressive scenario,
+  since less requests will be handled automatically.
+- Assuming customer care associates are flawless in judging cases, you will
+  have less mistakes in which you end up giving away materials when you should
+  not.
+- Requests will take longer to get processed, with customers and repair shops
+  waiting for the decision to be made to get things started.
+
+Despite the case being so simple and the comparison between both options so 
+brief, we can see plenty of impacts on business in this choice: we have hiring 
+and staffing (since the sizing of the customer care team will change), 
+customer satisfaction (speed of requests being handled), supply chain (the 
+need for materials needs to be planned and served) and financial and cost
+structure matters (what is cheaper, giving away more material or hiring more
+customer care profiles? Also, both options lead to different cost structures.
+Which one fits better with Hamaya's business and finance strategy?).
+
+So, do you think a data scientist is the right person to make this call?
+
+## Why does this always go wrong
+
+By now you might be as bothered as I am by the fact that loss functions are 
+usually owned by data scientists. I have made a commitment to myself that I 
+will prevent this from happening in my projects. If we want to succeed in doing
+so, it is probably a good idea to try to understand _why_ loss functions remain
+in the hands of data scientists. I have a few thoughts on this:
+
+- A common and simple one: the business team has no clue on how Machine 
+  Learning works. To them, it is just a magic prediction black box . I don't 
+  blame them. Instead, I would suggest that data scientists need to take their
+  time and creativity to teach just enough to the business team so that they
+  understand the important topics and why their involvement is needed at some
+  parts of the project.
+- Business has an excessively hands-off management style. They don't want to
+  know anything about the project, "because it is the data scientists' job and
+  we want results, not problems". This is the kind of management style that 
+  also ruins software projects because they are never accessible during the 
+  project and then get pissed off at the end because the result does not match
+  some arbitrary expectation. This is a purely cultural and people issue, so I 
+  won't even try to provide ideas here. Either try something to change their 
+  mindset or just run away.
+- Your project has no clear purpose, which probably translates to no clear 
+  business ownership. It might be seen as an initiative where only the data
+  science team participates. Or maybe some C-level executive sent an email
+  three weeks ago requesting that "we leverage AI to gain business value" and 
+  people are just running around scared, trying to get something done without a
+  clear direction. In this situation, nobody owns the loss function because 
+  things are just confusing as hell. I bet you don't have a business case, so
+  trying to get someone to do one might make the problem obvious enough so that
+  the chaos will stop. If you have enough power, a possible course of action is
+  to stand firm and announce that neither you nor your team will move a finger
+  unless the business case is built.
+
+
+
+## Wrapping up
+
+That was long. Let's make a simple recipe:
+
+- If you are starting a Machine Learning project, build a business case. Use it
+  to describe how errors in the final algorithm link to business/financial 
+  impact.
+- Educate the business team to understand that prediction errors are 
+  inevitable, but that you have a certain degree of control over _which_ errors
+  happen.
+- Work together with the business team to assess how different errors link to
+  business impacts, and provide them with the range of available options.
+- Get a decision from them which you is clear, defined and you can translate
+  into the loss function.
+- Go ahead and deliver!
+
--- a/posts/loss-functions-should-be-owned-by-business/meta.yaml
+++ b/posts/loss-functions-should-be-owned-by-business/meta.yaml
@ -0,0 +1,7 @@
+title: Loss functions should be owned by business
+date: 2021-12-7 09:00:00
+category: stories
+tags: data science, optimization, management, analytics translation, organizational design
+slug: loss-functions-should-be-owned-by-business
+authors: Pablo
+summary: My view on why business profiles should define loss functions.