From a3ba60111d00f8927a01e50e337d63bbe73011ab Mon Sep 17 00:00:00 2001 From: pablo Date: Wed, 8 Dec 2021 12:44:45 +0100 Subject: [PATCH] Initial draft, needs review. --- .../content.md | 212 ++++++++++++++++++ .../meta.yaml | 7 + 2 files changed, 219 insertions(+) create mode 100644 posts/loss-functions-should-be-owned-by-business/content.md create mode 100644 posts/loss-functions-should-be-owned-by-business/meta.yaml diff --git a/posts/loss-functions-should-be-owned-by-business/content.md b/posts/loss-functions-should-be-owned-by-business/content.md new file mode 100644 index 0000000..d3151d4 --- /dev/null +++ b/posts/loss-functions-should-be-owned-by-business/content.md @@ -0,0 +1,212 @@ +It is widely known that most Machine Learning projects never deliver anything +to production. Simplifying the message, they fail. There might be useful +by-products of running the project, but if it is never delivered as a service +that is used by people or other pieces of software, it failed to deliver. + +There are many reasons that can make Machine Learning projects fail, and +discussing all of them is not what I want to do here. Instead, I want to focus +on a specific one which bothers me a lot: the fact that business profiles never +own loss functions. + +## The business case + +Let me take this step by step. When you start a Machine Learning project (which +in my mind, covers the scope of designing, implementing and deploying a Machine +Learning algorithm), you should write a business case. It might be a more or +less detailed one. The utility of the predictions will sometimes be naturally +easy to translate into financial results, and sometimes you will have to get +more creative to link the two worlds. But in any case, not doing the business +case exercise means you are commiting into a line of work without thinking +first if it makes any sense to pursuit it. In my humble opinion, not a wise +choice. + +At some point, while you find yourself drafting the business case, you will +have to deal with an inevitable fact: the algorithm will make errors. And +these errors will have an impact on business results. This means, that, as part +of the business case, you will have to define how do different levels of +accuracy affect business results. + +## An example + +Let's discuss an example of this to make things clearer. You are drafting the +business case for a project in a motorcycle manufacturing company (let's call +Hamaya to keep it simpler). Repair shops affiliated with Hamaya will attend +customers that have Hamaya motorcycles under warranty. When the shop considers +that the needed repair is covered by the warranty, they will post a request to +Hamaya to obtain the required materials. Hamaya must judge the request and +accept it if all looks good or reject it if it considers the repair should not +be covered by the warranty or the material request does not fit with the repair +that needs to be done (the repair shope is trying to sneak some extra materials +). Currently, all claims are reviewed manually by a team of customer care +associates at Hamaya. The goal of the project is to deliver a classification +algorithm that will automatically classify requests as accepted or "needs +review", which will make the request enter the already in place process where +the customer care associate will review it. + +Again, we know that whatever algorithm we come up with, it will make mistakes. +What do the mistakes look like in this case? + +| The request is ... | The algorithm labels it as ... | Business outcome | +|---|---|---| +| Valid | Accepted | Good job. Human effort saved, everyone is happy. | +| Invalid | Accepted | Hamaya has given away materials it should not have. The cost of these materials has been lost. | +| Whatever | Needs Review | Valuable customer care time will need to be spent on the case. | + +So now we know that we can lose money in two ways: by giving away materials when +we should, and by consuming time for the customer care team for cases that +should have been marked accepted by the classification algorithm. + +## The choice + +If you are a data scientist, I am sure you already knew that some degree of +error in a Machine Learning algorithm is inevitable in almost all cases. You +probably also know that you, although errors are inevitable, you have several +levers and knobs in your toolkit to modify _what_ errors your algorithm will +make. Going back to the Hamaya example: you can tune how conservative will the +algorithm be, as in, where is the boundary to decide that a case is too unclear +to mark it accepted and instead send it to the customer care team. + +And here is where the loss function comes in. As a data scientist, you have the +power of adjusting the loss function you are training on to influence what kind +of mistakes will be punished harder, therefore determining _what_ kind of +mistakes will take place in production. + +So you just go and adjust that with your best judgement, right? Stop right here +and read the title of this post again. + +Now that you are back here, repeat with me: _I do not own the loss function._ + +Who does? Business. Who is business? I can't tell exactly, because it looks +different in every project and organization. But most probably, it is not you, +the data scientist. The business person that should be making the call on how +the loss function behaves usually is: +- A non-technical person. +- Either owns or reports to the owner of the P&L that will be most impacted by + the deployment of this algorithm. +- Should be pretty enthusiastic about putting this algorithm in production and + using it. + +In the hypothetical Hamaya case, some role like director of customer care would +be a good fit for this description. + +Now, look for this person in your project. If you can't find it, there is +something pretty wrong with your project, and you should probably take a few +minutes to reflect on what the hell is going on and fix some stuff. + +Once you find it, you need to make a nice ritual to take the ownership of the +design of the loss function and transfer it to the business person. This means: +- You help them understand that the algorithm you will develop will have + errors, that this is a normal thing in the industry and not a result of you + being incompetent. +- You help them understand the trade-offs between the different errors. They + might even know the business impact of the errors better than you since, + after all, they are the domain experts. But help them structure things + clearly anyway. +- You show them what kind of power you have over which errors are made. If you + have some baseline or prototype model, this is a great moment to train it + with some different loss functions and show how it modifies the kind of + errors taking place. +- You make it crystal clear that you need their decision on how to tune the + loss function. Write things down, help them structure their ideas, but at the + end of the day ensure it is them, not you, deciding on this matter. +- You receive the decision from the business. The level of detail can vary a + lot, ranging from simple but clear statements ("We want the model to never + classify an invalid request as accepted. We don't mind sending more work to + the customer care team.") to quantified, detailed policies. But they decided + it. + +## Why + +Because they know better than you. Because all of this is truly a business +decision and not a technical one and therefore, it is not a data science team +but rather the business team that should be in charge. + +Let's come back to the Hamaya example. The loss function and the errors made by +the classification algorithm will have impact on business matters that are +clearly beyond the scope of the data scientist work. Let's think about the two +extremes: you either tune the algorithm to be very agressive in labeling +requests as accepted, assuming the higher risk of incorrectly classifying +invalid requests as accepted, or you tune it to be conservative and send the +request to the customer care team unless the case is dead obvious. + +If you go for the aggressive option, at least the following will happen: +- You will take much more work out of the customer care team. This means that + the ratio of # requests/ # customer care associates will go down since more + requests are handled automatically. +- There will be more mistakes. Hamaya will be giving away more materials that + should not be. This translates into money lost. +- The latency in responding to requests will go down since more of them will be + handled automatically. This will make customers and repair shops happier. + +On the other hand, if you go for the conservative option: +- You will need more customer care associates than in the agressive scenario, + since less requests will be handled automatically. +- Assuming customer care associates are flawless in judging cases, you will + have less mistakes in which you end up giving away materials when you should + not. +- Requests will take longer to get processed, with customers and repair shops + waiting for the decision to be made to get things started. + +Despite the case being so simple and the comparison between both options so +brief, we can see plenty of impacts on business in this choice: we have hiring +and staffing (since the sizing of the customer care team will change), +customer satisfaction (speed of requests being handled), supply chain (the +need for materials needs to be planned and served) and financial and cost +structure matters (what is cheaper, giving away more material or hiring more +customer care profiles? Also, both options lead to different cost structures. +Which one fits better with Hamaya's business and finance strategy?). + +So, do you think a data scientist is the right person to make this call? + +## Why does this always go wrong + +By now you might be as bothered as I am by the fact that loss functions are +usually owned by data scientists. I have made a commitment to myself that I +will prevent this from happening in my projects. If we want to succeed in doing +so, it is probably a good idea to try to understand _why_ loss functions remain +in the hands of data scientists. I have a few thoughts on this: + +- A common and simple one: the business team has no clue on how Machine + Learning works. To them, it is just a magic prediction black box . I don't + blame them. Instead, I would suggest that data scientists need to take their + time and creativity to teach just enough to the business team so that they + understand the important topics and why their involvement is needed at some + parts of the project. +- Business has an excessively hands-off management style. They don't want to + know anything about the project, "because it is the data scientists' job and + we want results, not problems". This is the kind of management style that + also ruins software projects because they are never accessible during the + project and then get pissed off at the end because the result does not match + some arbitrary expectation. This is a purely cultural and people issue, so I + won't even try to provide ideas here. Either try something to change their + mindset or just run away. +- Your project has no clear purpose, which probably translates to no clear + business ownership. It might be seen as an initiative where only the data + science team participates. Or maybe some C-level executive sent an email + three weeks ago requesting that "we leverage AI to gain business value" and + people are just running around scared, trying to get something done without a + clear direction. In this situation, nobody owns the loss function because + things are just confusing as hell. I bet you don't have a business case, so + trying to get someone to do one might make the problem obvious enough so that + the chaos will stop. If you have enough power, a possible course of action is + to stand firm and announce that neither you nor your team will move a finger + unless the business case is built. + + + +## Wrapping up + +That was long. Let's make a simple recipe: + +- If you are starting a Machine Learning project, build a business case. Use it + to describe how errors in the final algorithm link to business/financial + impact. +- Educate the business team to understand that prediction errors are + inevitable, but that you have a certain degree of control over _which_ errors + happen. +- Work together with the business team to assess how different errors link to + business impacts, and provide them with the range of available options. +- Get a decision from them which you is clear, defined and you can translate + into the loss function. +- Go ahead and deliver! + diff --git a/posts/loss-functions-should-be-owned-by-business/meta.yaml b/posts/loss-functions-should-be-owned-by-business/meta.yaml new file mode 100644 index 0000000..fe11abf --- /dev/null +++ b/posts/loss-functions-should-be-owned-by-business/meta.yaml @@ -0,0 +1,7 @@ +title: Loss functions should be owned by business +date: 2021-12-7 09:00:00 +category: stories +tags: data science, optimization, management, analytics translation, organizational design +slug: loss-functions-should-be-owned-by-business +authors: Pablo +summary: My view on why business profiles should define loss functions. \ No newline at end of file