This commit is contained in:
counterweight 2023-12-09 18:05:57 +01:00
parent a67287eeb7
commit ee8dd41f85
129 changed files with 91712 additions and 0 deletions

254
other/pds_syllabus.md Normal file
View file

@ -0,0 +1,254 @@
# Practical Data Science for Operations Management
## Course description
Operations Management is a field filled with opportunities to apply Data
Science techniques. This course presents a set of practical cases where
students can put their knowledge to use in realistic challenges and relate
their theoretical knowledge with its application in industry. The expected
outcome is that students that have passed this course are familiar a variety of
modern and useful techniques that can be applied in real-life operations
contexts. With this knowledge and experience, the students understand what are
the right techniques for different problems, which are the main steps and
requirements to put each of these techniques to use and how to judge their
successful application.
Many of the techniques taught in this course are usually taught to engineering
and technical profiles. This course does not aim to bring students to the same
level of technical expertise as their engineering counterparts, but rather to
provide enough background so that the students can successfully interact with
such profiles. Having said that, this course can also be a first introduction
for students that are willing to pursue a more thorough learning of the
techniques discussed in the course, after or during itself.
## Objectives
With the knowledge and skills obtained in this course, students become fit for
tasks such as:
- Applying Data Science techniques, including optimization, simulation and
machine learning, to simple cases at a hands-on level.
- Planning and designing optimization, simulation and machine learning
initiatives to solve operational challenges.
- Leading optimization, simulation and machine learning projects from a
managerial point of view.
- Acting as a liaison between management and technical profiles in business
contexts.
- Becoming familiar with the Python programming language and several
specialized packages within the Data Science environment.
## Prerequisities
The course assumes the student has covered economics and management oriented
Mathematics and Statistics courses at undergraduate level.
Additional knowledge on Operations Management, Supply Chain Management and
Manufacturing will provide a better functional context for the cases and make
the course more profitable.
Previous programming experience will allow students to focus much more on the
operations side of the course. Passing this course is not impossible if that is
not the case, but the student should expect a non-trivial challenge ahead.
## Methodology
There will be 20 sessions throughout the course. Sessions will contain a mix of
lecturing and practical work and discussions. Students will team up to work on
several practical cases that will be handed-in throughout the course and will
allow students to put their knowledge into practice. Python will be the tool of
choice for most technical work, with different specialized packages used to
tackle different parts of the course.
Students are expected to attend all the activities in the course. Beyond class
sessions, additional reading resources will be provided to students. For
students that need to level up their Python skills, self-paced materials will
be suggested.
Lectures will have the following contents:
| Week | Classes | Student work |
|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1 | - L1: Introduction and motivation of the course<br/> - L2: Simulation, Optimization and Machine Learning in companies | - Python prep |
| 2 | - L3: Introduction to simulation: What is it, When do we use it, Types of simulation<br/> - L4: Simulation examples in Python. Introduction to case 1. | - Python prep<br/> - View [Primer: Simulating a pandemic](https://www.youtube.com/watch?v=7OLpKqTriio) <br/>- Read [Agent-based modeling: Methods and techniques for simulating human systems](https://www.pnas.org/content/99/suppl_3/7280) <br/> - Read case 1. |
| 3 | - L5: Simulation methodology. <br/> - L6: Simulation-based optimization I. Challenges and issues with simulation. Where to go from here<br/> - S1: Workshop for case 1 | - Work on case 1 <br/> - Review [HASH model market simulation](https://hash.ai/@hash/model-market-python) <br/>- Review [HASH warehouse simulation](https://hash.ai/@hash/warehouse-logistics) |
| 4 | - L7: Introduction to optimization<br/> - L8: Modeling optimization problems<br/> - S2: Workshop for case 1 | - Work on case 1 <br/> - Read Gurobi's [Modelling Basics](https://www.gurobi.com/resource/modeling-basics/) <br/> - Read Neos [taxonomy of optimization problems](https://neos-guide.org/optimization-tree) <br/> - View this video on the [Simplex algorithm](https://www.youtube.com/watch?v=RO5477EKlXE) |
| 5 | - L9: Taxonomy of optimization techniques <br/> - L10: Simulation-based optimization II. Introduction to case 2 | - Deliver case 1 <br/> - Read case 2 <br/> - Enjoy watching [simulation-based race car training](https://www.youtube.com/watch?v=-sg-GgoFCP0) <br/> - Read how the [4th most popular database software in the world uses GAs to access data faster.](https://www.postgresql.org/docs/8.0/geqo-intro2.html) |
| 6 | - L11: Challenges in real-world usage. Simulation vs Optimization <br/> - L12: Introduction to Machine Learning <br/> - S3: Workshop for case 2 | - Work on case 2 <br/> - Read this [review on simulation optimization techniques and softwares](https://arxiv.org/pdf/1706.08591.pdf) |
| 7 | - L13: Supervised Machine Learning (SML): NIPS<br/> - L14: Typical SML workflow. Introduction to case 3<br/> - S4: Workshop for case 2 | - Work on case 2 <br/> - Read case 3 |
| 8 | - L15: Algorithm deep dive: Decision trees<br/> - L16: Feature Engineering and Model Evaluation<br/> - S5: Workshop for case 3 | - Deliver case 2 <br/> - View this [intro to neural networks](https://www.youtube.com/watch?v=aircAruvnKk&t=10s) and this [intro to random forests](https://www.youtube.com/watch?v=J4Wdy0Wc_xQ) |
| 9 | - L17: Deployment of Models <br/> - L18: Stories from the trenches: applying all of this in the real world<br/> - S6: Workshop for case 3 | - Work on case 3 <br/> - View this video on [why businesses fail at ML](https://www.youtube.com/watch?v=dRJGyhS6gA0) |
| 10 | - L19: Where to go from here: further learning and carreer advice<br/> - L20: Final Q&A, exam preparation | - Work on case 3 |
| 11 | - Exam | - Deliver case 3 | | |
- Lecture 1 INTRO
- Introduction to the course
- Citizenship rules
- Won't force you to come, but I advice you to.
- I'll always try to start 5min late, finish 5min late, and stop
for 5min.
- You can come and go, just please be respectful.
- Calendar
- Contents
- Expectations
- The teacher
- Evaluation
- Contact
- Questions?
- The relevance of math and computers in management
- Examples: pricing, logistics, staffing.
- The skills and profiles required
- The tools used
- Lecture 2 INTRO
- The techniques we will see in the course
- Simulation
- Optimization
- Supervised machine learning (aka "prediction")
- Why this stuff is important
- Lecture 3 SIM
- A humbling example
- What is simulation and when do we use it
- Different types of simulations
- Lecture 4 SIM
- Toy simulations in Python
- How to approach simulation in practical terms
- Tools in industry
- Lecture 5 SIM
- Theoretical background on simulation
- Present case 1
- Lecture 6 SIM
- Simulation-based optimization
- Where to go from here
- Lecture 7 OPT
- What is optimization
- A trivial example
- Lecture 8 OPT
- Different optimization techniques
- Present case 3
- Lecture 9 OPT
- How to model optimization problems (target functions, decision variables
and constraints)
- Lecture 10 OPT
- Simulation-based optimization: Genetic algorithms
- Lecture 11 OPT
- Real world challenges and optimization deployment
- Lecture 12 ML
- Good news, you already know Machine Learning
- Different branches of Machine Learning
- Real world examples of applications
- Lecture 13 ML
- How does Supervised Machine Learning work?
- Present case 2
- Lecture 14 ML
- The Machine Learning workflow (EDA, Feature Engineering, Model
Evaluation, Deployment)
- Lecture 15 ML
- Feature Engineering
- Lecture 16 ML
- Model evaluation
- Lecture 17 ML
- Deployment and real world challenges
- Lecture 18 Real life stories from the trenches
- Lecture 19 Real life stories from the trenches
- Lecture 20
- Q&A pre-exam
- Feedback on the course
## Contents
The course focuses on two advanced topics within the context of Operations
Management: Optimization and Machine Learning. The backbone of the course is a
chain of practical cases that challenge the students to use these techniques,
applied in the Python programming language, to solve realistic problems. Hence,
the priority for students is to solve the cases, and lectures are not an end
but a mean to tackle the practical side of the course.
You can find below the contents planned for each week. The final exact contents
may be adapted to the students previous knowledge and skills to improve their
experience.
| Week | Contents |
|------|-------------------------------------------------------------------------------------------------|
| 1 | - Introduction and motivation of the course<br/> - Data Science in companies |
| 2 | - Introduction to case 1.<br/> - Real-world challenges with exact methods optimization |
| 3 | - Case 1 Workshop <br/> - Introduction to Simulation and Metaheuristics |
| 4 | - Introduction to case 2.<br/> - Simulation-based Optimization |
| 5 | - Case 2 Workshop <br/> - Metaheuristics Deep Dive: Genetic Algorithms |
| 6 | - Introduction to case 3.<br/> - Real-world challenges with simulation and heuristic approaches |
| 7 | - Case 3 Workshop <br/> - Introduction to Supervised Machine Learning |
| 8 | - Introduction to Case 4 <br/> - Supervised Machine Learning Methodology |
| 9 | - Case 4 Workshop <br/> - Algorithm Deep Dive: Decision Trees and Random Forests |
| 10 | - Where to go from here: further learning and carreer advice<br/> - Final Q&A, exam preparation |
## Evaluation Criteria
The following items compose the final grade:
- Case assignments: 60% of the grade. There will be several assignments, each
with the same weight. The average grade of the assignments must be of 4 or
more to pass the course.
- Final exam: 40% of the grade. There will be a final exam at the end of the
course. The grade must be of 4 or more to pass the course.
Students who fail the final exam will get the chance to sit a retake exam.
There is no retake for the case assignments.
Students are required to attend 80% of the classes. Failing to do so without
justified reason will imply a zero grade in the participation/attendance
evaluation item and may lead to suspension from the program.
In case of a justified no-show to an exam, the student must inform the
corresponding faculty member and the director(s) of the program so that they
study the possibility of rescheduling the exam (one possibility being during
the “Retake” period). In the meantime, the student will get an “incomplete”,
which will be replaced by the actual grade after the final exam is taken. The
“incomplete” will not be reflected on the students Academic Transcript.
Plagiarism is to use anothers work and presenting it as ones own without
acknowledging the sources in the correct way. All essays, reports or projects
handed in by a student must be original work completed by the student. By
enrolling at any UPF BSM Master of Science and signing the “Honor Code,”
students acknowledge that they understand the schools policy on plagiarism and
certify that all course assignments will be their own work, except where
indicated by correct referencing. Failing to do so may result in automatic
expulsion from the program.
## Bibliography
All compulsory and required materials will be provided during the course. These
include lecture notes, required readings and description readings.
A good book that follows the approach of this course is "Guttag, John.
Introduction to Computation and Programming Using Python: With Application to
Understanding Data. 2nd ed. MIT Press, 2016. ISBN: 9780262529624", used in the
homonymous course at MIT. It is not compulsory to use this book, but some
students might find it helpful.
Additional specific readings will be provided throughout the course. Students
will be requested to read some of these materials in advance of some sessions.
For students that want to dive deeper in the topics covered in the course, the
following books are recommended:
- On simulation: Louis G. Birta Gilbert Arbez, Modelling and Simulation.
Springer 2019 ISBN: 978-3-030-18869-6 or Law A., Kelton D., Simulation and
Modelling Analysis, Second Edition, McGraw-Hill, ISBN: 978-0071165372
- On metaheuristics: Sean Luke, 2013, Essentials of Metaheuristics, Lulu,
second edition, available for free
at http://cs.gmu.edu/~sean/book/metaheuristics/
- On machine learning: Hastie T., Tibshirani R., Friedman J., The Elements Of
Statistical Learning: Data Mining, Inference, And Prediction, Second Edition
ISBN: 978-0387848570
## Professor Bio
Pablo Martín is an adjunct professor at UPF, where he teaches applied Data
Science with a focus on Operations and Supply Chain. He has several years of
experience in the fields of Data Science and Data Engineering in industry, both
in Consulting and final companies. He also has experience in research. Pablo
holds a MSc degree in Data Science from the University of Amsterdam, and a BSc
in Business & Technology from the Autonomous University of Barcelona. Some of
his other interests include open-source software, Austrian Economics and
Bitcoin. You can find out more
at [LinkedIn](https://www.linkedin.com/in/pablomartincalvo/)
and [github](https://github.com/pmartincalvo).