aoml/other/pds_syllabus.md
2023-12-09 18:05:57 +01:00

18 KiB
Raw Blame History

Practical Data Science for Operations Management

Course description

Operations Management is a field filled with opportunities to apply Data Science techniques. This course presents a set of practical cases where students can put their knowledge to use in realistic challenges and relate their theoretical knowledge with its application in industry. The expected outcome is that students that have passed this course are familiar a variety of modern and useful techniques that can be applied in real-life operations contexts. With this knowledge and experience, the students understand what are the right techniques for different problems, which are the main steps and requirements to put each of these techniques to use and how to judge their successful application.

Many of the techniques taught in this course are usually taught to engineering and technical profiles. This course does not aim to bring students to the same level of technical expertise as their engineering counterparts, but rather to provide enough background so that the students can successfully interact with such profiles. Having said that, this course can also be a first introduction for students that are willing to pursue a more thorough learning of the techniques discussed in the course, after or during itself.

Objectives

With the knowledge and skills obtained in this course, students become fit for tasks such as:

  • Applying Data Science techniques, including optimization, simulation and machine learning, to simple cases at a hands-on level.
  • Planning and designing optimization, simulation and machine learning initiatives to solve operational challenges.
  • Leading optimization, simulation and machine learning projects from a managerial point of view.
  • Acting as a liaison between management and technical profiles in business contexts.
  • Becoming familiar with the Python programming language and several specialized packages within the Data Science environment.

Prerequisities

The course assumes the student has covered economics and management oriented Mathematics and Statistics courses at undergraduate level.

Additional knowledge on Operations Management, Supply Chain Management and Manufacturing will provide a better functional context for the cases and make the course more profitable.

Previous programming experience will allow students to focus much more on the operations side of the course. Passing this course is not impossible if that is not the case, but the student should expect a non-trivial challenge ahead.

Methodology

There will be 20 sessions throughout the course. Sessions will contain a mix of lecturing and practical work and discussions. Students will team up to work on several practical cases that will be handed-in throughout the course and will allow students to put their knowledge into practice. Python will be the tool of choice for most technical work, with different specialized packages used to tackle different parts of the course.

Students are expected to attend all the activities in the course. Beyond class sessions, additional reading resources will be provided to students. For students that need to level up their Python skills, self-paced materials will be suggested.

Lectures will have the following contents:

Week Classes Student work
1 - L1: Introduction and motivation of the course
- L2: Simulation, Optimization and Machine Learning in companies
- Python prep
2 - L3: Introduction to simulation: What is it, When do we use it, Types of simulation
- L4: Simulation examples in Python. Introduction to case 1.
- Python prep
- View Primer: Simulating a pandemic
- Read Agent-based modeling: Methods and techniques for simulating human systems
- Read case 1.
3 - L5: Simulation methodology.
- L6: Simulation-based optimization I. Challenges and issues with simulation. Where to go from here
- S1: Workshop for case 1
- Work on case 1
- Review HASH model market simulation
- Review HASH warehouse simulation
4 - L7: Introduction to optimization
- L8: Modeling optimization problems
- S2: Workshop for case 1
- Work on case 1
- Read Gurobi's Modelling Basics
- Read Neos taxonomy of optimization problems
- View this video on the Simplex algorithm
5 - L9: Taxonomy of optimization techniques
- L10: Simulation-based optimization II. Introduction to case 2
- Deliver case 1
- Read case 2
- Enjoy watching simulation-based race car training
- Read how the 4th most popular database software in the world uses GAs to access data faster.
6 - L11: Challenges in real-world usage. Simulation vs Optimization
- L12: Introduction to Machine Learning
- S3: Workshop for case 2
- Work on case 2
- Read this review on simulation optimization techniques and softwares
7 - L13: Supervised Machine Learning (SML): NIPS
- L14: Typical SML workflow. Introduction to case 3
- S4: Workshop for case 2
- Work on case 2
- Read case 3
8 - L15: Algorithm deep dive: Decision trees
- L16: Feature Engineering and Model Evaluation
- S5: Workshop for case 3
- Deliver case 2
- View this intro to neural networks and this intro to random forests
9 - L17: Deployment of Models
- L18: Stories from the trenches: applying all of this in the real world
- S6: Workshop for case 3
- Work on case 3
- View this video on why businesses fail at ML
10 - L19: Where to go from here: further learning and carreer advice
- L20: Final Q&A, exam preparation
- Work on case 3
11 - Exam - Deliver case 3
  • Lecture 1 INTRO
    • Introduction to the course
      • Citizenship rules

        • Won't force you to come, but I advice you to.
        • I'll always try to start 5min late, finish 5min late, and stop for 5min.
        • You can come and go, just please be respectful.
      • Calendar

      • Contents

      • Expectations

      • The teacher

      • Evaluation

      • Contact

      • Questions?

      • The relevance of math and computers in management

        • Examples: pricing, logistics, staffing.
        • The skills and profiles required
        • The tools used
  • Lecture 2 INTRO
    • The techniques we will see in the course
      • Simulation
      • Optimization
      • Supervised machine learning (aka "prediction")
    • Why this stuff is important
  • Lecture 3 SIM
    • A humbling example
    • What is simulation and when do we use it
    • Different types of simulations
  • Lecture 4 SIM
    • Toy simulations in Python
    • How to approach simulation in practical terms
    • Tools in industry
  • Lecture 5 SIM
    • Theoretical background on simulation
    • Present case 1
  • Lecture 6 SIM
    • Simulation-based optimization
    • Where to go from here
  • Lecture 7 OPT
    • What is optimization
    • A trivial example
  • Lecture 8 OPT
    • Different optimization techniques
    • Present case 3
  • Lecture 9 OPT
    • How to model optimization problems (target functions, decision variables and constraints)
  • Lecture 10 OPT
    • Simulation-based optimization: Genetic algorithms
  • Lecture 11 OPT
    • Real world challenges and optimization deployment
  • Lecture 12 ML
    • Good news, you already know Machine Learning
    • Different branches of Machine Learning
    • Real world examples of applications
  • Lecture 13 ML
    • How does Supervised Machine Learning work?
    • Present case 2
  • Lecture 14 ML
    • The Machine Learning workflow (EDA, Feature Engineering, Model Evaluation, Deployment)
  • Lecture 15 ML
    • Feature Engineering
  • Lecture 16 ML
    • Model evaluation
  • Lecture 17 ML
    • Deployment and real world challenges
  • Lecture 18 Real life stories from the trenches
  • Lecture 19 Real life stories from the trenches
  • Lecture 20
    • Q&A pre-exam
    • Feedback on the course

Contents

The course focuses on two advanced topics within the context of Operations Management: Optimization and Machine Learning. The backbone of the course is a chain of practical cases that challenge the students to use these techniques, applied in the Python programming language, to solve realistic problems. Hence, the priority for students is to solve the cases, and lectures are not an end but a mean to tackle the practical side of the course.

You can find below the contents planned for each week. The final exact contents may be adapted to the students previous knowledge and skills to improve their experience.

Week Contents
1 - Introduction and motivation of the course
- Data Science in companies
2 - Introduction to case 1.
- Real-world challenges with exact methods optimization
3 - Case 1 Workshop
- Introduction to Simulation and Metaheuristics
4 - Introduction to case 2.
- Simulation-based Optimization
5 - Case 2 Workshop
- Metaheuristics Deep Dive: Genetic Algorithms
6 - Introduction to case 3.
- Real-world challenges with simulation and heuristic approaches
7 - Case 3 Workshop
- Introduction to Supervised Machine Learning
8 - Introduction to Case 4
- Supervised Machine Learning Methodology
9 - Case 4 Workshop
- Algorithm Deep Dive: Decision Trees and Random Forests
10 - Where to go from here: further learning and carreer advice
- Final Q&A, exam preparation

Evaluation Criteria

The following items compose the final grade:

  • Case assignments: 60% of the grade. There will be several assignments, each with the same weight. The average grade of the assignments must be of 4 or more to pass the course.
  • Final exam: 40% of the grade. There will be a final exam at the end of the course. The grade must be of 4 or more to pass the course.

Students who fail the final exam will get the chance to sit a retake exam. There is no retake for the case assignments.

Students are required to attend 80% of the classes. Failing to do so without justified reason will imply a zero grade in the participation/attendance evaluation item and may lead to suspension from the program.

In case of a justified no-show to an exam, the student must inform the corresponding faculty member and the director(s) of the program so that they study the possibility of rescheduling the exam (one possibility being during the “Retake” period). In the meantime, the student will get an “incomplete”, which will be replaced by the actual grade after the final exam is taken. The “incomplete” will not be reflected on the students Academic Transcript.

Plagiarism is to use anothers work and presenting it as ones own without acknowledging the sources in the correct way. All essays, reports or projects handed in by a student must be original work completed by the student. By enrolling at any UPF BSM Master of Science and signing the “Honor Code,” students acknowledge that they understand the schools policy on plagiarism and certify that all course assignments will be their own work, except where indicated by correct referencing. Failing to do so may result in automatic expulsion from the program.

Bibliography

All compulsory and required materials will be provided during the course. These include lecture notes, required readings and description readings.

A good book that follows the approach of this course is "Guttag, John. Introduction to Computation and Programming Using Python: With Application to Understanding Data. 2nd ed. MIT Press, 2016. ISBN: 9780262529624", used in the homonymous course at MIT. It is not compulsory to use this book, but some students might find it helpful.

Additional specific readings will be provided throughout the course. Students will be requested to read some of these materials in advance of some sessions.

For students that want to dive deeper in the topics covered in the course, the following books are recommended:

  • On simulation: Louis G. Birta Gilbert Arbez, Modelling and Simulation. Springer 2019 ISBN: 978-3-030-18869-6 or Law A., Kelton D., Simulation and Modelling Analysis, Second Edition, McGraw-Hill, ISBN: 978-0071165372
  • On metaheuristics: Sean Luke, 2013, Essentials of Metaheuristics, Lulu, second edition, available for free at http://cs.gmu.edu/~sean/book/metaheuristics/
  • On machine learning: Hastie T., Tibshirani R., Friedman J., The Elements Of Statistical Learning: Data Mining, Inference, And Prediction, Second Edition ISBN: 978-0387848570

Professor Bio

Pablo Martín is an adjunct professor at UPF, where he teaches applied Data Science with a focus on Operations and Supply Chain. He has several years of experience in the fields of Data Science and Data Engineering in industry, both in Consulting and final companies. He also has experience in research. Pablo holds a MSc degree in Data Science from the University of Amsterdam, and a BSc in Business & Technology from the Autonomous University of Barcelona. Some of his other interests include open-source software, Austrian Economics and Bitcoin. You can find out more at LinkedIn and github.