A lot of stuff.

This commit is contained in:
pablo 2022-07-04 11:13:11 +02:00
parent a69efa8490
commit 6ad1dab4d2
45 changed files with 2063 additions and 31 deletions

BIN
other/contrato.pdf Normal file

Binary file not shown.

BIN
other/contrato_firmado.pdf Normal file

Binary file not shown.

261
other/course_syllabus.md Normal file
View file

@ -0,0 +1,261 @@
# Applied Optimization Techniques
## Course goals
The goal of this course is to provide an introduction to simulation,
optimization and machine learning techniques to students with a background in
social sciences, with an approach biased towards practical work. The expected
outcome is that students that have passed this course know a variety of modern
and useful techniques that can be applied in real-life business contexts. With
this knowledge and experience, the students understand what are the right
techniques for different problems, which are the main steps and requirements to
apply each of these techniques and how to judge the successful application of
them.
Many of the techniques taught in this course are usually taught to engineering
and technical profiles. This course does not aim to bring students to the same
level of technical expertise as their engineering counterparts, but rather to
provide enough background so that the students can successfully interact with
such profiles. Having said that, this course can also be a first introduction
for students that are willing to pursue a more thorough learning of the
techniques discussed in the course, after or during itself.
With the knowledge and skills obtained in this course, students become fit for
tasks such as:
- Applying simulation, optimization and machine learning techniques to simple
cases.
- Planning and designing simulation, optimization and machine learning
initiatives.
- Leading simulation, optimization and machine learning projects from a
managerial point of view.
- Acting as a liaison between management and technical profiles in business
contexts.
## Pre-requisities
The course assumes the student has covered Mathematics I, II, III courses and
the Probability & Statistics course. Passing this course is not impossible if
that is not the case, but the student should expect a non-trivial challenge
ahead.
Knowledge of the following topics will help students better leverage this
course, but is not strictly required:
- Basic programming, specially in data oriented languages such as Python or R.
- Operations research
## Teaching method and contents
The course will have lecture classes and practical seminars. Classes start on
April 7th.
There will be 20 lecture classes and 6 practical seminars. Lecture classes will
be used to present material to students as well as having discussions on the
course contents. For the practical seminars, students will be divided into two
groups with independent sessions to reduce the class size. The practical
seminars will be used to deep-dive in the three mandatory case assignments that
students will do throughout the course. The sesions will also be hands-on and
students will work in the case together with the professor.
Students are expected to attend all the activities in the course. Beyond
lectures and practical seminars, additional reading resources will be provided
to students. For students that need to level up their Python skills, self-paced
materials will be suggested.
Lectures will have the following contents:
| Week | Classes | Student work |
|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1 | - L1: Introduction and motivation of the course<br/> - L2: Simulation, Optimization and Machine Learning in companies | - Python prep |
| 2 | - L3: Introduction to simulation: What is it, When do we use it, Types of simulation<br/> - L4: Simulation examples in Python. Introduction to case 1. | - Python prep<br/> - View [Primer: Simulating a pandemic](https://www.youtube.com/watch?v=7OLpKqTriio) <br/>- Read [Agent-based modeling: Methods and techniques for simulating human systems](https://www.pnas.org/content/99/suppl_3/7280) <br/> - Read case 1. |
| 3 | - L5: Simulation methodology. <br/> - L6: Simulation-based optimization I. Challenges and issues with simulation. Where to go from here<br/> - S1: Workshop for case 1 | - Work on case 1 <br/> - Review [HASH model market simulation](https://hash.ai/@hash/model-market-python) <br/>- Review [HASH warehouse simulation](https://hash.ai/@hash/warehouse-logistics) |
| 4 | - L7: Introduction to optimization<br/> - L8: Modeling optimization problems<br/> - S2: Workshop for case 1 | - Work on case 1 <br/> - Read Gurobi's [Modelling Basics](https://www.gurobi.com/resource/modeling-basics/) <br/> - Read Neos [taxonomy of optimization problems](https://neos-guide.org/optimization-tree) <br/> - View this video on the [Simplex algorithm](https://www.youtube.com/watch?v=RO5477EKlXE) |
| 5 | - L9: Taxonomy of optimization techniques <br/> - L10: Simulation-based optimization II. Introduction to case 2 | - Deliver case 1 <br/> - Read case 2 <br/> - Enjoy watching [simulation-based race car training](https://www.youtube.com/watch?v=-sg-GgoFCP0) <br/> - Read how the [4th most popular database software in the world uses GAs to access data faster.](https://www.postgresql.org/docs/8.0/geqo-intro2.html) |
| 6 | - L11: Challenges in real-world usage. Simulation vs Optimization <br/> - L12: Introduction to Machine Learning <br/> - S3: Workshop for case 2 | - Work on case 2 <br/> - Read this [review on simulation optimization techniques and softwares](https://arxiv.org/pdf/1706.08591.pdf) |
| 7 | - L13: Supervised Machine Learning (SML): NIPS<br/> - L14: Typical SML workflow. Introduction to case 3<br/> - S4: Workshop for case 2 | - Work on case 2 <br/> - Read case 3 |
| 8 | - L15: Algorithm deep dive: Decision trees<br/> - L16: Feature Engineering and Model Evaluation<br/> - S5: Workshop for case 3 | - Deliver case 2 <br/> - View this [intro to neural networks](https://www.youtube.com/watch?v=aircAruvnKk&t=10s) and this [intro to random forests](https://www.youtube.com/watch?v=J4Wdy0Wc_xQ) |
| 9 | - L17: Deployment of Models <br/> - L18: Stories from the trenches: applying all of this in the real world<br/> - S6: Workshop for case 3 | - Work on case 3 <br/> - View this video on [why businesses fail at ML](https://www.youtube.com/watch?v=dRJGyhS6gA0) |
| 10 | - L19: Where to go from here: further learning and carreer advice<br/> - L20: Final Q&A, exam preparation | - Work on case 3 |
| 11 | - Exam | - Deliver case 3 | | |
- Lecture 1 INTRO
- Introduction to the course
- Citizenship rules
- Won't force you to come, but I advice you to.
- I'll always try to start 5min late, finish 5min late, and stop
for 5min.
- You can come and go, just please be respectful.
- Calendar
- Contents
- Expectations
- The teacher
- Evaluation
- Contact
- Questions?
- The relevance of math and computers in management
- Examples: pricing, logistics, staffing.
- The skills and profiles required
- The tools used
- Lecture 2 INTRO
- The techniques we will see in the course
- Simulation
- Optimization
- Supervised machine learning (aka "prediction")
- Why this stuff is important
- Lecture 3 SIM
- A humbling example
- What is simulation and when do we use it
- Different types of simulations
- Lecture 4 SIM
- Toy simulations in Python
- How to approach simulation in practical terms
- Tools in industry
- Lecture 5 SIM
- Theoretical background on simulation
- Present case 1
- Lecture 6 SIM
- Simulation-based optimization
- Where to go from here
- Lecture 7 OPT
- What is optimization
- A trivial example
- Lecture 8 OPT
- Different optimization techniques
- Present case 3
- Lecture 9 OPT
- How to model optimization problems (target functions, decision variables
and constraints)
- Lecture 10 OPT
- Simulation-based optimization: Genetic algorithms
- Lecture 11 OPT
- Real world challenges and optimization deployment
- Lecture 12 ML
- Good news, you already know Machine Learning
- Different branches of Machine Learning
- Real world examples of applications
- Lecture 13 ML
- How does Supervised Machine Learning work?
- Present case 2
- Lecture 14 ML
- The Machine Learning workflow (EDA, Feature Engineering, Model
Evaluation, Deployment)
- Lecture 15 ML
- Feature Engineering
- Lecture 16 ML
- Model evaluation
- Lecture 17 ML
- Deployment and real world challenges
- Lecture 18 Real life stories from the trenches
- Lecture 19 Real life stories from the trenches
- Lecture 20
- Q&A pre-exam
- Feedback on the course
## Case details
Case 1
- Title: Choosing stock policies under uncertainty
- Description: Students role-play their participation as consultants in a
project for Beanie Limited, a coffee beans roasting company. Elisa, the
regional manager for the italian region, is not happy with their inventory
policies for raw beans. The students are asked to analyse the problems posed
by Elisa and apply simulation techniques, together with real data, to
recommend a stock policy for the company's warehouse in the italian region.
Python notebooks with some helpful prepared functions are provided to the
students. The final delivery is a report with their recommendation to the
client company, along with the used code.
Case 2 candidate
- Title: ?
- Description: ?
- Sample idea: https://www.gurobi.com/resource/facility-location-problem/
Case 3
- Title: Improving last-mile scheduling with Machine Learning
- Description: Students role-play their participation as consultants in a
project for Beanie Limited, a coffee beans roasting company. Pieter, the
director of secondary transportation, has requested help from the student
consultants. One of the key activities in Pieter's team is the daily
scheduling, where the different trucks get assigned which deliveries and
routes will perform. The students are asked to develop a machine-learning
algorithm to predict the drop-time for each delivery (the drop-time is the
time a driver takes in unloading the goods in a a client location. More
informally, the time that passes since he removes the key from the truck
until he starts the engine again). The goal is to provide more advanced
information for Pieter's schedulers so they can better plan the routes of
their drivers. The students are asked to build and deliver a Machine Learning
algorithm that predicts this time. The students will be provided a labelled
dataset. The final delivery is the working prediction model, along with a
report explaining their methodology in building it, and answering some
business questions to the client company.
## Grading
The following items compose the final grade:
- Case assignments: 50% of the grade. There will be three assignments, each
with the same weight. The average grade of the assignments must be of 5 or
more to pass the course.
- Final exam: 50% of the grade. There will be a final exam at the end of the
course. The grade must be of 5 or more to pass the course.
Students who fail the final exam will get the chance to sit a retake exam.
A final grade is calculated as:
<!-- @formatter:off -->
```python
if avg(case1_grade, case2_grade, case3_grade) < 5:
passed_course = False
if final_exam_grade < 5:
passed course = False
passed_course = True
final_grade = (avg(case1_grade, case2_grade, case3_grade) + final_exam_grade) / 2
```
<!-- @formatter:on -->
## Bibliography
All compulsory and required materials will be provided during the course. These
include lecture notes, required readings and description readings.
A good book that follows the approach of this course is "Guttag, John.
Introduction to Computation and Programming Using Python: With Application to
Understanding Data. 2nd ed. MIT Press, 2016. ISBN: 9780262529624", used in the
homonymous course at MIT. It is not compulsory to use this book, but some
students might find it helpful.
Additional specific readings will be provided throughout the course. Students
will be requested to read some of these materials in advance of some sessions.
For students that want to dive deeper in the topics covered in the course, the
following books are recommended:
- On simulation: Louis G. Birta Gilbert Arbez, Modelling and Simulation.
Springer 2019 ISBN: 978-3-030-18869-6 or Law A., Kelton D., Simulation and
Modelling Analysis, Second Edition, McGraw-Hill, ISBN: 978-0071165372
- On machine learning: Hastie T., Tibshirani R., Friedman J., The Elements Of
Statistical Learning: Data Mining, Inference, And Prediction, Second Edition
ISBN: 978-0387848570
## Cool ideas & notes
- Hold a Kaggle competition with the students. Winners come to spend a morning
in Accenture.
- Start every lecture with a fun fact.
- Let them choose the challenge for one of the practical labs.
- Should I have office hours?
- Will the classes be recorded?
- What's are the policies on:
- Late deliveries
- Not attending exam
- Re-takes
- https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-0002-introduction-to-computational-thinking-and-data-science-fall-2016

34
other/ideas.md Normal file
View file

@ -0,0 +1,34 @@
# Ideas for next year
- Using Kaggle for the competition and making it individual. This would
definetely make the logistics easier, and the individual part of it would
probably be much more stimulating. It would also help me better spot who is
really interested in the course.
- I am thinking about switching from the 3-part approach to the course (
Simulation-Optimization-ML) to a two part (Optimization and ML). The contents
would be roughly the same, but the simulation part would be more geared
towards the serious metaheuristics and we would spend a bit less time in the
mathematical programming bit. I just realized that, in my mind, I had
packaged simulation as something completely different from optimization. But
that is quite silly. Optimization is optimization, no matter if you are doing
it through Linear Programming or through metaheuristics. The role of
simulation is to compute the value of a target function in a heuristic
environment. That's it. Hence, the simulation case then would not be so
naive, but rather require students to build a genetic algorithm solution to
tackle a more complex case of optimization, such as a multiechelon inventory
optimization situation.
- I have also changed my mind on the course grading. I really don't like to do
the exam. The cases have so much more value when compared to the exam. Some
of the students let me know that they thought that grading was not balanced
with the exam having so much weight, and I can't help but agree. I will
probably try to change the weight, but I still need to find a way to prevent
free-loaders from going through the course unharmed. I have thought that
maybe I could make the students work in groups of 2. The chance of having
free-loaders lowers by a lot. And then I could pull the cases grade up to 75%
and the exam would only by 25%.
- For next year, I want to include a few additional ideas in the first welcome
class:
- Publish the grades from Avaldo and show them to the new students.
- Give a lessons learnt from last year to students (what did successful
students do, what did students that failed do)
- Show places where students from last year are today (Iker, Ivan, Marc, etc)

8
other/memories.md Normal file
View file

@ -0,0 +1,8 @@
- Carla preguntando en la última clase como había acabado en la UPF.
- Iker y la nota que le faltaba para poder echar su application en Cambridge.
- Vicent y Arnau pidiéndome si podiamos quedar porque querían orientación de cómo entrar en el mundo Data Science.
- El empujoncito que le di a Iván para sacar la matrícula.
- El culebrón entre Jennifer, Marta y Anar al final del curso.
- La cara de alegría de Álvaro cuando les dí el libro de premio por la competición.
- Los gestos de sísísí de Marta.
- El aplauso que me dieron en mi discurso de despedida en la última clase.

View file

@ -0,0 +1,60 @@
# Python prep guidelines
On lecture 1, I informed you that you will be writing Python code as part of
the 3 cases you will need to go through in this course. This means that, the
more you know about Python, the less you will need to focus on *how* to do
things with Python and the *more* you will be able to focus on actually solving
the case.
This preparation is specially important if you have never done any coding. I
highly advise you to take your time to work your Python skills out before case
1 begins. It will pay off down the line.
This document contains two sections: a summary of the skills you are looking
for, and a few suggested contents that can help you. You don't need to use
suggested contents: if you find others that you prefer, go ahead. I do
recommend that you stick to trying to learn the skills I am suggesting, since
that will maximize the return on your effort. There are many other areas of
Python that you can learn but won't be relevant for this course.
## Skills to look for
- Notebooks and Colab
- How to use your UPF gmail to open
Colab (https://colab.research.google.com)
- How to create a new notebook, write code in it and run it
- How to upload and download CSV or Excel files in Colab notebooks
- Python basics
- Variables
- Functions
- Basic data types: ints, floats, strings
- Complex data types: lists, dicts, tuples
- Pandas
- Understanding pandas dataframes
- How to read and write CSV/Excel files as dataframes
- How to select data
- Selecting columns
- Selecting rows with conditions
- Grouping and aggregating dataframes
- Obtaining statistics from columns
- Creating new columns through operations on existing columns
- Seaborn
- Understaning seaborn plots
- How to plot data from a pandas dataframe with seaborn
## Suggested Materials
- A super brief Colab example
notebook: https://colab.research.google.com/?utm_source=scs-index
- A more detailed, high quality example Colab
notebook: https://colab.research.google.com/github/cs231n/cs231n.github.io/blob/master/python-colab.ipynb
- A brief video intro to Colab (
Spanish): https://www.youtube.com/watch?v=8VFYs3Ot_aA
- Another brief video intro to Colab (
English): https://www.youtube.com/watch?v=oCngVVBSsmA
- A free, 4 hours intro to Python from
Datacamp: https://www.datacamp.com/courses/intro-to-python-for-data-science
- A self-pace introduction to Python for STEM
applications: https://www.pythonlikeyoumeanit.com/index.html
- A huge collection of resources, in case you don't like the previous ones or
want to find more: https://www.reddit.com/r/learnpython/wiki/index