Data Engineering
Spring 2023
- Code: AI5308 / AI4005
- Schedule: Tue/Thu 1:00pm-2:30pm
- Location: GIST College Building A, Room 227 (N4)
- Instructor: Sundong Kim
- TAs: Sanha Hwang, Sungkyu Yang, Hongyiel Suh
- Contact: Students are encouraged to ask all course-related questions on Discord, where you can also find announcements. Meanwhile, our office hours are as follows.
- Sundong: Tue 2:30pm-3:30pm, Discord or GIST AI Graduate School (S7) Room 204
- Sanha: TBD, Discord or GIST AI Graduate School (S7) Room 204
- Sungkyu: TBD, Discord or GIST AI Graduate School (S7) Room 202
- Hongyiel: TBD, Discord or GIST AI Graduate School (S7) Room 202
- Virtual Classrooms:
- Discord (Discussion and Q&A, Team Collaboration)
- Google Classroom (Probably for submitting homeworks)
- Class Overview, Logistics and Grading: See this page
- Final Project: See this page
Textbook & References
- Books - PDF available at LMS and GIST library, Courtesy by O’Reilly
- Reference lectures
Notice
-
Homework released (March 27) - Write your first critique and submit here.
-
Homework released (March 17) - Make your webpage & CV and submit here.
- I invited several speakers to graduate school AI colloquium regarding to our course (from Naver, MakinaRocks, etc). Please mark the below sessions on your calendar and attend them. For those who cannot attend, we will record these lectures and share with you.
- March 30: (Multimodal representation learning at Naver Shopping, Wonyoung Shin)
- April 6: (MLOps at Naver Shopping, Byeongjo Kim and Shengzhe Li)
- May 18: (Trustworthy federated learning at IBS, Sungwon Han)
- May 25: (MLOps and use cases at MakinaRocks, Youngsub Lim)
- For convenience, you can consider taking the AI 5001 (AI Colloquium) together with my class. The colloquium will be on every Thursday 16:00-17:30, AI Studio (1F), AI Graduate School (S7).
Tentative Schedule
Herebelow, you can find the tentative schedule of the course. Overall course will follow the DMLS book, which is a up-to-date version of the CS329S lecture notes by Chip Huyen.
Date | Description | Materials | Homeworks |
---|---|---|---|
Feb 28 | Introduction | Sign-up form | |
Mar 2 | Overview of Machine Learning Systems | DMLS Ch.1 | |
Mar 7 | Introduction to Machine Learning Systems Design | DMLS Ch.2, Slides | |
Mar 9 | Class Logistics, Homework Releases | Slides | |
Mar 14 | Project Announcement | Final project, Slides | |
Mar 16 | Introduction to Machine Learning Systems Design | DMLS Ch.2, Slides | HW - Webpage (Mar 17) |
Mar 21 | Data Engineering 101 | DMLS Ch.3, Slides | Team formation (Mar 20) |
Mar 23 | Data Engineering 101, Training data | DMLS Ch.3-4, Slides | |
Mar 28 | Training data | DMLS Ch.4, Slides | HW - Critique 1 (Mar 27) |
Mar 30 | Feature engineering, Invited talk - e-CLIP model at Naver Shopping (16:00, AI Studio, S7) | Video, DMLS Ch.5 | |
Apr 4 | Feature engineering | DMLS Ch.5 | |
Apr 6 | Invited talk - MLOps at Naver Shopping (16:00, AI Studio, S7) | Video | |
Apr 11 | Model development and offline evaluation | DMLS Ch.6 | |
Apr 13 | Model development and offline evaluation | DMLS Ch.6 | Prepare MVP - Checklist (Apr 13) |
Apr 18 | Mid-term or not? (Midterm Period) | ||
Apr 20 | No Lecture (Midterm Period) | ||
Apr 25 | Deployment | DMLS Ch.7 | HW - Critique 2 (Apr 24) |
Apr 27 | Deployment | DMLS Ch.7 | |
May 2 | Project mid-review (5 min presentation) | [Show your MVP in Class (May 2)] | |
May 4 | Data distribution shifts and monitoring | DMLS Ch.8 | |
May 9 | Data distribution shifts and monitoring | DMLS Ch.8 | |
May 11 | Continual learning and test in production | DMLS Ch.9 | |
May 16 | Continual learning and test in production | DMLS Ch.9 | |
May 18 | Invited talk - Trustworthy federated learning (16:00, AI Studio, S7) | Video, DMLS Ch.11 | |
May 23 | Human side of machine Learning | DMLS Ch.11 | HW - Critique 3 (May 22) |
May 25 | Invited talk - MLOps at MakinaRocks (16:00, AI Studio, S7) | Video | |
May 30 | Infrastructure and tooling for MLOps | DMLS Ch.10 | |
Jun 1 | Infrastructure and tooling for MLOps | DMLS Ch.10 | |
Jun 6 | No Lecture (National Holiday) | ||
Jun 8 | Project demo day (Poster & demo booth) | Reference presentation | |
Jun 14 | No Lecture (Finals week) | ||
Jun 16 | No Lecture (Finals week) | Team report (due: Jun 16), Reference reports Distill-style sample |