Data Engineering

Spring 2023

  • Code: AI5308 / AI4005
  • Schedule: Tue/Thu 1:00pm-2:30pm
  • Location: GIST College Building A, Room 227 (N4)
  • Instructor: Sundong Kim
  • TAs: Sanha Hwang, Sungkyu Yang, Hongyiel Suh
  • Contact: Students are encouraged to ask all course-related questions on Discord, where you can also find announcements. Meanwhile, our office hours are as follows.
    • Sundong: Tue 2:30pm-3:30pm, Discord or GIST AI Graduate School (S7) Room 204
    • Sanha: TBD, Discord or GIST AI Graduate School (S7) Room 204
    • Sungkyu: TBD, Discord or GIST AI Graduate School (S7) Room 202
    • Hongyiel: TBD, Discord or GIST AI Graduate School (S7) Room 202
  • Virtual Classrooms:
  • Class Overview, Logistics and Grading: See this page
  • Final Project: See this page

Textbook & References


Notice


Tentative Schedule

Herebelow, you can find the tentative schedule of the course. Overall course will follow the DMLS book, which is a up-to-date version of the CS329S lecture notes by Chip Huyen.

Date       Description Materials Homeworks
Feb 28 Introduction Sign-up form  
Mar 2 Overview of Machine Learning Systems DMLS Ch.1  
Mar 7 Introduction to Machine Learning Systems Design DMLS Ch.2, Slides  
Mar 9 Class Logistics, Homework Releases Slides  
Mar 14 Project Announcement Final project, Slides  
Mar 16 Introduction to Machine Learning Systems Design DMLS Ch.2, Slides HW - Webpage (Mar 17)
Mar 21 Data Engineering 101 DMLS Ch.3, Slides Team formation (Mar 20)
Mar 23 Data Engineering 101, Training data DMLS Ch.3-4, Slides  
Mar 28 Training data DMLS Ch.4, Slides HW - Critique 1 (Mar 27)
Mar 30 Feature engineering, Invited talk - e-CLIP model at Naver Shopping (16:00, AI Studio, S7) Video, DMLS Ch.5  
Apr 4 Feature engineering DMLS Ch.5  
Apr 6 Invited talk - MLOps at Naver Shopping (16:00, AI Studio, S7) Video  
Apr 11 Model development and offline evaluation DMLS Ch.6  
Apr 13 Model development and offline evaluation DMLS Ch.6 Prepare MVP - Checklist (Apr 13)
Apr 18 Mid-term or not? (Midterm Period)    
Apr 20 No Lecture (Midterm Period)    
Apr 25 Deployment DMLS Ch.7 HW - Critique 2 (Apr 24)
Apr 27 Deployment DMLS Ch.7  
May 2 Project mid-review (5 min presentation)   [Show your MVP in Class (May 2)]
May 4 Data distribution shifts and monitoring DMLS Ch.8  
May 9 Data distribution shifts and monitoring DMLS Ch.8  
May 11 Continual learning and test in production DMLS Ch.9  
May 16 Continual learning and test in production DMLS Ch.9  
May 18 Invited talk - Trustworthy federated learning (16:00, AI Studio, S7) Video, DMLS Ch.11  
May 23 Human side of machine Learning DMLS Ch.11 HW - Critique 3 (May 22)
May 25 Invited talk - MLOps at MakinaRocks (16:00, AI Studio, S7) Video  
May 30 Infrastructure and tooling for MLOps DMLS Ch.10  
Jun 1 Infrastructure and tooling for MLOps DMLS Ch.10  
Jun 6 No Lecture (National Holiday)    
Jun 8 Project demo day (Poster & demo booth) Reference presentation  
Jun 14 No Lecture (Finals week)    
Jun 16 No Lecture (Finals week) Team report (due: Jun 16), Reference reports Distill-style sample