Final Project
Final project is shaping up soon, stay tuned!
Back to the Data Engineering Course - 2023 Spring (AI5308/AI4005)
First Step …
To get familiar with ML web application development, you can first refer to short clips such as:
- Very First Hands-on Exercise
- Build your first Streamlit app for non-python users: Streamlit, Github
- Build a Machine Learning Web App From Scratch: VS Code, Google Colab, Streamlit
- Nicholas Renotte Channel - Building a ML app in 15 minutes series
- I tried building a AutoML web app in 15 Minutes: He is using VS Code IDE for prototyping and Streamlit for serving his app.
- I tried to build a Machine Learning Python App in 15 Minutes: VS Code, scikit-Learn, mediapipe and Tkinter
- Siraj Raval Channel
- Building a trading bot with Chat-GPT: He is using Google Colab for prototyping, and get some hits from Chat-GPT.
- I Built a Sports Betting Bot with ChatGPT: Colab, Chat-GPT, searching API with Github, dexsport Web3.
- Learn Machine Learning in 3 months: 7-min video contains lots of useful resources.
Timeline
- Team formation and idea sketch (Project proposal): March 20 23:59 [Submit here]
- See this link to check your initial ideas for the final project.
- After submission, each team will receive an account for the GPU server.
- (Tentative) Project mid-review with 5 min presentation: May 2, 13:00-14:15
- Project demo day: Jun 8, 12:00-14:15 (Offline event, Pizza will be served, S7 bldg 1F)
- Final report submission: Jun 16, 23:59 [Sample reports] [Submit here]
Team formation and ideation
- Team formation and idea sketch - Ensure that each team submits only one form by March 20, 23:59.
- Team 1: 제갈윤, 이준명(20185141), 김선규, 정재홍
- Team 2: 권영후, 이성규, 이상윤, 채종욱
- Team 3: 송도윤, 권오현, 임민택, 한장혁
- Team 4: 고강빈, 김윤재, 김민준, 엄태준
- Team 5: 배성호, 전우석, 정현, 이준명(20205141), 정재익
- Team 6: 박혜빈, 전민수, 오정민, 이한별
- Team 7: 유성훈, 장석우, 최동규, 진치현
- Team 8: 김동우, 전지민, 정재우, 이선재
- Team 9: 최영인, 김준식, 류형석, Elise Souvannavoug
- Team 10: 이경로, 정시내, 김예원, 박세진
- Team 11: 김재우, 이호준, 진혜빈, 강제이
Resources to build your idea
Here are some resources to ideate your project.
- CS329S previous projects
- DEVIEW - NAVER developer’s conferences
- DEVIEW 2023
- SNOW AI Filter : 나인듯 나같지 않은 나보다 이쁜 나 - see page 60 and onwards
- 값비싼 Diffusion model을 받드는 저비용 MLOps
- DEVIEW 2021
- DEVIEW 2020
- DEVIEW 2023
- More academic stuffs: you can also check the list of papers presented in AI/ML conferences
- Sundong’s project: you can also build upon these projects.
While implementing your idea, please think about the concepts we learned during the class (e.g., Better model design, sustainability, scalability, cold-start problem, interpretability, data shift, fairness, data imbalance, etc)
Provided GPU Server and Tools
-
We have secured GPU sandboxes from GIST SCENT. Upon your project proposal submission, our TA (Sanha Hwang) will share an account to each team and let you know how to use the sandbox. Each team will get a virtual machine with a V100 GPU and here is the Cookbook.
-
You can also use Google Colab for prototyping your ML model. Google Colab Pro seems a reasonable option ($9.99/month).
-
PythonAnywhere is an online integrated development environment and web hosting service based on the Python programming language. Our research group also uses this service for hosting our web application. Hacker or Web dev are reasonable options for hosting your project ($5 or $12/month).
-
You can refer to these tools and libraries. These are some lists I have found, but there are much more.
- Web applications
- Streamlit lets you turn data scripts into shareable web apps in minutes, not weeks. It’s all Python, open-source, and free!
- FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.7+ based on standard Python type hints.
- Flask is a micro web framework written in Python.
- Vue.js is an open-source model–view–viewmodel front end JavaScript framework for building user interfaces and single-page applications.
- React, Angular …
- Libraries for ML applications
- Basic tools for wrangling data: numpy, pandas, scipy
- Machine learning and deep learning tools: scikit-learn, PyTorch, Tensorflow, …
- imbalanced-learn for imbalanced data
- cleanlab for data-centric AI
- Ludwig: declarative machine learning framework
- To scale-up the project further, you can refer to Ray, Kubernetes, etc.
- Web applications
Mid-review (May 2)
Information will be added.
Project Demo Day (Jun 8)
Information will be added.
Final Report (Jun 16)
Information will be added.
Grading
This is a rough idea of grading your final project (Total 65% of your grades).
- External evaluation: 45% - Evaluated by staffs & other teams
- Rubrics: Completeness, latency, sustainability, scalability, potential business impact, edge case and data shift handling, creativity, existence of ML algorithm, …
- Will be measured by executing your app, and presentation: (30%)
- Showcasing demo and 5-min youtube recordings (CES style, showing your demo with your laptops - Location: AI studio @ S7 building, iMacs and 65-inch screen available, Pizza will be served, start on noon, Jun 8) - 30%
- Quality of your report, checking how comments are resolved - 15%
- Considering that we have 12 teams
- Top-3 teams will be expected to get around 40 points in average.
- Average performers (6 teams in the middle) will be expected to get around 25 points in average.
- Bottom-3 teams will be expected to get around 10 points in average.
- Internal evaluation: 12% - Peer evaluation
- Team of size five: Share 30 points in total (e.g., 12, 9, 6, 3, 0 points)
- Team of size four: Share 24 points in total (e.g., 12, 8, 4, 0 points)
- In average, student will receive 6 points, with a maximum of 12 points.
- Should be determined by overall contribution: e.g., Data curation, Methodology, MLOps, Front-end, Backend, Paper writing, Visualization, Presentation, Conceptualization, Formal analysis, Funding acquisition, Project Administration, Legal review support, etc
- Contribution to the class: 8%
- e.g., Voluntarily sharing knowledge to everyone via recitation, providing templates for MLOps, support staffs to build a final project website with distill blog style, etc)