Final Project | Sundong Kim

Back to the Data Engineering Course - 2023 Spring (AI5308/AI4005)

Showcasing project outcomes

Team 1: Sudadui - Math Recommender, [Demo] [Report]
Team 2: LoLhelper, [Demo] [Report]
Team 3: Data Collecting Platform for ARC Problems, [Demo] [Report]
Team 4: DrawnTalk, [Demo] [Report]
Team 5: DogMoji, [Demo] [Report]
Team 6: AI Drive Assistant, [Demo] [Report]
Team 7: Saftey Distance Estimator, [Demo] [Report]
Team 8: AI Color Grader, [Demo] [Report]
Team 9: Apple Sweetness Finder, [Demo] [Report]
Team 10: Wine Picker, Wine Recommender System Based on Similarity, [Demo] [Report]
Team 11: Catch Market, [Demo] [Report]

Timeline

Team formation and idea sketch (Project proposal): March 20, 23:59 [Submit here]
First review (7 min presentation): May 1, 23:59 [Submit here]
Second review (Bring your demo, interaction with classmates): May 18
Third review (Try other’s demo): May 30
Final preparation: Jun 8, 13:00-14:15 (S7 bldg 1F)
Demo day: Jun 13, 12:00-16:00 (Offline event, Pizza will be served, S7 bldg 1F)
Final report submission: Jun 16, 23:59 [Sample reports] [Submit here]

First step

To get familiar with ML web application development, you can first refer to short clips such as:

Very First Hands-on Exercise
- Build your first Streamlit app for non-python users: Streamlit, Github
- Build a Machine Learning Web App From Scratch: VS Code, Google Colab, Streamlit
Nicholas Renotte Channel - Building a ML app in 15 minutes series
- I tried building a AutoML web app in 15 Minutes: He is using VS Code IDE for prototyping and Streamlit for serving his app.
- I tried to build a Machine Learning Python App in 15 Minutes: VS Code, scikit-Learn, mediapipe and Tkinter
Siraj Raval Channel
- Building a trading bot with Chat-GPT: He is using Google Colab for prototyping, and get some hits from Chat-GPT.
- I Built a Sports Betting Bot with ChatGPT: Colab, Chat-GPT, searching API with Github, dexsport Web3.
- Learn Machine Learning in 3 months: 7-min video contains lots of useful resources.

Team formation and ideation

Team formation and idea sketch - Ensure that each team submits only one form by March 20, 23:59.
- Team 1: 제갈윤, 이준명(18), 김선규, 정재홍
- Team 2: 권영후, 이성규, 이상윤, 채종욱
- Team 3: 송도윤, 권오현, 임민택, 한장혁
- Team 4: 고강빈, 김윤재, 엄태준, ~~김민준~~
- Team 5: 배성호, 전우석, 정현, 이준명(20), 정재익
- Team 6: 박혜빈, 전민수, 오정민, 이한별
- Team 7: 유성훈, 장석우, 최동규, 진치현
- Team 8: 전지민, 정재우, 이선재, ~~김동우~~
- Team 9: 최영인, 김준식, 류형석, Elise Souvannavoug
- Team 10: 이경로, 정시내, 김예원, 박세진
- Team 11: 김재우, 이호준, 진혜빈, 강제이
Check how your friends sketch their idea, and what kinds of tools are available

Resources to build your idea

Here are some resources to ideate your project.

CS329S previous projects
FSDL previous projects
DEVIEW - NAVER developer’s conferences
More academic stuffs: you can also check the list of papers presented in AI/ML conferences
Sundong’s project: you can also build upon these projects.

While implementing your idea, please think about the concepts we learned during the class (e.g., Better model design, sustainability, scalability, cold-start problem, interpretability, data shift, fairness, data imbalance, etc)

Provided GPU Server and Tools

We have secured GPU sandboxes from GIST SCENT. Upon your project proposal submission, our TA (Sanha Hwang) will share an account to each team and let you know how to use the sandbox. Each team will get a virtual machine with a V100 GPU and here is the Cookbook. For technical difficulties, please consult with 김세근 연구원 at GIST SCENT (locamania@gist.ac.kr, T. 062-715-6360).
You can also use Google Colab for prototyping your ML model. Google Colab Pro seems a reasonable option ($9.99/month).
PythonAnywhere is an online integrated development environment and web hosting service based on the Python programming language. Our research group also uses this service for hosting our web application. Hacker or Web dev are reasonable options for hosting your project ($5 or $12/month).
You can refer to these tools and libraries. These are some lists I have found, but there are much more.
- Web applications
  - Streamlit lets you turn data scripts into shareable web apps in minutes, not weeks. It’s all Python, open-source, and free!
  - FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.7+ based on standard Python type hints.
  - Flask is a micro web framework written in Python.
  - Vue.js is an open-source model–view–viewmodel front end JavaScript framework for building user interfaces and single-page applications.
  - React, Angular …
- Libraries for ML applications
  - Basic tools for wrangling data: numpy, pandas, scipy
  - Machine learning and deep learning tools: scikit-learn, PyTorch, Tensorflow, …
  - imbalanced-learn for imbalanced data
  - cleanlab for data-centric AI
  - Ludwig: declarative machine learning framework
  - To scale-up the project further, you can refer to Ray, Kubernetes, etc.

First review

Date: May 2, 2023
After preparing slides and demo, submit the form.
- Ensure that each team submits a form by May 1, 23:59.
- Things to prepare: 3 page slides + Demo of an MVP
- In the mid-review day, each team will present three-page slides and show their demo. You will be presenting in class, so do not prepare a video.
- Prepare your pitch in English.
Presentation 🗣️ [x minutes]: We just want 3 slides:
- Slide 1 [Pitch]: What’s the problem you’re working on and why? What are the key objectives?
- Slide 2 [System & Challenges]: A layout of the system you’re building (think system diagram). Walk us through your system, and tell us what you’ve built.
- Slide 3 [Future Work]: What more needs to be done? We want to know the key challenges you anticipate.
A demo of an MVP [y minutes; x+y < 7]: We just want to see you’ve built something that resembles what you promised. The emphasis here is on:
- convincing us that you’ve taken a first pass on all the parts of the system you’re building
- convincing us that you have a plan to take the application from MVP to final-demo-ready. Having an MVP at this stage will allow us to give you appropriate feedback.
Reference
- Checklist for preparing Minimum Viable Product (MVP) — see page 8-10
- FSDL previous projects (Many videos available)
Achievements
- Team 1 — Slides, App (Hate speech detection)
- Team 2 — Slides, App (Recsys for League of Legends)
- Team 3 — Slides, App (Abstraction and reasoning)
- Team 4 — Slides, App (Chatbot for kids)
- Team 5 — Slides, App (Pet emoji)
- Team 6 — Slides, App (Monitoring driver behavior)
- Team 7 — Slides, App (Safe driving distance)
- Team 8 — Slides, App (Instagram color transfer)
- Team 9 — Slides, App (Sweet Apple finder)
- Team 10 — Slides, App (Wine picker)
- Team 11 — Slides, App (Style finder)

Second review

Date: May 18, 2023
Things to prepare: Bring your updated demo, discuss with colleagues 💬

To encourage more discussion among students and create an interactive session, I would like to divide the class into six groups, each consisting of approximately 6-8 students. (4 teams of 1-2 members).

Group 1: 제갈윤, 권영후, 송도윤, 전지민, 김준식, 정시내, 이호준
Group 2: 이준명(18), 이성규, 권오현, Elise, 박세진, 배성호, 강제이, 유성훈
Group 3: 김선규, 이상윤, 임민택, 정현, 오정민, 최동규, 류형석
Group 4: 정재홍, 채종욱, 한장혁, 엄태준, 이준명(20), 이한별, 진치현
Group 5: 최영인, 이경로, 김재우, 고강빈, 박혜빈, 정재익
Group 6: 김윤재, 전우석, 전민수, 장석우, 정재우, 김예원, 진혜빈, 이선재
Time allocation: Allocate time for each of the following activities:
- Introduction and instructions (5 minutes)
- Small group discussions (70 minutes)

Guidelines for students and TAs:
- For students:
  - Prepare a brief overview of their project, highlighting the key aspects, objectives, and results.
  - Discuss your project and provide feedback to other students in their group.
  - Stay engaged and actively participate in the discussion.
- For TAs:
  - Facilitate the discussions of each small group, ensure everyone has an opportunity to speak, and manage time.
  - Take notes on the key points and interesting insights that arise during the discussions.
  - Each TA will take in change of two groups:
    - Sanha Hwang: Group 1 and 4
    - Hongyiel Suh: Group 2 and 5
    - Sungkyu Yang: Group 3 and 6
Session structure
- Introduction and instructions (5 minutes): Begin the session by explaining its purpose and explain the structure of the session.
- Small group discussions (70 minutes): Each small group discuss their projects within the allocated time. Ask questions, provide feedback, and share your own experiences. Each student should present their work and participate in discussions.

Third review

Date: May 30, 2023, 13:00-14:15
Things to prepare: Play with demos 🎮 and provide feedbacks 📝

In this class, we will spend more time on playing with other’s demo and provide feedbacks each other. Similar to the second review session, I divide the class into ten groups, each consisting of approximately 4-5 students.

Group 1: 채종욱, 정재익, 진치현, 진혜빈 (Team 2, 5, 7, 11)
Group 2: 권영후, 전민수, 최영인, 이호준 (2, 6, 9, 11)
Group 3: 이준명(18), 이상윤, 송도윤, 이한별, 이경로 (1, 2, 3, 6, 10)
Group 4: 한장혁, 정현, 전지민, 김재우 (3, 5, 8, 11)
Group 5: 엄태준, 오정민, 김준식, 박세진 (4, 6, 9, 10)
Group 6: 고강빈, 이준명(20), 최동규, 정시내 (4, 5, 7, 10)
Group 7: 김선규, 장석우, 이선재, 류형석, 강제이 (1, 7, 8, 9, 11)
Group 8: 이성규, 김윤재, 전우석, 유성훈, 김예원 (2, 4, 5, 7, 10)
Group 9: 제갈윤, 권오현, 박혜빈, 정재우 (1, 3, 6, 8)
Group 10: 정재홍, 임민택, 배성호, Elise (1, 3, 5, 9)
Session structure:
- Introduction and instructions (5 minutes)
- Play other’s demo and provide feedbacks (70 minutes)
Guidelines for students and TAs:
- For students:
  - Let others play with your demos and receive feedbacks (Bring pen & paper).
  - Each team: 15 minutes
- For TAs:
  - Facilitate the discussions of each small group, ensure everyone has an opportunity to share their demo, and manage time.

Final preparation

When: June 8, 2023, 13:00-14:15
Where: AI Graduate School, 1st Floor
Things to prepare: Setup your demos ⚙️ and receive critique on your final report draft 🔍
Instructions for both students and TAs:
- Students:
  - Plan the arrangement of your demo alongside TAs, consider using the TV screen and iMacs.
  - Discuss how to make your booth unique during the demo day.
  - Consult with the instructor and seek advice on your final report, bring your draft (5 minutes each, starting with Team 1)
- TAs:
  - Aid in the demo setup.
  - Determine what needs to be prepared for the demonstration day, and sequence them accordingly (Purchase).
  - Distribute emails and advocate for the demo day amongst members of the AI Graduate School and EECS undergraduates.
  - Decide how many pizzas and snacks to order based on the number of participants and prepare them.

Demo Day

Get ready for our Demo Day - a convention style, interactive booth experience!💥 Showcase your project, learn from your peers, and bring your ideas to life! 🚀

Logistics
- Date and Time: June 13, from 12:00 to 16:00
- Venue: AI Grad School (1F)
  - Equipped with projectors, three 65-inch TVs, and dozens of 27-inch iMacs.
- Lunch will be provided.
What you should prepare
- Your demo product
- (Optional) Videos you would like to stream using TV and iMacs. e.g., teaser video up to 1 minute.
What you will do
- Introduce your demo at the booth (at least one team member should stay at your booth at all teams).
- Visit other team’s booths and experience their demos.
- Attract attendees and promote your demo to raise virtual money.
- Provide suggestions and note down feedback from others.
What each member will get
- Virtual money (or stickers) to invest in other team’s products.
- People outside of AI5308 will also receive virtual money (or stickers) to invest in other team’s products.
Ingredients of a live demo
- Introduce yourself: Spend a few seconds telling customers who you are.
- Motivate your problem: Describe the goal of your application, describe why your team was interested in this problem. Set context and expectations.
- Describe the technology: Impress us with what you’ve accomplished. Talk about the technical depth of your system, show us that you thought through the design
Live demo: The live demo is done in real-time, but every single aspect of the demo is scripted ahead of time.
- You should know exactly how you’ll be walking through your application.
- Think about what you want to highlight. You should script what you’ll be saying and where you’ll be drawing the attention of viewers.
- Plan your pauses, and think about places to ask the audiences for input so that they feel involved.
- Be aware of balancing two needs: 1) showing off your applications in the best possible light, 2) Making the audience feel involved so that demo doesn’t feel scripted.

Top 3 teams will deliver a 5-minute speech at the end of the day (from 15:00 to 15:30). Slides are not required.

Timetable

Time	Agenda
12:00 - 13:00	Social lunch and demo setup
13:05 - 15:00	Booth management and exploration of other team’s products
15:00 - 15:20	Top three teams present their projects
15:20 - 16:00	Event closing and wrap-up

Final Report

The last part of your grade is based on a report that summarizes your work. The report will be written in the style of a blog post (good examples: here, here). Upload your final report in your personal webpage and provide us the link ot it. It will be published on the course website after the conclusion of the course.

Due: Jun 16, 23:59 - Submit here

Deliverables
- Your report should summarize your project work and learning. It’d be at most 2500 words long.
- Please see Grading criteria below for the required components of the writeup.
- You can submit the final report by providing a link to your blog post if you write your report on your own blog.
- Or, you can submit your final report as a zip file (titled team_name.zip) that contains two components:
  - Code, data, and associated materials used for the project. You can submit this component directly or include a link to a GitHub repo where it’s hosted. Please contact us if this information is proprietary.
  - An HTML folder:
    - index.html: an HTML file containing your report
    - All associated images and support files referenced by the HTML file. You should contain all supporting documents such that the HTML file, when opened, will properly render on any machine with a web browser and an Internet connection.

Components

Here are the aspects you should include in your final report (with suggested lengths). See the sample reports that follow this structure. You can decide the length of each part, but make sure that final report length is no longer than 2,500 words.

Team information
- Mention each members’ name and put the link of their website.
Problem definition (Suggested: 250 words)
- Explain the problem the team is solving, discusses related work, and proposes and justifies their solution.
System design (500 words)
- Details the key components of the system, including, but not limited to, data pipelines, modeling, deployment, and UX.
- If applicable, a diagram is included to illustrate the interplay between system components. Excalidraw is pretty awesome for sketches.
- This section explains and justifies central design decisions, including that of which technologies the team chose to use to support their system.
Machine learning component (300 words)
- For those you requires ML, explain the model that powers the application, the data it’s trained on, and the iterative development of that model.
System evaluation (500 words)
- Describe your efforts to validate and evaluate their system performance as well as its limitations.
- The results are included and presented in a clear and informative manner.
Application demonstration (300 words)
- Includes some visuals (screenshot, embedded video link) showcasing the main feature set of the application.
- Includes brief justifications of core interface decisions (e.g. why did you feel that a Web Application interface would be superior to an API interface given the context of their problem?).
- Provide instructions on how to use the application.
Reflection (400 words)
- Provides a comprehensive post-mortem on the project, including - but not limited to - answering the following:
  - What worked? (In terms of technology, design decisions, team dynamics, etc.).
  - What didn’t work? What would you improve next time?
  - If given unlimited time and resources, what would you add to your application?
  - If you have plans to move forward with this application, what are they? (We’re excited to see how you use the tools you’ve learned in this class to pursue topics they’re excited about!)
Broader Impacts (250 words)
- This section discusses intended uses of your application and possible unintended uses, and the associated harms. This section reflects upon the design decisions that the team undertook to mitigate harms associated with unintended use of the system.

Peer Evaluation

After completing the report, discuss how you distribute the given points within the team. You will be asked to submit a 1 page PDF file on your peer evaluation results.

(Must write student ID, name, with points, and the detailed reason)

Team of size five: Share 30 points in total (e.g., 8, 7, 6, 5, 4 points)
Team of size four: Share 24 points in total
Team of size three: Share 18 points in total
In average, student will receive 6 points, each student can get no more than 12 points.
Should be determined by overall contribution: e.g., Data curation, Methodology, MLOps, Front-end, Backend, Paper writing, Visualization, Presentation, Conceptualization, Formal analysis, Funding acquisition, Project Administration, Legal review support, etc

Grading

This is a rough idea of grading your final project (Total 65% of your grades).

External evaluation: 50% — Evaluated by staffs & other teams
- Rubrics: Completeness, latency, sustainability, scalability, potential business impact, edge case and data shift handling, creativity, existence of ML algorithm, participation, presentation, …
  - First review — 5% (Good: 4%, Excellent: 5%)
  - Second review — 5% (Absent: 3%, Absent with excuse: 4%, Attend: 5%)
  - Third review — 5% (Absent: 3%, Absent with excuse: 4%, Attend: 5%)
  - Showcasing demo in your booth (CES style, showing your demo with your laptops — Location: AI studio @ S7 building, iMacs and 65-inch screen available, Pizza will be served, start on noon, Jun 8) — 20%
  - Quality of your report, checking how comments are resolved — 15%
- Considering that we have 12 teams
  - Top-3 teams will be expected to get around 45 points in average.
  - Average performers (6 teams in the middle) will be expected to get around 37.5 points in average.
  - Bottom-3 teams will be expected to get around 30 points in average.
Internal evaluation: 12% - Peer evaluation
- Team of size five: Share 30 points in total (e.g., 12, 9, 6, 3, 0 points)
- Team of size four: Share 24 points in total (e.g., 12, 8, 4, 0 points)
- In average, student will receive 6 points, with a maximum of 12 points.
- Should be determined by overall contribution: e.g., Data curation, Methodology, MLOps, Front-end, Backend, Paper writing, Visualization, Presentation, Conceptualization, Formal analysis, Funding acquisition, Project Administration, Legal review support, etc
Contribution to the class: 3%
- e.g., Voluntarily sharing knowledge to everyone via recitation, providing templates for MLOps, support staffs to build a final project website with distill blog style, etc)