Data Science & Business Analytics Lab

“To support data-driven decision-making for various issues faced in industrial settings, we develop data science core technologies (AI, ML, etc.).
We operate three small-scale research groups focusing on time series, natural language, and vision data.
By collaborating with researchers in other fields, we aim to utilize diverse forms of data in a systematic manner.
Through partnerships with various companies in the manufacturing, IT, and service sectors, we acquire domain knowledge.”

Lab Name
데이터과학 및 비즈니스 애널리틱스 연구실 Data Science & Business Analytics Lab
Advisor
Prof. 강필성 (Pilsung Kang)
Lab Members
Current Students  (6 Postdocs / 1 Ph.D. Student / 4 M.S. Students)
Main Research Areas
▪ Time-series Data Analysis: Time-series Representation Learning/Anomaly Detection/Forecasting
▪ Natural Language Data Analysis: Opinion Mining/Log Anomaly Detection/Efficient training of Language Models.
▪ Image/Vision Data Analysis: Image Anomaly Detection/Active Learning/Vision-Language model
Representative Research
or Projects
▪ Research: Jaehee Kim, Yukyung Lee, Pilsung Kang*. (2024). A Gradient Accumulation Method for Dense Retriever under Memory Constraint. NeurIPS.
▪ Research: Hyeongwon Kang, Pilsung Kang*. (2024). Transformer-based Multivariate Time Series Anomaly Detection using Inter-Variable Attention Mechanism. Knowledge-Based Systems, 290, 111507.
▪ Research: Jaehyuk Heo+, Seungwan Seo+, Pilsung Kang*. (2023). Exploring the Differences in Adversarial Robustness Between ViT- and CNN-Based Models Using Novel Metrics. Computer Vision and Image Understanding. 235, 103800. (+: Equally contributed)
▪ Project: Development of a Generative, Predictive, and Prescriptive Maintenance Architecture for Autonomous Production Systems, Basic Research Laboratory Program of the National Research Foundation of Korea (NRF)
▪ Project: Development of Explainable Multimodal Anomaly Detection Methodologies and Applications in Industrial Data, NRF’s Mid-Career Researcher Program.
▪ Project: Development of Multi-Modal Learning for Multi-modal Data, LG Innotek.
▪ Project: Development of Evaluation Methodologies for Large Language Models in the Financial Domain, KakaoBank.
▪ Project: Establishment of an Active Learning-Based Defective Data Labeling Framework for Maximizing Product Yield, Samsung Electronics.
Relevant Courses
Graduate: Data Mining Techniques (406.546 001)
Contact Information
02-880-7360
Location
Building 39, Room 411
Main Career Paths After Graduation
Experts in Industrial Data Analytics for Academia and Industry
Application Inquiries
Please refer to the laboratory’s website for the procedure for selecting new students.
(http://dsba.snu.ac.kr/apply)

The Data Science & Business Analytics Lab focuses on Industrial Data Analytics, developing machine learning and artificial intelligence methodologies for data-driven decision-making and applying them to real-world industrial settings.

The lab is organized into three research groups dedicated to major data types generated in various industrial fields: Time-Series, Natural Language, and Vision data. A key research focus is also on Multi-modal AI, which integrates and utilizes these data types comprehensively. Major research topics involving time-series data include anomaly detection, forecasting, and effective representation learning. For natural language data, key areas include dialogue system evaluation, log anomaly detection, and efficient information retrieval. Regarding vision data, research focuses on anomaly detection in images, Active Learning for efficient training, and detecting and defending against adversarial attacks.

The lab conducts numerous government-funded research projects, including the NRF Basic Research Laboratory and the IITP SW Computing Industry Core Technology Development. Additionally, through industry-academia collaborations with leading companies such as Samsung Electronics, LG Electronics, Hyundai Motor, and KakaoBank, the lab not only gains domain knowledge but also applies and validates the developed methodologies in practical industrial environments.