Data Science & Business Analytics Lab
“To support data-driven decision-making for various issues faced in industrial settings, we develop data science core technologies (AI, ML, etc.).
We operate three small-scale research groups focusing on time series, natural language, and vision data.
By collaborating with researchers in other fields, we aim to utilize diverse forms of data in a systematic manner.
Through partnerships with various companies in the manufacturing, IT, and service sectors, we acquire domain knowledge.”
Natural Language Data Analysis: Opinion Mining/Log Anomaly Detection/Efficient training of Language Models.
Image/Vision Data Analysis: Image Anomaly Detection/Active Learning/Vision-Language model
or Projects
Research: Hyeongwon Kang, Pilsung Kang*. (2024). Transformer-based Multivariate Time Series Anomaly Detection using Inter-Variable Attention Mechanism. Knowledge-Based Systems, 290, 111507.
Research: Jaehyuk Heo+, Seungwan Seo+, Pilsung Kang*. (2023). Exploring the Differences in Adversarial Robustness Between ViT- and CNN-Based Models Using Novel Metrics. Computer Vision and Image Understanding. 235, 103800. (+: Equally contributed)
Project: Development of a Generative, Predictive, and Prescriptive Maintenance Architecture for Autonomous Production Systems, Basic Research Laboratory Program of the National Research Foundation of Korea (NRF)
Project: Development of Explainable Multimodal Anomaly Detection Methodologies and Applications in Industrial Data, NRF’s Mid-Career Researcher Program.
Project: Development of Multi-Modal Learning for Multi-modal Data, LG Innotek.
Project: Development of Evaluation Methodologies for Large Language Models in the Financial Domain, KakaoBank.
Project: Establishment of an Active Learning-Based Defective Data Labeling Framework for Maximizing Product Yield, Samsung Electronics.
(http://dsba.snu.ac.kr/apply)
The Data Science & Business Analytics Lab focuses on Industrial Data Analytics, developing machine learning and artificial intelligence methodologies for data-driven decision-making and applying them to real-world industrial settings.
The lab is organized into three research groups dedicated to major data types generated in various industrial fields: Time-Series, Natural Language, and Vision data. A key research focus is also on Multi-modal AI, which integrates and utilizes these data types comprehensively. Major research topics involving time-series data include anomaly detection, forecasting, and effective representation learning. For natural language data, key areas include dialogue system evaluation, log anomaly detection, and efficient information retrieval. Regarding vision data, research focuses on anomaly detection in images, Active Learning for efficient training, and detecting and defending against adversarial attacks.
The lab conducts numerous government-funded research projects, including the NRF Basic Research Laboratory and the IITP SW Computing Industry Core Technology Development. Additionally, through industry-academia collaborations with leading companies such as Samsung Electronics, LG Electronics, Hyundai Motor, and KakaoBank, the lab not only gains domain knowledge but also applies and validates the developed methodologies in practical industrial environments.