Guijin Son

Co-Founder @ OneLine AI

spthsrbwls123@yonsei.ac.kr

About

Hello. I am Guijin Son, co-founder at OneLine AI and lead of HAE-RAE, an open source research group focused on Korean NLP. I am interested in AI for Science, with a focus on evaluation and reasoning with language models. My current goal is to build stronger reasoning models and the measurements that prove real progress.

I am currently also interested in multimodal reasoning and agentic systems. Past projects include analyzing Korean knowledge and professional benchmarks, evaluating reward models, and exploring financial applications of LLMs.

I also teach and mentor: lectures at Fast Campus and SSAFY, curriculum work with Codeit and Code States, and mentorship at Upstage.

HAE-RAE Fast Campus CODED Samsung SAFI Upstage Code States
News
Nov 2025     Our KMMLU-Pro and Multi-LMentry paper were accepted to EMNLP 2025.
Jul 2025     Our Linguistic Generalizability (Oral) and FinKRX (Industry) were accepted to ACL 2025.
Jul 2025     Our Robustness of Reward Models was accepted to ICML 2025.
Apr 2025     Our BiGGen Bench and KMMLU were accepted to NAACL 2025.
Apr 2025     Our BiGGen Bench was selected for the Best Paper Award at NAACL 2025.
Aug 2024     Our Multitask Inference was accepted to ACL 2024.
May 2024     Our HAE-RAE Bench was accepted to LREC-COLING 2024.
Education

Yonsei University, Underwood International College (UIC) 2020 – 2025

B.S., Economics.

Publications

Preprint

Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought

Guijin Son, Donghun Yang, Hitesh Laxmichand Patel, Amit Agarwal, Hyunwoo Ko, Chanuk Lim, Srikant Panda, Minhyuk Kim, Nikunj Drolia, Dasol Choi, Kyong-Ha Lee, Youngjae Yu

Preprint Under Review

When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research

Guijin Son, Jiwoo Hong, Honglu Fan, Heejeong Nam, Hyunwoo Ko, Seungwon Lim, Jinyeop Song, Jinha Choi, Gonçalo Paulo, Youngjae Yu, Stella Biderman

Preprint Under Review

MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models

Guijin Son*, Dongkeun Yoon*, Juyoung Suk, Javier Aula-Blasco, Mano Aslan, Vu Trong Kim, Shayekh Bin Islam, Jaume Prats-Cristià, Lucía Tormo-Bañuelos, Seungone Kim

Preprint Under Review

2025

From KMMLU-Redux to KMMLU-Pro: A Professional Korean Benchmark Suite for LLM Evaluation

Seokhee Hong, Sunkyoung Kim, Guijin Son, Soyeon Kim, Yeonjung Hong, Jinsik Lee

EMNLP 2025

Multi-LMentry: Can Multilingual LLMs Solve Elementary Tasks Across Languages?

Luca Moroni, Javier Aula-Blasco, Simone Conia, Irene Baucells, Naiara Perez, Silvia Paniagua Suárez, Anna Sallés, Malte Ostendorff, Júlia Falcão, Guijin Son, Aitor Gonzalez-Agirre, Roberto Navigli, Marta Villegas

EMNLP 2025

On the Robustness of Reward Models for Language Model Alignment

Jiwoo Hong, Noah Lee, Eunki Kim, Guijin Son, Woojin Chung, Aman Gupta, Shao Tang, James Thorne

ICML 2025

FINKRX: Establishing Best Practices for Korean Financial NLP

Guijin Son, Hyunwoo Ko, Hanearl Jung, Chami Hwang

ACL 2025 Industry Track

Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning

Guijin Son, Jiwoo Hong, Hyunwoo Ko, James Thorne

ACL 2025 (Oral)

KMMLU: Measuring Massive Multitask Language Understanding in Korean

Guijin Son, Hanwool Lee, Sungdong Kim, Seungone Kim, Niklas Muennighoff, Taekyoon Choi, Cheonbok Park, Kang Min Yoo, Stella Biderman

NAACL 2025

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Seungone Kim, Juyoung Suk, Ji Yong Cho, Shayne Longpre, Chaeeun Kim, Dongkeun Yoon, Guijin Son, Yejin Cho, Sheikh Shafayat, Jinheon Baek, Sue Hyun Park, Hyeonbin Hwang, Jinkyung Jo, Hyowon Cho, Haebin Shin, Seongyun Lee, Hansoek Oh, Noah Lee, Namgyu Ho, Se June Joo, Miyoung Ko, Yoonjoo Lee, Hyungjoo Chae, Jamin Shin, Joel Jang, Seonghyeon Ye, Bill Yuchen Lin, Sean Welleck, Graham Neubig, Moontae Lee, Kyungjae Lee, Minjoon Seo

NAACL 2025 (Best Paper)

2024

Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?

Guijin Son, Sangwon Baek, Sangdae Nam, Ilgyun Jeong, Seungone Kim

ACL 2024

HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models

Guijin Son, Hanwool Lee, Suwan Kim, Huiseo Kim, Jaecheol Lee, Je Won Yeom, Jihyu Jung, Jung Woo Kim, Songseong Kim

LREC-COLING 2024

( * indicates equal contribution )

Vitæ

Full CV in PDF.

  • OneLineAI Mar 2023 - Present
    Co-founder & AI Researcher
    Built Korean benchmarks, finance-specific LLMs, and reasoning-focused LLMs
  • XFactLab, KAIST Dec 2024 - Mar 2025
    Research Intern (Mentor: Jiwoo Hong)
    Worked on building multilingual mathematical reasoning.
  • Qraft Technologies Jul 2022 - Feb 2023
    Researcher
    LLM applications for finance; built aspect-based sentiment analysis for stocks
  • FuturePlay Jan 2022 - Jul 2022
    Data Analyst
    Provided market intelligence for early to mid-stage deep tech; analyzed global VC investment trends
  • Qraft Technologies Jun 2021 - Sep 2021
    Research Intern
    Reinforcement learning for order execution in stock trading
  • Yonsei University Mar 2020 - Feb 2025
    B.A. in Economics