Jun Tian
Software Engineer @01.ai
Session
07-12
15:40
10min
Evaluate LLM synthesized Julia code
Jun Tian
HumanEval and MBPP are two of the most frequently used benchmarks to evaluate LLM's performance in code generation. However, they mainly focus on the Python programming language only. In this talk, we will analyze SOTA code LLMs' performance in Julia. Results will be updated constantly at HumanEval.jl
AI/ML/AD
For Loop (3.2)