Jun Tian Juliacon 2024

Jun Tian
.ical

Software Engineer @01.ai

Session

07-12

15:40

10min

Evaluate LLM synthesized Julia code

Jun Tian

HumanEval and MBPP are two of the most frequently used benchmarks to evaluate LLM's performance in code generation. However, they mainly focus on the Python programming language only. In this talk, we will analyze SOTA code LLMs' performance in Julia. Results will be updated constantly at HumanEval.jl

AI/ML/AD

For Loop (3.2)

Jun Tian .ical

Session

Jun Tian
.ical