2022-07-29 –, Green
BanyanDataFrames.jl is an open-source library for processing massive Parquet/CSV/Arrow datasets in your Virtual Private Cloud. One of the key goals of the project is to match the API of DataFrames.jl as much as possible. In this talk, we will provide an overview of BanyanDataFrames.jl and discuss challenges and success so far in achieving massively scalable data analytics with the Julia language.
More information about BanyanDataFrames.jl can be found on GitHub:
https://github.com/banyan-team/banyan-julia
https://github.com/banyan-team/banyan-julia-examples
I'm Caleb. I'm currently studying Computer Science at the University of Washington and will be doing research at Stanford next year. My research interests are quite broad and I have published work in areas including both brain-computer interfaces and wet lab automation and I would be happy to chat about these things. I'm also currently working on https://BanyanComputing.com. Outside of CS, I love composing music and playing the alto sax in an ensemble group with my siblings.