JuliaCon 2022 (Times are UTC)

Statistics symposium
2022-07-22 , Green

Statistics is a domain where some early stage development of packages, and some early applications, have come about in Julia. We think of this mini-symposium as a combination of (a) Report on many interesting recent developments in this field and (b) Offer a birds eye view to the people interested in this field, and help them assess the state of maturity so as to make decisions about whether Julia is appropriate for their statistics work.


  1. "Doing applied statistics research in Julia", Ajay Shah (20 minutes)
    We show the journey of two applied statistics research papers, done fully in Julia, by researchers who were previously working in R. What was convenient, what were the chokepoints, what were the gains in expressivity and in performance. Based on this, we evaluate the state of maturity of Julia for doing applied statistics. We propose practical pathways for statisticians, and speak to the Julia community about what is required next. We report on recent developments in the field of Julia and statistics.

  2. "CRRao: A unified framework for statistical models", Sourish Das (20 minutes)
    Many statistical models are available in Julia, and many more will come. CRRao is a consistent framework through which callers interact with a large suite of models. For the end-user, it reduces the cost and complexity of estimating statistical models. It offers convenient guidelines through which development of additional statistical models can take place in the future.

  3. "TSx: A time series class for Julia", Chirag Anand (20 minutes)
    DataFrames.jl is a powerful system, but expressing the standard tasks of manipulating time series -- e.g. as seen in finance or macroeconomics -- is often cumbersome. We draw on the work of the R community, which has built zoo and xts, to build a time series class, TSx, which delivers a simple set of operators and functions for the people working with time series. It constitutes syntactic sugar on top of the capabilities of DataFrame.jl and thus harnesses the capabilities and efficiency of that package. We conduct comparisons of capabilities and performance against zoo and xts in R.

  4. "Comparing glm in Julia, R and SAS", Mousum Datta (10 minutes)
    glm is an unusually important class of statistical models. We compare the capabilities, correctness and performance of the present glm systems in Julia, R and SAS. We report on recent improvements that have been injected into GLM.jl.

  5. "Working with survey data", Ayush Patnaik (10 minutes)
    The Julia package survey.jl builds some of the functionality required for statistical estimators with stratified random sampling. For a limited subset of the capabilities of Thomas Lumley's R package `survey', we show the correctness and the performance gains of the Julia package.

Ajay Shah studied at IIT, Bombay and USC, Los Angeles. He has held positions at Centre for Monitoring Indian Economy (CMIE), Indira Gandhi Institute for Development Research (IGIDR), Department of Economic Affairs at the Ministry of Finance and National Institute for Public Finance and Policy (NIPFP). He is now part of xKDR Forum and Jindal Global University. His research is at the intersection of economics, law and public administration. His second book, co-authored with Vijay Kelkar, "In service of the republic: The art and science of economic policy", featured in Bloomberg's global "2020 Best Books on Business and Leadership". His work can be accessed on his home page (http://www.mayin.org/ajayshah).