2023-07-28 –, 32-124
Economic research strongly depends upon the economist’s ability to identify relevant information for causal inference and forecast accuracy efficiently. We address this goal in our Julia’s ParallelGSReg project, developing different econometric-machine learning packages. In JuliaCon 2023, we will present an improved version of our dimensionality reduction package (including non-linear algorithms) and a new "research acceleration package" with automatized Latex code and AI-bibliographic features.
In their recent volume of “Econometrics with Machine Learning”, Chan & Mátyás (2022) remind us of the well-established distinction in which Econometrics and Machine Learning are perceived as alternative methodological cultures: one focused on prediction (model selection, sampling properties, accuracy metrics) and the other on explanation (causal inference, hypothesis testing, coefficient robustness). Moving away from this false dichotomy, we introduce ParallelGSReg (https://github.com/ParallelGSReg): a Julia’s research project with several packages (GlobalSearchRegression.jl, GlobalSearchRegressionGUI.jl and ModelSelection.jl) aimed at 1) building bridges between those complementary cultures; and 2) encouraging economic researchers to use Julia in order to improve computational efficiency in model selection tasks (particularly those using dimensionality reduction techniques with causal-inference requirements).
In JuliaCon 2018, the focus was on “efficiency”. We presented the world-fastest all-subset-regression command (GlobalSearchRegression.jl, which runs up to 3165 times faster than the original Stata-code and up to 197 times faster than well-known R-alternatives; see https://github.com/ParallelGSReg/JuliaCon2019/blob/master/GlobalSearchRegression.jl-paper.pdf).
In 2019, the goal was “ease of use” for what we improved our Graphic-User-Interface (GlobalSearchRegressionGUI.jl) and developed a basic package (ModelSelection.jl) to automatize Julia-to-Latex migration of dimensionality reduction results (which also includes all GlobalSearchRegression.jl functions and additional features like regularization and cross-fold validation).
For JuliaCon 2023 the target is “scope and integration”, for which we are:
1) updating all packages (removing compatibility issues with the newest Julia versions);
2) improving ModelSelection.jl with:
2.a) new classification algorithms (logistic, probit, etc) for regularization and all-subset-regression functions;
2.b) additional tests for causal inference (unit root tests);
2.c) extended cross-fold validation capabilities (to deal with re-sampling requirements of panel data and time-series databases); and
2.d) higher computational efficiency, reducing the Time-to-First-Result (TTFR) by focusing on statistical functions (moving Julia-to-Latex migration capabilities to a complementary package).
3) developing ResearchAccelerator.jl, a new package with:
3.a) extended Julia-to-Latex migration functions that work as an “automatic research assistant”. Using ModelSelection.jl results, it generates a Latex document, with relevant tables, graphics, and metrics.
3.b) AI integration for references and literature review. Using user-provided keywords or phrases, ResearchAccelerator.jl will interact with Google Scholar to obtain a potentially relevant bibliography. Then a subset of them with available abstracts, references, and keywords will be used to provide citation networks, and keywords/ citations statistics. Finally, a machine learning system with modern NLP models will be used to generate, based on articles’ abstracts, a similarity network to provide users with additional information for a deeper search among related bibliography. This network will be exported to the Latex document as a table, a figure, and to a standard output file to be viewed using graph plotting and analysis tools such as Gephi.
4) including a JuliaCall for Stata-integration, which allows all packages in our ParallelGSReg project to be used in batch mode through the Stata’s gsreg.ado package. This feature is developed to give change-averse economic researchers the simplest way to verify the substantial runtime reduction they can obtain by progressively switching to Julia.
We will introduce all these contributions (including some new benchmark figures) in the first 5 minutes of our Lightning talk. Then, a live hands-on example will be developed in 3 minutes to leave the last 2 minutes for audience questions.
YPF (Yacimientos Petrolíferos Fiscales) Board Member and Researcher at the National Council of Scientific and Technological Research (LETIF-CONICET). I have a PhD in Economics (EHESS-Paris, France) and I'm teaching advanced macroeconomics and development economics at three different Universities (UNLP, UNDAV and UNQ). Working now on industrial economics and HPC in Econometrics.
Alexis Tcach is a full time professor at Universidad Nacional of General Sarmiento and head teaching assistant at DC - Universidad de Buenos Aires, Argentina. He received his Major in Computer Science from Universidad de Buenos Aires in 2012. His research interests are in the areas of mobile ad hoc networks, computer networks and its integration with machine learning.
Pablo Gluzmann is a senior researcher at the Center for Distributive, Labor and Social Studies (CEDLAS) of Universidad Nacional de La Plata (UNLP). He received her B.A., M.A., and Ph.D. in Economics from UNLP, researcher at the National Scientific and Technical Research Council (CONICET) and associate professor at the UNLP. His research is focused on Inequality, Poverty, Labor Markets and Macroeconomics among others topics. Has been published on journals such as Journal of Development Economics, Economic Letters, World Development, The Stata Jounal, Centro Journal, Latin American Economic Review, Review of Development Economics, Journal of Income Distribution, Journal of International Financial Markets, Institutions and Money, Journal for Labour Market Research, Económica, El Trimestre Económico, Ensayos Económicos. He has also published several chapters of books and working papers at UNDP, WB, IADB, CAF, CEDLAS, IZA, etc.