Paulito Palmes
I am a research scientist at the IBM Research Europe (Dublin Research Lab) working in the areas of analytics, datamining, machine learning, reinforcement learning, automated decisions, and AI.
I created and maintain the following Julia packages:
- AutoMLPipeline (Automated Machine Learning Pipeline): https://github.com/IBM/AutoMLPipeline.jl
- TSML (Time Series Machine Learning): https://github.com/IBM/TSML.jl
- Julia wrapper for Lale in Python: https://github.com/IBM/Lale.jl
- Github repos: https://github.com/ppalmes
Session
Unlike in Online RL where agents need to interact with real environment, Offline RL works similar to a typical machine learning workflow. Given a dataset, Offline RL processes data extracting state, action, reward, and terminal columns to optimize the policy Q. By wrapping up offline RL into the AutoMLPipeline workflow, it becomes trivial to search for the optimal preprocessing elements and their combinations to improve Offline RL optimal policy using symbolic workflow manipulation.