To see our schedule with full functionality, like timezone conversion and personal scheduling, please enable JavaScript and go here.
09:00
09:00
90min
3D image processing with scikit-image
Alexandre de Siqueira

This tutorial will introduce how to analyze three dimensional stacked and volumetric images in Python, mainly using scikit-image.

Track4 (Chillida)
09:00
90min
Getting Started with JupyterLab
Mike Müller

JupyterLab is used for essentially all other tutorials at EuroSciPy. This tutorial gives an overview over the basic functionality and shows how to use some of the many tools it provides to simplify your Python programming workflow.

Track 2 (Baroja)
09:00
90min
Hands-on TensorFlow 2.0
Josh Gordon

A hands-on introduction to TensorFlow 2.0 at an intermediate difficulty level, with code examples for Deep Dream, Style Transfer, and Image Colorization.

Track 3 (Oteiza)
11:00
11:00
90min
Deep Diving into GANs: From Theory to Production with TensorFlow 2.0
Michele "Ubik" De Simoni, Paolo Galeone, Federico Di Mattia, Emanuele Ghelfi

GANs are one of the hottest topics in the ML arena; however, they present a challenge for the researchers and the engineers alike. This workshop will guide you through both the theory and the code needed to build a GAN and put into production.

Track 3 (Oteiza)
11:00
90min
Never get in a battle of bits without ammunition
Valerio Maggio

The numpy package takes a central role in Python scientific ecosystem.
This is mainly because numpy code has been designed with
high performance in mind. This tutorial will introduce the main features of in numpy in 90 mins.

Track 2 (Baroja)
11:00
90min
Reproducible Data Science in Python
Chandrasekhar Ramakrishnan, Rok Roškar

In this tutorial, we will take a detailed look at the concept of reproducibility, survey the landscape of existing solutions, and, using one solution in particular, Renku, we will do some hands-on work.

Track4 (Chillida)
14:00
14:00
90min
Building data pipelines in Python: Airflow vs scripts soup
Dr. Tania Allard

In this workshop, you will learn how to migrate from ‘scripts soups’ (a set of scripts that should be run in a particular order) to robust, reproducible and easy-to-schedule data pipelines in Airflow.

Track4 (Chillida)
14:00
90min
Create CUDA kernels from Python using Numba and CuPy.
Valentin Haenel

We'll explain how to do GPU-Accelerated numerical computing from Python using the Numba Python compiler in combination with the CuPy GPU array library.

Track 3 (Oteiza)
14:00
90min
Introduction to pandas
Marc Garcia

This tutorial is an introduction to pandas for people new to it. We will cover how to open datasets, perform some analysis, apply some transformations and visualize the data

Track 2 (Baroja)
16:00
16:00
90min
Performing Quantum Measurements in QuTiP
Simon Cross

Would you like to create (virtual) qubits and perform measurements on them using Python? Perhaps even explore entanglement and quantum teleportation? If so, this tutorial is for you!

No previous quantum mechanics experience required!

Track4 (Chillida)
16:00
90min
Speed up your python code
Jérémie du Boisberranger

In this tutorial we will see how to profile and speed up Python code, from a pure Python implementation to an optimized Cython code.

Track 3 (Oteiza)
09:00
09:00
90min
A Tour of the Data Visualization Ecosystem of Python
Giovanni De Gasperis

The tutorial will be a a tour of the getting-started how-tos of the major Python data visualization libraries such as Yt-Project, Seaborn, Altair, Plotly

Track 2 (Baroja)
09:00
90min
Introduction to geospatial data analysis with GeoPandas and the PyData stack
Joris Van den Bossche

This tutorial is an introduction to geospatial data analysis, with a focus on tabular vector data using GeoPandas. It will show how GeoPandas and related libraries can improve your GIS workflow and fit nicely in the traditional PyData stack.

Track4 (Chillida)
09:00
90min
Sufficiently Advanced Testing with Hypothesis
Zac Hatfield-Dodds

Testing research code can be difficult, but is essential for robust results. Using Hypothesis, a tool for property-based testing, I'll show how testing can be both easier and dramatically more powerful - even for complex "black box" codes.

Track 3 (Oteiza)
11:00
11:00
90min
Astronomical Image Processing
Samuel FARRENS

This tutorial will introduce the concept of sparsity and demonstrate how it can be used to remove noise from signals. These concepts will then be expanded to demonstrate how noise can be removed from astronomical images in particular.

Track4 (Chillida)
11:00
90min
Effectively using matplotlib
Tim Hoffmann

It can sometimes be difficult and frustrating to know how to achieve a desired plot. – Have you made this experience as well? Then this tutorial is for you. It will make you more effective and help you generate better looking plots.

Track 3 (Oteiza)
11:00
90min
Introduction to SciPy
Gert-Ludwig Ingold

SciPy is a comprehensive library for scientific computing and one of the central components of the scientific Python ecosystem. As most of its functionality naturally involves NumPy arrays, SciPy works hand in hand with NumPy.

Track 2 (Baroja)
14:00
14:00
90min
CFFI, Ctypes, Cython, Cppyy: how to run C code from Python
Matti Picus

Python is flexible, C and C++ are fast. How to use them together? There are many ways to call C code from Python, we will learn about the major ones, find out when you would prefer to use one over the other.

Track 3 (Oteiza)
14:00
90min
Introduction to scikit-learn: from model fitting to model interpretation
Guillaume Lemaitre, Olivier Grisel

We will present scikit-learn by focusing on the available tools used to train a machine-learning model. Then, we will focus on the challenge linked to model interpretation and the available tools to understand these models.

Track 2 (Baroja)
14:00
90min
Parallelizing Python applications with PyCOMPSs
Javier Conejero

PyCOMPSs is a task-based programming model that enables the parallel execution of Python scripts by annotating methods with task decorators. At run time, it identifies tasks' data-dependencies, schedules and executes them in distributed environments.

Track4 (Chillida)
16:00
16:00
90min
kCSD - a Python package for reconstruction of brain activity
Marta Kowalska, Jakub M. Dzik

kCSD is a Python package for localization of sources of brain electric activity based on recorded electric potentials.

Track 3 (Oteiza)
08:25
08:25
90min
scikit-fdiff, a new tool for PDE solving
Nicolas Cellier

Scikit-fdiff (formally Triflow) has been developed in order to facilitate mathematic models building. It has been made to quickly build and try many asymptotic falling film modelling with different phenomena coupling (energy and mass transfer).

Posters at 16:00
10:15
10:15
45min
From Galaxies to Brains! - Image processing with Python
Samuel FARRENS

From the smallest microscopic objects to the largest scales of the Universe, our ability to study the world around us is predicated on the quality of the data we have access to.

Track 1 (Mitxelena)
11:30
11:30
30min
Distributed GPU Computing with Dask
Peter Andreas Entschev

Dask has evolved over the last year to leverage multi-GPU computing alongside its existing CPU support. We present how this is possible with the use of NumPy-like libraries and how to get started writing distributed GPU software.

Track 1 (Mitxelena)
11:30
30min
How a voice assistant works
Miren Urteaga Aldalur

This talk will focus on the technologies needed to build a voice assistant. It will keep as center point Samsung’s voice assistant Bixby, which is available in 8 languages across the world (5 EU languages) in a variety of Samsung mobile phones.

Track 2 (Baroja)
11:30
30min
Sufficiently Advanced Testing with Hypothesis
Zac Hatfield-Dodds

Testing research code can be difficult, but is essential for robust results. Using Hypothesis, a tool for property-based testing, I'll show how testing can be both easier and dramatically more powerful - even for complex "black box" codes.

Track 3 (Oteiza)
11:40
11:40
90min
PhonoLAMMPS: Phonopy with LAMMPS made easy
Abel Carreras

PhonoLAMMPS is a Phonopy interface with LAMMPS that allows to calculate the interatomic force constants and other phonon properties from a usual LAMMPS input file.

Posters at 16:00
12:00
12:00
30min
Modern Data Science: A new approach to DataFrames and pipelines
Jovan Veljanoski, Maarten Breddels

We will demonstrate how to explore and analyse massive datasets (>150GB) on a laptop with the Vaex library in Python. Using computational graphs, efficient algorithms and storage (Apache Arrow / hdf5) Vaex can easily handle up to a billion rows.

Track 1 (Mitxelena)
12:00
30min
QuTiP: the quantum toolbox in Python as an ecosystem for quantum physics exploration and quantum information science
Nathan Shammah, Alexander Pitchford

In this talk you will learn how QuTiP, the quantum toolbox in Python (http://qutip.org), has emerged from a library to an ecosystem. QuTiP is used for education, to teach quantum physics. In research and industry, for quantum computing simulation.

Track 2 (Baroja)
12:00
30min
What about tests in Machine Learning projects?
Sarah Diot-Girard

Good practices tell you must write tests! But testing Machine Learning projects can be really complicated. Test writing seems often inefficient. Which kind of test should be written? How to write them? What are the benefits?

Track 3 (Oteiza)
13:15
13:15
90min
Really reproducible behavioural paper
Jakub M. Dzik

A heavily XKCD themed poster about writing a really reproducible behavioural paper in Python environment.
The poster is also available online.

Posters at 16:00
14:45
14:45
30min
Apache Arrow: a cross-language development platform for in-memory data
Joris Van den Bossche

Apache Arrow, defining a columnar, in-memory data format standard and communication protocols, provides a cross-language development platform with already several applications in the PyData ecosystem.

Track 1 (Mitxelena)
14:45
30min
Constrained Data Synthesis
Nick Radcliffe

We introduce a method for creating synthetic data "to order" based on learned (or provided) constraints and data classifications. This includes "good" and "bad" data.

Track 2 (Baroja)
14:45
30min
Scientific DevOps: Designing Reproducible Data Analysis Pipelines with Containerized Workflow Managers
Nicholas Del Grosso

A review of DevOps tools as applied to data analysis pipelines, including workflow managers, software containers, testing frameworks, and online repositories for performing reproducible science that scales.

Track 3 (Oteiza)
14:50
14:50
90min
kESI - a kernel-based method for reconstruction of sources of brain electric activity in realistic brain geometries
Jakub M. Dzik, Marta Kowalska

kESI is a new Python package for kernel-based reconstruction of brain electric activity from recorded electric field potentials using realistic assumptions about brain geometry and conductivity.

Posters at 16:00
15:15
15:15
30min
Caterva: A Compressed And Multidimensional Container For Big Data
Francesc Alted

Caterva is a library on top of the Blosc2 compressor that implements a simple multidimensional container for compressed binary data. It adds the capability to store, extract, and transform data in these containers, either in-memory or on-disk.

Track 1 (Mitxelena)
15:15
30min
Dashboarding with Jupyter notebooks, voila and widgets
Maarten Breddels, Martin Renou

Turn your Jupyter notebook into a beautiful modern React or Vue based dashboard using voila and Jupyter widgets.

Track 3 (Oteiza)
15:15
30min
ToFu - an open-source python/cython library for synthetic tomography diagnostics on Tokamaks
Laura Mendoza, Didier VEZINET

We present an open-source parallelized and cythonized python library, ToFu, for modeling tomography diagnostics on nuclear fusion reactors. Its functionalities (with realistic examples), its architecture and its design will be shown.

Track 2 (Baroja)
15:45
15:45
15min
Debugging in JupyterLab
Jeremy Tuloup

Debugging Jupyter Notebooks has been one of the most requested features. In this presentation we give an overview of the current state and tools for debugging in Jupyter, and offer a glimpse of what is coming next.

Track 2 (Baroja)
15:45
15min
Make your Python code fly at transonic speeds!
Pierre Augier

Transonic is a new pure Python package to easily accelerate modern Python-Numpy code with different accelerators (like Cython, Pythran, Numba, Cupy, etc...).

Track 3 (Oteiza)
15:45
15min
Modin: Scaling the Capabilities of the Data Scientist, not the machine
Devin Petersohn, Devin Petersohn

Modern data systems tend to heavily focus on optimizing for the system’s time. In this talk, we discuss the design of Modin, a DataFrame library, and how to optimize for the human system.

Track 1 (Mitxelena)
16:25
16:25
90min
From Modeler to Programmer
Mike Müller

The modeling system ueflow allows for customable, dynamic boundary conditions.
The modeler can write Python plugins to implement the behavior of these boundary conditions.

Posters at 16:00
16:30
16:30
15min
Best Coding Practices in Jupyterlab
Alexander CS Hendorf

Jupyter notebooks are often a mess. The code produced is working for one notebook, but it's hard to maintain or to re-use. In this talks I will present some best practices to make code more readable, better to maintain and re-usable.

Track 1 (Mitxelena)
16:30
15min
Controlling a confounding effect in predictive analysis.
Darya Chyzhyk

Confounding effects are often present in observational data: the effect or association studied is observed jointly with other effects that are not desired.

Track 2 (Baroja)
16:30
15min
PyFETI - An easy and massively Dual Domain Decomposition Solver for Python
Guilherme Jenovencio

PyFETI is a python implementation of Finite-Element-Tearing-Interconnecting Methods. The library provides a massive linear solver using Domain Decomposition method, where problems are solved locally by Direct Solver and at the interface iteratively.

Track 3 (Oteiza)
16:45
16:45
15min
High Voltage Lab Common Code Basis library: a uniform user-friendly object-oriented API for a high voltage engineering research.
Mikołaj Rybiński

The library leverages Python richness to provide a uniform user-friendly API for a zoo of industrial communication protocols used to control high voltage engineering devices, together with abstraction and implementations for such devices.

Track 3 (Oteiza)
16:45
15min
Lessons learned from comparing Numba-CUDA and C-CUDA
Lena Oden

We compared the performance of GPU-Applications written in C-CUDA and Numba-CUDA. By analyzing the GPU assembly code, we learned about the reasons for the differences. This helped us to optimize our codes written in NUMBA-CUDA and NUMBA itself.

Track 1 (Mitxelena)
16:45
15min
The Rapid Analytics and Model Prototyping (RAMP) framework: tools for collaborative data science challenges
Guillaume Lemaitre, Joris Van den Bossche

The RAMP (Rapid Analytics and Model Prototyping) framework provides a platform to organize reproducible and transparent data challenges. We will present the different framework bricks.

Track 2 (Baroja)
18:00
18:00
90min
MNE-Python, a toolkit for neurophysiological data
Joan Massich

A summary of the MNE-Python changes introduced during the two last releases and highlights for future directions.

Posters at 16:00
09:15
09:15
45min
HPC and Python: Intel’s work in enabling the scientific computing community
David Liu

High Performance Computing (HPC) has been a pillar of the scientific community for years, with many in the Python community contributing to its continued development. However, one of the fundamental links in performance is the relationship between h

Track 1 (Mitxelena)
10:30
10:30
30min
Inside NumPy: preparing for the next decade
Matti Picus

Over the past year, and for the first time since its creation, NumPy has been operating with dedicated funding. NumPy developers think it has invigorated the project and its community. But is that true, and how can we know?

Track 1 (Mitxelena)
10:30
30min
Visual Diagnostics at Scale
Dr. Rebecca Bilbro

Machine learning is a search for the best combination of features, model, and hyperparameters. But as data grow, so does the search space! Fortunately, visual diagnostics can focus our search and allow us to steer modeling purposefully, and at scale.

Track 2 (Baroja)
11:00
11:00
30min
Deep Learning without a PhD
Paige Bailey

In this talk, you'll learn how to transition from traditional machine learning tools, like scikit-learn, to deep learning with Keras, TensorFlow, and JAX. No prior experience with machine learning or with deep learning required, and no need to instal

Track 1 (Mitxelena)
11:00
30min
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applications
Andrii Gakhov

We interact with an increasing amount of data but classical data structures and algorithms can't fit our requirements anymore. This talk is to present the probabilistic algorithms and data structures and describe the main areas of their applications.

Track 3 (Oteiza)
11:00
30min
Histogram-based Gradient Boosting in scikit-learn 0.21
Olivier Grisel

In this presentation we will present some recently introduced features of the scikit-learn Machine Learning library with a particular emphasis on the new implementation of Gradient Boosted Trees.

Track 2 (Baroja)
11:30
11:30
30min
Driving a 30m Radio Telescope with Python
Francesco Pierfederici

The IRAM 30m radio telescope is one of the best in the world. The telescope control software, monitoring, data archiving as well as some of the data processing code is written in Python. We will describe how and why Python is used at the telescope.

Track 3 (Oteiza)
11:30
30min
Recent advances in python parallel computing
Pierre Glaser

Modern hardware is multi-core. It is crucial for Python to provide
efficient parallelism. This talk exposes the current state and advances
in Python parallelism, in order to help practitioners and developers take
better decisions on this matter.

Track 2 (Baroja)
11:30
30min
The Magic of Neural Embeddings with TensorFlow 2
Oliver Zeigermann

Neural Embeddings are a powerful tool of turning categorical into numerical values. Given reasonable training data semantics present in the categories can be preserved in the numerical representation.

Track 1 (Mitxelena)
12:00
12:00
30min
Data sciences in a polyglot world with xtensor and xframe
Sylvain Corlay, Wolf Vollprecht

The main scientific computing programming languages have different models the main data structures of data science such as dataframes and n-d arrays. In this talk, we present our approach to reconcile the data science tooling in this polyglot world.

Track 2 (Baroja)
12:00
30min
High quality video experience using deep neural networks
Marco Bertini, Tiberio Uricchio

Video compression algorithms used to stream videos are lossy, and when compression rates increase they result in strong degradation of visual quality. We show how deep neural networks can eliminate compression artefacts and restore lost details.

Track 1 (Mitxelena)
12:00
30min
Matrix calculus with SymPy
Francesco Bonazzi

In this talk we explore a recent addition to SymPy which allows to find closed-form solutions to matrix derivatives. As a consequence, generation of efficient code for optimization problems is now much easier.

Track 3 (Oteiza)
14:00
14:00
45min
In the Shadow of the Black Hole
Sara Issaoun

I will walk through the entire Event Horizon Telescope experiment and the global effort that led to the first-ever direct image of a black hole revealed to the world on April 10th of this year.

Track 1 (Mitxelena)
14:45
14:45
30min
A practical guide towards algorithmic bias and explainability in machine learning
Alejandro Saucedo

Undesired bias in machine learning has become a worrying topic due to the numerous high profile incidents. In this talk we demystify machine learning bias through a hands-on example. We'll be tasked to automate the loan approval process for a company

Track 1 (Mitxelena)
14:45
30min
Understanding Numba
Valentin Haenel

In this talk I will take you on a whirlwind tour of Numba and you will be quipped with a mental model of how Numba works and what it is good at. At the end, you will be able to decide if Numba could be useful for you.

Track 2 (Baroja)
14:45
30min
VeloxChem: Python meets quantum chemistry and HPC
Olav Vahtras

A new and efficient Python/C++ modular library for real and complex response functions at the
level of Kohn-Sham density functional theory

Track 3 (Oteiza)
15:15
15:15
30min
PyPy meets SciPy
Ronan Lamy

PyPy, the fast and compliant alternative implementation of Python, is now compatible with the SciPy ecosystem. We'll explore how scientific programmers can use it.

Track 2 (Baroja)
15:15
30min
Tracking migration flows with geolocated Twitter data
Antònia Tugores

Detect migration flows worldwide using geolocated Twitter data: routes, settlement areas, mobility to more than one country, spatial integration in cities, etc.

Track 1 (Mitxelena)
15:15
30min
emzed: a Python based framework for analysis of mass-spectrometry data
Uwe Schmitt

This talk is about emzed, a Python library to support biologists with little programming knowledge to implement ad-hoc analyses as well as workflows for mass-spectrometry data.

Track 3 (Oteiza)
15:45
15:45
15min
Deep Learning for Understanding Human Multi-modal Behavior
Ricardo Manhães Savii

Multi-modal sources of information are the next big step for AI. In this talk, I will present the use of deep learning techniques for automated multi-modal applications and some open benchmarks.

Track 1 (Mitxelena)
15:45
15min
High performance machine learning with dislib
Javier Álvarez

This talk will present dislib, a distributed machine learning library built on top of PyCOMPSs programming model. One of the main focuses of dislib is solving large-scale scientific problems on high performance computing clusters.

Track 2 (Baroja)
15:45
15min
vtext: fast text processing in Python using Rust
Roman Yurchak

In this talk, we present some of the benefits of writing extensions for Python in Rust. We then illustrate this approach on the vtext project, that aims to be a high-performance library for text processing.

Track 3 (Oteiza)
16:30
16:30
15min
Can we make Python fast without sacrificing readability? numba for Astrodynamics
Juan Luis Cano Rodríguez

There are several solutions to make Python faster, and choosing one is not easy: we would want it to be fast without sacrificing its readability and high-level nature. We tried to do it for an Astrodynamics library using numba. How did it turn out?

Track 2 (Baroja)
16:30
15min
How to process hyperspectral data from a prototype imager using Python
Matti Eskelinen

We present a collection of software for handling hyperspectral data acquisition and preprocessing fully in Python utilising Xarray for metadata preservation from start to finish.

Track 1 (Mitxelena)
16:30
15min
pystencils: Speeding up stencil computations on CPUs and GPUs
Martin Bauer

pystencils speeds up stencil computations on numpy arrays using a sympy-based high level description, that is compiled into optimized C code.

Track 3 (Oteiza)
16:45
16:45
15min
Enhancing & re-designing the QGIS user interface – a deep dive
Sebastian M. Ernst

How can one of the largest code bases in open source Geographical Information Science – QGIS – be enhanced and re-designed? Through the powers of Python plugins. This talk demonstrates concepts on how to make QGIS more user-friendly.

Track 1 (Mitxelena)
16:45
15min
PSYDAC: a parallel finite element solver with automatic code generation
Yaman Güçlü

PSYDAC takes input from SymPDE (a SymPy extension for partial differential equations), applies a finite-element discretization, generates MPI-parallel code, and accelerates it with Numba, Pythran, or Pyccel. We present design, usage and performance.

Track 2 (Baroja)
16:45
15min
TelApy a Python module to compute free surface flows and sediments transport in geosciences
yoann audouin

TelApy a Python module to compute free surface flows and sediments transport in geosciences and examples of how it is used to inter-operate with other Python libraries for Uncertainty Quantification, Optimization, Reduced Order Model.

Track 3 (Oteiza)
No sessions on Friday, Sept. 6, 2019.