EuroSciPy 2019 :: pretalx

To see our schedule with full functionality, like timezone conversion and personal scheduling, please enable JavaScript and go here.

Monday, Sept. 2, 2019

Tuesday, Sept. 3, 2019

Wednesday, Sept. 4, 2019

Thursday, Sept. 5, 2019

Friday, Sept. 6, 2019

09:00

3D image processing with scikit-image

Alexandre de Siqueira

This tutorial will introduce how to analyze three dimensional stacked and volumetric images in Python, mainly using scikit-image.

Track4 (Chillida)

Getting Started with JupyterLab

JupyterLab is used for essentially all other tutorials at EuroSciPy. This tutorial gives an overview over the basic functionality and shows how to use some of the many tools it provides to simplify your Python programming workflow.

Track 2 (Baroja)

Hands-on TensorFlow 2.0

A hands-on introduction to TensorFlow 2.0 at an intermediate difficulty level, with code examples for Deep Dream, Style Transfer, and Image Colorization.

Track 3 (Oteiza)

11:00

Deep Diving into GANs: From Theory to Production with TensorFlow 2.0

Michele "Ubik" De Simoni, Paolo Galeone, Federico Di Mattia, Emanuele Ghelfi

GANs are one of the hottest topics in the ML arena; however, they present a challenge for the researchers and the engineers alike. This workshop will guide you through both the theory and the code needed to build a GAN and put into production.

Track 3 (Oteiza)

Never get in a battle of bits without ammunition

The numpy package takes a central role in Python scientific ecosystem.
This is mainly because numpy code has been designed with
high performance in mind. This tutorial will introduce the main features of in numpy in 90 mins.

Track 2 (Baroja)

Reproducible Data Science in Python

Chandrasekhar Ramakrishnan, Rok Roškar

In this tutorial, we will take a detailed look at the concept of reproducibility, survey the landscape of existing solutions, and, using one solution in particular, Renku, we will do some hands-on work.

Track4 (Chillida)

14:00

Building data pipelines in Python: Airflow vs scripts soup

Dr. Tania Allard

In this workshop, you will learn how to migrate from ‘scripts soups’ (a set of scripts that should be run in a particular order) to robust, reproducible and easy-to-schedule data pipelines in Airflow.

Track4 (Chillida)

Create CUDA kernels from Python using Numba and CuPy.

Valentin Haenel

We'll explain how to do GPU-Accelerated numerical computing from Python using the Numba Python compiler in combination with the CuPy GPU array library.

Track 3 (Oteiza)

Introduction to pandas

This tutorial is an introduction to pandas for people new to it. We will cover how to open datasets, perform some analysis, apply some transformations and visualize the data

Track 2 (Baroja)

16:00

Performing Quantum Measurements in QuTiP

Would you like to create (virtual) qubits and perform measurements on them using Python? Perhaps even explore entanglement and quantum teleportation? If so, this tutorial is for you!

No previous quantum mechanics experience required!

Track4 (Chillida)

Speed up your python code

Jérémie du Boisberranger

In this tutorial we will see how to profile and speed up Python code, from a pure Python implementation to an optimized Cython code.

Track 3 (Oteiza)

09:00

A Tour of the Data Visualization Ecosystem of Python

Giovanni De Gasperis

The tutorial will be a a tour of the getting-started how-tos of the major Python data visualization libraries such as Yt-Project, Seaborn, Altair, Plotly

Track 2 (Baroja)

Introduction to geospatial data analysis with GeoPandas and the PyData stack

Joris Van den Bossche

This tutorial is an introduction to geospatial data analysis, with a focus on tabular vector data using GeoPandas. It will show how GeoPandas and related libraries can improve your GIS workflow and fit nicely in the traditional PyData stack.

Track4 (Chillida)

Sufficiently Advanced Testing with Hypothesis

Zac Hatfield-Dodds

Testing research code can be difficult, but is essential for robust results. Using Hypothesis, a tool for property-based testing, I'll show how testing can be both easier and dramatically more powerful - even for complex "black box" codes.

Track 3 (Oteiza)

11:00

Astronomical Image Processing

This tutorial will introduce the concept of sparsity and demonstrate how it can be used to remove noise from signals. These concepts will then be expanded to demonstrate how noise can be removed from astronomical images in particular.

Track4 (Chillida)

Effectively using matplotlib

It can sometimes be difficult and frustrating to know how to achieve a desired plot. – Have you made this experience as well? Then this tutorial is for you. It will make you more effective and help you generate better looking plots.

Track 3 (Oteiza)

Introduction to SciPy

Gert-Ludwig Ingold

SciPy is a comprehensive library for scientific computing and one of the central components of the scientific Python ecosystem. As most of its functionality naturally involves NumPy arrays, SciPy works hand in hand with NumPy.

Track 2 (Baroja)

14:00

CFFI, Ctypes, Cython, Cppyy: how to run C code from Python

Python is flexible, C and C++ are fast. How to use them together? There are many ways to call C code from Python, we will learn about the major ones, find out when you would prefer to use one over the other.

Track 3 (Oteiza)

Introduction to scikit-learn: from model fitting to model interpretation

Guillaume Lemaitre, Olivier Grisel

We will present scikit-learn by focusing on the available tools used to train a machine-learning model. Then, we will focus on the challenge linked to model interpretation and the available tools to understand these models.

Track 2 (Baroja)

Parallelizing Python applications with PyCOMPSs

Javier Conejero

PyCOMPSs is a task-based programming model that enables the parallel execution of Python scripts by annotating methods with task decorators. At run time, it identifies tasks' data-dependencies, schedules and executes them in distributed environments.

Track4 (Chillida)

16:00

kCSD - a Python package for reconstruction of brain activity

Marta Kowalska, Jakub M. Dzik

kCSD is a Python package for localization of sources of brain electric activity based on recorded electric potentials.

Track 3 (Oteiza)

08:25

scikit-fdiff, a new tool for PDE solving

Nicolas Cellier

Scikit-fdiff (formally Triflow) has been developed in order to facilitate mathematic models building. It has been made to quickly build and try many asymptotic falling film modelling with different phenomena coupling (energy and mass transfer).

Posters at 16:00

10:15

From Galaxies to Brains! - Image processing with Python

From the smallest microscopic objects to the largest scales of the Universe, our ability to study the world around us is predicated on the quality of the data we have access to.

Track 1 (Mitxelena)

11:30

Distributed GPU Computing with Dask

Peter Andreas Entschev

Dask has evolved over the last year to leverage multi-GPU computing alongside its existing CPU support. We present how this is possible with the use of NumPy-like libraries and how to get started writing distributed GPU software.

Track 1 (Mitxelena)

How a voice assistant works

Miren Urteaga Aldalur

This talk will focus on the technologies needed to build a voice assistant. It will keep as center point Samsung’s voice assistant Bixby, which is available in 8 languages across the world (5 EU languages) in a variety of Samsung mobile phones.

Track 2 (Baroja)

Sufficiently Advanced Testing with Hypothesis

Zac Hatfield-Dodds

Testing research code can be difficult, but is essential for robust results. Using Hypothesis, a tool for property-based testing, I'll show how testing can be both easier and dramatically more powerful - even for complex "black box" codes.

Track 3 (Oteiza)

11:40

PhonoLAMMPS: Phonopy with LAMMPS made easy

PhonoLAMMPS is a Phonopy interface with LAMMPS that allows to calculate the interatomic force constants and other phonon properties from a usual LAMMPS input file.

Posters at 16:00

12:00

Modern Data Science: A new approach to DataFrames and pipelines

Jovan Veljanoski, Maarten Breddels

We will demonstrate how to explore and analyse massive datasets (>150GB) on a laptop with the Vaex library in Python. Using computational graphs, efficient algorithms and storage (Apache Arrow / hdf5) Vaex can easily handle up to a billion rows.

Track 1 (Mitxelena)

QuTiP: the quantum toolbox in Python as an ecosystem for quantum physics exploration and quantum information science

Nathan Shammah, Alexander Pitchford

In this talk you will learn how QuTiP, the quantum toolbox in Python (http://qutip.org), has emerged from a library to an ecosystem. QuTiP is used for education, to teach quantum physics. In research and industry, for quantum computing simulation.

Track 2 (Baroja)

What about tests in Machine Learning projects?

Sarah Diot-Girard

Good practices tell you must write tests! But testing Machine Learning projects can be really complicated. Test writing seems often inefficient. Which kind of test should be written? How to write them? What are the benefits?

Track 3 (Oteiza)

13:15

Really reproducible behavioural paper

A heavily XKCD themed poster about writing a really reproducible behavioural paper in Python environment.
The poster is also available online.

Posters at 16:00

14:45

Apache Arrow: a cross-language development platform for in-memory data

Joris Van den Bossche

Apache Arrow, defining a columnar, in-memory data format standard and communication protocols, provides a cross-language development platform with already several applications in the PyData ecosystem.

Track 1 (Mitxelena)

Constrained Data Synthesis

We introduce a method for creating synthetic data "to order" based on learned (or provided) constraints and data classifications. This includes "good" and "bad" data.

Track 2 (Baroja)

Scientific DevOps: Designing Reproducible Data Analysis Pipelines with Containerized Workflow Managers

Nicholas Del Grosso

A review of DevOps tools as applied to data analysis pipelines, including workflow managers, software containers, testing frameworks, and online repositories for performing reproducible science that scales.

Track 3 (Oteiza)

14:50

kESI - a kernel-based method for reconstruction of sources of brain electric activity in realistic brain geometries

Jakub M. Dzik, Marta Kowalska

kESI is a new Python package for kernel-based reconstruction of brain electric activity from recorded electric field potentials using realistic assumptions about brain geometry and conductivity.

Posters at 16:00

15:15

Caterva: A Compressed And Multidimensional Container For Big Data

Caterva is a library on top of the Blosc2 compressor that implements a simple multidimensional container for compressed binary data. It adds the capability to store, extract, and transform data in these containers, either in-memory or on-disk.

Track 1 (Mitxelena)

Dashboarding with Jupyter notebooks, voila and widgets

Maarten Breddels, Martin Renou

Turn your Jupyter notebook into a beautiful modern React or Vue based dashboard using voila and Jupyter widgets.

Track 3 (Oteiza)

ToFu - an open-source python/cython library for synthetic tomography diagnostics on Tokamaks

Laura Mendoza, Didier VEZINET

We present an open-source parallelized and cythonized python library, ToFu, for modeling tomography diagnostics on nuclear fusion reactors. Its functionalities (with realistic examples), its architecture and its design will be shown.

Track 2 (Baroja)

15:45

Debugging in JupyterLab

Debugging Jupyter Notebooks has been one of the most requested features. In this presentation we give an overview of the current state and tools for debugging in Jupyter, and offer a glimpse of what is coming next.

Track 2 (Baroja)

Make your Python code fly at transonic speeds!

Transonic is a new pure Python package to easily accelerate modern Python-Numpy code with different accelerators (like Cython, Pythran, Numba, Cupy, etc...).

Track 3 (Oteiza)

Modin: Scaling the Capabilities of the Data Scientist, not the machine

Devin Petersohn, Devin Petersohn

Modern data systems tend to heavily focus on optimizing for the system’s time. In this talk, we discuss the design of Modin, a DataFrame library, and how to optimize for the human system.

Track 1 (Mitxelena)

16:25

From Modeler to Programmer

The modeling system ueflow allows for customable, dynamic boundary conditions.
The modeler can write Python plugins to implement the behavior of these boundary conditions.

Posters at 16:00

16:30

Best Coding Practices in Jupyterlab

Alexander CS Hendorf

Jupyter notebooks are often a mess. The code produced is working for one notebook, but it's hard to maintain or to re-use. In this talks I will present some best practices to make code more readable, better to maintain and re-usable.

Track 1 (Mitxelena)

Controlling a confounding effect in predictive analysis.

Confounding effects are often present in observational data: the effect or association studied is observed jointly with other effects that are not desired.

Track 2 (Baroja)

PyFETI - An easy and massively Dual Domain Decomposition Solver for Python

Guilherme Jenovencio

PyFETI is a python implementation of Finite-Element-Tearing-Interconnecting Methods. The library provides a massive linear solver using Domain Decomposition method, where problems are solved locally by Direct Solver and at the interface iteratively.

Track 3 (Oteiza)

16:45

High Voltage Lab Common Code Basis library: a uniform user-friendly object-oriented API for a high voltage engineering research.

Mikołaj Rybiński

The library leverages Python richness to provide a uniform user-friendly API for a zoo of industrial communication protocols used to control high voltage engineering devices, together with abstraction and implementations for such devices.

Track 3 (Oteiza)

Lessons learned from comparing Numba-CUDA and C-CUDA

We compared the performance of GPU-Applications written in C-CUDA and Numba-CUDA. By analyzing the GPU assembly code, we learned about the reasons for the differences. This helped us to optimize our codes written in NUMBA-CUDA and NUMBA itself.

Track 1 (Mitxelena)

The Rapid Analytics and Model Prototyping (RAMP) framework: tools for collaborative data science challenges

Guillaume Lemaitre, Joris Van den Bossche

The RAMP (Rapid Analytics and Model Prototyping) framework provides a platform to organize reproducible and transparent data challenges. We will present the different framework bricks.

Track 2 (Baroja)

18:00

MNE-Python, a toolkit for neurophysiological data

A summary of the MNE-Python changes introduced during the two last releases and highlights for future directions.

Posters at 16:00

09:15

HPC and Python: Intel’s work in enabling the scientific computing community

High Performance Computing (HPC) has been a pillar of the scientific community for years, with many in the Python community contributing to its continued development. However, one of the fundamental links in performance is the relationship between h

Track 1 (Mitxelena)

10:30

Inside NumPy: preparing for the next decade

Over the past year, and for the first time since its creation, NumPy has been operating with dedicated funding. NumPy developers think it has invigorated the project and its community. But is that true, and how can we know?

Track 1 (Mitxelena)

Visual Diagnostics at Scale

Dr. Rebecca Bilbro

Machine learning is a search for the best combination of features, model, and hyperparameters. But as data grow, so does the search space! Fortunately, visual diagnostics can focus our search and allow us to steer modeling purposefully, and at scale.

Track 2 (Baroja)

11:00

Deep Learning without a PhD

In this talk, you'll learn how to transition from traditional machine learning tools, like scikit-learn, to deep learning with Keras, TensorFlow, and JAX. No prior experience with machine learning or with deep learning required, and no need to instal

Track 1 (Mitxelena)

Exceeding Classical: Probabilistic Data Structures in Data Intensive Applications

We interact with an increasing amount of data but classical data structures and algorithms can't fit our requirements anymore. This talk is to present the probabilistic algorithms and data structures and describe the main areas of their applications.

Track 3 (Oteiza)

Histogram-based Gradient Boosting in scikit-learn 0.21

In this presentation we will present some recently introduced features of the scikit-learn Machine Learning library with a particular emphasis on the new implementation of Gradient Boosted Trees.

Track 2 (Baroja)

11:30

Driving a 30m Radio Telescope with Python

Francesco Pierfederici

The IRAM 30m radio telescope is one of the best in the world. The telescope control software, monitoring, data archiving as well as some of the data processing code is written in Python. We will describe how and why Python is used at the telescope.

Track 3 (Oteiza)

Recent advances in python parallel computing

Modern hardware is multi-core. It is crucial for Python to provide
efficient parallelism. This talk exposes the current state and advances
in Python parallelism, in order to help practitioners and developers take
better decisions on this matter.

Track 2 (Baroja)

The Magic of Neural Embeddings with TensorFlow 2

Oliver Zeigermann

Neural Embeddings are a powerful tool of turning categorical into numerical values. Given reasonable training data semantics present in the categories can be preserved in the numerical representation.

Track 1 (Mitxelena)

12:00

Data sciences in a polyglot world with xtensor and xframe

Sylvain Corlay, Wolf Vollprecht

The main scientific computing programming languages have different models the main data structures of data science such as dataframes and n-d arrays. In this talk, we present our approach to reconcile the data science tooling in this polyglot world.

Track 2 (Baroja)

High quality video experience using deep neural networks

Marco Bertini, Tiberio Uricchio

Video compression algorithms used to stream videos are lossy, and when compression rates increase they result in strong degradation of visual quality. We show how deep neural networks can eliminate compression artefacts and restore lost details.

Track 1 (Mitxelena)

Matrix calculus with SymPy

Francesco Bonazzi

In this talk we explore a recent addition to SymPy which allows to find closed-form solutions to matrix derivatives. As a consequence, generation of efficient code for optimization problems is now much easier.

Track 3 (Oteiza)

14:00

In the Shadow of the Black Hole

I will walk through the entire Event Horizon Telescope experiment and the global effort that led to the first-ever direct image of a black hole revealed to the world on April 10th of this year.

Track 1 (Mitxelena)

14:45

A practical guide towards algorithmic bias and explainability in machine learning

Alejandro Saucedo

Undesired bias in machine learning has become a worrying topic due to the numerous high profile incidents. In this talk we demystify machine learning bias through a hands-on example. We'll be tasked to automate the loan approval process for a company

Track 1 (Mitxelena)

Understanding Numba

Valentin Haenel

In this talk I will take you on a whirlwind tour of Numba and you will be quipped with a mental model of how Numba works and what it is good at. At the end, you will be able to decide if Numba could be useful for you.

Track 2 (Baroja)

VeloxChem: Python meets quantum chemistry and HPC

A new and efficient Python/C++ modular library for real and complex response functions at the
level of Kohn-Sham density functional theory

Track 3 (Oteiza)

15:15

PyPy meets SciPy

PyPy, the fast and compliant alternative implementation of Python, is now compatible with the SciPy ecosystem. We'll explore how scientific programmers can use it.

Track 2 (Baroja)

Tracking migration flows with geolocated Twitter data

Antònia Tugores

Detect migration flows worldwide using geolocated Twitter data: routes, settlement areas, mobility to more than one country, spatial integration in cities, etc.

Track 1 (Mitxelena)

emzed: a Python based framework for analysis of mass-spectrometry data

This talk is about emzed, a Python library to support biologists with little programming knowledge to implement ad-hoc analyses as well as workflows for mass-spectrometry data.

Track 3 (Oteiza)

15:45

Deep Learning for Understanding Human Multi-modal Behavior

Ricardo Manhães Savii

Multi-modal sources of information are the next big step for AI. In this talk, I will present the use of deep learning techniques for automated multi-modal applications and some open benchmarks.

Track 1 (Mitxelena)

High performance machine learning with dislib

Javier Álvarez

This talk will present dislib, a distributed machine learning library built on top of PyCOMPSs programming model. One of the main focuses of dislib is solving large-scale scientific problems on high performance computing clusters.

Track 2 (Baroja)

vtext: fast text processing in Python using Rust

In this talk, we present some of the benefits of writing extensions for Python in Rust. We then illustrate this approach on the vtext project, that aims to be a high-performance library for text processing.

Track 3 (Oteiza)

16:30

Can we make Python fast without sacrificing readability? numba for Astrodynamics

Juan Luis Cano Rodríguez

There are several solutions to make Python faster, and choosing one is not easy: we would want it to be fast without sacrificing its readability and high-level nature. We tried to do it for an Astrodynamics library using numba. How did it turn out?

Track 2 (Baroja)

How to process hyperspectral data from a prototype imager using Python

Matti Eskelinen

We present a collection of software for handling hyperspectral data acquisition and preprocessing fully in Python utilising Xarray for metadata preservation from start to finish.

Track 1 (Mitxelena)

pystencils: Speeding up stencil computations on CPUs and GPUs

pystencils speeds up stencil computations on numpy arrays using a sympy-based high level description, that is compiled into optimized C code.

Track 3 (Oteiza)

16:45

Enhancing & re-designing the QGIS user interface – a deep dive

Sebastian M. Ernst

How can one of the largest code bases in open source Geographical Information Science – QGIS – be enhanced and re-designed? Through the powers of Python plugins. This talk demonstrates concepts on how to make QGIS more user-friendly.

Track 1 (Mitxelena)

PSYDAC: a parallel finite element solver with automatic code generation

PSYDAC takes input from SymPDE (a SymPy extension for partial differential equations), applies a finite-element discretization, generates MPI-parallel code, and accelerates it with Numba, Pythran, or Pyccel. We present design, usage and performance.

Track 2 (Baroja)

TelApy a Python module to compute free surface flows and sediments transport in geosciences

TelApy a Python module to compute free surface flows and sediments transport in geosciences and examples of how it is used to inter-operate with other Python libraries for Uncertainty Quantification, Optimization, Reduced Order Model.

Track 3 (Oteiza)

No sessions on Friday, Sept. 6, 2019.