Live Stream: https://youtu.be/fMRBOdHZNds
Words from the organizers of PyCon Sweden 2021.
Live Stream: https://youtu.be/clRHB3Jq2Ag
"Bridging Productivity, Portability, and Performance with Data-Centric Python" By Tal Ben-Nun, senior researcher with the Scalable Parallel Computing Laboratory, ETH Zurich"
Live stream: https://youtu.be/veMSbl2fbXE
Flask is favoured for prototyping. It is easy to set up and run. However, choosing Flask as your main 'cheval de bataille', be it a company or individual, requires solid grounding. Flask lets you choose your own ingredients, which lights up the joy of coding but bites if not being careful. This talk covers the standard techniques not to be missed as well as new audacious ones used to help manage BIG codebases.
Live broadcast: https://www.youtube.com/watch?v=OcgLuOs1Hrc
Present a solution that integrates various components in its architecture, both computational resources, databases and its own python applications and other open source ones. The idea is to show the problems and challenges posed by traditional scraping and how we have been able to build solutions that reduce them, even more so if what is sought is to do it en masse and in parallel. This also means building an automated flow for the post-processing and transformation of the data using machine learning services such as NLP and classification.
Live Stream: https://youtu.be/9ZQxvhdOTlA
PySpark is a distributed data processing engine widely used in Data Engineering and Data Science. Another way to think of PySpark is a library that allows processing large amounts of data on a single machine or a cluster of machines. We will go through the basic concepts and operations so you will leave the workshop ready to continue learning on your own.
Live broadcast: https://www.youtube.com/watch?v=oBPNk5qN0L4
At H&M Group, we are increasingly adopting machine learning algorithms and rapidly developing successful use cases, one of the applications is a dynamic resources allocation (memory and cpu) using data driven analysis and ML to decrease the cost of infrastructure.
The objective of this talk is to show how one of H&M use cases adopted ML workflow using airflow, kubernetes and docker and how to solve the provisioning problem with ML approach.
Live stream: https://youtu.be/DK9teAs72Do
Python web frameworks, like Flask, Quartz, Tornado, and Twisted, are
increasingly important for writing high-performance web applications. However, even they posit some bottlenecks either due to their synchronous nature or due to the usage of python runtime. Most of them don’t have the ability to speed themselves due to their dependence on *SGIs. This is where Robyn comes in. Robyn tries to achieve near-native Rust throughput along with the benefit of writing code in Python. In this talk, we will learn more about Robyn. From what is Robyn to the development in Robyn.
Live broadcast: https://www.youtube.com/watch?v=UujU3xOo038
What are the essential software engineering skills a datascientist should have to succesfully bring own work to production? We - Sergei Beilin, Ph.D., software engineering consultant in AI/ML, and his wife Natalia Beylina, Ph.D., datascientist - will go through the most important things a modern datascientist needs to know about software engineering, from both software engineer and datascientist point of views, and using our own experience.
We will discuss:
* programming language(s): how much of the language should one know?
* execution models, orchestration, containerization - kubernetes, kubeflow, airflow, spark/databricks, etc
* storage, network protocols/APIs, file formats - from CSVs to delta, from json to avro
* modern systems architecture concepts to understand
* and how the whole system architecture and infrastructure landscape will dictate the way you deploy and run your work
* tools and devops practices
* processes: integrating data scientists' workflow into typical agile
* bad practices to avoid: a few examples we've seen ourselves
Live stream: https://youtu.be/kO5Es7KKUIY
This talk provides a hands-on deep-dive into the wheel file format and python packaging. First, we will slash the tire, see what's inside, and then build new wheels from scratch.
You will learn about the inner workings of a crucial part of the Python packaging ecosystem and understand what your tools do behind the covers.
Live Stream: https://youtu.be/5EWHYxnOTyQ
Not fading away: A tale about a 20-year old Python project By Érico Andrei, Python Software Foundation Fellow, Plone Foundation Vice-President
Live broadcast: https://www.youtube.com/watch?v=gwLJZVoXWlg
Apache Airflow has become one of the most popular data toolings. Due to its high
complexity, it could be challenging for all teams and companies. For example, how to
effectively construct an orchestrate architecture on diverse cloud platforms, how to
productively accelerate your engineering and machine learning workload at scale, and how
to smartly decouple your Python codebase for professional testing and easy maintenance.
Live Stream: https://youtu.be/qWvJSIgOcPU
With a lot of changes under the hood with Airflow 2.0, the workshop aims to give an overview on major updates in Airflow 2.0 from 1.0, major components and working of Airflow and hands-on demo of implementation and management of an end-to-end Machine Learning pipeline. Without a pipeline in-place, management of multiple Machine Learning stages in production can be difficult. This gives an overview of simplified process and management of Python based ML projects using Airflow.
Live broadcast: https://www.youtube.com/watch?v=EV7SkhRxemA
How can you show what a Machine Learning model does once it's trained? In this talk, you're going to learn how to create Machine Learning apps and demos using Streamlit and Gradio, Python libraries for this purpose. Additionally, we'll see how to share them with the rest of the Open Source ecosystem. Learning to create graphic interfaces for models is extremely useful for sharing with other people interesting with them.
Live Stream: https://youtu.be/rka62mJ-vfo
A discussion with companies' recruiters from different areas about the expectations on python programmers, the trends, and the difficulties nowadays.
Live stream: https://youtu.be/tHlMw9zFgQE
Popular programming language index websites (TIOBE index) and developer surveys (Stack Overflow) place Python as one of the fastest-growing programming languages. However, this popularity also puts in the target range of attackers. The attackers perform malicious dependency attacks and use misconfiguration tools to reveal confidential information. In this talk, we will discuss identifying common security issues in Python code and handling malicious dependency attacks using safety.
Live broadcast: https://www.youtube.com/watch?v=9fwOBMWRTiI
Facial recognition has been a challenging task for a long time. Nowadays, we can reach and pass the human level accuracy with deep learning based state-of-the-art models. In this talk, you are going to learn how to build highly scalable facial recognition pipelines in python programming language with DeepFace library from its creator.
DeepFace is the most lightweight facial recognition and facial attribute analysis (age, gender, emotion / facial expression, race / ethnicity) library for Python. It wraps many state-of-the-art face recognition models: VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, DeepID, Dlib and ArcFace. Experiments show that human beings have 97.53% score on LFW dataset whereas VGG, FaceNet, Dlib and ArcFace are passed that level already. Besides, OpenFace, DeepID and DeepFace have a close score as well. You can also build and run any one those cutting-edge models with just a few lines of code. The library got almost 2K stars on GitHub and 200K installations on PyPi / Pip.
Live stream: https://youtu.be/W1qOPma747k
Configuration files used to manage Infrastructure as Code are traditionally implemented as YAML or JSON text files and are missing most of the advantages of modern programming languages. Wouldn't it be better to use the expressive power of Python to define your cloud infrastructure? The AWS Cloud Development Kit (AWS CDK) is an open-source framework from AWS that enables developers to harness the full power of modern programming languages to define reusable cloud components and provision applications built from those components using AWS CloudFormation. In this session, we'll quickly cover the basic concepts of the CDK, and then we'll spend the majority of our time building a serverless application with the CDK. We'll show you how to use the CDK to assemble your AWS infrastructure using the Python CDK quickly.
Video stream: https://youtu.be/0HaIYpxTzX8
Lightning talks
Live broadcast: https://www.youtube.com/watch?v=cVEkJqcmivQ
Marketing attribution is one of the trickiest problems to crack for data scientists working with marketers. To reach potential customers one needs to measure the value of campaigns and channels
that the customers interact with. It's easier said than done. One solution to this problem is through the Markov chain. We will see how we can implement the markov chain for channel attribution.
Live Stream: https://youtu.be/BgzIaEzXEBU
Many times we have to write Python extensions, particularly in C. To do various system operations, or doing calculations in a faster manner. But, writing safe C code is always difficult, even for an experienced developer. This is where writing Python extensions in Rust is becoming more popular among developers where people think about speed and security at the same time. In this workshop we will learn about how to create a Python module using Rust. No previous Rust experience is required.
Video link: https://youtu.be/lWfJfviWIBU
The OWASP Top 10 is a book/referential document outlining the 10 most critical security concerns for web application security. In this talk, we will see how underlying security in Django, protects it against OWASP top 10 vulnerabilities, ranging from SQL injection attacks to authentication and CSRF. It is one of the most complex yet interesting topics in Django that makes it an extremely powerful web framework.
Live broadcast: https://www.youtube.com/watch?v=j90tdZyK6FA
The move to cloud has opened a world of new possibilities in software development.
It's so easy to spin up resources in the cloud and together with the adoption of DevOps, software developers are more empowered than ever before. Of course this also puts more demand on the software developers, to take full control and have knowledge of the complete cycle from depolying infrastructure to develop and deploy code. Luckily this process has a lot of benefits and is less reliant on skills of key-persons, if infrasctructure can be deployed as code, this can also be automated with different tools.
The end goal is to be able to deploy more code enhancements and at the same time benefit from the rapid pace of hardware and cloud improvements.
Live broadcast: https://www.youtube.com/watch?v=iw9uS8yLax8
Machine learning is not only an interesting technology to use today, but it’s also appreciated by management that will hear that the organisation is using “machine learning” to solve time series challenges, such as demand planning with supply chain management. However, this can result in time spent on complex modelling that in general can be accomplished quicker with much simpler models that are easier to deploy and sustain long-term.
Therefore, in this talk we'll show how simple can not only give better results while reducing the complexity in terms of data pre-processing, model development and final deployment. We will look at an example within supply chain management and demand planning for a product and discuss different scenarios based on multiple types of historical demand data.
The presentation will show the actual code, but a big focus will be on the strategic decision-making of selection of models and how to deploy these models.
Live Stream: https://youtu.be/4z0ivHhX4h0
Words from the organizers of PyCon Sweden 2021.
Live Stream: https://youtu.be/hv-jKovzeHI
Managing cloud infrastructure as code in Python By Alexey Isavnin, senior software developer at Elisa Automate, founder of Rays of Space company
Live Stream: https://youtu.be/Iv9KA2JWwVw
Join Carolina & Victoria, developers at 46elks, for a code along workshop 👩🏻💻
We will be building an answering machine with Flask. Using Python & 46elks you can setup your very own answering machine.
What you need to follow this code along:
- A 46elks account, here's a link with some credits to test your answering machine
- A computer and be excited to code some cool stuff 👩🏻💻
We will be coding together for about 60 minutes and then we'll answer any questions you might have (literally, ask us anything), or just hang, getting to know new developers friends 🥳
Live stream: https://www.youtube.com/watch?v=Ew4tKVem6F8
In this talk, we aim to find if polarization is induced in a neural
network by feeding it newspaper articles with manufactured sentiments according to the
Allsides Media Bias chart for the level of faith people on various aisles of the political
spectrum. This project consists of a set of experiments on similar data-sets from news
agencies across the various subsets in the ”media-bias” chart. News Media perceived bias
is common across consumers that belong to various political affiliations. While anecdotal
evidence of this exists and there exist annotated datasets that aim to annotate the ”spin”
a news agency puts on certain events and entities, whether this is a widespread problem
and whether it can be detected by the neural network topically or temporally is a problem that needs to be explored. The news media bias analysis is modelled as a Natural
Language Processing sentiment analysis task and a fake news binary classification task to
deduce the level of polarization in a neural network by feeding it headlines embedded using
pre-trained sentiment models from news publications across the political spectrum. When
it came to fake news vulnerability, news from all kinds of perceived politically affiliated
news media holds up well against a fake news dataset with a very good accuracy. None of
the accuracies dropped below 95%. This is a significant result that sort of debunks the AllSlides
Media categorization - if taken as simplistically as it is presented. These experiments can be extended to include entity based topical
studies in the future and to also educate the populace about their perceived biases.
Live stream: https://youtu.be/Sp0tEGqFfN8
Different programming languages have different functionality, different paradigms, and different styles. We have certainly seen other low-level languages like C++ adopting more “pythonic” themes in recent years, like foreach loops. But what about the opposite? Did you ever wonder how we could implement a smart pointer in Python? Whether we can we add Java’s final keyword for real constants? What exactly the inspect module is useful for? How we get private methods in classes?
We will take a deep dive into Python's fundamentals to discover how you can make things like C++-style input/output, like cout << "Hello world" << endl; or cin >> my_var;, a reality in Python, using the exact same syntax. And, of course, why you really, really shouldn't.
What exactly does pythonic mean? What makes python what it is today? Hint: It’s about more than just the walrus operator.
Live stream: https://youtu.be/Lsi1ZhmbNDc
Stuck in a deeply nested if...else when traversing the pyramid of doom, you pause for a minute to catch your breath. The program’s logic eludes you and it is getting increasingly tiresome to keep track of all the twists and turns of the various conditions and possible return values.
You start to dream a dream of a flattering flattening of all this code. A dream of refactoring this bewildering maze into an orderly space, devoid of surprising and unexpected behaviour. A space where things have their obvious place and purpose.
You decide that you just just might need to set some rules.
Enter the Rules Engine.
Live stream: https://www.youtube.com/watch?v=0a0c-aMj1Xs
Optimization libraries such as SciPy or Nevergrad are commonly used in different data science workflows, such as choosing optimal hyperparameters for a machine learning model or taking actions based on forecasts. In this presentation, we will discuss how such an optimizer can be used to build reward configurations for games (by rewards configurations here we mean bundles of different in-game items that players may get for completing different tasks/quests in a game) Using rewards in Candy Crush Soda as an example, I will show how the problem can be solved using the Nevergard library from Facebook.
Live stream: https://youtu.be/YnxG8jABqaU
A brief overview on how the python ecosystem can be used to build things that would help you to boost your skills and build a next major/minor version of yourself. I'll showcase a few approaches and mathematical model for motivation and how one can build tools to help you lower the resistance of doing those things you think you need to be doing more often
Live Stream: https://youtu.be/oIWBW2usic8
The best way to learn Python - for the absolute beginners and improvers by Cheuk Ting Ho, Developer Relations Lead at TerminusDB, Python Software Foundation Fellow, organizer of EuroPython
Live stream: https://youtu.be/y0PTozH9mZs
The purpose of this talk is to share the work as a professor of the Bachelor of Information Systems at the University of Minas Gerais (using the Python language to teach Object Oriented Programming II). We are going to talk about the difficulties encountered by students in learning this subject and how we managed to overcome it with the use of a modern language with a shorter learning curve and how this can contribute to a lower dropout rate from the course. Difficulties encountered, pedagogical approach used, exercise practices performed with students.
Live stream: https://www.youtube.com/watch?v=Y9OPX75ax0M
Controlled experiments such as A/B tests are a gold standard for determining whether changes to a website significantly impacted user behaviour, however they are not always possible. In this talk we walk through a iPython Notebook and describe a non-parametric method for determining whether changes to e-commerce product pages impacted conversion to basket without the use of controlled experiments.
Live Stream: https://youtu.be/gnFzZRkQZ2c
This workshop will demonstrate a zero-to-hero tutorial on how to solve a classification task using deep learning. The tutorial kicks off demonstrating a simple classification task on synthetic data, first in low and then in high dimension. Then, a harder classification task based on FashinMNIST, a famous dataset containing images of clothes, will be tackled. Apart from solving the classification task itself, we will show how to generate and analyze embedding vectors that can be used to solve other downstream tasks, different from the original classification problem on which the model was trained. Finally, we are going to face a more advanced type of classification problem, namely, predicting links on a graph using Graph Neural Networks. Link prediction will be demonstrated on an open source dataset that contains information about collaborations among authors of scientific papers. The target of this workshop is to show how we can use Python to solve the the aforementioned tasks, taking into account both the data science aspects and the engineering and project lifecycle related ones. In particular, the python packages that we are going to cover in the workshop are PyTorch, PyTorch-Lightning, Deep Graph Library.
Live stream: https://youtu.be/cuNEOtLbB14
The Robot Operating System (ROS) is a set of software libraries and tools that help you build robot applications. In this talk, we will discuss how to create your own custom robot and simulate it in Gazebo along with ROS. We will also learn to add cameras and other sensors which will enable us to move the robot and perform image processing using python.
Live stream: https://www.youtube.com/watch?v=bZjHWgLnWs8
What can a developer teach a data analyst about data analysis?
A few lines of Python code may be enough to solve a tricky data cleaning challenge.
Functions can stop you from getting lost in many copies of very similar code.
Tips for writing larger programs without tearing your hair out.
Start writing code which is still useful in years to come, and which evolves without degrading into a big mess
I will share examples of how I've used pure Python in my data analysis and give you simple tips on applying software development best practices to your code.
Live stream: https://www.youtube.com/watch?v=rqy3OZn4y-4
The cutting efficiency of a chainsaw is related to the hardness of the wood, For example, it is affected by the existence of knots (hard structure areas) and cracks (no material areas). The current practice involves clean cuts by avoiding knots and cracks. Therefore estimating the relative wood hardness by identifying the knots and cracks beforehand can significantly automate the process of regulating the chain properties, e.g., consumed power, force, etc., which in turn improves the chain's efficiency.
In this talk I will share how I have implemented Mask-RCNN to identify and segment defects in wood cuts and how the result can be used to understand wood hardness to improve cutting efficiency of chainsaw.
Live stream: https://youtu.be/ji8wYJE0c1I
Taking ideas to market faster remains key to any good DevOps strategy.
Boilerplate application code, configurations and Infrastructure-as-Code (IaC) are the key components that enable this.
Leveraging template engines to build these is an effective strategy to enhance your speed.
Aligning with minimalism and keeping things agnostic, the talk shares a simple and easy to use code base to generate all of these.
Most organizations effectively manage boilerplate code with Git based services. However, it does not solve the question of "what to customize?" once you have the code cloned.
This is where customizable templates are key in identifying the customizable bits and injecting with the right parameters/data with ease.
About the speaker:
Raza Balbale is currently a Snr. Architect/ Manager at Cognizant Technology Solutions, US - part of the Connected Products BU's Product Engineering Team. He frequently uses Python as part of his DevOps / acceleration toolkit.
Live Stream: https://youtu.be/SRqr8OSY0oU
Words from the organizers of PyCon Sweden 2021.