Python Conference APAC 2024
Workshop "Building Python Tools for Probing the Digital Footprint" ini dirancang untuk memberikan pemahaman mendalam tentang cara membangun dan menggunakan alat berbasis Python untuk menganalisis jejak digital. Peserta akan diperkenalkan pada teknik dan metodologi terbaru dalam pengumpulan, analisis, dan visualisasi data digital. Melalui serangkaian sesi praktikum dan teori, peserta akan mempelajari cara mengembangkan skrip Python yang efisien untuk mengekstraksi informasi dari berbagai sumber data online, seperti media sosial, situs web, dan basis data publik. Workshop ini juga mencakup teknik penelusuran email, username, dan informasi identitas digital lainnya. Selain itu, workshop ini akan membahas isu-isu etis dan privasi yang terkait dengan analisis data digital, serta memberikan panduan tentang praktik terbaik dalam pengelolaan dan penyimpanan data. Dengan mengikuti workshop ini, peserta diharapkan mampu membangun alat yang dapat membantu dalam penelitian, bisnis, dan aplikasi lainnya yang membutuhkan analisis mendalam terhadap jejak digital.
One of the challenges for a machine learning project is to deploy it. Fast API provides a fast and easy way to deploy a prototype with less software development expertise and yet allow it to be developed into a professional web service. We will look at how to do it.
MicroPython is a lean and efficient implementation of the Python 3 programming language that includes a small subset of the Python standard library and is optimised to run on microcontrollers and in constrained environments.
Since its first initiation in 2014, micropython has already supported many microcontrollers including the ESP32S3 which is the main brain in the Xiao ESP32S3 Sense development board.
In this page I will guide on how to use the the Xiao ESP32S3 Sense capability using the simple and easy syntax from micropython
The robots are coming... in this fun hands-on session we will use wheeled robots based on the popular BBC Micro:bit microcontroller and programmed using MicroPython. We will learn how to navigate, how to avoid obstacles and how to follow a line
Large Language models are all over the place, driving the advancement of AI in today's era. For enterprises and businesses, integrating LLM with custom data sources is crucial to provide more contextual understanding and reduce hallucinations. In my talk, I'll emphasize on building an effective RAG pipeline for production using Open Source LLMs. In simple words, Retrieval Augmented Generation involves retrieving relevant documents as context for user queries and leveraging LLMs to generate more accurate responses.
In this workshop, we will cover the very basic of using PyO3 - rust library that package rust crates into Python modules. This is the most popular tool in terms of creating Python libraries with Rust.
In this workshop, participants will embark on a comprehensive journey to build an AI agent using advanced tools and techniques such as Retrieval-Augmented Generation (RAG), Langchain, and Reasoning Engine. Over the course of 90 minutes, attendees will gain hands-on experience and valuable insights into the following key areas:
-
Preparing Documents for RAG: Learn how to prepare documents for Retrieval-Augmented Generation by embedding, chunking, and storing them in a vector database. We will utilize the pgvec extension in PostgreSQL to efficiently manage and query our document vectors.
-
Creating a Document Retriever Tool: Discover how to develop a powerful document retriever tool that performs efficient searches and retrievals from the vector database. This tool will be crucial for augmenting prompts with relevant information, enhancing the AI agent's responses.
-
Developing API Tools for Third-Party Interaction: Explore the process of creating API tools that enable the AI agent to interact seamlessly with third-party systems API. These tools will expand the agent's capabilities, allowing it to execute complex tasks and retrieve external data.
-
Building an Agent in Langchain: Dive into the creation of an intelligent agent using Langchain. Participants will learn how to manage chat histories through session stores (both in-memory and persistent storage like Redis) and leverage various tools to empower the agent to make decisions and perform actions autonomously.
-
Deploying the AI Agent to the Cloud with Reasoning Engine: Gain practical knowledge on deploying the AI agent to the cloud using the Reasoning Engine. This section will demonstrate how to transition from development to a production-ready prototype swiftly, ensuring the agent's scalability and reliability.
By the end of this workshop, participants will have a robust understanding of building and deploying an AI agent, equipped with the skills to create intelligent systems that can autonomously interact with users and third-party services. This workshop will provide a comprehensive guideline, empowering attendees to innovate and implement AI solutions effectively in their own projects.
This workshop aims to demonstrate how web scraping task can be made easy with Scrapy. Scrapy is an open source web scraping framework written in Python. It allows developers to focus on developing web crawlers without being bothered by lower-level details such as managing HTTP request scheduling and concurrency. We will use Scrapy to extract data from toscrape.com, a web scraping sandbox that can be used by anyone to learn web scraping. Participants will gradually learn how to perform web scraping, starting from simple task like extracting data from a single web page to more complex tasks such as extracting data from AJAX endpoints.
The target participants of this workshop are individuals with basic programming skill (not necessarily in Python) who understand basic concepts of HTTP and HTML document structure.
KEYNOTE 1
KEYNOTE 2
This session will showcase how to enable BI and ML on a modern two-tier data architecture for a business continuity plan, improve real-time analysis for a financial service application, create a centralized BI dashboard for organizational performance forecasting, implement an automated ETL process for cross-functional collaboration, and share experiences in creating a data-intelligent service layer for rapid development.
Harness the Raspberry Pi to create a voice-activated bot that tells you your schedule for the day, using both speech recognition and synthesis. Designed for beginners, this session will guide you through a hands-on project that brings the power of Raspberry Pi to your everyday life.
In this talk, I will explore the cutting-edge technology of Multimodal Retrieval Augmented Generation (MRAG) using Python. This innovative approach combines the power of natural language processing, computer vision, and machine learning to enhance the generation of text and images based on complex queries. We'll elaborate into how MRAG leverages large language models and retrieval systems to provide contextually rich and accurate outputs.
Ensuring that the website ranks well on search engines is crucial for visibility and success. This project will showcase a web application designed to check Google rank for specific URLs and analyze website content to extract potential keywords that can boost rankings. Utilizing Flask for the back end, Natural Language Processing (NLP) for content analysis, and Vue.js for an interactive front end, this project offers a comprehensive tool for SEO optimization.
Automatic Speech Recognition (ASR, a.k.a. speech-to-text) , also known as speech-to-text, is a valuable technology, with models like Whisper empowered by Python. Whisper is available via APIs; however, it takes a long time to process voice data. I would like to introduce several ways to speed up the Whisper model using local/remote GPUs and TPUs in Google Cloud.
With all the hype about applications that uses machine learning, I think there is one key aspect that developers tend to forget: "Performance check and monitoring".
ML and AI services have become very accessible and can be integrated into any application you can think of. But what do you do after you integrated your ML models to your application? How do you know that the output of the ML models are correct and up to standard? What are the signs that the model's performance is changing and how do you take action on such changes?
Typically, these problems are just discussed on a theoretical and research level. But how can we carry over these techniques and apply it to our application? Not just that, how can we make it so that monitoring and performance check is as simple as writing a unit test (or not).
In this session, we will learn some simple but effective ways on model performance monitoring as well as look at some python implementation and architecture consideration. We will also check some best practices and some real life scenario on how model monitoring works.
Intro to the code sprint session
cPython, FreeCAD, QGIS
Enhancing the performance of Python scripts is a critical challenge for developers striving to optimize efficiency and reduce execution time. This presentation will delve into various methodologies for improving Python script performance, including threading, multiprocessing, and application refactoring.
Python's Global Interpreter Lock (GIL) presents significant challenges in achieving true parallelism, as it permits only one thread to execute at a time, even in a multi-threaded context. We will discuss the implications of the GIL on threading and highlight scenarios where No GIL implementations or workarounds can be advantageous.
Although threading and multiprocessing enable parallel execution, they do not inherently ensure faster runtimes. Threading can be effective for I/O-bound tasks, whereas multiprocessing is more suitable for CPU-bound operations. However, both approaches introduce complexity and potential overhead that can negate performance gains.
Conversely, refactoring the application can lead to substantial performance improvements. By optimizing algorithms, reducing complexity, and leveraging efficient data structures, developers can achieve significant runtime reductions. This presentation will provide practical examples and case studies illustrating how refactoring can be a more effective strategy for performance enhancement compared to merely adding parallelism.
Additionally, this talk will outline our journey to enhance our infrastructure automation, employing several approaches and comparing each to achieve notable improvements. By implementing these strategies, we accomplished a 90% reduction in running time and met our SLA, thereby enhancing productivity. This presentation will offer practical examples and an in-depth exploration of approaches that may be applicable to other use cases as well.
We have a Code Sprint introduction today at #PyConAPAC 2024, starting at 1 PM!
Sessions led by:
- CPython session led by Anthony Shaw
- FreeCAD session led by Ajinkya Dahale
- QGIS session led by Ismail Sunni
Location: Amphitheater (2nd floor, in front of the piano).
After the session, join fellow Pythonistas for networking or dive into the code sprint at the Adaro Internet Center on the 2nd floor!
Facial recognition in low light conditions is a big challenge for traditional facial recognition systems. In this talk, the speaker will delve into how each of various dark image feature extraction algorithms (HOG, LBP, Fisherface, DeepID, etc), implemented using Python, can significantly impact facial recognition performance in low lighting condition based on speaker's research.
This talk will be useful for students, researchers, developers, and practitioners who are interested in facial recognition and computer vision.
In this talk, we will explore how to streamline full-stack development for the entire team by building OpenAPI-powered APIs with FastAPI and integrating them with Next.js using TypeScript and React Query. We’ll discuss how FastAPI’s automatic OpenAPI documentation fosters clearer communication between backend and frontend developers, reducing misalignment and speeding up collaboration. On the frontend, tools like Next.js, OpenApi-TypeScript, and React Query not only enhance the developer experience with better type safety and state management, but also create a smoother handoff between teams. By aligning backend and frontend workflows, teams can work more efficiently, minimize errors, and improve overall project cohesion.
In recent years, Python has been actively enhanced with type hints, with useful features being added with each major update.
In addition, a number of libraries and tools that use type hints have been increased and are popular in the community.
Often when people think of the use of 'typing' in programming languages, the first thing that comes to mind is static analysis to detect type inconsistencies and reduce bugs or run-time errors.
However, the Python ecosystem uses type hints to flesh out various ideas.
Concrete examples include the following
- FastAPI
- Automatic generation of API documentation
- Dependency Injection
- Pydantic
- Data validation
- SQLAlchemy 2.0
- Determining data types or constraints in databases
This session will show how these libraries implement the ideas.
I believe that type hints in Python still has a lot of potential.
Through this session, you will be able to use type hints more conveniently and flesh out new ideas by yourself.
The process of selecting a university major is a critical decision for new students, often accompanied by uncertainty and confusion. Traditional academic counseling services, while valuable, may not always be accessible or sufficient to address the diverse needs of every student. This research paper presents the development and implementation of a Python-based chatbot, enhanced with AI techniques, designed to provide personalized major recommendations to new students.
The chatbot utilizes a combination of natural language processing (NLP) and machine learning algorithms to analyze students' interests, academic strengths, and career aspirations, offering tailored suggestions for suitable majors. The deep learning model is trained on a comprehensive dataset comprising student profiles, historical academic performance, and successful major outcomes, ensuring accurate and relevant recommendations.
By integrating this intelligent chatbot into the academic counseling framework, we aim to augment the decision-making process for new students, offering them valuable references and insights into potential majors. This research underscores the potential of leveraging advanced AI technologies in educational support systems using Python, paving the way for more accessible and personalized academic advising solutions.
Which companies will potentially win in a dystopian future? Let an agentic framework imagine it and try to find out. Buy or short, it's up to you ;)
Our proposed framework consist on:
- Data ingestion: News, financial statements and OHLC information ingestion with data pipelines (Mage), embeddings and vector database (Weaviate).
- Merging previous knowledge with new data: Exploration on RAG system to feed the info to the LLM and keep the model updated.
- Create the investment crew and giving them tools LLM agents mimicking different investment roles (LangGraph) + financial toolset combination in order to make informed decisions.
We will try to distill for you our learnings while building it and keep the buzzwords apart... as much as possible.
Database replication is a concept of storing the same data in multiple databases to improve fault tolerance, data accessibility and performance. In this talk, we will discuss why and when database replication is useful, along with practical examples and best practices to use in the codebase when working with separate writer and reader databases.
For beginners, contributing to open source can seem like a daunting mystery. "Which project should I contribute to? Is contributing limited to coding, or can I design as well? If I decide to code, which part should I fix? How can I replicate bugs in my environment?" These questions can be overwhelming. But worry not, Django is a perfect starting point for familiarizing ourselves with open source. If you know Python and web development with HTML and JavaScript, you're already set! Many tutorials on how to contribute to Django are available online.
What if we prefer to ask someone who knows better? In that case, good news! Djangonaut Space is a fun place for everyone to explore open source together in a group. Participants will work in groups based on their interests. Are you interested in the Django Core codebase, or do you prefer working with third-party packages? Not only will we deepen our technical skills, but we will also learn from experienced Django leaders on how to sustainably work with open source. We will share personal experiences of navigating the Django codebase and explore how Djangonaut Space can enhance this journey.
In an era where data drives investment decisions, harnessing the power of Python to analyze historical financial data can provide valuable insights into the most profitable investment opportunities in the Asia-Pacific (APAC) region. This talk will explore how Python's versatile libraries and tools can be employed to gather, process, and analyze financial data from the past decade, highlighting the top-performing financial instruments. Whether you are an experienced investor, a data scientist, or a Python enthusiast, join us to discover actionable insights and practical techniques for making informed investment decisions based on historical performance in one of the world's most dynamic markets.
In this talk, we will explore how to leverage Python for text analytics using data from Medium articles. This session will cover various Natural Language Processing (NLP) techniques to analyze and extract valuable insights from text data. Attendees will learn about data collection, preprocessing, sentiment analysis, and topic modeling, illustrated through practical examples and visualizations. Whether you are a content creator, data scientist, or Python enthusiast, this presentation will provide you with actionable methods and tools to enhance your text analytics capabilities.
Linkedin's real-time infrastructure can process more than 4 trillion events daily with 3000 pipelines that deployed across data center to serve over 950 million users worldwide. To achieve this, they are using Apache Beam for building streaming pipeline. Good news, Apache Beam has an SDK in Python.
Discover how to harness machine learning to automatically tag personal transaction purposes, eliminating the need for manual categorization. This talk will explore the workaround by utilizing an AI architecture namely Graph Neural Networks, to model the interactions between senders and recipients, as well as other information such as the transfer amount and the transfer notes.
PyPI is full of amazing libraries. How did they do it? What does it take to keep a library relevant? What is the art of compatibility? Who are the users and where to find them?
We discover how artificial intelligence revolutionizes investment risk analysis with a focus on multi-agent systems. Learn to deploy specialized agents for Data Analysts, Trading Strategy Developers, Trade Advisors, and Risk Advisors to monitor market data, refine trading strategies, optimize trade executions, and evaluate risks.
PySpark is widely used for data analysis in distributed computing environments, offering not only the standard DataFrame API but also Python User Defined Functions (UDFs), Python Data Sources, Python UDTFs, and more. However, debugging and profiling applications in such environments can be challenging. For example, you cannot simply add a step and inspect it in your IDE.
In this presentation, I will explore techniques for debugging and profiling PySpark applications using existing tools like cProfile, a standard Python profiler. Additionally, I'll share various tips and best practices for effectively monitoring and debugging PySpark applications.
Embark on a journey into the realm of digital discovery with our tale of building an internal search engine at Sinarmas Land. Join us as we unveil how we tackled the challenge of unlocking hidden knowledge buried within PDFs, Images documents and Relational tables. Using a blend of modern technologies and ingenuity, we crafted a solution that revolutionizes how employees access critical information. This session will delve into the technical intricacies, lessons learned, and the impact of our quest for hidden knowledge.
AI Agentic workflows will drive massive AI progress this year. This is what Professor Andrew Ng said about the rise of agents. With the growing popularity of large language models, Agents are what everyone is talking about. In simple terms, Agents can be defined as LLMs with the ability to self-reason and plan, just like humans. In my talk, I will focus on how to build an Autonomous Agentic workflow and the components required. Additionally, I will cover the concepts of planning and reasoning Agentic prompting such as REACT, LATS and so on to motivate the audience to stay updated with the Agentic world.
FastAPI has taken the Python web development world by storm, offering a powerful and developer friendly approach to building APIs. This talk dives deep into the inner workings of FastAPI, deconstructing its architecture and exploring the key components that make it tick. As an early adopter, I'll share my experience and insights, providing a clear understanding of how FastAPI leverages the ASGI
standard for high-performance web applications. This session will not only benefit those new to FastAPI, but also experienced developers seeking a deeper understanding of its inner workings and the advantages of the ASGI
paradigm.
Polars is popular these days. The code looks similar to pandas however they are different libraries. To know how to efficiently use Polars, we need to dive deeper into how these libraries are different. In this talk, we will do that and provide tips to migrate from pandas to Polars.
The Python programming language in recent years has gained advancements not only in terms of speed but also with its coverage of platforms and architectures that it supports, such as the upcoming iOS (PEP 730) and Android (PEP 738) ABIs, which will be a boon for its popularity and help expand it to a large variety of users. Similarly, the advent of WebAssembly (through the Emscripten toolchain) in the last decade has served a purpose rarely met by any other implementation of low-level code: being able to provide a way to run programs written in multiple languages at near-native speed by reworking higher-level instructions and their intermediate representations to near-assembly ones, that can run inside conventional browsers such as Google Chrome (and other Chromium-based browsers), Safari, and Firefox; across devices.
Here, the web browser has 1. access to the CPU of the host machine and therefore does not hit any major speed or connectivity constraints, and 2. the inherent security models for web browsers and sophisticated sandboxing means that security constraints are seldom a problem.
This talk shall describe Python in WebAssembly through its most popular implementation (Pyodide), and about how I am working with the Scientific Python and PyData ecosystems for scientific computing, artificial intelligence, data science, and more to improve interoperability with Pyodide – and lastly, it will discuss through an interactive yet informational exchange coupled with pragmatic considerations for this Python distribution in specific in terms of interactive documentation and packaging insights, and provide a precursor on how the Scientific Python ecosystem is expected to change in the coming years with further advancements in this area.
From Big Data to Big Learning: A Python-Powered Journey in EdTech explores a transformative data-centric approach to educational technology development. Drawing from a large-scale pandemic relief data project, this presentation from UNU Yogyakarta introduces "Data-First Development," a methodology that prioritizes continuous data analysis and insights over traditional fixed-feature development approaches.
The presentation demonstrates this approach through the UNUY App Ecosystem transformation and an LLM-powered Developer Assistant while addressing technical challenges in legacy system integration and user experience. It examines emerging EdTech trends including AI/ML, VR/AR, and IoT implementations, providing a framework for data-driven educational technology that balances innovation with ethical considerations.
As a Software tester who love Python, I've often struggled with verifying visual elements during testing, such as design and components. This challenge led me to develop a solution for automated visual testing using Selenium for web and Appium for Android. I created a custom pytest plugin that performs visual comparisons, helping testers and developers detect discrepancies between actual and expected visuals. This plugin generates comparison images highlighting differences, making visual testing more efficient and reliable.
It is possible to use functionalities (modules, classes, functions) implemented with languages other than Python as Python libraries.
Well-known examples, Numpy and Pandas are primarily implemented in C/C++ for poweful performance.
Recently, the use of the Rust language has gained attention in addition to C/C++.
In this talk, I will explain the advantages and procedures for developing Python libraries using Rust.
I will also introduce examples of libraries where Rust is being used.
In this talk, we will explore how Python empowers data analytics to solve real-world problems effectively. We will delve into specific use cases, demonstrating how Python-based solutions can provide actionable insights. This session will offer unique perspectives based on real-life experiences, aiming to deepen the audience's understanding of Python's capabilities in the data analytics domain.
Currently, there are several options for developing GraphQL servers with Python. In this session, we will focus on the combination of "FastAPI + Strawberry," discussing implementation methods and important considerations for operations.
As a country located between the world's three main tectonic plates, Indonesia is the country with the most active volcanoes in the world (127 volcanoes) However, out of 127 volcanoes, only 69 volcanoes are monitored by the Center for Volcanology and Geological Hazard Mitigation (PVMBG/CVGHM). In conducting volcano monitoring, seismic methods are used as the main method to determine whether volcanic activity is increasing or decreasing. The large number of monitoring instruments, resulting in the increase of monitoring data, so technological assistance is needed to speed up the data processing process and help in providing quick and accurate analysis. Python is used as one of the tools in the processing of volcanic seismic data. This process includes data quality control, normalization, validation, processing, databases, information dissemination, and an early warning system for the community and volcano stakeholders. Python in volcanoes is also not only used as a tool to monitor volcanic activity, but also used for research and development of volcano monitoring methods, including the use of machine learning for volcanic disaster mitigation. Ptyhon can help save people from volcanic eruption disasters.
Kafka is a widely used open-source event streaming platform that aggregates data events and works as a pub-sub. To make sure each application can only have access to messages they are allowed to consume and produce Kafka Admins can implement Kafka ACL rules to determine which principals are allowed to produce and consume messages to certain topics. Unfortunately reading and changing ACLs through Kafka CLI can be unintuitive and prone to mistakes for users who are not familiar with how Kafka ACLs work. A simple error in changing Kafka ACLs can cause system failures on multiple services that rely on Kafka, especially any changes that include wildcard. To prevent this, a quick Python script leveraging the confluent_kafka library can safeguard these changes by reviewing their changes before applying them, preventing catastrophic incidents from happening.
COVID-19 has impacted PyCon MY greatly in terms of the size of the community and funding. The effect has prompted the core members to look into a more sustainable way for the survival of the community.
Here are the 4 pillars for sustainability:
1. Number of community members
2. Fundings
3. Leaderships
4. Collaborations with other communities, government agencies and industries.
In the talk, I will share the reasons we look into the above pillars, what plan we have, and the progress.
Efficient and effective system maintenance and troubleshooting in manufacturing industries are crucial for ensuring uninterrupted production processes. This proposal proposes leveraging IoT (Internet of Things) technology to enhance the monitoring and management of critical system temperatures, particularly in control rooms where temperature fluctuations can impact operational stability. The proposed solution involves the integration of micro controller (Arduino) and microcomputer (Raspberry Pi) technologies combined with software engineering to create an integrated and centralized smart critical room temperature monitoring system.
In this system, nodes consisting of micro controllers and sensors are strategically deployed throughout the room to monitor temperature and humidity variations. These nodes transmit data to a central master unit (Raspberry Pi) which collects, processes, and publishes the aggregated data to a message broker. A central server then consumes the data from the message broker, processes it further, and provides real-time monitoring capabilities to users across various locations. This approach eliminates the need for manual temperature checks at individual points, enabling proactive maintenance and swift troubleshooting actions based on real-time data insights.
By implementing this IoT-based solution, manufacturing industries can significantly improve system maintenance efficiency and effectiveness, thereby enhancing operational reliability and minimizing downtime due to temperature-related issues.
This talk will explore the development and implementation of multi-modal data fusion processing using Python, with a focus on an agricultural data case study. The session will introduce a framework and techniques for effectively processing and utilizing diverse data types to enhance decision-making in agriculture. Attendees will learn about the application of transfer learning to improve data sharing and interoperability, considering factors such as suitability, affordability, openness, and acceptability.
Ever wondered what goes on behind the scenes to keep your Python apps running smoothly? Join us for a deep dive into the world of garbage collection and resource allocation! We’ll uncover the secrets of Python's memory management, from reference counting to tackling those pesky cyclic references. Uncover how Python automatically identifies and cleans up unused objects, ensuring optimal performance using gc
module. You'll also get a scoop of tools like memory_profiler
and tracemalloc
. Whether you're a Python pro or just curious, this talk will help you keep your code lean, clean and efficient. Don't miss it!
How many stars do you know so far? Maybe you are familiar with Canopus, Capela, Vega from Sherina's Adventure Film. There are many stars outside our solar system. Or maybe you are familiar with the zodiac signs? Zodiac signs in astronomy are a collection of several stars called constellations. There is a lot of knowledge about the stars we see in the sky. By using python, we can see information about the stars in the sky using better visualization. You can also use this application to view star coordinates for observations.
It is essential to prioritize security in web app development. A robust and secure authentication service is necessary to protect clients' data and privacy. However, in the development of web app, we often face several situations that demand us to deliver it to the client quickly. To address these challenges, we need a system that is not only able to enhance security but also does not take a long time to set up. Logto.io is a modern authentication solution that offers a variety of features to enhance web app security like MFA (Multi-Factor), Social Auth (Google, Github, etc), User Management, Customizable user sign-in experience, and much more. The best part is that Logto.io is open source. Therefore, it allows you to deploy it with Docker, as a standalone system, or even in the cloud using their provided service.
KEYNOTE 3
The "Data Slices" series showcases aesthetically pleasing data visualizations created using Python libraries such as Matplotlib, Seaborn, and other advanced visualization libraries. Each visualization is meticulously curated to reveal hidden beauty and intriguing insights within seemingly ordinary data. The series covers various topics, including a popular episode where I tracked my daily mood throughout 2023 and transformed it into an engaging visualization, complete with a report card summarizing mood trends. "Data Slices" has been featured on platforms like Exsight Analytics and HaloTech Academy's YouTube channel, and continues to explore diverse data topics across multiple seasons.
This talk dives into the exciting realm of enriching your Large Language Model (LLM) interactions with structured data extraction. We'll explore how LangChain, in conjunction with Pydantic, empowers you to retrieve not just plain text from LLMs but also reusable Python objects like lists, dictionaries, and even pandas DataFrames.
This talk offers a thorough and balanced review of using Graph Databases (GraphDB) to enhance the knowledge bases of Large Language Models (LLMs). Drawing from practical experiences and real-world applications, we will present both the advantages and challenges of integrating GraphDB with LLMs.
We will start by exploring the capabilities and limitations of generative AI and LLMs, emphasizing common issues such as hallucination, where models generate misleading or baseless content. The core of the presentation will delve into how GraphDB can provide a structured and reliable knowledge base that improves the contextual accuracy of LLM outputs.
Attendees will gain insights into the practical implementation of GraphDB, supported by hands-on examples and case studies. We will discuss the strengths of GraphDB, such as its ability to create a robust and interconnected knowledge graph, and also address the potential drawbacks and challenges encountered during implementation.
By the end of the session, participants will have a clear understanding of the real-world impact of using GraphDB with LLMs, equipping them with the knowledge to make informed decisions about their AI projects. This talk is designed to be both informative and practical, offering deep insights into the intersection of GraphDB and LLM technologies.
As the Python gaining more popularity in development and automation, developers are now becoming the target of attack. The supply-chain attacks has emerged as a significant threat targeting developers who are not aware that their libraries or infrastructure might be infected. This talks delves into how attacker abuse, infiltrates, and compromise developer environment to achieve their goals.
Attendees will gain insight into the mechanics of these attacks, understanding how seemingly benign environment can harbor malicious code. The session will further explore real-world case studies, from attacker perspective.
Georgi will host an Ask Me Anything session for PyLadies
Sesi PBNU
We will delve into the significance of the Python Software Foundation (PSF) and how you can engage with the global Python community through it. Additionally, be the first to hear about the launch of the Python Asia Organization (PAO), an exciting new initiative aimed at empowering the regional Python community across East and South East Asia (for now)
Aksara Lampung merupakan salah satu manuskrip kuno yang ada di Nusantara. Aksara ini merupakan turunan dari warisan Dewdatt Deva Nagari (Devnagari). Aksara ini lebih dikenal juga dengan nama Had Lampung bagi masyarakat suku Lampung. Pada pembahasan topik ini akan mengulas bagaimana tulisan aksara Lampung bisa dikenali dan melakukan identifikasi dengan algoritma Computer Vision berbasis Optical Character Recognition (OCR) dengan metode Machine Learning atau Deep Learning agar masyarakat Indonesia dapat mengenal maupun mempelajari aksara Lampung yang mulai pudar bahkan belum ada digitalisasi lebih lanjut mengenai aksara Lampung.
In the era of AI everywhere, we need to run AI applications on the edge for a long period of time. Newly introduced Neural Processing Unit (NPU) is targeted for low power sustain workload like real-time streaming applications. This talk will demo OpenVINO notebooks that can run selective deep learning models on NPU as well as introduce Intel NPU Acceleration Library to go deeper on how to run Python on NPU.
In the era of cloud computing, it is essential to have logs that are structured using JSON and include the context of where the logs came from. This talk will show how to use structlog with Django, Celery, and Sentry in a real web application development scenario.
Discover powerful alternatives to A/B testing for measuring experiment impact using causal inference techniques. This talk will demonstrate how we can leverage the matching process methods, such as propensity score matching to reduce bias when comparing groups, providing reliable metrics even when traditional A/B testing isn't feasible. Learn practical applications, and new approaches, and gain actionable insights to broaden your data analysis toolkit and effectively measure the impact of your features or products.
Jublia Sponsor Talk
GraphQL is a more flexible alternative to REST for building web APIs and thus is becoming a strong foundation for any modern web stack. This is especially true where static HTML templates are not enough or a sophisticated single-page interface is needed. In this talk, we will look at various solutions for building GraphQL APIs in Python to power modern, dynamic web apps.
These last few years, we’ve seen a significant increase in the number of Generative AI-powered applications being developed by organizations and professionals globally. Due to the number of possible variations available when building these types of systems, companies often struggle to identify and implement the most effective approach. After a few weeks of running their Gen AI application, they start to encounter issues and challenges related to performance, scalability, and cost management.
In this session, we will discuss various best practices and strategies when building, scaling, and optimizing Generative AI systems. We'll tackle various challenges that organizations experience after they have deployed their first version of the Gen AI application and we'll discuss multiple optimization strategies to reduce costs and improve the application's performance.
Building a product and maintaining it are two different subset of challenges. When the development has just started, your team is small and your code is still using the latest, state-of-the-art dependencies. After a while, maintainability issues become trivial as team size grows and the codebase is getting older. This talk will explore various tools that you can adapt to your codebase for progressive improvements.
Autoscaling is crucial for organizations that need to adjust their computing resources based on user demand, especially during peak times. While many cloud services offer this feature, it can be challenging to implement autoscaling in environments where it is not directly supported.
In this talk, we will explore how to create an autoscaling system that is compatible with any cloud service, drawing inspiration from Kubernetes, involves developing a cloud-agnostic autoscaling system. This approach allows organizations to utilize autoscaling capabilities without being restricted to a specific cloud provider.
Ensuring the integrity and correctness of data is crucial, including configuration files like YAML. YAML files are used for configuration in various applications, kubernetes manifests and CI/CD pipelines. In our cases we used YAML files to creates Airflow DAGs in human readable format that will be processed by generator to generate numerous of DAGs compared to creating from scratch by using code. Everyone can make an Airflow DAG from YAML files.
However it is prone to human error due to their strict syntax and structure. These mistakes can lead to significant issues when deploying to production, causing downtime and unplanned maintenance.
In this talk, we will explore how to leverage custom pre-commit hooks to automate the validation of YAML files before they are committed, pushed, and deployed. By integrating these checks into your development workflow, you can catch errors early, maintain high standards of code quality, and empower your team to deploy with confidence.
The advancement of generative AI technologies has opened new frontiers in creativity and automated creation. However, there are several issues and concerns that need to be mitigated.
Agricultural commodities such as shallots, red chili, and cayenne pepper are highly produced, serving not only the local community but also neighboring districts and cities. However, farmers often face income losses as their selling prices are dictated by collectors, deviating from the actual market prices. This discrepancy negatively impacts farmers' productivity. Our proposed research aims to address this issue by developing a predictive system for commodity selling prices using data mining techniques, specifically multiple linear regression. This method will analyze factors such as rainfall and total production to forecast prices, enabling us to provide farmers with accurate selling price recommendations for specific periods. By aligning farmers' selling prices with market trends, this system aims to minimize their losses and enhance productivity. This proposal outlines the development and implementation of this predictive system, highlighting its potential benefits for the agricultural sector.
In the digital age, the wave of information is vast and relentless. With the rise of social media and ‘short-like’ platforms such as TikTok, the demand for succinct, accurate, and engaging news content is ever-increasing. However, the current information landscape often overwhelms users with a deluge of content, making it challenging to sift through and identify relevant news.
Artificial Intelligence (AI) itself has been advancing ever since, integrating to various sectors including news content generation. AI’s ability to analyze large volumes of data, identify patterns, and generate content has opened up new possibilities in the field of journalism.
To address these issues, this python project will leverages AI to automate the generation of one-minute accurate news content, specifically designed for ‘short-like’ platforms. This project aims to provide users with concise, relevant, and engaging news, tailored to the unique consumption patterns of today’s digital audience.
We all love python. But not all of us is fortunate enough to land a job as a python programmer or developer. Maybe some of us currently doing some office jobs that is not using python and doesn't need programming language at all. But it doesn't make our python skill useless. As an open-source programming language python has a lot of library that can make our life easier. In this talk we will explore some of python library and try to use it to do our office job faster and give more time for us to rest.
Javascript developer create Typescript to make javascript had typesafety. This typesafety make project easier to read and has less bug. Can typesafety like typescript be achieve in python?
An unstructured conversation about contributing to Python and open source, hosted by Mariatta Wijaya
In this talk, we will explore how Python and Large Language Models (LLMs) can be utilized to unlock new creative potentials in 3D printing. We will delve into practical applications and real-world examples where these technologies come together to design, prototype, and produce unique 3D-printed objects. Attendees will learn about the integration of Python scripts and LLM-driven design processes, showcasing how these tools can simplify and enhance the creative workflow.
Graph Neural Networks (GNNs) are revolutionizing how we handle complex data structures in various domains, from social networks to molecular biology. In this talk, we will explore how Python can harness the power of GNNs to solve real-world problems. Attendees will learn the fundamentals of GNNs, practical applications, and how to implement these networks using Python libraries. This session aims to empower developers to leverage GNNs in their projects, enhancing their ability to analyze and interpret intricate data patterns.
Locust is a popular load testing tools written in Python and scripted in Python too. It's easy to get started with Locust even me my self as a non Python developer can write the script easily as long as know some basic rules in Python.
We want to show the audience that how easy it is to play with load testing tools like Locust from installing it to write our first script. It also has a beautiful interface that already built-in and no more additional setup.
Serverless applications have transformed software development and deployment. Building serverless applications with Python on AWS presents unique challenges, especially for developers accustomed to frameworks like Django, Flask, and FastAPI. This talk explores strategies to maintain the developer experience of these frameworks while gaining the benefits of serverless. We will cover key patterns and tools, such as Lambdalith, AWS Chalice, AWS Lambda Powertools, single-purpose function patterns, event-driven architecture, and Infrastructure as Code (IaC) tools.
Despite the significant increase in the number of Generative AI-powered applications being developed by companies and professionals globally, many organizations are unable to secure their deployed applications properly. One of the practical techniques to secure Gen AI systems along with self-hosted LLMs involves building a vulnerability scanner that checks for vulnerabilities such as prompt injection. In this session, we will discuss how to build a custom scanner to help teams identify security issues specific to their self-hosted LLMs.
This paper presents a solution for detecting actively exploited WordPress vulnerabilities in a shared hosting environment. Recent reports indicate a significant increase in reported vulnerabilities, highlighting growing risk. Analysis from Patchstack shows a 24% rise in vulnerabilities from 2022 to 2023 [1], while WPScan reports substantial increase in reports from 2014, 2022, 2023, until 2024 especially among free plugins and themes [2]. Given these findings, detecting these vulnerabilities is crucial, particularly in shared hosting where users may lack awareness. Leveraging CloudLinux’s Imunify360 WAF rules, which includes WordPress vulnerability signatures, this study integrates incident logs from Imunify360’s SQLite database, WP-CLI Vulnerability Scanner, and Python for detection. By correlating WAF-triggered attacks, Static Analysis of version from WP-CLI Vulnerability, modifying date of plugin, theme, and core WordPress, the approach enhances the identification of actively exploited vulnerabilities.
In this session, I will demonstrate how Python can be leveraged to analyze the impact of public policies. While it might sound complex, Python simplifies the process significantly. Attendees will see visualizations that can tell a 'story' using geemap and Google Earth Engine. I will showcase the power of Python through simple yet powerful commands supported by these tools. By incorporating some statistical analysis, we can determine the actual impact of the development.
As technologists or enthusiasts, we can contribute to society by presenting data-driven insights, even if we cannot directly implement policies ourselves. This session aims to show that with basic programming skills, anyone can analyze and inform public discourse. If we cannot change policies, we can at least tell !.
These last few years, we’ve seen a significant increase in the number of serverless generative AI-powered applications being developed by organizations and professionals globally. Unfortunately, companies are unable to keep up with the security considerations and requirements and often end up unprepared for various types of attacks. There are different ways to attack serverless Generative AI-powered systems and most organizations are not equipped with the skills to secure these systems. In this talk, we will talk about the different ways these systems can be attacked and then we will share relevant strategies to protect these systems.
Japanese is reportedly one of the most difficult languages for English speakers to learn.
(FSI language difficulty: https://www.fsi-language-courses.org/blog/fsi-language-difficulty/)
There are many reasons for this, including the fact that there are three types of characters: hiragana, katakana, and kanji, and that words are not separated by spaces.
In this talk, I will first introduce what makes Japanese different from many European languages.
Then I will show how Python and natural language processing libraries can be used to support Japanese language learning.
Lightning talks are a maximum of 5 minutes long, on any topic of interest to other Python people.
It doesn't have to be about something that you wrote, it can be something that you learned, or a technique you think other people will be interested in.
- You know that thing at work that everyone comes to you for help with? Talk about that!
- You know that thing you just learned that helped you out? Talk about that!
- You know that thing you always wish you understood, but haven't figured out yet? Talk about that!
Slides are encouraged but not required!
Submit your talk here: https://bit.ly/py24lt