top of page
A picture of Pedro and his dog.

Pedro Holanda PhD

PhD in Database Architectures

C++ Programmer

Core Developer of DuckDB

  • GitHub
  • Twitter
  • LinkedIn
  • Discord

About Me.

I hold a Ph.D. in Database Architectures (CWI) on the topic of Progressive Indexes for Interactive Data Analysis. I am also one of the core developers of DuckDB, ranking among the top three contributors to the project, and committing to the project since its inception.

I have worked on various aspects of DuckDB, including:

  • DBMS Core (e.g., zonemaps, indexes, joins, optimizer, scans, and more)

  • Integration projects (e.g., Arrow and ADBC)

  • Plan Serialization (i.e., Substrait)

  • Client APIs (i.e., Python). 

All of these contributions have public pull requests available.

On the soft skill side, I have authored numerous technical blog posts and scientific papers, some of which are linked on this page.

Additionally, I have experience as a public speaker, with many videos published on YouTube, some of which can be accessed through this page.

Experiences

2023- Now

DuckDB Labs
Software Engineer

  • Maintain and develop new functionalities in DuckDB.

  • Manage two client projects for DuckDB Labs.

  • Write technical blog posts.

  • Give public presentations about DuckDB to developer communities.

2022-2023

DuckDB Labs
Chief of Operations

  • Managed the day-to-day operations of DuckDB Labs, including overseeing project timelines, resource allocation, and team coordination.

  • Actively participated in the organization of DuckDB Events.

  • Contributed to coding efforts within DuckDB Core.

2021-2022

DuckDB Labs

Software Engineer

  • Maintain and develop new functionalities in DuckDB.

  • Manage one client project for DuckDB Labs.

  • Write technical blog posts.

  • Give public presentations about DuckDB to developer communities.

2019

Microsoft Research

Data Management PhD Intern

Conducted research related to execution engines within SQL Server

Education

2021-2023

Post-Doc

Database Architecures

CWI

  • Supervised an MSc student on the topic of self-organizing storage in DuckDB.

  • Second-reader on a thesis of floating-point compression in DuckDB.

  • Peer-Reviewed the VLDB 2021 Demo Session.

2017- 2021

PhD Candidate

Database Architectures

CWI

Developed new algorithms to perform progressive index creation for interactive data analysis.

2015 - 2017

MSc Student

UFPR

Developed an index structure that caches frequently accesses data diminishing cache misses in the context of adaptive indexing.

2010 - 2014

BSc in Computer Science

UFC

Performed research in the areas of predictive index creation and database benchmarking.

Podcasts

Video Presentations

Presentations

Presentations

Presentations
All Categories
All Categories
Entertainment
People & Blogs
Science & Technology
DuckDB: Bringing analytical SQL directly to your Python shell (EuroPython 2023)

DuckDB: Bringing analytical SQL directly to your Python shell (EuroPython 2023)

41:27
Play Video
DuckCon 2023 - DuckDB Extensions - Pedro Holanda & Sam Ansmink

DuckCon 2023 - DuckDB Extensions - Pedro Holanda & Sam Ansmink

17:32
Play Video
FOSDEM 2023 - DuckDB In The Python Land - Pedro Holanda

FOSDEM 2023 - DuckDB In The Python Land - Pedro Holanda

24:43
Play Video
Interview with Pedro Holanda at J On The Beach 2023

Interview with Pedro Holanda at J On The Beach 2023

02:59
Play Video

Technical Blogposts

23-08-2023

DuckDB ADBC - Zero-Copy data transfer via Arrow Database Connectivity

27-07-2022

Persistent Storage of Adaptive Radix Trees in DuckDB.
 

07-07-2023

From Waddle to Flying: Quickly expanding DuckDB's functionality with Scalar Python UDFs

03-12-2021

DuckDB quacks Arrow:
A zero-copy data integration between Apache Arrow and DuckDB

16-08-2022

Scrooge McDuck: A DuckDB Extension for Financial Data Analysis (Demo)

26-11-2021

DuckDB Enums:

The Fellowship of the Categorical and Factors.

Scientific Papers

EDBT - 2021

Progressive Mergesort:
Merging Batches of Appends into Progressive Indexes

SBBD - 2019

Dissecting DuckDB : The internals of the “SQLite for Analytics”
 

DBTest - 2018

Fair Benchmarking Considered Difficult:Common Pitfalls In Database Performance Testing.

ICDE 2021

Multidimensional Adaptive & Progressive Indexes
 

DAMON - 2019

Relational Queries with a Tensor Processing Unit
 

EDBT - 2018

Deep Integration of Machine Learning Into Column Stores.
 

VLDB 2020

Progressive Indexes: Indexing for Interactive Data Analysis
 

EDBT - 2019

devUDF: Increasing UDF development efficiency through IDE Integration. It works like a PyCharm!

Talk To Me

I'm always happy to connect with people. Whether you want to discuss projects, talks, workshops, or just connect, feel free to let me know!

bottom of page