Pipeline Data Engineering Academy home blog pages letters

Become A Better Data Engineer On A Shoestring (More Free Resources)

A bit more than a year ago I’ve compiled an annotated list of the best free courses and learning resources that could help anyone to become a data engineer on a shoestring. We’ve received an overwhelming amount of positive feedback on it, so after a full year of running the bootcamp I sat down again and collected an other bunch of resources we’ve bumped into during the cohorts.

Experienced data scientists, ambitious data analysts, data-obsessed product managers and future-oriented computer scientist have joined Pipeline Academy in the last year with the goal of learning real data craftsmanship, aka data engineering. But even if you don’t identify as any of the above, but you’re planning to become a more well-rounded data professional, you’ve come to the right place.

Expect well-designed learning experiences that are mostly very accessible in terms of pricing, and learning outcomes that are very close to what the market demands.

Some guidance for this list:

  • If you haven’t read the first edition of our compiled list of learning materials, make sure to check it out first.

  • There are various different formats included here (video-based courses, books, podcasts, story-based interactive coding tutorials etc.), try to identify which ones suit your learning style the best.

  • Don’t skip the fundamentals: using Python, SQL and the command line are essential for data engineers.

  • Integrate learning into your weekly schedule: try sticking to what you’ve started.

  • Share your experience and recommendations with others, it’s really difficult to find the right courses in the forest of mediocre Medium posts and useless certifications.

General

The Data Engineering Podcast and The Python Podcast.init
Outstanding podcasts of Tobias Macey on the mentioned topics.

Data Engineering, UC Berkeley, Spring 2021
University course ran by industry players with experience, although one has to reflect on the fact that data engineering does not happen in notebooks.

Technically
Technically sends out engaging, simple explanations of technical concepts that are useful for your day to day job and fun to read. Start with the SQL and the AWS explanations.

Explain the Cloud Like I'm 10
Beginners will find the cloud explained from the basics. Little prior knowledge is assumed. You will find lots of pictures, lots of examples, and many somewhat questionable analogies in this book.

Alex Xu: System Design Interview - An Insider's Guide
The book and the course will help you to think about and understand complex integrations.

Python

Data Structures and Information Retrieval in Python
The new Downey book introduces data structures and algorithms using a web search engine as a motivating example.

Real Python
The best Python tutorials — even the free ones are outstanding.

SQL

Update: our advisor, Dr. Martin Loetzsch just shared for free the material he's teaching also at Pipeline Data Engineering Academy.

Database Design
2nd Edition from The BC Open Textbook Project.

oleg-agapov/data-engineering-book
A good, partial start for SQL from Oleg Agapov in the Beginner path.

CMU Database Group
Video series on contemporary databases and datawarehouses.

CLI

The command line can be intimidating if you're starting out with it. Two resources that are nice to novices especially if they don't come from a computer science background.

Programming Historian
Example: Reshaping JSON with jq

Software Carpentry
Example: Automation and Make

veltman/clmystery
The Command Line Murders - an interactive learnbook.

slackermedia / bashcrawl
Learn Linux commands by playing a simple text adventure.

DevOps

A Cloud Guru
Good choice for paid courses on the cloud and DevOps.

Gently Down the Stream
A gentle introduction to Apache Kafka - pairing whimsical imagery with lucid explanations of stream processing concepts, this book will captivate beginners of all ages.

The Illustrated Children's Guide to Kubernetes
Follow the adventures of Phippy the Giraffe, Captain Kube, and Goldie the Gopher as they discover Kubernetes pods, replication controllers, services, and volumes. Get silly and serious at the same time with this lighthearted introduction to core Kubernetes concepts.

Thanks to all graduates and expert guests of our cohorts in 2021 for their valuable feedback.

Happy learning!