Personal Projects | Automated podcast pipeline – COMPLETED
Podcast Automation Pipeline: Building a Self-Publishing Infrastructure
Overview
I built a comprehensive podcast automation pipeline that completely transformed how I manage episode publishing and infrastructure. What started as a need to reduce manual work evolved into a sophisticated system that orchestrates everything from episode ingestion to publishing, monitoring, and backup—all without any manual intervention.
www.bazancast.com
The Challenge
Publishing a podcast involves multiple steps: downloading new episodes, generating transcripts, writing show notes, publishing to WordPress, and maintaining backups. Doing this manually for every episode is time-consuming and error-prone. I wanted to build a system that would handle all of this automatically while running on a reliable, self-managed infrastructure.
The Solution
The Automation Pipeline
I built a Python-based pipeline that orchestrates the entire publishing workflow:
- Episode Detection: The system monitors the RSS feed for new episodes and downloads them automatically
- Transcription: Audio files are processed to generate accurate transcripts using Claude AI
- Content Analysis: The pipeline analyzes episode content to extract key information and generate summaries
- Publishing: Formatted content automatically publishes to WordPress with proper metadata, show notes, and episode information
- Notifications: Email confirmations are sent upon successful publication
The entire process runs end-to-end without any manual intervention. When a new episode drops, everything else happens automatically.
The Infrastructure
To support this automation, I built a small homelab setup using a mini ten-inch rack with multiple servers. This infrastructure provides:
- Redundancy: Multiple machines working together ensure reliability
- Remote Management: SSH access allows me to manage everything from anywhere
- Scalability: The infrastructure can handle growing storage and processing needs
System Management & Updates
I automated system administration using Ansible playbooks. Instead of manually SSH’ing into each machine to apply updates or make configuration changes, Ansible handles it all at once:
- Automated security updates across all systems
- Consistent configuration management
- Visibility into what changed on each system
- Zero-downtime deployments
Monitoring & Observability
Reliability requires visibility. I implemented comprehensive monitoring using Grafana dashboards that track:
- CPU and Memory Usage: Real-time performance metrics
- Disk Space: Storage utilization across systems
- Network Traffic: Bandwidth consumption and patterns
- System Health: Uptime, process status, and alerts
If anything goes wrong, I know about it immediately.
Data Protection
All critical data—transcripts, metadata, configuration files—automatically backs up to a TrueNAS server. This ensures that even in a worst-case scenario, no content or configuration is lost.
The Results
The benefits are significant:
- Time Saved: What used to take 30+ minutes per episode now requires zero manual work
- Consistency: Every episode follows the same publishing process with identical formatting
- Reliability: Automated backups and monitoring mean I never lose data and always know system status
- Scalability: The system can handle growth without additional manual overhead
- Focus: I can focus on creating content while infrastructure handles itself
Technical Stack
- Language: Python
- Publishing: WordPress API
- Transcription: Claude AI API
- Infrastructure: Custom homelab with mini servers
- Automation: Ansible
- Monitoring: Grafana + Prometheus
- Backup: TrueNAS
- Remote Access: SSH
Key Takeaways
This project demonstrates several important principles:
- Automation scales: Systems that work for you while you sleep multiply your effectiveness
- Infrastructure matters: Proper infrastructure—monitoring, backups, redundancy—enables reliability
- Know your systems: Monitoring and visibility are non-negotiable for any production system
- Invest in tooling: The time spent building automation pays dividends immediately and compounds over time
If you’re managing repetitive tasks or running any kind of production system, the principles here apply: automate what can be automated, monitor what matters, back up what’s important, and build infrastructure that works for you, not against you.