A scheduled ETL workflow using Apache Airflow, Python, PostgreSQL, and Docker to fetch Bitcoin data from the CoinGecko API, process it, and store it in a PostgreSQL database.
Tag: Data Engineering
This ETL pipeline:
- Queries the CoinGecko API for Bitcoin data.
- Computes unavailable supply and issuance progress.
- Stores the processed data in a PostgreSQL database.
- Runs on a scheduled basis using Apache Airflow.
- Apache Airflow – Task scheduling & orchestration
- Python – Data processing
- PostgreSQL – Data storage
- Docker – Containerization
Clone the repository:
git clone https://github.com/PPjamies/coingecko-etl.git
cd coingecko-etl
Start the ETL pipeline using Docker Compose:
docker-compose build && docker-compose up -d
Access the Airflow UI:
http://localhost:8080
Trigger the DAG to run the ETL process.
- Support for multiple cryptocurrencies.
- Data validation & error handling.