DATA ENGINEERING Pipeline

Crypto Market Analyzer

An automated Python pipeline utilizing the CoinMarketCap API to track, analyze, and visualize real-time trends of top cryptocurrencies.

CoinMarketCap API Pandas Seaborn Data Visualization

View Repository

Project Overview

This project builds a robust pipeline for financial data analysis. By connecting to the CoinMarketCap API, it pulls live market data, cleans and structures it into a Pandas DataFrame, and stores it locally for historical tracking. The final step involves sophisticated data visualization to map percentage changes over various timeframes (1h, 24h, 7d, 30d).

API Integration

Handles secure API requests with headers and parameters to fetch live crypto assets.

Data Persistence

Appends new data to a local CSV file to build a historical dataset over time.

Trend Analysis

Calculates mean percentage changes across 5 different time intervals.

Market Trend Visualization

Below is a dynamic representation of the market data generated by the project. It visualizes the volatility (Percentage Change) of top assets over time.

Core Implementation

1. Fetching Data from API

We configure the request headers with our API key and define parameters to fetch the top 15 currencies converted to USD using requests.

crypto_pipeline.py

from requests import Session
import json

url = 'https://pro-api.coinmarketcap.com/v1/cryptocurrency/listings/latest'
parameters = {
  'start': '1',
  'limit': '15',
  'convert': 'USD'
}
headers = {
  'Accepts': 'application/json',
  'X-CMC_PRO_API_KEY': 'YOUR-API-KEY-HERE'
}

session = Session()
session.headers.update(headers)
response = session.get(url, params=parameters)
data = json.loads(response.text)

2. Normalizing & Analyzing Data

The nested JSON response is normalized into a Pandas DataFrame. We then group by currency name to calculate average percentage changes.

crypto_pipeline.py

import pandas as pd

# Normalize JSON to DataFrame
df = pd.json_normalize(data['data'])

# Calculate mean changes for visualization
df_viz = df.groupby('name', sort=False)[[
    'quote.USD.percent_change_1h',
    'quote.USD.percent_change_24h',
    'quote.USD.percent_change_7d'
]].mean()

3. Visualization

Finally, the data is reshaped to allow Seaborn to plot the time intervals on the X-axis and percentage change on the Y-axis.

crypto_pipeline.py

import seaborn as sns
import matplotlib.pyplot as plt

# Reshape data for plotting
df_melted = df_viz.stack().to_frame().reset_index()
df_melted = df_melted.rename(columns={0: 'values', 'level_1': 'interval'})

sns.pointplot(x='interval', y='values', hue='name', data=df_melted)
plt.show()

Key Outcomes & Conclusion

This project demonstrates a complete data engineering workflow. Key takeaways include:

API Proficiency: Successfully authenticated and retrieved complex nested data from a third-party API.
Data Cleaning: Utilized Pandas json_normalize and melt to transform raw JSON into a format suitable for visualization.
Advanced Visualization: Leveraged Seaborn's point plots to effectively communicate multi-variable time series data.

View More Projects Back to Top