Ayman Ibrahim

Data Engineer en Upwork - Freelancing

Ingeniería en Alexandria

Egipto

¡Hola, soy Ayman Ibrahim!

Data Engineer en Upwork - Freelancing

A professional data engineer working to inform business decisions, analyze, visualize, and gain key insights from data, share findings, and provide data-driven recommendations. Looking for opportunities in data analytics to develop data models, and build data warehouses, data lakes, and data pipelines to solve business challenges.

Redes sociales

No se agregaron enlaces sociales

Experiencia

Educación

Certificaciones y Distintivos

No se agregó certificaciones o distintivos

Proyectos

Data Lake with Spark

•https://github.com/aymanibrahim/data-lake-spark

Data Engineer

Set up and launched Amazon EMR cluster.
Loaded data from S3 and processed them into analytics tables to build a data lake using Spark.
The raw datasets in JSON files reside on Amazon Simple Storage Service (S3) data storage.
Extract, Transform and Load (ETL) pipeline extracts JSON data files from S3 and processes them using Apache Spark on Amazon EMR, and loads the data back into a data lake hosted on S3 as partitioned Parquet files of dimensional tables.
Spark application is written in the PySpark interface for Apache Spark in Python.

Data Warehouse with Redshift

•https://github.com/aymanibrahim/data-warehouse-redshift

Data Engineer

Set up and launched Amazon Redshift cluster.
Transformed data into a set of dimensional tables in Redshift.
The raw datasets in JSON files reside on Amazon Simple Storage Service (S3) data storage and Amazon Redshift data warehouse uses star schema with fact and dimension tables.
Extract, Transform and Load (ETL) pipeline extracts data from S3, and stages them in Redshift.
Database queries are written in Structured Query Language (SQL) for the PostgresSQL database engine in the sql_queries.py file.

Data Pipeline with Airflow

•https://github.com/aymanibrahim/data-pipeline-airflow

Data Engineer

Created and automated data pipelines with Airflow.
Configured and monitored the data pipelines.
The raw datasets in JSON files reside on Amazon Simple Storage Service (S3) data storage.
Extract, Transform and Load (ETL) pipeline extracts JSON data files from S3, processes, and loads the data on the Amazon Redshift data warehouse using Apache Airflow.
Data quality tests run after the ETL steps have been executed to catch any discrepancies in the datasets.
Airflow Directed Acyclic Graphs (DAGs) and plugins are written in Python.

Idiomas

Inglés

Habilidades

Data Analytics

AWS (Amazon Web Services)

SQL

Python

Data Science

Amazon S3

Amazon RDS

Big Data Engineer

Amazon Elastic Compute Cloud (Amazon EC2)

Extract, Transform, Load (ETL)

Data Warehouse

AWS Certified

Spark

Pipeline