
Ayman Ibrahim
Data Engineer at Upwork - Freelancing
Engineering at Alexandria
Egypt
Get Remote-Ready with the Anywherepro Assessment
Increase your chances of landing your dream hybrid or remote job today!
Earn your badgeHi, I'm Ayman Ibrahim!
Data Engineer at Upwork - Freelancing
A professional data engineer working to inform business decisions, analyze, visualize, and gain key insights from data, share findings, and provide data-driven recommendations. Looking for opportunities in data analytics to develop data models, and build data warehouses, data lakes, and data pipelines to solve business challenges.
Socials
No social links added
Experience
Upwork - Freelancing
Data Engineer
July 2019 - Present
• Create and optimize user-friendly relational and NoSQL data models. • Develop scalable, efficient, and cloud-based data warehouses and data lakes. • Develop and optimize Spark applications to troubleshoot common errors • Configure, and automate data pipelines with Airflow to monitor and debug production pipelines. • Apply the Extract, Transform and Load (ETL) pipeline to process massive datasets. • Perform data quality tests and track data lineage to catch any discrepancies in the datasets.
Certificates & Badges
No certificates or badges added
Projects
Data Engineer
- Set up and launched Amazon EMR cluster.
- Loaded data from S3 and processed them into analytics tables to build a data lake using Spark.
- The raw datasets in JSON files reside on Amazon Simple Storage Service (S3) data storage.
- Extract, Transform and Load (ETL) pipeline extracts JSON data files from S3 and processes them using Apache Spark on Amazon EMR, and loads the data back into a data lake hosted on S3 as partitioned Parquet files of dimensional tables.
- Spark application is written in the PySpark interface for Apache Spark in Python.
Data Engineer
- Set up and launched Amazon Redshift cluster.
- Transformed data into a set of dimensional tables in Redshift.
- The raw datasets in JSON files reside on Amazon Simple Storage Service (S3) data storage and Amazon Redshift data warehouse uses star schema with fact and dimension tables.
- Extract, Transform and Load (ETL) pipeline extracts data from S3, and stages them in Redshift.
- Database queries are written in Structured Query Language (SQL) for the PostgresSQL database engine in the sql_queries.py file.
Data Engineer
- Created and automated data pipelines with Airflow.
- Configured and monitored the data pipelines.
- The raw datasets in JSON files reside on Amazon Simple Storage Service (S3) data storage.
- Extract, Transform and Load (ETL) pipeline extracts JSON data files from S3, processes, and loads the data on the Amazon Redshift data warehouse using Apache Airflow.
- Data quality tests run after the ETL steps have been executed to catch any discrepancies in the datasets.
- Airflow Directed Acyclic Graphs (DAGs) and plugins are written in Python.
Languages
English
Skills
Data Analytics
AWS (Amazon Web Services)
SQL
Python
Data Science
Amazon S3
Amazon RDS
Big Data Engineer
Amazon Elastic Compute Cloud (Amazon EC2)
Extract, Transform, Load (ETL)
Data Warehouse
AWS Certified
Spark
Pipeline