Jabir Bangrodi Profile Picture

Jabir Bangrodi

Senior Data Engineer

About Me

Data Engineer with 4 years of hands-on experience in designing, optimizing, and governing high-throughput, scalable data solutions, primarily within the financial sector. Expertise in PySpark/Spark SQL and the AWS ecosystem (Glue, S3, Athena, Lambda), with a proven history of migrating legacy systems to modern, high throughput pipelines, achieving up to 50% faster execution times.

Proficient in Data Quality and Governance to ensure production data is reliable and accurate for key stakeholders, including Analyst and ML Engineers.

Work Experience

Education

My Skills

Data Engineer

Road and transport Authority.

December 2025 – Present | Dubai, UAE

  • * Developing data pipelines leveraging Kafka and Spark Streaming to process real-time data, efficiently distributing it to multiple destinations such as sockets, PowerBI dashboards and MongoDB.
  • * Interact with and perform data operations on various databases, including MongoDB and MSSQL.

Data Engineering Management and Governance Analyst

Accenture India Pvt Ltd

October 2021 – August 2025 | Bangalore, India

  • * Developed highly scalable ETL pipelines using PySpark on AWS Glue to process and transform large data volumes from multiple source systems, focusing on modern, modular pipeline design.
  • * Automated data ingestion workflows via AWS Lambda, Glue, S3, and Athena, reducing manual intervention and improving overall pipeline reliability
  • * Pioneered migration of legacy SAS processes to PySpark, achieving a seamless transition with 100% output accuracy (Tableau) and a significant 30%+ enhancement in processing speed
  • * Implemented advanced data transformation logic in PySpark (spark with python), including window functions, aggregations, and joins, to meet complex business requirements.
  • * Optimized PySpark jobs using effective caching and resource management strategies, resulting in a 50% reduction in job execution time and minimized cluster resource consumption
  • * Implement data quality checks, monitoring, and validation routines using SQL to ensure data accuracy and reliability throughout the pipeline.
  • * Worked with Hive and Impala for querying and storing large datasets, optimizing queries for faster retrieval.
  • * Collaborated with product managers and engineers to gather and define technical requirements, conducted code reviews and provided mentorship to junior developers, fostering a culture of continuous improvement and knowledge sharing within the team.
  • * Developed and maintained data pipelines using ETL processes and tools
  • * Provided technical support for big data systems and applications
  • * Analyze code and data to troubleshoot dashboard exceptions and errors
  • * Developed and implemented various shell scripts for automating daily jobs, minimizing manual intervention which was taking 7 hours/week by 80%
  • * Experienced in Importing and exporting data into HDFS and Hive using Sqoop
  • * Participated in the development/implementation of Cloudera Hadoop environment
  • * Loaded and transformed large structured and semi-structured data from UNIX file systems into HDFS.
  • * Developed and optimized Hive queries for data visualization and reporting
  • * Converted Pig scripts to shell scripts to streamline data processing workflows.
  • * Worked with PySpark, including working with RDDs, DataFrames, and optimization techniques.

Bachelor of Engineering in Computer Science and Engineering

P A College of Engineering

Graduated: August 2021

Big Data Technologies

Hadoop
HDFS
Hive
HBase
Oozie
Sqoop
Phoenix

Programming Languages

Python
SQL

Data Processing

ETL
PySpark
SparkSQL
Spark Scala

Cloud

AWS Lambda
AWS Glue
AWS S3
AWS Athena

Governance & Quality

Data Quality Checks
Data Profiling
Metadata Management

DevOps & CI/CD

Bitbucket
Git
GitLab
Shell Scripting

Tools

Jupyter Notebook
SAS EG
Putty
VS Code
Dev CLI
Power BI
Databricks
Office 365

Cluster Management Tools

Ambari
CDP
HDP

Data Querying Tools

Hue
SQuirreL
Athena
MongoDB

Data Modeling

DW Dimensional Modeling
Star/Snowflake Schemas

Web Technologies

HTML
CSS

SDLC

Jira
Confluence

Graphic Designing

Adobe XD
Photoshop
CapCut

Soft Skills

Mentorship
Stakeholder Collaboration
Problem Solving
Planning
Self-Starter