Jabir Bangrodi

Work Experience

Education

My Skills

Data Engineer

Road and transport Authority.

December 2025 – Present | Dubai, UAE

* Developing data pipelines leveraging Kafka and Spark Streaming to process real-time data, efficiently distributing it to multiple destinations such as sockets, PowerBI dashboards and MongoDB.
* Interact with and perform data operations on various databases, including MongoDB and MSSQL.

Data Engineering Management and Governance Analyst

Accenture India Pvt Ltd

October 2021 – August 2025 | Bangalore, India

* Developed highly scalable ETL pipelines using PySpark on AWS Glue to process and transform large data volumes from multiple source systems, focusing on modern, modular pipeline design.
* Automated data ingestion workflows via AWS Lambda, Glue, S3, and Athena, reducing manual intervention and improving overall pipeline reliability
* Pioneered migration of legacy SAS processes to PySpark, achieving a seamless transition with 100% output accuracy (Tableau) and a significant 30%+ enhancement in processing speed
* Implemented advanced data transformation logic in PySpark (spark with python), including window functions, aggregations, and joins, to meet complex business requirements.
* Optimized PySpark jobs using effective caching and resource management strategies, resulting in a 50% reduction in job execution time and minimized cluster resource consumption
* Implement data quality checks, monitoring, and validation routines using SQL to ensure data accuracy and reliability throughout the pipeline.
* Worked with Hive and Impala for querying and storing large datasets, optimizing queries for faster retrieval.
* Collaborated with product managers and engineers to gather and define technical requirements, conducted code reviews and provided mentorship to junior developers, fostering a culture of continuous improvement and knowledge sharing within the team.
* Developed and maintained data pipelines using ETL processes and tools
* Provided technical support for big data systems and applications
* Analyze code and data to troubleshoot dashboard exceptions and errors
* Developed and implemented various shell scripts for automating daily jobs, minimizing manual intervention which was taking 7 hours/week by 80%
* Experienced in Importing and exporting data into HDFS and Hive using Sqoop
* Participated in the development/implementation of Cloudera Hadoop environment
* Loaded and transformed large structured and semi-structured data from UNIX file systems into HDFS.
* Developed and optimized Hive queries for data visualization and reporting
* Converted Pig scripts to shell scripts to streamline data processing workflows.
* Worked with PySpark, including working with RDDs, DataFrames, and optimization techniques.

Bachelor of Engineering in Computer Science and Engineering

P A College of Engineering

Graduated: August 2021

Big Data Technologies

Hadoop

HDFS

Hive

HBase

Oozie

Sqoop

Phoenix

Programming Languages

Python

SQL

Data Processing

ETL

PySpark

SparkSQL

Spark Scala

Cloud

AWS Lambda

AWS Glue

AWS S3

AWS Athena

Governance & Quality

Data Quality Checks

Data Profiling

Metadata Management

DevOps & CI/CD

Bitbucket

Git

GitLab

Shell Scripting

Tools

Jupyter Notebook

SAS EG

Putty

VS Code

Dev CLI

Power BI

Databricks

Office 365

Cluster Management Tools

Ambari

CDP

HDP

Data Querying Tools

Hue

SQuirreL

Athena

MongoDB

Data Modeling

DW Dimensional Modeling

Star/Snowflake Schemas

Web Technologies

HTML

CSS

SDLC

Jira

Confluence

Graphic Designing

Adobe XD

Photoshop

CapCut

Soft Skills

Mentorship

Stakeholder Collaboration

Problem Solving

Planning

Self-Starter

About Me

Work Experience

Education

My Skills

Data Engineer

Road and transport Authority.

Data Engineering Management and Governance Analyst

Accenture India Pvt Ltd

Bachelor of Engineering in Computer Science and Engineering

P A College of Engineering

Big Data Technologies

Hadoop

HDFS

Hive

HBase

Oozie

Sqoop

Phoenix

Programming Languages

Python

SQL

Data Processing

ETL

PySpark

SparkSQL

Spark Scala

Cloud

AWS Lambda

AWS Glue

AWS S3

AWS Athena

Governance & Quality

Data Quality Checks

Data Profiling

Metadata Management

DevOps & CI/CD

Bitbucket

Git

GitLab

Shell Scripting

Tools

Jupyter Notebook

SAS EG

Putty

VS Code

Dev CLI

Power BI

Databricks

Office 365

Cluster Management Tools

Ambari

CDP

HDP

Data Querying Tools

Hue

SQuirreL

Athena

MongoDB

Data Modeling

DW Dimensional Modeling

Star/Snowflake Schemas

Web Technologies

HTML

CSS

SDLC

Jira

Confluence

Graphic Designing

Adobe XD

Photoshop

CapCut

Soft Skills

Mentorship

Stakeholder Collaboration

Problem Solving

Planning

Self-Starter