Role Overview
:
We are seeking a highly skilled
Senior Data Engineer
with 5+ years of experience to join our team. This role is focused on designing and building robust data assets through high-performance data pipelines. You will be a key player in modernizing our data infrastructure, transitioning legacy codebases into clean, scalable architectures, and ensuring the highest standards of code quality through Test-Driven Development (TDD).
Key Responsibilities:
Data Pipeline Development:
Design, develop, and maintain complex ETL/ELT pipelines to build high-value data assets.
Legacy Modernization:
Lead the
code refactorization
of legacy codebases, improving readability, maintainability, and performance.
System Optimization:
Perform deep code optimization using
Spark SQL
and
PySpark
to handle large-scale datasets efficiently.
Quality Assurance:
Implement a
Test-Driven Development (TDD)
approach, writing comprehensive unit tests to ensure functionality and catch bugs early in the lifecycle.
Complex Problem Solving:
Isolate and resolve difficult bugs, including those related to performance bottlenecks, concurrency issues, and complex logic flaws.
Cloud Architecture:
Design and deploy solutions utilizing the full AWS stack, explaining the trade-offs and benefits of specific services for various use cases.
Technical Requirements:
Core Programming & Data Engineering
5+ years of experience
in hands-on programming with
Python
and
PySpark
.
Expertise in
Boto3
and various Python frameworks and libraries, adhering strictly to Python best practices (PEP 8).
Strong experience in
Spark SQL
and PySpark optimization techniques (e.g., partitioning, caching, broadcast joins).
Cloud & Infrastructure (AWS)
Deep architectural knowledge of
AWS services
, including:
S3, EC2, Lambda, Redshift, CloudFormation
DevOps & Tools
Advanced understanding of
Git
(branching strategies, PR reviews).
Experience with
JFrog Artifactory
for dependency management and artifact storage.
Proficiency in CI/CD pipelines and automated testing frameworks.
Professional Attributes:
Analytical Mindset:
Ability to debug complex, non-obvious issues in distributed systems.
Clean Coder:
Passion for writing "clean code" and mentoring junior engineers on maintainability.
Architectural Thinking:
Ability to explain the "why" behind choosing specific AWS components over others.