Senior Data Platform
Engineer
As a Senior Data Platform Engineer, you will drive the architectural execution of the Data platform. You will take ownership of the system's scalability and consistency, ensuring the Operational Data Store functions as a reliable, highly available operational source for all global Master Data. You will act as a technical leader, mentoring the engineering team and collaborating with cross-functional integration teams to ensure seamless data flow.
Expanded Responsibilities
Platform Architecture & Management:
Architect and maintain mission-critical data lake and warehouse environments on Google Cloud Platform (GCP), utilizing services such as BigQuery, Pub/Sub, and Cloud Storage.
Infrastructure & Deployment:
Design, deploy, and manage robust, scalable infrastructure using
Terraform
to ensure environmental consistency across all deployment stages.
Data Pipeline Engineering:
Engineer high-performance, distributed data pipelines for real-time and batch streaming using technologies like
Dataflow
and
Kafka
. Lead workflow orchestration by designing, deploying, and monitoring complex Directed Acyclic Graphs (DAGs) in
Apache Airflow
(Cloud Composer).
Software Development:
Develop custom backend tooling, internal APIs, and microservices in
Golang
and
Python
to enhance platform capabilities.
Data Governance & Modelling:
Manage the end-to-end data warehouse lifecycle, employing
dbt
(Data Build Tool) for clean, documented, and tested data models.
DevOps & Reliability:
Drive DevOps and Site Reliability Engineering (SRE) practices, owning
CI/CD pipelines
and utilizing
Docker
for seamless, highly reliable deployments. Optimize and architect diverse database engines (SQL and NoSQL), focusing on performance tuning, schema design, and security.
Database Management:
Architect and optimize various database engines (SQL and NoSQL), focusing on performance tuning, schema design, and security.
Technical Profile Additions
Programming - Advanced proficiency in Golang (for systems) and Python (for data/automation).
Infrastructure as Code (IaC):
Terraform
CI/CD & Version Control:
Jenkins, Gitlab
Programming Languages:
Python (with expertise in FastAPI and Prefect; well-versed in functional programming and a strong grasp of object-oriented programming)
GCP Expertise:
Cloud Run (Services, Jobs, Functions)
Databases: BigQuery, Cloud SQL for PostgreSQL & , Cloud Spanner
Cloud Storage
Managed Airflow (Cloud Composer)
Pub/Sub, Eventarc
IAMs: Ability to construct necessary permissions following the principle of least privilege; capability to analyze permission requests for potential security consequences.
Artifact Registry
GCP Monitoring and Alerting
Secret Manager
Vertex AI and Workbench Instances: Setup and permission management.
GCP Networking
Containerization:
Docker
Architecture:
Advanced skills in architecting data pipelines on Cloud (GCP preferred)
Data Architecture Theory:
Advanced understanding of Data Mesh Architecture and its comparison to the centralized approach.
INTERMEDIATE LEVEL IS ACCEPTABLE:
Transformations:
dbt
Frontend Technologies:
React.js
NICE TO HAVE:
Secrets Management:
Akeyless
Project Management & Documentation:
Jira, Confluence, ServiceNow
Data Ingestion & Reverse ETL:
Fivetran (including Fivetran Activations)
Educational Background &
Qualifications
Required
Education:
Master’s or Bachelor’s Degree
in Computer Science, Data Engineering, Software Engineering, or a related technical field (e.g., Electromechanical Engineering with a focus on Automation)
Specialized Coursework:
Strong academic foundation in Distributed Systems, Database Management Systems (DBMS), Algorithm Design, and Real-time Computing.
Preferred
Certifications:
Google Cloud Professional Data Engineer
or
Google Cloud Professional Cloud
Architect
.
APICS
or similar certifications in Logistics/Supply Chain (a significant plus for understanding the "why" behind the warehouse data).