Job Description:
Title:
Lead data engineer
DCF Level:
L40
About the Role
We are seeking a highly skilled and delivery-focused Lead GCP Data Engineer to support the design, development, and implementation of next-generation enterprise data and AI platforms on Google Cloud Platform (GCP).
This role will work closely with Enterprise Architects, platform leaders, and cross-functional engineering teams to build scalable, reusable, and AI-ready data foundations that enable advanced analytics, intelligent automation, and enterprise AI adoption.
The ideal candidate combines strong hands-on expertise in cloud-native data engineering, modern data platform development, semantic data enablement, and scalable pipeline engineering with the ability to lead engineering teams and drive high-quality delivery across multiple initiatives.
This role is expected to play a critical leadership position within the engineering organization by driving implementation excellence, mentoring teams, and operationalizing modern data architecture patterns.
Key Responsibilities
1. Enterprise Data Platform Engineering
Design, develop, and optimize scalable cloud-native data platforms and pipelines on GCP.
Implement robust batch, streaming, and event-driven data processing solutions supporting enterprise analytics and AI use cases.
Collaborate with Enterprise Architects to translate target-state architecture into scalable engineering implementations.
Contribute to modernization of legacy data ecosystems into reusable, governed, and AI-ready cloud platforms.
Support implementation of scalable ingestion, transformation, serving, and orchestration frameworks.
2. Data Product Engineering
Develop reusable and domain-oriented data products aligned with data mesh and data-as-a-product principles.
Implement scalable and modular data pipelines supporting multiple downstream consumers including analytics, AI/ML, and operational applications.
Contribute to implementation of:
Data contracts
Schema management
Metadata enrichment
Data quality frameworks
Reusable transformation patterns
Enable discoverability, trust, and operational reliability of enterprise data assets.
3. Semantic Layer & Consumption Enablement
Support implementation of semantic and business-consumption layers that simplify enterprise data access.
Collaborate with analytics and BI teams to enable standardized business metrics, reusable dimensions, and governed KPI definitions.
Contribute to semantic modeling and metadata integration initiatives supporting self-service analytics and AI consumption.
Assist in improving enterprise data usability, consistency, and discoverability across platforms.
4. GCP-Native Engineering & Development
Develop and optimize solutions leveraging GCP-native services including:
BigQuery
Dataflow
Dataproc
DBT
Pub/Sub
Cloud Storage
Cloud Composer (Airflow)
Cloud SQL
Build scalable ETL/ELT frameworks and real-time streaming pipelines.
Optimize data processing performance, reliability, scalability, and cost efficiency.
Implement CI/CD pipelines and engineering automation for data platform delivery.
5. AI/ML & GenAI Data Enablement
Build AI-ready data pipelines and scalable feature engineering workflows supporting enterprise AI initiatives.
Support integration with:
Vertex AI
BigQuery ML
Vector databases
LangChain
Generative AI Studio
Contribute to implementation of RAG architectures, semantic search, and AI-assisted data interaction patterns.
Partner with AI/ML teams to operationalize scalable ML and GenAI workflows.
6. Engineering Leadership & Delivery Excellence
Lead day-to-day engineering activities across multiple data engineering workstreams.
Guide and mentor junior and mid-level data engineers on modern engineering best practices.
Ensure adherence to coding standards, architecture guidelines, and operational best practices.
Drive engineering quality through automated testing, observability, monitoring, and performance optimization.
Collaborate with architects, product owners, analysts, and client stakeholders to ensure successful delivery outcomes.
7. Governance, Reliability & Observability
Implement data governance, lineage, monitoring, and observability frameworks.
Support enforcement of enterprise standards around security, reliability, scalability, and operational readiness.
Contribute to platform monitoring, incident management, and continuous improvement initiatives.
Ensure production readiness of pipelines and data services through robust testing and validation processes.
Technical Expertise Required
Area
Skills / Technologies
Cloud Data Engineering
GCP, BigQuery, Dataflow, Dataproc, Pub/Sub, Cloud Storage, Cloud SQL
Data Transformation
DBT, PySpark, SQL, ETL/ELT frameworks
Streaming & Pipelines
Apache Beam, real-time processing, event-driven architectures
Semantic Layer & Modeling
Semantic modeling concepts, Looker modeling, business metrics standardization
AI/ML Enablement
Vertex AI, BigQuery ML, LangChain, Vector Databases, GenAI integration
Orchestration & Automation
Cloud Composer (Airflow), CI/CD, Workflows
Metadata & Governance
Data Catalog, lineage, metadata management, observability frameworks
Programming
Python, SQL, PySpark
Qualifications
Bachelor’s or Master’s degree in Computer Science, Engineering, Information Systems, or related field.
7+ years of experience in data engineering and cloud-native data platform development.
Minimum 4+ years of hands-on experience delivering enterprise-scale solutions on GCP.
Strong expertise in building scalable batch and streaming data pipelines.
Experience working on modern enterprise data platforms supporting analytics, AI/ML, and GenAI use cases.
Good understanding of semantic layer concepts, reusable data models, and governed data consumption patterns.
Experience working within large-scale data modernization and cloud transformation initiatives.
Strong problem-solving, debugging, and performance optimization skills.
Proven ability to lead engineering teams and collaborate across architecture, product, and business functions.
Excellent communication and stakeholder management skills.
GCP certifications such as Professional Data Engineer preferred.
Location:
DGS India - Mumbai - Thane Ashar IT Park
Brand:
Merkle
Time Type:
Full time
Contract Type:
Permanent