Data Architect– Performance Optimization

Noida, India Mid Posted 2026-04-21

Don't apply into the void — reach the hiring manager

ResuMail finds the recruiters and hiring managers behind this Data Architect– Performance Optimization role at Algoworks, drafts a personalised outreach email, and schedules the send — so your application actually gets seen.

Reach the hiring manager ›

About this role

Role: Data Architect - Performance Optimization Location: India, Remote Experience: 8 Years Algoworks www.algoworks.com About the company Algoworks is an award-winning artificial intelligence, engineering services and experience transformation firm with offices across the United States, Europe, South America and India. We bring together a global team of engineers, architects, designers, researchers and operators united by rigor, accountability and a commitment to delivering measurable results. For over 20 years, Algoworks has partnered with Fortune 500 organizations across the Americas, Europe and Asia to define, build and run technology that drives meaningful business outcomes. Our work combines human-centered design, engineering excellence and AI-powered capabilities to solve complex challenges with clarity and precision. Innovation, particularly in the responsible application of AI, is embedded in how teams approach problem-solving and continuous improvement. At Algoworks, growth is continuous and closely tied to impact. Teams collaborate across geographies and disciplines, strengthening outcomes through shared insight and collective expertise. The culture values transparency, open dialogue and an environment where every voice is heard and contribution is recognized. Through collaboration, accountability and a focus on results, Algoworks operates at the intersection of technology and people, building not only advanced systems but strong global teams that elevate performance and create lasting impact. Follow the video below to know about us! Clipchamp Role overview We are seeking a highly skilled Data Architect – Performance Optimization Expert with deep expertise in SQL, PySpark, and Databricks performance tuning. This role focuses on handling large-scale data workloads, optimizing complex transformations, and designing highly efficient data pipelines within a Medallion Architecture (Bronze, Silver, Gold layers). The ideal candidate will bring strong hands-on experience in query optimization, partitioning strategies, clustering, and managing high-volume read/write operations using modern Lakehouse technologies such as Delta Lake and Apache Iceberg. Key responsibilities: 1.Data pipeline development and optimization Design, develop, and optimize scalable data pipelines in Databricks using PySpark and SQL . Optimize complex SQL queries and Spark jobs for performance, cost, and scalability. Handle large-scale transformations and complex joins across TB–PB scale datasets. 2.Architecture and data modeling Implement and maintain Medallion Architecture (Bronze, Silver, Gold layers). Apply strong data modeling practices for efficient data processing and access. Build reusable frameworks and enforce best practices for performance optimization. 3.Performance engineering Define and enforce partitioning, bucketing, and clustering strategies (including Z-ordering). Optimize read and write performance for large datasets. Tune Spark jobs using techniques such as: Broadcast joins Caching and persistence Adaptive Query Execution (AQE) Shuffle optimization Analyze execution plans (Spark UI, DAGs) and apply cost-based optimization techniques. Address data skew, large joins and shuffle-intensive workloads. 4.Databricks and lakehouse optimization Work extensively with Delta Lake and Apache Iceberg tables. Optimize cluster configurations (autoscaling, instance types, memory tuning). Manage Databricks jobs, workflows, and production pipelines. Ensure efficient file sizing, compaction, and storage optimization. 5.Quality, reliability and troubleshooting Identify performance bottlenecks and implement optimization strategies. Ensure data quality, consistency, and reliability across pipelines. Debug complex data processing and system performance issues. Collaboration & Stakeholder Engagement. Collaborate with Analytics, BI, Product, and DevOps teams. Translate business requirements into scalable and optimized data solutions. Required skills and experience: Core technical skills Strong expertise in SQL query optimization. Advanced hands-on experience with PySpark / Apache Spark. Hands-on experience with Databricks platform. Deep understanding of: Partitioning strategies Clustering (Z-ordering, indexing) File sizing and compaction Experience with Delta Lake. Experience with Apache Iceberg or similar open table formats. Strong understanding of distributed data processing concepts. Performance optimization expertise Proven experience optimizing large joins, aggregations, and skewed data scenarios. Strong knowledge of Spark execution plans, DAG analysis, and query pruning techniques. Experience with cost-based optimization and performance tuning methodologies. Data engineering and architecture Strong experience in Medallion Architecture. Experience building end-to-end ETL/ELT pipelines. Understanding of batch and real-time data processing systems. Familiarity with data modeling techniques. Databricks and ecosystem Expertise in cluster tuning and configuration. Experience with Databricks Jobs, Workflows, and Notebooks. Exposure to production-grade data pipeline orchestration. Good to have skills: Experience with Azure Data Factory, Microsoft Fabric, or Azure Synapse Analytics. Experience in multi-tenant data architectures. Exposure to data governance and security frameworks. Desired attributes: Strong problem-solving and analytical thinking. Ability to debug and optimize complex systems. Effective communication and stakeholder management skills. Ability to work in a fast-paced, data-driven environment. Mandate skills: ETL | SQL | PySpark | Databricks | Delta Lake | Apache Iceberg | Microsoft Azure | Azure Data Factory | Microsoft Fabric | Azure Synapse Analytics Interview process 2 rounds of discussion.

How to get this job at Algoworks

Don't rely on the portal. Cold applications for a role like Data Architect– Performance Optimization land in a pile of hundreds. A direct, personalised message to the hiring manager or a referrer is the fastest way in.
Find the right person. ResuMail surfaces the actual recruiters and hiring managers at Algoworks — not a generic careers inbox.
Send tailored outreach. ResuMail drafts an email personalised to your resume and this role, then paces and schedules sends so you stay out of spam.
Follow up. One polite nudge after 5–7 days roughly doubles reply rates — scheduled for you.

Reach Algoworks's hiring managers today.

Free to start. No credit card. Built for Indian job seekers.

Start free with ResuMail ›