Come work at a place where innovation and teamwork come together to support the most exciting missions in the world!
Description:
We are seeking a talented Lead
Software
Engineer
–
Performance
to deliver roadmap features of Enterprise
T
ruRisk
Platform which would help customers to Measure, Communicate and Eliminate Cyber Risks.
You will lead the performance engineering efforts across Spark, Kafka, Elasticsearch, and Middleware APIs, ensuring that our real-time data pipelines and services meet enterprise-grade SLAs.
As part of our high-performing engineering team, you will design and execute performance testing strategies,
identify
system bottlenecks, and work with development teams to implement performance improvements that support billions of
cyber security events processing
a day across our data platform.
Responsibilities:
Own the performance strategy across distributed systems
which
includ
es Hadoop,
Spark, Kafka, Elasticsearch/OpenSearch,
Big Data Components
and APIs
for each release.
Define, develop, and execute performance test plans, load tests, stress tests, and soak tests.
Create realistic performance test scenarios for data pipelines and microservices based on production-like workloads.
Proactively
identify
bottlenecks, resource contention, and latency issues using tools such as JMeter, Spark UI, Kafka Manager, Elastic Monitoring and App Dynamics.
Provide deep-dive analysis and recommendations on tuning and scaling Spark jobs, Kafka topics/partitions, ES queries, and API endpoints.
Collaborate with developers, architects, and infrastructure teams to integrate performance feedback into design and implementation.
Simulate and benchmark real-time and batch data
flow
at scale using synthetic and production-like datasets
and own this framework end to end for synthetic data generator.
Lead the initiative to build a performance testing framework that integrates with CI/CD pipelines.
Establish and track SLAs for throughput, latency, CPU/memory
utilization
and Garbage collection.
Create performance dashboards and visualization using Prometheus/Grafana, Kibana, or equivalent.
Document performance test findings and create technical reports for leadership and engineering teams.
Recommend performance optimization to Dev and Platform groups.
Responsible for
optimizing
the
overall cost.
Contribute to feature development and fixes apart from performance benchmarking.
Qualifications:
Bachelor's degree in computer science
, Engineering, or related field.
8+ years of overall experience in distributed systems and backend performance engineering.
4+ years of JAVA development experience with Microservices architecture
.
Proficient in
scripting (Python, Bash)
for automation and test data generation.
4+ years of hands-on experience with
Apache Spark
– performance tuning, memory management, and DAG optimization.
3+ years of experience with
Kafka
– topic optimization, producer/consumer tuning, and lag monitoring.
3+ years of experience with
Elasticsearch/
OpenSearch
– query profiling, indexing strategies, and cluster optimization.
3+ years of experience with
performance testing tools
such as JMeter or similar.
Excellent programming and designing skills and Hands-on experience on Spring, Hibernate.
Deep understanding of
middleware and microservices
performance including REST APIs.
Strong knowled
ge of
profiling, debugging, and observability tools
(e.g., Spark UI, Athena, Grafana, ELK).
Experience designing and running benchmarks at scale for high-throughput
environments
in PBs.
Experience with containerized workloads and performance testing in
Kubernetes/Docker
environments.
Solid understanding of
cloud-native architecture (OCI)
and distributed systems design.
Strong knowledge of Linux operating systems and performance related improvements.
Familiarity with
CI/CD integration for performance testing
(e.g., Jenkins, GitHub).
Knowledge of
data lake architecture
, caching solutions, and message queues.
Strong communication
skills and experience influencing cross-functional engineering teams.
Additional Plus Competencies:
Prior experience in any
a
nalytics platform on Big Data would be a huge plus.