About SatSure
SatSure is a deep-tech decision intelligence company operating at the nexus of agriculture, infrastructure, and climate action. We turn earth observation data into actionable insights for governments, financial institutions, and enterprises across the developing world — at scale, with reliability.
Role
You will be the
architect of the model’s latent space
, designing foundation models for
multi-spectral, multi-temporal, and multi-resolution geospatial data
.
This is a
hands-on role
involving prototyping, experimentation, and large-scale training. You will work across representation learning, model scaling, and spatiotemporal modeling to build systems that generalize across sensors, geographies, and time.
Responsibilities
Representation Learning
Design and implement
self-supervised learning (SSL)
objectives (e.g., Masked Autoencoders, DINO-style methods, contrastive learning) tailored for geospatial data
Develop
multi-modal representations
spanning optical, SAR, elevation, and derived signals
Ensure representations transfer effectively across tasks such as segmentation, classification, and change detection
Design evaluation strategies to measure
generalization across geographies, sensors, and time
Model Development & Scaling
Design and scale models based on
Vision Transformers (ViT), hybrid architectures, or State Space Models (e.g., Mamba)
to large parameter regimes
Apply modern training techniques such as
RMSNorm, FlashAttention, mixed precision, and gradient checkpointing
Run
scaling experiments, ablations, and architecture explorations
grounded in empirical rigor
Leverage insights from scaling behavior to make
compute-efficient decisions across model size, data, and training strategy
Temporal Dynamics
Develop methods to model
time-series satellite data
, capturing:
Seasonal patterns
Temporal dependencies
Long-term land-use changes
Explore
sequence modeling, memory mechanisms, and temporal tokenization strategies
Systems-Level Thinking
Design ML systems as
end-to-end pipelines
(data ingestion → curation → training → evaluation → deployment → feedback)
Make explicit trade-offs between
model quality, latency, cost, and data freshness
Work with platform teams to optimize:
Distributed training (FSDP, DeepSpeed)
GPU utilization
Data pipelines and experiment throughput
Build
reusable components and abstractions
, not one-off models
Preferred Background
Experience
5–8 years of experience in
ML research or applied research roles
Experience in
large-scale foundation model development
(vision, multimodal, speech, or related domains)
Experience training and/or fine-tuning
billion-parameter models
Experience working with
sequence, video, or temporal data
Exposure to geospatial foundation models such as:
Prithvi
Clay
Segment Anything Model (SAM) (nice to have)
Technical Skills
Expert-level proficiency in
PyTorch or JAX
Strong experience with:
Distributed training (FSDP / DeepSpeed)
Large-scale datasets and training pipelines
Familiarity with transformer architectures and training dynamics
Bonus: CUDA / performance optimization experience
Additional Strengths
Familiarity with efficient scaling techniques (e.g.,
Mixture of Experts
) is a plus
Strong
experimental rigor
and ability to design meaningful ablations
Track record of publishing or contributing to
state-of-the-art research
in representation learning or generative modeling