Troubleshoot and resolve technical issues across the Google Cloud AI/ML portfolio, focusing on customer-reported, deployment failures, model performance degradation and infrastructure-related problems.
Work directly with customers on their ML deployments, including generative AI models to ensure production readiness and high availability.
Utilize coding and scripting skills (primarily Python) to read, debug, and reproduce customer issues within their ML models (TensorFlow, PyTorch) or deployment environments (Kubernetes, Compute Engine).
Manage customer problems through effective diagnosis, clear documentation and the development, implementation of new investigation tools to increase diagnostic speed.
Develop an understanding of Google Cloud's AI/ML solutions and share this knowledge to upskill the wider global support organization.
Minimum qualifications:
Bachelor's degree in Computer Science, Engineering, Mathematics, a related technical field, or equivalent practical experience.
2 years of experience in a technical role such as technical support, software engineering, or solutions engineering.
Experience coding in one or more general purpose languages (e.g., Python, Java, Go, C or C++) including data structures, algorithms.
Experience with Artificial Intelligence (AI) concepts and Machine Learning (ML) techniques.
Experience with computer networking (e.g., TCP/IP, DNS, load balancing, routing) and Linux/Unix system administration.
Preferred qualifications:
Professional-level certification on Google Cloud, such as the Professional Machine Learning Engineer or Professional Cloud Architect.
Experience with Google Cloud's AI/ML product portfolio, including Vertex AI (Vertex AI Workbench, Pipelines, Endpoints, TensorBoard) and Generative AI tools (Gemini, Gen AI Studio).
Experience in specialized ML areas like Natural Language Processing (NLP), Computer Vision, or Recommendation System.
Experience with public cloud infrastructure and core services (e.g., Compute Engine, Cloud Storage, BigQuery).
Knowledge of ML frameworks such as TensorFlow, Keras, or PyTorch.