Career Category
Engineering
Job Description
The Sr Associate IS Engineer will be focused on data foundations and emerging AI technologies by assisting in designing, developing, and maintaining foundational data engineering solutions that meet business needs and ensuring the availability and performance of critical systems.
You will be part of a team specifically focused on Amgen’s Commercialization business. Amgen is using fully integrated and best-in-class technologies, having various enterprise platforms such as AWS, Databricks, Salesforce, Planisware and Anaplan; enterprise collaboration platforms such as O365, SharePoint Online and MS Teams, as well as vertical-specific platforms in R&D, Operations, Process Development and Commercial/Marketing areas.
The candidate should also have experience with large, diverse and globally dispersed teams within a matrixed organization. Extensive collaboration with global cross functional teams is required to ensure seamless integration and operational excellence. The ideal candidate will have a strong background in the end-to-end software development lifecycle, strong experience with data integration, and be a Scaled Agile practitioner.
Roles & Responsibilities:
Design, develop, and maintain data solutions for data generation, collection, and processing
Be a key team member that assists in design and development of the data pipeline
Create data pipelines and ensure data quality by implementing ETL processes to migrate and deploy data across systems
Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions
Work with the team to troubleshoot and resolve technical issues. Identify and fix bugs and defects
Develop and optimize data models, ETL processes, and workflows across structured and unstructured data sources.
Take ownership of data pipeline projects from inception to deployment, manage scope, timelines, and risks
Develop and maintain data models, data dictionaries, and other documentation to ensure data accuracy and consistency
Implement data security and privacy measures to protect sensitive data
Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions
Collaborate and communicate effectively with product teams
Collaborate with Architects, Business SMEs, and Data Scientists to design and develop end-to-end data pipelines to meet fast paced business needs across geographic regions
Identify and resolve complex data-related challenges
Adhere to best practices for coding, testing, and designing reusable code/component
Explore new tools and technologies that will help to improve ETL platform performance
Participate in sprint planning meetings and provide estimations on technical implementation
Ensure data security, compliance, and role-based access control across data environments.
Basic Qualifications:
Master’s / Bachelor's degree and 5 to 8 years of Computer Science, IT or related field experience OR
Must have Skills:
Hands-on experience with data technologies and platforms, such as Databricks, Apache Spark (PySpark, SparkSQL), workflow orchestration, performance tuning
Proficiency in data analysis tools (eg. SQL)
Solid understanding of data modeling, schema design and data pipeline orchestration.
Experience with ETL tools such as Apache Spark, and various Python packages related to data processing, machine learning model development
Strong understanding of data modeling, data warehousing, and data integration concepts
Demonstrated ability to design, build, and test automation workflows
Strong understanding of web services, databases, and ETL concepts
Eagerness to learn and grow in an engineering environment
Ability to work well within a team and communicate effectively
Working experience in Agile or Scrum methodologies
Good-to-Have Skills:
Experience with Software engineering best-practices, including but not limited to version control, infrastructure-as-code, CI/CD, and automated testing
Understanding of data governance frameworks, tools, and best practices.
Knowledge of data protection regulations and compliance requirements (e.g., GDPR)
Familiarity with vector databases for semantic search and embeddings-based retrieval.
Exposure to knowledge graph technologies (e.g., Neo4j, AWS Neptune, RDF/SPARQL, graph data modeling).
Experience working with Large Language Models (LLMs) or Generative AI systems, particularly in retrieval-augmented generation (RAG) or data augmentation contexts.
Understanding of MLOps or DataOps principles, including CI/CD for data workflows.
Ability to collaborate in cross-functional teams and communicate complex technical concepts clearly
Certifications related to Agile or any software or cloud platform are advantageous
Experience with automation tools or platforms (e.g. Microsoft Power Automate, UiPath, PowerShell, or similar)
Soft Skills:
Excellent analytical and troubleshooting skills
Comfortable to work effectively with global, virtual teams
High degree of initiative and self-motivation
.