Key Responsibilities :
- Collaborative engineering : Work within a larger team to rapidly develop proof-of-concept prototypes to validate research ideas and integrate them into production systems and infrastructure
- Performance Analysis : Conduct in-depth profiling and tuning of operating systems and large-scale distributed systems, leveraging heterogeneous hardware (CPU, NPU).
- Documentation and Reporting : Maintain clear technical documentation of research findings, design decisions, and implementation details to ensure reproducibility and facilitate knowledge transfer within the team.
- Research & Technology Exploration : Stay current with the latest advancements in AI infrastructure, cloud-native technologies, and operating systems. E.g. techniques to efficiently execute inference workload based on SW / HW co-design; exploit workload characteristics to prefetch memory / minimize communication.
- Stakeholder Communication : Present project milestones, performance metrics, and key findings to internal stakeholders.
List details of Knowledge, Skills, Experience and Qualifications needed to do the job :
Required :
Bachelor's or Master's degree in Computer Science or a related technical field.A solid background in operating systems and / or distributed systems and / or ML systems.Excellent programming skills, master of at least one language, such as C / C++.Good communication and teamwork skills.Be comfortable with research methodology.Desired :
Familiarity with current LLM architectures (e.g. Llama3, DeepSeek V3)Familiarity with production LLM serving systems and inference optimizations (e.g. VLLM)Experience with accelerator programming (e.g. CUDA, Triton) and communication libraries (e.g. NCCL)