Codec Avatars Large Scale Experimentation Engineer

Codec Avatars Large Scale Experimentation Engineer
Location pin icon
Meta Reality Lab’s Codec Avatar Research team is building technology to enable immersive, photorealistic social presence. Codec Avatars are real-time live-drivable representations that match the appearance of their users. As part of the Lab’s Instant Codec Avatar group, you’ll work to scale up Codec Avatar technology by modeling the diversity of human appearance and applying that model to the process of rapidly generating new avatars. This role is focused on our Large Scale Experimentation efforts, which both support our new Research Supercluster compute resource and uses that resource to run large-scale machine learning experiments that advance the state-of-the-art in Codec Avatar technology. In this role, you will work with a team of software engineers, research engineers, and research scientists to plan and deliver software systems needed to support large scale model training over thousands of GPUs. These systems ingest, store, and serve some of the largest ML training datasets in the world, and coordinate complex workflows composed from a mixture of traditional graphics and ML algorithms. You’ll also design and run research experiments using those workflows to advance our understanding of how appearance modeling scales over large populations.
Codec Avatars Large Scale Experimentation Engineer Responsibilities
  • Develop and debug machine learning workflows on a large multi-node cluster
  • Automation of data ingress into cluster
  • Implement compute allocation policy for the cluster
  • Define and implement strategy for compute environment management and deployment
  • Development of data read/access layer using proprietary framework
  • Define and communicate cluster software requirements, based on research needs
  • Enabling adoption of the cluster by additional research cases
  • Definition, design and implementation of automated testing
  • Point of contact for hardware & software questions regarding cluster capabilities
  • Reporting on progress, presenting technical risks, challenges and status to executive management
  • Partner with Data Collection and Asset Generation teams to specify and ingest assets required for large scale training
  • Partner with Codec Avatars Universal Avatar Research team to support large scale experimentation based on python workflows
  • Partner with Research SuperCluster production engineering team to support reliable operation
  • Partner with Research SuperCluster storage engineering team to support development of features required for Codec Avatars datasets
  • Partner with security, privacy, and policy teams to ensure workflow compliance with company policy
Minimum Qualifications
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
  • Experience with multi-node ML training workflows and frameworks
  • Experience developing and debugging distributed systems
  • Experience operating in a self-directed environment with multiple stakeholders across multiple teams
  • Proven communication skills, including experience driving decision making
  • Experience working with cross functional teams including hardware, software, network, legal, privacy and security
  • Proven Python experience
  • Proven Linux/shell scripting development experience
  • Experience developing and supporting reliable multi-stage data pipelines
  • Proven quantitative reasoning skills, analyzing trade-offs of different hardware and software solutions
Preferred Qualifications
  • Masters or higher degree in Computer Science or related technical field, or equivalent experience
  • 8+ years of experience developing software related to distributed systems or ML workflows
  • Experience developing or applying computer graphics algorithms
  • Experience developing or applying computer vision algorithms
  • 5+ years of experience developing workflows for large scale AI training
  • Understanding of deep neural network training
  • Experience with securing sensitive data (encryption, access control, audit logging)
  • Experience with HPC (High Performance Computing)
  • Experience with scheduling systems such as Slurm or Kubernetes
  • Experience with large scale object storage services (S3 or similar)
  • Experience in research or converting research to products
  • Experience using git
  • Experience using Conda
  • SQL databases experience
  • C++ experience
Locations
About Meta
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today—beyond the constraints of screens, the limits of distance, and even the rules of physics.
Meta is committed to providing reasonable support (called accommodations) in our recruiting processes for candidates with disabilities, long term conditions, mental health conditions or sincerely held religious beliefs, or who are neurodivergent or require pregnancy-related support. If you need support, please reach out to accommodations-ext@fb.com.
$173,000/year to $241,000/year + bonus + equity + benefits

Please note the national salary range listed in the job posting reflects the new hire salary range across levels and U.S. locations that would be applicable to the position. Final salary will be commensurate with the candidate’s final level and final location. Also, this range represents base salary only and does not include the company bonus, incentive for sales roles, equity, or benefits , if applicable.
Related Job Openings
Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. You may view our Equal Employment Opportunity notice here. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. We may use your information to maintain the safety and security of Meta, its employees, and others as required or permitted by law. You may view Meta's Pay Transparency Policy, Equal Employment Opportunity is the Law notice, and Notice to Applicants for Employment and Employees by clicking on their corresponding links. Additionally, Meta participates in the E-Verify program in certain locations, as required by law.

Meta is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, you may contact us at accommodations-ext@fb.com.