- Full Time
AI/ML DI Privacy Infrastructure Engineer
Salary Range: $124,800 – $145,600
Location: Seattle, WA – Hybrid role
The Data Infrastructure group within the AI/ML organization powers the analytics, experimentation and ML feature engineering that powers the Machine Learning technologies we all love in our devices. Our mission is to provide cutting edge, reliable and easy to use infrastructure for ingesting, storing, processing and interacting with data while keeping users’ data private and secure.
The Data Security and Privacy team sits within the AI/ML Data Infrastructure group and is looking for engineers who want to bring their passion for security and privacy to help build world class data infrastructure that puts’s users’ privacy first while enabling data engineers and scientists to produce world-beating ML data products.
The ideal candidate will have outstanding communication skills, proven data infrastructure design and implementation capabilities, strong business acumen, and an innate drive to deliver results. They will be a self-starter, comfortable with ambiguity and will enjoy working in a fast-paced dynamic environment.
3+ years of experience scaling and operating distributed systems like big data processing engines (e.g., Apache Hadoop, Apache Spark), distributed file systems (e.g. HDFS, CEPH, S3, etc.), streaming systems (e.g., Apache Flink, Apache Kafka), resource management systems (e.g., Apache Mesos, Kubernetes), or Identity and Access Management (e.g. Apache Ranger, Sentry, OPA)
3+ years’ experience with infrastructure as code and systems automation
Fluency in Java or a similar language
Ability to debug complex issues in large scale distributed systems
Passion for building infrastructure that is reliable, easy to use and easy to maintain
Excellent communication and collaboration skills
Experience with Spark and ETL processing pipelines is helpful, but not required
Experience with systems security, identity protocols and encryption are helpful, but not required
Scale and operationalize privacy and security systems in a big data environment leveraging technologies like Spark, Kafka, Presto, Flink, Hadoop in both on-premises and AWS environment through automation and infrastructure-as-code
Ensure data infrastructure offers reliable high-quality data with consistent SLAs with good monitoring, alerting and incident response and continual investment to reduce tech-debt
Write code, documentation, participate in code reviews and design sessions