AI/ML DI Privacy Infrastructure Engineer
W2 Contract – Bayside Solutions, Inc. is not able to sponsor any candidates at this time.
Salary Range: $114,400 – $135,200 per year
Location: Cupertino, CA – Hybrid Schedule
Summary
The analytics, testing, and ML feature engineering that underpin the Machine Learning technologies in our gadgets are powered by the Data Infrastructure division of the AI/ML department. Our goal is to offer state-of-the-art, dependable, and simple infrastructure for ingesting, storing, processing, and engaging with data while protecting the privacy and security of users.
Within the AI/ML Data Infrastructure group, the Data Security and Privacy team is in need of engineers who want to bring their passion for security and privacy to help build a world-class data infrastructure that prioritizes user privacy while empowering data engineers and scientists to create unrivaled ML data products.
Key Qualifications
- 3+ years of experience scaling and operating distributed systems like big data processing engines (e.g., Apache Hadoop, Apache Spark), distributed file systems (e.g. HDFS, CEPH, S3, etc.), streaming systems (e.g., Apache Flink, Apache Kafka), resource management systems (e.g., Apache Mesos, Kubernetes), or Identity and Access Management (e.g. Apache Ranger, Sentry, OPA)
- 3+ years’ experience with infrastructure as code and systems automation
- Fluency in Java or a similar language
- Ability to debug complex issues in large-scale distributed systems
- Passion for building infrastructure that is reliable, easy to use and easy to maintain
- Excellent communication and collaboration skills
- Experience with Spark and ETL processing pipelines is helpful, but not required
- Experience with systems security, identity protocols and encryption are helpful, but not required
Responsibilities include:
- Scale and operationalize privacy and security systems in a big data environment leveraging technologies like Spark, Kafka, Presto, Flink, Hadoop in both on-premises and AWS environment through automation and infrastructure-as-code
- Ensure data infrastructure offers reliable high-quality data with consistent SLAs with good monitoring, alerting and incident response and continual investment to reduce tech-debt
- Write code, documentation, participate in code reviews and design sessions