Machine Learning Engineer

Are you applying to the internship?

Job Description

About Cisco ThousandEyes:

Cisco ThousandEyes is a Digital Experience Assurance platform that helps organizations ensure flawless digital experiences across all networks, including those they don’t directly own. Leveraging AI and a comprehensive collection of data from cloud, internet, and enterprise networks, ThousandEyes allows IT teams to proactively identify, diagnose, and fix issues before they affect end-users. It’s deeply integrated with the Cisco technology portfolio and other systems, enabling large-scale deployments and providing AI-driven insights within Cisco’s Networking, Security, Collaboration, and Observability offerings.

About the Job: Machine Learning Engineer (Alerts Team)

This role involves developing and optimizing anomaly detection algorithms for a highly scalable stream processing platform. As a Machine Learning Engineer, you’ll be at the forefront of applying cutting-edge AI/ML technologies to real-time data. The position blends the challenges of handling massive datasets with the innovation of applied machine learning to deliver actionable insights to customers.

Responsibilities:

Developing and Maintaining AI/ML Pipelines: Collaborate with a team to design, implement, and maintain large-scale AI/ML pipelines for real-time anomaly detection. This includes training, tuning, and evaluating models using various techniques.
Anomaly Detection Algorithm Design and Implementation: Design and implement sophisticated anomaly detection algorithms, such as Isolation Forests, LSTM-based models, and Variational Autoencoders, tailored to the unique characteristics of the data streams.
Model Evaluation and Framework Creation: Create robust evaluation frameworks and metrics to thoroughly assess the performance of the developed algorithms.
Stream Processing Optimization: Implement and optimize stream processing solutions using technologies like Flink and Kafka.
Working with Massive Datasets: Handle and process billions of events, pushing the boundaries of real-time anomaly detection.
Utilizing Various ML Models: Proficiently utilize a range of machine learning models including neural networks (including transformer models and Large Language Models), decision trees, and other traditional machine learning models.
Leveraging ML Frameworks: Utilize machine learning frameworks such as SKLearn, XGBoost, PyTorch, or TensorFlow to build efficient and scalable solutions.

Qualifications:

• 3-5 years of software development experience and a minimum of 2 internships with direct experience in building and evaluating ML models and delivering large-scale ML products.
MS or PhD in a relevant field.
• Proficiency in crafting machine learning models, translating theoretical concepts into practical solutions.
• Fluency in Python and ability to transform abstract machine learning concepts into robust, efficient, and scalable code.
• Strong computer science fundamentals and object-oriented design skills.
• Experience building large-scale data processing systems.
• Experience in a fast-paced development environment.
• Strong team collaboration and communication skills.