Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Fundamentals of Agentic Systems in Production
- Agentic architectures: loops, tools, memory, and orchestration layers.
- The agent lifecycle: from development and deployment to continuous operation.
- Challenges associated with managing agents at a production scale.
Infrastructure and Deployment Models
- Deploying agents within containerized and cloud environments.
- Scaling patterns: horizontal versus vertical scaling, concurrency, and throttling.
- Multi-agent orchestration and workload balancing.
Monitoring and Observability
- Key metrics: latency, success rate, memory consumption, and agent call depth.
- Tracing agent activity and call graphs.
- Instrumenting observability using Prometheus, OpenTelemetry, and Grafana.
Logging, Auditing, and Compliance
- Centralized logging and structured event collection.
- Ensuring compliance and auditability within agentic workflows.
- Designing audit trails and replay mechanisms for debugging purposes.
Performance Tuning and Resource Optimization
- Reducing inference overhead and optimizing agent orchestration cycles.
- Model caching and lightweight embeddings for accelerated retrieval.
- Load testing and stress scenarios for AI pipelines.
Cost Control and Governance
- Understanding cost drivers for agents: API calls, memory, compute, and external integrations.
- Tracking agent-level costs and implementing chargeback models.
- Establishing automation policies to prevent agent sprawl and idle resource consumption.
CI/CD and Rollout Strategies for Agents
- Integrating agent pipelines into CI/CD systems.
- Testing, versioning, and rollback strategies for iterative agent updates.
- Progressive rollouts and safe deployment mechanisms.
Failure Recovery and Reliability Engineering
- Designing for fault tolerance and graceful degradation.
- Implementing retry, timeout, and circuit breaker patterns for agent reliability.
- Incident response and post-mortem frameworks for AI operations.
Capstone Project
- Build and deploy an agentic AI system with comprehensive monitoring and cost tracking.
- Simulate load, measure performance, and optimize resource usage.
- Present the final architecture and monitoring dashboard to peers.
Summary and Next Steps
Requirements
- Proficient knowledge of MLOps and production machine learning environments.
- Hands-on experience with containerized deployments (Docker and Kubernetes).
- Familiarity with cloud cost optimization strategies and observability tools.
Target Audience
- MLOps Engineers.
- Site Reliability Engineers (SREs).
- Engineering leaders responsible for AI infrastructure.
21 Hours
Testimonials (3)
The trainer is patient and very helpful. He knows the topic well.
CLIFFORD TABARES - Universal Leaf Philippines, Inc.
Course - Agentic AI for Business Automation: Use Cases & Integration
Good mixvof knowledge and practice
Ion Mironescu - Facultatea S.A.I.A.P.M.
Course - Agentic AI for Enterprise Applications
The mix of theory and practice and of high level and low level perspectives