EXO: End-to-End Local AI Cluster Deployment Training Course
EXO is an open-source framework that links Apple Silicon devices into a distributed AI cluster, allowing for the local inference of cutting-edge models that exceed the capacity of a single device.
This instructor-led, live training (available online or in-person) is designed for system administrators and DevOps engineers looking to deploy, configure, and manage EXO clusters to facilitate private LLM inference across multiple Apple Silicon or Linux nodes.
By the conclusion of this training, participants will be able to:
- Install and configure EXO on both macOS and Linux nodes.
- Activate automatic device discovery and construct multi-node clusters.
- Enable and verify RDMA over Thunderbolt 5 for ultra-low-latency communication between devices.
- Deploy frontier models (such as DeepSeek, Qwen, and Llama) across clustered devices.
- Monitor cluster health and resolve common deployment issues.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practical practice.
- Hands-on implementation within a live laboratory environment.
Customization Options
- To request customized training, please contact us to make arrangements.
Course Outline
Introduction to EXO and Local AI Clustering
- Overview of the EXO framework and the exo-explore ecosystem
- Comparing centralized cloud inference versus distributed local inference
- Architecture: libp2p device discovery, MLX backend, dashboard, and API layers
- Hardware requirements: Apple Silicon (M3 Ultra, M4 Pro/Max), Thunderbolt 5, and shared storage
Installing EXO on macOS
- Setting up Xcode, Metal ToolChain, and macOS prerequisites
- Installing uv, Node.js, and the Rust nightly toolchain
- Installing the pinned macmon fork for Apple Silicon monitoring
- Cloning the repository and building the dashboard using npm
- Running EXO from source and verifying the localhost:52415 dashboard
Installing EXO on Linux
- Installing dependencies via apt or Homebrew on Linux
- Configuring uv, Node.js 18+, and Rust nightly
- Building the dashboard and running EXO in CPU-only mode
- Directory layout: XDG Base Directory paths for config, data, cache, and logs
Automatic Device Discovery and Cluster Formation
- Understanding libp2p-based auto-discovery across local networks
- Configuring custom namespaces with EXO_LIBP2P_NAMESPACE for cluster isolation
- Verifying node membership in the dashboard cluster view
- Handling discovery failures and network segmentation issues
Enabling RDMA over Thunderbolt 5
- RDMA architecture and the claimed 99 percent latency reduction
- Enabling RDMA in macOS Recovery mode with rdma_ctl
- Cable requirements and port topology constraints on Mac Studio
- Ensuring macOS versions are matched across all cluster nodes
- Troubleshooting RDMA discovery and DHCP configuration
Deploying Frontier Models
- Using the dashboard to load and shard DeepSeek v3.1, Qwen3-235B, and Llama family models
- Previewing instance placements with the /instance/previews API endpoint
- Creating model instances with pipeline or tensor-parallel sharding
- Configuring custom model cards from the HuggingFace hub
Monitoring and Troubleshooting
- Reading EXO logs and understanding distributed tracing
- Interpreting cluster health via the dashboard cluster view
- Diagnosing worker node failures and reconnection behavior
- Using EXO_TRACING_ENABLED for performance bottleneck analysis
Cluster Maintenance and Updates
- Updating EXO binaries and procedures for rebuilding the dashboard
- Migrating model caches and managing pre-downloaded models over NFS
- Gracefully removing nodes and rebalancing workloads
Requirements
- A solid understanding of networking fundamentals (IP addressing, subnetting, firewalls)
- Experience with command-line administration on macOS or Linux
- Familiarity with Python package management (pip/uv) and Node.js tooling
Target Audience
- System administrators
- DevOps engineers
- AI infrastructure architects tasked with on-premise LLM deployment
Open Training Courses require 5+ participants.
EXO: End-to-End Local AI Cluster Deployment Training Course - Booking
EXO: End-to-End Local AI Cluster Deployment Training Course - Enquiry
EXO: End-to-End Local AI Cluster Deployment - Consultancy Enquiry
Upcoming Courses
Related Courses
Advanced LangGraph: Optimization, Debugging, and Monitoring Complex Graphs
35 HoursLangGraph is a framework designed for constructing stateful, multi-actor LLM applications as composable graphs, featuring persistent state and precise control over execution.
This instructor-led, live training (available online or onsite) is tailored for advanced-level AI platform engineers, AI DevOps professionals, and ML architects who aim to optimize, debug, monitor, and manage production-grade LangGraph systems.
By the conclusion of this training, participants will be able to:
- Design and optimize complex LangGraph topologies to enhance speed, reduce costs, and ensure scalability.
- Engineer reliability by implementing retries, timeouts, idempotency, and checkpoint-based recovery mechanisms.
- Debug and trace graph executions, inspect state data, and systematically reproduce production issues.
- Instrument graphs with logs, metrics, and traces; deploy to production environments; and monitor SLAs and costs.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practical practice.
- Hands-on implementation within a live-lab environment.
Customization Options
- To request customized training for this course, please contact us to arrange.
Building Coding Agents with Devstral: From Agent Design to Tooling
14 HoursDevstral is an open-source framework engineered to facilitate the creation and execution of coding agents capable of interacting with codebases, developer utilities, and APIs to boost engineering productivity.
This instructor-led, live training (available online or onsite) targets intermediate to advanced ML engineers, developer-tooling teams, and SREs looking to design, implement, and optimize coding agents using Devstral.
Upon completion of this training, participants will be able to:
- Set up and configure Devstral for coding agent development.
- Design agentic workflows for codebase exploration and modification.
- Integrate coding agents with developer tools and APIs.
- Implement best practices for secure and efficient agent deployment.
Course Format
- Interactive lecture and discussion.
- Extensive exercises and practice sessions.
- Hands-on implementation in a live-lab environment.
Customization Options
- To request a customized training for this course, please contact us to arrange.
Open-Source Model Ops: Self-Hosting, Fine-Tuning and Governance with Devstral & Mistral Models
14 HoursDevstral and Mistral models are open-source AI technologies designed for flexible deployment, fine-tuning, and scalable integration.
This instructor-led live training (available online or onsite) targets intermediate to advanced ML engineers, platform teams, and research engineers looking to self-host, fine-tune, and govern Mistral and Devstral models in production environments.
Upon completing this training, participants will be able to:
- Set up and configure self-hosted environments for Mistral and Devstral models.
- Apply fine-tuning techniques to optimize domain-specific performance.
- Implement versioning, monitoring, and lifecycle governance.
- Ensure security, compliance, and responsible usage of open-source models.
Course Format
- Interactive lectures and discussions.
- Hands-on exercises focused on self-hosting and fine-tuning.
- Live-lab sessions for implementing governance and monitoring pipelines.
Course Customization Options
- To request customized training for this course, please contact us to arrange.
Fiji: Image Processing for Biotechnology and Toxicology
14 HoursThis instructor-led, live training in India (online or onsite) is designed for beginner to intermediate-level researchers and laboratory professionals who need to process and analyze images of histological tissues, blood cells, algae, and other biological specimens.
Upon completing this training, participants will be able to:
- Navigate the Fiji interface and effectively use ImageJ’s core capabilities.
- Preprocess and enhance scientific images to improve analytical accuracy.
- Perform quantitative image analysis, including cell counting and area measurements.
- Automate routine tasks using macros and plugins.
- Tailor workflows to meet specific image analysis requirements in biological studies.
LangGraph Applications in Finance
35 HoursLangGraph serves as a framework for constructing stateful, multi-agent LLM applications through composable graphs, enabling persistent state management and precise execution control.
This instructor-led live training, available online or onsite, targets intermediate to advanced professionals seeking to design, implement, and manage finance solutions based on LangGraph, ensuring proper governance, observability, and compliance.
Upon completion of this training, participants will be able to:
- Develop finance-specific LangGraph workflows that align with regulatory and audit requirements.
- Integrate financial data standards and ontologies into graph states and tooling.
- Implement reliability, safety, and human-in-the-loop controls for critical operations.
- Deploy, monitor, and optimize LangGraph systems to enhance performance, manage costs, and meet SLAs.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practical practice.
- Hands-on implementation within a live laboratory environment.
Customization Options
- To request custom training for this course, please contact us to arrange.
LangGraph Foundations: Graph-Based LLM Prompting and Chaining
14 HoursLangGraph serves as a framework for constructing LLM applications with graph structures, enabling capabilities such as planning, branching, tool integration, memory management, and controllable execution.
This instructor-led training, available both online and onsite, is tailored for beginner-level developers, prompt engineers, and data professionals looking to design and implement robust, multi-step LLM workflows using LangGraph.
Upon completing this training, participants will be able to:
- Articulate the fundamental concepts of LangGraph (nodes, edges, state) and identify appropriate use cases.
- Construct prompt chains that support branching, tool invocation, and memory retention.
- Incorporate retrieval mechanisms and external APIs into graph-based workflows.
- Conduct testing, debugging, and evaluation of LangGraph applications to ensure reliability and safety.
Course Format
- Interactive lectures and guided discussions.
- Hands-on labs and code walkthroughs conducted in a sandbox environment.
- Scenario-based exercises focusing on design, testing, and evaluation.
Customization Options
- For personalized training arrangements for this course, please reach out to us.
LangGraph in Healthcare: Workflow Orchestration for Regulated Environments
35 HoursLangGraph facilitates stateful, multi-actor workflows driven by LLMs, offering precise control over execution paths and state persistence. In the healthcare sector, these features are vital for ensuring compliance, enabling interoperability, and developing decision-support systems that align with medical workflows.
This instructor-led, live training (available online or onsite) is designed for intermediate to advanced professionals aiming to design, implement, and manage LangGraph-based healthcare solutions while addressing regulatory, ethical, and operational challenges.
Upon completion of this training, participants will be able to:
- Design healthcare-specific LangGraph workflows with a focus on compliance and auditability.
- Integrate LangGraph applications with medical ontologies and standards (FHIR, SNOMED CT, ICD).
- Apply best practices for reliability, traceability, and explainability in sensitive environments.
- Deploy, monitor, and validate LangGraph applications in healthcare production settings.
Format of the Course
- Interactive lecture and discussion.
- Hands-on exercises with real-world case studies.
- Implementation practice in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
LangGraph for Legal Applications
35 HoursLangGraph serves as a framework for constructing stateful, multi-actor LLM applications through composable graphs, enabling persistent state management and precise control over execution flow.
This instructor-led live training, available online or on-site, targets intermediate to advanced professionals aiming to design, implement, and manage LangGraph-based legal solutions, ensuring necessary compliance, traceability, and governance controls.
Upon completion of this training, participants will be equipped to:
- Design LangGraph workflows tailored for legal requirements that maintain auditability and compliance.
- Integrate legal ontologies and document standards into graph states and processing logic.
- Implement guardrails, human-in-the-loop approval mechanisms, and traceable decision paths.
- Deploy, monitor, and maintain LangGraph services in production environments with robust observability and cost management.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practical sessions.
- Hands-on implementation within a live laboratory environment.
Customization Options
- For customized training arrangements, please reach out to us to coordinate.
Building Dynamic Workflows with LangGraph and LLM Agents
14 HoursLangGraph serves as a framework designed for assembling graph-structured LLM workflows, offering support for branching, tool integration, memory management, and controllable execution.
This instructor-led live training (available online or onsite) is tailored for intermediate-level engineers and product teams seeking to merge LangGraph’s graph logic with LLM agent loops to develop dynamic, context-aware applications such as customer support agents, decision trees, and information retrieval systems.
Upon completing this training, participants will be equipped to:
- Design graph-based workflows that effectively coordinate LLM agents, tools, and memory.
- Implement conditional routing, retry mechanisms, and fallbacks to ensure robust execution.
- Integrate retrieval processes, APIs, and structured outputs into agent loops.
- Evaluate, monitor, and fortify agent behaviour to enhance reliability and safety.
Course Format
- Interactive lectures coupled with facilitated discussions.
- Guided labs and code walkthroughs conducted within a sandbox environment.
- Scenario-based design exercises and peer reviews.
Course Customization Options
- To request customized training for this course, please get in touch with us to make arrangements.
LangGraph for Marketing Automation
14 HoursLangGraph serves as a graph-based orchestration framework that facilitates conditional, multi-step workflows involving Large Language Models (LLMs) and tools, making it ideally suited for automating and personalizing content pipelines.
This instructor-led live training, available either online or at the participant's site, targets intermediate-level marketers, content strategists, and automation developers looking to build dynamic, branching email campaigns and content generation pipelines using LangGraph.
Upon completing this training, participants will be capable of:
- Designing graph-structured workflows for email and content that incorporate conditional logic.
- Integrating LLMs, APIs, and various data sources to achieve automated personalization.
- Managing state, memory, and context across complex, multi-step marketing campaigns.
- Evaluating, monitoring, and optimizing the performance and delivery outcomes of these workflows.
Course Format
- Interactive lectures paired with group discussions.
- Practical, hands-on labs focused on implementing email workflows and content pipelines.
- Scenario-based exercises covering personalization, segmentation, and branching logic.
Customization Options
- For those interested in tailored training for this course, please get in touch with us to arrange a customized schedule.
Le Chat Enterprise: Private ChatOps, Integrations & Admin Controls
14 HoursLe Chat Enterprise offers a private ChatOps solution, delivering secure, customizable, and governed conversational AI capabilities tailored for organizational needs. It supports Role-Based Access Control (RBAC), Single Sign-On (SSO), connectors, and seamless integration with enterprise applications.
This instructor-led live training, available online or onsite, targets intermediate-level product managers, IT leads, solution engineers, and security/compliance teams who aim to deploy, configure, and govern Le Chat Enterprise within enterprise settings.
Upon completion of this training, participants will be able to:
- Set up and configure Le Chat Enterprise for secure deployments.
- Enable RBAC, SSO, and compliance-driven controls.
- Integrate Le Chat with enterprise applications and data stores.
- Design and implement governance and admin playbooks for ChatOps.
Format of the Course
- Interactive lecture and discussion.
- Extensive exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Cost-Effective LLM Architectures: Mistral at Scale (Performance / Cost Engineering)
14 HoursMistral is a suite of high-performance large language models, specifically engineered for cost-efficient production deployment at scale.
This instructor-led live training, available online or on-site, is designed for advanced infrastructure engineers, cloud architects, and MLOps leaders aiming to design, deploy, and optimize Mistral-based architectures to achieve maximum throughput with minimal cost.
Upon completing this training, participants will be equipped to:
- Implement scalable deployment patterns for Mistral Medium 3.
- Utilize batching, quantization, and efficient serving strategies.
- Optimize inference costs while preserving performance standards.
- Design production-ready serving topologies tailored for enterprise workloads.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practical sessions.
- Hands-on implementation within a live-lab environment.
Customization Options
- For customized training requests, please contact us to make arrangements.
Productizing Conversational Assistants with Mistral Connectors & Integrations
14 HoursMistral AI offers an open-source AI platform that empowers teams to build and integrate conversational assistants into enterprise operations and customer-facing workflows.
This instructor-led training session, available both online and on-site, is designed for beginner to intermediate product managers, full-stack developers, and integration engineers. It focuses on teaching participants how to design, integrate, and productize conversational assistants using Mistral connectors and integrations.
Upon completing this training, participants will be able to:
- Integrate Mistral conversational models with enterprise and SaaS connectors.
- Implement retrieval-augmented generation (RAG) to ensure grounded and accurate responses.
- Design user experience (UX) patterns for both internal and external chat assistants.
- Deploy assistants into product workflows to address real-world use cases.
Course Format
- Interactive lectures and discussions.
- Hands-on integration exercises.
- Live lab sessions for developing conversational assistants.
Course Customization Options
- For customized training tailored to your specific needs, please contact us to make arrangements.
Enterprise-Grade Deployments with Mistral Medium 3
14 HoursMistral Medium 3 is a high-performance, multimodal large language model built for production-grade deployment across enterprise settings.
This instructor-led live training (available online or on-site) targets intermediate to advanced AI/ML engineers, platform architects, and MLOps teams looking to deploy, optimize, and secure Mistral Medium 3 for enterprise use cases.
Upon completion of this training, participants will be able to:
- Deploy Mistral Medium 3 using both API and self-hosted approaches.
- Enhance inference performance and manage costs effectively.
- Implement multimodal use cases leveraging Mistral Medium 3.
- Apply security and compliance best practices suited for enterprise environments.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practice sessions.
- Hands-on implementation within a live lab environment.
Customization Options
- For customized training arrangements, please reach out to us.
Mistral for Responsible AI: Privacy, Data Residency & Enterprise Controls
14 HoursMistral AI offers an open, enterprise-ready platform designed to facilitate the secure, compliant, and responsible deployment of AI solutions.
This instructor-led training, available both online and on-site, is tailored for intermediate-level compliance leads, security architects, and legal or operations stakeholders. The programme focuses on embedding responsible AI practices within Mistral by utilising advanced privacy, data residency, and enterprise control mechanisms.
Upon completion of this training, participants will be equipped to:
- Deploy privacy-preserving techniques within Mistral environments.
- Apply data residency strategies to satisfy regulatory mandates.
- Establish enterprise-grade controls, including RBAC, SSO, and audit logging.
- Assess vendor and deployment choices to ensure alignment with compliance standards.
Course Format
- Interactive lectures and discussions.
- Compliance-oriented case studies and practical exercises.
- Hands-on configuration of enterprise AI controls.
Customisation Options
- For bespoke training arrangements, please contact us directly.