ONES 2.0: Unleashes Rule Engine and Alerting System for Seamless SONiC Operations
Get link
Facebook
X
Pinterest
Email
Other Apps
ONES Rule Engine is an advanced feature that enhances your network management experience by providing a seamlessly integrated alert and notification system. It offers comprehensive monitoring metrics and allows you to create device and interface level rules with ease. With ONES Rule Engine, you can have tailored control over your network management. Upgrade your network management game and experience with ONES Rule Engine today!
10 Benefits of Using ONES Rule-Engine for Comprehensive Network Monitoring
Comprehensive Monitoring ONES Rule-Engine takes a holistic approach to network monitoring by keeping an eye on diverse metrics such as CPU utilization, Memory utilization, PSU status, fans speed, RX/TX, and more. This breadth of coverage ensures that no aspect of your network goes unnoticed, providing a comprehensive view for proactive issue resolution.
Device and Interface Level It allows the creation of rules at both device and interface levels. This fine grained rule management ensures that specific devices or interfaces can be targeted for rule application, allowing for a tailored approach to network optimization and issue handling.
Rule Customization Rule-Engine understands the unique requirements of different network components. With device-level rules based on Hardware SKU, Role, and OS version, administrators can fine-tune alerts to align with the specific characteristics of their network infrastructure.
Figure 1: Rule Configuration
Device Inclusion & Exclusion Flexibility is key in network management. The rule engine provides the capability to include or exclude devices from rules, ensuring that the rule engine caters to the specific needs of your network architecture. This feature enables a dynamic response to changes in the network environment.
Severity-Based Alerting The Rule-Engine facilitates the creation of Critical and Warning severity alerts, allowing administrators to prioritize responses based on the urgency and impact of potential issues. This hierarchical alerting system ensures that critical problems are addressed promptly, minimizing downtime and optimizing network performance.
Alert Summary for Collaborative Issue Resolution The system enables users to generate a comprehensive report of all alerts, facilitating effortless sharing with the team. This feature simplifies the collaborative resolution process, promoting efficient communication and knowledge transfer among team members.
Figure 2: Alert Summary
Integration with Slack for real-time notifications ONES' Slack integration ensures that critical alerts are delivered directly to designated channels, keeping teams informed and in sync. Additionally, weekly Slack digests provide a comprehensive overview of alerts and Zendesk ticket details, streamlining communication and collaboration.
Zendesk Integration for Streamlined Ticketing The rule engine seamlessly integrates with Zendesk, automating the creation of tickets based on alerts. This integration simplifies the ticketing process, providing a centralized platform for tracking and managing network issues.
Preventing redundant alerts leads to efficient alerting During the rule creation process, administrators have the capability to specify the maximum number of alerts for a particular metric on a specific device, mitigating the occurrence of redundant notifications. This feature contributes to a streamlined and efficient alerting system, enhancing the overall effectiveness of network management within the ONES 2.0 ecosystem.
Strengthening Monitoring and Response Capabilities with detailed alert information Each alert is enriched with essential details, including Metric Name, Type(Critical or Warning), Triggered Time and Associated Rule Information. Alerts also includes a URL that will redirect users to associated visual representations for better understanding. In addition, device information such as IP address, role, region, SKU, serial number, NOS etc are the part of alert details. Interface specific alerts will have the related additional information like the interface name , speed , Transceiver details as shown in below image Fig 3.
Figure 3: Alerts details on Zendesk
Figure 4: Alerts Summary on Slack
Rule Engine coverage
System Health Rules can be created to monitor system health like device’s CPU utilization, Memory utilization and CPU core temperatures and alert if those values exceed the critical or warning thresholds. ONES UI also provides the recommended thresholds for CPU and memory usage.
Alert on Component Failures Rule engine can be used to alert if a device FAN or a Power supply unit (PSU) goes faulty. ONES backend keeps continuous track of component health and triggers an alert in case of failure.
Capacity Monitoring Hardware switching is an important aspect in today’s network for high speed data transmissions. Situations can develop where the switch ASIC hardware limits are utilized and forwarding happens in software causing system instability. ONES rule engines have these monitored as well and rules can be created to notify if the ASIC IPv4 / IPv6 utilizationexceeds the warning and critical levels.
Traffic Monitoring Set the utilization levels for traffic links , acceptable thresholds for errors and discards and alerts will be generated for links crossing the set levels.
Transceiver Health Transceiver operational values like Voltage, Temperature and Power are critical for having error free and lossless transmissions. Rule engine monitors those metrics and alerts the transceivers that are on verge of going rogue or requiring attention.
SONiC Services Health In addition to all the above , alerts can be generated for any BGP neighboring going down and for monitoring synced and for container cpu utilization.
Conclusion
Embrace the power of ONES 2.0's Rule Engine and Alerting system to elevate your network management experience. With real-time monitoring of hardware, network, components, counters and transceiver health to enhance your SONiC journey with unparalleled support and advanced alert management through Slack and Zendesk integrations.
The alerts system goes beyond Slack or Zendesk integrations and can be customized to fit any platform based on the requirements.
Stay tuned for our upcoming blog series, where we'll dive deep into these insightful topics:
RoCE Traffic Visibility in AI Fabric
Detailed security compliance with ONES
In-depth analysis regarding the measurement of NWSLA
Take a ‘test drive’ with ONES Center before SONiC Deployments with our well known vendors in hardware, platforms, ASIC and OS at your ease. Make your informed decision by testing it out with our multi-vendor, including Cisco SONiC, NVIDIA SONiC, Celestica SONiC, Marvell SONiC, Wistron SONiC, Edgecore Community SONiC, Arista SONiC, Supermicro SONiC, Enterprise SONiC, and DELL SONiC.
FAQ’s
1. What is the ONES Rule Engine and how does it enhance SONiC network monitoring? A. The ONES Rule Engine is a powerful monitoring and alerting feature introduced in ONES 2.0. It enables network teams to create customized rules at both device and interface levels for key metrics like CPU, memory, RX/TX, PSU, and fan status. With advanced alerting, real-time Slack/Zendesk integrations, and precise rule targeting, it elevates network observability and enables proactive issue resolution in multi-vendor SONiC deployments.
2. Can ONES Rule Engine integrate with Slack and Zendesk for real-time alerts and ticketing? A. Yes, ONES Rule Engine supports seamless integration with Slack and Zendesk. It delivers real-time alerts to Slack channels and creates automated support tickets in Zendesk. This ensures teams stay updated, improve collaboration, and streamline incident tracking and resolution processes.
3. How does ONES Rule Engine help prevent alert fatigue in large-scale network environments? A. ONES Rule Engine allows administrators to set alert thresholds and configure a maximum number of alerts per metric per device, helping avoid redundant notifications. This keeps alerting efficient and focused, minimizing noise while ensuring critical issues are addressed promptly.
4. What types of alerts can ONES 2.0 generate using the Rule Engine? A. ONES 2.0 can generate alerts for system health (CPU, memory), component failures (fan, PSU), traffic utilization, ASIC capacity, transceiver performance (voltage, temperature, power), and SONiC services (e.g., BGP neighbor down, container CPU usage). Each alert includes detailed metadata, such as device role, IP, SKU, NOS, and interface specs.
5. Why is ONES 2.0 Rule Engine important for multi-vendor SONiC network operations? A. In multi-vendor SONiC environments, consistent monitoring is critical. ONES 2.0 Rule Engine normalizes observability across different platforms, allowing tailored alerts, unified visibility, and centralized control. This helps organizations scale and secure their network operations while ensuring consistent SLA compliance.
Network observability is the lifeline of any modern digital infrastructure. Yet, for a space that’s so critical, it’s stuck in time. While the rest of the tech world is sprinting forward — driven by AI, cloud-native models, and vendor-neutral strategies — network observability is still clinging to a status quo that’s not just outdated, but actively limiting. Let’s break it down. Here are five reasons why incumbent network observability solutions are failing us: 1. Still Hardware-First in a Software-First World In nearly every part of the infrastructure stack — from storage to compute to networking — we’ve seen a clear shift: software first, hardware second. Flexibility, scalability, and rapid innovation are made possible by decoupling software from rigid, proprietary hardware. Yet, most legacy network observability solutions still demand specialized boxes and appliances. Want to scale? Add more hardware. Want to adapt quickly? Sorry, you’re locked into a lifecycle that move...
Try FTAS Latest 2.3 Release with EVPN, MC-LAG, ECMP and focus on Scaling Try Fabric Test Automation Suite (FTAS) 2.3 Latest Release – a robust suite of test cases meticulously designed to evaluate the deployment readiness of SONiC. Our unwavering commitment to improvement has shaped FTAS, evolving through invaluable customer feedback and integrating new features in alignment with the latest SONiC releases. Exciting news on our journey – FTAS 2.3 is now live! This version brings a myriad of enhancements, all aimed at refining SONiC assessment and streamlining pre-deployment testing. Let's explore the latest features and discover how they can enhance your testing experience! 4 Exciting Features in FTAS 2.3 You Should Know Decoding the Technological Tapestry of Data Center Interconnect Data Center Interconnect (DCI) orchestrates a seamless integration of cutting-edge technologies. EVPN (Ethernet VPN), VXLAN (Virtual Extensible LAN), and MCLAG (Multi-Chassis Link Aggregation Gr...
Since its 2016 introduction, SONiC has reshaped open networking. Initially adopted by the largest hyperscaler networks, in the past few years SONiC has added the features needed to support any and all the networks. Accelerating functionality is driving adoption of SONiC networking and the market is surging (per market analyst 650 Group). Private and edge cloud networks require a common network operating system and community. SONiC has played a pivotal role in making this possible. In addition, delivering a high-quality, ready to deploy, SONiC solution requires a thriving community of committed ecosystem players working cohesively. Aviz, a SONiC leader, works very closely with the entire ecosystem to bring high-quality community SONiC solutions to customers for their entire network. Kevin Wollenweber (SVP/GM, Cisco Networking Data Center and Provider Connectivity) and I are thrilled to announce a partnership between Cisco and Aviz. This exciting collaborati...
Comments
Post a Comment