Please fill in the form below, so we can support you in your request.
Please fill in the form below, so we can support you in your request.


    ASICAMD (Xilinx)AchronixIntel (Altera)LatticeMicrochip (MicroSemi)Other

    X
    CONTACT MLE
    Contact MLE for Solutions
    Please fill in the form below, so we can send you the relevant information regarding our Solutions.

      By submitting this form you are consenting to being contacted by the MLE via email and receiving marketing information.

      X
      CONTACT MLE

      Network Function Accelerators, FACs, NICs and SmartNICs

      NICs, SmartNICs, and Function Accelerator Cards with Network Accelerator

      A Network Interface Card (NIC) is a component that connects computers via networks, these days mostly via IEEE Ethernet – but what makes a NIC a SmartNIC? How can FPGA Network Accelerator make it operate more efficiently and enhance its performance to deliver deterministic networking?

      Performance gap between server CPU and the increasing computational demand of faster network port speeds over the past 40 yearsWith the push for Software-Defined Networking, (mostly open source) software running on standard server CPUs became a more flexible and cost-effective alternative to custom networking silicon and appliances. However, in the post Dennard scaling area, server CPU performance improvements cannot keep up with increasing computational demand of faster network port speeds.

      This widening performance gap creates the need for so-called SmartNICs. SmartNIC not only implement Domain-Specific Architecture for network processing but also offload host CPUs from running portions of the network processing stack and, thereby, free up CPU cores to run the “real” application.

      According to Gartner, Function Accelerator Cards (FACs) incorporate functions on the NIC that would have been done on dedicated network appliances. Hence, all FACs are essentially NICs, but not all NICs/SmartNICs are FACs. When deployed properly, FACs can increase bandwidth performance, can reduce transport latencies and can improve compute efficiency, which translates to less energy consumption.

      Deterministic Networking - Functional Accelerator Cards (FACs) - Landscape
      MLE has partnered with FPGA vendors, Fraunhofer Institutes and EMS partners to implement the FPGA Network Accelerator on FPGA-based FACs which deliver cost-efficient solutions for ultra-reliable, low-latency, deterministic networking.

      Features of FPGA Network Accelerator

      Ultra-Reliable, Low-Latency, Deterministic Networking

      With ultra-reliable, low-latency, deterministic networking we have borrowed a concept from 5G wireless communication (5G URLLC) and have applied this to LAN (Local Area Network) and WAN (Wide Area Network) wired communication:

      • Ultra-Reliable means no packets get lost in transport
      • Low-Latency means that packets get processed by a FAC at a fraction of CPU processing times
      • Deterministic means that there is an upper bound for transport and for processing latency

      We do this by combining the TCP protocol, fully accelerated (in FPGA or ASIC using NPAP), with TSN (Time Sensitive Networking) optimized for stream processing at data rates of 10/25/50/100 Gbps. These so-called TCP-TSN-Cores, the FPGA network accelerator, not only give us precise time synchronization but also traffic shaping, traffic scheduling and stream reservation with priorities.

      We believe that FPGAs are very well positioned as programmable compute engines for network processing because FPGAs can implement “stream processing” more efficiently than CPUs or GPUs can do. In particular, when the networking data stays local to the FPGA fabric Data-in-Motion processing can be done within 100s of clock cycles (which is 100s of nano-seconds) and can be sent back a few 100 clock cycles later, an aspect with is referred to as Full-Accelerated In-Network Compute.

      While FPGA technology has been on the forefront of Moore’s Law and modern devices such as AMD/Xilinx Versal Prime or Intel Agilex or Achronix Speedster7t can hold millions of gates, FPGA processing resources must be used wisely, when Bill-of-Materials costs are important. Therefore, at MLE we have put together a unique combination of FPGA and open-source software to achieve best-in-class performance while addressing cost metrics more in-line with CPU-based SmartNICs.

      Unique and Cost-Efficient Combination of Open Source

      The Open Source Technologies We Borrow From

      Linux kernel​

      Meanwhile highly optimized for networking​

      OpenvSwitch

      An open source multi-layer network switch​

      Corundum

      A vendor-neutral open-source high-performance FPGA-based NIC

      SONiC

      Software for Opensource Networking in the Cloud​​

      Intel Compiler for SystemC​​

      An open source High-Level Synthesis engine​​

      OpenNIC

      The GitHub project focusing on AMD/Xilinx Alveo cards​

      Xilinx Vitis HLS LLVM

      The High-Level Synthesis Frontend for Xilinx FPGAs

      High-Level Synthesis plays a vital role in our implementation as it allows MLE and MLE customers to turn algorithms implemented in C/C++/SystemC into efficient FPGA logic which is portable between different FPGA vendors.

      To build a high-performance FAC platform, portions of the above have been integrated together with proven 3rd party networking technologies:

      Corundum In-Network Compute + TCP Full Accelerator

      Corundum is an open-source FPGA-based NIC which features a high-performance datapath between multiple 10/25/50/100 Gigabit Ethernet ports and the PCIe link to the host CPU. Corundum has several unique architectural features: For example, transmit, receive, completion, and event queue states are stored efficiently in block RAM or ultra RAM, enabling support for thousands of individually-controllable queues.

      Corundum In-Network Compute with FPGA Network Accelerator (TCP Full Accelerator)MLE is a contributor to the Corundum project. Please visit our Developer Zone for services and downloads for Corundum full system stacks pre-built for various in-house and off-the-shelf FPGA boards.

      MLE combines the Corundum NIC with NPAP, the TCP/UDP/IP Full Accelerator from Fraunhofer HHI, via a so-called TCP Bypass which minimizes processing latency of network packets: Each packet gets processed in parallel by the Corundum NIC and by NPAP. The moment it can be determined that the packet shall be handled by NPAP (based on IP address and port number) this packet gets invalidated inside the Corundum NIC. If a packet shall not be processed by NPAP, it get’s dropped in NPAP and will solely be processed by the Corundum NIC.

      Fundamentally, this implements network protocol processing in multiple stages: Network data which is latency sensitive does get processed using full acceleration, while all other network traffic is handled either by a companion CPU and/or by the host CPU.

      Applications of FPGA Network Accelerator

      MLE’s Network Accelerators are of particular value where network bandwidth and latency constraints are key:

      • Wired and Wireless Networking
      • Acceleration of Software-Defined Wide Area Networks (SD-WAN)
        • Video Conferencing
        • Online Gaming
        • Industrial Internet-of-Things (IIoT)
      • Handling of Application Oriented Network Services
      • Mobile 5G User-Plane Function Acceleration
      • Mobile 5G URLLC Core Network Processing with TSN
      • Offloading OpenvSwitch (OvS), vRouter, etc

      Key Benefits

      The following shows the key benefits of MLE’s technology by comparing open-source SD-WAN switching in native CPU software mode against MLE’s FPGA Network Accelerator:

      Compared with plain CPU software processing MLE’s Ultra-Reliable Low-Latency Deterministic Networking increases network bandwidth and throughput close to Ethernet line rates, in particular for smaller packets, which reduces the need for over-provisioning within the backbone. And, processing latencies can be shortened significantly which is important, for example, when delivering a lively audio/video conferencing experience over WAN.

      Availability

      MLE’s FPGA Network Accelerator is available as a licensable full system stack and delivered as an integrated hardware/firmware/software solution. In close collaboration with partners in the FPGA ecosystem, MLE has ported and tested variations of the stack on a growing list of FPGA cards. Currently, this list comprises high-performance 3rd party hardware as well as MLE-designed cost-optimized hardware:

      FPGA Card

      Hardware Description & Features

      Status

      Verified modules of MLE FPGA Network Accelerator - NPAC-Ketch, MLE-designed single-slot FHHL PCIe card

      NPAC-Ketch, MLE-designed single-slot FHHL PCIe card

      • Cost-optimized Intel Stratix 10 GX 400 FPGA
      • Optional 4 GB DDR4 SO-DIMM attached to Programmable-Logic
      • 4x SFP+ (4x 10 GigE)PCIe 3.1 8 GT/sec x8 lanes
      • 50 Watts TDP passive cooling front-to-back
      Available
      Inquire
      Verified modules of MLE FPGA Network Accelerator - AMD/Xilinx Alveo U280

      Alveo U280, AMD/Xilinx-designed dual-slot FHFL PCIe card

      • AMD/Xilinx UltraScale+ FPGA
      • 32 GB DDR4 DRAM plus 8 GB HBM2 DRAM
      • 2x QSFP28 (2x 100 GigE or 4x 25 GigE or 8x 10 GigE)
      • PCIe 4.0 16 GT/sec x8 lanes
      • 225 Watts TDP active cooling

      Early Access

      Inquire

      Verified modules of MLE FPGA Network Accelerator - Intel N6000-PL

      N6000-PL, Intel-designed single-slot FHHL PCIe card

      • Intel Agilex AGF014 F Series FPGA
      • 4x 4 GB DDR4 SO-DIMM attached to Programmable-Logic
      • 2x QSFP28 (2x 100 GigE or 4x 25 GigE or 8x 10 GigE)
      • PCIe 4.0 16 GT/sec x16 lanes
      • ARM A53 Quad-Core CPU with 1 GB DDR4 DRAM running Linux
      • 125 Watts TDP passive cooling

      Early Access

      Inquire

      Documentation