162306a36Sopenharmony_ci.. include:: ../disclaimer-zh_CN.rst 262306a36Sopenharmony_ci 362306a36Sopenharmony_ci:Original: Documentation/infiniband/opa_vnic.rst 462306a36Sopenharmony_ci 562306a36Sopenharmony_ci:翻译: 662306a36Sopenharmony_ci 762306a36Sopenharmony_ci 司延腾 Yanteng Si <siyanteng@loongson.cn> 862306a36Sopenharmony_ci 962306a36Sopenharmony_ci:校译: 1062306a36Sopenharmony_ci 1162306a36Sopenharmony_ci 王普宇 Puyu Wang <realpuyuwang@gmail.com> 1262306a36Sopenharmony_ci 时奎亮 Alex Shi <alexs@kernel.org> 1362306a36Sopenharmony_ci 1462306a36Sopenharmony_ci.. _cn_infiniband_opa_vnic: 1562306a36Sopenharmony_ci 1662306a36Sopenharmony_ci============================================= 1762306a36Sopenharmony_ci英特尔全路径(OPA)虚拟网络接口控制器(VNIC) 1862306a36Sopenharmony_ci============================================= 1962306a36Sopenharmony_ci 2062306a36Sopenharmony_ci英特尔全路径(OPA)虚拟网络接口控制器(VNIC)功能通过封装HFI节点之间的以 2162306a36Sopenharmony_ci太网数据包,支持Omni-Path结构上的以太网功能。 2262306a36Sopenharmony_ci 2362306a36Sopenharmony_ci体系结构 2462306a36Sopenharmony_ci======== 2562306a36Sopenharmony_ci 2662306a36Sopenharmony_ciOmni-Path封装的以太网数据包的交换模式涉及Omni-Path结构拓扑上覆盖的一个或 2762306a36Sopenharmony_ci多个虚拟以太网交换机。Omni-Path结构上的HFI节点的一个子集被允许在特定的虚 2862306a36Sopenharmony_ci拟以太网交换机上交换封装的以太网数据包。虚拟以太网交换机是通过配置结构上的 2962306a36Sopenharmony_ciHFI节点实现的逻辑抽象,用于生成和处理报头。在最简单的配置中,整个结构的所有 3062306a36Sopenharmony_ciHFI节点通过一个虚拟以太网交换机交换封装的以太网数据包。一个虚拟以太网交换机, 3162306a36Sopenharmony_ci实际上是一个独立的以太网网络。该配置由以太网管理器(EM)执行,它是可信的结 3262306a36Sopenharmony_ci构管理器(FM)应用程序的一部分。HFI节点可以有多个VNIC,每个连接到不同的虚 3362306a36Sopenharmony_ci拟以太网交换机。下图介绍了两个虚拟以太网交换机与两个HFI节点的情况:: 3462306a36Sopenharmony_ci 3562306a36Sopenharmony_ci +-------------------+ 3662306a36Sopenharmony_ci | 子网/ | 3762306a36Sopenharmony_ci | 以太网 | 3862306a36Sopenharmony_ci | 管理 | 3962306a36Sopenharmony_ci +-------------------+ 4062306a36Sopenharmony_ci / / 4162306a36Sopenharmony_ci / / 4262306a36Sopenharmony_ci / / 4362306a36Sopenharmony_ci / / 4462306a36Sopenharmony_ci +-----------------------------+ +------------------------------+ 4562306a36Sopenharmony_ci | 虚拟以太网切换 | | 虚拟以太网切换 | 4662306a36Sopenharmony_ci | +---------+ +---------+ | | +---------+ +---------+ | 4762306a36Sopenharmony_ci | | VPORT | | VPORT | | | | VPORT | | VPORT | | 4862306a36Sopenharmony_ci +--+---------+----+---------+-+ +-+---------+----+---------+---+ 4962306a36Sopenharmony_ci | \ / | 5062306a36Sopenharmony_ci | \ / | 5162306a36Sopenharmony_ci | \/ | 5262306a36Sopenharmony_ci | / \ | 5362306a36Sopenharmony_ci | / \ | 5462306a36Sopenharmony_ci +-----------+------------+ +-----------+------------+ 5562306a36Sopenharmony_ci | VNIC | VNIC | | VNIC | VNIC | 5662306a36Sopenharmony_ci +-----------+------------+ +-----------+------------+ 5762306a36Sopenharmony_ci | HFI | | HFI | 5862306a36Sopenharmony_ci +------------------------+ +------------------------+ 5962306a36Sopenharmony_ci 6062306a36Sopenharmony_ci 6162306a36Sopenharmony_ciOmni-Path封装的以太网数据包格式如下所述。 6262306a36Sopenharmony_ci 6362306a36Sopenharmony_ci==================== ================================ 6462306a36Sopenharmony_ci位 域 6562306a36Sopenharmony_ci==================== ================================ 6662306a36Sopenharmony_ciQuad Word 0: 6762306a36Sopenharmony_ci0-19 SLID (低20位) 6862306a36Sopenharmony_ci20-30 长度 (以四字为单位) 6962306a36Sopenharmony_ci31 BECN 位 7062306a36Sopenharmony_ci32-51 DLID (低20位) 7162306a36Sopenharmony_ci52-56 SC (服务级别) 7262306a36Sopenharmony_ci57-59 RC (路由控制) 7362306a36Sopenharmony_ci60 FECN 位 7462306a36Sopenharmony_ci61-62 L2 (=10, 16B 格式) 7562306a36Sopenharmony_ci63 LT (=1, 链路传输头 Flit) 7662306a36Sopenharmony_ci 7762306a36Sopenharmony_ciQuad Word 1: 7862306a36Sopenharmony_ci0-7 L4 type (=0x78 ETHERNET) 7962306a36Sopenharmony_ci8-11 SLID[23:20] 8062306a36Sopenharmony_ci12-15 DLID[23:20] 8162306a36Sopenharmony_ci16-31 PKEY 8262306a36Sopenharmony_ci32-47 熵 8362306a36Sopenharmony_ci48-63 保留 8462306a36Sopenharmony_ci 8562306a36Sopenharmony_ciQuad Word 2: 8662306a36Sopenharmony_ci0-15 保留 8762306a36Sopenharmony_ci16-31 L4 头 8862306a36Sopenharmony_ci32-63 以太网数据包 8962306a36Sopenharmony_ci 9062306a36Sopenharmony_ciQuad Words 3 to N-1: 9162306a36Sopenharmony_ci0-63 以太网数据包 (pad拓展) 9262306a36Sopenharmony_ci 9362306a36Sopenharmony_ciQuad Word N (last): 9462306a36Sopenharmony_ci0-23 以太网数据包 (pad拓展) 9562306a36Sopenharmony_ci24-55 ICRC 9662306a36Sopenharmony_ci56-61 尾 9762306a36Sopenharmony_ci62-63 LT (=01, 链路传输尾 Flit) 9862306a36Sopenharmony_ci==================== ================================ 9962306a36Sopenharmony_ci 10062306a36Sopenharmony_ci以太网数据包在传输端被填充,以确保VNIC OPA数据包是四字对齐的。“尾”字段 10162306a36Sopenharmony_ci包含填充的字节数。在接收端,“尾”字段被读取,在将数据包向上传递到网络堆 10262306a36Sopenharmony_ci栈之前,填充物被移除(与ICRC、尾和OPA头一起)。 10362306a36Sopenharmony_ci 10462306a36Sopenharmony_ciL4头字段包含VNIC端口所属的虚拟以太网交换机ID。在接收端,该字段用于将收 10562306a36Sopenharmony_ci到的VNIC数据包去多路复用到不同的VNIC端口。 10662306a36Sopenharmony_ci 10762306a36Sopenharmony_ci驱动设计 10862306a36Sopenharmony_ci======== 10962306a36Sopenharmony_ci 11062306a36Sopenharmony_ci英特尔OPA VNIC的软件设计如下图所示。OPA VNIC功能有一个依赖于硬件的部分 11162306a36Sopenharmony_ci和一个独立于硬件的部分。 11262306a36Sopenharmony_ci 11362306a36Sopenharmony_ci对IB设备分配和释放RDMA netdev设备的支持已经被加入。RDMA netdev支持与 11462306a36Sopenharmony_ci网络堆栈的对接,从而创建标准的网络接口。OPA_VNIC是一个RDMA netdev设备 11562306a36Sopenharmony_ci类型。 11662306a36Sopenharmony_ci 11762306a36Sopenharmony_ci依赖于HW的VNIC功能是HFI1驱动的一部分。它实现了分配和释放OPA_VNIC RDMA 11862306a36Sopenharmony_cinetdev的动作。它涉及VNIC功能的HW资源分配/管理。它与网络堆栈接口并实现所 11962306a36Sopenharmony_ci需的net_device_ops功能。它在传输路径中期待Omni-Path封装的以太网数据包, 12062306a36Sopenharmony_ci并提供对它们的HW访问。在将数据包向上传递到网络堆栈之前,它把Omni-Path头 12162306a36Sopenharmony_ci从接收的数据包中剥离。它还实现了RDMA netdev控制操作。 12262306a36Sopenharmony_ci 12362306a36Sopenharmony_ciOPA VNIC模块实现了独立于硬件的VNIC功能。它由两部分组成。VNIC以太网管理 12462306a36Sopenharmony_ci代理(VEMA)作为一个IB客户端向IB核心注册,并与IB MAD栈接口。它与以太网 12562306a36Sopenharmony_ci管理器(EM)和VNIC netdev交换管理信息。VNIC netdev部分分配和释放OPA_VNIC 12662306a36Sopenharmony_ciRDMA netdev设备。它在需要时覆盖由依赖HW的VNIC驱动设置的net_device_ops函数, 12762306a36Sopenharmony_ci以适应任何控制操作。它还处理以太网数据包的封装,在传输路径中使用Omni-Path头。 12862306a36Sopenharmony_ci对于每个VNIC接口,封装所需的信息是由EM通过VEMA MAD接口配置的。它还通过调用 12962306a36Sopenharmony_ciRDMA netdev控制操作将任何控制信息传递给依赖于HW的驱动程序:: 13062306a36Sopenharmony_ci 13162306a36Sopenharmony_ci +-------------------+ +----------------------+ 13262306a36Sopenharmony_ci | | | Linux | 13362306a36Sopenharmony_ci | IB MAD | | 网络 | 13462306a36Sopenharmony_ci | | | 栈 | 13562306a36Sopenharmony_ci +-------------------+ +----------------------+ 13662306a36Sopenharmony_ci | | | 13762306a36Sopenharmony_ci | | | 13862306a36Sopenharmony_ci +----------------------------+ | 13962306a36Sopenharmony_ci | | | 14062306a36Sopenharmony_ci | OPA VNIC 模块 | | 14162306a36Sopenharmony_ci | (OPA VNIC RDMA Netdev | | 14262306a36Sopenharmony_ci | & EMA 函数) | | 14362306a36Sopenharmony_ci | | | 14462306a36Sopenharmony_ci +----------------------------+ | 14562306a36Sopenharmony_ci | | 14662306a36Sopenharmony_ci | | 14762306a36Sopenharmony_ci +------------------+ | 14862306a36Sopenharmony_ci | IB 核心 | | 14962306a36Sopenharmony_ci +------------------+ | 15062306a36Sopenharmony_ci | | 15162306a36Sopenharmony_ci | | 15262306a36Sopenharmony_ci +--------------------------------------------+ 15362306a36Sopenharmony_ci | | 15462306a36Sopenharmony_ci | HFI1 驱动和 VNIC 支持 | 15562306a36Sopenharmony_ci | | 15662306a36Sopenharmony_ci +--------------------------------------------+ 157