sync reader
TSN 仿真/仿真环境中的延迟与抖动表征
Characterization of latency and jitter in TSN emulation · 2025-06-02
本页提供英文原文段落与中文逐段译稿。译稿包含自动复核状态;标记为需人工复核的段落应回到 PDF/HTML 校对公式、表格和符号。
- 本站范围
- 全文逐段对照
- 内容来源
- 本地英文段落 + 中文译稿
- 阅读规模
- 59/59 段已生成译稿
本地保存的可公开原文段落,随页面一起滚动
自动复核标记:15 段需要人工回看公式、表格或符号
\PARstart Timestamping frames is central to network profiling. It is primarily performed using network analyzers in physical networks. Profiling network traffic in software has become crucial in emulated and containerized environments (e.g., Docker, Kubernetes, CNI plugins), bridging and tunneling, and cloudification. It enables the use of different procedures for recording various timer values at different software layers, each with distinct overheads and trade-offs. The goal of this work is to explore and compare four different timestamping methods. We conduct our study in the context of Time-Sensitive Networking (TSN) emulation, but our methods and findings are applicable to a wide range of network emulation scenarios, including containerized networks and tunneling systems.
为帧打时间戳是网络画像的核心。在物理网络中,这主要使用网络分析仪来完成。在仿真和容器化环境(例如 Docker、Kubernetes、CNI 插件)、桥接与隧道,以及云化中,用软件对网络流量进行画像已经变得至关重要。它使得可以使用不同的过程,在不同的软件层记录各种定时器值,而每种过程都有不同的开销和权衡。本文工作的目标是探索并比较四种不同的时间戳方法。我们在时间敏感网络(Time-Sensitive Networking,TSN)仿真的语境下开展研究,但我们的方法和发现适用于范围广泛的网络仿真场景,包括容器化网络和隧道系统。
术语方面,“network profiling”译为“网络画像”,也可译为“网络性能剖析”;“timestamping frames”保留为“为帧打时间戳”。数字“四种”已保留。Docker、Kubernetes、CNI、TSN 等缩写保留。未发现公式风险。
Time-Sensitive Networking (TSN) constitutes a set of IEEE standards at the Data Link layer aimed at achieving deterministic and ultra-fast transmissions over standard Ethernet and wireless technologies, capable of integrating different types of traffic. Current proprietary industrial networks are migrating to this new open and interoperable paradigm. TSN is at the basis of Industry 4.0, new intra-vehicle networks in automotive and aerospace industries, and the future Deterministic Internet.
时间敏感网络(Time-Sensitive Networking,TSN)构成一组位于数据链路层的 IEEE 标准,旨在通过标准以太网和无线技术实现确定性的、超高速的传输,并且能够集成不同类型的流量。当前的专有工业网络正在迁移到这一新的开放且可互操作的范式。TSN 是工业 4.0、汽车和航空航天行业中新型车内网络,以及未来确定性互联网的基础。
“Data Link layer”译为“数据链路层”;“deterministic and ultra-fast transmissions”译为“确定性的、超高速的传输”。“intra-vehicle networks”按上下文译为“车内网络”,但航空航天语境下也可能泛指载具内部网络,需留意。未发现数字或公式风险。
TSN offers traffic shaping to maintain quality of service in networks where time-sensitive streams with varying criticalities coexist with best-effort traffic. Particularly, we focus on the IEEE 802.1Qbv shaper (TAS) [ 1 ]. The quest for optimal routing and scheduling solutions for specific use cases still remains open, particularly when it comes to implementing a scheduling solution in actual networks. This implementation requires a verification process that often necessitates revising the schedule due to estimates provided to the theoretical scheduling problem. Utilizing a physical testbed demands significant time and resources, and simulation may not be practical due to the complexity of certain use cases. In this research, we focus on emulation, which aligns with the trend towards Software-Defined Networks (SDN) and cloudification, where physical testbeds are not available.
TSN 提供流量整形,以便在具有不同关键程度的时间敏感流与尽力而为流量共存的网络中维持服务质量。具体而言,我们关注 IEEE 802.1Qbv 整形器(TAS)[1]。针对特定用例寻求最优路由与调度方案这一问题仍然开放,尤其是在将某一调度方案实现到实际网络中时更是如此。这种实现需要一个验证过程,而由于理论调度问题所提供的是估计值,该验证过程往往需要修订调度。使用物理测试床需要大量时间和资源,而由于某些用例的复杂性,仿真模拟可能并不实际。在本研究中,我们关注仿真,这与软件定义网络(Software-Defined Networks,SDN)和云化的发展趋势相一致,在这些场景中物理测试床并不可用。
“traffic shaping”译为“流量整形”,“best-effort traffic”译为“尽力而为流量”,“TAS”保留。逻辑上,“due to estimates provided to the theoretical scheduling problem”语义略别扭,按“理论调度问题中使用估计值导致需要修订”处理,建议人工确认原文意图。引用 [1] 已保留。
To this end, we leverage the Mininet [ 2 ] network emulator. It allows the definition and emulation of network nodes (hosts —TSN end points—, and bridges) and links on a single Linux system. Hosts in Mininet behave as the physical ones. The net can be administered through common tools (e.g., nm). User applications can send and receive frames through virtual interfaces such as veth, going through Mininet bridges (which can functionally act as switches). Both the emulated and the physical system can run the same binaries. Fig. 1 provides an overview of our TSN emulation setup and highlights the issues we address here. First, we define and configure the network: hosts, bridges, links, and TSN streams (talker and listener nodes, jitter and deadline bounds). Second, we characterize bridge latencies and the intrinsic jitter of this network. Third, we solve the TSN scheduling problem for those streams, taking into account the stream definitions, and the latencies and intrinsic jitter we measured in the emulated network. Then, we deploy the schedule configuring the TAS of the bridges. Finally, we start the system, collecting indicators to assess the correctness of the schedule.
为此,我们利用 Mininet [2] 网络仿真器。它允许在单个 Linux 系统上定义并仿真网络节点(主机,即 TSN 端点,以及网桥)和链路。Mininet 中的主机表现得像物理主机一样。网络可以通过常见工具(例如 nm)进行管理。用户应用程序可以通过 veth 等虚拟接口发送和接收帧,并经过 Mininet 网桥(这些网桥在功能上可以充当交换机)。仿真系统和物理系统都可以运行相同的二进制文件。图 1 概述了我们的 TSN 仿真设置,并突出显示了我们在此处理的问题。首先,我们定义并配置网络:主机、网桥、链路和 TSN 流(talker 与 listener 节点、抖动和截止期限边界)。其次,我们刻画网桥时延以及该网络的内在抖动。第三,我们针对这些流求解 TSN 调度问题,同时考虑流定义,以及我们在仿真网络中测得的时延和内在抖动。然后,我们部署该调度,对网桥的 TAS 进行配置。最后,我们启动系统,收集指标以评估该调度的正确性。
Mininet、Linux、veth、TAS、TSN 等术语已保留。“latencies”译为“时延”;“intrinsic jitter”译为“内在抖动”。“deadline bounds”译为“截止期限边界”,也可译为“截止期界限”。“nm”可能指 NetworkManager 的 nm 工具或命令族,原文简写保留。图 1 上下文未提供,但未影响段落主要含义。
This work provides the following contributions: • A software timestamping methodology to measure end-to-end and bridge latencies, comparing options suitable for any network emulation and containerized networks, among others. • Solutions to the principal issues underlying the Linux and Mininet setup for TSN emulation. As far as we know and as of this writing, this is the only work which explicitly provides pivotal configuration details for TSN emulation on Mininet. • Experimental evidence leading to useful hints to conveniently setup the underlying emulation platform (Linux plus microprocessor) with and without Intel ® ’s TCC ®, either on a high-end workstation or on an industrial PC.
本工作提供以下贡献:• 一种用于测量端到端时延和网桥时延的软件时间戳方法论,并比较了适用于任何网络仿真和容器化网络等场景的若干选项。• 针对 TSN 仿真中 Linux 和 Mininet 设置所涉及的主要底层问题,给出解决方案。据我们所知,并且截至本文写作时,这是唯一一项明确提供在 Mininet 上进行 TSN 仿真的关键配置细节的工作。• 给出实验证据,由此形成有用提示,以便在带有或不带有 Intel® TCC® 的情况下,方便地设置底层仿真平台(Linux 加微处理器),无论该平台位于高端工作站上还是工业 PC 上。
项目符号结构已保留在同一段中。Intel®、TCC® 注册标识保留。“pivotal configuration details”译为“关键配置细节”。“Linux plus microprocessor”直译为“Linux 加微处理器”,表达略特殊但忠实。未发现数字、公式或缩写遗漏。
A software timestamping methodology to measure end-to-end and bridge latencies, comparing options suitable for any network emulation and containerized networks, among others.
一种用于测量端到端时延和网桥时延的软件时间戳方法论,并比较了适用于任何网络仿真和容器化网络等场景的若干选项。
该段与 P005 第一条贡献重复,按输入独立翻译。术语“end-to-end and bridge latencies”已完整保留为“端到端时延和网桥时延”。未发现明显问题。
Solutions to the principal issues underlying the Linux and Mininet setup for TSN emulation. As far as we know and as of this writing, this is the only work which explicitly provides pivotal configuration details for TSN emulation on Mininet.
针对 TSN 仿真中 Linux 和 Mininet 设置所涉及的主要底层问题,给出解决方案。据我们所知,并且截至本文写作时,这是唯一一项明确提供在 Mininet 上进行 TSN 仿真的关键配置细节的工作。
该段与 P005 第二条贡献重复,按输入独立翻译。“as of this writing”译为“截至本文写作时”。Linux、Mininet、TSN 缩写保留。未发现明显问题。
Experimental evidence leading to useful hints to conveniently setup the underlying emulation platform (Linux plus microprocessor) with and without Intel ® ’s TCC ®, either on a high-end workstation or on an industrial PC.
给出实验证据,由此形成有用提示,以便在带有或不带有 Intel® TCC® 的情况下,方便地设置底层仿真平台(Linux 加微处理器),无论该平台位于高端工作站上还是工业 PC 上。
该段与 P005 第三条贡献重复,按输入独立翻译。Intel® TCC®、Linux、PC 已保留。“with and without”译为“带有或不带有”。未发现明显问题。
In which follows, we review a few close contributions in Sec. 2. Secs. 3 and 6 introduce solutions to the Linux and Mininet issues related to TSN emulation, and timestamping related background. Sec. 7 describes the experimental environment. Sec. 8 analyzes timestamping methods to characterize latencies at different levels and jitter, considering different platform configurations. Sec. 9 applies the characterization methodology to a Use Case, revealing and solving a few final problems. Sec. 10 provides conclusions and remarks.
在下文中,我们将在第 2 节回顾若干密切相关的贡献。第 3 节和第 6 节介绍与 TSN 仿真相关的 Linux 和 Mininet 问题的解决方案,以及与时间戳相关的背景知识。第 7 节描述实验环境。第 8 节在考虑不同平台配置的情况下,分析用于刻画不同层级时延和抖动的时间戳方法。第 9 节将该刻画方法论应用于一个用例,揭示并解决若干最终问题。第 10 节给出结论和评述。
章节编号 Sec. 2、Secs. 3 and 6、Sec. 7、Sec. 8、Sec. 9、Sec. 10 均已保留为中文节号。原文“In which follows”疑似应为“In what follows”,不影响翻译但可能是原文笔误。未发现公式风险。
There is a limited number of works related to the issues of emulating TSN on Mininet [ 3 ] [ 4 ]. Both identify —and do not always resolve— integration problems of TSN components on Mininet bridges. The authors in [ 5 ] develop a measurement methodology comparing a software implementation of the TAS with a hardware testbed, but they provide no details.
与在 Mininet 上仿真 TSN 的问题相关的工作数量有限 [3][4]。两者都识别了 Mininet 网桥上 TSN 组件的集成问题,但并不总是解决这些问题。[5] 中的作者开发了一种测量方法论,用于比较 TAS 的软件实现与硬件测试床,但他们没有提供细节。
引用 [3][4][5] 已保留。“Both identify —and do not always resolve—”译为“识别了……但并不总是解决”。“hardware testbed”译为“硬件测试床”。未发现明显问题。
TSN emulation on Mininet is also leveraged in [ 6 ], focusing on the SDN capabilities of Mininet rather than on the TSN mechanisms implemented in Linux. They develop a profiling methodology, whose approach differs from that of [ 5 ] but obtains similar results. Also, there is an interesting summary of most TSN utilities existing in Linux in [ 7 ].
Mininet 上的 TSN 仿真也在文献 [6] 中得到了利用,其关注点是 Mininet 的 SDN 能力,而不是 Linux 中实现的 TSN 机制。他们提出了一种剖析方法,其思路不同于文献 [5],但获得了相似的结果。此外,文献 [7] 对 Linux 中现有的大多数 TSN 工具给出了一份有意思的总结。
术语 TSN、Mininet、SDN、Linux 保留;引用 [6]、[5]、[7] 未遗漏;“profiling methodology”译为“剖析方法”,语义可接受。未发现明显问题。
We exploit three key components to support TSN on Mininet: XDP (eXpress Data Path) [ 8 ], AF_PACKET [ 9 ] and qdisc (Linux queue disciplines) (Fig. 2).
我们利用三个关键组件在 Mininet 上支持 TSN:XDP(eXpress Data Path)[8]、AF_PACKET [9] 以及 qdisc(Linux queue disciplines,Linux 队列规程)(图 2)。
术语 XDP、AF_PACKET、qdisc 保留;括号内全称和引用 [8]、[9]、图 2 未遗漏;“queue disciplines”译为“队列规程”符合 Linux 网络语境。未发现明显问题。
XDP allows the execution of user-level filters using the BPF interface [ 10, 11 ]. These filters are attached to a kernel hook right after the interrupt service routine which triggers upon each frame arrival, in a physical system, or just into the veth driver in an emulated node. The XDP routine can examine the frame and either drop it, copy it to user space, or forward it to another Network Interface Controller (NIC). AF_PACKET clones the frame and sends the copy to a user process, while the kernel’s copy of the frame (sk_buff) proceeds to the Linux Network Stack (LNS).
XDP 允许使用 BPF 接口 [10, 11] 执行用户级过滤器。在物理系统中,这些过滤器被附加到一个内核钩子上,该钩子位于每个帧到达时触发的中断服务例程之后;而在仿真节点中,它们则直接进入 veth 驱动。XDP 例程可以检查该帧,并选择丢弃它、将其复制到用户空间,或将其转发到另一个网络接口控制器(NIC)。AF_PACKET 会克隆该帧,并将副本发送给一个用户进程,而内核中的帧副本(sk_buff)则继续进入 Linux 网络栈(LNS)。
BPF、veth、NIC、AF_PACKET、sk_buff、LNS 等术语和缩写均保留;“right after the interrupt service routine”已译出位置关系;“copy it to user space”和“kernel’s copy”区分清楚。未发现明显问题。
Qdisc is a Linux framework, managed with the tc tool, to place predefined filters between an ingress (egress) port and the LNS. The key qdisc for TSN is the taprio qdisc, intended to emulate a simplified version of an IEEE 802.1Qbv TAS (Fig. 3). Besides taprio, we also leverage clsact for the complete integration of the taprio qdisc, in order to meet TSN common practices (Sec. 4), and netem to emulate the transmission time (Sec. 9.1).
Qdisc 是一个 Linux 框架,使用 tc 工具进行管理,用于在入口(出口)端口与 LNS 之间放置预定义过滤器。TSN 的关键 qdisc 是 taprio qdisc,它旨在仿真 IEEE 802.1Qbv TAS 的一个简化版本(图 3)。除 taprio 之外,我们还利用 clsact 来实现 taprio qdisc 的完整集成,以满足 TSN 的通用实践(第 4 节),并利用 netem 来仿真传输时间(第 9.1 节)。
qdisc、tc、LNS、taprio、IEEE 802.1Qbv、TAS、clsact、netem 均保留;图 3、第 4 节、第 9.1 节未遗漏;“ingress (egress)”按“入口(出口)”处理。未发现明显问题。
Mininet runs on a single Linux computer. This obviates the emulation of IEEE 802.1AS devices to meet the TSN time-synchronization requirements, resulting in a zero clock skew. Each network node (host or bridge) is a user process, which forks children processes as required (e.g., talkers and listeners at the end-stations). Network links among nodes are set up leveraging the virtual Ethernet driver veth. This virtual driver emulates the Data Link Layer firmware of the NICs, and serves as the OS Ethernet driver itself. All processes in a node share a single LNS and the same Linux namespace. Mininet imposes no requirement on the kernel preemption mode. We have opted for a fully preemptible kernel (RT) configuration, common in TSN nodes.
Mininet 运行在单台 Linux 计算机上。这使得无需仿真 IEEE 802.1AS 设备也能满足 TSN 的时间同步要求,从而得到零时钟偏斜。每个网络节点(主机或网桥)都是一个用户进程,并会按需派生子进程,例如终端站中的 talker 和 listener。节点之间的网络链路通过利用虚拟以太网驱动 veth 来建立。这个虚拟驱动仿真 NIC 的数据链路层固件,同时也充当操作系统的以太网驱动本身。一个节点中的所有进程共享同一个 LNS 和同一个 Linux 命名空间。Mininet 对内核抢占模式没有要求。我们选择了完全可抢占内核(RT)配置,这在 TSN 节点中很常见。
IEEE 802.1AS、TSN、veth、NIC、LNS、RT 等术语保留;“zero clock skew”译为“零时钟偏斜”;“forks children processes”译为“派生子进程”;因果关系未遗漏。未发现明显问题。
Configuring Mininet involves installing and setting up the taprio qdisc, which emulates the queue structure of IEEE 802.1Qbv. However, the veth driver defaults to a single queue, which is manageable for one class but requires kernel patching to overcome this limitation [ 12 ]. Additionally, frames in TSN need to be identified by the traffic class of their respective streams. TSN bridges commonly utilize the PCP subfield within the VLAN tag of the Ethernet frame for this purpose. Although Mininet does not natively support virtual LANs, we can modify the Host class in the Mininet framework to enable the VLAN field.
配置 Mininet 涉及安装和设置 taprio qdisc,后者仿真 IEEE 802.1Qbv 的队列结构。然而,veth 驱动默认只有单个队列,这对于一个类别是可管理的,但若要克服这一限制,则需要对内核打补丁 [12]。此外,TSN 中的帧需要通过其各自流的流量类别来识别。TSN 网桥通常为此使用以太网帧 VLAN 标签中的 PCP 子字段。尽管 Mininet 原生不支持虚拟 LAN,我们仍可以修改 Mininet 框架中的 Host 类以启用 VLAN 字段。
taprio qdisc、IEEE 802.1Qbv、veth、TSN、VLAN、PCP、Host 类保留;引用 [12] 未遗漏;“one class”译为“一个类别”,结合队列和流量类别语境合理。未发现明显问题。
In TSN, frames are tagged with their corresponding class ID at the talker’s host. We do that leveraging the iptables tool of Linux, which stores the ID in the priority field of the sk_buff allocated to the frame (sk_buff → → \rightarrow → priority), for every stream sent through a specific egress port. Then, actual VLAN interfaces pass this ID to the PCP field of the Ethernet package. We have configured the veth interfaces at the egress ports of Mininet hosts with VLAN support, so they can proceed the same way.
在 TSN 中,帧会在 talker 的主机处用其对应的类别 ID 进行标记。我们通过利用 Linux 的 iptables 工具来做到这一点;对于通过特定出口端口发送的每一条流,该工具会把该 ID 存储到为该帧分配的 sk_buff 的 priority 字段中(sk_buff → → \rightarrow → priority)。随后,实际的 VLAN 接口会把这个 ID 传递到以太网包的 PCP 字段。我们已为 Mininet 主机出口端口处的 veth 接口配置了 VLAN 支持,因此它们可以以同样方式继续处理。
iptables、sk_buff、priority、VLAN、PCP、veth 等术语保留;“for every stream sent through a specific egress port”已译出限定范围;公式/字段表达式 `sk_buff → → \rightarrow → priority` 在输入中疑似存在重复箭头或识别残留,需人工确认原论文中的正确写法。
The veth interfaces at the ingress ports of Mininet bridges cannot be configured with VLAN capabilities; otherwise, they would strip the VLAN header. Since taprio determines the class ID of frames based on their sk_buff → → \rightarrow → priority field, we employ the clsact qdisc to copy the PCP value of incoming frames into their sk_buff → → \rightarrow → priority field. Fig. 4 provides an overview of our approach, illustrating how filters and qdiscs are used to emulate TSN within Mininet.
Mininet 网桥入口端口处的 veth 接口不能配置 VLAN 能力;否则,它们会剥离 VLAN 头。由于 taprio 根据帧的 sk_buff → → \rightarrow → priority 字段来确定其类别 ID,我们采用 clsact qdisc 将传入帧的 PCP 值复制到其 sk_buff → → \rightarrow → priority 字段中。图 4 概述了我们的方法,展示了如何使用过滤器和 qdisc 在 Mininet 中仿真 TSN。
veth、VLAN、taprio、sk_buff、priority、clsact qdisc、PCP、qdisc、Mininet、TSN 均保留;因果关系“否则会剥离 VLAN 头”和“由于 taprio 根据字段确定类别 ID”已保留;字段表达式 `sk_buff → → \rightarrow → priority` 同样疑似由识别导致箭头重复,需人工核对原文。
As per time synchronization, we already mentioned that Mininet runs on a single Linux instance. Therefore, all processes can share the same clocking with no clock skew and no explicit emulation of the IEEE 802.1AS protocol.
关于时间同步,我们已经提到,Mininet 运行在单个 Linux 实例上。因此,所有进程都可以共享同一时钟机制,没有时钟偏斜,也无需显式仿真 IEEE 802.1AS 协议。
Mininet、Linux、IEEE 802.1AS 保留;“no clock skew”和“no explicit emulation”均已译出;逻辑关系清楚。未发现明显问题。
Intel’s Time Coordinated Computing (TCC ®) [ 13 ] encompasses a set of optimizations in order to improve the real-time performance of the underlying platform. • Power State Transition Optimizations limit the jitter in CPU execution due to frequency changes, and other power-saving features. • Memory/Cache Allocation Optimizations reduce the variability of the memory subsystem by allowing to partition the shared caches, including the portions available to the GPU. • Interrupt Request (IRQ) Optimizations streamline the critical path for interrupts in the CPU core, and also allow devices to deliver interrupts directly to the guest OS. • Fabric and PCIe Virtual Channels provide different priorities for the transactions related to different workloads, allowing to treat real-time traffic as high priority. • Intel ® Speed Shift for Edge Compute Applications enables specific assignment of processor performance to where it is most needed. • Precision Time Coordination and PCIe Precision Time Measurement (PTM) allow to coordinate events across multiple SoC subsystems and components with independent time clocks.
Intel 的 Time Coordinated Computing(TCC ®)[13] 包含一组优化,用以改善底层平台的实时性能。• Power State Transition Optimizations 通过限制由频率变化以及其他节能特性导致的 CPU 执行抖动来发挥作用。• Memory/Cache Allocation Optimizations 通过允许划分共享缓存,包括可供 GPU 使用的部分,来降低内存子系统的可变性。• Interrupt Request(IRQ)Optimizations 简化 CPU 核心中处理中断的关键路径,并且还允许设备将中断直接递交给客户操作系统。• Fabric and PCIe Virtual Channels 为与不同工作负载相关的事务提供不同优先级,从而允许将实时流量作为高优先级处理。• Intel ® Speed Shift for Edge Compute Applications 能够把处理器性能明确分配到最需要的位置。• Precision Time Coordination 和 PCIe Precision Time Measurement(PTM)允许在多个具有独立时钟的 SoC 子系统和组件之间协调事件。
Intel TCC ®、IRQ、PCIe、GPU、SoC、PTM 等术语和缩写保留;各优化项按输入中的项目顺序翻译;“guest OS”译为“客户操作系统”;项目符号在 JSON 中合并为同一段,已按同一 P020 输出。未发现明显问题。
Power State Transition Optimizations limit the jitter in CPU execution due to frequency changes, and other power-saving features.
功耗状态转换优化限制了由于频率变化以及其他节能特性而导致的 CPU 执行抖动。
术语 “Power State Transition Optimizations” 译为“功耗状态转换优化”较稳妥;“jitter” 在本文语境中译为“抖动”。未发现明显问题。
Memory/Cache Allocation Optimizations reduce the variability of the memory subsystem by allowing to partition the shared caches, including the portions available to the GPU.
内存/缓存分配优化通过允许对共享缓存进行分区,包括对 GPU 可用的部分进行分区,来降低内存子系统的可变性。
“Memory/Cache Allocation Optimizations” 译为“内存/缓存分配优化”;“variability” 译为“可变性”。句中 including 指共享缓存中可供 GPU 使用的部分,逻辑未发现明显问题。
Interrupt Request (IRQ) Optimizations streamline the critical path for interrupts in the CPU core, and also allow devices to deliver interrupts directly to the guest OS.
中断请求(IRQ)优化简化了 CPU 核心中处理中断的关键路径,并且还允许设备将中断直接递送到客户操作系统。
IRQ 缩写保留;“guest OS” 译为“客户操作系统”符合虚拟化语境。未发现明显问题。
Fabric and PCIe Virtual Channels provide different priorities for the transactions related to different workloads, allowing to treat real-time traffic as high priority.
Fabric 和 PCIe 虚拟通道为与不同工作负载相关的事务提供不同优先级,从而允许将实时流量作为高优先级流量处理。
“Fabric” 保留英文以避免误译;“transactions” 在总线/互连语境中译为“事务”。未发现明显问题。
Intel ® Speed Shift for Edge Compute Applications enables specific assignment of processor performance to where it is most needed.
面向边缘计算应用的 Intel® Speed Shift 使处理器性能能够被具体分配到最需要它的位置。
Intel® Speed Shift 为专有技术名,保留英文;“specific assignment” 译为“具体分配”略直译,但符合原文限定。未发现明显问题。
Precision Time Coordination and PCIe Precision Time Measurement (PTM) allow to coordinate events across multiple SoC subsystems and components with independent time clocks.
精确时间协调和 PCIe 精确时间测量(PTM)允许在具有独立时间时钟的多个 SoC 子系统和组件之间协调事件。
PTM 缩写保留;“independent time clocks” 译为“独立时间时钟”。未发现明显问题。
A usual timestamping point is the boundary between the physical layer (PHY) and the medium (e.g., an Ethernet cable). Specialized hardware probes like network TAP devices allow the capture of such measurements. NICs may also include timestamping mechanisms that approximate those values. Some NICs include the ability to timestamp DMA requests as well. The OS can retrieve timestamps from devices (hardware timers at the microprocessor or NICs).
一个常见的时间戳点是物理层(PHY)与介质(例如以太网电缆)之间的边界。诸如网络 TAP 设备之类的专用硬件探针允许捕获这类测量值。NIC 也可能包含近似这些值的时间戳机制。一些 NIC 还包含为 DMA 请求打时间戳的能力。操作系统可以从设备获取时间戳,例如从微处理器或 NIC 上的硬件定时器获取。
PHY、TAP、NIC、DMA 均保留缩写;“timestamping point” 译为“时间戳点”。最后一句括号中 “hardware timers at the microprocessor or NICs” 语义略不自然,按“微处理器或 NIC 上的硬件定时器”处理。未发现明显问题。
Linux stores a few hardware timestamps in the sk_buff per-frame structure, along with software timestamps, which can be reached leveraging XDP, or the recvmsg() syscall via sockets AF_PACKET or AF_INET. The OS does also generate timestamps generated from OS-managed hardware timers, abstracted as OS clocks, such as CLOCK_REALTIME, CLOCK_TAI or CLOCK_MONOTONIC and others, reachable through the clock_gettime() syscall, or a BPF helper function such as bpf_ktime_get_ns() called from an XDP program. It is pivotal to note that hardware counters may hold time values (e.g., in nanoseconds), or simply a number of ticks that must be translated to time values using a given frequency, as Linux does to provide clock abstractions like the ones we have just mentioned. When it comes to networking, a typical hardware element used for timestamping is the PTP Hardware Clock (PHC) [ 14 ]. Linux offers an ioctl interface that allows to relate the timestamps taken by the NIC and its own time-keeping mechanisms.
Linux 在每帧结构 `sk_buff` 中存储少量硬件时间戳以及软件时间戳,这些时间戳可以通过利用 XDP 访问,或者通过套接字 `AF_PACKET` 或 `AF_INET` 经由 `recvmsg()` 系统调用访问。操作系统也会生成来自操作系统管理的硬件定时器的时间戳,这些定时器被抽象为操作系统时钟,例如 `CLOCK_REALTIME`、`CLOCK_TAI`、`CLOCK_MONOTONIC` 等;这些时钟可以通过 `clock_gettime()` 系统调用访问,也可以通过从 XDP 程序调用的 BPF 辅助函数,例如 `bpf_ktime_get_ns()` 访问。关键需要注意的是,硬件计数器可能保存时间值,例如以纳秒为单位的值,也可能只是保存必须使用给定频率转换为时间值的 tick 数;Linux 正是这样做的,以提供像我们刚才提到的那些时钟抽象。在网络方面,一个用于时间戳的典型硬件元素是 PTP 硬件时钟(PHC)[14]。Linux 提供了一个 `ioctl` 接口,允许将 NIC 获取的时间戳与它自身的计时机制关联起来。
`sk_buff`、XDP、`recvmsg()`、`AF_PACKET`、`AF_INET`、BPF、PHC、`ioctl` 等技术名保留;“ticks” 译为 tick 数以避免歧义。原文 “does also generate timestamps generated from” 有重复表达,译文按其含义处理。未发现明显问题。
The hardware architecture provides specific timestamping mechanisms and ISA interfaces. For example, Intel 64 and IA-32 architectures define the operation of a Timestamp Counter (TSC), and instructions like rdtsc to read it [ 15 ]. The OS will normally use such architecture support as the foundation of its own time-keeping mechanisms.
硬件架构提供特定的时间戳机制和 ISA 接口。例如,Intel 64 和 IA-32 架构定义了时间戳计数器(TSC)的操作,以及像 `rdtsc` 这样用于读取它的指令 [15]。操作系统通常会使用这种架构支持作为其自身计时机制的基础。
ISA、TSC、`rdtsc` 保留;“time-keeping mechanisms” 译为“计时机制”。未发现明显问题。
Tab. 1 summarizes the three experimental setups we use in this work. C1 employs a preeemptable kernel configuration, with no special optimization for real-time (RT), whereas C2 and C3 run a kernel with the PREEMPT_RT patch, configured with full RT preemption. The kernel in C2 is parameterized following Intel ® ’s recommendations for RT. C3 runs with the Intel ® ’s TCC ® system activated.
表 1 总结了我们在本工作中使用的三种实验设置。C1 采用可抢占内核配置,没有针对实时(RT)的特殊优化;而 C2 和 C3 运行带有 PREEMPT_RT 补丁的内核,并配置为完全 RT 抢占。C2 中的内核按照 Intel® 关于 RT 的建议进行参数化。C3 在启用 Intel® TCC® 系统的情况下运行。
“preeemptable” 原文疑似拼写错误,应为 “preemptable/preemptible”,译为“可抢占”;PREEMPT_RT、RT、TCC 保留。因存在原文拼写异常,但含义明确。
We deploy 1000 random frames from talker to listener traversing two bridges for the characterization and evaluation measurements performed in Sec. 8, with taprio configured with all queues open to avoid delaying any frame. The TSN topology and streams of the use case are described in Sec. 9, with taprio configured according to the computed schedule. The default configuration is C2 unless stated otherwise.
我们部署 1000 个从 talker 到 listener、穿越两个 bridge 的随机帧,用于第 8 节中执行的表征与评估测量;其中 taprio 被配置为所有队列均打开,以避免延迟任何帧。该用例的 TSN 拓扑和流在第 9 节中描述,其中 taprio 根据计算得到的调度进行配置。除非另有说明,默认配置为 C2。
数字 1000、第 8 节、第 9 节、C2 均已保留;talker、listener、bridge、taprio、TSN 等术语保留原文以避免误译。逻辑上前一句为测量配置,后一句为用例配置,未发现明显问题。
Among the timestamping possibilities introduced in Sec. 6, we have shortlisted the ones compatible with the veth framework, which actually emulates a NIC besides acting as the NIC driver, and with the timestamping points most appropriate to obtain the value of the delays under consideration.
在第 6 节介绍的时间戳可能方案中,我们筛选出了与 veth 框架兼容的方案;该框架除了充当 NIC 驱动程序之外,实际上还仿真一个 NIC,并且这些方案也与最适合获得所考虑延迟数值的时间戳记录点相兼容。
NIC、veth 保留;“besides acting as the NIC driver”译为“除了充当 NIC 驱动程序之外”符合原意。句子较长,已保留“兼容对象”包括 veth 框架和 timestamping points 两层关系。未发现明显问题。
Fig. 5 shows the points where we record timestamps, along with the calculated latencies. Tab. 2 defines each timestamp and the method(s) used to read them.
图 5 显示了我们记录时间戳的位置,以及计算得到的时延。表 2 定义了每一个时间戳以及用于读取它们的方法。
Fig. 5、Tab. 2 分别译为图 5、表 2;latencies 译为“时延”,符合 TSN 论文语境。未发现明显问题。
We leverage three timestamping methods: M1 reads the values of either CLOCK_REALTIME (M1.1) or CLOCK_MONOTONIC (M1.2) using the clock_gettime() syscall. M2 records the CLOCK_REALTIME value stored in the sk_buff by veth_xmit() at veth pairs, in kernel space. M2.1 records the timer value using a socket AF_INET (SOCK_DGRAM) from user space. M2.2 records the timer value using a socket AF_PACKET from user space. AF_PACKET clones the sk_buff and sends a copy to user space whereas the frame proceeds as usual through the LNS, where it can be validated. M3 records the value of a CLOCK_MONOTONIC timer in kernel space, using a BPF helper function bpf_ktime_get_ns() through the XDP framework.
我们利用三种时间戳记录方法:M1 使用 `clock_gettime()` 系统调用读取 `CLOCK_REALTIME`(M1.1)或 `CLOCK_MONOTONIC`(M1.2)的值。M2 在内核空间中,记录 veth pair 处由 `veth_xmit()` 存储在 `sk_buff` 中的 `CLOCK_REALTIME` 值。M2.1 从用户空间使用 `AF_INET`(`SOCK_DGRAM`)套接字记录计时器值。M2.2 从用户空间使用 `AF_PACKET` 套接字记录计时器值。`AF_PACKET` 会克隆 `sk_buff` 并向用户空间发送一个副本,而该帧则照常通过 LNS 继续前进,在那里可以对其进行验证。M3 通过 XDP 框架使用 BPF 辅助函数 `bpf_ktime_get_ns()`,在内核空间中记录 `CLOCK_MONOTONIC` 计时器的值。
M1/M2/M2.1/M2.2/M3、`CLOCK_REALTIME`、`CLOCK_MONOTONIC`、`clock_gettime()`、`veth_xmit()`、`sk_buff`、`AF_INET`、`SOCK_DGRAM`、`AF_PACKET`、LNS、XDP、BPF 函数名均已保留。逻辑上 `AF_PACKET` 是复制给用户空间,同时原帧继续通过 LNS。未发现明显问题。
reads the values of either CLOCK_REALTIME (M1.1) or CLOCK_MONOTONIC (M1.2) using the clock_gettime() syscall.
使用 `clock_gettime()` 系统调用读取 `CLOCK_REALTIME`(M1.1)或 `CLOCK_MONOTONIC`(M1.2)的值。
该段缺少主语,疑似从表格或列表中抽取出的残片;内容与 P034 中 M1 的描述重复。术语和缩写保留正确,但上下文可能缺失。
records the CLOCK_REALTIME value stored in the sk_buff by veth_xmit() at veth pairs, in kernel space.
在内核空间中,记录 veth pair 处由 `veth_xmit()` 存储在 `sk_buff` 中的 `CLOCK_REALTIME` 值。
该段缺少主语,疑似表格条目残片;内容与 P034 中 M2 的描述重复。`veth pair`、`veth_xmit()`、`sk_buff`、`CLOCK_REALTIME` 保留正确,但上下文可能缺失。
M2.1 records the timer value using a socket AF_INET (SOCK_DGRAM) from user space. M2.2 records the timer value using a socket AF_PACKET from user space. AF_PACKET clones the sk_buff and sends a copy to user space whereas the frame proceeds as usual through the LNS, where it can be validated.
M2.1 从用户空间使用 `AF_INET`(`SOCK_DGRAM`)套接字记录计时器值。M2.2 从用户空间使用 `AF_PACKET` 套接字记录计时器值。`AF_PACKET` 会克隆 `sk_buff` 并向用户空间发送一个副本,而该帧则照常通过 LNS 继续前进,在那里可以对其进行验证。
M2.1、M2.2、`AF_INET`、`SOCK_DGRAM`、`AF_PACKET`、`sk_buff`、LNS 均已保留;逻辑为两种用户空间 socket 读取方式,并说明 `AF_PACKET` 的复制机制。未发现明显问题。
records the timer value using a socket AF_INET (SOCK_DGRAM) from user space.
从用户空间使用 `AF_INET`(`SOCK_DGRAM`)套接字记录计时器值。
该段缺少主语,疑似表格条目残片;内容与 P034/P037 中 M2.1 的描述重复。术语和缩写保留正确,但上下文可能缺失。
records the timer value using a socket AF_PACKET from user space. AF_PACKET clones the sk_buff and sends a copy to user space whereas the frame proceeds as usual through the LNS, where it can be validated.
从用户空间使用 `AF_PACKET` 套接字记录计时器值。`AF_PACKET` 会克隆 `sk_buff` 并向用户空间发送一个副本,而该帧则照常通过 LNS 继续前进,在那里可以对其进行验证。
该段首句缺少主语,疑似表格条目残片;内容与 P034/P037 中 M2.2 的描述重复。`AF_PACKET`、`sk_buff`、LNS 保留正确,复制与原帧继续转发的逻辑清楚,但上下文可能缺失。
records the value of a CLOCK_MONOTONIC timer in kernel space, using a BPF helper function bpf_ktime_get_ns() through the XDP framework.
通过 XDP 框架使用 BPF 辅助函数 `bpf_ktime_get_ns()`,在内核空间中记录 `CLOCK_MONOTONIC` 计时器的值。
该段缺少主语,疑似表格条目残片;内容与 P034 中 M3 的描述重复。XDP、BPF、`bpf_ktime_get_ns()`、`CLOCK_MONOTONIC` 保留正确,但上下文可能缺失。
There are two relevant latencies we must calculate for TSN scheduling. First, the definition of each time-aware stream s i subscript 𝑠 𝑖 s_{i} italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, in a set of i 𝑖 i italic_i time-aware streams, includes a maximum allowed delay D i subscript 𝐷 𝑖 D_{i} italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and jitter J i subscript 𝐽 𝑖 J_{i} italic_J start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for the stream. For all frames of stream s i subscript 𝑠 𝑖 s_{i} italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the delay d i, j subscript 𝑑 𝑖 𝑗 d_{i,j} italic_d start_POSTSUBSCRIPT italic_i, italic_j end_POSTSUBSCRIPT of each frame j 𝑗 j italic_j must be equal or lower than D i subscript 𝐷 𝑖 D_{i} italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. In TSN scheduling, such delay (and its jitter) refers to the time span we measure in the emulated system as e2e.nic (T 4 − T 1 𝑇 4 𝑇 1 T4-T1 italic_T 4 - italic_T 1, Fig. 5). Thus, the TSN schedule deployed in the emulated system is correct if the actual e2e.nic value measured for every frame j 𝑗 j italic_j is such that d i, j ≤ subscript 𝑑 𝑖 𝑗 absent d_{i,j}\leq italic_d start_POSTSUBSCRIPT italic_i, italic_j end_POSTSUBSCRIPT ≤ e2e.nic.
对于 TSN 调度,我们必须计算两个相关的时延。首先,在一组 \(i\) 个时间感知流中,每个时间感知流 \(s_i\) 的定义都包括该流的最大允许时延 \(D_i\) 和抖动 \(J_i\)。对于流 \(s_i\) 的所有帧,每个帧 \(j\) 的时延 \(d_{i,j}\) 必须等于或低于 \(D_i\)。在 TSN 调度中,这种时延(及其抖动)指的是我们在仿真系统中测量为 e2e.nic 的时间跨度(\(T4-T1\),图 5)。因此,如果针对每个帧 \(j\) 测得的实际 e2e.nic 值满足 \(d_{i,j} \leq\) e2e.nic,则部署在仿真系统中的 TSN 调度是正确的。
术语“time-aware stream”译为“时间感知流”,“maximum allowed delay”译为“最大允许时延”。原文末尾公式出现识别残缺,文本呈现为“\(d_{i,j}\leq ... \leq e2e.nic\)”且逻辑上可能缺少与 \(D_i\) 的比较关系,需结合论文原版确认。
Second, in order to correctly calculate the gate opening and closing times synthesized as GCL entries in the IEEE 802.1Qbv TAS (Fig. 3), the scheduling algorithm must take into account the intrinsic jitter of the physical (or emulated) TSN system. The two main factors of this intrinsic jitter are the clock skew (which is zero in Mininet, Sec. 4) and the bridge latency. Actual TSN bridges have no user processes. In Mininet bridges, we deploy an instrumental user process which only runs when we activate profiling, using M2.2 and M3 to measure the bridge latency (br1L for any intermediate bridge, and br2L for the last bridge before the listener’s end station as in Fig. 5).
其次,为了正确计算在 IEEE 802.1Qbv TAS 中作为 GCL 条目合成的门打开和关闭时间(图 3),调度算法必须考虑物理(或仿真的)TSN 系统的固有抖动。这种固有抖动的两个主要因素是时钟偏斜(在 Mininet 中为零,见第 4 节)和网桥时延。实际的 TSN 网桥没有用户进程。在 Mininet 网桥中,我们部署了一个工具性用户进程,该进程只在我们激活性能分析时运行,并使用 M2.2 和 M3 来测量网桥时延(任意中间网桥为 br1L,监听方终端站之前的最后一个网桥为 br2L,如图 5 所示)。
术语 GCL、IEEE 802.1Qbv TAS、clock skew、bridge latency 均已保留或准确翻译。br1L/br2L 与图 5 上下文相关,未擅自展开。未发现明显问题。
We have also instrumented the profiling system to calculate sendL, arrL, and e2e, useful to get an insight on the ways a frame can be processed in Linux.
我们还对性能分析系统进行了插桩,使其能够计算 sendL、arrL 和 e2e,这有助于深入了解帧在 Linux 中可能被处理的方式。
sendL、arrL、e2e 为指标名,已保留原样。“instrumented”译为“进行了插桩”符合系统测量语境。未发现明显问题。
Fig. 6 shows that the bridge latency (brL1) and average jitter measured using M2.2 (AF_PACKET) is slightly higher and with greater IQR than using M3 (AF_PACKET), due to the cloning performed by the latter. XDP yields a greater absolute jitter if we consider the outliers. Differences are actually negligible (a few μ 𝜇 \mu italic_μ s), favoring AF_PACKET (M2.2) because XDP (M3) is much harder to implement.
图 6 显示,使用 M2.2(AF_PACKET)测得的网桥时延(brL1)和平均抖动,比使用 M3(AF_PACKET)测得的结果略高,并且 IQR 更大,这是由于后者执行了克隆操作。如果考虑离群值,XDP 会产生更大的绝对抖动。实际差异可以忽略不计(几个 \(\mu s\)),并且更倾向于 AF_PACKET(M2.2),因为 XDP(M3)实现起来要困难得多。
IQR 保留为四分位距指标缩写;\(\mu s\) 译为微秒单位但保留符号。原文中“M3 (AF_PACKET)”与后文“XDP (M3)”存在表述不一致风险,可能是原文或抽取错误,需人工核对图 6 或论文原文。
The leftmost box plot in Fig. 7 shows e2e.nic measured using M1.1 for T 1 𝑇 1 T1 italic_T 1 and M2 for T 4 𝑇 4 T4 italic_T 4, with timestamping turned off in bridges. It provides an estimate of the overhead introduced by XDP (M3) and AF_PACKET (M2.2) when timestamping is active at brL1 and brL2. Again, practical differences between XDP and AF_PACKET are negligible. Also, we can estimate the overhead of bridge timestamping while measuring end2end.nic in about 10 μ 𝜇 \mu italic_μ s.
图 7 中最左侧的箱线图显示了使用 M1.1 测量 \(T1\)、使用 M2 测量 \(T4\) 时得到的 e2e.nic,其中网桥中的时间戳功能处于关闭状态。它提供了一个估计值,用于估计当 brL1 和 brL2 处的时间戳处于活动状态时,XDP(M3)和 AF_PACKET(M2.2)所引入的开销。同样,XDP 和 AF_PACKET 之间的实际差异可以忽略不计。此外,我们可以估计,在测量 end2end.nic 时,网桥时间戳的开销约为 10 \(\mu s\)。
M1.1、M2、M3、M2.2、brL1、brL2、e2e.nic、end2end.nic 均按原文保留。注意原文同时出现 e2e.nic 与 end2end.nic,可能是指标命名差异或识别不一致,需人工确认。
We have compared our methodology over the three platform configurations summarized in Tab. 1. The RT optimizations in C2 yield lower latency values across all timestamping methods, despite a few outliers (Fig. 8). With C3, values decrease even further, necessitating a rescaling of the y 𝑦 y italic_y -axis to better visualize the impact (note the C3 dashed line at about 150 μ 𝜇 \mu italic_μ s superimposed as a reference). We detail results with C3 in Fig. 9, marking the C3 reference given in Fig. 8 (150 μ 𝜇 \mu italic_μ s), varying now the way we allocate processes to the four cores available in C3. Although the values per latency type roughly hold no matter the allocation, the middle plot (Allocation 2) eliminates the extreme outliers seen in other allocation schemes. We use XDP (M3) at T 2 𝑇 2 T2 italic_T 2, T 3 𝑇 3 T3 italic_T 3, and T 4 𝑇 4 T4 italic_T 4 in Fig. 8 and AF_PACKET (M2.2) in Fig. 9 but the results hold, with negligible variations.
我们在表 1 总结的三种平台配置上比较了我们的方法。尽管存在少量离群值(图 8),C2 中的 RT 优化在所有时间戳方法下都产生了更低的时延值。在 C3 中,数值进一步降低,因此需要重新缩放 \(y\) 轴,以便更好地可视化其影响(注意作为参考叠加显示的 C3 虚线约为 150 \(\mu s\))。我们在图 9 中详细给出了 C3 的结果,标出了图 8 中给出的 C3 参考值(150 \(\mu s\)),并且此时改变了将进程分配到 C3 中可用四个核心上的方式。虽然无论采用何种分配方式,各类时延的数值大致保持不变,但中间图(分配 2)消除了其他分配方案中出现的极端离群值。我们在图 8 中于 \(T2\)、\(T3\) 和 \(T4\) 使用 XDP(M3),在图 9 中使用 AF_PACKET(M2.2),但结果仍然成立,变化可以忽略不计。
RT、C2、C3、Allocation 2、XDP、AF_PACKET 等术语和配置编号已保留。数字 150 \(\mu s\)、四个核心、图表引用均完整。未发现明显问题。
We now deploy a simple TSN use case to achieve three key objectives (Tab. 3, Fig. 10). First, we validate the timestamping methodology outlined in Sec. 8. Second, we wrap-up and test the TSN emulation settings described in Secs. 3 and 4. Finally, we demonstrate that the emulation platform and methodology can support the optimization of TSN scheduling design and deployment.
我们现在部署一个简单的 TSN 用例,以实现三个关键目标(表 3,图 10)。首先,我们验证第 8 节概述的时间戳方法。其次,我们总结并测试第 3 节和第 4 节中描述的 TSN 仿真设置。最后,我们展示该仿真平台和方法能够支持 TSN 调度设计与部署的优化。
“wrap-up and test”译为“总结并测试”,语义基本贴合但略依赖上下文。Sec. 8、Secs. 3 and 4 的引用已保留为中文节号表达。未发现明显问题。
We solve the TAS scheduling problem according to the method in [ 16 ]. The solver requires the four per-stream parameters shown in Tab. 3 plus their paths (traversed bridges, Fig. 10).
我们按照文献 [16] 中的方法求解 TAS 调度问题。求解器需要表 3 中所示的四个逐流参数,以及这些流的路径(所经过的网桥,图 10)。
“per-stream parameters”译为“逐流参数”,路径说明“traversed bridges”已补译为“所经过的网桥”。引用 [16]、表 3、图 10 均保留。未发现明显问题。
Also, solving the scheduling problem requires considering four network parameters: transmission time, propagation time, bridge latency, and intrinsic jitter (which includes clock skew, talker delay, and NIC/ veth jitter). The transmission time is the interval from when the Time-Aware Shaper (TAS) begins transmitting a frame through an open gate to the physical medium until the transmission completes (Fig. 3). The propagation time (also known as propagation delay) is the time it takes for a signal to travel from the sender to the receiver across a physical communication channel.
此外,求解该调度问题还需要考虑四个网络参数:传输时间、传播时间、网桥时延和固有抖动(其中包括时钟偏斜、talker 时延以及 NIC/veth 抖动)。传输时间是指 Time-Aware Shaper(TAS)通过打开的门开始将一个帧发送到物理介质,直到传输完成之间的时间间隔(图 3)。传播时间(也称为传播时延)是指信号通过物理通信信道从发送方传播到接收方所需的时间。
transmission time 与 propagation time 分别译为“传输时间”和“传播时间”,避免混淆。talker、NIC/veth、TAS 已保留必要术语。未发现明显问题。
In Mininet, the veth driver (Sec. 3) emulates the physical channel, meaning any parameter related to the latter may require kernel modifications. This is the only method we have found to accurately emulate propagation time. However, since propagation time is not central to our goals, we choose to omit it in our platform. Instead, we have devised a workaround to emulate transmission time by instantiating a netem qdisc as a child of taprio, as illustrated in Fig. 4. A drawback of this approach is that the first frame entering each netem queue is lost. Specifically, the first frame of stream 0 in Fig. 10 disappears at B1, and the second disappears at B2. To mitigate this, we perform a dry run before starting the profiling process.
在 Mininet 中,veth 驱动程序(第 3 节)仿真物理信道,这意味着任何与后者相关的参数都可能需要修改内核。这是我们发现的唯一一种能够准确仿真传播时间的方法。然而,由于传播时间并不是我们目标的核心,我们选择在平台中省略它。相反,我们设计了一种变通方法,通过将 netem qdisc 实例化为 taprio 的子 qdisc 来仿真传输时间,如图 4 所示。该方法的一个缺点是,进入每个 netem 队列的第一个帧会丢失。具体而言,图 10 中流 0 的第一个帧在 B1 处消失,第二个帧在 B2 处消失。为了缓解这一问题,我们在启动性能分析过程之前执行一次空运行。
veth、netem qdisc、taprio、B1、B2 等实现术语已保留。dry run 译为“空运行”,符合测试/预热语境。数字和顺序关系“第一个帧在 B1、第二个帧在 B2”已完整保留。未发现明显问题。
As stated in Sec. 4, there is no clock skew in this experimental platform. To measure the talker delay, we have recorded the actual transmission times at talkers’ ends-tations, obtaining that the maximum difference with the scheduled transmission times is about 80 80 80 80 ns. The jitter at bridges is of about 200 μ 200 𝜇 200\leavevmode\nobreak\ \mu 200 italic_μ s (Fig. 6), and the jitter in e2e.nic latency is around 10 μ 10 𝜇 10\leavevmode\nobreak\ \mu 10 italic_μ s. Upon these figures, we estimate in 500 μ 500 𝜇 500\leavevmode\nobreak\ \mu 500 italic_μ s the intrinsic jitter of our Mininet emulation platform.
如第 4 节所述,在该实验平台中不存在时钟偏斜。为了测量 talker 延迟,我们记录了 talker 端站处的实际发送时间,得到其与计划发送时间的最大差异约为 80 ns。网桥处的抖动约为 200 μs(图 6),而 e2e.nic 延迟中的抖动约为 10 μs。基于这些数值,我们估计 Mininet 仿真平台的固有抖动为 500 μs。
“ends-tations”应为“end-stations”的识别/排版错误,已按“端站”处理;“80 80 80 80 ns”“200 μ ... s”“10 μ ... s”“500 μ ... s”为公式/单位抽取重复,已还原为 80 ns、200 μs、10 μs、500 μs。术语 talker、e2e.nic、Mininet 保留。存在原文抽取噪声风险。
Physical TSN systems are usually configured and run through a user network interface known as CUC (Centralized User Configuration), operating upon a Centralized Network Configuration component (CNC) which performs the actual configuration and boots the system. We do this job through a Python script which sets up the network (Mininet hosts, bridges and links, user processes), configures the CGLs in the taprio qdiscs of the bridges, defines a timestamp as instant zero, and sets all processes in waiting state. When instant zero is reached, all processes start.
物理 TSN 系统通常通过一种称为 CUC(Centralized User Configuration,集中式用户配置)的用户网络接口来配置和运行,该接口作用于 CNC(Centralized Network Configuration,集中式网络配置)组件;CNC 执行实际配置并启动系统。我们通过一个 Python 脚本完成这项工作:该脚本搭建网络(Mininet 主机、网桥和链路、用户进程),配置网桥的 taprio qdisc 中的 CGL,定义一个时间戳作为零时刻,并将所有进程置于等待状态。当到达零时刻时,所有进程启动。
CUC、CNC、CGL、taprio qdisc、Mininet 均为关键术语,已保留缩写并补充常见中文译名。逻辑关系为 CUC 操作 CNC、CNC 负责实际配置与启动,已保留。未发现明显问题。
Fig. 11 plots the values of e2e.nic for the three streams of the use case (configuration C2. Timestamping activated in bridges: AF_PACKET, M2.2). All measured latencies are in the order of μ 𝜇 \mu italic_μ s, an order of magnitude below the deadlines of the streams.
图 11 绘制了该用例中三个流的 e2e.nic 数值(配置 C2。网桥中启用了时间戳:AF_PACKET,M2.2)。所有测得的延迟均处于 μs 量级,比这些流的截止期限低一个数量级。
配置编号 C2、AF_PACKET、M2.2 和指标 e2e.nic 已保留。原文中 μ 的 LaTeX/抽取残留已还原为“μs”。“deadlines”译为“截止期限”。未发现明显问题。
We have also checked that the time windows at TAS (taprio) gates are wide enough for the frames to pass-through. Fig. 12 shows that the gates at the taprio in bridge 1 allocated to streams 1 (green) and 0 (blue) open for enough time to ensure that the frames of the streams correctly pass-through. This means that the schedule solution has correctly taken into account transmission times and the intrinsic jitter of the platform.
我们还检查了 TAS(taprio)门处的时间窗口是否足够宽,以便帧能够通过。图 12 表明,网桥 1 中 taprio 上分配给流 1(绿色)和流 0(蓝色)的门开启了足够长的时间,以确保这些流的帧能够正确通过。这意味着调度解已经正确考虑了发送时间以及平台的固有抖动。
TAS、taprio、stream 1/0 的颜色标注已保留。pass-through 译为“通过”,符合门控调度语境。逻辑为“门窗口足够宽”推出“调度考虑了发送时间和固有抖动”,已保留。未发现明显问题。
We have successfully set up a Mininet/Linux environment suitable for TSN emulation, with a timestamping methodology that allows the characterization of the latencies of the emulated network (bridge latency, intrinsic jitter). As an application example, and to complete the necessary configuration steps, we have tested the schedule of a use-case on an emulated TSN network, solving the issue of emulating transmission times leveraging the netem Linux qdisc. Emulating the propagation delay requires the modification of the kernel, nevertheless.
我们已经成功搭建了一个适用于 TSN 仿真的 Mininet/Linux 环境,并采用了一种时间戳方法,使其能够表征仿真网络的延迟(网桥延迟、固有抖动)。作为一个应用示例,并且为了完成必要的配置步骤,我们在一个仿真 TSN 网络上测试了某个用例的调度,通过利用 Linux 的 netem qdisc 解决了发送时间仿真的问题。不过,仿真传播延迟仍需要修改内核。
bridge latency、intrinsic jitter、netem Linux qdisc、propagation delay 均已准确保留/翻译。最后一句 nevertheless 表示转折限制,已用“不过”体现。未发现明显问题。
Using XDP for timestamping yields slightly better latency bounds than using AF_PACKET, although differences are negligible as far as e2e.nic (the end-to-end latency which actually counts in TSN scheduling) is concerned.
使用 XDP 进行时间戳标记所得到的延迟界限略优于使用 AF_PACKET 的情况,尽管就 e2e.nic(在 TSN 调度中真正起作用的端到端延迟)而言,二者差异可以忽略不计。
XDP、AF_PACKET、e2e.nic 均保留。latency bounds 译为“延迟界限”;“actually counts”译为“真正起作用”,贴合调度关注对象。未发现明显问题。
Using XDP for timestamping offers slightly improved latency bounds over AF_PACKET, but the differences in e2e.nic, the end-to-end latency relevant to TSN scheduling, are minimal, and AFP is significantly easier to use. Leveraging a fully preemptible kernel along with Intel ® ’s TCC ® reduces substantially the intrinsic jitter in all cases. However, outcomes depend on the process-to-core allocation scheme, which is crucial for industrial PCs with few cores.
使用 XDP 进行时间戳标记,相比 AF_PACKET 可提供略有改善的延迟界限,但在 e2e.nic,即与 TSN 调度相关的端到端延迟方面,差异很小,并且 AFP 明显更易使用。将完全可抢占内核与 Intel® 的 TCC® 结合使用,在所有情况下都显著降低了固有抖动。然而,结果取决于进程到核心的分配方案,而这种方案对于核心数量较少的工业 PC 至关重要。
此段与 P056 内容部分重复但措辞不同,按输入独立翻译。原文写作“AFP”而非前文“AF_PACKET”,可能是 AF_PACKET 的缩写或识别/作者用词不一致,已保留为 AFP。Intel®、TCC®、process-to-core allocation scheme 已保留含义。存在 AFP 指代风险。
We have experimented a number of compatibility issues when installing, configuring and adapting the necessary tools and frameworks (kernel and gcc versions, veth, qdiscs, Mininet itself among others). Improvements are in the line of updating veth to integrate taprio and the etf qdisc, emulating the transmission time and updating hardware platforms.
在安装、配置和适配必要工具与框架时,我们遇到了若干兼容性问题(包括 kernel 和 gcc 版本、veth、qdiscs、Mininet 本身等)。改进方向包括更新 veth 以集成 taprio 和 etf qdisc、仿真发送时间,以及更新硬件平台。
kernel、gcc、veth、qdiscs、Mininet、taprio、etf qdisc 均保留为技术术语。原文“We have experimented a number of compatibility issues”疑为“We have experienced...”,已按“遇到”处理。最后一句并列结构可能存在“improvements are in the line of...”语义不够精确,已按“改进方向”处理。存在原文措辞/抽取风险。
This work was supported by the Spanish MCIN /AEI /10.13039 /501100011033 (grant PID2022 -136454NB-C22),by Government of Aragon (research group T58_23R) and by Instituto de Investigación en Ingeniería de Aragón (I3A, Conv. de Ayudas a Prácticas con TFG 2023)
本工作得到了 Spanish MCIN/AEI/10.13039/501100011033(资助编号 PID2022-136454NB-C22)、Aragon 政府(研究组 T58_23R)以及 Instituto de Investigación en Ingeniería de Aragón(I3A,Conv. de Ayudas a Prácticas con TFG 2023)的支持。
资助机构、DOI/机构标识、项目编号 PID2022-136454NB-C22、研究组 T58_23R、I3A 名称均已保留。原文中“),by”缺少空格,已按正常标点处理。机构名称是否应译为中文取决于全篇术语表,当前保留原名以避免误译。未发现明显问题。
中文逐段译稿
为帧打时间戳是网络画像的核心。在物理网络中,这主要使用网络分析仪来完成。在仿真和容器化环境(例如 Docker、Kubernetes、CNI 插件)、桥接与隧道,以及云化中,用软件对网络流量进行画像已经变得至关重要。它使得可以使用不同的过程,在不同的软件层记录各种定时器值,而每种过程都有不同的开销和权衡。本文工作的目标是探索并比较四种不同的时间戳方法。我们在时间敏感网络(Time-Sensitive Networking,TSN)仿真的语境下开展研究,但我们的方法和发现适用于范围广泛的网络仿真场景,包括容器化网络和隧道系统。
术语方面,“network profiling”译为“网络画像”,也可译为“网络性能剖析”;“timestamping frames”保留为“为帧打时间戳”。数字“四种”已保留。Docker、Kubernetes、CNI、TSN 等缩写保留。未发现公式风险。
时间敏感网络(Time-Sensitive Networking,TSN)构成一组位于数据链路层的 IEEE 标准,旨在通过标准以太网和无线技术实现确定性的、超高速的传输,并且能够集成不同类型的流量。当前的专有工业网络正在迁移到这一新的开放且可互操作的范式。TSN 是工业 4.0、汽车和航空航天行业中新型车内网络,以及未来确定性互联网的基础。
“Data Link layer”译为“数据链路层”;“deterministic and ultra-fast transmissions”译为“确定性的、超高速的传输”。“intra-vehicle networks”按上下文译为“车内网络”,但航空航天语境下也可能泛指载具内部网络,需留意。未发现数字或公式风险。
TSN 提供流量整形,以便在具有不同关键程度的时间敏感流与尽力而为流量共存的网络中维持服务质量。具体而言,我们关注 IEEE 802.1Qbv 整形器(TAS)[1]。针对特定用例寻求最优路由与调度方案这一问题仍然开放,尤其是在将某一调度方案实现到实际网络中时更是如此。这种实现需要一个验证过程,而由于理论调度问题所提供的是估计值,该验证过程往往需要修订调度。使用物理测试床需要大量时间和资源,而由于某些用例的复杂性,仿真模拟可能并不实际。在本研究中,我们关注仿真,这与软件定义网络(Software-Defined Networks,SDN)和云化的发展趋势相一致,在这些场景中物理测试床并不可用。
“traffic shaping”译为“流量整形”,“best-effort traffic”译为“尽力而为流量”,“TAS”保留。逻辑上,“due to estimates provided to the theoretical scheduling problem”语义略别扭,按“理论调度问题中使用估计值导致需要修订”处理,建议人工确认原文意图。引用 [1] 已保留。
为此,我们利用 Mininet [2] 网络仿真器。它允许在单个 Linux 系统上定义并仿真网络节点(主机,即 TSN 端点,以及网桥)和链路。Mininet 中的主机表现得像物理主机一样。网络可以通过常见工具(例如 nm)进行管理。用户应用程序可以通过 veth 等虚拟接口发送和接收帧,并经过 Mininet 网桥(这些网桥在功能上可以充当交换机)。仿真系统和物理系统都可以运行相同的二进制文件。图 1 概述了我们的 TSN 仿真设置,并突出显示了我们在此处理的问题。首先,我们定义并配置网络:主机、网桥、链路和 TSN 流(talker 与 listener 节点、抖动和截止期限边界)。其次,我们刻画网桥时延以及该网络的内在抖动。第三,我们针对这些流求解 TSN 调度问题,同时考虑流定义,以及我们在仿真网络中测得的时延和内在抖动。然后,我们部署该调度,对网桥的 TAS 进行配置。最后,我们启动系统,收集指标以评估该调度的正确性。
Mininet、Linux、veth、TAS、TSN 等术语已保留。“latencies”译为“时延”;“intrinsic jitter”译为“内在抖动”。“deadline bounds”译为“截止期限边界”,也可译为“截止期界限”。“nm”可能指 NetworkManager 的 nm 工具或命令族,原文简写保留。图 1 上下文未提供,但未影响段落主要含义。
本工作提供以下贡献:• 一种用于测量端到端时延和网桥时延的软件时间戳方法论,并比较了适用于任何网络仿真和容器化网络等场景的若干选项。• 针对 TSN 仿真中 Linux 和 Mininet 设置所涉及的主要底层问题,给出解决方案。据我们所知,并且截至本文写作时,这是唯一一项明确提供在 Mininet 上进行 TSN 仿真的关键配置细节的工作。• 给出实验证据,由此形成有用提示,以便在带有或不带有 Intel® TCC® 的情况下,方便地设置底层仿真平台(Linux 加微处理器),无论该平台位于高端工作站上还是工业 PC 上。
项目符号结构已保留在同一段中。Intel®、TCC® 注册标识保留。“pivotal configuration details”译为“关键配置细节”。“Linux plus microprocessor”直译为“Linux 加微处理器”,表达略特殊但忠实。未发现数字、公式或缩写遗漏。
一种用于测量端到端时延和网桥时延的软件时间戳方法论,并比较了适用于任何网络仿真和容器化网络等场景的若干选项。
该段与 P005 第一条贡献重复,按输入独立翻译。术语“end-to-end and bridge latencies”已完整保留为“端到端时延和网桥时延”。未发现明显问题。
针对 TSN 仿真中 Linux 和 Mininet 设置所涉及的主要底层问题,给出解决方案。据我们所知,并且截至本文写作时,这是唯一一项明确提供在 Mininet 上进行 TSN 仿真的关键配置细节的工作。
该段与 P005 第二条贡献重复,按输入独立翻译。“as of this writing”译为“截至本文写作时”。Linux、Mininet、TSN 缩写保留。未发现明显问题。
给出实验证据,由此形成有用提示,以便在带有或不带有 Intel® TCC® 的情况下,方便地设置底层仿真平台(Linux 加微处理器),无论该平台位于高端工作站上还是工业 PC 上。
该段与 P005 第三条贡献重复,按输入独立翻译。Intel® TCC®、Linux、PC 已保留。“with and without”译为“带有或不带有”。未发现明显问题。
在下文中,我们将在第 2 节回顾若干密切相关的贡献。第 3 节和第 6 节介绍与 TSN 仿真相关的 Linux 和 Mininet 问题的解决方案,以及与时间戳相关的背景知识。第 7 节描述实验环境。第 8 节在考虑不同平台配置的情况下,分析用于刻画不同层级时延和抖动的时间戳方法。第 9 节将该刻画方法论应用于一个用例,揭示并解决若干最终问题。第 10 节给出结论和评述。
章节编号 Sec. 2、Secs. 3 and 6、Sec. 7、Sec. 8、Sec. 9、Sec. 10 均已保留为中文节号。原文“In which follows”疑似应为“In what follows”,不影响翻译但可能是原文笔误。未发现公式风险。
与在 Mininet 上仿真 TSN 的问题相关的工作数量有限 [3][4]。两者都识别了 Mininet 网桥上 TSN 组件的集成问题,但并不总是解决这些问题。[5] 中的作者开发了一种测量方法论,用于比较 TAS 的软件实现与硬件测试床,但他们没有提供细节。
引用 [3][4][5] 已保留。“Both identify —and do not always resolve—”译为“识别了……但并不总是解决”。“hardware testbed”译为“硬件测试床”。未发现明显问题。
Mininet 上的 TSN 仿真也在文献 [6] 中得到了利用,其关注点是 Mininet 的 SDN 能力,而不是 Linux 中实现的 TSN 机制。他们提出了一种剖析方法,其思路不同于文献 [5],但获得了相似的结果。此外,文献 [7] 对 Linux 中现有的大多数 TSN 工具给出了一份有意思的总结。
术语 TSN、Mininet、SDN、Linux 保留;引用 [6]、[5]、[7] 未遗漏;“profiling methodology”译为“剖析方法”,语义可接受。未发现明显问题。
我们利用三个关键组件在 Mininet 上支持 TSN:XDP(eXpress Data Path)[8]、AF_PACKET [9] 以及 qdisc(Linux queue disciplines,Linux 队列规程)(图 2)。
术语 XDP、AF_PACKET、qdisc 保留;括号内全称和引用 [8]、[9]、图 2 未遗漏;“queue disciplines”译为“队列规程”符合 Linux 网络语境。未发现明显问题。
XDP 允许使用 BPF 接口 [10, 11] 执行用户级过滤器。在物理系统中,这些过滤器被附加到一个内核钩子上,该钩子位于每个帧到达时触发的中断服务例程之后;而在仿真节点中,它们则直接进入 veth 驱动。XDP 例程可以检查该帧,并选择丢弃它、将其复制到用户空间,或将其转发到另一个网络接口控制器(NIC)。AF_PACKET 会克隆该帧,并将副本发送给一个用户进程,而内核中的帧副本(sk_buff)则继续进入 Linux 网络栈(LNS)。
BPF、veth、NIC、AF_PACKET、sk_buff、LNS 等术语和缩写均保留;“right after the interrupt service routine”已译出位置关系;“copy it to user space”和“kernel’s copy”区分清楚。未发现明显问题。
Qdisc 是一个 Linux 框架,使用 tc 工具进行管理,用于在入口(出口)端口与 LNS 之间放置预定义过滤器。TSN 的关键 qdisc 是 taprio qdisc,它旨在仿真 IEEE 802.1Qbv TAS 的一个简化版本(图 3)。除 taprio 之外,我们还利用 clsact 来实现 taprio qdisc 的完整集成,以满足 TSN 的通用实践(第 4 节),并利用 netem 来仿真传输时间(第 9.1 节)。
qdisc、tc、LNS、taprio、IEEE 802.1Qbv、TAS、clsact、netem 均保留;图 3、第 4 节、第 9.1 节未遗漏;“ingress (egress)”按“入口(出口)”处理。未发现明显问题。
Mininet 运行在单台 Linux 计算机上。这使得无需仿真 IEEE 802.1AS 设备也能满足 TSN 的时间同步要求,从而得到零时钟偏斜。每个网络节点(主机或网桥)都是一个用户进程,并会按需派生子进程,例如终端站中的 talker 和 listener。节点之间的网络链路通过利用虚拟以太网驱动 veth 来建立。这个虚拟驱动仿真 NIC 的数据链路层固件,同时也充当操作系统的以太网驱动本身。一个节点中的所有进程共享同一个 LNS 和同一个 Linux 命名空间。Mininet 对内核抢占模式没有要求。我们选择了完全可抢占内核(RT)配置,这在 TSN 节点中很常见。
IEEE 802.1AS、TSN、veth、NIC、LNS、RT 等术语保留;“zero clock skew”译为“零时钟偏斜”;“forks children processes”译为“派生子进程”;因果关系未遗漏。未发现明显问题。
配置 Mininet 涉及安装和设置 taprio qdisc,后者仿真 IEEE 802.1Qbv 的队列结构。然而,veth 驱动默认只有单个队列,这对于一个类别是可管理的,但若要克服这一限制,则需要对内核打补丁 [12]。此外,TSN 中的帧需要通过其各自流的流量类别来识别。TSN 网桥通常为此使用以太网帧 VLAN 标签中的 PCP 子字段。尽管 Mininet 原生不支持虚拟 LAN,我们仍可以修改 Mininet 框架中的 Host 类以启用 VLAN 字段。
taprio qdisc、IEEE 802.1Qbv、veth、TSN、VLAN、PCP、Host 类保留;引用 [12] 未遗漏;“one class”译为“一个类别”,结合队列和流量类别语境合理。未发现明显问题。
在 TSN 中,帧会在 talker 的主机处用其对应的类别 ID 进行标记。我们通过利用 Linux 的 iptables 工具来做到这一点;对于通过特定出口端口发送的每一条流,该工具会把该 ID 存储到为该帧分配的 sk_buff 的 priority 字段中(sk_buff → → \rightarrow → priority)。随后,实际的 VLAN 接口会把这个 ID 传递到以太网包的 PCP 字段。我们已为 Mininet 主机出口端口处的 veth 接口配置了 VLAN 支持,因此它们可以以同样方式继续处理。
iptables、sk_buff、priority、VLAN、PCP、veth 等术语保留;“for every stream sent through a specific egress port”已译出限定范围;公式/字段表达式 `sk_buff → → \rightarrow → priority` 在输入中疑似存在重复箭头或识别残留,需人工确认原论文中的正确写法。
Mininet 网桥入口端口处的 veth 接口不能配置 VLAN 能力;否则,它们会剥离 VLAN 头。由于 taprio 根据帧的 sk_buff → → \rightarrow → priority 字段来确定其类别 ID,我们采用 clsact qdisc 将传入帧的 PCP 值复制到其 sk_buff → → \rightarrow → priority 字段中。图 4 概述了我们的方法,展示了如何使用过滤器和 qdisc 在 Mininet 中仿真 TSN。
veth、VLAN、taprio、sk_buff、priority、clsact qdisc、PCP、qdisc、Mininet、TSN 均保留;因果关系“否则会剥离 VLAN 头”和“由于 taprio 根据字段确定类别 ID”已保留;字段表达式 `sk_buff → → \rightarrow → priority` 同样疑似由识别导致箭头重复,需人工核对原文。
关于时间同步,我们已经提到,Mininet 运行在单个 Linux 实例上。因此,所有进程都可以共享同一时钟机制,没有时钟偏斜,也无需显式仿真 IEEE 802.1AS 协议。
Mininet、Linux、IEEE 802.1AS 保留;“no clock skew”和“no explicit emulation”均已译出;逻辑关系清楚。未发现明显问题。
Intel 的 Time Coordinated Computing(TCC ®)[13] 包含一组优化,用以改善底层平台的实时性能。• Power State Transition Optimizations 通过限制由频率变化以及其他节能特性导致的 CPU 执行抖动来发挥作用。• Memory/Cache Allocation Optimizations 通过允许划分共享缓存,包括可供 GPU 使用的部分,来降低内存子系统的可变性。• Interrupt Request(IRQ)Optimizations 简化 CPU 核心中处理中断的关键路径,并且还允许设备将中断直接递交给客户操作系统。• Fabric and PCIe Virtual Channels 为与不同工作负载相关的事务提供不同优先级,从而允许将实时流量作为高优先级处理。• Intel ® Speed Shift for Edge Compute Applications 能够把处理器性能明确分配到最需要的位置。• Precision Time Coordination 和 PCIe Precision Time Measurement(PTM)允许在多个具有独立时钟的 SoC 子系统和组件之间协调事件。
Intel TCC ®、IRQ、PCIe、GPU、SoC、PTM 等术语和缩写保留;各优化项按输入中的项目顺序翻译;“guest OS”译为“客户操作系统”;项目符号在 JSON 中合并为同一段,已按同一 P020 输出。未发现明显问题。
功耗状态转换优化限制了由于频率变化以及其他节能特性而导致的 CPU 执行抖动。
术语 “Power State Transition Optimizations” 译为“功耗状态转换优化”较稳妥;“jitter” 在本文语境中译为“抖动”。未发现明显问题。
内存/缓存分配优化通过允许对共享缓存进行分区,包括对 GPU 可用的部分进行分区,来降低内存子系统的可变性。
“Memory/Cache Allocation Optimizations” 译为“内存/缓存分配优化”;“variability” 译为“可变性”。句中 including 指共享缓存中可供 GPU 使用的部分,逻辑未发现明显问题。
中断请求(IRQ)优化简化了 CPU 核心中处理中断的关键路径,并且还允许设备将中断直接递送到客户操作系统。
IRQ 缩写保留;“guest OS” 译为“客户操作系统”符合虚拟化语境。未发现明显问题。
Fabric 和 PCIe 虚拟通道为与不同工作负载相关的事务提供不同优先级,从而允许将实时流量作为高优先级流量处理。
“Fabric” 保留英文以避免误译;“transactions” 在总线/互连语境中译为“事务”。未发现明显问题。
面向边缘计算应用的 Intel® Speed Shift 使处理器性能能够被具体分配到最需要它的位置。
Intel® Speed Shift 为专有技术名,保留英文;“specific assignment” 译为“具体分配”略直译,但符合原文限定。未发现明显问题。
精确时间协调和 PCIe 精确时间测量(PTM)允许在具有独立时间时钟的多个 SoC 子系统和组件之间协调事件。
PTM 缩写保留;“independent time clocks” 译为“独立时间时钟”。未发现明显问题。
一个常见的时间戳点是物理层(PHY)与介质(例如以太网电缆)之间的边界。诸如网络 TAP 设备之类的专用硬件探针允许捕获这类测量值。NIC 也可能包含近似这些值的时间戳机制。一些 NIC 还包含为 DMA 请求打时间戳的能力。操作系统可以从设备获取时间戳,例如从微处理器或 NIC 上的硬件定时器获取。
PHY、TAP、NIC、DMA 均保留缩写;“timestamping point” 译为“时间戳点”。最后一句括号中 “hardware timers at the microprocessor or NICs” 语义略不自然,按“微处理器或 NIC 上的硬件定时器”处理。未发现明显问题。
Linux 在每帧结构 `sk_buff` 中存储少量硬件时间戳以及软件时间戳,这些时间戳可以通过利用 XDP 访问,或者通过套接字 `AF_PACKET` 或 `AF_INET` 经由 `recvmsg()` 系统调用访问。操作系统也会生成来自操作系统管理的硬件定时器的时间戳,这些定时器被抽象为操作系统时钟,例如 `CLOCK_REALTIME`、`CLOCK_TAI`、`CLOCK_MONOTONIC` 等;这些时钟可以通过 `clock_gettime()` 系统调用访问,也可以通过从 XDP 程序调用的 BPF 辅助函数,例如 `bpf_ktime_get_ns()` 访问。关键需要注意的是,硬件计数器可能保存时间值,例如以纳秒为单位的值,也可能只是保存必须使用给定频率转换为时间值的 tick 数;Linux 正是这样做的,以提供像我们刚才提到的那些时钟抽象。在网络方面,一个用于时间戳的典型硬件元素是 PTP 硬件时钟(PHC)[14]。Linux 提供了一个 `ioctl` 接口,允许将 NIC 获取的时间戳与它自身的计时机制关联起来。
`sk_buff`、XDP、`recvmsg()`、`AF_PACKET`、`AF_INET`、BPF、PHC、`ioctl` 等技术名保留;“ticks” 译为 tick 数以避免歧义。原文 “does also generate timestamps generated from” 有重复表达,译文按其含义处理。未发现明显问题。
硬件架构提供特定的时间戳机制和 ISA 接口。例如,Intel 64 和 IA-32 架构定义了时间戳计数器(TSC)的操作,以及像 `rdtsc` 这样用于读取它的指令 [15]。操作系统通常会使用这种架构支持作为其自身计时机制的基础。
ISA、TSC、`rdtsc` 保留;“time-keeping mechanisms” 译为“计时机制”。未发现明显问题。
表 1 总结了我们在本工作中使用的三种实验设置。C1 采用可抢占内核配置,没有针对实时(RT)的特殊优化;而 C2 和 C3 运行带有 PREEMPT_RT 补丁的内核,并配置为完全 RT 抢占。C2 中的内核按照 Intel® 关于 RT 的建议进行参数化。C3 在启用 Intel® TCC® 系统的情况下运行。
“preeemptable” 原文疑似拼写错误,应为 “preemptable/preemptible”,译为“可抢占”;PREEMPT_RT、RT、TCC 保留。因存在原文拼写异常,但含义明确。
我们部署 1000 个从 talker 到 listener、穿越两个 bridge 的随机帧,用于第 8 节中执行的表征与评估测量;其中 taprio 被配置为所有队列均打开,以避免延迟任何帧。该用例的 TSN 拓扑和流在第 9 节中描述,其中 taprio 根据计算得到的调度进行配置。除非另有说明,默认配置为 C2。
数字 1000、第 8 节、第 9 节、C2 均已保留;talker、listener、bridge、taprio、TSN 等术语保留原文以避免误译。逻辑上前一句为测量配置,后一句为用例配置,未发现明显问题。
在第 6 节介绍的时间戳可能方案中,我们筛选出了与 veth 框架兼容的方案;该框架除了充当 NIC 驱动程序之外,实际上还仿真一个 NIC,并且这些方案也与最适合获得所考虑延迟数值的时间戳记录点相兼容。
NIC、veth 保留;“besides acting as the NIC driver”译为“除了充当 NIC 驱动程序之外”符合原意。句子较长,已保留“兼容对象”包括 veth 框架和 timestamping points 两层关系。未发现明显问题。
图 5 显示了我们记录时间戳的位置,以及计算得到的时延。表 2 定义了每一个时间戳以及用于读取它们的方法。
Fig. 5、Tab. 2 分别译为图 5、表 2;latencies 译为“时延”,符合 TSN 论文语境。未发现明显问题。
我们利用三种时间戳记录方法:M1 使用 `clock_gettime()` 系统调用读取 `CLOCK_REALTIME`(M1.1)或 `CLOCK_MONOTONIC`(M1.2)的值。M2 在内核空间中,记录 veth pair 处由 `veth_xmit()` 存储在 `sk_buff` 中的 `CLOCK_REALTIME` 值。M2.1 从用户空间使用 `AF_INET`(`SOCK_DGRAM`)套接字记录计时器值。M2.2 从用户空间使用 `AF_PACKET` 套接字记录计时器值。`AF_PACKET` 会克隆 `sk_buff` 并向用户空间发送一个副本,而该帧则照常通过 LNS 继续前进,在那里可以对其进行验证。M3 通过 XDP 框架使用 BPF 辅助函数 `bpf_ktime_get_ns()`,在内核空间中记录 `CLOCK_MONOTONIC` 计时器的值。
M1/M2/M2.1/M2.2/M3、`CLOCK_REALTIME`、`CLOCK_MONOTONIC`、`clock_gettime()`、`veth_xmit()`、`sk_buff`、`AF_INET`、`SOCK_DGRAM`、`AF_PACKET`、LNS、XDP、BPF 函数名均已保留。逻辑上 `AF_PACKET` 是复制给用户空间,同时原帧继续通过 LNS。未发现明显问题。
使用 `clock_gettime()` 系统调用读取 `CLOCK_REALTIME`(M1.1)或 `CLOCK_MONOTONIC`(M1.2)的值。
该段缺少主语,疑似从表格或列表中抽取出的残片;内容与 P034 中 M1 的描述重复。术语和缩写保留正确,但上下文可能缺失。
在内核空间中,记录 veth pair 处由 `veth_xmit()` 存储在 `sk_buff` 中的 `CLOCK_REALTIME` 值。
该段缺少主语,疑似表格条目残片;内容与 P034 中 M2 的描述重复。`veth pair`、`veth_xmit()`、`sk_buff`、`CLOCK_REALTIME` 保留正确,但上下文可能缺失。
M2.1 从用户空间使用 `AF_INET`(`SOCK_DGRAM`)套接字记录计时器值。M2.2 从用户空间使用 `AF_PACKET` 套接字记录计时器值。`AF_PACKET` 会克隆 `sk_buff` 并向用户空间发送一个副本,而该帧则照常通过 LNS 继续前进,在那里可以对其进行验证。
M2.1、M2.2、`AF_INET`、`SOCK_DGRAM`、`AF_PACKET`、`sk_buff`、LNS 均已保留;逻辑为两种用户空间 socket 读取方式,并说明 `AF_PACKET` 的复制机制。未发现明显问题。
从用户空间使用 `AF_INET`(`SOCK_DGRAM`)套接字记录计时器值。
该段缺少主语,疑似表格条目残片;内容与 P034/P037 中 M2.1 的描述重复。术语和缩写保留正确,但上下文可能缺失。
从用户空间使用 `AF_PACKET` 套接字记录计时器值。`AF_PACKET` 会克隆 `sk_buff` 并向用户空间发送一个副本,而该帧则照常通过 LNS 继续前进,在那里可以对其进行验证。
该段首句缺少主语,疑似表格条目残片;内容与 P034/P037 中 M2.2 的描述重复。`AF_PACKET`、`sk_buff`、LNS 保留正确,复制与原帧继续转发的逻辑清楚,但上下文可能缺失。
通过 XDP 框架使用 BPF 辅助函数 `bpf_ktime_get_ns()`,在内核空间中记录 `CLOCK_MONOTONIC` 计时器的值。
该段缺少主语,疑似表格条目残片;内容与 P034 中 M3 的描述重复。XDP、BPF、`bpf_ktime_get_ns()`、`CLOCK_MONOTONIC` 保留正确,但上下文可能缺失。
对于 TSN 调度,我们必须计算两个相关的时延。首先,在一组 \(i\) 个时间感知流中,每个时间感知流 \(s_i\) 的定义都包括该流的最大允许时延 \(D_i\) 和抖动 \(J_i\)。对于流 \(s_i\) 的所有帧,每个帧 \(j\) 的时延 \(d_{i,j}\) 必须等于或低于 \(D_i\)。在 TSN 调度中,这种时延(及其抖动)指的是我们在仿真系统中测量为 e2e.nic 的时间跨度(\(T4-T1\),图 5)。因此,如果针对每个帧 \(j\) 测得的实际 e2e.nic 值满足 \(d_{i,j} \leq\) e2e.nic,则部署在仿真系统中的 TSN 调度是正确的。
术语“time-aware stream”译为“时间感知流”,“maximum allowed delay”译为“最大允许时延”。原文末尾公式出现识别残缺,文本呈现为“\(d_{i,j}\leq ... \leq e2e.nic\)”且逻辑上可能缺少与 \(D_i\) 的比较关系,需结合论文原版确认。
其次,为了正确计算在 IEEE 802.1Qbv TAS 中作为 GCL 条目合成的门打开和关闭时间(图 3),调度算法必须考虑物理(或仿真的)TSN 系统的固有抖动。这种固有抖动的两个主要因素是时钟偏斜(在 Mininet 中为零,见第 4 节)和网桥时延。实际的 TSN 网桥没有用户进程。在 Mininet 网桥中,我们部署了一个工具性用户进程,该进程只在我们激活性能分析时运行,并使用 M2.2 和 M3 来测量网桥时延(任意中间网桥为 br1L,监听方终端站之前的最后一个网桥为 br2L,如图 5 所示)。
术语 GCL、IEEE 802.1Qbv TAS、clock skew、bridge latency 均已保留或准确翻译。br1L/br2L 与图 5 上下文相关,未擅自展开。未发现明显问题。
我们还对性能分析系统进行了插桩,使其能够计算 sendL、arrL 和 e2e,这有助于深入了解帧在 Linux 中可能被处理的方式。
sendL、arrL、e2e 为指标名,已保留原样。“instrumented”译为“进行了插桩”符合系统测量语境。未发现明显问题。
图 6 显示,使用 M2.2(AF_PACKET)测得的网桥时延(brL1)和平均抖动,比使用 M3(AF_PACKET)测得的结果略高,并且 IQR 更大,这是由于后者执行了克隆操作。如果考虑离群值,XDP 会产生更大的绝对抖动。实际差异可以忽略不计(几个 \(\mu s\)),并且更倾向于 AF_PACKET(M2.2),因为 XDP(M3)实现起来要困难得多。
IQR 保留为四分位距指标缩写;\(\mu s\) 译为微秒单位但保留符号。原文中“M3 (AF_PACKET)”与后文“XDP (M3)”存在表述不一致风险,可能是原文或抽取错误,需人工核对图 6 或论文原文。
图 7 中最左侧的箱线图显示了使用 M1.1 测量 \(T1\)、使用 M2 测量 \(T4\) 时得到的 e2e.nic,其中网桥中的时间戳功能处于关闭状态。它提供了一个估计值,用于估计当 brL1 和 brL2 处的时间戳处于活动状态时,XDP(M3)和 AF_PACKET(M2.2)所引入的开销。同样,XDP 和 AF_PACKET 之间的实际差异可以忽略不计。此外,我们可以估计,在测量 end2end.nic 时,网桥时间戳的开销约为 10 \(\mu s\)。
M1.1、M2、M3、M2.2、brL1、brL2、e2e.nic、end2end.nic 均按原文保留。注意原文同时出现 e2e.nic 与 end2end.nic,可能是指标命名差异或识别不一致,需人工确认。
我们在表 1 总结的三种平台配置上比较了我们的方法。尽管存在少量离群值(图 8),C2 中的 RT 优化在所有时间戳方法下都产生了更低的时延值。在 C3 中,数值进一步降低,因此需要重新缩放 \(y\) 轴,以便更好地可视化其影响(注意作为参考叠加显示的 C3 虚线约为 150 \(\mu s\))。我们在图 9 中详细给出了 C3 的结果,标出了图 8 中给出的 C3 参考值(150 \(\mu s\)),并且此时改变了将进程分配到 C3 中可用四个核心上的方式。虽然无论采用何种分配方式,各类时延的数值大致保持不变,但中间图(分配 2)消除了其他分配方案中出现的极端离群值。我们在图 8 中于 \(T2\)、\(T3\) 和 \(T4\) 使用 XDP(M3),在图 9 中使用 AF_PACKET(M2.2),但结果仍然成立,变化可以忽略不计。
RT、C2、C3、Allocation 2、XDP、AF_PACKET 等术语和配置编号已保留。数字 150 \(\mu s\)、四个核心、图表引用均完整。未发现明显问题。
我们现在部署一个简单的 TSN 用例,以实现三个关键目标(表 3,图 10)。首先,我们验证第 8 节概述的时间戳方法。其次,我们总结并测试第 3 节和第 4 节中描述的 TSN 仿真设置。最后,我们展示该仿真平台和方法能够支持 TSN 调度设计与部署的优化。
“wrap-up and test”译为“总结并测试”,语义基本贴合但略依赖上下文。Sec. 8、Secs. 3 and 4 的引用已保留为中文节号表达。未发现明显问题。
我们按照文献 [16] 中的方法求解 TAS 调度问题。求解器需要表 3 中所示的四个逐流参数,以及这些流的路径(所经过的网桥,图 10)。
“per-stream parameters”译为“逐流参数”,路径说明“traversed bridges”已补译为“所经过的网桥”。引用 [16]、表 3、图 10 均保留。未发现明显问题。
此外,求解该调度问题还需要考虑四个网络参数:传输时间、传播时间、网桥时延和固有抖动(其中包括时钟偏斜、talker 时延以及 NIC/veth 抖动)。传输时间是指 Time-Aware Shaper(TAS)通过打开的门开始将一个帧发送到物理介质,直到传输完成之间的时间间隔(图 3)。传播时间(也称为传播时延)是指信号通过物理通信信道从发送方传播到接收方所需的时间。
transmission time 与 propagation time 分别译为“传输时间”和“传播时间”,避免混淆。talker、NIC/veth、TAS 已保留必要术语。未发现明显问题。
在 Mininet 中,veth 驱动程序(第 3 节)仿真物理信道,这意味着任何与后者相关的参数都可能需要修改内核。这是我们发现的唯一一种能够准确仿真传播时间的方法。然而,由于传播时间并不是我们目标的核心,我们选择在平台中省略它。相反,我们设计了一种变通方法,通过将 netem qdisc 实例化为 taprio 的子 qdisc 来仿真传输时间,如图 4 所示。该方法的一个缺点是,进入每个 netem 队列的第一个帧会丢失。具体而言,图 10 中流 0 的第一个帧在 B1 处消失,第二个帧在 B2 处消失。为了缓解这一问题,我们在启动性能分析过程之前执行一次空运行。
veth、netem qdisc、taprio、B1、B2 等实现术语已保留。dry run 译为“空运行”,符合测试/预热语境。数字和顺序关系“第一个帧在 B1、第二个帧在 B2”已完整保留。未发现明显问题。
如第 4 节所述,在该实验平台中不存在时钟偏斜。为了测量 talker 延迟,我们记录了 talker 端站处的实际发送时间,得到其与计划发送时间的最大差异约为 80 ns。网桥处的抖动约为 200 μs(图 6),而 e2e.nic 延迟中的抖动约为 10 μs。基于这些数值,我们估计 Mininet 仿真平台的固有抖动为 500 μs。
“ends-tations”应为“end-stations”的识别/排版错误,已按“端站”处理;“80 80 80 80 ns”“200 μ ... s”“10 μ ... s”“500 μ ... s”为公式/单位抽取重复,已还原为 80 ns、200 μs、10 μs、500 μs。术语 talker、e2e.nic、Mininet 保留。存在原文抽取噪声风险。
物理 TSN 系统通常通过一种称为 CUC(Centralized User Configuration,集中式用户配置)的用户网络接口来配置和运行,该接口作用于 CNC(Centralized Network Configuration,集中式网络配置)组件;CNC 执行实际配置并启动系统。我们通过一个 Python 脚本完成这项工作:该脚本搭建网络(Mininet 主机、网桥和链路、用户进程),配置网桥的 taprio qdisc 中的 CGL,定义一个时间戳作为零时刻,并将所有进程置于等待状态。当到达零时刻时,所有进程启动。
CUC、CNC、CGL、taprio qdisc、Mininet 均为关键术语,已保留缩写并补充常见中文译名。逻辑关系为 CUC 操作 CNC、CNC 负责实际配置与启动,已保留。未发现明显问题。
图 11 绘制了该用例中三个流的 e2e.nic 数值(配置 C2。网桥中启用了时间戳:AF_PACKET,M2.2)。所有测得的延迟均处于 μs 量级,比这些流的截止期限低一个数量级。
配置编号 C2、AF_PACKET、M2.2 和指标 e2e.nic 已保留。原文中 μ 的 LaTeX/抽取残留已还原为“μs”。“deadlines”译为“截止期限”。未发现明显问题。
我们还检查了 TAS(taprio)门处的时间窗口是否足够宽,以便帧能够通过。图 12 表明,网桥 1 中 taprio 上分配给流 1(绿色)和流 0(蓝色)的门开启了足够长的时间,以确保这些流的帧能够正确通过。这意味着调度解已经正确考虑了发送时间以及平台的固有抖动。
TAS、taprio、stream 1/0 的颜色标注已保留。pass-through 译为“通过”,符合门控调度语境。逻辑为“门窗口足够宽”推出“调度考虑了发送时间和固有抖动”,已保留。未发现明显问题。
我们已经成功搭建了一个适用于 TSN 仿真的 Mininet/Linux 环境,并采用了一种时间戳方法,使其能够表征仿真网络的延迟(网桥延迟、固有抖动)。作为一个应用示例,并且为了完成必要的配置步骤,我们在一个仿真 TSN 网络上测试了某个用例的调度,通过利用 Linux 的 netem qdisc 解决了发送时间仿真的问题。不过,仿真传播延迟仍需要修改内核。
bridge latency、intrinsic jitter、netem Linux qdisc、propagation delay 均已准确保留/翻译。最后一句 nevertheless 表示转折限制,已用“不过”体现。未发现明显问题。
使用 XDP 进行时间戳标记所得到的延迟界限略优于使用 AF_PACKET 的情况,尽管就 e2e.nic(在 TSN 调度中真正起作用的端到端延迟)而言,二者差异可以忽略不计。
XDP、AF_PACKET、e2e.nic 均保留。latency bounds 译为“延迟界限”;“actually counts”译为“真正起作用”,贴合调度关注对象。未发现明显问题。
使用 XDP 进行时间戳标记,相比 AF_PACKET 可提供略有改善的延迟界限,但在 e2e.nic,即与 TSN 调度相关的端到端延迟方面,差异很小,并且 AFP 明显更易使用。将完全可抢占内核与 Intel® 的 TCC® 结合使用,在所有情况下都显著降低了固有抖动。然而,结果取决于进程到核心的分配方案,而这种方案对于核心数量较少的工业 PC 至关重要。
此段与 P056 内容部分重复但措辞不同,按输入独立翻译。原文写作“AFP”而非前文“AF_PACKET”,可能是 AF_PACKET 的缩写或识别/作者用词不一致,已保留为 AFP。Intel®、TCC®、process-to-core allocation scheme 已保留含义。存在 AFP 指代风险。
在安装、配置和适配必要工具与框架时,我们遇到了若干兼容性问题(包括 kernel 和 gcc 版本、veth、qdiscs、Mininet 本身等)。改进方向包括更新 veth 以集成 taprio 和 etf qdisc、仿真发送时间,以及更新硬件平台。
kernel、gcc、veth、qdiscs、Mininet、taprio、etf qdisc 均保留为技术术语。原文“We have experimented a number of compatibility issues”疑为“We have experienced...”,已按“遇到”处理。最后一句并列结构可能存在“improvements are in the line of...”语义不够精确,已按“改进方向”处理。存在原文措辞/抽取风险。
本工作得到了 Spanish MCIN/AEI/10.13039/501100011033(资助编号 PID2022-136454NB-C22)、Aragon 政府(研究组 T58_23R)以及 Instituto de Investigación en Ingeniería de Aragón(I3A,Conv. de Ayudas a Prácticas con TFG 2023)的支持。
资助机构、DOI/机构标识、项目编号 PID2022-136454NB-C22、研究组 T58_23R、I3A 名称均已保留。原文中“),by”缺少空格,已按正常标点处理。机构名称是否应译为中文取决于全篇术语表,当前保留原名以避免误译。未发现明显问题。
切换查看英文原文
\PARstart Timestamping frames is central to network profiling. It is primarily performed using network analyzers in physical networks. Profiling network traffic in software has become crucial in emulated and containerized environments (e.g., Docker, Kubernetes, CNI plugins), bridging and tunneling, and cloudification. It enables the use of different procedures for recording various timer values at different software layers, each with distinct overheads and trade-offs. The goal of this work is to explore and compare four different timestamping methods. We conduct our study in the context of Time-Sensitive Networking (TSN) emulation, but our methods and findings are applicable to a wide range of network emulation scenarios, including containerized networks and tunneling systems.
Time-Sensitive Networking (TSN) constitutes a set of IEEE standards at the Data Link layer aimed at achieving deterministic and ultra-fast transmissions over standard Ethernet and wireless technologies, capable of integrating different types of traffic. Current proprietary industrial networks are migrating to this new open and interoperable paradigm. TSN is at the basis of Industry 4.0, new intra-vehicle networks in automotive and aerospace industries, and the future Deterministic Internet.
TSN offers traffic shaping to maintain quality of service in networks where time-sensitive streams with varying criticalities coexist with best-effort traffic. Particularly, we focus on the IEEE 802.1Qbv shaper (TAS) [ 1 ]. The quest for optimal routing and scheduling solutions for specific use cases still remains open, particularly when it comes to implementing a scheduling solution in actual networks. This implementation requires a verification process that often necessitates revising the schedule due to estimates provided to the theoretical scheduling problem. Utilizing a physical testbed demands significant time and resources, and simulation may not be practical due to the complexity of certain use cases. In this research, we focus on emulation, which aligns with the trend towards Software-Defined Networks (SDN) and cloudification, where physical testbeds are not available.
To this end, we leverage the Mininet [ 2 ] network emulator. It allows the definition and emulation of network nodes (hosts —TSN end points—, and bridges) and links on a single Linux system. Hosts in Mininet behave as the physical ones. The net can be administered through common tools (e.g., nm). User applications can send and receive frames through virtual interfaces such as veth, going through Mininet bridges (which can functionally act as switches). Both the emulated and the physical system can run the same binaries. Fig. 1 provides an overview of our TSN emulation setup and highlights the issues we address here. First, we define and configure the network: hosts, bridges, links, and TSN streams (talker and listener nodes, jitter and deadline bounds). Second, we characterize bridge latencies and the intrinsic jitter of this network. Third, we solve the TSN scheduling problem for those streams, taking into account the stream definitions, and the latencies and intrinsic jitter we measured in the emulated network. Then, we deploy the schedule configuring the TAS of the bridges. Finally, we start the system, collecting indicators to assess the correctness of the schedule.
This work provides the following contributions: • A software timestamping methodology to measure end-to-end and bridge latencies, comparing options suitable for any network emulation and containerized networks, among others. • Solutions to the principal issues underlying the Linux and Mininet setup for TSN emulation. As far as we know and as of this writing, this is the only work which explicitly provides pivotal configuration details for TSN emulation on Mininet. • Experimental evidence leading to useful hints to conveniently setup the underlying emulation platform (Linux plus microprocessor) with and without Intel ® ’s TCC ®, either on a high-end workstation or on an industrial PC.
A software timestamping methodology to measure end-to-end and bridge latencies, comparing options suitable for any network emulation and containerized networks, among others.
Solutions to the principal issues underlying the Linux and Mininet setup for TSN emulation. As far as we know and as of this writing, this is the only work which explicitly provides pivotal configuration details for TSN emulation on Mininet.
Experimental evidence leading to useful hints to conveniently setup the underlying emulation platform (Linux plus microprocessor) with and without Intel ® ’s TCC ®, either on a high-end workstation or on an industrial PC.
In which follows, we review a few close contributions in Sec. 2. Secs. 3 and 6 introduce solutions to the Linux and Mininet issues related to TSN emulation, and timestamping related background. Sec. 7 describes the experimental environment. Sec. 8 analyzes timestamping methods to characterize latencies at different levels and jitter, considering different platform configurations. Sec. 9 applies the characterization methodology to a Use Case, revealing and solving a few final problems. Sec. 10 provides conclusions and remarks.
There is a limited number of works related to the issues of emulating TSN on Mininet [ 3 ] [ 4 ]. Both identify —and do not always resolve— integration problems of TSN components on Mininet bridges. The authors in [ 5 ] develop a measurement methodology comparing a software implementation of the TAS with a hardware testbed, but they provide no details.
TSN emulation on Mininet is also leveraged in [ 6 ], focusing on the SDN capabilities of Mininet rather than on the TSN mechanisms implemented in Linux. They develop a profiling methodology, whose approach differs from that of [ 5 ] but obtains similar results. Also, there is an interesting summary of most TSN utilities existing in Linux in [ 7 ].
We exploit three key components to support TSN on Mininet: XDP (eXpress Data Path) [ 8 ], AF_PACKET [ 9 ] and qdisc (Linux queue disciplines) (Fig. 2).
XDP allows the execution of user-level filters using the BPF interface [ 10, 11 ]. These filters are attached to a kernel hook right after the interrupt service routine which triggers upon each frame arrival, in a physical system, or just into the veth driver in an emulated node. The XDP routine can examine the frame and either drop it, copy it to user space, or forward it to another Network Interface Controller (NIC). AF_PACKET clones the frame and sends the copy to a user process, while the kernel’s copy of the frame (sk_buff) proceeds to the Linux Network Stack (LNS).
Qdisc is a Linux framework, managed with the tc tool, to place predefined filters between an ingress (egress) port and the LNS. The key qdisc for TSN is the taprio qdisc, intended to emulate a simplified version of an IEEE 802.1Qbv TAS (Fig. 3). Besides taprio, we also leverage clsact for the complete integration of the taprio qdisc, in order to meet TSN common practices (Sec. 4), and netem to emulate the transmission time (Sec. 9.1).
Mininet runs on a single Linux computer. This obviates the emulation of IEEE 802.1AS devices to meet the TSN time-synchronization requirements, resulting in a zero clock skew. Each network node (host or bridge) is a user process, which forks children processes as required (e.g., talkers and listeners at the end-stations). Network links among nodes are set up leveraging the virtual Ethernet driver veth. This virtual driver emulates the Data Link Layer firmware of the NICs, and serves as the OS Ethernet driver itself. All processes in a node share a single LNS and the same Linux namespace. Mininet imposes no requirement on the kernel preemption mode. We have opted for a fully preemptible kernel (RT) configuration, common in TSN nodes.
Configuring Mininet involves installing and setting up the taprio qdisc, which emulates the queue structure of IEEE 802.1Qbv. However, the veth driver defaults to a single queue, which is manageable for one class but requires kernel patching to overcome this limitation [ 12 ]. Additionally, frames in TSN need to be identified by the traffic class of their respective streams. TSN bridges commonly utilize the PCP subfield within the VLAN tag of the Ethernet frame for this purpose. Although Mininet does not natively support virtual LANs, we can modify the Host class in the Mininet framework to enable the VLAN field.
In TSN, frames are tagged with their corresponding class ID at the talker’s host. We do that leveraging the iptables tool of Linux, which stores the ID in the priority field of the sk_buff allocated to the frame (sk_buff → → \rightarrow → priority), for every stream sent through a specific egress port. Then, actual VLAN interfaces pass this ID to the PCP field of the Ethernet package. We have configured the veth interfaces at the egress ports of Mininet hosts with VLAN support, so they can proceed the same way.
The veth interfaces at the ingress ports of Mininet bridges cannot be configured with VLAN capabilities; otherwise, they would strip the VLAN header. Since taprio determines the class ID of frames based on their sk_buff → → \rightarrow → priority field, we employ the clsact qdisc to copy the PCP value of incoming frames into their sk_buff → → \rightarrow → priority field. Fig. 4 provides an overview of our approach, illustrating how filters and qdiscs are used to emulate TSN within Mininet.
As per time synchronization, we already mentioned that Mininet runs on a single Linux instance. Therefore, all processes can share the same clocking with no clock skew and no explicit emulation of the IEEE 802.1AS protocol.
Intel’s Time Coordinated Computing (TCC ®) [ 13 ] encompasses a set of optimizations in order to improve the real-time performance of the underlying platform. • Power State Transition Optimizations limit the jitter in CPU execution due to frequency changes, and other power-saving features. • Memory/Cache Allocation Optimizations reduce the variability of the memory subsystem by allowing to partition the shared caches, including the portions available to the GPU. • Interrupt Request (IRQ) Optimizations streamline the critical path for interrupts in the CPU core, and also allow devices to deliver interrupts directly to the guest OS. • Fabric and PCIe Virtual Channels provide different priorities for the transactions related to different workloads, allowing to treat real-time traffic as high priority. • Intel ® Speed Shift for Edge Compute Applications enables specific assignment of processor performance to where it is most needed. • Precision Time Coordination and PCIe Precision Time Measurement (PTM) allow to coordinate events across multiple SoC subsystems and components with independent time clocks.
Power State Transition Optimizations limit the jitter in CPU execution due to frequency changes, and other power-saving features.
Memory/Cache Allocation Optimizations reduce the variability of the memory subsystem by allowing to partition the shared caches, including the portions available to the GPU.
Interrupt Request (IRQ) Optimizations streamline the critical path for interrupts in the CPU core, and also allow devices to deliver interrupts directly to the guest OS.
Fabric and PCIe Virtual Channels provide different priorities for the transactions related to different workloads, allowing to treat real-time traffic as high priority.
Intel ® Speed Shift for Edge Compute Applications enables specific assignment of processor performance to where it is most needed.
Precision Time Coordination and PCIe Precision Time Measurement (PTM) allow to coordinate events across multiple SoC subsystems and components with independent time clocks.
A usual timestamping point is the boundary between the physical layer (PHY) and the medium (e.g., an Ethernet cable). Specialized hardware probes like network TAP devices allow the capture of such measurements. NICs may also include timestamping mechanisms that approximate those values. Some NICs include the ability to timestamp DMA requests as well. The OS can retrieve timestamps from devices (hardware timers at the microprocessor or NICs).
Linux stores a few hardware timestamps in the sk_buff per-frame structure, along with software timestamps, which can be reached leveraging XDP, or the recvmsg() syscall via sockets AF_PACKET or AF_INET. The OS does also generate timestamps generated from OS-managed hardware timers, abstracted as OS clocks, such as CLOCK_REALTIME, CLOCK_TAI or CLOCK_MONOTONIC and others, reachable through the clock_gettime() syscall, or a BPF helper function such as bpf_ktime_get_ns() called from an XDP program. It is pivotal to note that hardware counters may hold time values (e.g., in nanoseconds), or simply a number of ticks that must be translated to time values using a given frequency, as Linux does to provide clock abstractions like the ones we have just mentioned. When it comes to networking, a typical hardware element used for timestamping is the PTP Hardware Clock (PHC) [ 14 ]. Linux offers an ioctl interface that allows to relate the timestamps taken by the NIC and its own time-keeping mechanisms.
The hardware architecture provides specific timestamping mechanisms and ISA interfaces. For example, Intel 64 and IA-32 architectures define the operation of a Timestamp Counter (TSC), and instructions like rdtsc to read it [ 15 ]. The OS will normally use such architecture support as the foundation of its own time-keeping mechanisms.
Tab. 1 summarizes the three experimental setups we use in this work. C1 employs a preeemptable kernel configuration, with no special optimization for real-time (RT), whereas C2 and C3 run a kernel with the PREEMPT_RT patch, configured with full RT preemption. The kernel in C2 is parameterized following Intel ® ’s recommendations for RT. C3 runs with the Intel ® ’s TCC ® system activated.
We deploy 1000 random frames from talker to listener traversing two bridges for the characterization and evaluation measurements performed in Sec. 8, with taprio configured with all queues open to avoid delaying any frame. The TSN topology and streams of the use case are described in Sec. 9, with taprio configured according to the computed schedule. The default configuration is C2 unless stated otherwise.
Among the timestamping possibilities introduced in Sec. 6, we have shortlisted the ones compatible with the veth framework, which actually emulates a NIC besides acting as the NIC driver, and with the timestamping points most appropriate to obtain the value of the delays under consideration.
Fig. 5 shows the points where we record timestamps, along with the calculated latencies. Tab. 2 defines each timestamp and the method(s) used to read them.
We leverage three timestamping methods: M1 reads the values of either CLOCK_REALTIME (M1.1) or CLOCK_MONOTONIC (M1.2) using the clock_gettime() syscall. M2 records the CLOCK_REALTIME value stored in the sk_buff by veth_xmit() at veth pairs, in kernel space. M2.1 records the timer value using a socket AF_INET (SOCK_DGRAM) from user space. M2.2 records the timer value using a socket AF_PACKET from user space. AF_PACKET clones the sk_buff and sends a copy to user space whereas the frame proceeds as usual through the LNS, where it can be validated. M3 records the value of a CLOCK_MONOTONIC timer in kernel space, using a BPF helper function bpf_ktime_get_ns() through the XDP framework.
reads the values of either CLOCK_REALTIME (M1.1) or CLOCK_MONOTONIC (M1.2) using the clock_gettime() syscall.
records the CLOCK_REALTIME value stored in the sk_buff by veth_xmit() at veth pairs, in kernel space.
M2.1 records the timer value using a socket AF_INET (SOCK_DGRAM) from user space. M2.2 records the timer value using a socket AF_PACKET from user space. AF_PACKET clones the sk_buff and sends a copy to user space whereas the frame proceeds as usual through the LNS, where it can be validated.
records the timer value using a socket AF_INET (SOCK_DGRAM) from user space.
records the timer value using a socket AF_PACKET from user space. AF_PACKET clones the sk_buff and sends a copy to user space whereas the frame proceeds as usual through the LNS, where it can be validated.
records the value of a CLOCK_MONOTONIC timer in kernel space, using a BPF helper function bpf_ktime_get_ns() through the XDP framework.
There are two relevant latencies we must calculate for TSN scheduling. First, the definition of each time-aware stream s i subscript 𝑠 𝑖 s_{i} italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, in a set of i 𝑖 i italic_i time-aware streams, includes a maximum allowed delay D i subscript 𝐷 𝑖 D_{i} italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and jitter J i subscript 𝐽 𝑖 J_{i} italic_J start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for the stream. For all frames of stream s i subscript 𝑠 𝑖 s_{i} italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the delay d i, j subscript 𝑑 𝑖 𝑗 d_{i,j} italic_d start_POSTSUBSCRIPT italic_i, italic_j end_POSTSUBSCRIPT of each frame j 𝑗 j italic_j must be equal or lower than D i subscript 𝐷 𝑖 D_{i} italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. In TSN scheduling, such delay (and its jitter) refers to the time span we measure in the emulated system as e2e.nic (T 4 − T 1 𝑇 4 𝑇 1 T4-T1 italic_T 4 - italic_T 1, Fig. 5). Thus, the TSN schedule deployed in the emulated system is correct if the actual e2e.nic value measured for every frame j 𝑗 j italic_j is such that d i, j ≤ subscript 𝑑 𝑖 𝑗 absent d_{i,j}\leq italic_d start_POSTSUBSCRIPT italic_i, italic_j end_POSTSUBSCRIPT ≤ e2e.nic.
Second, in order to correctly calculate the gate opening and closing times synthesized as GCL entries in the IEEE 802.1Qbv TAS (Fig. 3), the scheduling algorithm must take into account the intrinsic jitter of the physical (or emulated) TSN system. The two main factors of this intrinsic jitter are the clock skew (which is zero in Mininet, Sec. 4) and the bridge latency. Actual TSN bridges have no user processes. In Mininet bridges, we deploy an instrumental user process which only runs when we activate profiling, using M2.2 and M3 to measure the bridge latency (br1L for any intermediate bridge, and br2L for the last bridge before the listener’s end station as in Fig. 5).
We have also instrumented the profiling system to calculate sendL, arrL, and e2e, useful to get an insight on the ways a frame can be processed in Linux.
Fig. 6 shows that the bridge latency (brL1) and average jitter measured using M2.2 (AF_PACKET) is slightly higher and with greater IQR than using M3 (AF_PACKET), due to the cloning performed by the latter. XDP yields a greater absolute jitter if we consider the outliers. Differences are actually negligible (a few μ 𝜇 \mu italic_μ s), favoring AF_PACKET (M2.2) because XDP (M3) is much harder to implement.
The leftmost box plot in Fig. 7 shows e2e.nic measured using M1.1 for T 1 𝑇 1 T1 italic_T 1 and M2 for T 4 𝑇 4 T4 italic_T 4, with timestamping turned off in bridges. It provides an estimate of the overhead introduced by XDP (M3) and AF_PACKET (M2.2) when timestamping is active at brL1 and brL2. Again, practical differences between XDP and AF_PACKET are negligible. Also, we can estimate the overhead of bridge timestamping while measuring end2end.nic in about 10 μ 𝜇 \mu italic_μ s.
We have compared our methodology over the three platform configurations summarized in Tab. 1. The RT optimizations in C2 yield lower latency values across all timestamping methods, despite a few outliers (Fig. 8). With C3, values decrease even further, necessitating a rescaling of the y 𝑦 y italic_y -axis to better visualize the impact (note the C3 dashed line at about 150 μ 𝜇 \mu italic_μ s superimposed as a reference). We detail results with C3 in Fig. 9, marking the C3 reference given in Fig. 8 (150 μ 𝜇 \mu italic_μ s), varying now the way we allocate processes to the four cores available in C3. Although the values per latency type roughly hold no matter the allocation, the middle plot (Allocation 2) eliminates the extreme outliers seen in other allocation schemes. We use XDP (M3) at T 2 𝑇 2 T2 italic_T 2, T 3 𝑇 3 T3 italic_T 3, and T 4 𝑇 4 T4 italic_T 4 in Fig. 8 and AF_PACKET (M2.2) in Fig. 9 but the results hold, with negligible variations.
We now deploy a simple TSN use case to achieve three key objectives (Tab. 3, Fig. 10). First, we validate the timestamping methodology outlined in Sec. 8. Second, we wrap-up and test the TSN emulation settings described in Secs. 3 and 4. Finally, we demonstrate that the emulation platform and methodology can support the optimization of TSN scheduling design and deployment.
We solve the TAS scheduling problem according to the method in [ 16 ]. The solver requires the four per-stream parameters shown in Tab. 3 plus their paths (traversed bridges, Fig. 10).
Also, solving the scheduling problem requires considering four network parameters: transmission time, propagation time, bridge latency, and intrinsic jitter (which includes clock skew, talker delay, and NIC/ veth jitter). The transmission time is the interval from when the Time-Aware Shaper (TAS) begins transmitting a frame through an open gate to the physical medium until the transmission completes (Fig. 3). The propagation time (also known as propagation delay) is the time it takes for a signal to travel from the sender to the receiver across a physical communication channel.
In Mininet, the veth driver (Sec. 3) emulates the physical channel, meaning any parameter related to the latter may require kernel modifications. This is the only method we have found to accurately emulate propagation time. However, since propagation time is not central to our goals, we choose to omit it in our platform. Instead, we have devised a workaround to emulate transmission time by instantiating a netem qdisc as a child of taprio, as illustrated in Fig. 4. A drawback of this approach is that the first frame entering each netem queue is lost. Specifically, the first frame of stream 0 in Fig. 10 disappears at B1, and the second disappears at B2. To mitigate this, we perform a dry run before starting the profiling process.
As stated in Sec. 4, there is no clock skew in this experimental platform. To measure the talker delay, we have recorded the actual transmission times at talkers’ ends-tations, obtaining that the maximum difference with the scheduled transmission times is about 80 80 80 80 ns. The jitter at bridges is of about 200 μ 200 𝜇 200\leavevmode\nobreak\ \mu 200 italic_μ s (Fig. 6), and the jitter in e2e.nic latency is around 10 μ 10 𝜇 10\leavevmode\nobreak\ \mu 10 italic_μ s. Upon these figures, we estimate in 500 μ 500 𝜇 500\leavevmode\nobreak\ \mu 500 italic_μ s the intrinsic jitter of our Mininet emulation platform.
Physical TSN systems are usually configured and run through a user network interface known as CUC (Centralized User Configuration), operating upon a Centralized Network Configuration component (CNC) which performs the actual configuration and boots the system. We do this job through a Python script which sets up the network (Mininet hosts, bridges and links, user processes), configures the CGLs in the taprio qdiscs of the bridges, defines a timestamp as instant zero, and sets all processes in waiting state. When instant zero is reached, all processes start.
Fig. 11 plots the values of e2e.nic for the three streams of the use case (configuration C2. Timestamping activated in bridges: AF_PACKET, M2.2). All measured latencies are in the order of μ 𝜇 \mu italic_μ s, an order of magnitude below the deadlines of the streams.
We have also checked that the time windows at TAS (taprio) gates are wide enough for the frames to pass-through. Fig. 12 shows that the gates at the taprio in bridge 1 allocated to streams 1 (green) and 0 (blue) open for enough time to ensure that the frames of the streams correctly pass-through. This means that the schedule solution has correctly taken into account transmission times and the intrinsic jitter of the platform.
We have successfully set up a Mininet/Linux environment suitable for TSN emulation, with a timestamping methodology that allows the characterization of the latencies of the emulated network (bridge latency, intrinsic jitter). As an application example, and to complete the necessary configuration steps, we have tested the schedule of a use-case on an emulated TSN network, solving the issue of emulating transmission times leveraging the netem Linux qdisc. Emulating the propagation delay requires the modification of the kernel, nevertheless.
Using XDP for timestamping yields slightly better latency bounds than using AF_PACKET, although differences are negligible as far as e2e.nic (the end-to-end latency which actually counts in TSN scheduling) is concerned.
Using XDP for timestamping offers slightly improved latency bounds over AF_PACKET, but the differences in e2e.nic, the end-to-end latency relevant to TSN scheduling, are minimal, and AFP is significantly easier to use. Leveraging a fully preemptible kernel along with Intel ® ’s TCC ® reduces substantially the intrinsic jitter in all cases. However, outcomes depend on the process-to-core allocation scheme, which is crucial for industrial PCs with few cores.
We have experimented a number of compatibility issues when installing, configuring and adapting the necessary tools and frameworks (kernel and gcc versions, veth, qdiscs, Mininet itself among others). Improvements are in the line of updating veth to integrate taprio and the etf qdisc, emulating the transmission time and updating hardware platforms.
This work was supported by the Spanish MCIN /AEI /10.13039 /501100011033 (grant PID2022 -136454NB-C22),by Government of Aragon (research group T58_23R) and by Instituto de Investigación en Ingeniería de Aragón (I3A, Conv. de Ayudas a Prácticas con TFG 2023)