sync reader
丢包与定时误差对 TSN 中 TAS 周期调度流量的影响
Impact of Packet Loss and Timing Errors on Scheduled Periodic Traffic with Time-Aware Shaping (TAS) in Time-Sensitive Networking (TSN) · 2025-10-06
本页提供英文原文段落与中文逐段译稿。译稿包含自动复核状态;标记为需人工复核的段落应回到 PDF/HTML 校对公式、表格和符号。
- 本站范围
- 全文逐段对照
- 内容来源
- 本地英文段落 + 中文译稿
- 阅读规模
- 53/53 段已生成译稿
本地保存的可公开原文段落,随页面一起滚动
自动复核标记:12 段需要人工回看公式、表格或符号
We use the following acronyms.
我们使用以下缩略语。
术语“acronyms”译为“缩略语”准确;无数字、公式或逻辑风险。未发现明显问题。
The rise of Industry 4.0 applications, e.g., intelligent factory automation, results in steadily increasing demands on communication quality. Real-time applications, e.g., controlling the movement of a robotic arm, require guaranteed sub-microsecond end-to-end delays across multiple network nodes, so-called isochronous traffic. As conventional Ethernet technology cannot meet these stringent Quality of Service (QoS) requirements, the Time-Sensitive Networking (TSN) standards has been evolved to provide ultra-low latency, zero jitter, and no congestion-based packet loss. TSN consists of standards for time synchronization, deterministic latencies, and high reliability. TSN defines generalized Precision Time Protocol (gPTP), a configuration for the packet based synchronization protocol PTP that allows a nanosecond precise time synchronization of all TSN devices. This is the basis for any synchronous operation. The enhancements for scheduled traffic IEEE 802.1Qbv \cites 802.1Qbv enable the Time-Aware Shaper (TAS) for deterministic forwarding with ultra-low delay. It consists of eight transmission gates that control the transmission selection from the eight priority queues in the egress port. While a gate is closed, the queue is prevented from sending, until the gate opens. The so-called Gate Control List (GCL) defines the timing-based state of the gates. By closing all gates but one, the TAS can provide exclusive link access for a single priority omitting any delay from an already started transmission of a lower priority packet. By carefully planning each transmission time at the Talkers and forwarding delay at the switches, TSN implements a per-stream scheduled traffic that omits any congestion based delay. The calculation of this TAS schedule in combination with the precise sending time of each stream at the end stations is based on different information, such as the QoS requirements, network topology, and traffic description. The Industrial Internet Consortium (IIC) published the traffic types of Industrial Automation and Control Systems and maps them to the TSN technologies. The IIC classifies critical applications into 3 classes: Isochronous, Cyclic, and Event-based traffic. This paper focuses on the Isochronous traffic. The calculated schedule depends on the correct arrival time of frames at bridges. Missing, additional, early, or late frames might lead to too long end-to-end delays or packet loss, either for the end stations that deviated from their planned sending behavior or even for other end stations that strictly adhered to their planned sending behavior. The effects of misconfigured or malfunctioning end stations for network-wide TAS schedules have not yet been investigated. The lack of knowledge about the fragility of Per-Stream Scheduled Traffic in TSN can lead to essential security measures being removed as a cost-cutting measure, resulting in fatal system failures. The paper states four main contributions. First, it defines the general problem statement of Per-Stream Scheduled Traffic (PSST) with misconfigured or malfunctioning end stations. Therefore, it presents a visualization methodology for TAS schedules over time. It explains the effects of faulty frames, i.e., frames that are missing, additional, early, or late, and visualizes the consequences for the TAS schedule. Second, it quantifies the probability of errors based on faulty frames for network-wide TAS schedules via simulation. Finally, it demonstrates and simulates solution approaches that illustrate the necessity of time-based Per-Stream Filtering and Policing (PSFP) to resolve the fragility issue of TAS.
工业 4.0 应用(例如智能工厂自动化)的兴起,使通信质量需求持续增加。实时应用(例如控制机器人手臂的运动)需要在多个网络节点之间保证亚微秒级端到端时延,即所谓的等时流量。由于传统以太网技术无法满足这些严格的服务质量(Quality of Service, QoS)要求,时间敏感网络(Time-Sensitive Networking, TSN)标准已经发展起来,用于提供超低时延、零抖动以及无基于拥塞的丢包。TSN 由用于时间同步、确定性时延和高可靠性的标准组成。TSN 定义了广义精确时间协议(generalized Precision Time Protocol, gPTP),它是基于分组的同步协议 PTP 的一种配置,允许所有 TSN 设备实现纳秒级精确的时间同步。这是任何同步操作的基础。用于调度流量的增强标准 IEEE 802.1Qbv \cites 802.1Qbv 使时间感知整形器(Time-Aware Shaper, TAS)能够以超低时延进行确定性转发。它由八个传输门组成,这些传输门控制出口端口中来自八个优先级队列的传输选择。当某个门关闭时,该队列被阻止发送,直到该门打开。所谓门控列表(Gate Control List, GCL)定义了这些门基于时间的状态。通过关闭除一个门以外的所有门,TAS 可以为单一优先级提供独占链路访问,从而避免已经开始传输的较低优先级分组所造成的任何时延。通过仔细规划各 Talker 的每一次传输时间以及交换机处的转发时延,TSN 实现了按流调度的流量,从而避免任何基于拥塞的时延。该 TAS 调度的计算,结合各数据流在终端站处的精确发送时间,是基于不同信息完成的,例如 QoS 要求、网络拓扑和流量描述。工业互联网联盟(Industrial Internet Consortium, IIC)发布了工业自动化与控制系统的流量类型,并将其映射到 TSN 技术。IIC 将关键应用分为 3 类:等时、循环和基于事件的流量。本文重点关注等时流量。计算得到的调度依赖于帧在网桥处的正确到达时间。缺失的、额外的、提前的或延迟的帧,可能导致端到端时延过长或丢包;这既可能影响偏离其计划发送行为的终端站,也可能甚至影响严格遵守其计划发送行为的其他终端站。配置错误或发生故障的终端站对全网范围 TAS 调度的影响尚未得到研究。对于 TSN 中按流调度流量脆弱性的知识缺乏,可能导致必要的安全措施被作为削减成本的手段移除,从而造成致命的系统故障。本文提出四项主要贡献。首先,本文定义了存在配置错误或故障终端站时按流调度流量(Per-Stream Scheduled Traffic, PSST)的一般问题陈述。因此,本文提出了一种用于随时间可视化 TAS 调度的方法。本文解释了故障帧的影响,即缺失、额外、提前或延迟的帧,并可视化其对 TAS 调度造成的后果。其次,本文通过仿真量化了全网范围 TAS 调度中由故障帧导致错误的概率。最后,本文展示并仿真了解决方案方法,这些方法说明需要基于时间的按流过滤与监管(Per-Stream Filtering and Policing, PSFP)来解决 TAS 的脆弱性问题。
保留了 TSN、QoS、gPTP、PTP、IEEE 802.1Qbv、TAS、GCL、IIC、PSST、PSFP 等缩写与标准号;“\cites 802.1Qbv”疑似 LaTeX 引用命令残留,按原符号保留;“paper states four main contributions”后正文实际以 First、Second、Finally 呈现三处贡献表达,可能是原文计数或抽取上下文问题;“TSN standards has been evolved”原文语法异常,按含义译出;长段包含多项因果与限定关系,已尽量逐句保留。状态建议人工复核引用命令与“四项贡献”上下文。
The paper is structured as follows. We review related work in Section II and introduce background information on TSN in Section III. Section IV explains the problem statement and presents our novel visualization methodology for TAS schedules. We explain and visualize the effects of faulty frames in Section V and present preliminary simulation results in Section VI. Finally, we conclude the paper in Section VII.
本文结构如下。我们在第 II 节回顾相关工作,并在第 III 节介绍 TSN 的背景信息。第 IV 节解释问题陈述,并提出我们用于 TAS 调度的新型可视化方法。我们在第 V 节解释并可视化故障帧的影响,并在第 VI 节给出初步仿真结果。最后,我们在第 VII 节总结本文。
章节编号 II 至 VII 保留准确;术语“faulty frames”译为“故障帧”与前文一致;无公式风险。未发现明显问题。
The TSN standard ”IEEE Std. 802.1Qbv Enhancements for scheduled traffic” was first published in 2018. In the last 6 years, a lot of researchers published algorithms to create TSN schedules [ 2 ]. Some of them implement reliability features within their algorithms based on specific fault models [ 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 ]. Those reliability features are designed for specific faults and can be structured according to their countermeasures. Dobirn et al. [ 10 ] as well as Park et al. [ 11 ] build their schedule to enable re-transmissions of missing frames within the frames deadline. Therefore, they schedule enough time between frames, such that re-transmissions do not interfere with other streams. A similar approach that copes with frame loss is by Feng et al. [ 5, 9 ] and Reusch et al. [ 7 ]. They send redundant copies of frames over the same path to ensure that the stream data is received even if a single frame is missing. Zhou et al. [ 13 ] step further by adding a routing mechanism, such that the copies are no longer routed over the same path based on the probability of a frame loss, which does not necessarily result in disjoint paths. In contrast, Atallah et al. [ 12, 6 ], Gavrilut et al. [ 8 ], Syed et al. [ 18, 17 ], and Huang et al. [ 3 ] ensure redundant copies that are sent via disjoint paths. Only two of those used the IEEE Std. 802.1CB Frame Replication and Elimination for Reliability [ 19 ] for this approach [ 17, 3 ]. Syed et al. [ 20 ] present an encoding scheme to reduce the network capacity needed for redundancy information. They use the XOR of two streams as a redundancy stream, which is sent via a path, disjoint of both other streams. Therefore if one link fails, the information from the remaining two streams can be used to reconstruct the missing stream. Another approach is fast rescheduling and rerouting after a link failure, which is implemented by Pozo et al. [ 21 ].
TSN 标准“IEEE Std. 802.1Qbv 用于调度流量的增强”首次发布于 2018 年。在过去 6 年中,许多研究者发布了用于创建 TSN 调度的算法 [2]。其中一些算法基于特定故障模型,在算法内部实现可靠性特性 [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]。这些可靠性特性是针对特定故障设计的,并且可以按照其对策进行结构化分类。Dobirn 等人 [10] 以及 Park 等人 [11] 构建其调度,以便在帧的截止期限内重新传输缺失帧。因此,他们在帧之间调度足够的时间,使得重传不会干扰其他流。Feng 等人 [5, 9] 和 Reusch 等人 [7] 提出了一种处理帧丢失的类似方法。他们沿相同路径发送帧的冗余副本,以确保即使单个帧缺失,也能接收到该流的数据。Zhou 等人 [13] 更进一步,加入了一种路由机制,使这些副本不再基于帧丢失概率沿相同路径路由,但这并不必然产生不相交路径。相比之下,Atallah 等人 [12, 6]、Gavrilut 等人 [8]、Syed 等人 [18, 17] 和 Huang 等人 [3] 确保冗余副本经由不相交路径发送。在这些研究中,只有两项将 IEEE Std. 802.1CB 用于可靠性的帧复制与消除 [19] 用于这一方法 [17, 3]。Syed 等人 [20] 提出了一种编码方案,以减少冗余信息所需的网络容量。他们使用两个流的 XOR 作为冗余流,该冗余流经由一条与另外两个流都不相交的路径发送。因此,如果一条链路失效,则可以使用其余两个流中的信息来重构缺失流。另一种方法是在链路失效后进行快速重新调度和重新路由,这由 Pozo 等人 [21] 实现。
年份 2018、过去 6 年、引用编号、标准 IEEE Std. 802.1Qbv 和 IEEE Std. 802.1CB 均已保留;“frames deadline”按“帧的截止期限”译出,原文可能缺少所有格;“Zhou et al. [13] step further”原文疑似语法残缺,按语义译为“更进一步”;“disjoint paths”统一译为“不相交路径”;XOR 保留。未发现明显问题。
Crae et al. defined the Frame and Flow Isolation [ 22 ], which is implemented by some scheduling algorithms [ 23 ] (Frame Isolation [ 13 ]).
Crae 等人定义了帧与流隔离(Frame and Flow Isolation)[22],该机制由一些调度算法 [23] 实现(帧隔离 [13])。
“Crae”可能为作者名抽取或拼写问题,但按原文保留;Frame and Flow Isolation 与 Frame Isolation 均保留英文术语;括号中的引用关系可能依赖上下文。状态建议人工复核作者名及该句上下文。
Craciunas et al. introduce a scheduling that is robust against clock drifts [ 4 ]. Therefore, small sending time variations would not lead to interference between frames.
Craciunas 等人提出了一种能够抵抗时钟漂移的调度方法 [4]。因此,较小的发送时间变化不会导致帧之间发生干扰。
“clock drifts”译为“时钟漂移”;“small sending time variations”译为“较小的发送时间变化”;因果关系已保留。未发现明显问题。
Zhou et al. implement a hardware selection that selects the TSN hardware based on its costs, tests, and the reliability requirements of the streams [ 15 ].
Zhou 等人实现了一种硬件选择方法,该方法根据成本、测试以及各流的可靠性要求来选择 TSN 硬件 [15]。
“costs, tests, and reliability requirements”均已保留;“hardware selection”译为“硬件选择方法”符合上下文;无数字或公式风险。未发现明显问题。
5G ACIA white paper lists typical use cases for TSN in factory environments as well as requirements of industrial communication. The white paper states two kinds of scheduled traffic called Isochronous and Cyclic-Synchronous. For both scheduled traffic types, IEEE802.1 Qbv and Qci are mentioned as mandatory without any reasoning. Although Qci is mentioned as mandatory, according to the IEC/IEEE 60802, it is currently an optional feature for TSN Bridges.
5G ACIA 白皮书列出了工厂环境中 TSN 的典型用例以及工业通信的要求。该白皮书说明了两种称为等时(Isochronous)和循环同步(Cyclic-Synchronous)的调度流量。对于这两种调度流量类型,IEEE802.1 Qbv 和 Qci 都被提及为强制要求,但没有给出任何理由。尽管 Qci 被提及为强制要求,但根据 IEC/IEEE 60802,它目前是 TSN 网桥的一项可选特性。
5G ACIA、IEEE802.1 Qbv、Qci、IEC/IEEE 60802 均按原文保留;“mandatory”译为“强制要求”,“optional feature”译为“可选特性”;“IEEE802.1 Qbv”原文格式中 IEEE 与 802.1 间无空格、802.1 与 Qbv 间有空格,可能应为 IEEE 802.1Qbv,按原文风险保留。状态建议人工复核标准名称格式。
We first give an overview of Time-Sensitive Networking (TSN) and explain the TAS. Afterwards we focus on Per-Stream Scheduled Traffic (PSST) and the PSFP.
我们首先概述时间敏感网络(Time-Sensitive Networking, TSN),并解释 TAS。随后,我们聚焦于按流调度流量(Per-Stream Scheduled Traffic, PSST)和 PSFP。
TSN、TAS、PSST、PSFP 缩写保留;“Afterwards”译为“随后”;无数字或公式风险。未发现明显问题。
Time-Sensitive Networking is an enhancement of Ethernet and is currently under standardization by the IEEE 802.1 TSN task group. TSN specifies shaping and policing mechanisms that enable data transmission with QoS requirements. Examples for QoS requirements are bounded jitter, zero congestion-based packet loss, and sub-microsecond end-to-end delays. End stations in a TSN network that transmit data are called talkers, and end stations that receive data are called listeners. Talkers and listeners are connected through possibly multiple TSN bridges that apply shaping and policing mechanisms to provide QoS. Examples for shaping and policing mechanisms are Credit-Based Shaper (CBS) [ 24 ], Per-Stream Filtering and Policing (PSFP) [ 25 ], Time-Aware Shaper (TAS) [ 1 ], and Asynchronous Traffic Shaping (ATS) [ 26 ].
时间敏感网络是以太网的一种增强,目前正在由 IEEE 802.1 TSN 任务组进行标准化。TSN 规定了整形和监管机制,使具有 QoS 要求的数据传输成为可能。QoS 要求的例子包括有界抖动、零基于拥塞的丢包以及亚微秒级端到端时延。在 TSN 网络中,传输数据的终端站称为 talker,接收数据的终端站称为 listener。Talker 和 listener 通过可能存在多个的 TSN 网桥连接,这些 TSN 网桥应用整形和监管机制来提供 QoS。整形和监管机制的例子包括基于信用的整形器(Credit-Based Shaper, CBS)[24]、按流过滤与监管(Per-Stream Filtering and Policing, PSFP)[25]、时间感知整形器(Time-Aware Shaper, TAS)[1],以及异步流量整形(Asynchronous Traffic Shaping, ATS)[26]。
IEEE 802.1 TSN、QoS、talker、listener、CBS、PSFP、TAS、ATS 及引用编号均已保留;“shaping and policing”统一译为“整形和监管”;“possibly multiple TSN bridges”译为“可能存在多个的 TSN 网桥连接”,含义完整。未发现明显问题。
End stations signal their QoS requirements, e.g., maximum acceptable latency, to the network before they start data transmission. Further, they describe their data transmission in the form of an interval, also called a period, the maximum number of frames that are transmitted during the interval, and the maximum frame size. This allows the calculation of an upper bound on the required datarate. A logical data transmission between a talker and (multiple) listeners is called a TSN stream. TSN streams are admitted to the network if the QoS demands of both the talker and listeners can be met. This admission control decision is either done in a distributed manner or by a central configuration entity, the so-called Centralized Network Configuration (CNC). Finally, bridges on the path between the talker and listeners are configured such that the previously signaled QoS demands are fulfilled.
终端站在开始数据传输之前,会向网络表明其 QoS 需求,例如最大可接受时延。此外,它们以一个时间间隔的形式描述其数据传输;该时间间隔也称为周期,同时还描述在该时间间隔内传输的最大帧数以及最大帧大小。这使得可以计算所需数据速率的上界。一个 talker 与一个或多个 listener 之间的逻辑数据传输称为一个 TSN 流。如果 talker 和 listener 双方的 QoS 需求都能够得到满足,则 TSN 流会被接纳进入网络。该准入控制决策要么以分布式方式完成,要么由一个中心配置实体完成,即所谓的集中式网络配置(Centralized Network Configuration,CNC)。最后,会对 talker 与 listener 之间路径上的桥进行配置,使得先前已表明的 QoS 需求得到满足。
术语 TSN stream 译为“TSN 流”,talker/listener 保留英文并补充复数含义;QoS、CNC 缩写保留。数字和逻辑关系完整。未发现明显问题。
The policing and shaping mechanisms can provide different levels of QoS guarantees. For example, the Credit-Based Shaper (CBS) provides a guaranteed bandwidth share but cannot guarantee sub-microsecond end-to-end delays. IEEE 802.1Qbv [ 1 ] specifies the enhancements for schedule traffic, often referred to as the Time-Aware Shaper (TAS). TAS enables sub-microsecond end-to-end delays for time-critical real-time communication in TSN networks. It is based on a gating mechanism that introduces time slots in which frames are eligible for transmission, similar to Time Division Multiple Access (TDMA) [ 27 ]. Figure 1 illustrates the structure of the Queuing mechanism in a TSN compliant bridge with TAS.
管制和整形机制可以提供不同级别的 QoS 保证。例如,基于信用的整形器(Credit-Based Shaper,CBS)提供有保证的带宽份额,但不能保证亚微秒级的端到端时延。IEEE 802.1Qbv [1] 规定了针对调度流量的增强功能,这些增强功能通常被称为时间感知整形器(Time-Aware Shaper,TAS)。TAS 使 TSN 网络中时间关键型实时通信能够实现亚微秒级端到端时延。它基于一种门控机制,该机制引入了帧有资格进行传输的时间槽,类似于时分多址(Time Division Multiple Access,TDMA)[27]。图 1 展示了一个符合 TSN 的、带有 TAS 的桥中排队机制的结构。
“schedule traffic” 原文疑似应为 “scheduled traffic”,按上下文译为“调度流量”。CBS、TAS、TDMA、IEEE 802.1Qbv、引用编号均保留。未发现明显问题。
TSN bridges support up to eight priority queues per egress port. Frames of a TSN stream are enqueued in a specific queue based on the Priority Code Point (PCP) value of its VLAN tag. A priority queue is followed by a Transmission Selection Algorithm (TSA), e.g., CBS, and a transmission gate with two possible states: open and close. Frames are only eligible for transmission if their corresponding gate is open. The GCL controls a periodic state change of all gates, called schedule. A GCL entry maps a relative time interval [ T i, T i + 1 ] [T_{i},T_{i+1}] to a bit vector, where each bit corresponds to a single transmission gate. If the bit is set, the gate is open. Otherwise, it is closed. The schedule repeats periodically for an indefinite number of times. The length of a single schedule is called the hyperperiod, which is typically the least common multiple of all stream periods. If multiple queues are eligible for transmission, i.e., several transmission gates are open, the queue with the highest priority is used.
TSN 桥在每个出口端口上最多支持八个优先级队列。TSN 流的帧会基于其 VLAN 标签中的优先级代码点(Priority Code Point,PCP)值,被排入某个特定队列。一个优先级队列之后连接一个传输选择算法(Transmission Selection Algorithm,TSA),例如 CBS,以及一个具有两种可能状态的传输门:打开和关闭。只有当帧对应的门处于打开状态时,该帧才有资格进行传输。GCL 控制所有门的周期性状态变化,这称为调度。一个 GCL 条目将一个相对时间间隔 [T_i, T_{i+1}] 映射到一个位向量,其中每一位对应一个单独的传输门。如果该位被置位,则该门打开;否则,该门关闭。该调度会无限次地周期性重复。单个调度的长度称为超周期,它通常是所有流周期的最小公倍数。如果多个队列都有资格进行传输,即若干个传输门处于打开状态,则使用优先级最高的队列。
原文中时间区间同时出现 “[ T i, T i + 1 ] [T_{i},T_{i+1}]”,疑似文本抽取重复;译文保留规范公式形式 [T_i, T_{i+1}]。open/close 译为“打开/关闭”,与门状态一致。因存在公式抽取重复风险,需人工确认原排版。
Prioritizing or delaying streams can be done by closing the gates of other priorities at the right time. When multiple bridge schedules are planned together, sub-microsecond end-to-end delays can be achieved.
可以通过在适当时间关闭其他优先级的门,来对流进行优先处理或延迟处理。当多个桥的调度被共同规划时,可以实现亚微秒级端到端时延。
逻辑关系清楚;“prioritizing or delaying streams” 译为“对流进行优先处理或延迟处理”。未发现明显问题。
To support streams with sub-microsecond end-to-end delays, frames need to arrive at the bridge in time, i.e., when the corresponding transmission gate is open. Otherwise, the frame is received while the gate is closed. This introduces additional queuing delay, which may violate the stream’s deadline. Consequently, both the end stations and the network require precise time synchronization, e.g., with gPTP IEEE Std. 802.1AS [ 28 ]. Additionally, the sending time at the talker and the deployed TAS schedule needs to be compatible. When the talker signals its QoS requirements, it additionally sends its earliest and latest possible sending time, called transmit offset (ETO and LTO), as well as its synchronization jitter to the CNC. Based on these information, the CNC calculates the sending offset for the talker and an appropriate GCL for all bridges on the path, such that the previously signaled QoS requirements are met. The resulting schedule may or may not include queuing at some bridges. Section IV illustrates the consequences of different schedule decisions.
为了支持具有亚微秒级端到端时延的流,帧需要及时到达桥,即在对应的传输门打开时到达。否则,帧会在门关闭时被接收。这会引入额外的排队时延,而该时延可能违反该流的截止期限。因此,终端站和网络都需要精确的时间同步,例如使用 gPTP IEEE Std. 802.1AS [28]。此外,talker 处的发送时间与所部署的 TAS 调度需要相互兼容。当 talker 表明其 QoS 需求时,它还会向 CNC 发送其最早和最晚可能发送时间,称为发送偏移(transmit offset,ETO 和 LTO),以及其同步抖动。基于这些信息,CNC 为 talker 计算发送偏移,并为路径上的所有桥计算合适的 GCL,使得先前已表明的 QoS 需求得到满足。所得调度可能在某些桥处包含排队,也可能不包含排队。第 IV 节说明了不同调度决策的后果。
ETO/LTO 原文未展开具体英文全称,仅按原文说明为最早和最晚可能发送时间对应的 transmit offset;gPTP、IEEE Std. 802.1AS、GCL、CNC 均保留。未发现明显问题。
Calculating such a schedule is NP-complete and can be done by suitable algorithms [ 2 ]. The combination of sending offsets at talkers and TAS within the network is called Per-Stream Scheduled Traffic (PSST) [ 1 ].
计算这样的调度是 NP-complete 的,并且可以由合适的算法完成 [2]。talker 处的发送偏移与网络中的 TAS 的组合称为逐流调度流量(Per-Stream Scheduled Traffic,PSST)[1]。
NP-complete 可译为“NP 完全”,此处保留英文以对应复杂性类别术语;PSST 译名与英文全称对应。未发现明显问题。
Per-Stream Filtering and Policing (PSFP) is defined with the IEEE Std. 802.1Qci in 2017 to restrict the enqueuing of unwanted packets into the priority queues at the egress ports. Therefore it has the capability to drop packets or change the priority queue based on a combination of the time, the frame size, the rate and the burstiness of a stream. To accomplish that, PSFP consists of three layered components, the Stream Filters, the Stream Gates, and the Flow Meters. PSFP matches each arriving packet to the first Stream Filter instance where both the configured IEEE802.1CB Stream Handle and the vlan priority matches. The Stream Filter allows a wildcard entry for both the Stream Handle and the vlan priority. The matched Stream Filter instance may filter the frame based on the SDU Size, updates some counters, and defines the corresponding Stream Gate instance and the optional Flow Meter instance.
逐流过滤与监管(Per-Stream Filtering and Policing,PSFP)由 2017 年的 IEEE Std. 802.1Qci 定义,用于限制不需要的数据包进入出口端口处的优先级队列。因此,它具有基于一个流的时间、帧大小、速率和突发性组合来丢弃数据包或改变优先级队列的能力。为实现这一点,PSFP 由三个分层组件组成,即流过滤器(Stream Filters)、流门(Stream Gates)和流量计(Flow Meters)。PSFP 会将每个到达的数据包匹配到第一个 Stream Filter 实例,其中已配置的 IEEE802.1CB Stream Handle 和 VLAN 优先级均匹配。Stream Filter 允许针对 Stream Handle 和 VLAN 优先级二者使用通配符条目。匹配到的 Stream Filter 实例可以基于 SDU 大小过滤该帧,更新一些计数器,并定义对应的 Stream Gate 实例以及可选的 Flow Meter 实例。
“Policing” 在网络 QoS 中也可译为“管制”,此处译为“监管”,需与全文术语统一。IEEE802.1CB、SDU、VLAN、Stream Handle 等保留。原文 “vlan” 大小写不统一,译文统一为 VLAN。未发现明显问题。
The Stream Gate instance consists of a gate that can be opened and closed and assign an Internal Priority Value (IPV) based on a timing-based gate control list. When the packet arrives while the gate is in the closed state, the packet is dropped. Otherwise, the packet passes the gate and might change the switch’s internal priority for that frame, based on the IPV value. In the following, we refer to this time-based gating mechanism as time-based PSFP.
Stream Gate 实例由一个可以打开和关闭的门组成,并基于一个按时间控制的门控控制列表分配一个内部优先级值(Internal Priority Value,IPV)。当数据包到达时如果该门处于关闭状态,则该数据包会被丢弃。否则,该数据包通过该门,并且可能基于 IPV 值改变交换机针对该帧使用的内部优先级。在下文中,我们将这种基于时间的门控机制称为基于时间的 PSFP。
“assign an IPV” 原文主谓略有不顺,按 Stream Gate 实例功能译为“分配”。IPV 和 time-based PSFP 保留。未发现明显问题。
The optional Flow Meter is a two rate three color token bucket meter as defined in MEF 10.3 [ 29 ]. The token bucket is configured using the committed information rate, committed burst size, excess information rate, and excess burst size as parameters. This mechanism enables to filter the frames based on their data-rate. In the following, we refer to this as rate-based PSFP.
可选的 Flow Meter 是一个双速率三色令牌桶计量器,其定义见 MEF 10.3 [29]。该令牌桶使用承诺信息速率、承诺突发大小、超额信息速率和超额突发大小作为参数进行配置。该机制能够基于帧的数据速率来过滤帧。在下文中,我们将其称为基于速率的 PSFP。
“two rate three color token bucket meter” 译为“双速率三色令牌桶计量器”;committed/excess information rate 和 burst size 术语直译,需与相关标准中文译法统一时可人工调整。未发现明显问题。
The combination of those 3 layers enables a filtering of TSN packets based on their arrival time and data-rate.
这 3 层的组合使得能够基于 TSN 数据包的到达时间和数据速率对其进行过滤。
数字 3、arrival time、data-rate 均保留对应含义。未发现明显问题。
PSST combines precise sending times at talkers (transmission offsets) and valid GCLs in the network. Misconfigured or malfunctioning end stations can disrupt the deployed network schedule by deviating from the planned sending behavior. We categorize frames that deviate from their planned arrival time, so-called faulty frames, into the following groups: missing frames, additional frames, early frames, and late frames. A missing frame is a frame that is scheduled to arrive at a specific time at a bridge but is never received by the bridge. The reason for missing frame might be a failed link, packet corruption (CRC failure), a broken end station, or an overloaded network component. An additional frame is a frame received by a bridge without being planned, which can be caused by a malfunctioning or misconfigured end station. Further, frames might arrive too early or too late at a bridge. Reasons for too early and too late frames are misconfigured end stations, synchronization errors, and non-deterministic queuing delay in a previous network component.
PSST 结合了说话者(talkers)处的精确发送时间(传输偏移,transmission offsets)以及网络中的有效 GCL。配置错误或发生故障的端站可能会偏离计划的发送行为,从而扰乱已部署的网络调度。我们将偏离其计划到达时间的帧,即所谓故障帧,归为以下几类:缺失帧、额外帧、过早帧和过晚帧。缺失帧是指被调度为在某个特定时间到达某座桥(bridge),但该桥从未接收到的帧。缺失帧的原因可能是链路故障、分组损坏(CRC 失败)、端站损坏,或网络组件过载。额外帧是指桥在没有计划的情况下接收到的帧,这可能由发生故障或配置错误的端站导致。此外,帧可能过早或过晚到达某座桥。帧过早和过晚的原因包括端站配置错误、同步误差,以及前一个网络组件中的非确定性排队时延。
术语 PSST、talkers、transmission offsets、GCL、bridge、CRC 均已保留或对应翻译;四类 faulty frames 未遗漏;因果关系和分类逻辑完整。未发现明显问题。
Faulty frames can cause a part of the network to no longer operate as planned or cause a global network failure. We explain and visualize the consequences of missing, additional, early, and late frames for a network-wide schedule in Section V.
故障帧可能导致网络的一部分不再按计划运行,或者导致全局网络故障。我们在第 V 节中解释并可视化缺失帧、额外帧、过早帧和过晚帧对全网范围调度的后果。
“part of the network”和“global network failure”的范围区分已保留;Section V 译为第 V 节;四类帧未遗漏。未发现明显问题。
In this section, we first introduce a method to visualize the behavior of an egress queue with TAS and periodic traffic. Then, we illustrate on a single link by minimal examples how faulty frames can delay later frames, possibly by significantly long time, and cause a persistent or even continuously increasing queue.
在本节中,我们首先介绍一种方法,用于可视化采用 TAS 和周期性流量时出口队列的行为。然后,我们通过单条链路上的最小示例说明,故障帧如何能够延迟后续帧,且这种延迟可能达到显著长的时间,并导致队列持续存在,甚至持续增长。
“egress queue”译为出口队列;“minimal examples”译为最小示例;“persistent or even continuously increasing queue”的递进关系已保留。未发现明显问题。
In Figure 3 the occupancy of an egress queue is visualized over time. Frames of different streams are shown with their respective size in different colors on top of each other. In the example, the two frames are 1000 and 500 bytes large. Frames arrive periodically and instantaneously, which is indicated by a sudden increase of the queue occupancy. In contrast, we illustrate frame transmissions by linearly decreasing queue occupancy over time; the slope depends on the transmission speed 1 1 1 Instantaneous arrivals and continuous departures are inconsistent, but simplify the reading of the figures. When arrivals are also depicted continuous, simultaneous frame arrivals lead to superposed slopes, which looks more complex but does not add any value to the discussion.. In the examples, all frames have the same period, and the beginning of a period is marked by a vertical line. In case of different periods, it is more useful to mark the start of a hyperperiod. The state of the periodic TAS gate is indicated on the x-axis. The gate is closed during red intervals, and it is open during green intervals. As a consequence, frames are queued during red intervals and transmitted during green intervals.
图 3 将出口队列的占用量随时间变化进行了可视化。不同流的帧以其各自大小用不同颜色彼此叠放显示。在该示例中,两个帧的大小分别为 1000 字节和 500 字节。帧周期性且瞬时到达,这由队列占用量的突然增加表示。相反,我们通过队列占用量随时间线性下降来表示帧传输;其斜率取决于传输速率。注 1:瞬时到达和连续离开并不一致,但可以简化图的阅读。当到达也被描绘为连续过程时,同时到达的帧会导致斜率叠加,这看起来更复杂,但不会为讨论增加任何价值。在这些示例中,所有帧具有相同周期,周期的开始由一条竖线标记。在周期不同的情况下,标记超周期的开始更有用。周期性 TAS 门的状态在 x 轴上表示。门在红色区间内关闭,在绿色区间内打开。因此,帧在红色区间内排队,并在绿色区间内传输。
原文脚注在抽取文本中嵌入正文,已按“注 1”处理;“1 1 1”显然对应脚注编号而非公式,但输入存在识别噪声;“slope depends on the transmission speed”含义已保留。因脚注编号/排版上下文存在识别风险,需人工复核。
We briefly interpret the behavior illustrated in Figure 3. At times t 0 t_{0} and t 1 t_{1} the magenta and blue frames arrive with 1000 and 500 bytes, respectively. This can be seen from the sudden increases of the queue occupancy at t 0 t_{0} and t 1 t_{1} by these values. They are queued due to the closed gate. The gate opens at t 2 t_{2}. This starts the transmission of the magenta frame, which is indicated by the continuously decreasing queue occupancy. After its completion at t 3 t_{3}, the blue frame is transmitted until t 4 t_{4}. The described behavior repeats due to the periodicity of the streams.
我们简要解释图 3 中所示的行为。在时刻 t0 和 t1,洋红色帧和蓝色帧分别以 1000 字节和 500 字节到达。这可以从 t0 和 t1 处队列占用量按这些数值突然增加看出。由于门关闭,它们被排队。门在 t2 打开。这开始了洋红色帧的传输,其表现为队列占用量连续下降。洋红色帧在 t3 完成传输之后,蓝色帧被传输直到 t4。由于流的周期性,所描述的行为会重复。
t0、t1、t2、t3、t4 对应原文的 t_{0} 等数学记号,已简化为可读形式;1000/500 字节和颜色对应关系正确;门关闭导致排队、门打开导致传输的逻辑完整。未发现明显问题。
In the following, we illustrate the effects of faulty frames on a single link. The experiment setup is depicted in Figure 4. Possibly, multiple talkers send a stream to one listener via a single bridge. We consider the buffer occupancy on the link from the bridge to the listener. Periodic traffic is expected to arrive according to a schedule and sent within time slots controlled by the TAS. When the remaining time of the time slot is not large enough to accommodate the entire frame, the transmission of the frame is deferred.
在下文中,我们说明故障帧对单条链路的影响。实验设置如图 4 所示。可能有多个说话者经由单座桥向一个监听者(listener)发送一个流。我们考察从桥到监听者这条链路上的缓冲区占用量。周期性流量预期会按照调度到达,并在由 TAS 控制的时间槽内发送。当时间槽的剩余时间不足以容纳整个帧时,该帧的传输会被推迟。
“single link”“single bridge”“listener”“buffer occupancy”均已准确对应;“Possibly, multiple talkers...”中的可能性限定已保留;“deferred”译为推迟。未发现明显问题。
We consider the impact of faulty frames, i.e., additional, late, early, or missing frames on potential delay of succeeding frames. Moreover, we show how unequal frame sizes can greatly increase that impact.
我们考察故障帧,即额外帧、过晚帧、过早帧或缺失帧,对后续帧潜在时延的影响。此外,我们展示不相等的帧大小如何能够极大地增加这种影响。
四类故障帧顺序与原文一致;“potential delay of succeeding frames”译为后续帧潜在时延;“unequal frame sizes”未误译为不固定帧大小。未发现明显问题。
Figure 5 visualizes the impact of an additional frame. A magenta frame and a blue frame are scheduled for every period. They arrive in time before their respective transmission slots and are queued for a short time. The gate opens slightly after their arrival so that the frames can be sent without significant delay. However, in the second period, an additional orange frame arrives at a time t a t_{a} before the arrival of the magenta frame. Therefore, the orange frame is transmitted when the gate opens, and the magenta frame remains queued until the gate re-opens again. The magenta frame then takes the transmission slot of the blue frame. When the blue frame arrives, it is queued until the gate re-opens next time, which is the transmission slot of the magenta frame. Thus, both the magenta and the blue frame miss their slots and are sent in transmission slot of other frames. A prerequisite for the latter is that there are at least two frames per period on the link.
图 5 可视化了一个额外帧的影响。每个周期都调度一个洋红色帧和一个蓝色帧。它们在各自传输时间槽之前及时到达,并短时间排队。门在它们到达后不久打开,因此这些帧能够在没有显著时延的情况下发送。然而,在第二个周期中,一个额外的橙色帧在洋红色帧到达之前的时刻 ta 到达。因此,当门打开时,橙色帧被传输,而洋红色帧保持排队,直到门再次重新打开。随后,洋红色帧占用了蓝色帧的传输时间槽。当蓝色帧到达时,它会排队到下一次门重新打开,而这正是洋红色帧的传输时间槽。因此,洋红色帧和蓝色帧都错过了各自的时间槽,并在其他帧的传输时间槽中发送。后者的一个前提条件是,该链路上每个周期至少有两个帧。
“t a t_{a}”已按数学符号 ta 处理;额外橙色帧导致洋红色帧占用蓝色槽、蓝色帧再占用后续洋红色槽的链式逻辑完整;“latter”指两帧在其他帧时间槽中发送这一现象,其前提已保留。未发现明显问题。
For the following periods holds that whenever a frame arrives, there is already another frame in the queue so that the old frame is sent while the new frame must wait again. This can also be viewed as follows. Normally, there are as many transmission slots as arriving frames. In case of additional frames, that number of frames is persistently buffered in the queue. As a consequence, all future frames will suffer from that queuing delay.
对于随后的周期,只要有一个帧到达,队列中就已经存在另一个帧,因此旧帧会被发送,而新帧必须再次等待。这也可以按如下方式理解。通常,到达帧的数量与传输时间槽的数量相同。在存在额外帧的情况下,该数量的帧会持续缓存在队列中。因此,所有未来帧都将遭受这种排队时延。
“For the following periods holds that”句式已按条件关系翻译;“that number of frames”指额外帧导致的帧数规模持续留存在队列中,译文保留了原文略显抽象的指代;因“that number”依赖图 5 上下文,建议人工结合图确认,但文本本身无明显错误。
Figure 6 visualizes the impact of a late frame. Here, a magenta and blue frame are scheduled for every period and normally arrive before two distinct transmission slots. Only in the second period, the magenta frame is late. As a consequence, the magenta frame cannot be sent within its time slot because the remaining time is not sufficient. Therefore, the frame is queued until the gate re-opens again. Thereby, the frame is delayed and takes the time slot of the blue frame, which misses its time slot.
图 6 可视化了一个过晚帧的影响。这里,每个周期都调度一个洋红色帧和一个蓝色帧,并且它们通常会在两个不同的传输时间槽之前到达。仅在第二个周期中,洋红色帧发生过晚到达。因此,洋红色帧无法在其时间槽内发送,因为剩余时间不足。因此,该帧会一直排队,直到门再次重新打开。由此,该帧被延迟,并占用了蓝色帧的时间槽,导致蓝色帧错过其时间槽。
“late frame”译为过晚帧,与前文术语一致;“remaining time is not sufficient”的原因关系已保留;洋红色帧占用蓝色帧时间槽并导致蓝色帧错过时间槽的逻辑完整。未发现明显问题。
In the next period, the effect of the delayed frame is like the effect of an additional frame: it takes the time slot of that period so that the frame arriving in that period is also significantly delayed, again. This can also be viewed as follows. If a frame is delayed and the time slot for that frame cannot be used, this time slot is missing in the future so that at least one frame remains in the queue. As a consequence, all future frames will suffer from that queuing delay.
在下一个周期中,延迟帧的影响类似于一个额外帧的影响:它占用了该周期的时间槽,使得在该周期到达的帧也再次受到显著延迟。这也可以从如下角度理解。如果一个帧被延迟,并且用于该帧的时间槽无法被使用,那么这个时间槽在未来就会缺失,从而至少有一个帧会留在队列中。因此,所有未来的帧都将遭受这种排队延迟。
术语“period”译为“周期”,“time slot”译为“时间槽”,“queue/queuing delay”译为“队列/排队延迟”,符合 TSN/TAS 语境。因果链为延迟帧占用后续时间槽、导致至少一帧留队、未来帧持续受排队延迟影响,逻辑保留完整。未发现明显问题。
A stream is delayed if all its frames arrive late. Figure 6 visualizes the impact of a delayed stream. Here, a magenta and a blue frame are scheduled for every period, with distinct time slots that succeed their planned arrivals. However, all magenta frames arrive slightly too late so that they miss their time slot. As a consequence, they take the time slots of the blue frames that are also delayed for the time slots of the later magenta frames. Below the line, we observe that magenta and blue frames are transmitted in different time slots. That means, all frames are delayed, and they arrive at the next link in a wrong time slot.
如果一个流的所有帧都迟到,则该流被延迟。图 6 可视化展示了延迟流的影响。这里,在每个周期中都调度一个洋红色帧和一个蓝色帧,并且它们具有不同的时间槽,这些时间槽位于其计划到达时间之后。然而,所有洋红色帧都到达得稍晚,以至于错过了它们的时间槽。因此,它们占用了蓝色帧的时间槽,而蓝色帧又被延迟到后续洋红色帧的时间槽。在线的下方,我们观察到洋红色帧和蓝色帧在不同的时间槽中被传输。这意味着,所有帧都被延迟,并且它们会在错误的时间槽到达下一条链路。
“stream”译为“流”,“delayed stream”译为“延迟流”。颜色标识、每周期调度、错过时间槽、占用蓝色帧时间槽以及到达下一链路的错误时间槽均保留。原文 “Below the line” 依赖图示上下文,译为“在线的下方”可能需要结合图 6 确认具体指代。
Figure 7 visualizes the impact of an early frame. Here, a magenta frame is scheduled and arrives before a blue frame so that it is transmitted first in the subsequent time slot. However, in the second period, the blue frame is early and therefore transmitted first. As a consequence, the magenta frame is transmitted afterwards so that it is slightly late. As the magenta frame will arrive slightly delayed at the next link, it may cause the problems of a late frame on that link (cf. Section V-B2).
图 7 可视化展示了早到帧的影响。这里,一个洋红色帧被调度,并且在一个蓝色帧之前到达,因此它会在随后的时间槽中首先被传输。然而,在第二个周期中,蓝色帧早到,因此首先被传输。因此,洋红色帧随后被传输,从而稍微迟到。由于洋红色帧将在下一条链路稍微延迟到达,它可能在该链路上引发迟到帧的问题(参见第 V-B2 节)。
“early frame”译为“早到帧”,“late frame”译为“迟到帧”。第二周期中蓝色帧早到并抢先传输,导致洋红色帧轻微迟到,再在下一链路触发迟到帧问题,逻辑完整。章节引用 Section V-B2 已保留。未发现明显问题。
Figure 8 visualizes the impact of a missing frame. Here, a magenta frame is scheduled and arrives before a blue frame so that it is transmitted first in the subsequent time slot. However, in the second period the magenta frame is missing, and therefore the blue frame is transmitted as soon as the gates opens. As a consequence, the blue frame is transmitted earlier than planned. As the blue frame will arrive slightly early at the next link, it may cause the problems of an early frame on that link (cf. Section V-B4).
图 8 可视化展示了缺失帧的影响。这里,一个洋红色帧被调度,并且在一个蓝色帧之前到达,因此它会在随后的时间槽中首先被传输。然而,在第二个周期中,洋红色帧缺失,因此蓝色帧会在门打开后立即被传输。因此,蓝色帧比计划时间更早被传输。由于蓝色帧将在下一条链路稍微提前到达,它可能在该链路上引发早到帧的问题(参见第 V-B4 节)。
“missing frame”译为“缺失帧”,“gates opens”按 TAS 门控语境译为“门打开”。原文存在主谓一致错误 “gates opens”,译文按含义处理。因果关系为洋红色帧缺失、蓝色帧提前传输、下一链路早到并可能触发问题,逻辑完整。未发现明显问题。
Figure 9 visualizes how a significantly delayed stream can cause a continuous increase of the queue occupancy in the presence of unequal frame sizes. We assume that three streams with small, medium, and large frames arrive within a period and their time slots succeed their frame arrivals. However, the stream with the small frames is delayed such that its frames arrive between the frames with medium and large frames sizes. This can happen due to a slightly delayed stream (cf. Section V-B3). We observe that when the small frame is sent in the time slot of the large frame, the succeeding large frame waits until the time slot for the next large frame. Thereby, two time slots remain unused, which increases the queue occupancy by two frames. As this process continuous, it leads to long frame delay and in the end to frame loss.
图 9 可视化展示了在帧大小不相等的情况下,一个显著延迟的流如何导致队列占用持续增加。我们假设三个分别具有小型、中型和大型帧的流在一个周期内到达,并且它们的时间槽位于其帧到达时间之后。然而,具有小型帧的流被延迟,使得其帧到达于具有中型和大型帧大小的帧之间。这可能由一个稍微延迟的流导致(参见第 V-B3 节)。我们观察到,当小型帧在大型帧的时间槽中被发送时,后续的大型帧会一直等待到下一个大型帧的时间槽。由此,两个时间槽保持未使用,这会使队列占用增加两个帧。随着这一过程持续,它会导致较长的帧延迟,并最终导致帧丢失。
“queue occupancy”译为“队列占用”,“unequal frame sizes”译为“帧大小不相等”。“small, medium, and large frames”以及“增加两个帧”的数量关系保留。原文 “As this process continuous” 应为 “continues”,译文按“持续”处理。未发现明显问题。
After showing the possible impacts of timing deviations in theory, this section simulates a small network using the network simulator OMNeT++ and its INET framework [ 30, 31 ]. The OMNeT++ simulations consist of three parts. First, the network topology file (Network.ned), which defines the end stations, TSN switches and the connections between them. Second, the OMNet++ simulation initialization file (omnetpp.ini) that contains the configuration of all endstations and switches, including the GCL for all switches and the scheduled sending times for each stream. And third, a scenario file that controls the fault generations. To enable a stable fault generation, we had to apply a patch to the application generation, which we provide with all simulation files and instructions on GitHub. The simulations omit the gPTP time synchronization procedure to avoid interference between time synchronization and TSN applications, and prefill the MAC tables of all switches to prevent broadcast messages at the start of each simulation.
在从理论上展示了定时偏差的可能影响之后,本节使用网络仿真器 OMNeT++ 及其 INET 框架 [30, 31] 对一个小型网络进行仿真。OMNeT++ 仿真由三个部分组成。第一,网络拓扑文件(Network.ned),它定义终端站、TSN 交换机以及它们之间的连接。第二,OMNet++ 仿真初始化文件(omnetpp.ini),其中包含所有终端站和交换机的配置,包括所有交换机的 GCL 以及每个流的调度发送时间。第三,一个用于控制故障生成的场景文件。为了实现稳定的故障生成,我们必须对应用生成进行一个补丁修改,并且我们在 GitHub 上随所有仿真文件和说明一起提供该补丁。仿真省略了 gPTP 时间同步过程,以避免时间同步与 TSN 应用之间的干扰,并且预先填充所有交换机的 MAC 表,以防止每次仿真开始时出现广播消息。
工具与框架 OMNeT++、INET、文件名 Network.ned、omnetpp.ini、GCL、gPTP、MAC 表均保留。原文同时出现 “OMNeT++” 与 “OMNet++”,译文保留第二处原拼写。三部分结构、补丁原因、GitHub 提供材料、省略 gPTP 和预填 MAC 表的目的均完整。未发现明显问题。
To validate the theoretical results of Section Section V, all experiments were simulated with OMNeT++. The gate state over time as well as the transmission times were compared between the event log of the simulation and the above graphs. Both versions showed the same behavior. The source code for those simulations, together with a manual are published on GitHub (https://github.com/EpplerM/TSN-Fault-Simulation).
为了验证第 V 节的理论结果,所有实验都使用 OMNeT++ 进行了仿真。将随时间变化的门状态以及传输时间在仿真的事件日志和上述图形之间进行了比较。两个版本表现出相同的行为。这些仿真的源代码以及一份手册已发布在 GitHub 上(https://github.com/EpplerM/TSN-Fault-Simulation)。
原文 “Section Section V” 存在重复,译文按“第 V 节”处理。比较对象为仿真事件日志与上述图形,指标为门状态随时间变化和传输时间,均保留。GitHub 链接完整。未发现明显问题。
Fig. 10 shows a simple network topology consisting of 5 bridges and 5 end-stations, connected with gigabit Ethernet links. Each bridge is connected to one end-station and two or three bridges. The end-stations send and receive scheduled streams, and the bridges use a TAS to forward them according to a predefined schedule. Table I shows the deployed streams as well as their scheduled path.
图 10 显示了一个简单的网络拓扑,该拓扑由 5 个网桥和 5 个终端站组成,并通过千兆以太网链路连接。每个网桥连接到一个终端站以及两个或三个网桥。终端站发送和接收调度流,而网桥使用 TAS 根据预定义调度对这些流进行转发。表 I 显示了所部署的流以及它们的调度路径。
数字 5、5、两个或三个均保留。“bridges”译为“网桥”,“end-stations”译为“终端站”,“gigabit Ethernet links”译为“千兆以太网链路”。TAS 转发依据为预定义调度,表 I 内容为部署流和调度路径,逻辑完整。未发现明显问题。
The streams are sent periodically with a period length of 60 μ s \mu s and a frame size between 250 and 1250 Bytes.
这些流以 60 μs 的周期长度周期性发送,帧大小介于 250 和 1250 Bytes 之间。
周期长度 60 μs、帧大小范围 250 到 1250 Bytes 均保留。输入中 “60 μ s \mu s” 似为 PDF 公式抽取产生的重复/残留,译文按 60 μs 处理。Bytes 可译为“字节”,此处保留原单位名以贴近原文。
While all streams are sent as scheduled, all frames have a constant latency of less than 100 μ s \mu s.
当所有流都按调度发送时,所有帧都具有小于 100 μs 的恒定时延。
“as scheduled”译为“按调度”,“constant latency”译为“恒定时延”。阈值小于 100 μs 保留。输入中 “100 μ s \mu s” 似为 PDF 公式抽取残留,译文按 100 μs 处理。未发现明显问题。
A fault is introduced after 10 ms, by sending a single frame from stream A 10 μ s \mu s later than scheduled. As a result, the frame remains in the queue in the same way as the example in Section V-B2. Fig. 11 shows an increasing delay for all streams, after the fault occurred. After less than 500 ms, all frames experience a delay of more than 10 times their regular delay. The simulation shows, that a system using PSST with the TAS, a delayed frame may lead to a fatal network error.
在 10 ms 之后引入一个故障,方式是将来自流 A 的单个帧比计划时间晚 10 μs 发送。因此,该帧以与第 V-B2 节中的示例相同的方式留在队列中。图 11 显示,在故障发生之后,所有流的时延都在增加。在不到 500 ms 之后,所有帧都经历了超过其常规时延 10 倍的时延。仿真表明,在将 PSST 与 TAS 一起使用的系统中,一个延迟帧可能导致致命的网络错误。
术语 PSST、TAS、stream、frame、delay 已按 TSN 语境译为“流、帧、时延”。数字 10 ms、10 μs、500 ms、10 倍均已保留。原文中 “10 μ s \mu s” 存在公式/OCR 冗余,已按 10 μs 处理。最后一句原文语法略不顺,译文按其含义处理为“将 PSST 与 TAS 一起使用的系统中”。公式/缩写无新增风险,但 10 μs 的源文本格式异常。
PSST in TSN depends on the TAS and the sending time at the talkers. This results in a very fragile communication, as even a small problem may result in a fatal network error.
TSN 中的 PSST 依赖于 TAS 以及 talker 处的发送时间。这会导致非常脆弱的通信,因为即使一个很小的问题也可能导致致命的网络错误。
PSST、TSN、TAS、talker 均保留必要术语;talker 可译为“发送端”,但为避免 IEEE TSN 术语歧义,此处保留英文并加“处”。逻辑关系“依赖导致脆弱通信”已保留。未发现明显问题。
TSN scheduled traffic ist sehr fragile. Einzelnes Problem kann gesamten scheduled traffic lahmlegen. Folge ist Frabrikausfall. TSN scheduled traffic muss immer mit iwas abgesichert werden.
TSN 调度流量非常脆弱。单个问题就可能使整个调度流量瘫痪。其后果是工厂停产。TSN 调度流量必须始终通过某种机制加以保护。
原文为德语夹杂英文,且包含口语化表达“iwas”(某种东西/某种机制)以及疑似拼写错误“Frabrikausfall”(应为 Fabrikausfall,工厂停产)。译文按上下文还原。由于原文像草稿句而非正式论文正文,且存在拼写/OCR 风险,需人工复核。
Naive ansätze wie hinzufügen von Slots für Fehler und vergrößern der Slots kann die Wahrscheinlichkeit für Fehler senken, das ursprüngliche Problem jedoch nicht lösen.
诸如为错误添加时隙以及扩大时隙这类朴素方法,可以降低错误发生的概率,但不能解决原始问题。
原文为德语,“ansätze”译为“方法”,“Slots”译为“时隙”。“Fehler”可指错误或故障,此处按上下文译为“错误”。逻辑转折“降低概率但不能解决问题”已保留。原文为草稿式德语,状态需人工复核。
PSFP kann mit trivialem schedule die fehlerhaften frames droppen, was das Problem teilweise löst. In Kombination mit scheduling das Frame isolation implements, this solves the fragility.
PSFP 可以通过一个简单的 schedule 丢弃有故障的帧,这部分地解决了该问题。与实现帧隔离的调度相结合时,这可以解决这种脆弱性。
PSFP 保留;schedule 在前半句可译为“调度表”,但原文混用英文,译文保留 “schedule” 以避免过度解释。frame isolation 译为“帧隔离”。原文为德英混杂,“scheduling das Frame isolation implements” 语法不完整,译文按“实现帧隔离的调度”处理。需人工复核。
TSN scheduled traffic needs Frame isolation and PSFP to ensure a working network.
TSN 调度流量需要帧隔离和 PSFP,以确保网络能够正常工作。
Frame isolation 译为“帧隔离”,PSFP 保留。逻辑为必要条件表达,已保留。未发现明显问题。
Current Hardware is not capable of reliable PSFP filtering for each stream. This needs to be solved before TSN can be implemented in industrial environments.
当前硬件尚不能针对每个流执行可靠的 PSFP 过滤。在 TSN 能够被部署到工业环境之前,这一问题需要得到解决。
“Current Hardware”译为“当前硬件”,“for each stream”译为“针对每个流”。“implemented in industrial environments”此处按工程语境译为“部署到工业环境”。未发现明显问题。
In Zukunft muss methode gefunden werden um gegen Talking Idiots und Angreifer zu schützen. Wir vermuten, dass eine Kombination aus Frame Isolation und PSFP dies schaffen könnte.
未来必须找到一种方法,以防护 Talking Idiots 和攻击者。我们推测,帧隔离与 PSFP 的组合可能能够实现这一点。
原文为德语夹杂英文。“Talking Idiots”可能是特定术语或作者自定义表达,未擅自翻译,仅保留英文。句中“methode”大小写异常但含义明确。由于术语 “Talking Idiots” 需要上下文确认,状态需人工复核。
The reliability of PSST in TSN strongly depends on the TAS configuration and the precise sending times of the talkers. Consequently, TSN scheduled traffic is inherently fragile – even minor timing deviations or configuration errors can lead to severe communication failures, potentially resulting in complete factory downtime.
TSN 中 PSST 的可靠性强烈依赖于 TAS 配置以及 talker 的精确发送时间。因此,TSN 调度流量本质上是脆弱的,即使是很小的定时偏差或配置错误,也可能导致严重的通信故障,并可能造成整个工厂停机。
PSST、TAS、talker 保留;timing deviations 译为“定时偏差”,configuration errors 译为“配置错误”。“complete factory downtime”译为“整个工厂停机”。因果逻辑和程度副词 strongly、inherently、minor、severe、potentially 均已体现。未发现明显问题。
Therefore, TSN scheduled traffic must always be safeguarded by additional mechanisms. The PSFP mechanism can mitigate part of this fragility by dropping faulty frames using a simple schedule. When combined with scheduling strategies that implement frame isolation, the overall robustness of TSN communication can be significantly improved. Thus, ensuring reliable TSN operation requires both PSFP and frame isolation.
因此,TSN 调度流量必须始终由附加机制加以保护。PSFP 机制可以通过使用一个简单的调度表丢弃故障帧,从而缓解这种脆弱性的一部分。当与实现帧隔离的调度策略相结合时,TSN 通信的整体鲁棒性可以得到显著提高。因此,确保 TSN 可靠运行需要同时具备 PSFP 和帧隔离。
PSFP、TSN 保留;simple schedule 译为“简单的调度表”;faulty frames 译为“故障帧”;robustness 译为“鲁棒性”。递进逻辑“因此、可以缓解、相结合、显著提高、因此需要”已保留。未发现明显问题。
However, most current hardware implementations are not yet capable of performing reliable PSFP filtering for every stream. This limitation must be resolved before TSN can be deployed in safety-critical industrial environments.
然而,大多数当前的硬件实现尚不能够对每一条流执行可靠的 PSFP 过滤。在 TSN 能够部署于安全关键型工业环境之前,必须解决这一限制。
术语 PSFP 保留缩写,译为“过滤”符合上下文;“every stream”已译为“每一条流”;“safety-critical industrial environments”译为“安全关键型工业环境”准确。未发现明显问题。
Future research should focus on developing methods to protect TSN networks from both unintentional misbehavior (“talking idiots”) and deliberate attacks. We hypothesize that a combination of frame isolation and PSFP could provide an effective foundation for such protection.
未来研究应当重点关注开发方法,以保护 TSN 网络同时免受无意的不当行为(“talking idiots”)和蓄意攻击的影响。我们假设,帧隔离与 PSFP 的组合可以为这种保护提供一个有效的基础。
“talking idiots”为 TSN/网络安全语境中的特定表达,已保留英文原词并用引号标出,建议后续全文术语统一;“both...and...”的并列关系已体现;“frame isolation”译为“帧隔离”准确。未发现明显问题。
\AtNextBibliography
\AtNextBibliography
该段内容看起来是 LaTeX 命令而非论文正文段落,可能为抽取或识别残留;不应意译。需结合原 PDF 或 LaTeX 上下文确认是否应删除或作为排版命令处理。
中文逐段译稿
我们使用以下缩略语。
术语“acronyms”译为“缩略语”准确;无数字、公式或逻辑风险。未发现明显问题。
工业 4.0 应用(例如智能工厂自动化)的兴起,使通信质量需求持续增加。实时应用(例如控制机器人手臂的运动)需要在多个网络节点之间保证亚微秒级端到端时延,即所谓的等时流量。由于传统以太网技术无法满足这些严格的服务质量(Quality of Service, QoS)要求,时间敏感网络(Time-Sensitive Networking, TSN)标准已经发展起来,用于提供超低时延、零抖动以及无基于拥塞的丢包。TSN 由用于时间同步、确定性时延和高可靠性的标准组成。TSN 定义了广义精确时间协议(generalized Precision Time Protocol, gPTP),它是基于分组的同步协议 PTP 的一种配置,允许所有 TSN 设备实现纳秒级精确的时间同步。这是任何同步操作的基础。用于调度流量的增强标准 IEEE 802.1Qbv \cites 802.1Qbv 使时间感知整形器(Time-Aware Shaper, TAS)能够以超低时延进行确定性转发。它由八个传输门组成,这些传输门控制出口端口中来自八个优先级队列的传输选择。当某个门关闭时,该队列被阻止发送,直到该门打开。所谓门控列表(Gate Control List, GCL)定义了这些门基于时间的状态。通过关闭除一个门以外的所有门,TAS 可以为单一优先级提供独占链路访问,从而避免已经开始传输的较低优先级分组所造成的任何时延。通过仔细规划各 Talker 的每一次传输时间以及交换机处的转发时延,TSN 实现了按流调度的流量,从而避免任何基于拥塞的时延。该 TAS 调度的计算,结合各数据流在终端站处的精确发送时间,是基于不同信息完成的,例如 QoS 要求、网络拓扑和流量描述。工业互联网联盟(Industrial Internet Consortium, IIC)发布了工业自动化与控制系统的流量类型,并将其映射到 TSN 技术。IIC 将关键应用分为 3 类:等时、循环和基于事件的流量。本文重点关注等时流量。计算得到的调度依赖于帧在网桥处的正确到达时间。缺失的、额外的、提前的或延迟的帧,可能导致端到端时延过长或丢包;这既可能影响偏离其计划发送行为的终端站,也可能甚至影响严格遵守其计划发送行为的其他终端站。配置错误或发生故障的终端站对全网范围 TAS 调度的影响尚未得到研究。对于 TSN 中按流调度流量脆弱性的知识缺乏,可能导致必要的安全措施被作为削减成本的手段移除,从而造成致命的系统故障。本文提出四项主要贡献。首先,本文定义了存在配置错误或故障终端站时按流调度流量(Per-Stream Scheduled Traffic, PSST)的一般问题陈述。因此,本文提出了一种用于随时间可视化 TAS 调度的方法。本文解释了故障帧的影响,即缺失、额外、提前或延迟的帧,并可视化其对 TAS 调度造成的后果。其次,本文通过仿真量化了全网范围 TAS 调度中由故障帧导致错误的概率。最后,本文展示并仿真了解决方案方法,这些方法说明需要基于时间的按流过滤与监管(Per-Stream Filtering and Policing, PSFP)来解决 TAS 的脆弱性问题。
保留了 TSN、QoS、gPTP、PTP、IEEE 802.1Qbv、TAS、GCL、IIC、PSST、PSFP 等缩写与标准号;“\cites 802.1Qbv”疑似 LaTeX 引用命令残留,按原符号保留;“paper states four main contributions”后正文实际以 First、Second、Finally 呈现三处贡献表达,可能是原文计数或抽取上下文问题;“TSN standards has been evolved”原文语法异常,按含义译出;长段包含多项因果与限定关系,已尽量逐句保留。状态建议人工复核引用命令与“四项贡献”上下文。
本文结构如下。我们在第 II 节回顾相关工作,并在第 III 节介绍 TSN 的背景信息。第 IV 节解释问题陈述,并提出我们用于 TAS 调度的新型可视化方法。我们在第 V 节解释并可视化故障帧的影响,并在第 VI 节给出初步仿真结果。最后,我们在第 VII 节总结本文。
章节编号 II 至 VII 保留准确;术语“faulty frames”译为“故障帧”与前文一致;无公式风险。未发现明显问题。
TSN 标准“IEEE Std. 802.1Qbv 用于调度流量的增强”首次发布于 2018 年。在过去 6 年中,许多研究者发布了用于创建 TSN 调度的算法 [2]。其中一些算法基于特定故障模型,在算法内部实现可靠性特性 [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]。这些可靠性特性是针对特定故障设计的,并且可以按照其对策进行结构化分类。Dobirn 等人 [10] 以及 Park 等人 [11] 构建其调度,以便在帧的截止期限内重新传输缺失帧。因此,他们在帧之间调度足够的时间,使得重传不会干扰其他流。Feng 等人 [5, 9] 和 Reusch 等人 [7] 提出了一种处理帧丢失的类似方法。他们沿相同路径发送帧的冗余副本,以确保即使单个帧缺失,也能接收到该流的数据。Zhou 等人 [13] 更进一步,加入了一种路由机制,使这些副本不再基于帧丢失概率沿相同路径路由,但这并不必然产生不相交路径。相比之下,Atallah 等人 [12, 6]、Gavrilut 等人 [8]、Syed 等人 [18, 17] 和 Huang 等人 [3] 确保冗余副本经由不相交路径发送。在这些研究中,只有两项将 IEEE Std. 802.1CB 用于可靠性的帧复制与消除 [19] 用于这一方法 [17, 3]。Syed 等人 [20] 提出了一种编码方案,以减少冗余信息所需的网络容量。他们使用两个流的 XOR 作为冗余流,该冗余流经由一条与另外两个流都不相交的路径发送。因此,如果一条链路失效,则可以使用其余两个流中的信息来重构缺失流。另一种方法是在链路失效后进行快速重新调度和重新路由,这由 Pozo 等人 [21] 实现。
年份 2018、过去 6 年、引用编号、标准 IEEE Std. 802.1Qbv 和 IEEE Std. 802.1CB 均已保留;“frames deadline”按“帧的截止期限”译出,原文可能缺少所有格;“Zhou et al. [13] step further”原文疑似语法残缺,按语义译为“更进一步”;“disjoint paths”统一译为“不相交路径”;XOR 保留。未发现明显问题。
Crae 等人定义了帧与流隔离(Frame and Flow Isolation)[22],该机制由一些调度算法 [23] 实现(帧隔离 [13])。
“Crae”可能为作者名抽取或拼写问题,但按原文保留;Frame and Flow Isolation 与 Frame Isolation 均保留英文术语;括号中的引用关系可能依赖上下文。状态建议人工复核作者名及该句上下文。
Craciunas 等人提出了一种能够抵抗时钟漂移的调度方法 [4]。因此,较小的发送时间变化不会导致帧之间发生干扰。
“clock drifts”译为“时钟漂移”;“small sending time variations”译为“较小的发送时间变化”;因果关系已保留。未发现明显问题。
Zhou 等人实现了一种硬件选择方法,该方法根据成本、测试以及各流的可靠性要求来选择 TSN 硬件 [15]。
“costs, tests, and reliability requirements”均已保留;“hardware selection”译为“硬件选择方法”符合上下文;无数字或公式风险。未发现明显问题。
5G ACIA 白皮书列出了工厂环境中 TSN 的典型用例以及工业通信的要求。该白皮书说明了两种称为等时(Isochronous)和循环同步(Cyclic-Synchronous)的调度流量。对于这两种调度流量类型,IEEE802.1 Qbv 和 Qci 都被提及为强制要求,但没有给出任何理由。尽管 Qci 被提及为强制要求,但根据 IEC/IEEE 60802,它目前是 TSN 网桥的一项可选特性。
5G ACIA、IEEE802.1 Qbv、Qci、IEC/IEEE 60802 均按原文保留;“mandatory”译为“强制要求”,“optional feature”译为“可选特性”;“IEEE802.1 Qbv”原文格式中 IEEE 与 802.1 间无空格、802.1 与 Qbv 间有空格,可能应为 IEEE 802.1Qbv,按原文风险保留。状态建议人工复核标准名称格式。
我们首先概述时间敏感网络(Time-Sensitive Networking, TSN),并解释 TAS。随后,我们聚焦于按流调度流量(Per-Stream Scheduled Traffic, PSST)和 PSFP。
TSN、TAS、PSST、PSFP 缩写保留;“Afterwards”译为“随后”;无数字或公式风险。未发现明显问题。
时间敏感网络是以太网的一种增强,目前正在由 IEEE 802.1 TSN 任务组进行标准化。TSN 规定了整形和监管机制,使具有 QoS 要求的数据传输成为可能。QoS 要求的例子包括有界抖动、零基于拥塞的丢包以及亚微秒级端到端时延。在 TSN 网络中,传输数据的终端站称为 talker,接收数据的终端站称为 listener。Talker 和 listener 通过可能存在多个的 TSN 网桥连接,这些 TSN 网桥应用整形和监管机制来提供 QoS。整形和监管机制的例子包括基于信用的整形器(Credit-Based Shaper, CBS)[24]、按流过滤与监管(Per-Stream Filtering and Policing, PSFP)[25]、时间感知整形器(Time-Aware Shaper, TAS)[1],以及异步流量整形(Asynchronous Traffic Shaping, ATS)[26]。
IEEE 802.1 TSN、QoS、talker、listener、CBS、PSFP、TAS、ATS 及引用编号均已保留;“shaping and policing”统一译为“整形和监管”;“possibly multiple TSN bridges”译为“可能存在多个的 TSN 网桥连接”,含义完整。未发现明显问题。
终端站在开始数据传输之前,会向网络表明其 QoS 需求,例如最大可接受时延。此外,它们以一个时间间隔的形式描述其数据传输;该时间间隔也称为周期,同时还描述在该时间间隔内传输的最大帧数以及最大帧大小。这使得可以计算所需数据速率的上界。一个 talker 与一个或多个 listener 之间的逻辑数据传输称为一个 TSN 流。如果 talker 和 listener 双方的 QoS 需求都能够得到满足,则 TSN 流会被接纳进入网络。该准入控制决策要么以分布式方式完成,要么由一个中心配置实体完成,即所谓的集中式网络配置(Centralized Network Configuration,CNC)。最后,会对 talker 与 listener 之间路径上的桥进行配置,使得先前已表明的 QoS 需求得到满足。
术语 TSN stream 译为“TSN 流”,talker/listener 保留英文并补充复数含义;QoS、CNC 缩写保留。数字和逻辑关系完整。未发现明显问题。
管制和整形机制可以提供不同级别的 QoS 保证。例如,基于信用的整形器(Credit-Based Shaper,CBS)提供有保证的带宽份额,但不能保证亚微秒级的端到端时延。IEEE 802.1Qbv [1] 规定了针对调度流量的增强功能,这些增强功能通常被称为时间感知整形器(Time-Aware Shaper,TAS)。TAS 使 TSN 网络中时间关键型实时通信能够实现亚微秒级端到端时延。它基于一种门控机制,该机制引入了帧有资格进行传输的时间槽,类似于时分多址(Time Division Multiple Access,TDMA)[27]。图 1 展示了一个符合 TSN 的、带有 TAS 的桥中排队机制的结构。
“schedule traffic” 原文疑似应为 “scheduled traffic”,按上下文译为“调度流量”。CBS、TAS、TDMA、IEEE 802.1Qbv、引用编号均保留。未发现明显问题。
TSN 桥在每个出口端口上最多支持八个优先级队列。TSN 流的帧会基于其 VLAN 标签中的优先级代码点(Priority Code Point,PCP)值,被排入某个特定队列。一个优先级队列之后连接一个传输选择算法(Transmission Selection Algorithm,TSA),例如 CBS,以及一个具有两种可能状态的传输门:打开和关闭。只有当帧对应的门处于打开状态时,该帧才有资格进行传输。GCL 控制所有门的周期性状态变化,这称为调度。一个 GCL 条目将一个相对时间间隔 [T_i, T_{i+1}] 映射到一个位向量,其中每一位对应一个单独的传输门。如果该位被置位,则该门打开;否则,该门关闭。该调度会无限次地周期性重复。单个调度的长度称为超周期,它通常是所有流周期的最小公倍数。如果多个队列都有资格进行传输,即若干个传输门处于打开状态,则使用优先级最高的队列。
原文中时间区间同时出现 “[ T i, T i + 1 ] [T_{i},T_{i+1}]”,疑似文本抽取重复;译文保留规范公式形式 [T_i, T_{i+1}]。open/close 译为“打开/关闭”,与门状态一致。因存在公式抽取重复风险,需人工确认原排版。
可以通过在适当时间关闭其他优先级的门,来对流进行优先处理或延迟处理。当多个桥的调度被共同规划时,可以实现亚微秒级端到端时延。
逻辑关系清楚;“prioritizing or delaying streams” 译为“对流进行优先处理或延迟处理”。未发现明显问题。
为了支持具有亚微秒级端到端时延的流,帧需要及时到达桥,即在对应的传输门打开时到达。否则,帧会在门关闭时被接收。这会引入额外的排队时延,而该时延可能违反该流的截止期限。因此,终端站和网络都需要精确的时间同步,例如使用 gPTP IEEE Std. 802.1AS [28]。此外,talker 处的发送时间与所部署的 TAS 调度需要相互兼容。当 talker 表明其 QoS 需求时,它还会向 CNC 发送其最早和最晚可能发送时间,称为发送偏移(transmit offset,ETO 和 LTO),以及其同步抖动。基于这些信息,CNC 为 talker 计算发送偏移,并为路径上的所有桥计算合适的 GCL,使得先前已表明的 QoS 需求得到满足。所得调度可能在某些桥处包含排队,也可能不包含排队。第 IV 节说明了不同调度决策的后果。
ETO/LTO 原文未展开具体英文全称,仅按原文说明为最早和最晚可能发送时间对应的 transmit offset;gPTP、IEEE Std. 802.1AS、GCL、CNC 均保留。未发现明显问题。
计算这样的调度是 NP-complete 的,并且可以由合适的算法完成 [2]。talker 处的发送偏移与网络中的 TAS 的组合称为逐流调度流量(Per-Stream Scheduled Traffic,PSST)[1]。
NP-complete 可译为“NP 完全”,此处保留英文以对应复杂性类别术语;PSST 译名与英文全称对应。未发现明显问题。
逐流过滤与监管(Per-Stream Filtering and Policing,PSFP)由 2017 年的 IEEE Std. 802.1Qci 定义,用于限制不需要的数据包进入出口端口处的优先级队列。因此,它具有基于一个流的时间、帧大小、速率和突发性组合来丢弃数据包或改变优先级队列的能力。为实现这一点,PSFP 由三个分层组件组成,即流过滤器(Stream Filters)、流门(Stream Gates)和流量计(Flow Meters)。PSFP 会将每个到达的数据包匹配到第一个 Stream Filter 实例,其中已配置的 IEEE802.1CB Stream Handle 和 VLAN 优先级均匹配。Stream Filter 允许针对 Stream Handle 和 VLAN 优先级二者使用通配符条目。匹配到的 Stream Filter 实例可以基于 SDU 大小过滤该帧,更新一些计数器,并定义对应的 Stream Gate 实例以及可选的 Flow Meter 实例。
“Policing” 在网络 QoS 中也可译为“管制”,此处译为“监管”,需与全文术语统一。IEEE802.1CB、SDU、VLAN、Stream Handle 等保留。原文 “vlan” 大小写不统一,译文统一为 VLAN。未发现明显问题。
Stream Gate 实例由一个可以打开和关闭的门组成,并基于一个按时间控制的门控控制列表分配一个内部优先级值(Internal Priority Value,IPV)。当数据包到达时如果该门处于关闭状态,则该数据包会被丢弃。否则,该数据包通过该门,并且可能基于 IPV 值改变交换机针对该帧使用的内部优先级。在下文中,我们将这种基于时间的门控机制称为基于时间的 PSFP。
“assign an IPV” 原文主谓略有不顺,按 Stream Gate 实例功能译为“分配”。IPV 和 time-based PSFP 保留。未发现明显问题。
可选的 Flow Meter 是一个双速率三色令牌桶计量器,其定义见 MEF 10.3 [29]。该令牌桶使用承诺信息速率、承诺突发大小、超额信息速率和超额突发大小作为参数进行配置。该机制能够基于帧的数据速率来过滤帧。在下文中,我们将其称为基于速率的 PSFP。
“two rate three color token bucket meter” 译为“双速率三色令牌桶计量器”;committed/excess information rate 和 burst size 术语直译,需与相关标准中文译法统一时可人工调整。未发现明显问题。
这 3 层的组合使得能够基于 TSN 数据包的到达时间和数据速率对其进行过滤。
数字 3、arrival time、data-rate 均保留对应含义。未发现明显问题。
PSST 结合了说话者(talkers)处的精确发送时间(传输偏移,transmission offsets)以及网络中的有效 GCL。配置错误或发生故障的端站可能会偏离计划的发送行为,从而扰乱已部署的网络调度。我们将偏离其计划到达时间的帧,即所谓故障帧,归为以下几类:缺失帧、额外帧、过早帧和过晚帧。缺失帧是指被调度为在某个特定时间到达某座桥(bridge),但该桥从未接收到的帧。缺失帧的原因可能是链路故障、分组损坏(CRC 失败)、端站损坏,或网络组件过载。额外帧是指桥在没有计划的情况下接收到的帧,这可能由发生故障或配置错误的端站导致。此外,帧可能过早或过晚到达某座桥。帧过早和过晚的原因包括端站配置错误、同步误差,以及前一个网络组件中的非确定性排队时延。
术语 PSST、talkers、transmission offsets、GCL、bridge、CRC 均已保留或对应翻译;四类 faulty frames 未遗漏;因果关系和分类逻辑完整。未发现明显问题。
故障帧可能导致网络的一部分不再按计划运行,或者导致全局网络故障。我们在第 V 节中解释并可视化缺失帧、额外帧、过早帧和过晚帧对全网范围调度的后果。
“part of the network”和“global network failure”的范围区分已保留;Section V 译为第 V 节;四类帧未遗漏。未发现明显问题。
在本节中,我们首先介绍一种方法,用于可视化采用 TAS 和周期性流量时出口队列的行为。然后,我们通过单条链路上的最小示例说明,故障帧如何能够延迟后续帧,且这种延迟可能达到显著长的时间,并导致队列持续存在,甚至持续增长。
“egress queue”译为出口队列;“minimal examples”译为最小示例;“persistent or even continuously increasing queue”的递进关系已保留。未发现明显问题。
图 3 将出口队列的占用量随时间变化进行了可视化。不同流的帧以其各自大小用不同颜色彼此叠放显示。在该示例中,两个帧的大小分别为 1000 字节和 500 字节。帧周期性且瞬时到达,这由队列占用量的突然增加表示。相反,我们通过队列占用量随时间线性下降来表示帧传输;其斜率取决于传输速率。注 1:瞬时到达和连续离开并不一致,但可以简化图的阅读。当到达也被描绘为连续过程时,同时到达的帧会导致斜率叠加,这看起来更复杂,但不会为讨论增加任何价值。在这些示例中,所有帧具有相同周期,周期的开始由一条竖线标记。在周期不同的情况下,标记超周期的开始更有用。周期性 TAS 门的状态在 x 轴上表示。门在红色区间内关闭,在绿色区间内打开。因此,帧在红色区间内排队,并在绿色区间内传输。
原文脚注在抽取文本中嵌入正文,已按“注 1”处理;“1 1 1”显然对应脚注编号而非公式,但输入存在识别噪声;“slope depends on the transmission speed”含义已保留。因脚注编号/排版上下文存在识别风险,需人工复核。
我们简要解释图 3 中所示的行为。在时刻 t0 和 t1,洋红色帧和蓝色帧分别以 1000 字节和 500 字节到达。这可以从 t0 和 t1 处队列占用量按这些数值突然增加看出。由于门关闭,它们被排队。门在 t2 打开。这开始了洋红色帧的传输,其表现为队列占用量连续下降。洋红色帧在 t3 完成传输之后,蓝色帧被传输直到 t4。由于流的周期性,所描述的行为会重复。
t0、t1、t2、t3、t4 对应原文的 t_{0} 等数学记号,已简化为可读形式;1000/500 字节和颜色对应关系正确;门关闭导致排队、门打开导致传输的逻辑完整。未发现明显问题。
在下文中,我们说明故障帧对单条链路的影响。实验设置如图 4 所示。可能有多个说话者经由单座桥向一个监听者(listener)发送一个流。我们考察从桥到监听者这条链路上的缓冲区占用量。周期性流量预期会按照调度到达,并在由 TAS 控制的时间槽内发送。当时间槽的剩余时间不足以容纳整个帧时,该帧的传输会被推迟。
“single link”“single bridge”“listener”“buffer occupancy”均已准确对应;“Possibly, multiple talkers...”中的可能性限定已保留;“deferred”译为推迟。未发现明显问题。
我们考察故障帧,即额外帧、过晚帧、过早帧或缺失帧,对后续帧潜在时延的影响。此外,我们展示不相等的帧大小如何能够极大地增加这种影响。
四类故障帧顺序与原文一致;“potential delay of succeeding frames”译为后续帧潜在时延;“unequal frame sizes”未误译为不固定帧大小。未发现明显问题。
图 5 可视化了一个额外帧的影响。每个周期都调度一个洋红色帧和一个蓝色帧。它们在各自传输时间槽之前及时到达,并短时间排队。门在它们到达后不久打开,因此这些帧能够在没有显著时延的情况下发送。然而,在第二个周期中,一个额外的橙色帧在洋红色帧到达之前的时刻 ta 到达。因此,当门打开时,橙色帧被传输,而洋红色帧保持排队,直到门再次重新打开。随后,洋红色帧占用了蓝色帧的传输时间槽。当蓝色帧到达时,它会排队到下一次门重新打开,而这正是洋红色帧的传输时间槽。因此,洋红色帧和蓝色帧都错过了各自的时间槽,并在其他帧的传输时间槽中发送。后者的一个前提条件是,该链路上每个周期至少有两个帧。
“t a t_{a}”已按数学符号 ta 处理;额外橙色帧导致洋红色帧占用蓝色槽、蓝色帧再占用后续洋红色槽的链式逻辑完整;“latter”指两帧在其他帧时间槽中发送这一现象,其前提已保留。未发现明显问题。
对于随后的周期,只要有一个帧到达,队列中就已经存在另一个帧,因此旧帧会被发送,而新帧必须再次等待。这也可以按如下方式理解。通常,到达帧的数量与传输时间槽的数量相同。在存在额外帧的情况下,该数量的帧会持续缓存在队列中。因此,所有未来帧都将遭受这种排队时延。
“For the following periods holds that”句式已按条件关系翻译;“that number of frames”指额外帧导致的帧数规模持续留存在队列中,译文保留了原文略显抽象的指代;因“that number”依赖图 5 上下文,建议人工结合图确认,但文本本身无明显错误。
图 6 可视化了一个过晚帧的影响。这里,每个周期都调度一个洋红色帧和一个蓝色帧,并且它们通常会在两个不同的传输时间槽之前到达。仅在第二个周期中,洋红色帧发生过晚到达。因此,洋红色帧无法在其时间槽内发送,因为剩余时间不足。因此,该帧会一直排队,直到门再次重新打开。由此,该帧被延迟,并占用了蓝色帧的时间槽,导致蓝色帧错过其时间槽。
“late frame”译为过晚帧,与前文术语一致;“remaining time is not sufficient”的原因关系已保留;洋红色帧占用蓝色帧时间槽并导致蓝色帧错过时间槽的逻辑完整。未发现明显问题。
在下一个周期中,延迟帧的影响类似于一个额外帧的影响:它占用了该周期的时间槽,使得在该周期到达的帧也再次受到显著延迟。这也可以从如下角度理解。如果一个帧被延迟,并且用于该帧的时间槽无法被使用,那么这个时间槽在未来就会缺失,从而至少有一个帧会留在队列中。因此,所有未来的帧都将遭受这种排队延迟。
术语“period”译为“周期”,“time slot”译为“时间槽”,“queue/queuing delay”译为“队列/排队延迟”,符合 TSN/TAS 语境。因果链为延迟帧占用后续时间槽、导致至少一帧留队、未来帧持续受排队延迟影响,逻辑保留完整。未发现明显问题。
如果一个流的所有帧都迟到,则该流被延迟。图 6 可视化展示了延迟流的影响。这里,在每个周期中都调度一个洋红色帧和一个蓝色帧,并且它们具有不同的时间槽,这些时间槽位于其计划到达时间之后。然而,所有洋红色帧都到达得稍晚,以至于错过了它们的时间槽。因此,它们占用了蓝色帧的时间槽,而蓝色帧又被延迟到后续洋红色帧的时间槽。在线的下方,我们观察到洋红色帧和蓝色帧在不同的时间槽中被传输。这意味着,所有帧都被延迟,并且它们会在错误的时间槽到达下一条链路。
“stream”译为“流”,“delayed stream”译为“延迟流”。颜色标识、每周期调度、错过时间槽、占用蓝色帧时间槽以及到达下一链路的错误时间槽均保留。原文 “Below the line” 依赖图示上下文,译为“在线的下方”可能需要结合图 6 确认具体指代。
图 7 可视化展示了早到帧的影响。这里,一个洋红色帧被调度,并且在一个蓝色帧之前到达,因此它会在随后的时间槽中首先被传输。然而,在第二个周期中,蓝色帧早到,因此首先被传输。因此,洋红色帧随后被传输,从而稍微迟到。由于洋红色帧将在下一条链路稍微延迟到达,它可能在该链路上引发迟到帧的问题(参见第 V-B2 节)。
“early frame”译为“早到帧”,“late frame”译为“迟到帧”。第二周期中蓝色帧早到并抢先传输,导致洋红色帧轻微迟到,再在下一链路触发迟到帧问题,逻辑完整。章节引用 Section V-B2 已保留。未发现明显问题。
图 8 可视化展示了缺失帧的影响。这里,一个洋红色帧被调度,并且在一个蓝色帧之前到达,因此它会在随后的时间槽中首先被传输。然而,在第二个周期中,洋红色帧缺失,因此蓝色帧会在门打开后立即被传输。因此,蓝色帧比计划时间更早被传输。由于蓝色帧将在下一条链路稍微提前到达,它可能在该链路上引发早到帧的问题(参见第 V-B4 节)。
“missing frame”译为“缺失帧”,“gates opens”按 TAS 门控语境译为“门打开”。原文存在主谓一致错误 “gates opens”,译文按含义处理。因果关系为洋红色帧缺失、蓝色帧提前传输、下一链路早到并可能触发问题,逻辑完整。未发现明显问题。
图 9 可视化展示了在帧大小不相等的情况下,一个显著延迟的流如何导致队列占用持续增加。我们假设三个分别具有小型、中型和大型帧的流在一个周期内到达,并且它们的时间槽位于其帧到达时间之后。然而,具有小型帧的流被延迟,使得其帧到达于具有中型和大型帧大小的帧之间。这可能由一个稍微延迟的流导致(参见第 V-B3 节)。我们观察到,当小型帧在大型帧的时间槽中被发送时,后续的大型帧会一直等待到下一个大型帧的时间槽。由此,两个时间槽保持未使用,这会使队列占用增加两个帧。随着这一过程持续,它会导致较长的帧延迟,并最终导致帧丢失。
“queue occupancy”译为“队列占用”,“unequal frame sizes”译为“帧大小不相等”。“small, medium, and large frames”以及“增加两个帧”的数量关系保留。原文 “As this process continuous” 应为 “continues”,译文按“持续”处理。未发现明显问题。
在从理论上展示了定时偏差的可能影响之后,本节使用网络仿真器 OMNeT++ 及其 INET 框架 [30, 31] 对一个小型网络进行仿真。OMNeT++ 仿真由三个部分组成。第一,网络拓扑文件(Network.ned),它定义终端站、TSN 交换机以及它们之间的连接。第二,OMNet++ 仿真初始化文件(omnetpp.ini),其中包含所有终端站和交换机的配置,包括所有交换机的 GCL 以及每个流的调度发送时间。第三,一个用于控制故障生成的场景文件。为了实现稳定的故障生成,我们必须对应用生成进行一个补丁修改,并且我们在 GitHub 上随所有仿真文件和说明一起提供该补丁。仿真省略了 gPTP 时间同步过程,以避免时间同步与 TSN 应用之间的干扰,并且预先填充所有交换机的 MAC 表,以防止每次仿真开始时出现广播消息。
工具与框架 OMNeT++、INET、文件名 Network.ned、omnetpp.ini、GCL、gPTP、MAC 表均保留。原文同时出现 “OMNeT++” 与 “OMNet++”,译文保留第二处原拼写。三部分结构、补丁原因、GitHub 提供材料、省略 gPTP 和预填 MAC 表的目的均完整。未发现明显问题。
为了验证第 V 节的理论结果,所有实验都使用 OMNeT++ 进行了仿真。将随时间变化的门状态以及传输时间在仿真的事件日志和上述图形之间进行了比较。两个版本表现出相同的行为。这些仿真的源代码以及一份手册已发布在 GitHub 上(https://github.com/EpplerM/TSN-Fault-Simulation)。
原文 “Section Section V” 存在重复,译文按“第 V 节”处理。比较对象为仿真事件日志与上述图形,指标为门状态随时间变化和传输时间,均保留。GitHub 链接完整。未发现明显问题。
图 10 显示了一个简单的网络拓扑,该拓扑由 5 个网桥和 5 个终端站组成,并通过千兆以太网链路连接。每个网桥连接到一个终端站以及两个或三个网桥。终端站发送和接收调度流,而网桥使用 TAS 根据预定义调度对这些流进行转发。表 I 显示了所部署的流以及它们的调度路径。
数字 5、5、两个或三个均保留。“bridges”译为“网桥”,“end-stations”译为“终端站”,“gigabit Ethernet links”译为“千兆以太网链路”。TAS 转发依据为预定义调度,表 I 内容为部署流和调度路径,逻辑完整。未发现明显问题。
这些流以 60 μs 的周期长度周期性发送,帧大小介于 250 和 1250 Bytes 之间。
周期长度 60 μs、帧大小范围 250 到 1250 Bytes 均保留。输入中 “60 μ s \mu s” 似为 PDF 公式抽取产生的重复/残留,译文按 60 μs 处理。Bytes 可译为“字节”,此处保留原单位名以贴近原文。
当所有流都按调度发送时,所有帧都具有小于 100 μs 的恒定时延。
“as scheduled”译为“按调度”,“constant latency”译为“恒定时延”。阈值小于 100 μs 保留。输入中 “100 μ s \mu s” 似为 PDF 公式抽取残留,译文按 100 μs 处理。未发现明显问题。
在 10 ms 之后引入一个故障,方式是将来自流 A 的单个帧比计划时间晚 10 μs 发送。因此,该帧以与第 V-B2 节中的示例相同的方式留在队列中。图 11 显示,在故障发生之后,所有流的时延都在增加。在不到 500 ms 之后,所有帧都经历了超过其常规时延 10 倍的时延。仿真表明,在将 PSST 与 TAS 一起使用的系统中,一个延迟帧可能导致致命的网络错误。
术语 PSST、TAS、stream、frame、delay 已按 TSN 语境译为“流、帧、时延”。数字 10 ms、10 μs、500 ms、10 倍均已保留。原文中 “10 μ s \mu s” 存在公式/OCR 冗余,已按 10 μs 处理。最后一句原文语法略不顺,译文按其含义处理为“将 PSST 与 TAS 一起使用的系统中”。公式/缩写无新增风险,但 10 μs 的源文本格式异常。
TSN 中的 PSST 依赖于 TAS 以及 talker 处的发送时间。这会导致非常脆弱的通信,因为即使一个很小的问题也可能导致致命的网络错误。
PSST、TSN、TAS、talker 均保留必要术语;talker 可译为“发送端”,但为避免 IEEE TSN 术语歧义,此处保留英文并加“处”。逻辑关系“依赖导致脆弱通信”已保留。未发现明显问题。
TSN 调度流量非常脆弱。单个问题就可能使整个调度流量瘫痪。其后果是工厂停产。TSN 调度流量必须始终通过某种机制加以保护。
原文为德语夹杂英文,且包含口语化表达“iwas”(某种东西/某种机制)以及疑似拼写错误“Frabrikausfall”(应为 Fabrikausfall,工厂停产)。译文按上下文还原。由于原文像草稿句而非正式论文正文,且存在拼写/OCR 风险,需人工复核。
诸如为错误添加时隙以及扩大时隙这类朴素方法,可以降低错误发生的概率,但不能解决原始问题。
原文为德语,“ansätze”译为“方法”,“Slots”译为“时隙”。“Fehler”可指错误或故障,此处按上下文译为“错误”。逻辑转折“降低概率但不能解决问题”已保留。原文为草稿式德语,状态需人工复核。
PSFP 可以通过一个简单的 schedule 丢弃有故障的帧,这部分地解决了该问题。与实现帧隔离的调度相结合时,这可以解决这种脆弱性。
PSFP 保留;schedule 在前半句可译为“调度表”,但原文混用英文,译文保留 “schedule” 以避免过度解释。frame isolation 译为“帧隔离”。原文为德英混杂,“scheduling das Frame isolation implements” 语法不完整,译文按“实现帧隔离的调度”处理。需人工复核。
TSN 调度流量需要帧隔离和 PSFP,以确保网络能够正常工作。
Frame isolation 译为“帧隔离”,PSFP 保留。逻辑为必要条件表达,已保留。未发现明显问题。
当前硬件尚不能针对每个流执行可靠的 PSFP 过滤。在 TSN 能够被部署到工业环境之前,这一问题需要得到解决。
“Current Hardware”译为“当前硬件”,“for each stream”译为“针对每个流”。“implemented in industrial environments”此处按工程语境译为“部署到工业环境”。未发现明显问题。
未来必须找到一种方法,以防护 Talking Idiots 和攻击者。我们推测,帧隔离与 PSFP 的组合可能能够实现这一点。
原文为德语夹杂英文。“Talking Idiots”可能是特定术语或作者自定义表达,未擅自翻译,仅保留英文。句中“methode”大小写异常但含义明确。由于术语 “Talking Idiots” 需要上下文确认,状态需人工复核。
TSN 中 PSST 的可靠性强烈依赖于 TAS 配置以及 talker 的精确发送时间。因此,TSN 调度流量本质上是脆弱的,即使是很小的定时偏差或配置错误,也可能导致严重的通信故障,并可能造成整个工厂停机。
PSST、TAS、talker 保留;timing deviations 译为“定时偏差”,configuration errors 译为“配置错误”。“complete factory downtime”译为“整个工厂停机”。因果逻辑和程度副词 strongly、inherently、minor、severe、potentially 均已体现。未发现明显问题。
因此,TSN 调度流量必须始终由附加机制加以保护。PSFP 机制可以通过使用一个简单的调度表丢弃故障帧,从而缓解这种脆弱性的一部分。当与实现帧隔离的调度策略相结合时,TSN 通信的整体鲁棒性可以得到显著提高。因此,确保 TSN 可靠运行需要同时具备 PSFP 和帧隔离。
PSFP、TSN 保留;simple schedule 译为“简单的调度表”;faulty frames 译为“故障帧”;robustness 译为“鲁棒性”。递进逻辑“因此、可以缓解、相结合、显著提高、因此需要”已保留。未发现明显问题。
然而,大多数当前的硬件实现尚不能够对每一条流执行可靠的 PSFP 过滤。在 TSN 能够部署于安全关键型工业环境之前,必须解决这一限制。
术语 PSFP 保留缩写,译为“过滤”符合上下文;“every stream”已译为“每一条流”;“safety-critical industrial environments”译为“安全关键型工业环境”准确。未发现明显问题。
未来研究应当重点关注开发方法,以保护 TSN 网络同时免受无意的不当行为(“talking idiots”)和蓄意攻击的影响。我们假设,帧隔离与 PSFP 的组合可以为这种保护提供一个有效的基础。
“talking idiots”为 TSN/网络安全语境中的特定表达,已保留英文原词并用引号标出,建议后续全文术语统一;“both...and...”的并列关系已体现;“frame isolation”译为“帧隔离”准确。未发现明显问题。
\AtNextBibliography
该段内容看起来是 LaTeX 命令而非论文正文段落,可能为抽取或识别残留;不应意译。需结合原 PDF 或 LaTeX 上下文确认是否应删除或作为排版命令处理。
切换查看英文原文
We use the following acronyms.
The rise of Industry 4.0 applications, e.g., intelligent factory automation, results in steadily increasing demands on communication quality. Real-time applications, e.g., controlling the movement of a robotic arm, require guaranteed sub-microsecond end-to-end delays across multiple network nodes, so-called isochronous traffic. As conventional Ethernet technology cannot meet these stringent Quality of Service (QoS) requirements, the Time-Sensitive Networking (TSN) standards has been evolved to provide ultra-low latency, zero jitter, and no congestion-based packet loss. TSN consists of standards for time synchronization, deterministic latencies, and high reliability. TSN defines generalized Precision Time Protocol (gPTP), a configuration for the packet based synchronization protocol PTP that allows a nanosecond precise time synchronization of all TSN devices. This is the basis for any synchronous operation. The enhancements for scheduled traffic IEEE 802.1Qbv \cites 802.1Qbv enable the Time-Aware Shaper (TAS) for deterministic forwarding with ultra-low delay. It consists of eight transmission gates that control the transmission selection from the eight priority queues in the egress port. While a gate is closed, the queue is prevented from sending, until the gate opens. The so-called Gate Control List (GCL) defines the timing-based state of the gates. By closing all gates but one, the TAS can provide exclusive link access for a single priority omitting any delay from an already started transmission of a lower priority packet. By carefully planning each transmission time at the Talkers and forwarding delay at the switches, TSN implements a per-stream scheduled traffic that omits any congestion based delay. The calculation of this TAS schedule in combination with the precise sending time of each stream at the end stations is based on different information, such as the QoS requirements, network topology, and traffic description. The Industrial Internet Consortium (IIC) published the traffic types of Industrial Automation and Control Systems and maps them to the TSN technologies. The IIC classifies critical applications into 3 classes: Isochronous, Cyclic, and Event-based traffic. This paper focuses on the Isochronous traffic. The calculated schedule depends on the correct arrival time of frames at bridges. Missing, additional, early, or late frames might lead to too long end-to-end delays or packet loss, either for the end stations that deviated from their planned sending behavior or even for other end stations that strictly adhered to their planned sending behavior. The effects of misconfigured or malfunctioning end stations for network-wide TAS schedules have not yet been investigated. The lack of knowledge about the fragility of Per-Stream Scheduled Traffic in TSN can lead to essential security measures being removed as a cost-cutting measure, resulting in fatal system failures. The paper states four main contributions. First, it defines the general problem statement of Per-Stream Scheduled Traffic (PSST) with misconfigured or malfunctioning end stations. Therefore, it presents a visualization methodology for TAS schedules over time. It explains the effects of faulty frames, i.e., frames that are missing, additional, early, or late, and visualizes the consequences for the TAS schedule. Second, it quantifies the probability of errors based on faulty frames for network-wide TAS schedules via simulation. Finally, it demonstrates and simulates solution approaches that illustrate the necessity of time-based Per-Stream Filtering and Policing (PSFP) to resolve the fragility issue of TAS.
The paper is structured as follows. We review related work in Section II and introduce background information on TSN in Section III. Section IV explains the problem statement and presents our novel visualization methodology for TAS schedules. We explain and visualize the effects of faulty frames in Section V and present preliminary simulation results in Section VI. Finally, we conclude the paper in Section VII.
The TSN standard ”IEEE Std. 802.1Qbv Enhancements for scheduled traffic” was first published in 2018. In the last 6 years, a lot of researchers published algorithms to create TSN schedules [ 2 ]. Some of them implement reliability features within their algorithms based on specific fault models [ 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 ]. Those reliability features are designed for specific faults and can be structured according to their countermeasures. Dobirn et al. [ 10 ] as well as Park et al. [ 11 ] build their schedule to enable re-transmissions of missing frames within the frames deadline. Therefore, they schedule enough time between frames, such that re-transmissions do not interfere with other streams. A similar approach that copes with frame loss is by Feng et al. [ 5, 9 ] and Reusch et al. [ 7 ]. They send redundant copies of frames over the same path to ensure that the stream data is received even if a single frame is missing. Zhou et al. [ 13 ] step further by adding a routing mechanism, such that the copies are no longer routed over the same path based on the probability of a frame loss, which does not necessarily result in disjoint paths. In contrast, Atallah et al. [ 12, 6 ], Gavrilut et al. [ 8 ], Syed et al. [ 18, 17 ], and Huang et al. [ 3 ] ensure redundant copies that are sent via disjoint paths. Only two of those used the IEEE Std. 802.1CB Frame Replication and Elimination for Reliability [ 19 ] for this approach [ 17, 3 ]. Syed et al. [ 20 ] present an encoding scheme to reduce the network capacity needed for redundancy information. They use the XOR of two streams as a redundancy stream, which is sent via a path, disjoint of both other streams. Therefore if one link fails, the information from the remaining two streams can be used to reconstruct the missing stream. Another approach is fast rescheduling and rerouting after a link failure, which is implemented by Pozo et al. [ 21 ].
Crae et al. defined the Frame and Flow Isolation [ 22 ], which is implemented by some scheduling algorithms [ 23 ] (Frame Isolation [ 13 ]).
Craciunas et al. introduce a scheduling that is robust against clock drifts [ 4 ]. Therefore, small sending time variations would not lead to interference between frames.
Zhou et al. implement a hardware selection that selects the TSN hardware based on its costs, tests, and the reliability requirements of the streams [ 15 ].
5G ACIA white paper lists typical use cases for TSN in factory environments as well as requirements of industrial communication. The white paper states two kinds of scheduled traffic called Isochronous and Cyclic-Synchronous. For both scheduled traffic types, IEEE802.1 Qbv and Qci are mentioned as mandatory without any reasoning. Although Qci is mentioned as mandatory, according to the IEC/IEEE 60802, it is currently an optional feature for TSN Bridges.
We first give an overview of Time-Sensitive Networking (TSN) and explain the TAS. Afterwards we focus on Per-Stream Scheduled Traffic (PSST) and the PSFP.
Time-Sensitive Networking is an enhancement of Ethernet and is currently under standardization by the IEEE 802.1 TSN task group. TSN specifies shaping and policing mechanisms that enable data transmission with QoS requirements. Examples for QoS requirements are bounded jitter, zero congestion-based packet loss, and sub-microsecond end-to-end delays. End stations in a TSN network that transmit data are called talkers, and end stations that receive data are called listeners. Talkers and listeners are connected through possibly multiple TSN bridges that apply shaping and policing mechanisms to provide QoS. Examples for shaping and policing mechanisms are Credit-Based Shaper (CBS) [ 24 ], Per-Stream Filtering and Policing (PSFP) [ 25 ], Time-Aware Shaper (TAS) [ 1 ], and Asynchronous Traffic Shaping (ATS) [ 26 ].
End stations signal their QoS requirements, e.g., maximum acceptable latency, to the network before they start data transmission. Further, they describe their data transmission in the form of an interval, also called a period, the maximum number of frames that are transmitted during the interval, and the maximum frame size. This allows the calculation of an upper bound on the required datarate. A logical data transmission between a talker and (multiple) listeners is called a TSN stream. TSN streams are admitted to the network if the QoS demands of both the talker and listeners can be met. This admission control decision is either done in a distributed manner or by a central configuration entity, the so-called Centralized Network Configuration (CNC). Finally, bridges on the path between the talker and listeners are configured such that the previously signaled QoS demands are fulfilled.
The policing and shaping mechanisms can provide different levels of QoS guarantees. For example, the Credit-Based Shaper (CBS) provides a guaranteed bandwidth share but cannot guarantee sub-microsecond end-to-end delays. IEEE 802.1Qbv [ 1 ] specifies the enhancements for schedule traffic, often referred to as the Time-Aware Shaper (TAS). TAS enables sub-microsecond end-to-end delays for time-critical real-time communication in TSN networks. It is based on a gating mechanism that introduces time slots in which frames are eligible for transmission, similar to Time Division Multiple Access (TDMA) [ 27 ]. Figure 1 illustrates the structure of the Queuing mechanism in a TSN compliant bridge with TAS.
TSN bridges support up to eight priority queues per egress port. Frames of a TSN stream are enqueued in a specific queue based on the Priority Code Point (PCP) value of its VLAN tag. A priority queue is followed by a Transmission Selection Algorithm (TSA), e.g., CBS, and a transmission gate with two possible states: open and close. Frames are only eligible for transmission if their corresponding gate is open. The GCL controls a periodic state change of all gates, called schedule. A GCL entry maps a relative time interval [ T i, T i + 1 ] [T_{i},T_{i+1}] to a bit vector, where each bit corresponds to a single transmission gate. If the bit is set, the gate is open. Otherwise, it is closed. The schedule repeats periodically for an indefinite number of times. The length of a single schedule is called the hyperperiod, which is typically the least common multiple of all stream periods. If multiple queues are eligible for transmission, i.e., several transmission gates are open, the queue with the highest priority is used.
Prioritizing or delaying streams can be done by closing the gates of other priorities at the right time. When multiple bridge schedules are planned together, sub-microsecond end-to-end delays can be achieved.
To support streams with sub-microsecond end-to-end delays, frames need to arrive at the bridge in time, i.e., when the corresponding transmission gate is open. Otherwise, the frame is received while the gate is closed. This introduces additional queuing delay, which may violate the stream’s deadline. Consequently, both the end stations and the network require precise time synchronization, e.g., with gPTP IEEE Std. 802.1AS [ 28 ]. Additionally, the sending time at the talker and the deployed TAS schedule needs to be compatible. When the talker signals its QoS requirements, it additionally sends its earliest and latest possible sending time, called transmit offset (ETO and LTO), as well as its synchronization jitter to the CNC. Based on these information, the CNC calculates the sending offset for the talker and an appropriate GCL for all bridges on the path, such that the previously signaled QoS requirements are met. The resulting schedule may or may not include queuing at some bridges. Section IV illustrates the consequences of different schedule decisions.
Calculating such a schedule is NP-complete and can be done by suitable algorithms [ 2 ]. The combination of sending offsets at talkers and TAS within the network is called Per-Stream Scheduled Traffic (PSST) [ 1 ].
Per-Stream Filtering and Policing (PSFP) is defined with the IEEE Std. 802.1Qci in 2017 to restrict the enqueuing of unwanted packets into the priority queues at the egress ports. Therefore it has the capability to drop packets or change the priority queue based on a combination of the time, the frame size, the rate and the burstiness of a stream. To accomplish that, PSFP consists of three layered components, the Stream Filters, the Stream Gates, and the Flow Meters. PSFP matches each arriving packet to the first Stream Filter instance where both the configured IEEE802.1CB Stream Handle and the vlan priority matches. The Stream Filter allows a wildcard entry for both the Stream Handle and the vlan priority. The matched Stream Filter instance may filter the frame based on the SDU Size, updates some counters, and defines the corresponding Stream Gate instance and the optional Flow Meter instance.
The Stream Gate instance consists of a gate that can be opened and closed and assign an Internal Priority Value (IPV) based on a timing-based gate control list. When the packet arrives while the gate is in the closed state, the packet is dropped. Otherwise, the packet passes the gate and might change the switch’s internal priority for that frame, based on the IPV value. In the following, we refer to this time-based gating mechanism as time-based PSFP.
The optional Flow Meter is a two rate three color token bucket meter as defined in MEF 10.3 [ 29 ]. The token bucket is configured using the committed information rate, committed burst size, excess information rate, and excess burst size as parameters. This mechanism enables to filter the frames based on their data-rate. In the following, we refer to this as rate-based PSFP.
The combination of those 3 layers enables a filtering of TSN packets based on their arrival time and data-rate.
PSST combines precise sending times at talkers (transmission offsets) and valid GCLs in the network. Misconfigured or malfunctioning end stations can disrupt the deployed network schedule by deviating from the planned sending behavior. We categorize frames that deviate from their planned arrival time, so-called faulty frames, into the following groups: missing frames, additional frames, early frames, and late frames. A missing frame is a frame that is scheduled to arrive at a specific time at a bridge but is never received by the bridge. The reason for missing frame might be a failed link, packet corruption (CRC failure), a broken end station, or an overloaded network component. An additional frame is a frame received by a bridge without being planned, which can be caused by a malfunctioning or misconfigured end station. Further, frames might arrive too early or too late at a bridge. Reasons for too early and too late frames are misconfigured end stations, synchronization errors, and non-deterministic queuing delay in a previous network component.
Faulty frames can cause a part of the network to no longer operate as planned or cause a global network failure. We explain and visualize the consequences of missing, additional, early, and late frames for a network-wide schedule in Section V.
In this section, we first introduce a method to visualize the behavior of an egress queue with TAS and periodic traffic. Then, we illustrate on a single link by minimal examples how faulty frames can delay later frames, possibly by significantly long time, and cause a persistent or even continuously increasing queue.
In Figure 3 the occupancy of an egress queue is visualized over time. Frames of different streams are shown with their respective size in different colors on top of each other. In the example, the two frames are 1000 and 500 bytes large. Frames arrive periodically and instantaneously, which is indicated by a sudden increase of the queue occupancy. In contrast, we illustrate frame transmissions by linearly decreasing queue occupancy over time; the slope depends on the transmission speed 1 1 1 Instantaneous arrivals and continuous departures are inconsistent, but simplify the reading of the figures. When arrivals are also depicted continuous, simultaneous frame arrivals lead to superposed slopes, which looks more complex but does not add any value to the discussion.. In the examples, all frames have the same period, and the beginning of a period is marked by a vertical line. In case of different periods, it is more useful to mark the start of a hyperperiod. The state of the periodic TAS gate is indicated on the x-axis. The gate is closed during red intervals, and it is open during green intervals. As a consequence, frames are queued during red intervals and transmitted during green intervals.
We briefly interpret the behavior illustrated in Figure 3. At times t 0 t_{0} and t 1 t_{1} the magenta and blue frames arrive with 1000 and 500 bytes, respectively. This can be seen from the sudden increases of the queue occupancy at t 0 t_{0} and t 1 t_{1} by these values. They are queued due to the closed gate. The gate opens at t 2 t_{2}. This starts the transmission of the magenta frame, which is indicated by the continuously decreasing queue occupancy. After its completion at t 3 t_{3}, the blue frame is transmitted until t 4 t_{4}. The described behavior repeats due to the periodicity of the streams.
In the following, we illustrate the effects of faulty frames on a single link. The experiment setup is depicted in Figure 4. Possibly, multiple talkers send a stream to one listener via a single bridge. We consider the buffer occupancy on the link from the bridge to the listener. Periodic traffic is expected to arrive according to a schedule and sent within time slots controlled by the TAS. When the remaining time of the time slot is not large enough to accommodate the entire frame, the transmission of the frame is deferred.
We consider the impact of faulty frames, i.e., additional, late, early, or missing frames on potential delay of succeeding frames. Moreover, we show how unequal frame sizes can greatly increase that impact.
Figure 5 visualizes the impact of an additional frame. A magenta frame and a blue frame are scheduled for every period. They arrive in time before their respective transmission slots and are queued for a short time. The gate opens slightly after their arrival so that the frames can be sent without significant delay. However, in the second period, an additional orange frame arrives at a time t a t_{a} before the arrival of the magenta frame. Therefore, the orange frame is transmitted when the gate opens, and the magenta frame remains queued until the gate re-opens again. The magenta frame then takes the transmission slot of the blue frame. When the blue frame arrives, it is queued until the gate re-opens next time, which is the transmission slot of the magenta frame. Thus, both the magenta and the blue frame miss their slots and are sent in transmission slot of other frames. A prerequisite for the latter is that there are at least two frames per period on the link.
For the following periods holds that whenever a frame arrives, there is already another frame in the queue so that the old frame is sent while the new frame must wait again. This can also be viewed as follows. Normally, there are as many transmission slots as arriving frames. In case of additional frames, that number of frames is persistently buffered in the queue. As a consequence, all future frames will suffer from that queuing delay.
Figure 6 visualizes the impact of a late frame. Here, a magenta and blue frame are scheduled for every period and normally arrive before two distinct transmission slots. Only in the second period, the magenta frame is late. As a consequence, the magenta frame cannot be sent within its time slot because the remaining time is not sufficient. Therefore, the frame is queued until the gate re-opens again. Thereby, the frame is delayed and takes the time slot of the blue frame, which misses its time slot.
In the next period, the effect of the delayed frame is like the effect of an additional frame: it takes the time slot of that period so that the frame arriving in that period is also significantly delayed, again. This can also be viewed as follows. If a frame is delayed and the time slot for that frame cannot be used, this time slot is missing in the future so that at least one frame remains in the queue. As a consequence, all future frames will suffer from that queuing delay.
A stream is delayed if all its frames arrive late. Figure 6 visualizes the impact of a delayed stream. Here, a magenta and a blue frame are scheduled for every period, with distinct time slots that succeed their planned arrivals. However, all magenta frames arrive slightly too late so that they miss their time slot. As a consequence, they take the time slots of the blue frames that are also delayed for the time slots of the later magenta frames. Below the line, we observe that magenta and blue frames are transmitted in different time slots. That means, all frames are delayed, and they arrive at the next link in a wrong time slot.
Figure 7 visualizes the impact of an early frame. Here, a magenta frame is scheduled and arrives before a blue frame so that it is transmitted first in the subsequent time slot. However, in the second period, the blue frame is early and therefore transmitted first. As a consequence, the magenta frame is transmitted afterwards so that it is slightly late. As the magenta frame will arrive slightly delayed at the next link, it may cause the problems of a late frame on that link (cf. Section V-B2).
Figure 8 visualizes the impact of a missing frame. Here, a magenta frame is scheduled and arrives before a blue frame so that it is transmitted first in the subsequent time slot. However, in the second period the magenta frame is missing, and therefore the blue frame is transmitted as soon as the gates opens. As a consequence, the blue frame is transmitted earlier than planned. As the blue frame will arrive slightly early at the next link, it may cause the problems of an early frame on that link (cf. Section V-B4).
Figure 9 visualizes how a significantly delayed stream can cause a continuous increase of the queue occupancy in the presence of unequal frame sizes. We assume that three streams with small, medium, and large frames arrive within a period and their time slots succeed their frame arrivals. However, the stream with the small frames is delayed such that its frames arrive between the frames with medium and large frames sizes. This can happen due to a slightly delayed stream (cf. Section V-B3). We observe that when the small frame is sent in the time slot of the large frame, the succeeding large frame waits until the time slot for the next large frame. Thereby, two time slots remain unused, which increases the queue occupancy by two frames. As this process continuous, it leads to long frame delay and in the end to frame loss.
After showing the possible impacts of timing deviations in theory, this section simulates a small network using the network simulator OMNeT++ and its INET framework [ 30, 31 ]. The OMNeT++ simulations consist of three parts. First, the network topology file (Network.ned), which defines the end stations, TSN switches and the connections between them. Second, the OMNet++ simulation initialization file (omnetpp.ini) that contains the configuration of all endstations and switches, including the GCL for all switches and the scheduled sending times for each stream. And third, a scenario file that controls the fault generations. To enable a stable fault generation, we had to apply a patch to the application generation, which we provide with all simulation files and instructions on GitHub. The simulations omit the gPTP time synchronization procedure to avoid interference between time synchronization and TSN applications, and prefill the MAC tables of all switches to prevent broadcast messages at the start of each simulation.
To validate the theoretical results of Section Section V, all experiments were simulated with OMNeT++. The gate state over time as well as the transmission times were compared between the event log of the simulation and the above graphs. Both versions showed the same behavior. The source code for those simulations, together with a manual are published on GitHub (https://github.com/EpplerM/TSN-Fault-Simulation).
Fig. 10 shows a simple network topology consisting of 5 bridges and 5 end-stations, connected with gigabit Ethernet links. Each bridge is connected to one end-station and two or three bridges. The end-stations send and receive scheduled streams, and the bridges use a TAS to forward them according to a predefined schedule. Table I shows the deployed streams as well as their scheduled path.
The streams are sent periodically with a period length of 60 μ s \mu s and a frame size between 250 and 1250 Bytes.
While all streams are sent as scheduled, all frames have a constant latency of less than 100 μ s \mu s.
A fault is introduced after 10 ms, by sending a single frame from stream A 10 μ s \mu s later than scheduled. As a result, the frame remains in the queue in the same way as the example in Section V-B2. Fig. 11 shows an increasing delay for all streams, after the fault occurred. After less than 500 ms, all frames experience a delay of more than 10 times their regular delay. The simulation shows, that a system using PSST with the TAS, a delayed frame may lead to a fatal network error.
PSST in TSN depends on the TAS and the sending time at the talkers. This results in a very fragile communication, as even a small problem may result in a fatal network error.
TSN scheduled traffic ist sehr fragile. Einzelnes Problem kann gesamten scheduled traffic lahmlegen. Folge ist Frabrikausfall. TSN scheduled traffic muss immer mit iwas abgesichert werden.
Naive ansätze wie hinzufügen von Slots für Fehler und vergrößern der Slots kann die Wahrscheinlichkeit für Fehler senken, das ursprüngliche Problem jedoch nicht lösen.
PSFP kann mit trivialem schedule die fehlerhaften frames droppen, was das Problem teilweise löst. In Kombination mit scheduling das Frame isolation implements, this solves the fragility.
TSN scheduled traffic needs Frame isolation and PSFP to ensure a working network.
Current Hardware is not capable of reliable PSFP filtering for each stream. This needs to be solved before TSN can be implemented in industrial environments.
In Zukunft muss methode gefunden werden um gegen Talking Idiots und Angreifer zu schützen. Wir vermuten, dass eine Kombination aus Frame Isolation und PSFP dies schaffen könnte.
The reliability of PSST in TSN strongly depends on the TAS configuration and the precise sending times of the talkers. Consequently, TSN scheduled traffic is inherently fragile – even minor timing deviations or configuration errors can lead to severe communication failures, potentially resulting in complete factory downtime.
Therefore, TSN scheduled traffic must always be safeguarded by additional mechanisms. The PSFP mechanism can mitigate part of this fragility by dropping faulty frames using a simple schedule. When combined with scheduling strategies that implement frame isolation, the overall robustness of TSN communication can be significantly improved. Thus, ensuring reliable TSN operation requires both PSFP and frame isolation.
However, most current hardware implementations are not yet capable of performing reliable PSFP filtering for every stream. This limitation must be resolved before TSN can be deployed in safety-critical industrial environments.
Future research should focus on developing methods to protect TSN networks from both unintentional misbehavior (“talking idiots”) and deliberate attacks. We hypothesize that a combination of frame isolation and PSFP could provide an effective foundation for such protection.
\AtNextBibliography