返回论文解读

sync reader

面向工业 IoT 的弹性 TSN:配置与容错评估

Resilient Time-Sensitive Networking for Industrial IoT: Configuration and Fault-Tolerance Evaluation · 2025-07-15

可靠性/冗余仿真与测试工业网络CC BY,可公开对照

本页提供英文原文段落与中文逐段译稿。译稿包含自动复核状态;标记为需人工复核的段落应回到 PDF/HTML 校对公式、表格和符号。

本站范围
全文逐段对照
内容来源
本地英文段落 + 中文译稿
阅读规模
79/79 段已生成译稿
中文逐段译稿
P001已复核

确保确定性的、低时延的、并且具备容错能力的通信,对于工业自动化与控制系统至关重要。信息技术(Information Technology, IT)与运营技术(Operational Technology, OT)的融合,再加上网络物理系统的兴起,对网络基础设施提出了严格的性能与可靠性要求。尽管传统以太网仍被广泛使用,但它缺乏实时工业应用所要求的时序保证、流量隔离以及故障韧性。

术语方面,“deterministic”译为“确定性的”,“low-latency”译为“低时延”,“fault-tolerant”译为“具备容错能力”,“failure resilience”译为“故障韧性”,符合 TSN/工业网络语境。未发现数字、公式或缩写风险。逻辑为背景需求到传统以太网不足,未发现明显问题。

P002已复核

时间敏感网络(Time-Sensitive Networking, TSN)是一组 IEEE 802.1 标准,它通过精确时间同步、广义精确时间协议(Generalized Precision Time Protocol, gPTP)、时间感知整形(time aware shaping, TAS)、逐流过滤与监管、基于 VLAN 的优先级划分,以及用于冗余的帧复制与消除(Frame Replication and Elimination for Redundancy, FRER)等功能来增强以太网。这些能力共同支持确定性行为、低抖动和高可用性,使 TSN 成为工业物联网(Industrial Internet of Things, IIoT)和智能制造网络的一项关键使能技术。然而,在真实世界故障条件下评估 TSN 的有效性,仍然是一个开放的研究挑战。

IEEE 802.1、gPTP、TAS、VLAN、FRER、IIoT 等缩写均保留并扩展。术语“per-stream filtering and policing”译为“逐流过滤与监管”存在“policing”也可译为“管制/限速监管”的术语差异,但含义未偏离。未发现数字、逻辑或公式风险。

P003已复核

本文提出了一个基于 OMNeT++/INET 的模块化仿真框架,用于建模一个配备 TSN 功能的、现实化的、多单元工业网络。我们的仿真框架包括可配置的流量模式、基于虚拟局域网(Virtual Local Area Network, VLAN)的流优先级划分、基于 gPTP 的同步、逐流过滤,以及使用 FRER 的流冗余。关键在于,该框架集成了基于 XML 的故障场景,以模拟链路级和单元级中断,并评估在有冗余和无冗余情况下 TSN 的恢复行为。本文的主要贡献包括:• 现实化工业 TSN 建模:在 OMNeT++ 中实现了一个包含两个生产单元 [ 1 ] 的全规模工业网络,集成了时间同步、流过滤、流量整形和冗余等 TSN 机制。• 基于场景的故障注入:使用定时 XML 脚本对链路和节点故障进行建模,从而能够在受控中断条件下评估 FRER 和故障切换机制。• 韧性评估:该框架在基线 TSN 配置和冗余增强型 TSN 配置下,提供包括时延、抖动和恢复时间在内的详细性能洞察。

术语“multi-cell industrial network”译为“多单元工业网络”,“cell-level disruptions”译为“单元级中断”,可能也可按制造语境译为“生产单元级中断”。引用 [ 1 ]、OMNeT++/INET、XML、FRER 均保留。输入段落中贡献项后续 P004-P006 又单独重复出现,本文按当前段落完整翻译,未合并。未发现数字、逻辑或公式风险。

P004已复核

现实化工业 TSN 建模:在 OMNeT++ 中实现了一个包含两个生产单元 [ 1 ] 的全规模工业网络,集成了时间同步、流过滤、流量整形和冗余等 TSN 机制。

与 P003 中贡献项内容重复,但这是输入段落的独立段落,已单独输出。引用 [ 1 ] 和“两个生产单元”准确保留。术语“full-scale”译为“全规模”可接受。未发现明显问题。

P005已复核

基于场景的故障注入:使用定时 XML 脚本对链路和节点故障进行建模,从而能够在受控中断条件下评估 FRER 和故障切换机制。

“timed XML scripts”译为“定时 XML 脚本”,“failover mechanisms”译为“故障切换机制”,符合网络容错语境。FRER 缩写保留。未发现数字、公式或逻辑风险。

P006已复核

韧性评估:该框架在基线 TSN 配置和冗余增强型 TSN 配置下,提供包括时延、抖动和恢复时间在内的详细性能洞察。

“baseline”译为“基线”,“redundancy-enhanced”译为“冗余增强型”,指标 latency、jitter、recovery time 分别译为“时延、抖动、恢复时间”。未发现明显问题。

P007已复核

本文其余部分的结构如下:第二节回顾工业系统中 TSN 的相关工作。第三节概述系统架构。第四节详细介绍仿真设置,包括拓扑、流量建模和故障注入。第五节给出四种场景下的结果。第六节总结全文并概述未来方向。

Section II-VI 对应译为第二节至第六节;“four scenarios”译为“四种场景”。未发现术语、数字、逻辑或公式风险。

P008已复核

TSN 已经成为工业自动化、汽车以及其他时延敏感领域中确定性以太网通信的关键使能技术。TSN 由 IEEE 802.1 工作组定义,纳入了时间同步(IEEE 802.1AS)、时间感知调度(IEEE 802.1Qbv)、逐流过滤与监管(IEEE 802.1Qci),以及用于冗余的帧复制与消除(FRER, IEEE 802.1CB)等核心功能,以确保关键流量流的低时延、有界抖动和高可用性 [ 2 ]。这些机制使以太网能够满足工业物联网(IIoT)应用的严格要求,取代传统现场总线和专有实时解决方案 [ 3 ]。

标准号 IEEE 802.1AS、802.1Qbv、802.1Qci、802.1CB 准确保留。术语“bounded jitter”译为“有界抖动”,“critical traffic flows”译为“关键流量流”,后者略直译但技术含义准确。引用 [ 2 ]、[ 3 ] 保留。未发现明显问题。

P009已复核

大量研究已经证实,TSN 在混合关键性环境中能够降低抖动并改善确定性行为 [ 4 ]。基于仿真的研究,尤其是使用 OMNeT++ 和 INET 框架的研究,已经验证了 TSN 在工业场景中用于流量优先级划分、调度和可靠性方面的有效性 [ 5 ]。容错性是工业通信的基石,在 TSN 中通常通过冗余流以及 FRER 等快速故障切换技术来实现 [ 6 ]。这些功能在工厂网络中尤其相关,因为在此类网络中,正常运行时间和可预测性能至关重要。

“mixed-criticality environments”译为“混合关键性环境”,符合实时系统常用译法。“uptime”译为“正常运行时间”。引用 [ 4 ]、[ 5 ]、[ 6 ] 保留。未发现数字、公式或逻辑风险。

P010已复核

尽管 TSN 在汽车网络中已经获得关注 [ 7 ],但其在工业自动化中的部署,尤其是在现实化故障条件下的部署,仍然是一个开放的研究挑战。近期工作考察了 TSN 与 PROFINET 和 EtherCAT 等遗留协议的集成 [ 8 ],并提出了用于优化时延和带宽使用的新调度策略 [ 9 ]。除调度之外,若干研究聚焦于 TSN 和无线 TSN 系统中的容错性与冗余。Sudhakaran 等人 [ 10 ] 提出了一种动态调度方法,该方法利用无线 TSN 冗余为移动机器人实现零延迟漫游,并表明 FRER 能够在移动性条件下维持实时保证。Jover 等人 [ 11 ] 通过基于仿真的配置洞察,分析了 TSN 网络中可靠性与冗余成本之间的权衡。在无线方面,Cena 等人 [ 12 ] 提出了用于高可靠性 Wi-Fi 的无缝冗余技术,利用 Wi-Fi 7 多链路操作(Multi-Link Operation, MLO)来实现类似 FRER 的复制与去重。这些研究通过处理先进的故障处置方法,对我们的研究形成补充。然而,大多数现有工作聚焦于静态性能或孤立故障事件,忽视了具有同步工业流量的复杂故障注入场景。我们的工作通过引入一个场景驱动的仿真框架(IN2C)来弥补这一缺口,该框架对工业生产单元、实时流量和协调故障事件进行建模。它评估了四种不同的故障配置,每一种配置均分别在有 FRER 和无 FRER 的情况下进行评估,以便从时延、抖动和恢复时间方面量化 TSN 的故障恢复能力。本文的新颖性在于:• 使用 XML 触发的场景和受真实世界启发的生产逻辑,在分层工业流量流上进行全面故障注入;• 对有冗余和无冗余的 TSN 进行比较评估,从而能够定量洞察 FRER 和逐流保护在故障条件下的影响;• 多角色流量建模(传感器-执行器-PLC-SCADA-边缘单元-HMI),反映出现有 TSN 研究中很少以这种详细程度结合在一起的现实 IIoT 交互。

引用 [ 7 ] 至 [ 12 ]、PROFINET、EtherCAT、FRER、Wi-Fi 7、MLO、IN2C、PLC、SCADA、HMI 均保留。“zero-delay roaming”译为“零延迟漫游”,可能在无线移动机器人语境中也可译为“零时延漫游”。“per-flow protection”译为“逐流保护”。“sensor-actuator-PLC-SCADA-Edge unit-HMI”按链式角色翻译,保留关键缩写。未发现数字或公式风险;该段较长且包含项目符号,已按原逻辑保留。

P011已复核

使用由 XML 触发的场景以及受真实世界启发的生产逻辑,对分层工业流量流进行全面故障注入;

术语“fault injection”译为“故障注入”,“layered industrial traffic flows”译为“分层工业流量流”;XML、场景触发、生产逻辑等限定信息均已保留。未发现明显问题。

P012已复核

对具有冗余和不具有冗余的 TSN 进行比较评估,从而能够定量洞察 FRER 以及故障条件下按流保护的影响;

TSN、FRER 均保留缩写;“with and without redundancy”译为“具有冗余和不具有冗余”;“per-flow protection under failure”译为“故障条件下按流保护”。未发现明显问题。

P013已复核

多角色流量建模(传感器-执行器-PLC-SCADA-边缘单元-HMI),反映了现实的 IIoT 交互,而在现有 TSN 研究中,这些交互很少以这种详细程度结合在一起。

sensor、actuator、PLC、SCADA、Edge unit、HMI 的角色链已完整保留;“rarely combined at this level of detail”译为“很少以这种详细程度结合在一起”。未发现明显问题。

P014已复核

这些贡献加深了对工业环境中 TSN 韧性的理解,并为未来的 TSN 扩展和工业调度策略提供了一个可复用的仿真测试平台。

“resilience”按论文语境译为“韧性”;“reusable simulation testbed”译为“可复用的仿真测试平台”;因果和用途关系已保留。未发现明显问题。

P015已复核

本工作中设计的工业 TSN 网络对一个具有两个生产单元和集中式基础设施的现实智能工厂进行建模,在有线 TSN 链路上纳入了感知、执行、控制、检测、SCADA、HMI 和边缘计算单元。本节概述了网络架构、流量模型、节点配置、故障注入方法以及性能指标,这些内容用于评估系统在正常条件和故障条件下的行为。

“two production cells”译为“两个生产单元”;“centralized infrastructure”译为“集中式基础设施”;功能列表和评估条件均已完整保留。未发现明显问题。

P016已复核

该工业网络采用模块化、分层架构,旨在仿真现实的工业自动化环境。它由两个独立的生产单元和一个集中式基础设施组成;每个生产单元都能够自主运行,而集中式基础设施负责监督、监测和边缘处理。网络拓扑包括单元级 TSN 交换机、中央汇聚交换机,以及为冗余和时间同步而配置的互连链路。

“modular, hierarchical architecture”译为“模块化、分层架构”;“supervision, monitoring, and edge processing”分别译为“监督、监测和边缘处理”;冗余和时间同步配置均已保留。未发现明显问题。

P017需人工复核

每个生产单元集成五个关键功能单元:感知单元、执行单元、机械臂、检测系统和控制单元(PLC)。这些单元在表 I 中有详细说明,并通过两个具备 TSN 能力的交换机(SwitchA_X 和 SwitchB_X,其中 X 表示单元索引)进行双归属连接,从而形成一个容错的本地子网络。这种双交换机设计允许通过 centralSwitch_1、centralSwitch_2 和 centralSwitch_3 朝向中央基础设施提供冗余转发路径。骨干网由在核心层和汇聚层互连交换机的 1-Gbps 以太网链路组成。这些包括:centralSwitch_1:连接两个生产单元的核心交换机。centralSwitch_2 和 centralSwitch_3:与 SCADA、HMI、边缘单元以及质量控制交换机(qcSwitch_1、qcSwitch_2)互连。

原文包含疑似 PDF/LaTeX 识别残留,如 `c ⁢ e ⁢ n...`、斜体字符说明和重复的 `centralSwitch\_1` 等;译文按可识别的设备名 centralSwitch_1、centralSwitch_2、centralSwitch_3、qcSwitch_1、qcSwitch_2 处理。数字“五个”“两个”“1-Gbps”已保留;因存在公式/排版识别噪声和列表上下文残缺风险,需人工复核。

P018需人工复核

centralSwitch_1:连接两个生产单元的核心交换机。

原文包含疑似 PDF/LaTeX 识别残留和重复字符序列;译文保留可识别设备名 centralSwitch_1 及其功能说明。因存在识别噪声,需人工复核。

P019需人工复核

centralSwitch_2 和 centralSwitch_3:与 SCADA、HMI、边缘单元以及质量控制交换机(qcSwitch_1、qcSwitch_2)互连。

原文包含疑似 PDF/LaTeX 识别残留和重复字符序列;译文保留 centralSwitch_2、centralSwitch_3、SCADA、HMI、Edge unit、qcSwitch_1、qcSwitch_2 等关键术语与设备名。因存在识别噪声,需人工复核。

P020已复核

终端设备(例如 PLC、传感器、执行器)通过 100-Mbps 链路连接,表示工厂部署中常见的边缘连接。图 1 展示了该架构。所有交换机和主机都在 TSNBase 配置下配置了 TSN 能力。表 II 总结了网络中实现的 TSN 特性,这些特性是在基于 INET 的 `.ini` 文件中配置的。

100-Mbps、Figure 1、TSNBase、Table II、INET-based `.ini` 文件均已保留;“edge connections commonly found in factory deployments”译为“工厂部署中常见的边缘连接”。未发现明显问题。

P021已复核

这种架构设计为在现实的工业环境中评估 TSN 性能提供了一个稳健且灵活的测试平台,使研究者能够在故障条件下对时延、抖动以及恢复过程进行详细考察。

术语 TSN、latency、jitter、recovery 已按网络语境译为“TSN、时延、抖动、恢复过程”。“realistic industrial setting”译为“现实的工业环境”,语义完整;未涉及数字、公式或缩写展开风险。未发现明显问题。

P022已复核

该工业网络对一组多样化的业务流进行建模,这些业务流反映了工业自动化系统的通信需求。这些业务流按照源-目的角色、通信模式(周期性与事件驱动)、数据包大小以及关键性等级进行分类。TSN 机制,包括流识别以及基于 Priority Code Point(PCP)的优先级划分,被配置为确保高优先级业务即使在网络压力或故障下也能获得确定性服务。表 III 总结了仿真中建模的关键业务类型,以及它们关联的 PCP 值和时序约束。

“traffic streams”译为“业务流”,符合 TSN/工业网络语境;“source-destination roles”译为“源-目的角色”,保留原有限定;“periodic vs. event-driven”译为“周期性与事件驱动”;Priority Code Point 和 PCP 已保留。表号 Table III 译为“表 III”,数字与引用未遗漏。未发现明显问题。

P023已复核

所有业务流都使用 PCP 值进行标记和编码,具体由配置中的逐流映射定义。TSN 交换机按每条流执行入口分类、监管和整形,以实施带宽控制和突发控制。

“per-stream mapping”译为“逐流映射”,“ingress classification, policing, and shaping per stream”译为“按每条流执行入口分类、监管和整形”,术语较贴近网络 QoS/TSN 语境;PCP 缩写保留。无数字、公式风险。未发现明显问题。

P024需人工复核

为了评估 TSN 网络的故障韧性,引入了两类故障场景:• S1A1/S1A2 模拟单元中部链路故障(SwitchA_1 - SwitchB_1),• S2A1/S2A2 模拟单元断连(centralSwitch_2 - 边缘单元和 SCADA)。

原文段落中包含项目符号且后续 P025、P026 又分别重复列出两个故障场景,可能是 PDF 抽取导致的重复或列表拆分;本段仍按输入完整翻译。设备名 SwitchA_1、SwitchB_1、centralSwitch_2 和 SCADA 保留;“cell”按工业单元语境译为“单元”。由于列表上下文和重复段落存在抽取风险,建议人工核对。

P025需人工复核

S1A1/S1A2 模拟单元中部链路故障(SwitchA_1 - SwitchB_1),

本段似乎是 P024 中项目符号列表的单独抽取行,且末尾为逗号,句子不完整但与输入一致。场景编号和链路端点保留无误;“mid-cell link failure”译为“单元中部链路故障”。因段落可能由列表拆分产生,且末尾标点显示上下文不完整,需人工复核。

P026需人工复核

S2A1/S2A2 模拟单元断连(centralSwitch_2 - 边缘单元和 SCADA)。

本段同样可能是 P024 中项目符号列表的单独抽取行。centralSwitch_2 和 SCADA 保留;“Edge unit”译为“边缘单元”。未发现数字或缩写错误,但存在表格/列表上下文拆分风险。

P027需人工复核

表 IV 总结了这些故障场景及其恢复逻辑。以下指标用于评估故障事件期间的网络行为:• 端到端时延:从源到目的地的传输延迟。• 抖动:数据包延迟随时间的变化性。• 丢包:正常运行和故障运行期间被丢弃的数据包数量。• 恢复时间:故障之后重新建立流交付所需的时间。

表号 Table IV 已保留为“表 IV”。四个指标 End-to-End Latency、Jitter、Packet Loss、Recovery Time 已逐项翻译,定义中的来源、目的地、时间变化、正常/故障运行、故障后恢复均未遗漏。原文包含项目符号且后续 P028-P030 又拆出部分指标定义,存在列表重复抽取风险,需人工核对上下文。

P028已复核

端到端时延:从源到目的地的传输延迟。

“End-to-End Latency”译为“端到端时延”,“Transmission delay from source to destination”译为“从源到目的地的传输延迟”,语义准确。该段可能是 P027 指标列表中的单独抽取项,但本段本身完整。未发现明显问题。

P029已复核

抖动:数据包延迟随时间的变化性。

“Jitter”译为“抖动”,定义“Variability in packet delay over time”译为“数据包延迟随时间的变化性”,技术含义完整。该段可能来自 P027 的拆分列表项,但内容完整。未发现明显问题。

P030已复核

丢包:正常运行和故障运行期间被丢弃的数据包数量。

“Packet Loss”译为“丢包”;“normal and faulted operation”译为“正常运行和故障运行期间”,保留了两种运行状态;无数字、公式或缩写风险。该段可能来自 P027 的拆分列表项,但内容完整。未发现明显问题。

P031已复核

恢复时间:故障发生后重新建立流交付所需的时间。

术语“Recovery Time”译为“恢复时间”准确;“stream delivery”译为“流交付”保留 TSN 语境。未发现明显问题。

P032已复核

IN2C 仿真框架使用 INET 框架在 OMNeT++ 中实现,并通过 TSN 专用模块进行扩展,以仿真工业通信行为。该仿真对两个同步的生产单元以及集中式基础设施进行建模,捕获传感器、PLC、机械臂、检测单元、SCADA、边缘单元和 HMI 系统之间的交互。INET 的 TSN 扩展,包括时间感知交换机、gPTP 时钟、逐流过滤以及流量整形,使确定性和容错通信建模成为可能。

OMNeT++、INET、TSN、PLC、SCADA、Edge unit、HMI、gPTP 等缩写和专名已保留;“production cells”译为“生产单元”符合工业语境;“per-stream filtering”译为“逐流过滤”准确。未发现明显问题。

P033需人工复核

所有节点都根据其角色,由 INET 的 TsnDevice 和 StandardHost 模板构建。每个功能单元(例如 Sensing_1、Control_1、Inspection_2)包括:带有振荡器漂移的本地时钟模块、用户数据报协议(UDP)流量生成器和接收端、带有 PCP 标签映射的流编码器,以及用于 TSN 运行的桥接与调度逻辑。

“sinks”译为“接收端”可接受,也可译为“流量接收器”;PCP-tag mapping 保留为“PCP 标签映射”。原文中该段包含项目符号,且后续 P034-P037 似乎重复拆分这些项目,存在版面抽取重复风险,但本段本身可完整翻译。

P034需人工复核

带有振荡器漂移的本地时钟模块,

该段是列表项片段,末尾逗号来自原文;与 P033 内容重复,可能是 PDF/文本抽取导致的列表重复或表格上下文拆分。术语无明显问题。

P035需人工复核

用户数据报协议(UDP)流量生成器和接收端,

“traffic generators and sinks”译为“流量生成器和接收端”准确;该段为列表项片段,且与 P033 内容重复,存在抽取上下文问题。缩写 UDP 已保留。

P036需人工复核

带有 PCP 标签映射的流编码器,

“Stream coders”译为“流编码器”准确;PCP-tag mapping 保留关键缩写。该段为列表项片段,且与 P033 内容重复,需结合原版排版确认。

P037需人工复核

用于 TSN 运行的桥接与调度逻辑。

术语“bridging and scheduling logic”译为“桥接与调度逻辑”准确;该段为列表项片段,且与 P033 内容重复,可能存在抽取重复。未发现术语错误。

P038已复核

单元交换机(SwitchA_X、SwitchB_X)和汇聚交换机(centralSwitch_X、qcSwitch_X)使用 TsnSwitch 类型,并按配置启用 FRER、流过滤和流量整形。

“Cell switches”译为“单元交换机”,“aggregation switches”译为“汇聚交换机”准确;FRER、TsnSwitch 和设备名已保留;“as configured”译为“按配置”未省略限定。未发现明显问题。

P039已复核

主时钟(masterClock)作为全局时间参考运行,并通过 gPTP 在整个拓扑中分发同步。每个节点都将其本地时钟对齐到该时间源,从而确保时间敏感流具有一致的传输时序。

“global time reference”译为“全局时间参考”准确;“coherent transmission timing”译为“一致的传输时序”符合上下文;gPTP 和 masterClock 已保留。未发现明显问题。

P040已复核

工作流遵循事件驱动模型:初始化:配置节点时钟,实例化流量应用,应用流编码器,并建立同步。流量生成:应用模块生成周期性或事件驱动的 UDP 流,这些流具有预先配置的数据包大小、间隔和目的地。调度:时间感知和基于信用的整形模块在有界时延约束下管理排队和分发。故障处理:基于 XML 的 ScenarioManager 脚本触发链路或节点故障(例如 linkfailure.xml 或 cellfailure.xml)。冗余/恢复:如果启用了冗余,则基于 FRER 的故障切换会自动介入。否则,通信可能停滞。指标收集:在整个仿真过程中,OMNeT++ 记录时延、抖动、丢包以及故障恢复持续时间。

“event-driven model”“Initialization”“Traffic Generation”“Scheduling”“Failure Handling”“Redundancy/Recovery”“Metrics Collection”均逐项翻译;XML、ScenarioManager、FRER、OMNeT++、UDP 及文件名已保留;“bounded latency constraints”译为“有界时延约束”准确。该段包含多个列表项,未见公式残缺,但版面为列表压缩成单段。

P041已复核

初始化:配置节点时钟,实例化流量应用,应用流编码器,并建立同步。

术语“Initialization”译为“初始化”准确;“node clocks”“traffic applications”“stream encoders”“synchronization”均已保留技术含义。未发现明显问题。

P042已复核

流量生成:应用模块生成周期性或事件驱动的 UDP 流,这些流具有预先配置的数据包大小、发送间隔和目的地。

“cyclic or event-driven UDP streams”译为“周期性或事件驱动的 UDP 流”准确;“packet size, interval, and destination”三个配置项未遗漏。未发现明显问题。

P043已复核

调度:时间感知整形和基于信用的整形模块在有界时延约束下管理排队和发送。

“Time-aware”译为“时间感知”,“credit-based shaping”译为“基于信用的整形”;“bounded latency constraints”译为“有界时延约束”。“dispatch”译为“发送”符合网络调度语境。未发现明显问题。

P044已复核

故障处理:基于 XML 的 ScenarioManager 脚本触发链路/节点故障,例如 `linkfailure.xml` 或 `cellfailure.xml`。

保留了 ScenarioManager、XML 文件名;“link/node failures”译为“链路/节点故障”准确。未发现明显问题。

P045已复核

冗余/恢复:如果启用了冗余,则基于 FRER 的故障切换会自动介入。否则,通信可能会停滞。

FRER 缩写保留;“failover automatically engages”译为“故障切换会自动介入”准确;“may stall”表达为“可能会停滞”,保留不确定性。未发现明显问题。

P046已复核

指标收集:在整个仿真过程中,OMNeT++ 会记录时延、抖动、丢包以及故障恢复持续时间。

“delay, jitter, packet drops, and fault recovery durations”四类指标均已保留;OMNeT++ 名称准确。未发现明显问题。

P047已复核

同步由 IEEE 802.1AS 驱动。每个交换机都会向下游转发同步消息,以保持精确的偏移和频率对齐。TSN 流量调度使用始终开启的 TAS 门控以及按流的基于信用的整形。该配置是有意选择的,其目的是通过避免时间感知门控调度所引入的额外复杂性,来隔离整形机制和冗余机制的影响。

IEEE 802.1AS、TAS 等标准/缩写已保留;“always-open TAS gates”译为“始终开启的 TAS 门控”;因果逻辑“通过避免……来隔离……”已保留。未发现明显问题。

P048已复核

数据流根据源地址和 PCP 进行过滤与计量。故障通过计划好的断开/连接事件在仿真时间中引入。冗余配置使用 FRER 树和数据包复制来触发备用路径。这种模块化、事件驱动的结构,使得研究者能够在现实工厂网络条件下,对流量整形、同步、流冗余以及恢复行为进行受控实验。

PCP、FRER 缩写已保留;“scheduled disconnect/connect events”译为“计划好的断开/连接事件”;“controlled experimentation”译为“受控实验”。未发现明显问题。

P049已复核

本节分析 IN2C TSN 网络在影响链路和单元连接性的故障条件下的性能。评估了四种场景,以考察冗余对时延、抖动、丢包和恢复时间的影响。结果反映了 TSN 机制在智能制造环境中遭受现实扰动时的表现。

“cell connectivity”译为“单元连接性”,结合后文生产单元语境合理;四项指标 latency、jitter、packet loss、recovery time 均未遗漏。未发现明显问题。

P050需人工复核

评估基于 IN2C 仿真框架 [13],该框架使用 OMNeT++ 并结合 INET-TSN 扩展开发。该网络如图 2 所示,由两个支持 TSN 的生产单元组成,这两个生产单元通过 1-Gbps 骨干网连接到集中式基础设施,包括 SCADA、Edge unit 和 HMI。终端设备通过 100-Mbps 链路进行通信。每次仿真都会捕获初始化、稳态、故障和恢复阶段的行为。在这一语境下,定义了四种场景:• S1A1 / S1A2:模拟在第 4 秒时 SwitchA_1 与 SwitchB_1 之间发生链路故障。S1A2,如图 3 所示,启用 FRER;S1A1 不启用。运行时间:10 s。• S2A1 / S2A2:通过在第 2 秒断开 centralSwitch_2 与 Edge unit 和 SCADA 的连接,模拟单元隔离。S2A2,如图 4 所示,使用 FRER 实现冗余。运行时间:5 s。

数字和单位包括 1-Gbps、100-Mbps、第 4 秒、第 2 秒、10 s、5 s 均已保留;SCADA、Edge unit、HMI、FRER、OMNeT++、INET-TSN 等缩写/名称已保留。原文中运行时间出现“OCR/公式重复”形式,如 `10 ⁢ s 10 𝑠 10s 10 italic_s` 和 `5 ⁢ s 5 𝑠 5s 5 italic_s`,译文按语义整理为“10 s”和“5 s”,但该处存在识别噪声,需人工核对版面原文。

P051需人工复核

S1A1 / S1A2:在第 4 秒模拟 SwitchA_1 与 SwitchB_1 之间的链路故障。S1A2(图 3)启用 FRER;S1A1 不启用。运行时间:10 s。

术语 FRER 保留为缩写;节点名 SwitchA_1、SwitchB_1 保持原样;数字 4 秒与 10 s 已保留。原文运行时间处存在重复 OCR/公式识别内容,已按语义整理为 10 s。

P052需人工复核

S2A1 / S2A2:通过在第 2 秒断开 centralSwitch_2 与 Edge 单元和 SCADA 的连接,模拟单元隔离。S2A2(图 4)使用 FRER 实现冗余。运行时间:5 s。

centralSwitch_2、Edge、SCADA、FRER 均按术语或模块名保留;数字 2 秒与 5 s 已保留。原文运行时间处存在重复 OCR/公式识别内容,已按语义整理为 5 s。

P053已复核

故障通过基于 XML 的 ScenarioManager 脚本引入,这些脚本会显式断开模块之间的以太网门,并在之后恢复这些门。不使用节点级故障;相反,链路级扰动用于模拟实际故障情形,例如电缆损坏或汇聚交换机故障。

XML、ScenarioManager、Ethernet gates 的含义已保留并译为“以太网门”;“disconnect and later restore”的先后逻辑已保留;节点级与链路级故障区分明确。未发现明显问题。

P054已复核

冗余机制,包括用于冗余的帧复制与消除(Frame Replication and Elimination for Redundancy,IEEE 802.1CB),仅在 S1A2 和 S2A2 中通过 StreamRedundancyConfigurator 激活。门控控制和时钟同步分别使用 EagerGateScheduleConfigurator,以及通过集中式 masterClock 实现的符合 IEEE 802.1AS 的 gPTP。完整的仿真源代码、场景和 TSN 配置已在 [13] 公开提供。

IEEE 802.1CB、IEEE 802.1AS、gPTP、StreamRedundancyConfigurator、EagerGateScheduleConfigurator、masterClock 均已保留;“only in S1A2 and S2A2”的限定已保留;引用 [13] 已保留。未发现明显问题。

P055需人工复核

图 5 展示了 S1 流在 SwitchA_1 与 SwitchB_1 之间发生链路故障时的性能,该故障在 t = 4 s 注入,并在 t = 6 s 解决。在非冗余设置(S1A1)中,故障发生后时延变得高度可变,其中位数为 24.01 μs,且分布范围较宽(均值为 18.93 μs)。抖动显著增加,其标准差为 6.29 μs,原因是不稳定的重传以及路径恢复效应。尽管存在物理备用路径,但由于它们未包含在活动流配置中,该流容易受到该故障影响。

t = 4 s、t = 6 s、24.01 μs、18.93 μs、6.29 μs 均已保留;“median 高于 mean”的统计关系按原文保留;“alternate paths”译为“备用路径”;“active stream configuration”的限定已保留。原文公式/单位处有重复识别内容,已按语义整理。

P056需人工复核

相比之下,冗余配置(S1A2)使用 FRER 将流副本预先分发到不相交路径上。这消除了动态重路由或恢复信令的需求。正如预期的那样,尽管存在故障,时延仍始终保持有界,不过其中位数略微上升至 29.81 μs。这一增加源于重复帧处理引入的缓冲时延。该权衡带来了更稳定的平均时延(27.43 μs)和增强的容错能力。由于 FRER 机制确保接收端的第一个有效帧会被无延迟交付,从而完全屏蔽该故障,抖动仍受到控制(标准差 8.13 μs)。

FRER、S1A2、29.81 μs、27.43 μs、8.13 μs 已保留;“disjoint paths”译为“不相交路径”;“first valid frame”译为“第一个有效帧”;“without delay”已保留为“无延迟”。原文单位处有重复识别内容,已按语义整理。

P057需人工复核

图 6 展示了 S2 流在单元级故障下的情况,其中 centralSwitch_2 在 t = 2 s 时与 Edge 和 SCADA 单元断开连接,并在 t = 3 s 时重新连接。

S2、centralSwitch_2、Edge、SCADA、t = 2 s、t = 3 s 均已保留;“cell-level fault”译为“单元级故障”,与前文“cell isolation”一致。原文公式时间处有重复识别内容,已按语义整理。

P058需人工复核

在非冗余情形(S2A1)中,时延急剧飙升,其中位数为 51.28 μs,峰值超过 90 μs。抖动保持在中等水平(标准差 17.32 μs),反映出一旦故障得到解决,恢复行为是一致的。

S2A1、51.28 μs、90 μs、17.32 μs 均已保留;“latency spikes sharply”和“jitter remains moderate”的对比已保留;“once the fault is resolved”的条件逻辑已保留。原文单位处有重复识别内容,已按语义整理。

P059需人工复核

相比之下,采用 FRER 的 S2A2 呈现出更复杂的特征。尽管中位时延略微改善至 45.99 μs,但均值增加至 63.27 μs,并且尾部变得明显:时延最高达到 879 μs,抖动在 -224 到 +224 μs 之间摆动。这表明,尽管 FRER 屏蔽了故障本身,但即使在正常运行下,它也引入了显著的交付可变性。centralSwitch_1 与 centralSwitch_3 之间的拥塞很可能由重复流转发导致,它放大了路径不平衡,并造成较大的抖动波动。这些结果表明,FRER 并非普遍有益。虽然它提供故障切换保护,但对于检查流量这类突发性或高带宽流,它可能降低性能。

S2A2、FRER、45.99 μs、63.27 μs、879 μs、-224 到 +224 μs 均已保留;“heavy tail/尾部明显”与后文统计表述一致;“inspection traffic”译为“检查流量”,可能也可按工业视觉语境译为“检测流量”。原文单位处有重复识别内容,已按语义整理。

P060已复核

两幅图右侧的箱线图从统计上证实了这些观察结果。在 S1 中,非冗余配置在故障期间表现出离群值和抖动分散,而 FRER(S1A2)实现了始终较低的方差,验证了其对关键低流量流的有效性。然而,在 S2 中,与非冗余设置(S2A1)更稳定的特征相比,FRER(S2A2)显示出重尾分布和较宽的抖动范围。这些结果强调,尽管 FRER 能防止数据包丢失并确保连续性,但它可能在正常负载下引入交付不可预测性,尤其是对于高吞吐量或突发性流量。

S1、S2、S1A2、S2A2、S2A1、FRER 均已保留;“statistically”“outliers”“heavy-tailed distribution”“normal load”等限定已保留;结论中的适用条件“high-throughput or bursty traffic”已保留。未发现明显问题。

P061已复核

为了量化每种配置的韧性,表 V 汇总了所有场景下的数据包传输、丢失和恢复特性。

术语“resilience”译为“韧性”符合 TSN/网络可靠性语境;Table V、packet transmission、loss、recovery characteristics 信息完整。未发现明显问题。

P062需人工复核

对于 Sensing_1 → Control_1 流(S1),其平均每 7.5 ms 传输一次;S1A1 配置在 2 秒链路故障期间经历了 270 个丢失数据包,产生 20.2% 的丢包率。恢复完全是反应式的,并且会延迟到链路恢复之后才发生。相比之下,启用 FRER 的 S1A2 通过不相交路径维持连续交付,没有数据包丢失,并实现瞬时故障切换。

原文箭头存在重复/OCR 残留,译文按单一方向箭头处理;7.5 ms、2 秒、270、20.2% 均保留;FRER、S1A1、S1A2 保留。因箭头文本可能存在识别噪声,需人工确认路径符号是否应仅为单箭头。

P063需人工复核

Inspection_2 → Edge unit 流以高速率运行(66 μs 间隔),S2A1 设置在 1 秒隔离期间遭受 19,776 个丢弃数据包,对应 26.3% 的丢包率。尽管存在可用的物理路径,但由于缺少冗余,无法使用这些路径。S2A2 利用 FRER,通过使用重复流避免了丢包;然而,如前文所讨论的,这以显著的抖动和时延可变性为代价。

原文箭头和 μ 符号存在重复/OCR 残留,译文按“66 μs”处理;19,776、1 秒、26.3% 保留;“dropped packets”译为“丢弃数据包”,“duplicate streams”译为“重复流”。因公式/符号识别存在噪声,需人工确认单位和路径方向。

P064已复核

本研究将 FRER 与非冗余基线进行比较,排除了传统保护方案,例如 Spanning Tree Protocol(STP)、Rapid STP(RSTP)和 Ethernet Ring Protection Switching(ERPS);这些方案需要数十到数百毫秒才能恢复,对于具有亚毫秒级期限的 Time-Sensitive Networking(TSN)流量而言过于缓慢。

STP、RSTP、ERPS、TSN 缩写均保留并展开;“tens to hundreds of milliseconds”“sub-millisecond deadlines”含义完整。未发现明显问题。

P065已复核

FRER 通过在不相交路径上的数据平面复制实现零恢复时间,但会引入带宽开销,并且可能由于路径不平衡而产生抖动。替代方案包括带 Scheduled Traffic 的 FRER(FRER-ST),它改善了确定性但需要集中式调度;5G 中的 Packet Duplication and Elimination Function(P-DEF),它专用于无线场景;以及 Multi-Protocol Label Switching - Transport Profile(MPLS-TP)和 IEEE 802.1Qca 等反应式选项,它们以增加复杂性为代价提供快速故障切换。虽然 FRER 对关键流是理想的,但对于高带宽或突发型流量,建议采用选择性或混合式使用。

FRER-ST、P-DEF、MPLS-TP、IEEE 802.1Qca 均保留;因果与转折关系完整;“wireless-specific”译为“专用于无线场景”准确。未发现明显问题。

P066需人工复核

尽管 FRER 消除了数据包丢失并实现了无缝故障切换,但它引入了可测量的带宽开销。图 7 显示了 Sensing_1 → SwitchB_1 链路随时间变化的链路利用率,该链路在 t = 4 s 发生链路故障之前保持稳定利用率(4–5%)。随后利用率降为零,确认发生了完全断开,并且只在 t = 6 s 恢复之后才重新出现。缺少替代转发会导致数据丢失。

原文箭头与 t=4、t=6 存在重复/OCR 残留,译文规范为 t = 4 s、t = 6 s;4–5%、图 7、链路名称均保留。因公式符号识别有噪声,需人工确认时间表达和路径名称。

P067需人工复核

在 S1A2 中,FRER 将流量复制到不相交路径上:SwitchA_1 → centralSwitch_1 链路持续达到 13–15%,而 centralSwitch_1 → SwitchB_1 承载 5%。与此同时,直接路径仍保持活动状态,但负载降低(2%)。这验证了重复帧的连续交付,并确认了 FRER 所引入的开销。

原文箭头存在重复/OCR 残留,译文按单向链路处理;13–15%、5%、2% 均保留;“duplicate frames”译为“重复帧”。因链路箭头识别异常,需人工复核路径方向。

P068需人工复核

图 8 捕捉了 FRER 在 S2 场景中的影响,该场景涉及从 Inspection_2 → centralSwitch_3 到接近饱和(100%)的高带宽流式传输。即使在无故障运行期间,由于持续复制,这些链路仍保持高负载。这种拥塞不仅如前所示增加了时延和抖动,而且还减少了可供其他流量使用的余量。

原文路径表达“from Inspection_2 → centralSwitch_3 to near saturation”略显不完整,译文尽量保留其含义;100%、Figure 8、S2、FRER 保留。由于句子可能依赖图表上下文,且路径/饱和对象略不明确,需人工复核。

P069已复核

这些发现表明,虽然 FRER 提供了故障韧性,但其代价是显著的链路开销。必须仔细考虑这一权衡,尤其是对于高速率或突发型流而言,因为在共享核心网段中,冗余可能会放大拥塞风险。

“fault resilience”译为“故障韧性”,“substantial link overhead”译为“显著的链路开销”;转折和因果逻辑完整。未发现明显问题。

P070需人工复核

建议:在带宽受限或融合的 TSN 网络中,FRER 的使用必须具有选择性和策略性,优先考虑具有严格时延和丢失约束的关键控制流。我们的评估(第 V 节)表明,不加区分的复制会增加拥塞和抖动,特别是对于检测视频这类高带宽流,除非配置了额外容量,否则它们从 FRER 中获得的收益有限。这些观察结果与先前关于 TSN 系统中冗余开销和拥塞风险的分析一致 [11],[12]。为了缓解这些风险并维持确定性性能,我们建议如下:• 采用拓扑感知的流分配,防止冗余路径在容易拥塞的链路上重叠。• 进行逐链路容量规划,同时考虑标称条件和故障条件下的 FRER 复制开销(例如,带宽增加最高可达 2×)。• 采用流感知的整形和监管,在复制之前限制突发大小,并避免队列积压。• 采用基于 VLAN 的隔离来限制流量类别,并避免故障切换事件期间的跨流争用。• 采用自适应冗余级别,即根据流关键性、系统负载或链路健康指标动态启用冗余(如 [10] 所探讨)。

Section V、[11]、[12]、[10]、2×、VLAN、FRER、TSN 均保留;列表项目在原文中位于同一段内,译文保持同一 P070 小节未拆分;“accounts FRER duplication overhead”原文疑似缺少介词,译为“考虑……开销”。因原文项目符号被压缩在一个段落中,需人工确认排版是否应恢复为独立列表。

P071已复核

采用拓扑感知的流分配,以防止冗余路径在容易发生拥塞的链路上重叠。

“Topology-aware”译为“拓扑感知”,“stream allocation”译为“流分配”;“congestion-prone links”保留了“容易发生拥塞的链路”的风险含义。未发现明显问题。

P072已复核

进行逐链路容量规划,在标称条件和故障条件下都将 FRER 复制开销考虑在内(例如,带宽最多增加到 2×)。

“Per-link”译为“逐链路”;“nominal and failure conditions”译为“标称条件和故障条件”;“up to 2× bandwidth increase”存在两种理解:可译为“带宽最多增加 2 倍”或“增加到 2×”,此处为避免歧义采用“增加到 2×”。未发现明显问题。

P073已复核

采用流感知的整形与监管,以在复制之前限制突发规模,并避免队列积聚。

“shaping and policing”按网络 QoS 语境译为“整形与监管”;“burst sizes”译为“突发规模”;“queue buildup”译为“队列积聚”。未发现明显问题。

P074已复核

采用基于 VLAN 的隔离,以约束流量类别,并避免在故障切换事件期间发生跨流竞争。

“traffic classes”译为“流量类别”;“cross-stream contention”译为“跨流竞争”;“failover events”译为“故障切换事件”。未发现明显问题。

P075已复核

采用自适应冗余级别,即根据流的关键性、系统负载或链路健康指标动态启用冗余(如 [10] 中所探讨的那样)。

“flow criticality”译为“流的关键性”;“link health indicators”译为“链路健康指标”;引用 [10] 已保留。未发现明显问题。

P076已复核

这些措施由我们的仿真结果和近期研究共同提供依据,有助于确保 FRER 实现其预期收益,即无损故障切换和零恢复时间,同时不使更广泛的 TSN 网络失稳。

“informed by”译为“提供依据”;“lossless failover and zero recovery time”译为“无损故障切换和零恢复时间”;“broader TSN network”译为“更广泛的 TSN 网络”。未发现明显问题。

P077已复核

本文提出了一个场景驱动的仿真框架,用于评估工业网络中故障条件下的 TSN 韧性。通过对具有流级冗余的同步生产单元进行建模,我们评估了链路和单元故障对时延、抖动、丢包和恢复时间的影响。

“scenario-driven simulation framework”译为“场景驱动的仿真框架”;“synchronized production cells”译为“同步生产单元”;“cell failures”在工业网络语境中译为“单元故障”。未发现明显问题。

P078已复核

结果表明,FRER 在故障期间实现了零丢失故障切换和不间断运行。然而,这种可靠性以带宽使用量显著增加为代价,在某些情况下由于冗余流量可达到 2–3×。这些发现强调,在带宽受限环境中,需要以选择性的、流量感知的方式应用冗余。

“zero-loss failover”译为“零丢失故障切换”;“up to 2–3×”译为“可达到 2–3×”,保留数字范围;“traffic-aware application of redundancy”译为“流量感知的方式应用冗余”。未发现明显问题。

P079已复核**

未来工作将通过动态重配置能力增强该框架,使网络能够通过门控调度更新、流重路由和流优先级划分来自主适应故障。我们还计划仿真多故障和瞬态故障场景,以评估 TSN 在更现实的工业条件下的鲁棒性。

“dynamic reconfiguration capabilities”译为“动态重配置能力”;“gate schedule updates”译为“门控调度更新”;“stream rerouting”译为“流重路由”;“flow prioritization”译为“流优先级划分”。未发现明显问题。

切换查看英文原文
P001Block 5

Ensuring deterministic, low-latency, and fault-tolerant communication is vital for industrial automation and control systems. The convergence of Information Technology (IT) and Operational Technology (OT), coupled with the rise of cyber-physical systems, imposes stringent performance and reliability requirements on network infrastructures. While traditional Ethernet remains widespread, it lacks the timing guarantees, traffic isolation, and failure resilience required by real-time industrial applications.

P002Block 6

Time-Sensitive Networking (TSN), a suite of IEEE 802.1 standards, augments Ethernet with features such as precise time synchronization, Generalized Precision Time Protocol (gPTP), time aware shaping (TAS), per-stream filtering and policing, VLAN-based prioritization, and Frame Replication and Elimination for Redundancy (FRER). These capabilities collectively enable deterministic behavior, low jitter, and high availability, making TSN a key enabler for industrial Internet of Things (IIoT) and smart manufacturing networks. However, assessing TSN’s effectiveness under real-world failure conditions remains an open research challenge.

P003Block 7

This paper presents a modular OMNeT++/INET-based simulation framework that models a realistic, multi-cell industrial network equipped with TSN features. Our simulation framework includes configurable traffic patterns, Virtual Local Area Network (VLAN)-based stream prioritization, gPTP-based synchronization, per-stream filtering, and stream redundancy using FRER. Crucially, the framework integrates XML-based failure scenarios to simulate both link and cell-level disruptions and evaluates TSN’s recovery behavior with and without redundancy. The key contributions of this work are: • Realistic Industrial TSN Modeling: A full-scale industrial network with two production cells [ 1 ] is implemented in OMNeT++, integrating TSN mechanisms such as time synchronization, stream filtering, traffic shaping, and redundancy. • Scenario-Based Fault Injection: Link and node failures are modeled using timed XML scripts, enabling the evaluation of FRER and failover mechanisms under controlled disruptions. • Resilience Evaluation: The framework provides detailed performance insights—including latency, jitter, and recovery time—under both baseline and redundancy-enhanced TSN configurations.

P004Block 8

Realistic Industrial TSN Modeling: A full-scale industrial network with two production cells [ 1 ] is implemented in OMNeT++, integrating TSN mechanisms such as time synchronization, stream filtering, traffic shaping, and redundancy.

P005Block 9

Scenario-Based Fault Injection: Link and node failures are modeled using timed XML scripts, enabling the evaluation of FRER and failover mechanisms under controlled disruptions.

P006Block 10

Resilience Evaluation: The framework provides detailed performance insights—including latency, jitter, and recovery time—under both baseline and redundancy-enhanced TSN configurations.

P007Block 11

The remainder of this paper is structured as follows: Section II reviews related work on TSN for industrial systems. Section III outlines the system architecture. Section IV details the simulation setup, including topology, traffic modeling, and fault injection. Section V presents results across the four scenarios. Section VI concludes and outlines future directions.

P008Block 13

TSN has emerged as a key enabler of deterministic Ethernet communication for industrial automation, automotive, and other latency-sensitive domains. Defined by the IEEE 802.1 working group, TSN incorporates core features such as time synchronization (IEEE 802.1AS), time-aware scheduling (IEEE 802.1Qbv), per-stream filtering and policing (IEEE 802.1Qci), and Frame Replication and Elimination for Redundancy (FRER, IEEE 802.1CB) to ensure low-latency, bounded jitter, and high availability of critical traffic flows [ 2 ]. These mechanisms allow Ethernet to meet the stringent requirements of Industrial Internet of Things (IIoT) applications, replacing traditional fieldbus and proprietary real-time solutions [ 3 ].

P009Block 14

Extensive research has confirmed the benefits of TSN in reducing jitter and improving deterministic behavior in mixed-criticality environments [ 4 ]. Simulation-based studies—particularly those using OMNeT++ and the INET framework—have validated the effectiveness of TSN for traffic prioritization, scheduling, and reliability in industrial scenarios [ 5 ]. Fault tolerance, a cornerstone of industrial communication, is typically achieved in TSN using redundant streams and fast fail-over techniques such as FRER [ 6 ]. These features are especially relevant in factory networks where uptime and predictable performance are critical.

P010Block 15

While TSN has gained traction in automotive networking [ 7 ], its deployment in industrial automation—particularly under realistic fault conditions—remains an open research challenge. Recent work has examined TSN’s integration with legacy protocols such as PROFINET and EtherCAT [ 8 ], and proposed new scheduling strategies for optimizing latency and bandwidth usage [ 9 ]. Beyond scheduling, several studies have focused on fault-tolerance and redundancy in TSN and wireless TSN systems. Sudhakaran et al. [ 10 ] introduced a dynamic scheduling approach enabling zero-delay roaming for mobile robots using wireless TSN redundancy, showing that FRER can uphold real-time guarantees under mobility. Jover et al. [ 11 ] analyzed the trade-off between reliability and redundancy cost in TSN networks through simulation-based configuration insights. On the wireless side, Cena et al. [ 12 ] proposed seamless redundancy techniques for high-reliability Wi-Fi, leveraging Wi-Fi 7 Multi-Link Operation (MLO) to implement FRER-like duplication and de-duplication. Those complement our study by tackling advanced fault-handling approaches. However, most existing efforts focus on static performance or isolated failure events, overlooking complex failure injection scenarios with synchronized industrial traffic. Our work addresses this gap by introducing a scenario-driven simulation framework (IN2C) that models industrial production cells, real-time traffic, and coordinated failure events. It evaluates four distinct failure configurations, each with and without FRER, to quantify TSN’s fault-recovery capabilities in terms of latency, jitter, and recovery time. The novelty of this work lies in: • Comprehensive fault injection across layered industrial traffic flows, using XML-triggered scenarios and real-world-inspired production logic; • Comparative evaluation of TSN with and without redundancy, enabling quantitative insights into the impact of FRER and per-flow protection under failure; • Multi-role traffic modeling (sensor-actuator-PLC-SCADA-Edge unit-HMI), reflecting realistic IIoT interactions rarely combined at this level of detail in existing TSN studies.

P011Block 16

Comprehensive fault injection across layered industrial traffic flows, using XML-triggered scenarios and real-world-inspired production logic;

P012Block 17

Comparative evaluation of TSN with and without redundancy, enabling quantitative insights into the impact of FRER and per-flow protection under failure;

P013Block 18

Multi-role traffic modeling (sensor-actuator-PLC-SCADA-Edge unit-HMI), reflecting realistic IIoT interactions rarely combined at this level of detail in existing TSN studies.

P014Block 19

These contributions provide a deeper understanding of TSN’s resilience in industrial environments and offer a reusable simulation testbed for future TSN extensions and industrial scheduling strategies.

P015Block 21

The industrial TSN network designed in this work models a realistic smart factory with two production cells and centralized infrastructure, incorporating sensing, actuation, control, inspection, SCADA, HMI, and edge computing units over wired TSN links. The section outlines the network architecture, traffic models, node configurations, fault injection methods, and performance metrics used to evaluate system behavior under both normal and failure conditions.

P016Block 23

The industrial network follows a modular, hierarchical architecture designed to emulate realistic industrial automation environments. It comprises two independent production cells, each capable of autonomous operation, and a centralized infrastructure responsible for supervision, monitoring, and edge processing. The network topology includes cell-level TSN switches, central aggregation switches, and interconnecting links configured for redundancy and time synchronization.

P017Block 24

Each production cell integrates five key functional units: a sensing unit, actuating unit, robotic arm, inspection system, and a control unit (PLC). These units, detailed in Table I, are dual-homed through two TSN-capable switches (SwitchA_X and SwitchB_X, where X denotes the cell index), forming a fault-tolerant local sub-network. This dual-switch design allows for redundant forwarding paths toward central infrastructure via centralSwitch_1, centralSwitch_2, and centralSwitch_3. The backbone consists of 1-Gbps Ethernet links interconnecting switches at the core and aggregation layers. These include: • c ⁢ e ⁢ n ⁢ t ⁢ r ⁢ a ⁢ l ⁢ S ⁢ w ⁢ i ⁢ t ⁢ c ⁢ h ⁢ _ ⁢ 1 𝑐 𝑒 𝑛 𝑡 𝑟 𝑎 𝑙 𝑆 𝑤 𝑖 𝑡 𝑐 ℎ _ 1 centralSwitch\_1 italic_c italic_e italic_n italic_t italic_r italic_a italic_l italic_S italic_w italic_i italic_t italic_c italic_h _ 1: Core switch connecting both cells. • c ⁢ e ⁢ n ⁢ t ⁢ r ⁢ a ⁢ l ⁢ S ⁢ w ⁢ i ⁢ t ⁢ c ⁢ h ⁢ _ ⁢ 2 𝑐 𝑒 𝑛 𝑡 𝑟 𝑎 𝑙 𝑆 𝑤 𝑖 𝑡 𝑐 ℎ _ 2 centralSwitch\_2 italic_c italic_e italic_n italic_t italic_r italic_a italic_l italic_S italic_w italic_i italic_t italic_c italic_h _ 2 and c ⁢ e ⁢ n ⁢ t ⁢ r ⁢ a ⁢ l ⁢ S ⁢ w ⁢ i ⁢ t ⁢ c ⁢ h ⁢ _ ⁢ 3 𝑐 𝑒 𝑛 𝑡 𝑟 𝑎 𝑙 𝑆 𝑤 𝑖 𝑡 𝑐 ℎ _ 3 centralSwitch\_3 italic_c italic_e italic_n italic_t italic_r italic_a italic_l italic_S italic_w italic_i italic_t italic_c italic_h _ 3: Interconnect with SCADA, HMI, Edge unit, and quality-control switches (q ⁢ c ⁢ S ⁢ w ⁢ i ⁢ t ⁢ c ⁢ h ⁢ _ ⁢ 1 𝑞 𝑐 𝑆 𝑤 𝑖 𝑡 𝑐 ℎ _ 1 qcSwitch\_1 italic_q italic_c italic_S italic_w italic_i italic_t italic_c italic_h _ 1, q ⁢ c ⁢ S ⁢ w ⁢ i ⁢ t ⁢ c ⁢ h ⁢ _ ⁢ 2 𝑞 𝑐 𝑆 𝑤 𝑖 𝑡 𝑐 ℎ _ 2 qcSwitch\_2 italic_q italic_c italic_S italic_w italic_i italic_t italic_c italic_h _ 2).

P018Block 25

c ⁢ e ⁢ n ⁢ t ⁢ r ⁢ a ⁢ l ⁢ S ⁢ w ⁢ i ⁢ t ⁢ c ⁢ h ⁢ _ ⁢ 1 𝑐 𝑒 𝑛 𝑡 𝑟 𝑎 𝑙 𝑆 𝑤 𝑖 𝑡 𝑐 ℎ _ 1 centralSwitch\_1 italic_c italic_e italic_n italic_t italic_r italic_a italic_l italic_S italic_w italic_i italic_t italic_c italic_h _ 1: Core switch connecting both cells.

P019Block 26

c ⁢ e ⁢ n ⁢ t ⁢ r ⁢ a ⁢ l ⁢ S ⁢ w ⁢ i ⁢ t ⁢ c ⁢ h ⁢ _ ⁢ 2 𝑐 𝑒 𝑛 𝑡 𝑟 𝑎 𝑙 𝑆 𝑤 𝑖 𝑡 𝑐 ℎ _ 2 centralSwitch\_2 italic_c italic_e italic_n italic_t italic_r italic_a italic_l italic_S italic_w italic_i italic_t italic_c italic_h _ 2 and c ⁢ e ⁢ n ⁢ t ⁢ r ⁢ a ⁢ l ⁢ S ⁢ w ⁢ i ⁢ t ⁢ c ⁢ h ⁢ _ ⁢ 3 𝑐 𝑒 𝑛 𝑡 𝑟 𝑎 𝑙 𝑆 𝑤 𝑖 𝑡 𝑐 ℎ _ 3 centralSwitch\_3 italic_c italic_e italic_n italic_t italic_r italic_a italic_l italic_S italic_w italic_i italic_t italic_c italic_h _ 3: Interconnect with SCADA, HMI, Edge unit, and quality-control switches (q ⁢ c ⁢ S ⁢ w ⁢ i ⁢ t ⁢ c ⁢ h ⁢ _ ⁢ 1 𝑞 𝑐 𝑆 𝑤 𝑖 𝑡 𝑐 ℎ _ 1 qcSwitch\_1 italic_q italic_c italic_S italic_w italic_i italic_t italic_c italic_h _ 1, q ⁢ c ⁢ S ⁢ w ⁢ i ⁢ t ⁢ c ⁢ h ⁢ _ ⁢ 2 𝑞 𝑐 𝑆 𝑤 𝑖 𝑡 𝑐 ℎ _ 2 qcSwitch\_2 italic_q italic_c italic_S italic_w italic_i italic_t italic_c italic_h _ 2).

P020Block 27

End devices (e.g., PLCs, sensors, actuators) are connected via 100-Mbps links, representing edge connections commonly found in factory deployments. Figure 1 illustrates this architecture. All switches and hosts are configured with TSN capabilities under the TSNBase configuration. Table II summarizes the TSN features implemented in the network, as configured in the INET-based.ini file.

P021Block 28

This architectural design provides a robust and flexible testbed for evaluating TSN performance in a realistic industrial setting, enabling detailed investigation of latency, jitter, and recovery under fault conditions.

P022Block 30

The industrial network models a diverse set of traffic streams that reflect the communication requirements of industrial automation systems. These streams are classified by source-destination roles, communication patterns (periodic vs. event-driven), packet size, and criticality level. TSN mechanisms—including stream identification and Priority Code Point (PCP)-based prioritization—are configured to ensure that high-priority traffic receives deterministic service, even under network stress or failure. Table III summarizes key traffic types modeled in the simulation, along with their associated PCP values and timing constraints.

P023Block 31

All streams are tagged and encoded using PCP values, as defined by per-stream mapping in the configuration. TSN switches perform ingress classification, policing, and shaping per stream to enforce bandwidth and burst control.

P024Block 32

To evaluate the fault resilience of the TSN network, two types of failure scenarios are introduced: • S1A1/S1A2 simulate a mid-cell link failure (SwitchA_1 - SwitchB_1), • S2A1/S2A2 simulate cell disconnection (centralSwitch_2 - Edge unit and SCADA).

P025Block 33

S1A1/S1A2 simulate a mid-cell link failure (SwitchA_1 - SwitchB_1),

P026Block 34

S2A1/S2A2 simulate cell disconnection (centralSwitch_2 - Edge unit and SCADA).

P027Block 35

Table IV summarizes these fault scenarios and their recovery logic. The following metrics are used to evaluate network behavior during fault events: • End-to-End Latency: Transmission delay from source to destination. • Jitter: Variability in packet delay over time. • Packet Loss: Number of packets dropped during normal and faulted operation. • Recovery Time: Time required to reestablish stream delivery after a failure.

P028Block 36

End-to-End Latency: Transmission delay from source to destination.

P029Block 37

Jitter: Variability in packet delay over time.

P030Block 38

Packet Loss: Number of packets dropped during normal and faulted operation.

P031Block 39

Recovery Time: Time required to reestablish stream delivery after a failure.

P032Block 41

The IN2C simulation framework is implemented in OMNeT++ using the INET framework, extended with TSN-specific modules to emulate industrial communication behavior. The simulation models two synchronized production cells and centralized infrastructure, capturing interactions across sensors, PLCs, robot arms, inspection units, SCADA, Edge unit, and HMI systems. INET’s TSN extensions—including time-aware switches, gPTP clocks, per-stream filtering, and traffic shaping—enable deterministic and fault-tolerant communication modeling.

P033Block 42

All nodes are built from INET’s TsnDevice and StandardHost templates, depending on their role. Each functional unit (e.g., Sensing_1, Control_1, Inspection_2) includes: • A local clock module with oscillator drift, • User Datagram Protocol(UDP) traffic generators and sinks, • Stream coders with PCP-tag mapping, • Bridging and scheduling logic for TSN operation.

P034Block 43

A local clock module with oscillator drift,

P035Block 44

User Datagram Protocol(UDP) traffic generators and sinks,

P036Block 45

Stream coders with PCP-tag mapping,

P037Block 46

Bridging and scheduling logic for TSN operation.

P038Block 47

Cell switches (SwitchA_X, SwitchB_X) and aggregation switches (centralSwitch_X, qcSwitch_X) use the TsnSwitch type, with FRER, stream filtering, and shaping enabled as configured.

P039Block 48

The master clock (masterClock) operates as a global time reference, distributing synchronization via gPTP across the topology. Each node aligns its local clock to this source, ensuring coherent transmission timing for time-sensitive streams.

P040Block 49

The workflow follows an event-driven model: • Initialization: Node clocks are configured, traffic applications instantiated, stream encoders applied, and synchronization established. • Traffic Generation: Application modules produce cyclic or event-driven UDP streams with preconfigured packet size, interval, and destination. • Scheduling: Time-aware and credit-based shaping modules manage queueing and dispatch under bounded latency constraints. • Failure Handling: XML-based ScenarioManager scripts trigger link/node failures (e.g., linkfailure.xml or cellfailure.xml). • Redundancy/Recovery: If redundancy is enabled, FRER-based failover automatically engages. Otherwise, communication may stall. • Metrics Collection: Throughout the simulation, OMNeT++ records delay, jitter, packet drops, and fault recovery durations.

P041Block 50

Initialization: Node clocks are configured, traffic applications instantiated, stream encoders applied, and synchronization established.

P042Block 51

Traffic Generation: Application modules produce cyclic or event-driven UDP streams with preconfigured packet size, interval, and destination.

P043Block 52

Scheduling: Time-aware and credit-based shaping modules manage queueing and dispatch under bounded latency constraints.

P044Block 53

Failure Handling: XML-based ScenarioManager scripts trigger link/node failures (e.g., linkfailure.xml or cellfailure.xml).

P045Block 54

Redundancy/Recovery: If redundancy is enabled, FRER-based failover automatically engages. Otherwise, communication may stall.

P046Block 55

Metrics Collection: Throughout the simulation, OMNeT++ records delay, jitter, packet drops, and fault recovery durations.

P047Block 56

Synchronization is driven by IEEE 802.1AS. Each switch forwards sync messages downstream to maintain precise offset and frequency alignment. TSN traffic scheduling uses always-open TAS gates and per-stream credit-based shaping. This configuration is intentionally chosen to isolate the impact of shaping and redundancy mechanisms by avoiding the additional complexity introduced by time-aware gate scheduling.

P048Block 57

Streams are filtered and metered based on source and PCP. Faults are introduced at simulation time using scheduled disconnect/connect events. Redundant configurations trigger alternate paths using FRER trees and packet duplication. This modular, event-driven structure enables controlled experimentation with traffic shaping, synchronization, stream redundancy, and recovery behavior under realistic factory network conditions.

P049Block 59

This section analyzes the performance of the IN2C TSN network under fault conditions affecting link and cell connectivity. Four scenarios are evaluated to examine the impact of redundancy on latency, jitter, packet loss, and recovery time. Results reflect how TSN mechanisms perform when subject to realistic disruptions in smart manufacturing environments.

P050Block 61

The evaluation is based on the IN2C simulation framework [ 13 ], developed in OMNeT++ using the INET-TSN extension. The network, Figure 2, consists of two TSN-enabled production cells connected to centralized infrastructure (SCADA, Edge unit, HMI) through a 1-Gbps backbone. End devices communicate via 100-Mbps links. Each simulation captures behavior across initialization, steady-state, failure, and recovery phases. In this context, four scenarios are defined: • S1A1 / S1A2: Simulate a link failure between SwitchA_1 and SwitchB_1 at 4 seconds. S1A2, Figure 3, enables FRER; S1A1 does not. Runtime: 10 ⁢ s 10 𝑠 10s 10 italic_s. • S2A1 / S2A2: Simulate cell isolation by disconnecting centralSwitch_2 from both Edge unit and SCADA at 2 seconds. S2A2, Figure 4, uses FRER for redundancy. Runtime: 5 ⁢ s 5 𝑠 5s 5 italic_s.

P051Block 62

S1A1 / S1A2: Simulate a link failure between SwitchA_1 and SwitchB_1 at 4 seconds. S1A2, Figure 3, enables FRER; S1A1 does not. Runtime: 10 ⁢ s 10 𝑠 10s 10 italic_s.

P052Block 63

S2A1 / S2A2: Simulate cell isolation by disconnecting centralSwitch_2 from both Edge unit and SCADA at 2 seconds. S2A2, Figure 4, uses FRER for redundancy. Runtime: 5 ⁢ s 5 𝑠 5s 5 italic_s.

P053Block 64

Failures are introduced using XML-based ScenarioManager scripts, which explicitly disconnect and later restore Ethernet gates between modules. No node-level failures are used; instead, link-level disruptions emulate practical fault cases such as cable damage or aggregation switch malfunction.

P054Block 65

Redundancy mechanisms, including Frame Replication and Elimination for Redundancy (IEEE 802.1CB), are activated only in S1A2 and S2A2 via StreamRedundancyConfigurator. Gate control and clock synchronization are implemented using EagerGateScheduleConfigurator and IEEE 802.1AS-compliant gPTP via a centralized masterClock. The full simulation source code, scenarios, and TSN configurations are publicly available at [ 13 ].

P055Block 67

Figure 5 shows the performance of the S1 stream under a link failure between SwitchA_1 and SwitchB_1 injected at t = 4 𝑡 4 t=4 italic_t = 4 s and resolved at t = 6 𝑡 6 t=6 italic_t = 6 s. In the non-redundant setup (S1A1), latency becomes highly variable post-failure, with a median of 24.01 ⁢ μ 24.01 𝜇 24.01\mu 24.01 italic_μ s and a wide spread (mean 18.93 ⁢ μ 18.93 𝜇 18.93\mu 18.93 italic_μ s). Jitter increases significantly, with a standard deviation of 6.29 ⁢ μ 6.29 𝜇 6.29\mu 6.29 italic_μ s due to unstable retransmissions and path recovery effects. Despite the existence of physical alternate paths, their absence from the active stream configuration renders the stream vulnerable to the failure.

P056Block 68

In contrast, the redundant configuration (S1A2) uses FRER to pre-distribute stream replicas over disjoint paths. This eliminates the need for dynamic rerouting or recovery signaling. As expected, latency remains consistently bounded despite the fault, though its median rises slightly to 29.81 ⁢ μ 29.81 𝜇 29.81\mu 29.81 italic_μ s. This increase stems from buffering delay introduced by duplicate frame handling. The trade-off yields a more stable mean delay (27.43 ⁢ μ 27.43 𝜇 27.43\mu 27.43 italic_μ s) and enhanced fault tolerance. Jitter remains controlled (std 8.13 ⁢ μ 8.13 𝜇 8.13\mu 8.13 italic_μ s) as the FRER mechanism ensures that the first valid frame at the receiver is delivered without delay, masking the failure entirely.

P057Block 69

Figure 6 presents the S2 stream under a cell-level fault, where centralSwitch_2 is disconnected from the Edge and SCADA units at t = 2 𝑡 2 t=2 italic_t = 2 s and reconnected at t = 3 𝑡 3 t=3 italic_t = 3 s.

P058Block 70

In the non-redundant case (S2A1), latency spikes sharply, with a median of 51.28 ⁢ μ 51.28 𝜇 51.28\mu 51.28 italic_μ s and peaks over 90 μ 𝜇 \mu italic_μ s. Jitter remains moderate (std 17.32 ⁢ μ 17.32 𝜇 17.32\mu 17.32 italic_μ s), reflecting consistent recovery behavior once the fault is resolved.

P059Block 71

In contrast, S2A2 with FRER exhibits a more complex profile. While the median latency improves slightly to 45.99 ⁢ μ 45.99 𝜇 45.99\mu 45.99 italic_μ s, the mean increases to 63.27 ⁢ μ 63.27 𝜇 63.27\mu 63.27 italic_μ s, and the tail becomes pronounced—latency reaches up to 879 ⁢ μ 879 𝜇 879\mu 879 italic_μ s, and jitter swings from –224 to +224 μ 𝜇 \mu italic_μ s. This indicates that while FRER masks the failure itself, it introduces significant delivery variability even under normal operation. The congestion between centralSwitch_1 and centralSwitch_3, likely due to duplicate stream forwarding, amplifies path imbalance and leads to large jitter fluctuations. These results reveal that FRER is not universally beneficial. While it provides failover protection, it may degrade performance for bursty or high-bandwidth streams like inspection traffic.

P060Block 72

The right-hand boxplots in both figures confirm these observations statistically. In S1, the non-redundant configuration exhibits outliers and jitter spread during failure, while FRER (S1A2) achieves consistently low variance, validating its effectiveness for critical, low-volume streams. However, in S2, FRER (S2A2) displays a heavy-tailed distribution and wide jitter range, in contrast to the more stable profile of the non-redundant setup (S2A1). These results highlight that while FRER prevents packet loss and ensures continuity, it can introduce delivery unpredictability under normal load, particularly for high-throughput or bursty traffic.

P061Block 74

To quantify the resilience of each configuration, Table V summarizes packet transmission, loss, and recovery characteristics for all scenarios.

P062Block 75

For the Sensing_1 → → \rightarrow → Control_1 stream (S1), which transmits every 7.5 ms on average, the S1A1 configuration experiences 270 lost packets during the 2-second link failure—yielding a 20.2% loss rate. Recovery is purely reactive and delayed until the link is restored. In contrast, FRER-enabled S1A2 maintains continuous delivery across disjoint paths, with no packet loss and instantaneous failover.

P063Block 76

The Inspection_2 → → \rightarrow → Edge unit stream operates at a high rate (66 μ 𝜇 \mu italic_μ s intervals), and the S2A1 setup suffers 19,776 dropped packets during a 1-second isolation, corresponding to a 26.3% loss rate. Despite available physical paths, the absence of redundancy prevents their use. S2A2, leveraging FRER, avoids loss by using duplicate streams; however, as discussed earlier, this comes at the cost of significant jitter and latency variability.

P064Block 77

This study compares FRER against a non-redundant baseline, excluding traditional protection schemes like Spanning Tree Protocol (STP), Rapid STP (RSTP), and Ethernet Ring Protection Switching (ERPS), which require tens to hundreds of milliseconds to recover—too slow for Time-Sensitive Networking (TSN) traffic with sub-millisecond deadlines.

P065Block 78

FRER enables zero recovery time via data-plane replication over disjoint paths but introduces bandwidth overhead and possible jitter due to path imbalance. Alternatives include FRER with Scheduled Traffic (FRER-ST), which improves determinism but needs centralized scheduling; Packet Duplication and Elimination Function (P-DEF) in 5G, which is wireless-specific; and reactive options like Multi-Protocol Label Switching - Transport Profile (MPLS-TP) and IEEE 802.1Qca, which offer fast failover at the cost of added complexity. While FRER is ideal for critical flows, selective or hybrid use is advised for high-bandwidth or bursty traffic.

P066Block 80

While FRER eliminates packet loss and enables seamless failover, it introduces measurable bandwidth overhead. Figure 7 shows link utilization over time for the Sensing_1 → → \rightarrow → SwitchB_1, resulting in steady utilization (4–5%) until the link failure at t = 4 𝑡 4 t=4 italic_t = 4 s. Utilization then drops to zero, confirming complete disconnection, and only resumes after recovery at t = 6 𝑡 6 t=6 italic_t = 6 s. The lack of alternative forwarding results in data loss.

P067Block 81

In S1A2, FRER replicates traffic across disjoint paths— SwitchA_1 → → \rightarrow → centralSwitch_1 link consistently reaches 13–15%, while centralSwitch_1 → → \rightarrow → SwitchB_1 carries 5%. Meanwhile, the direct path remains active with a reduced load (2%). This validates continuous delivery of duplicate frames, confirming the overhead introduced by FRER.

P068Block 82

Figure 8 captures the impact of FRER in the S2 scenario, which involves high-bandwidth streaming from Inspection_2 → → \rightarrow → centralSwitch_3 to near saturation (100%). Even during fault-free operation, these links remain heavily loaded due to constant duplication. This congestion not only increases latency and jitter, as shown earlier, but also reduces available headroom for other traffic.

P069Block 83

These findings demonstrate that while FRER offers fault resilience, it comes at the cost of substantial link overhead. This trade-off must be carefully considered—especially for high-rate or bursty streams—where redundancy may amplify congestion risks in shared core segments.

P070Block 84

Recommendation: In bandwidth-constrained or converged TSN networks, FRER use must be selective and strategic, prioritizing critical control flows with strict latency and loss constraints. Our evaluation (Section V) shows that indiscriminate replication can increase congestion and jitter, particularly for high-bandwidth streams like inspection video, which experience marginal benefit from FRER unless provisioned with excess capacity. These observations are consistent with prior analyses on redundancy overhead and congestion risk in TSN systems [ 11 ], [ 12 ]. To mitigate these risks and maintain deterministic performance, we recommend the following: • Topology-aware stream allocation to prevent redundant paths from overlapping at congestion-prone links. • Per-link capacity planning that accounts FRER duplication overhead under both nominal and failure conditions (e.g., up to 2× bandwidth increase). • Flow-aware shaping and policing to bound burst sizes before replication and avoid queue buildup. • VLAN-based isolation to confine traffic classes and avoid cross-stream contention during failover events. • Adaptive redundancy levels, where redundancy is enabled dynamically based on flow criticality, system load, or link health indicators (as explored in [ 10 ]).

P071Block 85

Topology-aware stream allocation to prevent redundant paths from overlapping at congestion-prone links.

P072Block 86

Per-link capacity planning that accounts FRER duplication overhead under both nominal and failure conditions (e.g., up to 2× bandwidth increase).

P073Block 87

Flow-aware shaping and policing to bound burst sizes before replication and avoid queue buildup.

P074Block 88

VLAN-based isolation to confine traffic classes and avoid cross-stream contention during failover events.

P075Block 89

Adaptive redundancy levels, where redundancy is enabled dynamically based on flow criticality, system load, or link health indicators (as explored in [ 10 ]).

P076Block 90

These measures, informed by both our simulation results and recent studies, help ensure that FRER delivers its intended benefits—lossless failover and zero recovery time—without destabilizing the broader TSN network.

P077Block 92

This paper introduced a scenario-driven simulation framework for evaluating TSN resilience under fault conditions in industrial networks. By modeling synchronized production cells with stream-level redundancy, we assessed the impact of link and cell failures on latency, jitter, packet loss, and recovery time.

P078Block 93

Results show that FRER achieves zero-loss failover and uninterrupted operation during faults. However, this reliability comes at the cost of significantly increased bandwidth usage—up to 2–3× in some cases—due to redundant traffic. These findings underscore the need for selective, traffic-aware application of redundancy in bandwidth-constrained environments.

P079Block 94

Future work will enhance the framework with dynamic reconfiguration capabilities, enabling networks to autonomously adapt to faults via gate schedule updates, stream rerouting, and flow prioritization. We also plan to simulate multi-fault and transient failure scenarios to assess TSN’s robustness under more realistic industrial conditions.