Hybrid-Line Remanufacturing Process Optimization for Multi-Type Factories with Twin Delayed Deep Deterministic Policy Gradient Algorithm
Keywords:
Multi-objective optimization, reinforcement learning, twin delayed deep deterministic policy gradient algorithm, remanufacturing processAbstract
In response to the growing complexity of remanufacturing systems, this work investigates a novel Multi-objective Hybrid-line Multi-type Factory Remanufacturing Optimization Problem. The proposed model takes into account the disassembly technologies used by different factories, the selection of heterogeneous disassembly lines, and task-related constraints such as precedence and conflicts. The goal is to assign end-of-life products to appropriate disassembly factories and schedule tasks on optimal lines to achieve high scalability and efficiency in large-scale dynamic environments. To solve this problem, we formulate a multi-objective mixed-integer programming model that simultaneously maximizes overall profit and minimizes factory cycle time. The model is validated using a commercial solver to ensure feasibility and correctness. Due to the dynamic and sequential nature of the problem, we also employ the Twin Delayed Deep Deterministic Policy Gradient Algorithm (TD3), TD3 for short, to learn optimal strategies through interaction with the environment. Experimental studies in various benchmark cases show that TD3 significantly outperforms baseline reinforcement learning algorithms such as Deep Deterministic Policy Gradient (DDPG), Soft Actor-Critic (SAC), and Advantage Actor-Critic (A2C) in both convergence stability and solution quality. TD3 also demonstrates superior capability in approximating Pareto-optimal solutions, which makes it suitable for real-world remanufacturing scenarios.
Downloads
Downloads
Published
License
Copyright (c) 2026 International Journal of Artificial Intelligence and Green Manufacturing

This work is licensed under a Creative Commons Attribution 4.0 International License.