Comparison of Two Maintenance Policies for a Two-Unit Series System Considering General Repair

In recent years, maintenance optimization has attracted special attention due to the growth of industrial systems complexity. Maintenance costs are high for many systems, and preventive maintenance is effective when it increases operations' reliability and safety at a reduced cost. The novelty of this research is to consider general repair in the modeling of multi-unit series systems and solve the maintenance problem for such systems using the semi-Markov decision process (SMDP) framework. We propose an opportunistic maintenance policy for a series system composed of two main units. Unit 1, which is more expensive than unit 2, is subjected to condition monitoring, and its deterioration is modeled using a gamma process. Unit 1 hazard rate is estimated by the proportional hazards model (PHM), and two hazard rate control limits are considered as the thresholds of maintenance interventions for unit 1. Maintenance is performed on unit 2, considering an age control limit. The objective is to find the optimal control limits and minimize the long-run expected average cost per unit time. The proposed algorithm is applied to a numerical example to compare the effectiveness of the proposed policy (policy Ⅰ) with policy Ⅱ, which is similar to policy Ⅰ, but instead of general repair, replacement is performed. Results show that policy Ⅰ leads to lower average cost compared with policy Ⅱ. 

Optimal Production and Maintenance Policy for a Partially Observable Production System with Stochastic Demand

In this paper, the joint optimization of the economic manufacturing quantity (EMQ), safety stock level, and condition-based maintenance (CBM) is presented for a partially observable, deteriorating system subject to random failure. The demand is stochastic and it is described by a Poisson process. The stochastic model is developed and the optimization problem is formulated in the semi-Markov decision process framework. A modification of the policy iteration algorithm is developed to find the optimal policy. A numerical example is presented to compare the optimal policy with the policy considering zero safety stock.

A Condition-Based Maintenance Policy for Multi-Unit Systems Subject to Deterioration

In this paper, we propose a condition-based maintenance policy for multi-unit systems considering the existence of economic dependency among units. We consider a system composed of N identical units, where each unit deteriorates independently. Deterioration process of each unit is modeled as a three-state continuous time homogeneous Markov chain with two working states and a failure state. The average production rate of units varies in different working states and demand rate of the system is constant. Units are inspected at equidistant time epochs, and decision regarding performing maintenance is determined by the number of units in the failure state. If the total number of units in the failure state exceeds a critical level, maintenance is initiated, where units in failed state are replaced correctively and deteriorated state units are maintained preventively. Our objective is to determine the optimal number of failed units to initiate maintenance minimizing the long run expected average cost per unit time. The problem is formulated and solved in the semi-Markov decision process (SMDP) framework. A numerical example is developed to demonstrate the proposed policy and the comparison with the corrective maintenance policy is presented.

Optimal Opportunistic Maintenance Policy for a Two-Unit System

This paper presents a maintenance policy for a system consisting of two units. Unit 1 is gradually deteriorating and is subject to soft failure. Unit 2 has a general lifetime distribution and is subject to hard failure. Condition of unit 1 of the system is monitored periodically and it is considered as failed when its deterioration level reaches or exceeds a critical level N. At the failure time of unit 2 system is considered as failed, and unit 2 will be correctively replaced by the next inspection epoch. Unit 1 or 2 are preventively replaced when deterioration level of unit 1 or age of unit 2 exceeds the related preventive maintenance (PM) levels. At the time of corrective or preventive replacement of unit 2, there is an opportunity to replace unit 1 if its deterioration level reaches the opportunistic maintenance (OM) level. If unit 2 fails in an inspection interval, system stops operating although unit 1 has not failed. A mathematical model is derived to find the preventive and opportunistic replacement levels for unit 1 and preventive replacement age for unit 2, that minimize the long run expected average cost per unit time. The problem is formulated and solved in the semi-Markov decision process (SMDP) framework. Numerical example is provided to illustrate the performance of the proposed model and the comparison of the proposed model with an optimal policy without opportunistic maintenance level for unit 1 is carried out.

Optimal Bayesian Control of the Proportion of Defectives in a Manufacturing Process

In this paper, we present a model and an algorithm for the calculation of the optimal control limit, average cost, sample size, and the sampling interval for an optimal Bayesian chart to control the proportion of defective items produced using a semi-Markov decision process approach. Traditional p-chart has been widely used for controlling the proportion of defectives in various kinds of production processes for many years. It is well known that traditional non-Bayesian charts are not optimal, but very few optimal Bayesian control charts have been developed in the literature, mostly considering finite horizon. The objective of this paper is to develop a fast computational algorithm to obtain the optimal parameters of a Bayesian p-chart. The decision problem is formulated in the partially observable framework and the developed algorithm is illustrated by a numerical example.

Optimal Maintenance and Improvement Policies in Water Distribution System: Markov Decision Process Approach

The Markov decision process (MDP) based methodology is implemented in order to establish the optimal schedule which minimizes the cost. Formulation of MDP problem is presented using the information about the current state of pipe, improvement cost, failure cost and pipe deterioration model. The objective function and detailed algorithm of dynamic programming (DP) are modified due to the difficulty of implementing the conventional DP approaches. The optimal schedule derived from suggested model is compared to several policies via Monte Carlo simulation. Validity of the solution and improvement in computational time are proved.

Trajectory-Based Modified Policy Iteration

This paper presents a new problem solving approach that is able to generate optimal policy solution for finite-state stochastic sequential decision-making problems with high data efficiency. The proposed algorithm iteratively builds and improves an approximate Markov Decision Process (MDP) model along with cost-to-go value approximates by generating finite length trajectories through the state-space. The approach creates a synergy between an approximate evolving model and approximate cost-to-go values to produce a sequence of improving policies finally converging to the optimal policy through an intelligent and structured search of the policy space. The approach modifies the policy update step of the policy iteration so as to result in a speedy and stable convergence to the optimal policy. We apply the algorithm to a non-holonomic mobile robot control problem and compare its performance with other Reinforcement Learning (RL) approaches, e.g., a) Q-learning, b) Watkins Q(λ), c) SARSA(λ).

Opportunistic Routing with Secure Coded Wireless Multicast Using MAS Approach

Many Wireless Sensor Network (WSN) applications necessitate secure multicast services for the purpose of broadcasting delay sensitive data like video files and live telecast at fixed time-slot. This work provides a novel method to deal with end-to-end delay and drop rate of packets. Opportunistic Routing chooses a link based on the maximum probability of packet delivery ratio. Null Key Generation helps in authenticating packets to the receiver. Markov Decision Process based Adaptive Scheduling algorithm determines the time slot for packet transmission. Both theoretical analysis and simulation results show that the proposed protocol ensures better performance in terms of packet delivery ratio, average end-to-end delay and normalized routing overhead.

Markov Game Controller Design Algorithms

Markov games are a generalization of Markov decision process to a multi-agent setting. Two-player zero-sum Markov game framework offers an effective platform for designing robust controllers. This paper presents two novel controller design algorithms that use ideas from game-theory literature to produce reliable controllers that are able to maintain performance in presence of noise and parameter variations. A more widely used approach for controller design is the H∞ optimal control, which suffers from high computational demand and at times, may be infeasible. Our approach generates an optimal control policy for the agent (controller) via a simple Linear Program enabling the controller to learn about the unknown environment. The controller is facing an unknown environment, and in our formulation this environment corresponds to the behavior rules of the noise modeled as the opponent. Proposed controller architectures attempt to improve controller reliability by a gradual mixing of algorithmic approaches drawn from the game theory literature and the Minimax-Q Markov game solution approach, in a reinforcement-learning framework. We test the proposed algorithms on a simulated Inverted Pendulum Swing-up task and compare its performance against standard Q learning.

A Novel Approach of Route Choice in Stochastic Time-varying Networks

Many exist studies always use Markov decision processes (MDPs) in modeling optimal route choice in stochastic, time-varying networks. However, taking many variable traffic data and transforming them into optimal route decision is a computational challenge by employing MDPs in real transportation networks. In this paper we model finite horizon MDPs using directed hypergraphs. It is shown that the problem of route choice in stochastic, time-varying networks can be formulated as a minimum cost hyperpath problem, and it also can be solved in linear time. We finally demonstrate the significant computational advantages of the introduced methods.

Periodic Storage Control Problem

Considering a reservoir with periodic states and different cost functions with penalty, its release rules can be modeled as a periodic Markov decision process (PMDP). First, we prove that policy- iteration algorithm also works for the PMDP. Then, with policy- iteration algorithm, we obtain the optimal policies for a special aperiodic reservoir model with two cost functions under large penalty and give a discussion when the penalty is small.