Abstract: Markov games can be effectively used to design
controllers for nonlinear systems. The paper presents two novel
controller design algorithms by incorporating ideas from gametheory
literature that address safety and consistency issues of the
'learned' control strategy. A more widely used approach for
controller design is the H∞ optimal control, which suffers from high
computational demand and at times, may be infeasible. We generate
an optimal control policy for the agent (controller) via a simple
Linear Program enabling the controller to learn about the unknown
environment. The controller is facing an unknown environment and
in our formulation this environment corresponds to the behavior rules
of the noise modeled as the opponent. Proposed approaches aim to
achieve 'safe-consistent' and 'safe-universally consistent' controller
behavior by hybridizing 'min-max', 'fictitious play' and 'cautious
fictitious play' approaches drawn from game theory. We empirically
evaluate the approaches on a simulated Inverted Pendulum swing-up
task and compare its performance against standard Q learning.
Abstract: Buyer coalition with a combination of items is a group of buyers joining together to purchase a combination of items with a larger discount. The primary aim of existing buyer coalition with a combination of items research is to generate a large total discount. However, the aim is hard to achieve because this research is based on the assumption that each buyer completely knows other buyers- information or at least one buyer knows other buyers- information in a coalition by exchange of information. These assumption contrast with the real world environment where buyers join a coalition with incomplete information, i.e., they concerned only with their expected discounts. Therefore, this paper proposes a new buyer community coalition formation with a combination of items scheme, called the Community Compromised Combinatorial Coalition scheme, under such an environment of incomplete information. In order to generate a larger total discount, after buyers who want to join a coalition propose their minimum required saving, a coalition structure that gives a maximum total retail prices is formed. Then, the total discount division of the coalition is divided among buyers in the coalition depending on their minimum required saving and is a Pareto optimal. In mathematical analysis, we compare concepts of this scheme with concepts of the existing buyer coalition scheme. Our mathematical analysis results show that the total discount of the coalition in this scheme is larger than that in the existing buyer coalition scheme.