Abstract: In this paper, we study a distributed control algorithm
for the problem of unknown area coverage by a network of robots.
The coverage objective is to locate a set of targets in the area and
to minimize the robots’ energy consumption. The robots have no
prior knowledge about the location and also about the number of the
targets in the area. One efficient approach that can be used to relax
the robots’ lack of knowledge is to incorporate an auxiliary learning
algorithm into the control scheme. A learning algorithm actually
allows the robots to explore and study the unknown environment
and to eventually overcome their lack of knowledge. The control
algorithm itself is modeled based on game theory where the network
of the robots use their collective information to play a non-cooperative
potential game. The algorithm is tested via simulations to verify its
performance and adaptability.
Abstract: In this paper, we provided a literature survey on the
artificial stock problem (ASM). The paper began by exploring the
complexity of the stock market and the needs for ASM. ASM
aims to investigate the link between individual behaviors (micro
level) and financial market dynamics (macro level). The variety of
patterns at the macro level is a function of the AFM complexity. The
financial market system is a complex system where the relationship
between the micro and macro level cannot be captured analytically.
Computational approaches, such as simulation, are expected to
comprehend this connection. Agent-based simulation is a simulation
technique commonly used to build AFMs. The paper proceeds by
discussing the components of the ASM. We consider the roles
of behavioral finance (BF) alongside the traditionally risk-averse
assumption in the construction of agent’s attributes. Also, the
influence of social networks in the developing of agents interactions is
addressed. Network topologies such as a small world, distance-based,
and scale-free networks may be utilized to outline economic
collaborations. In addition, the primary methods for developing
agents learning and adaptive abilities have been summarized.
These incorporated approach such as Genetic Algorithm, Genetic
Programming, Artificial neural network and Reinforcement Learning.
In addition, the most common statistical properties (the stylized facts)
of stock that are used for calibration and validation of ASM are
discussed. Besides, we have reviewed the major related previous
studies and categorize the utilized approaches as a part of these
studies. Finally, research directions and potential research questions
are argued. The research directions of ASM may focus on the macro
level by analyzing the market dynamic or on the micro level by
investigating the wealth distributions of the agents.
Abstract: Investigating language acquisition is one of the most
challenging problems in the area of studying language. Syllable
learning as a level of language acquisition has a considerable
significance since it plays an important role in language acquisition.
Because of impossibility of studying language acquisition directly
with children, especially in its developmental phases, computer
models will be useful in examining language acquisition. In this
paper a computer model of early language learning for syllable
learning is proposed. It is guided by a conceptual model of syllable
learning which is named Directions Into Velocities of Articulators
model (DIVA). The computer model uses simple associational and
reinforcement learning rules within neural network architecture
which are inspired by neuroscience. Our simulation results verify the
ability of the proposed computer model in producing phonemes
during babbling and early speech. Also, it provides a framework for
examining the neural basis of language learning and communication
disorders.
Abstract: Modeling the behavior of the dialogue management in
the design of a spoken dialogue system using statistical methodologies
is currently a growing research area. This paper presents a work
on developing an adaptive learning approach to optimize dialogue
strategy. At the core of our system is a method formalizing dialogue
management as a sequential decision making under uncertainty whose
underlying probabilistic structure has a Markov Chain. Researchers
have mostly focused on model-free algorithms for automating the
design of dialogue management using machine learning techniques
such as reinforcement learning. But in model-free algorithms there
exist a dilemma in engaging the type of exploration versus exploitation.
Hence we present a model-based online policy learning
algorithm using interconnected learning automata for optimizing
dialogue strategy. The proposed algorithm is capable of deriving
an optimal policy that prescribes what action should be taken in
various states of conversation so as to maximize the expected total
reward to attain the goal and incorporates good exploration and
exploitation in its updates to improve the naturalness of humancomputer
interaction. We test the proposed approach using the most
sophisticated evaluation framework PARADISE for accessing to the
railway information system.
Abstract: Conceptualization strengthens intelligent systems in generalization skill, effective knowledge representation, real-time inference, and managing uncertain and indefinite situations in addition to facilitating knowledge communication for learning agents situated in real world. Concept learning introduces a way of abstraction by which the continuous state is formed as entities called concepts which are connected to the action space and thus, they illustrate somehow the complex action space. Of computational concept learning approaches, action-based conceptualization is favored because of its simplicity and mirror neuron foundations in neuroscience. In this paper, a new biologically inspired concept learning approach based on the probabilistic framework is proposed. This approach exploits and extends the mirror neuron-s role in conceptualization for a reinforcement learning agent in nondeterministic environments. In the proposed method, instead of building a huge numerical knowledge, the concepts are learnt gradually from rewards through interaction with the environment. Moreover the probabilistic formation of the concepts is employed to deal with uncertain and dynamic nature of real problems in addition to the ability of generalization. These characteristics as a whole distinguish the proposed learning algorithm from both a pure classification algorithm and typical reinforcement learning. Simulation results show advantages of the proposed framework in terms of convergence speed as well as generalization and asymptotic behavior because of utilizing both success and failures attempts through received rewards. Experimental results, on the other hand, show the applicability and effectiveness of the proposed method in continuous and noisy environments for a real robotic task such as maze as well as the benefits of implementing an incremental learning scenario in artificial agents.
Abstract: This research presents a system for post processing of
data that takes mined flat rules as input and discovers crisp as well as
fuzzy hierarchical structures using Learning Classifier System
approach. Learning Classifier System (LCS) is basically a machine
learning technique that combines evolutionary computing,
reinforcement learning, supervised or unsupervised learning and
heuristics to produce adaptive systems. A LCS learns by interacting
with an environment from which it receives feedback in the form of
numerical reward. Learning is achieved by trying to maximize the
amount of reward received. Crisp description for a concept usually
cannot represent human knowledge completely and practically. In the
proposed Learning Classifier System initial population is constructed
as a random collection of HPR–trees (related production rules) and
crisp / fuzzy hierarchies are evolved. A fuzzy subsumption relation is
suggested for the proposed system and based on Subsumption Matrix
(SM), a suitable fitness function is proposed. Suitable genetic
operators are proposed for the chosen chromosome representation
method. For implementing reinforcement a suitable reward and
punishment scheme is also proposed. Experimental results are
presented to demonstrate the performance of the proposed system.
Abstract: A cognitive collaborative reinforcement learning
algorithm (CCRL) that incorporates an advisor into the learning
process is developed to improve supervised learning. An autonomous
learner is enabled with a self awareness cognitive skill to decide
when to solicit instructions from the advisor. The learner can also
assess the value of advice, and accept or reject it. The method is
evaluated for robotic motion planning using simulation. Tests are
conducted for advisors with skill levels from expert to novice. The
CCRL algorithm and a combined method integrating its logic with
Clouse-s Introspection Approach, outperformed a base-line fully
autonomous learner, and demonstrated robust performance when
dealing with various advisor skill levels, learning to accept advice
received from an expert, while rejecting that of less skilled
collaborators. Although the CCRL algorithm is based on RL, it fits
other machine learning methods, since advisor-s actions are only
added to the outer layer.
Abstract: In the recent past Learning Classifier Systems have
been successfully used for data mining. Learning Classifier System
(LCS) is basically a machine learning technique which combines
evolutionary computing, reinforcement learning, supervised or
unsupervised learning and heuristics to produce adaptive systems. A
LCS learns by interacting with an environment from which it
receives feedback in the form of numerical reward. Learning is
achieved by trying to maximize the amount of reward received. All
LCSs models more or less, comprise four main components; a finite
population of condition–action rules, called classifiers; the
performance component, which governs the interaction with the
environment; the credit assignment component, which distributes the
reward received from the environment to the classifiers accountable
for the rewards obtained; the discovery component, which is
responsible for discovering better rules and improving existing ones
through a genetic algorithm. The concatenate of the production rules
in the LCS form the genotype, and therefore the GA should operate
on a population of classifier systems. This approach is known as the
'Pittsburgh' Classifier Systems. Other LCS that perform their GA at
the rule level within a population are known as 'Mitchigan' Classifier
Systems. The most predominant representation of the discovered
knowledge is the standard production rules (PRs) in the form of IF P
THEN D. The PRs, however, are unable to handle exceptions and do
not exhibit variable precision. The Censored Production Rules
(CPRs), an extension of PRs, were proposed by Michalski and
Winston that exhibit variable precision and supports an efficient
mechanism for handling exceptions. A CPR is an augmented
production rule of the form: IF P THEN D UNLESS C, where
Censor C is an exception to the rule. Such rules are employed in
situations, in which conditional statement IF P THEN D holds
frequently and the assertion C holds rarely. By using a rule of this
type we are free to ignore the exception conditions, when the
resources needed to establish its presence are tight or there is simply
no information available as to whether it holds or not. Thus, the IF P
THEN D part of CPR expresses important information, while the
UNLESS C part acts only as a switch and changes the polarity of D
to ~D. In this paper Pittsburgh style LCSs approach is used for
automated discovery of CPRs. An appropriate encoding scheme is
suggested to represent a chromosome consisting of fixed size set of
CPRs. Suitable genetic operators are designed for the set of CPRs
and individual CPRs and also appropriate fitness function is proposed
that incorporates basic constraints on CPR. Experimental results are
presented to demonstrate the performance of the proposed learning
classifier system.
Abstract: Trust management and Reputation models are
becoming integral part of Internet based applications such as CSCW,
E-commerce and Grid Computing. Also the trust dimension is a
significant social structure and key to social relations within a
collaborative community. Collaborative Decision Making (CDM) is
a difficult task in the context of distributed environment (information
across different geographical locations) and multidisciplinary
decisions are involved such as Virtual Organization (VO). To aid
team decision making in VO, Decision Support System and social
network analysis approaches are integrated. In such situations social
learning helps an organization in terms of relationship, team
formation, partner selection etc. In this paper we focus on trust
learning. Trust learning is an important activity in terms of
information exchange, negotiation, collaboration and trust
assessment for cooperation among virtual team members. In this
paper we have proposed a reinforcement learning which enhances the
trust decision making capability of interacting agents during
collaboration in problem solving activity. Trust computational model
with learning that we present is adapted for best alternate selection of
new project in the organization. We verify our model in a multi-agent
simulation where the agents in the community learn to identify
trustworthy members, inconsistent behavior and conflicting behavior
of agents.
Abstract: A novel biologically inspired controller for the autonomous
navigation of a mobile robot in an evasion task is
proposed. The controller takes advantage of the environment by
calculating a measure of danger and subsequently choosing the
parameters of a reinforcement learning based decision process.
Two different reinforcement learning algorithms were used: Qlearning
and Sarsa (λ). Simulations show that selecting dynamic
parameters reduce the time while executing the decision making
process, so the robot can obtain a policy to succeed in an escaping
task in a realistic time.