Strength or Accuracy: Credit Assignment in Learning Classifier Systems

Strength or Accuracy: Credit Assignment in Learning Classifier Systems

Credit Assignment in Learning Classifier Systems. Diss.

Aus der Reihe

Fr. 191.00

inkl. gesetzl. MwSt.

Strength or Accuracy: Credit Assignment in Learning Classifier Systems

Ebenfalls verfügbar als:

Gebundenes Buch

Gebundenes Buch

ab Fr. 191.00
Taschenbuch

Taschenbuch

ab Fr. 182.00
eBook

eBook

ab Fr. 187.90

Beschreibung

Details

Einband

Gebundene Ausgabe

Erscheinungsdatum

20.01.2004

Verlag

Springer London

Seitenzahl

307

Maße (L/B/H)

24.1/16/2.3 cm

Beschreibung

Rezension

From the reviews:



"This book is a monograph on learning classifier systems … . The main objective of the book is to compare strength-based classifier systems with accuracy-based systems. … The book is equipped with nine appendices. … The biggest advantage of the book is its readability. The book is well written and is illustrated with many convincing examples." (Jerzy W. Grzymal-Busse, Mathematical Reviews, Issue 2005 k)

Details

Einband

Gebundene Ausgabe

Erscheinungsdatum

20.01.2004

Verlag

Springer London

Seitenzahl

307

Maße (L/B/H)

24.1/16/2.3 cm

Gewicht

658 g

Auflage

2004

Sprache

Englisch

ISBN

978-1-85233-770-4

Weitere Bände von Distinguished Dissertations

Unsere Kundinnen und Kunden meinen

0.0

0 Bewertungen

Informationen zu Bewertungen

Zur Abgabe einer Bewertung ist eine Anmeldung im Konto notwendig. Die Authentizität der Bewertungen wird von uns nicht überprüft. Wir behalten uns vor, Bewertungstexte, die unseren Richtlinien widersprechen, entsprechend zu kürzen oder zu löschen.

Verfassen Sie die erste Bewertung zu diesem Artikel

Helfen Sie anderen Kund*innen durch Ihre Meinung

Erste Bewertung verfassen

Unsere Kundinnen und Kunden meinen

0.0

0 Bewertungen filtern

  • Strength or Accuracy: Credit Assignment in Learning Classifier Systems
  • 1 Introduction.- 1.1 Two Example Machine Learning Tasks.- 1.2 Types of Task.- 1.2.1 Supervised and Reinforcement Learning.- 1.2.2 Sequential and Non-sequential Decision Tasks.- 1.3 Two Challenges for Classifier Systems.- 1.3.1 Problem 1: Learning a Policy from Reinforcement.- 1.3.2 Problem 2: Generalisation.- 1.4 Solution Methods.- 1.4.1 Method 1: Reinforcement Learning Algorithms.- 1.4.2 Method 2: Evolutionary Algorithms.- 1.5 Learning Classifier Systems.- 1.5.1 The Tripartite LCS Structure.- 1.5.2 LCS = Policy Learning + Generalisation.- 1.5.3 Credit Assignment in Classifier Systems.- 1.5.4 Strength and Accuracy-based Classifier Systems.- 1.6 About the Book.- 1.6.1 Why Compare Strength and Accuracy.- 1.6.2 Are LCS EC- or RL-based.- 1.6.3 Moving in Design Space.- 1.7 Structure of the Book.- 2 Learning Classifier Systems.- 2.1 Types of Classifier Systems.- 2.1.1 Michigan and Pittsburgh LCS.- 2.1.2 XCS and Traditional LCS?.- 2.2 Representing Rules.- 2.2.1 The Standard Ternary Language.- 2.2.2 Other Representations.- 2.2.3 Summary of Rule Representation.- 2.2.4 Notation for Rules.- 2.3 XCS.- 2.3.1 Wilson’s Motivation for XCS.- 2.3.2 Overview of XCS.- 2.3.3 Wilson’s Explore/Exploit Framework.- 2.3.4 The Performance System.- 2.3.4.1 The XCS Performance System Algorithm.- 2.3.4.2 The Match Set and Prediction Array.- 2.3.4.3 Action Selection.- 2.3.4.4 Experience-weighting of System Prediction.- 2.3.5 The Credit Assignment System.- 2.3.5.1 The MAM Technique.- 2.3.5.2 The Credit Assignment Algorithm.- 2.3.5.3 Sequential and Non-sequential Updates.- 2.3.5.4 Parameter Update Order.- 2.3.5.5 XCS Parameter Updates.- 2.3.6 The Rule Discovery System.- 2.3.6.1 Random Initial Populations.- 2.3.6.2 Covering.- 2.3.6.3 The Niche Genetic Algorithm.- 2.3.6.4 Alternative Mutation Schemes.- 2.3.6.5 Triggering the Niche GA.- 2.3.6.6 Deletion of Rules.- 2.3.6.7 Classifier Parameter Initialisation.- 2.3.6.8 Subsumption Deletion.- 2.4 SB-XCS.- 2.4.1 Specification of SB-XCS.- 2.4.2 Comparison of SB-XCS and Other Strength LCS.- 2.5 Initial Tests of XCS and SB-XCS.- 2.5.1 The 6 Multiplexer.- 2.5.2 Woods2.- 2.6 Summary.- 3 How Strength and Accuracy Differ.- 3.1 Thinking about Complex Systems.- 3.2 Holland’s Rationale for CS-1 and his Later LCS.- 3.2.1 Schema Theory.- 3.2.2 The Bucket Brigade.- 3.2.3 Schema Theory + Bucket Brigade = Adaptation.- 3.3 Wilson’s Rationale for XCS.- 3.3.1 A Bias towards Accurate Rules.- 3.3.2 A Bias towards General Rules.- 3.3.3 Complete Maps.- 3.3.4 Summary.- 3.4 A Rationale for SB-XCS.- 3.5 Analysis of Populations Evolved by XCS and SB-XCS.- 3.5.1 SB-XCS.- 3.5.2 XCS.- 3.5.3 Learning Rate.- 3.6 Different Goals, Different Representations.- 3.6.1 Default Hierarchies.- 3.6.2 Partial and Best Action Maps.- 3.6.3 Complete Maps.- 3.6.4 What do XCS and SB-XCS Really Learn?.- 3.7 Complete and Partial Maps Compared.- 3.7.1 Advantages of Partial Maps.- 3.7.2 Disadvantages of Partial Maps.- 3.7.3 Complete Maps and Strength.- 3.7.4 Contrasting Complete and Partial Maps in RL Terminology.- 3.7.5 Summary of Comparison.- 3.8 Ability to Express Generalisations.- 3.8.1 Mapping Policies and Mapping Value Functions.- 3.8.2 Adapting the Accuracy Criterion.- 3.8.3 XCS-hard and SB-XCS-easy Functions.- 3.8.4 Summary of Generalisation and Efficiency.- 3.9 Summary.- 4 What Should a Classifier System Learn?.- 4.1 Representing Boolean Functions.- 4.1.1 Truth Tables.- 4.1.2 On-sets and Off-sets.- 4.1.3 Sigma Notation.- 4.1.4 Disjunctive Normal Form.- 4.1.5 Representing Functions with Sets of Rules.- 4.1.6 How Should a Classifier System Represent a Solution?.- 4.1.7 The Value of a Single Rule.- 4.1.1 The Value of a Set of Rules.- 4.1.1 Complete and Correct Representations.- 4.1.1 Minimal Representations.- 4.1.1 Non-overlapping Representations.- 4.1.1 Why XCS Prefers Non-overlapping Populations.- 4.1.1 Should we Prefer Non-overlapping Populations?.- 4.1.1 Optimal Rule Sets: [O]s.- 4.1.1 Conflicting Rules.- 4.1.1 Representation in XCS.- 4.3 How Should We Measure Performance?.- 4.3.1 Measures of Performance.- 4.3.2 Measures of Population State.- 4.3.3 Measuring Performance and Measuring State.- 4.3.4 New Population State Metrics.- 4.3.5 Testing XCS with %[PI].- 4.3.6 Testing XCS with %[m-DNF].- 4.3.7 Summary of Metrics and Properties.- 4.4 Summary.- 5 Prospects for Adaptation.- 5.1 Known Problems with Strength LCS.- 5.2 Methodology for Rule Type Analysis.- 5.3 Analysis of Rule Types.- 5.3.1 Correct and Incorrect Actions.- 5.3.2 Overgeneral Rules.- 5.3.3 Strong Overgeneral Rules.- 5.3.4 Fit Overgeneral Rules.- 5.3.5 Parallel Definitions of Strength and Fitness.- 5.4 When are Strong and Fit Overgenerals Possible?.- 5.4.1 Biases in the Reward Function are Relevant.- 5.4.2 Competition for Action Selection.- 5.4.3 Competition for Reproduction.- 5.5 Strong Overgenerals in XCS.- 5.5.1 Biases between Actions do not Produce Strong Overgenerals.- 5.5.2 Some Properties of Accuracy-based Fitness.- 5.6 Strong Overgenerals in SB-XCS.- 5.6.1 When are Strong Overgenerals Impossible in SB-XCS?.- 5.6.2 What Makes Strong Overgenerals Possible in SB-XCS?.- 5.7 Fit Overgenerals and the Survival of Rules under the GA.- 5.7.1 Comparison on an Unbiased Reward Function.- 5.7.2 Comparison on a Biased Reward Function.- 5.7.3 Discussion.- 5.8 Designing Strong and Fit Overgenerals for XCS.- 5.8.1 Biased Variance Functions.- 5.8.2 Empirical Results.- 5.8.3 Avoiding Fit Overgenerals.- 5.8.4 SB-XCS and Biased Variance Functions.- 5.9 Strong and Fit Undergeneral Rules.- 5.10 Why Bias the Reward Function.- 5.10.1 Some State-actions are more Important than Others.- 5.10.2 A Rule Allocation Bias can Focus Resources.- 5.11 Rule Allocation Reconsidered.- 5.11.1 Knowing What Not to Do.- 5.11.2 Managing Exploration.- 5.11.3 Complete and Partial Maps Revisited.- 5.11.4 Alternatives to Biasing the Reward Function.- 5.11.5 Can SB-XCS Avoid Strong and Fit Overgenerals?.- 5.12 Sequential Tasks.- 5.12.1 The Need to Pass Values Back.- 5.12.2 The Need for Discounting.- 5.12.3 How Q-functions become Biased.- 5.12.4 Examples.- 5.12.5 Woods2 Revisited.- 5.12.6 When Will the Value Function be Unbiased?.- 5.13 What Tasks can we Solve with SB-XCS?.- 5.14 Extensions.- 5.14.1 Fitness Sharing.- 5.14.2 Other Factors Contributing to Strong Overgenerals.- 5.14.3 Qualitative and Quantitative Approaches.- 5.15 Summary.- 6 Classifier Systems and Q-learning.- 6.1 Classifier Systems and Q-learning.- 6.1.1 Q-learning in Classifier Systems.- 6.1.2 Is it Really Q-learning?.- 6.1.3 XCS is a Proper Generalisation of Tabular Q-learning.- 6.1.4 Summary.- 6.2 The GA-view and RL-view Revisited.- 6.2.1 How SB-XCS Determines Policies.- 6.2.2 How XCS Determines Policies.- 6.2.3 Three Approaches to Determining a Policy.- 6.2.4 The GA-view and the RL-view.- 6.2.5 Combining Evolution and Q-learning.- 6.3. XCS is Closer to Tabular Q-learning than to SB-XCS.- 6.4. Summary.- 7 Conclusion.- 7.1 The Capacities of Various Types of LCS.- 7.2 Contributions.- 7.3 The Take-home Message.- 7.4 Open Problems and Future Work.- 7.4.1 Fitness Sharing and Strength-based Fitness.- 7.4.2 Further Study of Accuracy-based Fitness.- 7.5 Concluding Remarks.- 7.5.1 The Moral of the Story: The Need for a Complex Systems Design Methodology.- 7.5.2 Classifier Systems and Reinforcement Learning.- 7.5.3 The Future.- Appendices.- A Evaluation of Macroclassifiers.- B Example XCS Cycle.- B.1 The Performance System Algorithm.- B.2 The Credit Assignment Algorithm.- B.3 The Rule Discovery Algorithm.- C Learning from Reinforcement.- C.1 Three Learning Paradigms.- C.1.1 Supervised Learning.- C.1.2 Reinforcement Learning.- C.1.3 Unsupervised Learning.- C.2 The Explore/Exploit Dilemma: a Feature of RL.- C.3 Sequential and Non-sequential Tasks.- C.3.1 Immediate Reward and Long-term Value.- C.3.2 Sequential Decisions Imply RL.- C.3.3 Episodic and Continuing Tasks.- C.4 The Agent’s Goal: Maximising Return.- C.4.1 Return and Reward.- C.4.2 Sequential Formulations of Return.- C.5 Formalising RL Tasks.- C.5.1 Environment.- C.5.2 Learning Agent.- C.5.3 Agent-environment Interaction.- C.6 Summary.- D Generalisation Problems.- D.1 Why Generalise?.- D.1.1 The Curse of Dimensionality.- D.1.2 The Need for Generalisation.- D.2 Generalisation in RL.- D.2.1 Generalising Over Policies and Value Functions.- D.3 State Aggregation.- D.4 State Space and Generalisation Space.- D.5 Summary.- E Value Estimation Algorithms.- E.1 The Value of State-actions.- E.2 Non-sequential RL: Estimating Reward Functions.- E.2.1 The Value of State-actions in Non-sequential Tasks.- E.3 Estimating Expectations with Sample Averages.- E.3.1 Incremental Updates.- E.3.2 A General Form of Incremental Update.- E.3.3 Setting StepSize in Incremental Updates.- E.3.4 A Prediction Algorithm for Non-sequential RL.- E.4 Sequential RL: Estimating Long-term Value Functions.- E.4.1 Long-term Value Functions.- E.4.2 The Value of State-actions in Sequential Tasks.- E.4.3 The Value of a Policy.- E.4.4 Estimating Values with Monte Carlo Methods.- E.4.5 Estimating Values with Temporal Difference Methods.- E.4.6 Russell and Norvig’s Maze: A Sequential RL Task.- E.4.7 Summary of Sequential RL.- E.5 State Aggregation.- E.5.1 Fixed and Adaptive Aggregation Schemes.- E.5.2 The Value of Aggregations I: Return.- E.5.3 The Value of Aggregations II: Predictive Utility.- E.6 Storing Value Estimates.- E.6.1 Storing Estimates of Aggregations.- E.6.2 Sparse Estimators, Models and Search.- E.6.3 Function Approximators.- E.7 Summary.- F Generalised Policy Iteration Algorithms.- F.1 Policy Improvement.- F.2 Optimal Policies.- F.3 Generalised Policy Iteration.- F.3.1 How Well must we Evaluate a Policy?.- F.3.2 Convergence Properties of GPI Control Algorithms.- F.3.3 Initialising Value Functions.- F.3.4 What Characterises GPI Algorithms?.- F.4 State-value Functions.- F.5 Summary.- G Evolutionary Algorithms.- G.1 Evolution.- G.2 Elements of EAs.- G.2.1 A Generic EA.- G.2.2 Population-based Search.- G.2.3 Fitness Functions.- G.2.4 Probabilistic Selection of Parents.- G.2.5 Genetic Operators.- G.2.6 Replacement.- G.3 EAs as Search.- G.3.1 Local and Global Optima.- G.4 The Generalisation Problem.- G.5 Niching and Mating Restriction.- G.5.1 Fitness Sharing.- G.5.2 Crowding.- G.5.3 Mating Restriction.- G.6 RL with EAs.- G.6.1 Non-associative RL with an EA.- G.6.2 Associative RL with an EA.- G.6.3 Sequential RL with an EA.- G.7 Comparing GPI and EA Methods for RL.- G.7.1 Similarities between GPI and EA Methods.- G.8 Summary.- H The Origins of Sarsa.- H.1 Modified Connectionist Q-learning.- H.2 ZCS’s Implicit Bucket Brigade.- H.3 Who Invented Sarsa?.- I Notation.- References.