Methods

From aHuman Wiki
Jump to: navigation, search
Algorithms and Methods

@@Home -> ArtificialIntelligenceDictionary -> methods


General Methods

  • back-propagation (Werbos, 1974)
  • backward modeling
  • batch training
  *# batch back-propagation
  *# training by epoch
  • boosting (Schapire, 2001)
  • cascade algorithm
  • constructive training (of neural network)
  • convex optimization
  • covariance training
  • direct error minimization
  • discriminative training methods
  • dynamic algorithms
  • ensemble learning (Krogh and Vedelsby, 1995; Diettrich, 2000)
  • explorative algorithms
  • exploitive algorithms
  • forward modeling
  • genetic algorithms (Goldberg, 1989)
  • global optimization techniques
  • gradient methods
  • greedy learning
  • incremental training (of neural network)
  *# incremental back-propagation
  *# online training
  *# training by pattern
  • kernel methods
  • learning
  *# learning-by-example
  • mini-batch training (Wilson and Martinez, 2003)
  • off-line learning
  • off-policy learning
  • on-policy learning
  • online learning
  • policy iteration algorithm (Kaelbling et al., 1996)
  • reinforcement learning
  *# episodic reinforcement learning
  *# model-based reinforcement learning
  *# model-free reinforcement learning
  *# non-episodic reinforcement learning
  *# off-policy learning
  *# on-policy learning
  *# tabular reinforcement learning
  • supervised training
  • temporal difference learning algorithms
  • training
  • unsupervised training
  • value iteration algorithm
  • wake-sleep algorithm
  *# contrastive wake-sleep
  • weight update algorithm

Common Parameters of Algorithms

  • fixed step-size
  • dynamic step-size
  • patience parameter
  • momentum
  • steepness parameter
  • step size
  *# fixed step-size
  *# dynamic step-size
  • variational bound

Named Algorithms

  • Backprop
  • Bayesian techniques (Neal, 1996)
  • Cascade 2
  *# Cascade 2 with caching
  • Cascade Correlation (Prechelt, 1997)
  • Casper algorithm (Treadgold and Gedeon, 1997)
  • Cerebellar Model Articulation Controller (CMAC) (Albus, 1975; Glanz et al., 1991; Sutton and Barto, 1998)
  • Contrastive Divergence Learning
  • Dyna-Q (Sutton and Barto, 1998)
  • Explicit Explore or Exploit (Kearns and Singh, 1998)
  • Gibbs Sampling
  *# Alternating Gibbs Sampling
  • K-Nearest Neighbor
  • Learning Vector Quantization (LVQ)
  • Levenberg-Marquardt (More, 1977)
  • Locality-Sensitive Hashing (LSH)
  • Maximum Likelihood (ML) Learning
  • Model-Based Interval Estimation (MBIE) (Strehl and Littman, 2004)
  • Model-Based Policy Gradient methods (MBPG) (Wang and Dietterich, 2003)
  • Monte Carlo Algorithms
  • N-Step Return Algorithm
  • Neural Fitted Q Iteration (Riedmiller, 2005)
  *# NFQ-SARSA(L)
  • Optimal Brain Damage (LeCun et al., 1990)
  • Orthogonal Least Squares (OLS)
  • Particle Swarm (Kennedy and Eberhart, 1995)
  • Prioritized Sweeping (Sutton and Barto, 1998)
  • Q-Learning
  *# Delayed Q-learning (Strehl et al., 2006)
  *# Generalized Policy Iteration (GPI) (Sutton and Barto, 1998)
  *# Naive Q(λ) (Sutton and Barto, 1998)
  *# One-Step Q-Learning
  *# Peng’s Q(λ) (Peng and Williams, 1994)
  *# Q(λ)
  *# Watkin’s Q(λ) (Watkins, 1989)
  • Q-SARSA(λ)
  *# One-Step Q-SARSA Algorithm
  • Quickprop (Fahlman, 1988)
  • R-Max (Brafman and Tennenholtz, 2002)
  • RPROP (Riedmiller and Braun, 1993)
  *# iRPROP- (Igel and Hsken, 2000)
  • SARSA(λ)
  *# On-Policy SARSA-Learning
  *# SARSA with Linear Function Approximators (Gordon, 2000)
  • Simulated Annealing (Kirkpatrick et al., 1987)
  • Sliding Window Cache
  • SONN (Tenorio and Lee, 1989)
  • Temporal Difference (TD) Learning
  *# Monte Carlo Prediction
  *# TD(0) Algorithm (Sutton, 1988)
  *# TD(λ) Algorithm
  *# TD(λ) Algorithm With Linear Function Approximators (Tsitsiklis and Roy, 1996)
  *# Temporal Difference (TD) Prediction Algorithm
  • λ-Return Approach