Header logo is ei

Pattern Selection Using the Bias and Variance of Ensemble




[Abstract]: A useful pattern is a pattern that contributes much to learning. For a classification problem those patterns near the class boundary surfaces carry more information to the classifier. For a regression problem the ones near the estimated surface carry more information. In both cases, the usefulness is defined only for those patterns either without error or with negligible error. Using only the useful patterns gives several benefits. First, computational complexity in memory and time for learning is decreased. Second, overfitting is avoided even when the learner is over-sized. Third, learning results in more stable learners. In this paper, we propose a pattern “utility index” that measures the utility of an individual pattern. The utility index is based on the bias and variance of a pattern trained by a network ensemble. In classification, the pattern with a low bias and a high variance gets a high score. In regression, on the other hand, the one with a low bias and a low variance gets a high score. Based on the distribution of the utility index, the original training set is divided into a high-score group and a low-score group. Only the high-score group is then used for training. The proposed method is tested on synthetic and real-world benchmark datasets. The proposed approach gives a better or at least similar performance.

Author(s): Shin, H. and Cho, S.
Journal: Journal of the Korean Institute of Industrial Engineers
Volume: 28
Number (issue): 1
Pages: 112-127
Year: 2001
Month: March
Day: 0

Department(s): Empirical Inference
Bibtex Type: Article (article)

Digital: 0
Institution: Seoul National University, Seoul, Korea
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik


  title = {Pattern Selection Using the Bias and Variance of Ensemble},
  author = {Shin, H. and Cho, S.},
  journal = {Journal of the Korean Institute of Industrial Engineers},
  volume = {28},
  number = {1},
  pages = {112-127},
  organization = {Max-Planck-Gesellschaft},
  institution = {Seoul National University, Seoul, Korea},
  school = {Biologische Kybernetik},
  month = mar,
  year = {2001},
  month_numeric = {3}