Text-Based Information Retrieval Using Exponentiated Gradient Descent, 
Ron Papka, James P. Callan and Andrew G. Barto 
 3 

Why did TD-Gammon Work?,
Jordan B. Pollack and Alan D. Blair 
 10 

Neural Models for Pain- Whole Hierarchies, 
Maximilian Riesenhuber and Peter Dayan 
 17 

Temporal Low-Order Statistics of Natural Sounds,
H. Attias and C. E. Schreiner 
27 

Reconstructing Stimulus Velocity from Neuronal Responses in Area MT, 
Wyeth Bair, James R. Cavanaugh and J. Anthony Movshon 
 34 

3D Object Recognition: A Model of View-Tuned Neurons,
Emanuela Bricolo, Tomaso Poggio and Nikos Logothetis 
 41 

A Hierarchical Model of Visual Rivalry,
Peter Dayan 
 48 

Neural Network Models of Chemotaxis in the Nematode Caenorhabditis Elegans,
Thomas C. Ferr6e, Ben A. Marcotte and Shawn R. Lockery 
 55 

Extraction of Temporal Features in the Electrosensory System of Weakly Electric Fish,
Fabrizio Gabbiani, Walter Metzner, Ralf Wessel and Christof Koch
 62 

A Neural Model of Visual Contour Integration,
Zhaoping Li 
 69 

Learning Exact Patterns of Quasi-synchronization among Spiking Neurons from Data on Multi-unit Recordings,
Laura Martignon, Kathryn Laskey, Gustavo Deco and Eilon Vaadia
 76 

Complex- Cell Responses Derived from Center-Surround Inputs: The Surprising Power of Intradendritic Computation,
Bartlett W. Mel, Daniel L. Ruderman and Kevin A. Archie 
 83 

Orientation Contrast Sensitivity from Long-range Interactions in Visual Cortex,
Klaus R. Pawelzik, Udo Ernst, Fred Wolf and Theo Geisel 
 90 

Statistically Efficient Estimations Using Comical Lateral Connections,
Alexandre Pouget and Kechen Zhang 
 97 

An Architectural Mechanism for Direction-tuned Comical Simple Cells: The Role of Mutual Inhibition,
Silvio P. Sabatini, Fabio Solari and Giacomo M. Bisio 
 104 

Cholinergic Modulation Preserves Spike Timing Under Physiologically Realistic Fluctuating Input,
Akaysha C. Tang, Andreas M. Bartels and Terrence J. Sejnowski
 111

A Model of Recurrent Interactions in Primary Visual Cortex,
Emanuel Todorov, Athanassios Siapas and David Somers 
118 

Neural Learning in Structured Parameter Spaces m Natural Riemannian Gradient,
 Shun-ichi Amari 
 127 

For Valid Generalization, the Size of the Weights is More Important than the Size of the Network
Peter L. Bartlett 
 134 

Dynamics of Training,
 Siegfried Bos and Manfred Opper 
 141 

Multilayer Neural Networks: One or Two Hidden Layers?,
G. Brightwell, C. Kenyon and Helene Paugam-Moisy 
 148 

Support Vector Regression Machines,
 Harris Dmcker, Chris J.C. Burges, Linda Kaufman, Alex Smola and Vladimir Vapnik 
 155 

Size of Multilayer Networks for Exact Learning: Analytic Approach,
Andr6 Elisseeff and Helene Paugam-Moisy 
 162 

The Effect of Correlated Input Data on the Dynamics of Learning, 
Soren Halkjaer and Ole Winther 
 169 

Practical Confidence and Prediction Intervals,
 Tom Heskes 
 176 

Statisticai Mechanics of the Mixture of Experts,
 Kukjin Kang and Jong-Hoon Oh
183 

MLP Can Provably Generalize Much Better than VC-bounds Indicate, 
A. Kowalczyk and H. Ferra 
 190 

Radial Basis Function Networks and Complexity Regularization in Function Learning,
 Adam Kowalczyk and Tamas Linder 
 197 

An Apobayesian Relative of Winnow,
 Nick Littlestone and Chris Mesterharm 
204 

Noisy Spiking Neurons with Temporal Coding have more Computational Power than Sigmoidal Neurons, 
Wolfgang Maass 
 211 

On the Effect of Analog Noise in Discrete-Time Analog Computations, 
Wolfgang Maass and Pekka Orponen 
 218 

A Mean Field Algorithm for Bayes Learning in Large Feed-forward Neural Networks,
 Manfred Opper and Ole Winther 
 225 

Removing Noise in On-Line Search using Adaptive Batch Sizes,
 Genevieve B. Orr 
232 

Are Hopfield Networks Faster than Conventional Computers ?, 
Ian Parberry and Hung-Li Tseng 
 239 

Hebb Learning of Features based on their Information Content, 
Ferdinand Peper and Hideki Noda 
 246 

The Generalisation Cost of RAMnets,
 Richard Rohwer and Michal Morciniec 
253 

Learning with Noise and Regularizers in Multilayer Neural Networks, 
David Saad and Sara A. Solla 
260 

A Variational Principle for Model-based Morphing,
Lawrence K. Saul and Michael I. Jordan 
267 

Online Learning from Finite Training Sets: An Analytical Case Study,
Peter Sollich and David Barber 
274 

Support Vector Method for Function Approximation,  Regression Estimation, and Signal Processing, 
Vladimir Vapnik, Steven E. Golowich and Alex Smola
281 

The Learning Dynamcis of a Universal Approximator,
Ansgar H. L. West, David Saad and Ian T. Nabney 
 288 

Computing with Infinite Networks,
 Christopher K. I. Williams 
 295 

Microscopic Equations in Rough Energy Landscape for Neural Networks, 
K. Y. Michael Wong 
 302 

Time Series Prediction using Mixtures of Experts,
Assaf J. Zeevi, Ron Meir and Robert J. Adler 
309 

Genetic Algorithms and Explicit Search Statistics,
 Shumeet B aluja 
 319 

Consistent Classification,
 Firm and Soft, Yoram Baram 
 326 

Bayesian Model Comparison by Monte Carlo Chaining,
David Barber and Christopher M. Bishop 
 333 

Gaussian Processes for Bayesian Classification via Hybrid Monte Carlo,
David Barber and Christopher K. I. Williams 
 340 

Regression with Input-Dependent Noise: A Bayesian Treatment,
Christopher M. Bishop and Cazhaow S. Qazaz 
 347 

GTM: A Principled Alternative to the Self- Organizing Map,
Christopher M. Bishop, Markus Svens6n and Christopher K. I. Williams
 354 

The CONDENSATION Algorithm-- Conditional Density Propagation and Applications to Visual Tracking,
 A. Blake and M. Isard 
 361 

Clustering via Concave Minimization,
P.S. Bradley, O. L. Mangasarian and W. N. Street 
 368 

Improving the Accuracy and Speed of Support Vector Machines,
Chris J.C. Burges and B. Sch61kopf 
 375 

Estimating Equivalent Kernels for Neural Networks: A Data Perturbation Approach,
 A. Neil Burgess 
 382 

Promoting Poor Features to Supervisors: Some Inputs Work Better as Outputs,
Rich Caruana and Virginia R. de Sa 
 389 

Self-Organizing and Adaptive Algorithms for Generalized Eigen-Decomposition,
Chanchal Chatterjee and Vwani P. Roychowdhury 
 396 

Representation and Induction of Finite State Machines using Time-Delay Neural Networks, 
Daniel S. Clouse, C. Lee Giles, Bill G. Horne and Garrison W. Cottrell
403 

"488" Solutions to the XOR Problem,
 Frans M. Coetzee and Virginia L. Stonick
410 

Minimizing Statistical Bias with Queries, 
David A. Cohn 
 417 

MIMIC: Finding Optima by Estimating Probability Densities,
Jeremy S. de Bonet, Charles L. Isbell, and Paul Viola
  424 

On a Modification to the Mean Field EM Algorithm in Factorial Learning,
A. P. Dunmur and D. M. Titterington 
 431 

Softening Discrete Relaxation,
 Andrew M. Finch, Richard C. Wilson and Edwin R. Hancock 
 438 

Limitations of Self-organizing Maps for Vector Quantization and Multidimensional Scaling,
 Arthur Flexer 
 445 

Continuous Sigmoidal Belief Networks Trained using Slice Sampling,
 Brendan J. Frey 
 452 

Adaptively Growing Hierarchical Mixtures of Experts,
 Juergen Fritsch, Michael Finke and Alex Waibel 
 459 

Balancing Between Bagging and Bumping,
 Tom Heskes 
 466 

LSTM can Solve Hard Long Time Lag Problems,
 Sepp Hochreiter and Jurgen Schmidhuber 
 473 

One-unit Learning Rules for Independent Component Analysis,
 Aapo Hyvarinen and Erkki Oja 
 480 

Recursive Algorithms for Approximating Probabilities in Graphical Models,
 Tommi S. Jaakkola and Michael I. Jordan 
 487 

Combinations of Weak Classifiers,
 Chuanyi Ji and Sheng Ma 
 494 

Hidden Markov Decision Trees,
 Michael I. Jordan, Zoubin Ghahramani and Lawrence K. Saul 
 501 

Unification of lnformation Maximization and Minimization,
 Ryotaro Kamimura 
508 

Unsupervised Learning by Convex and Conic Coding,
 D. D. Lee and H. S. Seung 
515 

ARC-LH: A New Adaptive Resampling Algorithm for Improving ANN Classifiers, 
Friedrich Leisch and Kurt Hornik 
 522 

Bayesian Unsupervised Learning of Higher Order Structure,
Michael S. Lewicki and Terrence J. Sejnowski 
 529 

Source Separation and Density Estimation by Faithful Equivariant SOM,
Juan K. Lin, Jack D. Cowan and David G. Grier 
 536 

NeuroScale: Novel Topographic Feature Extraction using RBF Networks,
David Lowe and Michael E. Tipping 
 543 

Ordered Classes and Incomplete Examples in Classification,
 Mark Mathieson 
550 

Triangulation by Continuous Embedding, 
Marina Meila and Michael I. Jordan 
557 

Combining Neural Network Regression Estimates with Regularized Linear 
Weights, Christopher J. Merz and Michael J. Pazzani 
 564 

A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data,
 David J. Miller and Hasan S. Uyar 
 571 

Learning Bayesian Belief Networks with Neural Network Estimators,
 Stefano Monti and Gregory F. Cooper 
 578 

Smoothing Regularizers for Projectire Basis Function Networks,
 John E. Moody and Thorsteinn S. R6gnvaldsson 
 585 

Competition Among Networks Improves Committee Performance,
 Paul W. Munro and Bambang Parmanto 
 592 

Adaptive On-line Learning in Changing Environments,
 Noboru Murata, Klaus-Robert Miiller, Andreas Ziehe and Shun-ichi Amari
  599 

Using Curvature Information for Fast Stochastic Search,
 Genevieve B. Orr and Todd K. Leen 
 606 

Maximum Likelihood Blind Source Separation: A Context-Sensitive Generalization of ICA,
 Barak A. Pearlmutter and Lucas C. Parra 
 613 

A Convergence Proof for the Softassign Quadratic Assignment Algorithm,
Anand Rangarajan, Alan Yuille, Steven Gold and Eric Mjolsness 
 620 

Second-order Learning Algorithm with Squared Penalty Term,
 Kazumi Saito and Ryohei Nakano 
 627 

Monotonicity Hints,
 Joseph Sill and Yaser S. Abu-Mostafa 
 634 

Training Algorithms for Hidden Markov Models using Entropy Based Distance Functions,
 Yoram Singer and Manfred K. Warmuth 
 641 

Clustering Sequences with Hidden Markov Models,
 Padhraic Smyth 
 648 

Fast Network Pruning and Feature Extraction by using the Unit-OBS Algorithm,
Achim Stahlberger and Martin Riedmiller 
 655 

Separating Style and Content,
 Joshua B. Tenenbaum and William T. Freeman
662 

Early Brain Damage, 
Volker Tresp, Ralph Neuneier and Hans Georg Zimmermann 
 669 

Probabilistic Interpretation of Population Codes,
Richard S. Zemel, Peter Dayan and Alexandre Pouget 
 676 

VLSI Implementation of Cortical Visual Motion Detection Using an Analog Neural Computer,
 Ralph Etienne-Cummings, Jan van der Spiegel, Naomi Takahashi, Alyssa Apscl and Paul Mueller 
685 

A Spike Based Learning Neuron in Analog VLSI,
Philipp H'fifliger, Misha Mahowald and Lloyd Watts 
 692 

An Analog Implementation of the Constant Average Statistics Constraint For Sensor Calibration,
 John G. Harris and Yu-Ming Chiang 
 699 

Analog VLSI Circuits for Attention-Based, Visual Tracking, 
Timothy Horiuchi, Tonia G. Morris, Christof Koch and Stephen P. DeWeerth
706 

Dynamically Adaptable CMOS Winner-Take-All Neural Network, 
Kunihiko Iizuka, Masayuki Miyamoto and Hirofumi Matsui 
 713 

An Adaptive WTA using Floating Gate Technology,
W. Fritz Kruger, Paul Hasler, Bradley A. Minch and Christof Koch
 720 

A Micropower Analog VLSI HMM State Decoder for Wordspotting,
John Lazzaro, John Wawrzynek and Richard Lippmann 
 727 

Bangs, Clicks, Snaps, Thuds and Whacks: An Architecture for Acoustic Transient Processing, 
Fernando J. Pineda, Gert Cauwenberghs and R. Timothy Edwards 
734 

A Silicon Model of Amplitude Modulation Detection in the Auditory Brainstem, 
Andre van Schaik, Eric Fragniere and Eric Vittoz 
741 

Dynamic Features for Visual Speechreading: A Systematic Comparison,
Michael S. Gray, Javier R. Movellan and Terrence J. Sejnowski
 751 

Blind Separation of Delayed and Convolved Sources,
Te-Won Lee, Anthony J. Bell and Russell H. Lambert 
 758 

A Constructive RBF Network for Writer Adaptation,
John C. Platt and Nada P. Mati6 
 765 

A New Approach to Hybrid HMM/ANN Speech Recognition using Mutual Information Neural Networks,
 G. Rigoll and C. Neukirchen 
 772 

Neural Network Modeling of Speech and Music Signals,
 Alex R6bel 
 779 

A Constructive Learning Algorithm for Discriminant Tangent Models,
Diego Sona, Alessandro Sperduti and Antonina Starita 
 786 

Dual Kalman Filtering Methods for Nonlinear Prediction, Smoothing and Estimation, 
Eric A. Wan and Alex T. Nelson 
 793 

Ensemble Methods for Phoneme Classification,
 Steve Waterhouse and Gary Cook 
800 

Effective Training of a Neural Network Character Classifier for Word 
Recognition, Larry Yaeger, Richard Lyon and Brandyn Webb 
 807 

Viewpoint Invariant Face Recognition using Independent Component Analysis and Attractor Networks,
 Marian Stewart Bartlett and Terrence J. Sejnowski 
 817 

Learning Temporally Persistent Hierarchical Representations,
Suzanna Becker 
824 

Edges are the "Independent Components" of Natural Scenes, 
Anthony J. Bell and Terrence J. Sejnowski 
 831 

Compositionality, MDL Priors, and Object Recognition, 
Elie Bienenstock, Stuart Geman and Daniel Potter 
 838 

Learning Appearance Based Models: Mixtures of Second Moment Experts,
 Christoph Bregler and Jitendra Malik 
 845 

Spatial Decorrelation in Orientation Tuned Cortical Cells,
 Alexander Dimitrov and Jack D. Cowan 
 852 

Spatiotemporal Coupling and Scaling of Natural Images and Human Visual Sensitivities,
 Dawei W. Dong 
 859 

Selective Integration: A Model for Disparity Estimation,
 Michael S. Gray, Alexandre Pouget, Richard S. Zemel, Steven J. Nowlan and Terrence J. Sejnowski 
866 

ARTEX: A Self-organizing Architecture for Classifying Image Regions, 
Stephen Grossberg and James R. Williamson 
 873 

Contour Organisation with the EM Algorithm,
J. A. F. Leite and Edwin R. Hancock 
 880 

Visual Cortex Circuitry and Orientation Tuning,
Trevor Mundel, Alexander Dimitrov and Jack D. Cowan 
 887 

Representing Face Images for Emotion Classification,
Curtis Padgett and Garrison W. Cottrell 
 894 

Rapid Visual Processing using Spike Asynchrony,
Simon J. Thorpe and Jacques Gautrais 
 901 

Interpreting Images by Propagating Bayesian Beliefs,
 Yair Weiss 
 908 

Salient Contour Extraction by Temporal Binding in a Cortically-based Network,
Shih-Cheng Yen and Leif H. Finkel 
 915 

An Orientation Selective Neural Network for Pattern Identification in Particle Detectors,
Halina Abramowicz, David Horn, Ury Naftaly and Cannit Sahar-Pikielny
 925 

Adaptive Access Control Applied to Ethernet Data,
 Timothy X. Brown 
 932 

Predicting Lifetimes in Dynamically Allocated Memory, 
David A. Cohn and Satinder Singh 
 939 

Multi-Task Learning for Stock Selection,
 Joumana Ghosn and Yoshua Bengio 
946 

The Neurothermostat: Predictive Optimal Control of Residential Heating 
Systems, Michael C. Mozer, Lucky Vidmar and Robert H. Dodier 
 953 

Sequential Tracking in Pricing Financial Options using Model Based and Neural Network Approaches,
 Mahesan Niranjan 
 960 

A Comparison between Neural Networks and other Statistical Techniques for Modeling the Relationship between Tobacco and Alcohol and Cancer,
Tony Plate, Pierre Band, Joel Bert and John Grace 
967 

Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems,
Satinder Singh and Dimitri Bertsekas 
974 

Spectroscopic Detection of Cervical Pre-Cancer through Radial Basis Function Networks,
 Kagan Tumer, Nirmala Ramanujam, Rebecca Richards-Kortum and Joydeep Ghosh 
 981 

Interpolating Earth-science Data using RBF Networks and Mixtures of Experts,
 Ernest Wan and Don Bone 
 988 

Multi-effect Decompositions for Financial Data Modeling,
 Lizhong Wu and John E. Moody 
 995 

Multidimensional Triangulation and Interpolation for Reinforcement Learning,
 Scott Davies 
 1005 

Efficient Nonlinear Control with Actor-Tutor Architecture,
 Kenji Doya 
 1012 

Local Bandit Approximation for Optimal Learning Problems,
Michael O. Duff and Andrew G. Barto 
 1019 

Reinforcement Learning for Mixed Open-loop and Closed-loop Control,
Eric A. Hansen, Andrew G. Barto and Shlomo Zilberstein 
 1026 

Multi-Grid Methods for Reinforcement Learning in Controlled Diffusion Processes,
 Stephan Pareigis 
 1033 

Learning from Demonstration,
 Stefan Schaal 
 1040 

Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning,
Jeff G. Schneider 
 1047 

Analytical Mean Squared Error Curves in Temporal Difference Learning,
Satinder Singh and Peter Dayan 
 1054 

Learning Decision Theoretic Utilities through Reinforcement Learning,
Magnus Stensmo and Terrence J. Sejnowski 
 1061 

On-line Policy Improvement using Monte-Carlo Search,
Gerald Tesauro and Gregory R. Galperin 
 1068 

Analysis of Temporal-Diffference Learning with Function Approximation,
John N. Tsitsiklis and Benjamin Van Roy 
 1075 

Approximate Solutions to Optimal Stopping Problems,
John N. Tsitsiklis and Benjamin Van Roy 
 1082 

