767 
SELF-ORGANIZATION OF ASSOCIATIVE DATABASE 
AND ITS APPLICATIONS 
Hisashi Suzuki and Suguru Arimoto 
Osaka University, Toyonaka, Osaka 560, Japan 
ABSTRACT 
An efficient method of self-organizing associative databases is proposed together with 
applications to robot eyesight systems The proposed databases cn associate ny input 
with some output. In the first half prt of discussion, n algorithm of self-organization is 
proposed. From an aspect of hardware, it produces a new style of neural network. In the 
latter half part, an pplicability to handwritten letter recognition and that to an autonomous 
mobile robot system are demonstrated. 
INTRODUCTION 
Let  mpping f: X --, Y be given. Here, X is a finite or infinite set, and Y is another 
finite or infinite set. A learning machine observes any set of pairs (x, y) sampled randomly 
from X x Y. (X x Y means the Cartesian product of X and Y.) And, it computes some 
estimate f: X --, Y of f to make small, the estimation error in some measure. 
Usually we say that: the faster the decrease of estimation error with increase of the num- 
ber of samples, the better the learning machine. However, such expression on performance 
is incomplete. Since, it lacks consideration on the candidates of f of f assumed prelimi- 
narily. Then, how should we find out good learning machines? To clarify this conception, 
let us discuss for a while on some types of learning machines. And, let us advance the 
understanding of the self-organization of associative database. 
 Parameter Type 
An ordinary type of learning machine assumes an equation relatin x's and y's with 
parameters being indefi^nite,namely, a structure of f. It is equivalent to define implicitly a 
set 1  of candidates of f. (F is some subset of mappings from X to Y.) And, it computes 
values of the parameters based on the observed samples. We call such type a parameter 
type. 
For a learning machine defined well, if 1   f, f pproaches f as the number of samples 
increases. In the alternative case, however, some estimation error remains eternally. Thus, 
a problem of designing  learning machine returns to find out a proper structure of f in this 
sense. 
On the other hand, the assumed structure of f is demanded to be as compact as possible 
to achieve a fast learning. In other words, the number of parameters should be small. Since, 
if the parameters are few, some f can be uniquely determined even though the observed 
samples are few. However, this demand of being proper contradicts to that of being compact. 
Consequently, in the parameter type, the better the compactness of the assumed structure 
that is proper, the better the learning machine. This is the most elementary conception 
when we design learning ma&ines. 
 Universality and Ordinary Neural Networks 
Now suppose that a sufficient knowledge on f is given though j itself is unknown. In 
this case, it is comparatively easy to find out proper and compact structures of j. In the 
alternative case, however, it is sometimes difficult. A possible solution is to give up the 
compactness and assume an almighty structure that can cover various f's. A combination 
of some orthogonal bases of the infinite dimension is such a structure. Neural networks '2 
are its approximations obtained by truncating finitely the dimension for implementation 
@ American Institute of Physics 1988 
768 
A min topic in designing neural networks is to establish such desirable structures of f. 
This work includes developing practical procedures that compute values of coefiScients from 
the observed samples. Such discussions are fiourishing since 1980 while many efiScient meth- 
ods have been proposed. Recently, even hardware units computing coefiScients in pardlid 
for speed-up are sold, e.g., ANZA, Mark III, Odyssey and 
Nevertheless, in neural networks, there always exists a danger of some error remaining 
eternally in estimating f. Precisely speaking, suppose that a combination of the bases of a 
finite number can define a structure of f essentially. In other words, suppose that P /, or 
f is located near 1 '. In such case, the estimation error is none or negligible. However, if f 
is distant from i', the estimation error never becomes negligible. Indeed, many researches 
report that the following situation appears when f is too complex. Once the estimation 
error converges to some value (> 0) as the number of samples increases, it decreases hardly 
even though the dimension is heighten. This property sometimes is a considerable defect of 
neural networks. 
 Recursive Type 
The recursive type is founded on another methodology of learning that should be as 
follows. At the initial stage of no sample, the set F0 (instead of notation ) of candidates 
of f' equals to the set of all mappings from X to Y. After observing the first sample 
(x,,)  X x Y, & is reduced to  so that f(x) =  for any f  . Aaer observing 
th eona sample (xa, ya)  X x Y, g is further reduced to Pa so that f() = yx and 
f(xa) = ya for any f  P. Thus, the candidate set P becom gruy smM1  observaaon 
of samples proceeds. The f after observing i-staples, which we write fi, is one of the most 
hkehhood timation of f selected in . Hence, contrily to the pameter type, the 
recursive type guantees surely that f approhes to f  the number of samples increase. 
The recursive type, if observes a sample (xi, Yi), rewrit vMues A-()'s to A()'s for 
some 's correlated to the staple. Hence, this type h an architecture composed of a rule 
for rewriting and a free memory spce. Such architecture for natury a nd of dtabase 
that builds up magement syste of data in a self-orgizing way. However, this datable 
differs kom ordiny on in the following sense. It does not only record the samples Mready 
observed, but computes some estimation of f(x} for any x  X. We cM1 such datable  
socitive datable. 
The first subject in constructing sociative datables is how we estabhsh the rule for 
rewriting. For this purpose, we Mapt a measure cMled the dissimilarity. Here,  dissilarity 
means a mapping d: X x X  {reals > 0} such that for any (x, )  X x X, d(x, ) > 0 
whenever f(x)  (). However, it is not necessily aea,a with  single formula. It is 
definable with, for example, a collection of rules written in forms of "if ... then ...." 
The dissilarity d defines a structure of f locy in X x Y. Hence, even though 
the knowledge on f is imperfect, we can reflect it on d in some heuristic way. Hence, 
contrarily to neurM networks, it is possible to celerate the speed of leaning by establishing 
d wall. Especiy, we c eily find out simple d's for those f's which process anMocy 
information like a human. (See the pplications in this paper.) And, for such f's, the 
rursive type shows strongly its effectiveness. 
We denote a sequence of observed sampl by (x, y), (xa, ya)," '. One of the simplest 
constructions of associative databases fi after observing i-staples (i = 1, 2,.-.) is as follows 
Algorithm 1. At the initiM stage, let S0 be the empty set. For every i = 
1,2,..., let A-(x) for any x  X equM some y* such that (x*,y*)  Si- and 
(1) 
Furthermore, d (xi, y) to S_ to produce S, i.e., S = S_ u {(xi, yi)}. 
769 
Another version improved to economize the memory is as follows. 
Algorithm 2. At the initial stage, let So be composed of an arbitrary element 
in X x Y. For every i = 1, 2,-.-, let fi-l(a:) for any a:  X equal some ,* such 
that (x*, ?,,*)  Si-1 and 
d(x,x'): rain 
Furthermore, if fi-l(a:i) :/: ' then let S = Si-1, or dad (a:i,q) to Si-1 to 
produce Si, i.e., Si: Si-10 {(a:i, 'i)}. 
In either construction, fl approaches to f as i increases. However, the computation time 
grows proportionally to the size of Si. The second subject in constructing associative 
databases is what addressing rule we should employ to economize the computation time. In 
the subsequent chapters, a construction of associative database for this purpose is proposed. 
It manages data in a form of binary tree. 
SELF-ORGANIZATION OF ASSOCIATIVE DATABASE 
Given a sample sequence (xl, ?,q), (x, ?,,a),..., the algorithm for constructing associative 
database is as follows. 
Algorithm 3.' 
Step l(Initialization): Let (a:[rootl,,[root]) = (aq,q). Here, a:[-] and ,[-] are 
warlables assigned for respective nodes to memorize data. Furthermore, let t = 1. 
Step 2: Increase t by 1, and put a:t in. After reset a pointer n to the root, repeat 
the following until n axrives at some terminal node, i.e., le. 
Notations  and 2 mean the descendant nodes of ,. If d(a:,,a:[a]) < 
d(,,tl), let n: a. Otherwise, let n: 
Step 3: Display g,[n] as the related information. Next, put g, in. If ,[n] = q, bak 
to step 2. Otherwise, first establish new descendant nodes a and h. Secondly, 
let 
: (2) 
(x[h],y[iz]) = (:rt,yt). (3) 
Finally, back to step 2. Here, the loop of step 2-3 can be stopped at any time 
and also can be continued. 
Now, suppose that gate elements, namely, artificial "synapses" that play the role of branch- 
ing by d are prepared. Then, we obtain a new style of neural network with gate elements 
being randomly connected by this algorithm. 
LETTER HECOGNITION 
Recently, the vertical slitting method for recognizing typographic English letters 3, the 
elastic matching method for recognizing handwritten discrete English letters 4, the global 
training and fuzzy logic search method for recognizing Chinese characters written in square 
style , etc. axe published. The self-organization of associative database realizes the recogni- 
tion of handwritten continuous English letters. 
770 
Fig. 1. Source document. 
Wi ndow --' Bundary 
Fig. 2. Windowing. 
0 1000 2000 3000 4000 0 1000 
Nuner of samples 
2000 3000 4000 
Number of samples 
Fig. 3. An experiment result. 
An image scanner takes a document image (Fig. 1). The letter recognizer uses a par- 
allelogram window that at least can cover the maximal letter (Fig. 2), and processes the 
sequence of letters while shifting the window. That is, the recognizer scans a word in a 
slant direction. And, it places the window so that its left vicinity may be on the first black 
point detected. Then, the window catches a letter and some part of the succeeding letter. 
If recognition of the head letter is performed, its end position, namely, the boundary line 
between two letters becomes known. Hence, by starting the scanning from this boundary 
and repeating the above operations, the recognizer accomplishes recurslvely the task. Thus 
the major problem comes to identifying the head letter in the window. 
Considering it, we define the following. 
 Regard window images as x's, and define X accordingly. 
 For a (x, :)  X x X, denote by_/} a black point in the left area from the boundary on 
window image . Project each B onto window image x. Then, measure the Euclidean 
distance 5 between/} and a black point B on x being the dosest to/}. Let d(_x, :) be 
the summation of 5's for all black points/}'s on  divided by the number of B's. 
 Regard couples of the "reading" and the position of boundary as y's, and define Y 
accordingly. 
An operator teaches the recognizer in interaction the relation between window image and 
reading&boundary with algorithm 3. Precisely, if the recalled reading is incorrect, the 
operator teaches a correct reading via the console. Moreover, if the boundary position is 
incorrect, he teaches a correct position via the mouse. 
Fig. 1 shows partially a document image used in this experiment. Fig. 3 shows the 
change of the number of nodes and that of the recognition rate defined as the relative 
frequency of correct answers in the past 1000 trials. Specifications of the window are height 
= 20dot, width = 10dot, and slant angular = 68deg. In this example, the levels of tree 
were distributed in 6-19 at time 4000 and the recognition rate converged to about 74%. 
Experimentally, the recognition rate converges to about 60-85% in most cases, and to 95% at 
a rare case. However, it does not attain 100% since, e.g., "c" and "e" are not distinguishable 
because of excessive fluctuation in writing. If the consistency of the x, ?,,-rdation is not 
assured like this, the number of nodes increases endlessly (cf. Fig. 3). Hence, it is clever to 
stop the learning when the recognition rate attains some upper limit. To improve further 
the recognition rate, we must consider the spelling of words. It is one of future subjects. 
771 
OBSTACLE AVOIDING MOVEMENT 
Various systems of camera type autonomous mobile robot are reported flourishingly 6-1. 
The system made up by the authors (Fig. 4) also belongs to this category. Now, in math- 
ematical methodologies, we solve usually the problem of obstacle avoiding movement as 
a cost minimization problem under some cost criterion established artificially. Contrarily, 
the self-organization of associative database reproduces faithfully the cost criterion of an 
operator. Therefore, motion of the robot after learning becomes very natural. 
Now, the length, width and height of the robot are all about 0.7m, and the weight is 
about 30kg. The visual angle of camera is about 55deg. The robot has the following three 
factors of motion. It turns less than 4-30deg, advances less than lm, and controls speed less 
than 3km/h. The experiment was done on the passageway of width 2.5m inside a building 
which the authors' laboratories exist in (Fig. 5). Because of an experimental intention, we 
arrange boxes, smoking stands, gas cylinders, stools, handcarts, etc. on the passage way at 
random. We let the robot take an image through the camera, recall a similar image, and 
trace the route preliminarily recorded on it. For this purpose, we define the following. 
Let the camera face 28deg downward to take an image, and process it through a low 
pass filter. Scanning verticMly the filtered image from the bottom to the top, search 
the first point C where the luminanee changes excessively. Then, substitute all points 
from the bottom to C for white, and all points from C to the top for black (Fig. 6). 
(If no obstacle exists just in front of the robot, the white area shows the "free" area 
where the robot can move around.) Regard binary 32 x 32dot images processed thus 
as a:'s, and define X accordingly. 
For every (x, :) e X x X, let d(x, :) be the number of black points on the exclusive-or 
image between x and E, 
Regard as ?,,'s the images obtained by drawing routes on images a:'s, and define Y 
accordingly. 
The robot superimposes, on the current camera image a:, the route recalled for a:, and 
inquires the operator instructions. The operator judges subjectively whether the suggested 
route is appropriate or not. In the negative answer, he draws a desirable route on a: with the 
mouse to teach a new , to the robot. This operation defines implicitly a sample sequence 
of (a:, ,) reflecting the cost criterion of the operator. 
t 
moni tot dl splay 
Mouse 
rwor-ariven  
wheel yen  .'toni ball 
Mobile unit (robot) Stationary unit 
Fig. 4. Configuration of 
autonomous mobile robot system. 
22 ; 11 
Room/ 13 - 
14 
23 
, 24 
Block 
nber 
North 
Fig. 5. Experimental 
environment. 
772 
Wal I 
A reprocessing 
Fig. 6. Processing for 
obstacle avoiding movement. 
Canera image 
Preprocess ing  
 Course 
suggest ion 
x  
Fig. 7. Processing for 
position identification. 
We define the satisfaction rate by the relative frequency of acceptable suggestions of 
route in the past 100 trials. In a typical experiment, the change of satisfaction rate showed 
a similar tendency to Fig. 3, and it attains about 95% axound time 800. Here, notice that 
the rest 5% does not mean directly the percentage of collision. (In practice, we prevent the 
collision by adopting some supplementary measure.) At time 800, the number of nodes was 
145, d the levels of tree were distributed in ;-17. 
The proposed method reflects delicately various characters of operator. For exarnple, a 
robot trained by an operator O moves slowly with enough space against obstacles while one 
trained by other operator O' brushes quickly against obstacles. This fat gives us a hint 
on a method of printing "characters" into machines. 
POSITION IDENTIFICATION 
The robot can identify its position by recalling a similar landscape with the position data 
to a camera image. For this purpose, in principle, it suffices to regard camera images and 
position data as a:'s and ?,,'s, respectively. However, the memory capacity is finite in actuaJ 
computers. Hence, we cannot but compress the carhera images at a slight loss of information. 
Such compression is admlttable as long as the precision of position identification is in  
aceptable area. Thus, the major problem comes to find out some suitable compression 
method. 
In the experimental environment (Fig. 5), juts axe on the passageway at intervals of 
3.6ra, and each section between adjacent juts has at most one door. The robot identifies 
roughly from a surrounding landscape which section itself places in. And, it uses temporarily 
a triangulax surveying technique if an exact measure is necessary. To realize the former task, 
we define the following. 
Turn the camera to take a paatorama image of 360deg. Scanning horizontally the 
center line, substitute the points where the luminanee excessively changes for blak 
d the other points for white (Fig. ?). Regaxd binary 360dot line images processed 
thus as a:'s, and define X acordingly. 
For every (a:, :)  X x X, project each black point J on : onto a:. And, measure the 
Euclidean distce 5 between j and a blak point A on a: being the closest to 4. Let 
the summation of 5 be S. Similarly, calculate , by exchging the roles of a: and :. 
Denoting the numbers of A's and J's respectively by n and , define 
773 
(4) 
 Regard positive integers lbeled on sections as y's (cf. Fig. 5), nd define Y ccord- 
ingly. 
In the learning mode, the robot checks exactly its position with  counter tlt is reset pe- 
riodically by the operator. The robot runs rbitraily on the passageways within 18m axe 
and learns the relation between lndscpes nd position dt. (Position identification be- 
yond 18m re is clieved by crossing plural dtbses one another.) This task is utomtlc 
excepting the periodic reset of counter, nmely, it is  kind of learning without teacher. 
We define the identlfiction rte by the relative frequency of correct recalls of position 
dt in the past 100 trials. In  typical example, it converged to bout 83% round time 
400. At time 400, the number of levels was 202, ad the levels of tree were distributed in 5- 
22. Since the identification failures of 17% cn be rejected by considering the trajectory, no 
problem rises in practical use. In order to improve the identification rte, the compression 
rtio of camer images must be loosened. Such possibility depends on improvement of the 
lrdwre in the future. 
Fig. 8 shows n example of ctual motion of the robot based on the dtbase for obstacle 
voiding movement nd that for position identification. This exaxnple corresponds to  case 
of moving from 14 to 23 in Fig. 5. Here, the time interval per frame is bout 40sec. 
Fig. 8. Actual motion of the robot. 
774 
CONCLUSION 
A method of self-orgazizing associative databases was proposed with the application to 
robot eyesight systems. The machine decomposes a global structure unknown into a set of 
local structures known and learns universally any input-output response. This framework 
of problem implies a wide application area other than the examples shown in this paper. 
A defect of the algorithm 3 of self-orgazization is that the tree is balanced well only 
for a subclass of structures of f. A subject imposed us is to widen the class. A probable 
solution is to abolish the aAdressing rule depending directly on values of d and, insteaA, to 
establish another rule depending on the distribution function of values of d. It is now under 
investigation. 
REFERENCES 
o 
o 
10. 
Hopfield, J. J. and D. W. Tank, "Computing with Neural Circuit: A Model," 
Science 233 (1986), pp. 625--633. 
Rumelhaxt, D. E. et al., "Learning Representations by Back-Propagating Er- 
rors," Nature 323 (1986), pp. 533-536. 
Hull, J. J., "Hypothesis Generation in a Computational Model for Visual Word 
Recognition," IEEE Expert, Fall (1986), pp. 63-70. 
Kurtzberg, J. M., "Feature Analysis for Symbol Recognition by Elastic Match- 
ing," IBM J. Res. Develop. 31-1 (1987), pp. 91-95. 
Wang, Q. R. azd C. Y. Suen, "Large Tree Classifier with Heuristic Search and 
Global Training," IEEE Trans. Pattern. Anal. &Mach. Intell. PAMI 9-1 
(1987) pp. 91-102. 
Brooks, R. A. et al, "Self Calibration of Motion and Stereo Vision for Mobile 
Robots," 4th Int. Symp. of Robotics Research (1987), pp. 267-276. 
Goto, Y. and A. Stentz, "The CMU System for Mobile Robot Navigation," 1987 
IEEE Int. Conf. on Robotics & Automation (1987), pp. 99-105. 
Madarasz, R. et al., "The Design of an Autonomous Vehicle for the Disabled," 
IEEE Jour. of Robotics & Automation RA 2-3 (1986), pp. 117-125. 
Triendl, E. and D. J. Kriegman, "Stereo Vision and Navigation within Build- 
ings," 1987 IEEE Int. Conf. on Robotics & Automation (1987), pp. 1725-1730. 
Turk, M. A. et al., "Video RoaA-Following for the Autonomous Land Vehicle," 
1987 IEEE Int. Conf. on Robotics & Automation (1987), pp. 273-279. 
