hello_world_logo.gif (12859 bytes) maple_leaf.gif (1004 bytes)

This could be your company's banner advertisement

March 1999

Issue 1, Volume 1

About the Author

Anuj Ranjan
is a computer science student at the University of Alberta.

University of Alberta

Hello World!
Editorial
News
Comics
Humour
Events
Sponsorship
Hello World! Teams:
Editors
Sponsorship
Site Design
Member Universities
Past Issues
Contact Us

Artificial Neural Networks: An Introduction

-- Anuj Ranjan

This article is the first in a three-part introduction into the world of artificial neural networks. Don't get fooled by the word 'introduction' in the title. By the end of these three parts, you will have all the information you will need to be able to fully understand the concepts of neural nets and even design your own ANNs to perform whatever complex tasks you want. Part I, which is below, introduces the concept of a neural network in computing terms, and also describes how one could go about creating a simple basic ANN using grandmother cells and the Grandmothering technique (Don't ask why they call it that). Parts II and III will cover more complex algorithms for neural net design, namely the Adaline and the Backpropagation networks.

Part 1: What is a Neural Network?

An excellent one sentence description of what encompasses a neural network is provided by Robert Hecht-Nielsen: A neural network is a computing system which is made up of a number of simple, highly interconnected processing elements, and which processes information by its dynamical state response to external inputs.

Now, a lot of you may be thinking, "What the heck does that mean?" Well, it means a number of things. It says that a neural network is not a serial computer (in that it does not execute a sequential set of instructions), it is not deterministic, and it has no separate memory array for the storage of data. In fact, knowledge within a neural network is not stored in a particular location (of memory). Instead, knowledge is stored in the way that the processing elements are connected (how the outputs of one processing element are used as inputs into others), and in the weighting (or importance) of each input to the processing elements. Knowledge is more of a function of the architecture of the network rather than the contents of it.

Neural Network design was inspired by current studies of the cerebral cortex in the human brain. The cerebral cortex is made up of billions of neurons (processing elements, or neurodes, in our context) and interconnections between them. These neurons, when presented with input (electrical signals, or binary 1 for a strong signal and 0 for no signal in our terms), either fire or do not fire. The outputs of these neurons are then sent to many other neurons (and possibly back to itself) as input signals via the interconnections.

The structure of a neural network is made up of the interconnection architecture between the neurodes, the function that will determine whether or not the neurode will fire, and the rules that determine the changes in the importance (weighting) of the neurodes inputs (training laws - will be discussed at greater length later).

Thus, a neural network developer would spend his/her time specifying the interconnections, transfer functions, and training laws, which does not follow the traditional methods of programming. This is because a neural network is not the traditional computer system. Instead of executing programs like most systems, neural nets react, self-organize, learn, and even forget according to their inputs.

Why should we learn about ANNs?

Neural networks appear to be able to solve "monster" problems of AI that traditional systems have found difficulty with. These include, but are not limited to, speech recognition and synthesis, vision, and pattern recognition.

Neural Nets appear to be good at solving the kinds of problems that people can also solve easily. However, they are also usually terrible at solving problems that traditional computers are very good at. For example, a neural net would not be able to make a precise, numerical computation (which is the basis of traditional systems). On the other hand, a neural net can be taught to recognize whether or not a visual image of a face is that of a particular person, even with a different facial expression or hairdo. People are very good at this, but try doing this with a digital computer!

It is also important to point out that neural networks are not a replacement for traditional systems, but are rather a partner to them. Most neural networks are used in conjunction with other systems, and are operated by calling procedures when a network application is encountered.

Basics of Neural Networks

A neurode is basically an extremely simple processing element that has a number of input signals and only one output signal (see Figure 1). Each input signal xi has an associated weight wi, so that the effective input to the neurode is the weighted total input (or the sum of all of the products of each input and its assigned weight). The simplest kind of neurode simply compares this weighted sum to an arbitrary threshold. If the input is greater than this threshold, the neurode will fire or generate an output signal. Otherwise, the processing element will not fire and no output will be generated.

ANN1.gif (2811 bytes)
Figure 1: A neurode is a simple processing element that has input signals (each with an assigned weight) and an output signal.

The output signal of a neurode then splits out to act as inputs to other neurodes (see Figure 2). It can also act as an input on itself, depending on the network architecture. Also, these outputs (and subsequent inputs) can be either excitatory or inhibitory (either the signal tends to cause the neurode to fire or it tends to keep the neurode from firing).

ANN2.gif (2359 bytes)
Figure 2: A neurode's output signal splits out to act as inputs to other neurodes (or even back as input to itself)

If you think of the inputs and their corresponding weights as vectors, then the total input signal is just the dot or inner product of the weight and input vectors. It thus follows that the projection of the weight vector on the input vector will be strongest when the two are pointed in almost the same direction, and will be smallest when the two are pointed in near perpendicular directions. The projection is a measure of the closeness of the two vectors to each other.

Now, imagine a network with a set of only four neurodes, each with the same set of inputs but different weights on those inputs. Also, suppose only one of these neurodes can fire, and that the firing neurode is that with the largest input signal (its weight and input vectors are closest together). We can imagine the four neurodes with their weight vectors pointing in completely different directions. The one with the weight vector pointing most closely to the direction of input vector will be the one that fires (see Figure 3). Using this visual image of weight and input vectors is extremely helpful in gaining insight into their operations.

ANN3.gif (2737 bytes)
Figure 3:The neurode that will fire will be the one whose weight vector is pointing closest to the direction of the input vector. Thus the neurode represented by weight vector w1 will fire strongest.

This system, although primitive, can already perform useful functions. Suppose we want a neural net to recognize inputs and classify them into one of four distinct patterns. We could set each of the neurons so that their weight vectors point to each of the four patterns we want the system to recognize. We then present an input vector from some unknown sample, and the neurode with its weight vector closest to the input vector (i.e. it is the best match) will fire with the greatest strength, which would appropriately classify the given pattern.

You might think that neurodes would have to be much more complicated in order to make up interesting neural nets. Well, I hate to disappoint you, but as this example shows, even very simple networks can be made to perform surprisingly complex tasks. This simple system is, in fact, an example of the use of grandmother cells (I have no idea where the term comes from), which are processing elements that respond to exactly one type of pattern.

Grandmothering, which is the type of network we've just described, is not really learning (as some of you might have noticed), but is instead memorization. It is a static system in which the weights of a neurode are never changed during the system's operation. To recognize a new pattern, you have to add a new grandmother cell, and as you can imagine, for real-world tasks the number of grandmother cells can quickly get out of hand. Sure, memorizing that 25 + 56 = 81 is useful, but it would be much more useful if the system could use a general technique (such as training) to come up with the sum of any two numbers.

Part II, "The Adaline", will cover one of the first effective learning laws, the Widrow-Hoff LMS algorithm. This learning algorithm bypasses the disadvantages of the grandmothering system (by not plainly memorizing), and is still widely used in network architectures and offers excellent solutions for specific problems.

Columns
MFC Corner
Development
Security
Network Security
Feature Articles
Artificial Neural Networks: An Introduction

C++ Standard Template Library: Part I

Designing the Web

Disk Scheduling Algorithms

Java and Swing

Random Pruning: A Heuristic Approach to Programming AI Agents

The Basic Commands of Linux

Networking Your Home/Dorm/Apartment

Nouveau Networking: Introducing Jini

Should You Use Linux?

So You Want To Be a Hacker

The X Windows System

XML Exchange

March 1999

Issue 1, Volume 1

This could be your company's banner advertisement

hw_publegal.gif (2694 bytes)