The GENETIC ALGORITHM is a model of machine learning which derives its behavior from a metaphor of the processes of EVOLUTION in nature. This is done by the creation within a machine of a POPULATION of INDIVIDUALs represented by CHROMOSOMEs, in essence a set of character strings that are analogous to the base-4 chromosomes that we see in our own DNA. The individuals in the population then go through a process of evolution. We should note that EVOLUTION (in nature or anywhere else) is not a purposive or directed process. That is, there is no evidence to support the assertion that the goal of evolution is to produce Mankind. Indeed, the processes of nature seem to boil down to different INDIVIDUALs competing for resources in the ENVIRONMENT. Some are better than others. Those that are better are more likely to survive and propagate their genetic material. In nature, we see that the encoding for our genetic information (GENOME) is done in a way that admits asexual REPRODUCTION (such as by budding) typically results in OFFSPRING that are genetically identical to the PARENT. Sexual REPRODUCTION allows the creation of genetically radically different offspring that are still of the same general flavor (SPECIES). At the molecular level what occurs (wild oversimplification alert!) is that a pair of CHROMOSOMEs bump into one another, exchange chunks of genetic information and drift apart. This is the RECOMBINATION operation, which GA/GPers generally refer to as CROSSOVER because of the way that genetic material crosses over from one chromosome to another. The CROSSOVER operation happens in an ENVIRONMENT where the SELECTION of who gets to mate is a function of the FITNESS of the INDIVIDUAL, i.e. how good the individual is at competing in its environment. Some GENETIC ALGORITHMs use a simple function of the fitness measure to select individuals (probabilistically) to undergo genetic operations such as crossover or asexual REPRODUCTION (the propagation of genetic material unaltered). This is fitness-proportionate selection. Other implementations use a model in which certain randomly selected individuals in a subgroup compete and the fittest is selected. This is called tournament selection and is the form of selection we see in nature when stags rut to vie for the privilege of mating with a herd of hinds. The two processes that most contribute to EVOLUTION are crossover and fitness based selection/reproduction. As it turns out, there are mathematical proofs that indicate that the process of FITNESS proportionate REPRODUCTION is, in fact, near optimal in some senses. MUTATION also plays a role in this process, although how important its role is continues to ba a matter of debate (some refer to it as a backgroud operator, while others view it as playing the dominant role in the evolutionary process). It cannot be stressed too strongly that the GENETIC ALGORITHM (as a SIMULATION of a genetic process) is not a random search for a solution to a problem (highly fit INDIVIDUAL). The genetic algorithm uses stochastic processes, but the result is distinctly non-random (better than random). GENETIC ALGORITHMs are used for a number of different application areas. An example of this would be multidimensional OPTIMIZATION problems in which the character string of the CHROMOSOME can be used to encode the values for the different parameters being optimized. In practice, therefore, we can implement this genetic model of computation by having arrays of bits or characters to represent the CHROMOSOMEs. Simple bit manipulation operations allow the implementation of CROSSOVER, MUTATION and other operations. Although a substantial amount of research has been performed on variable- length strings and other structures, the majority of work with GENETIC ALGORITHMs is focussed on fixed-length character strings. We should focus on both this aspect of fixed-lengthness and the need to encode the representation of the solution being sought as a character string, since these are crucial aspects that distinguish GENETIC PROGRAMMING, which does not have a fixed length representation and there is typically no encoding of the problem. When the GENETIC ALGORITHM is implemented it is usually done in a manner that involves the following cycle: Evaluate the FITNESS of all of the INDIVIDUALs in the POPULATION. Create a new population by performing operations such as CROSSOVER, fitness-proportionate REPRODUCTION and MUTATION on the individuals whose fitness has just been measured. Discard the old population and iterate using the new population. One iteration of this loop is referred to as a GENERATION. There is no theoretical reason for this as an implementation model. Indeed, we do not see this punctuated behavior in POPULATIONs in nature as a whole, but it is a convenient implementation model. The first GENERATION (generation 0) of this process operates on a POPULATION of randomly generated INDIVIDUALs. From there on, the genetic operations, in concert with the FITNESS measure, operate to improve the population. PSEUDO CODE Algorithm GA is // start with an initial time t := 0; // initialize a usually random population of individuals initpopulation P (t); // evaluate fitness of all initial individuals of population evaluate P (t); // test for termination criterion (time, fitness, etc.) while not done do // increase the time counter t := t + 1; // select a sub-population for offspring production P' := selectparents P (t); // recombine the "genes" of selected parents recombine P' (t); // perturb the mated population stochastically mutate P' (t); // evaluate it's new fitness evaluate P' (t); // select the survivors from actual fitness P := survive P,P' (t); od end GA.Go Back Up