Genetic Algorithms Digest Tuesday, July 5, 1994 Volume 8 : Issue 23 - Send submissions to GA-List@AIC.NRL.NAVY.MIL - Send administrative requests to GA-List-Request@AIC.NRL.NAVY.MIL - anonymous ftp archive: FTP.AIC.NRL.NAVY.MIL [192.26.18.68] - (Info in /pub/galist/FTP) Today's Topics: - One offspring or two (Re: v8n18/19/20/21/22) - Convergence, Vose (Re: v8n22) 2 messages - Asymptotes and Genetic Invariance - Permutation Mapping (long) (Re: v8n22) - C++ simple GA code - Ph.D. dissertation available: ---------------------------------------------------------------------- **************************************************************************** CALENDAR OF GA-RELATED ACTIVITIES: (with GA-List issue reference) FOGA94 Foundations of GAs Wkshop, Estes Park, Colorado(v7n26)Jul 30-Aug 3, 94 SAB94 3rd Intl Conf on Sim of Adaptive Behavior, Brighton(v7n11) Aug 8-12, 94 ECAI-94, 11th European Conference on AI, Amsterdam (v7n23) Aug 8-12, 94 ECAI-94 Wkshp on Applied Genetic & Other Evol Algs, Amsterdam(v8n5) Aug 9, 94 IEEE/Nagoya Univ WW Wkshp on Fuzzy Logic & NNs/GAs, Japan(v7n33) Aug 9-10, 94 ISRAM94 Special Session on Robotics & GAs, Maui, Hawaii (v7n22) Aug 14-17, 94 TECOM AI Conference, Aberdeen Proving Ground, Maryland (v8n19) Sep 13-16, 94 Evolution Artificielle 94, Toulouse, France (v8n10) Sep 19-23, 94 COMPLEX94 2nd Australian National Conference, Australia (v7n34) Sep 26-28, 94 PPSN-94 Parallel Problem Solving from Nature, Israel (v7n32) Oct 9-14, 94 GAs in Image Processing and Vision Colloquium, Savoy Place (v8n16) Oct 20, 94 AI'94 Workshop on Evol Comp, Armidale, NSW, Australia (v8n15) Nov 22, 94 ACM SAC '95 Symposium on Applied Computing, Tennessee (v8n20) Feb 26-28, 95 EP95 4th Ann Conf on Evolutionary Programming, San Diego,CA(v8n6) Mar 1-4, 95 ICTS95 3rd Int'l Conf. on Telecommunications, Tennessee (v8n21) Mar 16-19, 95 ICANNGA95 Intl Conf on Artificial NNs and GAs, France (v8n10) Apr 18-21, 95 ECAL95 3rd European Conf on Artificial Life, Granada, Spain(v8n5) Jun 4-6, 95 ASI-AA-95 Practice and Future of Autonomous Agents (v8n19) Sep 23 - Oct 1, 95 (Send announcements of other activities to GA-List@aic.nrl.navy.mil) **************************************************************************** ------------------------------ From: booker@azrael.mitre.org (Lashon Booker) Date: Thu, 30 Jun 94 09:22:31 EDT Subject: Re: One offspring or two (Re: v8n18/19/20/21/22) One brief comment about the issue of one vs two offspring. Much of the discussion has focused on whether there is any difference from a "long run", expected outcome standpoint. My motivation for advocating two offspring was more pragmatic. In the small populations used for GA implementations, the two offspring technique produces dramatically lower rates of allele loss than the one offspring approach. The reason is obvious. When you save both offspring, crossover is not a source of allele loss. When you save only one offspring, crossover becomes a source of allele loss. Lashon ------------------------------ From: Mark Cronshaw Date: Wed, 29 Jun 1994 13:25:28 -0600 (MDT) Subject: Convergence, Vose v8n22 I entirely agree with Vose's characterization of the dynamic behavior of GAs. If the mutation rate is small, one would expect long periods in which each member of the population is the same, followed by a transient period initiated by a mutation. The justification for this is the fact that Markov chains are ergodic. So the long run time average behavior is the limit probability distribution of the Markov process. (This requires some qualification regarding uniqueness. But with mutation the limit distribution is unique.) If the population ever reached a state with identical members, neither reproduction nor cross-over would change the population. Mutation would, but it occurs with low probability. So once in this state the population would stay there until a mutation occurred. Given this the number of "converged" or long run states must be at least as large as the number of possible types of agent. It would be interesting to know if there are other states besides those with identical agents which will have long runs. It would also be interesting to know the relative likelihood of the long run states. Mark B. Cronshaw, Dept. of Economics Tel. (303) 492-6310 University of Colorado Fax (303) 492-8960 Boulder, CO 80309-0256 ------------------------------ From: Philippe.Preux@lifl.fr Date: Thu, 30 Jun 1994 12:01:36 +0200 Subject: Convergence continued ... (Re: v8n22) This submission is a follow-up to M. Vose submission in v8n22. Being interested in GAs as possible heuristics, I have begun studies about the behavior of GAs in what I call an "accessible" or "reachable" future. This is quite a vague term that I can hardly precisely define. However, it corresponds to the convergence M. Vose mentions in his contribution in GA-List v8n22. That is, I observed that GAs populations uniformize around a certain genotype (or, more accurately, a set S of genotypes). Hence, convergence means uniformization of the population, that is, only a few equally fitting genotypes remain present in the population (with, eventually, several individuals with the same genotype in the population). In these studies, my point is that in order to be used as valuable heuristics, we should not contend ourselves with convergence observed after an infinite time of evolution. Important questions are: - how long does it take a GA to "find" a solution, that is, how long does it take the population to stabilize/uniformize in a region of the search space ? - where does the population uniformize to ? What is the value of the solution ? I performed lots of experiments on numerical functions as well as on symbolic problems (B. Spears' satisfiability problems) to observe how the GA behaves to be able to model it. In summary, my observations were: - the population uniformizes very rapidly, - once uniformized, the population remains uniform, even if the population has uniformized around a local minimum; the genotype does not change any longer. The population has uniformized and is stable with regards to the GA evolution. Though theoretically predicted and intuitively sound, I have never observed any re-diversification of the population even with very long running experiments, - it does not seem that there really exists an exploration phase followed by an exploitation phase. The GA always tend to uniformize immediately towards a region of the search space (that corresponds to the genotypes of the set S): it is more of a "rush" that we should speak rather than an "exploration". The word "exploration" gives a feeling of carefulness that we have not observed ! An immediate consequence of these remarks is that if the GA does not find immediately the global optimum, it loses any chance to converge towards it later, at least, in a reachable future. According to these remarks, I am trying to understand towards which kind of individuals the population uniformizes and at which rate. An other major issue is: what can we do to make the GA converge towards an "interesting" region. Or: can we slow down the rush, or even, transform it to something more likely to be called an "exploration" ? According to what I have deduced from these experiments, the stability that is attained rapidly is due to the crossover which is acting on a set S of genotypes that is closed by crossover, that is whatever Xi, Xj two individuals in S, (Yi, Yj) = Xover (Xi, Xj) where Yi and Yj are also elements of S, and Xover(Xi, Xj) denotes the application of crossover to the two individuals Xi, Xj that yields the two individuals Yi, Yj. We will say further that the set S is "crossover-closed". Once uniformized, that is, once all the individuals are elements of S, the crossover can no longer generate individuals that are not in S. Hence, using only the crossover, the GA is trapped. The mutation should have the role to lead the population to other regions of the search space. However, the probability to generate new better fitting individuals is very low, which explains that this re-diversification is not practically observed. It should be noted that crossover-closed population may diversify as long as mutation may yield better fitting individuals with a high probability. This fact was experimentally confirmed using initially crossover-closed populations that does not correspond to a local optimum. In this case, the population rapidly diversifies and then uniformizes. It is striking that the whole process of diversification-uniformization is very rapid, almost as rapid as a single uniformization observed when we begin the evolution with a random population. To understand what's going on when the population of a GA evolves, I have started studying very simple fitness functions. Something that struck me concerns the study of a "flat function", that is a fitness function where all individuals of the search space are equally fitting. The result is that the population uniformizes, as rapidly as usual, around a certain set of genotypes. In a certain way, this is normal that the population uniformizes. In an other way, as long as there is not a fitter point than the others, how does the GA choose the point towards which it converges ? Before reading M. Vose comments in the GA-List, I had never heard nor read about this kind of studies. I would be greatly interested in hearing of anyone who has performed studies along these directions. Any way, I would also be interested in receiving any feedback with regards to the ideas presented here. I have recently written a paper about all this which will be published in the proceedings of the workshop on "applied genetic and other evolutionary techniques", ECAI'94. It is available via anonymous ftp on ftp.lifl.fr:pub/users/preux/ecai.ps.gz The abstract of this paper is: The work reported here aims at describing and understanding the behavior of GAs within a ``reachable'' future, rather than their asymptotic behavior. This paper presents a study of the population diversity when a GA evolves, that is, the population uniformization. The diversity is measured at the genotypic level. Extensive experiments have been performed to observe this uniformization on numerical functions and on some specifically tailored functions, and using various strategies. Among other things, it is shown that the uniformization/exploration phase is always rapid, whichever objective function is to optimize. We study the selection pressure and its crucial impact on the rapidity of the uniformization. Analytical developments of the notion of diversity is under work. Some analysis of the various steps of the GA on the diversity are given. Any feedback is welcome. P. Preux ------------------------------ From: econec@vax.ox.ac.uk Date: Thu, 30 Jun 1994 11:34:39 +0100 Subject: Asymptotes and Genetic Invariance Dear List, Two questions: 1) Every GA person has in their subconscious an image of a "ragged asymptotic curve" for GA convergence, the one that zooms up and then goes up gradually by more and more infrequent steps that are also shorter each time. Well, I'm still fooling with that MLE problem and managed to get something looking pretty like a sigmoid for convergence. At first I thought it was just because my fitness covered such a wide range. Doubling a fitness of 0.25 isn't going to look like much on a graph that has to cope with a doubling of 25 million later on! So I tried log scales and graphs of percentage increase. The former was just a mess and the latter still looked pretty sigmoid. Appealing to biology, I had visions of some "choking" principle at the early stages that was preventing full "growth", but then I reflected (perhaps incorrectly) that even starting the search in the wrong space shouldn't give a sigmoid, just a much flatter asymptote. Does anybody have a similar experience and did they figure out what the "problem" was? (I'm getting perfectly good convergence but feel I might be getting there much quicker if I could figure out the slow start ... It might also tell me something interesting about the search space.) 2) Has anybody got any opinions on the GENE INVARIANT GENETIC ALGORITHM (GIGA)? I've read two papers on it by Lewchuk and Culberson, but have never seen it discussed anywhere else. It seems like quite an interesting algorithm but I can't get a very clear intuition on what it does - that's to say, with a GA I don't just know the algorithm but I have some feeling for the events "behind" it. Is it the opinion of the GA community that GIGA is a poor idea, or has it just not recieved much discussion? Will summarise to taste, Thanks, Edmund ------------------------------ From: jvr@animal.inescn.pt (Joao Vasco Ranito) Date: Thu, 30 Jun 1994 12:38:46 +0000 Subject: Re: Permutation Mapping (long) >From: mgix@jpn.thomson-di.fr (Emmanuel Mogenet) >Date: Tue, 28 Jun 1994 13:26:51 +0900 (JST) >Subject: Permutation Mapping > >I was wondering if anybody could help me on that one: Perhaps! >I am looking for a way to compute a perfect one to one mapping between the set >of all the permutations of the integers [0,n[ and the set of integers [0, n![ [snip] If you represent a permutation by its "inversion table", you will get the mapping you want. To get the inversion table of a permutation, you just write down, for each element i, the number of elements bigger than i at its left. For example: the permutation (0 3 2 1) has the inversion table (0 2 1 0), meaning "element 0 has 0 elements bigger than itself at its left" "element 1 has 2 elements bigger than itself at its left" "element 2 has 1 elements bigger than itself at its left" "element 3 has 0 elements bigger than itself at its left" (BTW, it is obvious that the bigger element always has 0 elements bigger than itself at its left, so we can always drop the last 0.) You can use the inversion table as a chromossome, using normal crossover (yes, it does always generate legal offspring!) and mutation (yes, sometimes it does generate *illegal* offspring!). If you concatenate the elements of the inversion table and use them as a factorial base (a*(n-1)! + b*(n-2)! + ... + 0*0!) number, you will have your permutations "numbered" and you will have your mapping (although I STRONGLY advise you to use the inversion table instead!!! ;-): (0 1 2 3) = (0 0 0 0) = 0*3! + 0*2! + 0*1! + 0*0! = 0 ... (3 2 1 0) = (3 2 1 0) = 3*3! + 2*2! + 1*1! + 0*0! = 23 = 4!-1 Note: this numbering, or ordering, depends on permutation "entropy" or disorganization. To get the permutation back from the inversion table there is a "naive" algorithm, O(n*n), and a "smart" algorithm, O(n log n), being n the number of elements. >Any pointers ? I would recommend the "bible": "The Art of Computer Programming - Vol II, Sorting and Searching" - D. E. Knuth, for inversion tables and algorithms on them. "The Art of Computer Programming - Vol III, Seminumerical Algorithms" - D. E. Knuth, for factorial bases. For the use of it all in GA, along with "entropy", permutation numbering etc., see below, please ;-)... >The reason for this is to be able to encode a permutation in a smarter, >purely binary representation and perform genetic manipulations on that >code. I have tried it. I even started writing a paper on that (5 pages, now) but: Pros: a) You can use normal crossover and mutation, although mutation can produce illegal offspring. But that's normal, when you code a parameter having an upper limit different from 2^n. b) If you take a closer look at the inversion table, you will notice that you can save some space on the chromossome: element i cannot have more than n-i elements bigger than itself. c) To generate the first population, you generate inversion tables and not permutations! Very good to get a good sampling of the state space without fiddling with swapped elements!! So, good theoretical bonus! :-))) Cons: a) It doesn't work very well :-(. Although not a complete disaster, it is _easily_ beaten by specific crossover operators (PMX, OX, Cycle) and "normal" encoding. I tried it on TSP and Job-Shop and the results were not very encouraging, both in convergence speed and quality. Nevertheless, I think that understanding _why_ it does not work would be interesting... b) It is dead slow! You have to decode the inversion table into a permutation every time you apply the fitness function (usually defined on the permutation space) and this decoding is HEAVY, for useful permutation sizes (100+). The best algorithm for decoding is O(n log n) and it cannot beat the O(0) algorithm (no decoding at all! ;-) used on the "traditional" approach... So much for the theoretical bonus... You know, it is frustrating to have a good ;-) idea and see that it doesn't work... That's why I never finished the paper... but perhaps now I'll do it: after all, I've never seen this published anywhere before. Hope this helps, Joao Vasco Ranito e-mail: jvr@animal.inescn.pt tel: (02) 2094033 INESC Porto Lg. Mompilher 22 4000 Porto ------------------------------ From: Chris Jasek Date: Wed, 29 Jun 1994 15:20:44 -0500 Subject: C++ simple GA code I need to write a simple GA in C++ and I was wondering if anyone knew if there is any publically available C++ GA code. I have code in C, but really need it in C++. Any ideas? Thanks. Chris ------------------------------ From: David Levine Date: Wed, 29 Jun 1994 16:09:54 -0500 Subject: Ph.D. dissertation available: The following Ph.D. dissertation is now available: A PARALLEL GENETIC ALGORITHM FOR THE SET PARTITIONING PROBLEM DAVID LEVINE In this dissertation we report on our efforts to develop a parallel genetic algorithm and apply it to the solution of the set partitioning problem---a difficult combinatorial optimization problem used by many airlines as a mathematical model for flight crew scheduling. We developed a distributed steady-state genetic algorithm in conjunction with a specialized local search heuristic for solving the set partitioning problem. The genetic algorithm is based on an island model where multiple independent subpopulations each run a steady-state genetic algorithm on their own subpopulation and occasionally fit strings migrate between the subpopulations. Tests on forty real-world set partitioning problems were carried out on up to 128 nodes of an IBM SP1 parallel computer. We found that performance, as measured by the quality of the solution found and the iteration on which it was found, improved as additional subpopulations were added to the computation. With larger numbers of subpopulations the genetic algorithm was regularly able to find the optimal solution to problems having up to a few thousand integer variables. In two cases, high-quality integer feasible solutions were found for problems with 36,699 and 43,749 integer variables, respectively. A notable limitation we found was the difficulty solving problems with many constraints. (1) Via anonymous ftp: File /pub/tech_reports/reports/ANL9423.ps.Z from info.mcs.anl.gov (2) Via the Web: http://www.mcs.anl.gov/home/levine/GA/ga.html (3) To request a hardcopy, send electronic mail to Judy Beumer beumer@mcs.anl.gov and ask for tech report ANL-94-23. David Levine levine@mcs.anl.gov http://www.mcs.anl.gov/home/levine MCS 221 C-216 Argonne National Laboratory Argonne, Illinois 60439 (708)-252-6735 Fax: (708)-252-5986 ------------------------------ End of Genetic Algorithms Digest ******************************