Genetic Algorithms Digest Wednesday, June 22, 1994 Volume 8 : Issue 21 - Send submissions to GA-List@AIC.NRL.NAVY.MIL - Send administrative requests to GA-List-Request@AIC.NRL.NAVY.MIL - anonymous ftp archive: FTP.AIC.NRL.NAVY.MIL (Info in /pub/galist/FTP) Today's Topics: - One offspring or two. (Re: v8n18/19/20) - Convergence of Canonical GA (Re: v8n20) - Quadruplets better than twins? (Re: v8n18/19/20) - Feature Selection using Genetic Algorithms - GA in MATLAB - GA accelerator hardware - CFP: GA's in Telecommunication Systems - Papers available (2 messages) ---------------------------------------------------------------------- **************************************************************************** CALENDAR OF GA-RELATED ACTIVITIES: (with GA-List issue reference) The IEEE Conference on Evolutionary Computation, Orlando(v7n26) Jun 26-30, 94 FOGA94 Foundations of GAs Wkshop, Estes Park, Colorado(v7n26)Jul 30-Aug 3, 94 SAB94 3rd Intl Conf on Sim of Adaptive Behavior, Brighton(v7n11) Aug 8-12, 94 ECAI-94, 11th European Conference on AI, Amsterdam (v7n23) Aug 8-12, 94 ECAI-94 Wkshp on Applied Genetic & Other Evol Algs, Amsterdam(v8n5) Aug 9, 94 IEEE/Nagoya Univ WW Wkshp on Fuzzy Logic & NNs/GAs, Japan(v7n33) Aug 9-10, 94 ISRAM94 Special Session on Robotics & GAs, Maui, Hawaii (v7n22) Aug 14-17, 94 TECOM AI Conference (v8n19) Sep 13-16, 94 Evolution Artificielle 94, Toulouse, France (v8n10) Sep 19-23, 94 COMPLEX94 2nd Australian National Conference, Australia (v7n34) Sep 26-28, 94 PPSN-94 Parallel Problem Solving from Nature, Israel (v7n32) Oct 9-14, 94 GAs in Image Processing and Vision Colloquium, Savoy Place (v8n16) Oct 20, 94 AI'94 Workshop on Evol Comp, Armidale, NSW, Australia (v8n15) Nov 22, 94 ACM SAC '95 Symposium on Applied Computing, Tennessee (v8n20) Feb 26-28, 95 EP95 4th Ann Conf on Evolutionary Programming, San Diego,CA(v8n6) Mar 1-4, 95 ICTS95 3rd Int'l Conf. on Telecommunications, Tennessee (v8n21) Mar 16-19, 95 ICANNGA95 Intl Conf on Artificial NNs and GAs, France (v8n10) Apr 18-21, 95 ECAL95 3rd European Conf on Artificial Life, Granada, Spain(v8n5) Jun 4-6, 95 ASI-AA-95 Practice and Future of Autonomous Agents (v8n19) Sep 23 - Oct 1, 95 (Send announcements of other activities to GA-List@aic.nrl.navy.mil) **************************************************************************** ------------------------------ From: terry@santafe.edu Date: Wed, 15 Jun 94 03:33:33 MDT Subject: Re: One offspring or two. Although Nick Radcliffe's question wasn't specifically about diversity, that was the difference that people seemed to think would be affected. It might seem like a trivial point to some, but you can't answer that question until you say what diversity is. There are many ways to do that. Some will tell you one form of crossover produces more diversity than the other, while other measures will tell you they produce the same amount. If we have a population of size N (where N is even), binary strings of length L, and two methods of producing a new population: M1 1-point crossover, 2 parents (chosen randomly, with replacement) produce 2 children and one of these is chosen (with equal probability) to enter the next population. This is done N times to fill the next population. M2 one-point crossover, 2 parents (chosen randomly, with replacement) produce 2 (complementary) children which both go into the next population. This is done N/2 times to fill the next population. Suppose we have a diversity measure d() : P -> Reals. The function d gives us a measure of diversity. If we take a population P and produce P1 from P using M1 and P2 from P using M2, we are interested in the relationship between d(P1) and d(P2). There are diversity measures that would consider these methods equal in terms of their diversity production, e.g. d(X) = 0 for all X. Another measure, d(X) = the number of strings in X whose complement is not also in X, would usually result in d(P1) > d(P2). Both of these are trivial, but they at least illustrate that how you define diversity can change the answer to the question. There are many other more realistic diversity measures. For instance, Mean, minimum or maximum Hamming distance. Range or distribution of fitnesses in the population. Some measure of allele distribution at each locus. Number or distribution of schema in the population. How much you can infer about the parent population's individuals. This is just the beginning; there are at least 30 measures of complexity that could arguably be used to measure diversity. As a concrete example, the following is an additive measure based on the variation found at each locus: L __ d(P) = \ min(Fi, 1 - Fi) / -- i=1 where N __ Fi = \ Pi(j) / -- j=1 and Pi(j) = 1 if the jth member of P has a 1 at locus i. 0 otherwise [This is all merely a long way of saying that we compute a value for each locus, and add them to get the overall diversity. The value computed for each locus will be 0.5 if the locus is one-half 0's and one-half 1's, and falls to 0.0 when the locus is fully converged. Thus 0 <= d(P) <= L/2 and a higher d(P) value is taken as indicating greater diversity. For this (perfectly reasonable) measure of diversity, I claim without proof that, E(d(P1)) = E(d(P2)). That is, the expected value of d() for the resultant populations is the same. I have most of a proof of this (Mathematica has the rest) if anyone is interested. If we divide the interesting (whatever that is) diversity measures into 3 classes: 0 = the measure indicates the methods produce equal diversity. 1 = the measure indicates M1 produces more diversity. 2 = the measure indicates M2 produces more diversity. and use #n to denote the number of measures in class n, then my feelings are that: #0 > #2 and #1 > #2. This seems to run counter to the responses to Nick's posting. I'm not sure why people think that M2 would generate more diversity (but I'm also not sure what they mean by "diversity"). Terry Jones. [ WMS: To be provocative, let me ask whether an expected value analysis is sufficient. Isn't the full distribution of importance? ] ------------------------------ From: Guenter Rudolph Date: Sat, 18 Jun 94 19:16:52 +0200 Subject: Re: Convergence of Canonical GA Aaron Konstam cited my paper "Convergence Analysis of Canonical Genetic Algorithms", IEEE Trans on Neural Networks 5(1), Jan 1994 pp. 96-101 and asked for explanations. Here they come: The 'Canonical Genetic Algorithm' (CGA) is the standard GA using bit strings, mutation probabilities > 0, arbitrary crossover operators and proportionate reproduction (scaling is not considered, but it does not affect the results). At first, one has to define what is meant by convergence: Let Z_t denote the best fitness value in the population at generation t. Then the CGA is said to converge to the global optimum if Prob(Z_t = f*) -> 1 as t->infinity, where f* is the value of the global optimum. Clearly, Z_t is a random variable. There are many different modes of stochastic convergence of random variables (E. Lukacs, Stochastic Convergence, 2nd ed., Academic Press: New York 1975, offers about a dozen of different concepts). The mode of convergence used above is termed 'convergence in probability'. This kind of convergence is weak enough to be a minimum requirement for stochastic algorithms. What I have shown via Markov chain analysis is, that Prob(Z_t = f*) < 1 as t-> inf. for the CGA unless the fitness function is constant. Therefore, the CGA is not globally convergent. Now consider another sequence of random variables: Let R_t = max{ Z_i : i = 0, 1, ..., t } be the 'record sequence' of sequence Z_0, Z_1, ... In other words, R_t is the fitness value of the best solution found until generation t. Via Markov chains I have shown that Prob(R_t = f*) -> 1 as t-> inf. for the CGA. That means, that the CGA will generate the optimal solution at some time. Why does this happen ? Since the mutation probability is larger than zero, any point x of the search space can be generated via mutation in one step for an arbitrary individual with a probability bounded from zero. This probability is very small, but larger than zero. Moreover, the expected time for the occurence of this event is finite (the event occurs with probability one), but the time to wait until this event occurs may be transcomputational long! The result for the record sequence R_t is also contained in the paper of Joe Suzuki in the 5th ICGA proceedings (1993). He used a Markov model (obviously independently) introduced by M. Vose and T.E. Davis, but the result is the same. But why Z_t does not converge ? When using proport. reproduction an optimal solution at time t is lost with probability one at a later moment > t, even if there are n optimal individuals (n = popsize). But mutation guarantees that the optimal solution will be found again. Then it will be lost, found, lost, ... In short: The sequence Z_t oscillates and cannot converge. Conclusion: o One can use a CGA for static optimization if one keeps track of the best solution. o We must be careful with the usage of terms like 'convergence': A population does NOT converge in a mathematical sense! Also, 'premature convergence' of GAs does not express 'convergence' in a mathematical sense: here, the GA is attracted by a locally optimal solution with a wide basin and it may take a long, long time until this basin is left (but it will happen!). Hope this helps! Guenter Rudolph Informatik Centrum Dortmund e.V. Joseph-von-Fraunhofer-Str. 20 D-44227 Dortmund (Germany) ------------------------------ From: Gary Roberts Date: Wed, 15 Jun 94 16:00:10 PDT Subject: Quadruplets better than twins? What do we mean by "children"? I think a look back at mammalian biology might be of interest. Please realize that I am not a biologist, and, no doubt, have an oversimplified view. When I use "recombination" here, I mean this in the genetic algorithm sense. "Recombination" is not analogous to the union of egg and sperm. The production of germ cells (egg or sperm) is where the analogue for "recombination" might be derived. In the biological process, one cell with diploid chromosomes produces four haploid germ cells. No genes are lost in this process. Of course, it is unlikely that any of the germ cells will actually reproduce, let alone all four. Still, this seems to follow the multiple children approach, at least so far as germ cell fitness is concerned. I guess what we have here is partial genetic information from two pairs of grandparents, recombined into four germ cells. The "fitness" function is nontrivial. * ------------------------------ From: oyman@ds5500.cc.boun.edu.tr (A.Irfan Oyman-BogaziciUni-CmpEngDept) Date: Wed, 15 Jun 94 10:09:26 -0400 Subject: Feature Selection using Genetic Algorithms Dear researchers, I am a research assistant in Bogazici (Bosphorus) University, Turkey. My research area is feature selection using Genetic Algorithms. This is also the topic of my Master of Science Thesis. Can you advise me any papers in the area of 'feature selection' ? (thank you in advance, irfan) ------------------------------ From: Sourav Kundu Date: Thu, 16 Jun 1994 13:17:58 +0900 (JST) Subject: GA in MATLAB Hello GA-list Subscribers, I would like to know if anybody has used or written some GA engine in MATLAB ? Is there any public domain MATLAB code for GA available ? I would like to link a SIMULINK model with a GA engine. I know I could do it via a link to C with MEX files but I would prefer to do straight with MATLAB. If any code is available for use, I could save the time to write one. Any pointers would be appreciated. Rgds to all, -kundu *.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.* SOURAV KUNDU ~ Internet: Control Engg. Laboratory ~ sourav@control.prec.metro-u.ac.jp Department of Precision Engg. ~ sourav@yamabuki.comp.metro-u.ac.jp Faculty of Technology, ~ sourav@archsci.arch.su.EDU.AU Tokyo Metropolitan University ~ Voice : +81(426)-77-1111 EXTN.4251 1-1 Minami-Ohsawa, Hachioji-shi ~ Fax : +81(426)-77-2717 Tokyo 192-03, JAPAN ~ Time : GMT +9 Hours "In nature, there are neither rewards nor punishments -- there are consequences." ------------------------------ From: Arie Zychlinski Date: 16 Jun 94 08:43:38 EDT Subject: GA accelerator hardware I would like to know your opinion on the need for special hardware accelerator to speed optimization calculations using GA's. Since I am new in GA, it might be that there are already such available. I would be much obliged to know on them. thank you !! ARIEZ@optibase.mhs.compuserve.com OPTIBASE, ISRAEL ------------------------------ From: "Charles C. Palmer Tie 863-7411 Date: Fri, 17 Jun 94 14:28:38 -0500 Subject: CFP: GA's in Telecommunication Systems I'm putting together a session in this conference on GA's in Telecommunications. The conference is kept small so as to encourage interaction between the participants and to maintain the quality of the presentations. The previous conferences have been superb and well worth the attendee's while. The conference itself is VERY well run and has been quite interesting in the past. The talks will be 20 minutes each, and given to a audience of design and optimization folks. If you are interested, please send your paper (preferable), or titles and abstracts for potential presentations to be considered for the conference to via email (for speed) either to me at cpalmer@watson.ibm.com of directly to the conference chair, Bezalel Gavish (address below), by August 1, 1994. Sending more than one abstract is encouraged! Charles C. Palmer ( cpalmer@swamp.watson.ibm.com ) Network Design Tools Tie 863-7411 914/784-7411 IBM T. J. Watson Research Center P.O.Box 704 Yorktown Heights, NY 10598 Bezalel Gavish Owen Graduate School of Management Vanderbilt University Nashville, TN, 37203 Bitnet: GAVISHB@VUCTRVAX Tel: (615) 322-3659 FAX: (615) 343-7177 [ WMS: The complete Call for Papers is on our FTP site under pub/galist/info/conferences/ICTS95. ] ------------------------------ From: Kazuhiro M. Saito Date: Mon, 20 Jun 94 18:29:15 EDT Subject: A paper available Hi. The following paper is available via anonymous ftp: Meshing of Engineering Domains by Meitotic Cell Division Kazuhiro Saitou Mark J. Jakiela Computer-Aided Design Laboratory Massachusetts Institute of Technology To appear in Proceedings of ALife IV, July 6-8 1994, MIT, Cambridge, MA Abstract A new technique for triangular meshing of a given two-dimensional domain is developed, where meshing is done by a process of simulated meitotic cell division. Dead cells become the elements of the discretized domain, and the domain is filled by a process of cell growth. A classifier system evolves the rules for developmental cell growth in the domain. A given domain is meshed repeatedly using the growth rules, and the quality of each mesh is checked with a fitness function. The preliminary results shows that an evolved set of classifiers for a particular test domain could also be effectively used for geometrically and topologically similar domains. To obtain an electronic copy of this paper (6 pages): ftp ftp.cognet.ucla.edu Name: anonymous Password: cd /pub/alife/public binary get kazu.ps.gz quit Then at your system: gunzip kazu.ps.gz lpr -P kazu.ps If you cannot obtain an electronic copy, please send a request for a hard copy to kazu@mit.edu Kazuhiro Saitou Graduate Student Research Assistant Computer-Aided Design Laboratory Massachusetts Institute of Technology kazu@mit.edu ------------------------------ From: "Laurence D. Merkle" Date: Fri, 17 Jun 1994 08:22:21 -0400 Subject: Availability of Paper Application of the Parallel Fast Messy Genetic Algorithm to the Protein Folding Problem Laurence D. Merkle, George H. Gates, Jr., Gary B. Lamont Department of Electrical and Computer Engineering Air Force Institute of Technology Graduate School of Engineering Wright-Patterson AFB, OH 45433 {lmerkle, ggates, lamont}@afit.af.mil Ruth Pachter Wright Laboratory Wright-Patterson AFB, OH 45433 pachterr@ml.wpafb.af.mil The capability to predict a polypeptide's tertiary structure given its amino acid sequence is important to numerous scientific, medical, and engineering applications, including understanding the mechanism by which specific enzymes function, and/or designing new polypeptides with specific biochemical, mechanical, and/or optical properties. A common theme in many approaches to the protein folding problem is the minimization of an energy function in conformational space. We discuss a parallel genetic algorithm development effort which focuses on the global minimization of a semi-empirical energy model. In particular, a parallel fast messy GA (PFMGA) is designed using the island model and implemented on the Intel Paragon. Each processor independently performs probabilistically complete initialization and the primordial phase. The subpopulations are then recombined in a single population, and the juxtapositional phase is carried out serially on a single processor. We discuss the relationship between expected execution time and processor count for this design. Each individual in the PFMGA population, once overlaid on the competitive template, is a binary string representation of a subset of the internal coordinates required to fully specify a conformation. The values of additional coordinates sufficient to fully specify the molecule's conformation, as well as empirical physical constants used by the energy model, are specified during initialization. Thus, the fitness of an individual is determined by overlaying it on the competitive template, `decoding' each of the independently variable coordinates, calculating the interatomic distances, and then calculating the energy associated with the conformation. We describe experiments designed to assess the scalability of the PFMGA design, in particular for the application to the energy minimization of [Met]-Enkephalin. The PFMGA obtains sub-linear speedup through 32 processors, at which point execution time begins to increase. The sublinear nature of the speedup is due primarily to the sequential nature of the juxtapositional phase, and secondarily to the increased communication required for recombination of the population between the primordial and juxtapositional phases. The energies obtained by the PFMGA are of the same order of magnitude as the lowest known for the model. Also, there is no statistically significant difference observed between the energies obtained for any processor counts. Capt Laurence D. Merkle (Larry) PhD Student Air Force Institute of Technology Graduate School of Engineering ------------------------------ End of Genetic Algorithms Digest ******************************