Genetic Algorithms Digest Thursday, August 8 1991 Volume 5 : Issue 21 - Send submissions to GA-List@AIC.NRL.NAVY.MIL - Send administrative requests to GA-List-Request@AIC.NRL.NAVY.MIL Today's Topics: - Site Sought for ICGA-93 - Re: Building Block Hypothesis Considered Harmful (2 Messages) - GA Software Survey - Corrections to paper in ICGA-91 proceedings ********************************************************************** CALENDAR OF GA-RELATED ACTIVITIES: (with GA-List issue reference) IJCAI 91, International Joint Conference on AI, Sydney, AU Aug 25-30, 1991 First European Conference on Artificial Life (v5n10) Dec 11-13, 1991 Canadian AI Conference, Vancouver, (CFP 1/7) May 11-15, 1992 10th National Conference on AI, San Jose, (CFP 1/15) Jul 12-17, 1992 ECAI 92, 10th European Conference on AI (v5n13) Aug 3-7, 1992 Parallel Problem Solving from Nature, Brussels, (CFP 4/15) Sep 28-30, 1992 (Send announcements of other activities to GA-List@aic.nrl.navy.mil) ********************************************************************** ---------------------------------------------------------------------- From: goldberg@gal2.ge.uiuc.edu Date: Tue, 6 Aug 91 14:31:28 GMT Subject: Site Sought The ICGA Conference Committee is seeking proposals for siting ICGA 93. Sites should be capable of handling 400-500 people in a large auditorium, poster sessions, and workshops. A mix of low-cost and not-so-Spartan housing should be available, and one or more volunteers to handle local arrangements should be identified. If you are interested in exploring the possibility of hosting ICGA 93 send email to Dave Schaffer (ds1@philabs.philips.com) or Dave Goldberg (goldberg@gal2.ge.uiuc.edu or goldberg@vmd.cso.uiuc.edu). Dave Goldberg ------------------------------ From: das@CS.ColoState.EDU (rajarshi das) Date: Sun, 4 Aug 91 11:34:39 MDT Subject: Re: Building Block Hypothesis Considered Harmful In GA-List v5n19, Grefenstette raises some very important issues regarding the Strong Building Block Hypothesis (SBBH). His claim is that SBBH does not work in practice because several factors contributes to its failure. The most interesting factor according to him is the biased sampling due to previous convergence in a GA. To prove this Grefenstette provides the following example: " It is easy to define problems that are "Deceptive" in the sense implied by the SBBH, but are in fact easy for the GA to optimize. Consider a 10-bit space representing the interval [0 .. 1] in binary encoding. That is, 0000000000 represents 0.0 and 1111111111 represents 1.0. We want to maximize f. Let f be defined: f(x) = x*x, except for the following special cases: f(0111111111) = 1.01 ===> String A0 f(0011111111) = 1.02 ===> String A1 f(0001111111) = 1.03 ===> String A2 f(0000111111) = 1.04 ===> String A3 f(0000011111) = 1.05 ===> String A4 f(0000001111) = 1.06 ===> String A5 f(0000000111) = 1.07 ===> String A6 f(0000000011) = 1.08 ===> String A7 f(0000000001) = 1.09 ===> String A8 f(0000000000) = 1.10 ===> String A9 This is nearly as "Deceptive" as you can get: an enumeration of all schema competitions shows that except for 0000000000 and 000000000#, every schema representing the optimum has a true average payoff less than the competing schema representing 1111111111. Yet, the GENESIS program with population size 100 and the default parameter settings finds the optimum after a few thousand trials." Although this is an interesting problem, I am not sure that it succeeds in proving Grefenstette's hypothesis. As Grefenstette himself explains, the optimum is found because once the population converges to the suboptimal 1111111111, the special cases arise one by one, thanks to mutation. The question is, how does the search proceed once the population has converged in this problem. It is easy to see that the special cases provide the link (in the hamming space) between 1111111111 and 0000000000 such that mutation is able to "hill-climb" from the suboptimum to the optimum by flipping the necessary bits, one at a time. To prove this, lets remove say, String A5 from the list of special cases. Then GENESIS would fail to find the optimum every time (in say 15,000 trials) and converge to String A4, unless one of the four strings from A6 to A9 happen to be in the initial population or two (extremely lucky) consecutive mutations occur at positions 5 and 6 within string A4. Going back to the original problem, if we change the crossover probability to zero after the population has converged to the suboptimal 1111111111, it will have no effect and GENESIS is still going to find the optimum with other parameters set to their default values (pm = 0.001). In fact, we can speed up the search process by decreasing the population size to just two (or one) individuals (GENESIS finds the optimum in less than 200 trials). All this goes to show that: (1) After the population has converged in this problem, we are talking of an entirely different situation where the usefulness of crossover or having a population of individuals (read GA) is nil. It is only mutation that is working its way up the "hill"--a hill that happens to be a nice monotonically increasing function in hamming space. (2) Such a special case might not say a whole lot regarding the validity of SBBH in general, especially when there is a lot of diversity in the gene pool. (3) And the claim that "this (problem) is easy for the GA to optimize" should be made with the much caution. However, I do share some of Grefenstette's concern that a static analysis of the (average) fitness of low order schema in the initial population might not tell us a whole lot about how the GA is going to allocate trials in succeeding generations when the sampling in the population is biased. Grefenstette discusses such a problem in his paper in the Proceedings of ICGA 89. ------------------------------ From: gref@AIC.NRL.Navy.Mil Date: Wed, 7 Aug 91 09:15:04 EDT Subject: Re: Building Block Hypothesis Considered Harmful Perhaps I did not make the logical form of my message clear. I am arguing that *some* problems that are "deceptive" in the sense implied by the SBBH are in fact easy for GAs. I did not assert that *all* deceptive problems are easy, or that every variation of my example (e.g. removing string A5) would also be easy. I showed that the example is highly deceptive, and that it is actually easy for a standard GA to optimize. The observation that it is contrived or in some ways unusual is not relevant to my point. Nevertheless, I would claim that the example in fact shows a fairly common feature of a lot of practical problems: that convergence of the population often shifts the competition in favor of some schemas by eliminating their competitors, and that this happens in ways that the usual static analysis of the schema averages cannot reveal. By the way, I recommend that people interested in this topic read the discussion by Melanie Mitchell and Stephanie Forrest in their paper, "What is deception anyway, and what does it have to do with GAs?". They had a draft of this at the conference. ------------------------------ From: Jose Luiz Ribeiro Filho Date: Fri, 19 Jul 91 10:34:09 +0100 Subject: GA Software Survey Dear all GA friends, We are writing a survey on the available software packages for Genetic Algorithms. One of our main problem is how to gather information on such tools. This sound like a "call for information". Could you help us? We are looking for a brief description of the software tools, their high level structure and, if possible, the technical details. Thank you very much for your help. Please reply to: Jose Luiz Ribeiro Filho Department of Computer Science University College London Gower Street London WC1E 6BT England e-mail: zeluiz@cs.ucl.ac.uk - Jose Luiz ------------------------------ From: lipsitch@sfi.santafe.edu (Marc Lipsitch) Date: Thu, 1 Aug 91 19:08:03 MDT Subject: Corrections to paper in ICGA-91 proceedings THis is to correct some mistakes in my recent paper in the proceedings of ICGA91. Some of the data, in particular the data for the ALTERN and RANDOM target strings on runs with crossover, was mistaken. The new runs, with mistakes corrected, do not contradict any of the major conclusions. However, the following things have changed: 1) For the RANDOM string, the distribution of mutation-only-better vs. mut&xover-better is similar to that of ALLONES. The tables in the appendix I distributed at the conference show that for over half the runs the two methods of adaptation did the same. That was a mistake. 2) For the ALTERN string, for runs with crossover, the mean-walk-length vs. performance graph looks more like the run with mutation-only than like the graph shown in the appendix. 3) For the ALTERN target string, as for the others, crossover was more helpful at long correlation lengths than at short ones. The paper and appendix mistakenly show that crossover was helpful for all correlation lengths for the ALTERN string. I apologize for these mistakes and hope they have not inconvenienced anyone. A corrected version of the Appendix will be available in a few days. Write to me at: Marc Lipsitch 431 West Wesley Road NW Atlanta, GA 30305. ------------------------------ End of Genetic Algorithms Digest ******************************