Genetic Algorithms Digest Tuesday, March 14, 1995 Volume 9 : Issue 13 - Do NOT send email or reply to gadistr@aic.nrl.navy.mil (GA List Moderator) - Send submissions (articles) to GA-List@AIC.NRL.NAVY.MIL - Send administrative (subscribe, unsubscribe, change of address, etc.,) requests to GA-List-Request@AIC.NRL.NAVY.MIL ****************************************************************************** - You can access back issues, GA code, conference announcements, etc., either through the WWW at URL http://www.aic.nrl.navy.mil/galist/ or through anonymous ftp at ftp.aic.nrl.navy.mil [192.26.18.68] in /pub/galist. ****************************************************************************** Today's Topics: - Super-Linear Speedup (Re: v9n11) (2 Messages) - No Free Lunch (Re: v9n12) - Macready's NFL comments (Re: v9n12) - EP reply (Re: v9n12) ---------------------------------------------------------------------- ******************************************************************************* CALENDAR OF GA-RELATED ACTIVITIES: (with GA-List issue reference) ICTS95 3rd Intl Conf on Telecommunications, Tennessee (v8n21) Mar 16-19, 95 AISB Wrkshp on Evol Comp, Univ of Sheffield, UK (v8n45) April 3-4, 95 ICANNGA95 Intl Conf on Artificial NNs and GAs, France (v8n10) Apr 18-21, 95 IAS-95 Intelligent Adaptive Systems, Melbourne Beach FL (v8n32) April 26, 95 CAD95 Computer Aided Design / "GA Day", Yalta City, CIS (v8n46) May 4-13, 95 CSCS10 10th Intl Conf on CONTROL SYSTEMS & CS, Romania (v8n37) May 24-26, 95 ECAL95 3rd European Conf on Artificial Life, Granada, Spain(v8n5) Jun 4-6, 95 GWIC95 Golden West Intl Conf on Intell Systems, San Fran(v8n49) Jun 12-14, 95 ML95 GP Workshop: Theory to real-world applications, CA (v9n8) July 9, 95 AAICG95 Symposium on Appl of AI in Geophysics, Boulder, CO (v8n39) Jul 12, 95 ICGA-95 Sixth Intl Conference on GAs, Pittsburgh PA (v8n32) July 15-20, 95 PASE95 Wrkshp Parallel Appl in Stat and Econ, Trier (v8n39) Aug 29-Sep 2, 95 EA95 Evolution Artificielle, Telecom Bretagne, Brest, Fr (v9n1) Sept 4-8, 95 GALESIA'95 GAs in Eng Systems, Univ of Sheffield, UK (v8n45) Sept 12-14, 95 AIPetro95 AI in the Petroleum Ind, Lillehammer, Norway (v8n48) Sept 13-15, 95 ASI-AA-95 Practice and Future of Autonomous Agents (v8n19) Sep 23-Oct 1, 95 SOCO95 Soft Computing Fuzzy Logic, NNs and GAs, Rochester(v9n5) Oct 24-27, 95 Genetic Methods for Routing & Scheduling, New Orleans(v8n50) Oct 29-Nov 1, 95 AAAI GP Symposium (Fall Series), Cambridge, MA (v8n43) Nov 10-12, 95 IEEE Wrkshp on Natural Algorithms in Signal Proc, Essex (v9n7) Nov 12-14, 95 WWW'95 on Fuzzy Logic and NNs/Evol Comp, Nagoya, Japan (v9n6) Nov 14-15, 95 ICEC '95 IEEE Intl Conf on Evol. Computing, Perth (v8n36) Nov 29-Dec 1, 95 (Send announcements of other activities to GA-List@aic.nrl.navy.mil) ******************************************************************************* ------------------------------ From: giani@DI.UniPi.IT (Antonella Giani) Date: Thu, 9 Mar 1995 10:43:38 +0100 Subject: Re: Super-Linear Speedup when Parallelizing GAs > From time to time, I have heard it mentioned that GAs may > deliver SUPER-LINEAR SPEEDUP in the sense that the > parallel GA solves (at least for some problems) faster when > semi-isolated sub-populations (Sewell Wright's demes) are > used. In other words, the problem SOLVES IN FEWER > GENERATIONS with subpopulations than without them. > > Is anyone aware of any published results in this area? > > John Koza > Stanford University We observed a super-linear speedup in two parallel classifier systems. The first one, PECS, applies the punctuated equilibria model to CSs based on the Michigan approach. Several CSs, working on distinct sets of rules, evolve in parallel and periodically interact through the exchange of sets of rules, chosen in a probabilistic way according to their fitness. PECS learns in fewer generations with respect to a sequential CS. In our opinion, it shows a super-linear speed up because co-operating parallel CSs induce co-operation among rules, whereas, in a sequential CS, the GA exploits the competition among rules. The other parallel CS, PANIC (just presented at EP-95), is based on the Pitt approach, i.e. it evolves a population of genotypes, where each genotype is a set of rules. It uses an asynchronous and decentralized parallel GA with dynamic subpopulations which overlap in time. It does not show a faster problem solving with respect to the number of generations (we think it could be due to the limited size of subpopulations it uses, because of the high computational cost of evaluating a genotype), but it shows a super linear speed up in time: although it requires more generations to find good solutions, with respect to a synchronous panmictic model, it takes much lower time to evolve these populations. A description of the two parallel CS, together with a first comparison, has been submitted to PPAI-95 (Third International Workshop on Parallel Processing for Artificial Intelligence). The title of the paper is "Cooperation and Learning in Parallel Classifier Systems". Antonella Antonella Giani (giani@di.unipi.it) Dip. di Informatica, Universita' di Pisa Corso Italia, 40 I-56125 Pisa, Italy tel: +39-50-887229 fax: +39-50-887226 ------------------------------ From: darwen@csadfa.cs.adfa.oz.au (Paul Darwen) Date: Fri, 10 Mar 1995 11:19:29 +1000 Subject: Re: Super-Linear Speedup In volume 9, issue 11 of the GA digest, John Koza wrote: > From time to time, I have heard it mentioned that GAs may > deliver SUPER-LINEAR SPEEDUP in the sense that the > Is anyone aware of any published results in this area? Heinz Muhlenbein has written several papers about PGA (his parallel GA system), which achieves superlinear speedup for many search problems. The following paper is perhaps the best place to start. AUTHOR: Muhlenbein, H.; Schomisch, M.; Born, J. CORP SOURCE: Gesellschaft fur Math. und Datenverarbeitung mbH, Sankt Augustin, Germany TITLE: The parallel genetic algorithm as function optimizer SOURCE: Parallel Computing, vol.17, no.6-7, p. 619-32 ISSN: 0167-8191 PLACE OF PUBL: Netherlands LANGUAGE: English YEAR: Sept. 1991 ABSTRACT: The parallel genetic algorithm PGA is applied to the optimization of continuous functions. The PGA uses a mixed strategy. Subpopulations try to locate good local minima. If a subpopulation does not progress after a number of generations, hillclimbing is done. Good local minima of a subpopulation are diffused to neighboring subpopulations. Many simulation results are given with popular test functions. The PGA is at least as good as other genetic algorithms on simple problems. A comparison with mathematical optimization methods is done for very large problems. The PGA is able to find the global minimum of Rastrigin's function of dimension 400 on a 64 processor system! Furthermore, the authors give an example of a superlinear speedup (18 Refs.) <-------- I haven't read the following paper, but it goes into the analysis of superlinear speedup for algorithms like the GA and simulated annealling. Ron Shonkwiler's email is shenk@math.gatech.edu AUTHOR: Shonkwiler, R.; Van Vleck, E. CORP SOURCE: Sch. of Math., Georgia Inst. of Technol., Atlanta, GA, USA TITLE: Parallel speed-up of Monte Carlo methods for global optimization SOURCE: Journal of Complexity, vol.10, no.1, p. 64-95 ISSN: 0885-064X PLACE OF PUBL: USA LANGUAGE: English YEAR: March 1994 ABSTRACT: We introduce the notion of expected hitting time to a goal as a measure of the convergence rate of a Monte Carlo optimization method. The techniques developed apply to simulated annealing, genetic algorithms, and other stochastic search schemes. The expected hitting time can itself be calculated from the more fundamental complementary hitting time distribution (CHTD) which completely characterizes a Monte Carlo method. The CHTD is asymptotically a geometric series, (1/s)/(1- lambda ), characterized by two parameters, s, lambda , related to the search process in a simple way. The main utility of the CHTD is in comparing Monte Carlo algorithms. In particular, we show that independent, identical Monte Carlo algorithms run in parallel, IIP parallelism, and exhibit superlinear speedup. We give conditions under which this occurs and note that equally likely search is linearly sped up. We observe that a serial Monte Carlo search can have an infinite expected hitting time, but the same algorithm, when parallelized, can have a finite expected hitting time. One consequence of the observed superlinear speedup is an <--------- improved uniprocessor algorithm by the technique of in-code parallelism (20 Refs.) Paul Darwen darwen@canth.cs.adfa.oz.au Department of Computer Science Phone: +61 6 268 8182 University College, University of New South Wales Fax: +61 6 268 8581 Australian Defence Force Academy Canberra ACT 2601 AUSTRALIA ------------------------------ From: efalkena@ulb.ac.be (Falkenauer Emanuel) Date: Fri, 10 Mar 95 05:10:48 +0100 Subject: Re: No Free Lunch Hello, It looks like the NFL discussion is getting overheated, so I will add something Bill Macready and David Wolpert probably do not expect (too bad): an agreeing voice from somebody who not only likes GAs, but also earns his living with them. Yes, I fully agree that the question whether the distribution of 'interesting' functions is uniform or not is IRRELEVANT. Indeed, they are right: if you don't spend the due time thinking on which side of the (supposedly) non-uniform distribution you (and your algorithm) are w.r.t. the particular problem you try to tackle, then you can as well assume it IS uniform, for your chances really are 50-50. Yes, once again, I welcome VERY MUCH their conclusion that the right way of proceeding is FROM the function TO the algorithm, not the other way around, so frequently used. And yes, I do agree there has been a lot of the converse (i.e. ALGO->FUNC) published concerning GAs in particular, although I have the suspicion that that's the case for most metaheuristics (yes, I do think the convergence proofs for SA are pretty useless, and there is nothing at all [to my knowledge] for TS). Let's face it: how many of you have NEVER seen a nice curve going down or up (minimization or maximization) over the generations, standing as a 'proof' that 'it works'... without any comparison to OTHER approaches? Having said that, I have a few comments: One thing that stroke me about the NFL correspondence is the complete reliance on objective FUNCTIONS in the discussion: not once has anybody mentioned that there are also PROBLEMS with their INSTANCES, even though THAT's the common way of function 'classification' in NP-completeness Theory. Why? I suspect that it's because problems define a certain STRUCTURE of the objective functions involved (even though from one instance to another, things can vary quite a lot). Of course, any NP-complete problem can be transformed to SAT, so that remark might seem irrelevant. BUT, remember that most of the NP-completeness Theory only considers GLOBALLY optimal behaviour, which is NOT the aim of GAs (well, it is nice to show that a GA can get the global optimum most of the time, but I don't think large NP-complete decision problems will ever be in the realm of GAs). Once we are willing to admit that GAs are not GUARANTEED global optimizers, then the possible transformations to SAT become largely irrelevant and the STRUCTURE of a problem (as opposed to a function, which pertains to one parti- cular instance of a problem) becomes very much of interest. Indeed, if argu- ments can be found supporting the efficiency of an algorithm on a PROBLEM, then they will apply to most of its instances, which there is an infinite number of. I find results of this kind quite useful, and they do NOT contradict the NFL stuff (so I'm convinced). I find it rather strange that Bill and David never hinted on that (which might be understandable after all...), and even more that Bill Spears did not in his posting which I found very insightful. Now to the GAs in particular. I got the impression that Bill and David still have in their mind the classic Holland-style, binary-coded GA. In that respect, their (pretty negative) conclusions concerning the GAs are quite correct. However, GAs are MUCH more than that!!! Indeed, I wouldn't even say GAs are algorithms: 'the GA' is a PARADIGM, while an algorithm is made of it by a particular ENCODING and (subsequently) OPERA- TORS. In other words, it is time for getting used to the idea that NO encoding is good for ALL problems. But of course, I do not pretend having found this by myself, for it's already written in the Bible of all of us, i.e. John Holland's "Adaptation in...". Should they have read it, I do not think Bill and David would have written things like "don't be so wed to a GA" (their last posting). The point is this: John Holland's book provides a good deal of robustness results, yielding statements like "...whatever the complexity of the function..." (in italics on p.129) and the like. While results like these probably made a lot of people into believing the 'THE GA' will work optimally (w.r.t. trial allocation) on ALL functions, they must have overlooked a very key point: most these results (including the Schema Theorem) are based on the assumption that the ENCODING suits the problem!!! In short, John Holland has worked out a beautiful theory that supports the validity of his (i.e. the classic) GA IN THE PARTICULAR CASE where the binary encoding is relevant. But do not blame him for the crowd of overzealous followers: he did, in the same book, warned against such an overconfidence. Indeed, on page 117, Bill and David could have read Holland> ...Moreover they do NOT compensate the major shortcoming of genetic Holland> plans [that use just crossover, mutation and inversion]. That short- Holland> coming is the complete dependence of such plans upon the detectors Holland> determining the representation. If the set of detectors is inadequate, Holland> in any way, the plan must operate within that constraint. Gosh, doesn't that look PRECISELY the same as their B.M & D.W.> If the correlations you are "exploiting" are not B.M & D.W.> present in f then at best (!) you can hope to perform as well as a B.M & D.W.> random algorithm. (?!) What bothers me a lot is the very next sentence of theirs: B.M & D.W.> And since you don't know precisely what correlations B.M & D.W.> are exploited by a GA you run the risk they are not present in f. Who told you that? If I wanted to be very harsh, I would say "Don't mix MY GAs with YOURS, because they are obviously NOT the same!" So, what's the conclusion? I think Bill and David should get up to date in GAs: they (the GAs) are not as naive as they might look to an exterior observer and, yes, there ARE people who consider FIRST the problem and THEN try to design encodings and operators that exploit its structure (although we will probably still see for a long time sta- tements like "we have torn and bent the structure of the problem so that the standard binary operators of Holland could be applied...[even when it doesn't make any sense]). I sincerely think Bill and David would get less harsh then. On the other hand, the GAers should realize that Bill and David's work is nothing more (but also nothing less!) than a formal proof of what was clear from the outset: THE ENCODING AND OPERATORS MUST SUIT THE STRUCTURE OF THE PROBLEM being tackled. Well, what if there is no obvious structurte to exploit? Then we should be ho- nest enough to admit that a GA is probably not the way to go, and should maybe leave it for an SA or a TS, or whatever. That, for my part, defines the next 'holy grail' in GAs: let's try to develop tools that will indicate whether a particular problem has any structure to exploit or whether it does not. In the latter case, such a problem should be classified GA-hard. But that might turn out to be a bit hard. I would therefore at least welcome a tool that would indicate whether a GIVEN encoding/operators is suitable for a given problem. Needless to say, I'm not as yet aware of any tools of any of the two kinds. In that respect, I join Bill and David: we still have a long way to go. Cheers, Emanuel ------------------------------ From: andy.keane@eng.ox.ac.uk (Andy Keane) Date: Fri, 10 Mar 95 09:31:36 GMT Subject: AT LAST! (Re: Macready's NFL comments) It is good to see someone else pushing for GA workers, etc. to compare their methods to other optimizers. This has been something of a personal crusade for me for some years now. Unless you try a variety of methods you can have little confidence that you are using an appropriate one. Over the years I have built up a software package called OPTIONS that allows for many methods to be readily applied and compared for any problem of interest. Please see my entry in the WWW at http://www.eng.ox.ac.uk/~oedajk/ for papers that describe some of this work and also the OPTIONS package. It is however worth noting that, given no knowledge of how a function behaves, that it has many dimensions and that you want a global optimum, a GA is a good place to start, often followed by other methods for fine tuning, etc. However, it is also not uncommon to find that classical slope driven methods and modern versions such as DHC can be well worth trying. In my experience, the real distinction comes when the number of variables is very large and the function evaluations very expensive. Say you have >50 variables and can only afford 5,000 evaluations. What do you then do ? Answer: use the GA for say 2,000 evaluations to get a feel for things. Use another 2,000 split into four lots of 500 on each of four other methods starting from some of the best places thrown up by the GA search. Finally use the last 1,000 on whatever methods has performed best starting from the best solution so far. At the end of this you may have some confidence in what is going on. Equally you cannot guarantee that some untried method won't come along and produce massively better results in a few trials. Andy Keane Oxford 10/3/95 ------------------------------ From: fogel@ece.ucsd.edu (Fogel) Date: Fri, 10 Mar 95 13:22:26 PST Subject: EP reply (Re: v9n12) A recent posting in ga-list asserted that the evolutionary programming method of Fogel et al. (1966) was not capable of solving any but the most simple problems. This statement is a repetition of similar comments that have appeared in some of the GA literature, but it is not correct. Indeed, in Fogel (1993), I used this exact procedure to evolve strategies for playing the iterated prisoner's dilemma in the form of finite state machines. If anyone would like a copy of this paper and does not have access to the journal, please send me some email and I'll forward you a hardcopy. I'd be pleased to discuss the potential of evolutionary programming methods for general problem solving with anyone who is interested in following up. All of the methods of evolutionary computation (genetic algorithms, evolution strategies, evolutionary programming) have the potential to enhance our ability to address difficult optimization problems. Regards, David ------------------------------ End of Genetic Algorithms Digest ******************************