Genetic Algorithms Digest Friday, June 27, 1986 Volume 1 : Issue 4 - Send submissions to GA-List@NRL-AIC.ARPA - Send administrative requests to GA-List-Request@NRL-AIC.ARPA Today's Topics: - GA-List status - alternative bibliography formats - Re: alternative bibliography formats - some genetic learning questions - GA Clearinghouse - Machine Learning Group - Holland correspondence ------------------------------ Date: Fri June 27 1986 From: GA-List Moderator To: GA-List Members Subject: GA-List status Ok, as promised, here's the first of smaller, higher frequency digests. Remember to send material to GA-List. Material sent to gadistr gets buried in a mountain of electronic junk mail and may be lost for months. Ken ------------------------------ Date: Thu, 12 Dec 85 15:24 EST From: JUDD%umass-cs.csnet@CSNET-RELAY.ARPA To: gadistr%nrl-aic.arpa@csnet-relay.arpa Subject: RE: Volume 1, Number 2 I would like to see some discussion as to the format of this bibliographic database. I tend to favor a BIBTEK format which allows for the inclusion of keywords, comments, summaries, etc. For example, title= author= keywords= note= abstract= . . . The BIBTEX standard also includes the type of publication being refered to (eg PhD thesis, journal, book, tech.rep, etc.). This is useful for a variety of reasons: 1) a retrieval program could answer queries like "give me all books that..." 2) a bibliographical typesetter can make use of the distinctions. 3) further distinctiveness or more redundancy in specification. 4) checking for completeness in the entry (eg. a journal needs a volume#) 5) bigger and more is better and bigger. etc and so on etc. sj ------------------------------ Date: Fri, 27 Jun 86 10:15:02 edt From: Lashon Booker To: GA-List Subject: Bibliography format I volunteered to put together a bibliography to be distributed over this list, and suggested using the REFER format. Perhaps that format is not the best choice: [ see previous message ] I suggested REFER because it is available on our VAX, period. If there is a strong feeling that BIBTEX is better, and BIBTEX is widely available, *and* we can get a version that runs under 4bsd Unix, I don't mind using it. Comments? Lashon ------------------------------ To: GA-List@nrl-aic.arpa Subject: Some genetic learning questions. Date: Mon, 16 Jun 86 15:47:22 +0000 From: Julian Onions Sender: jpo%computer-science.nottingham.ac.uk@Cs.Ucl.AC.UK OK, here's some reasonably dumb questions I've been pondering on, anyone care to give some answers? 1) If I have a system that depends on (say) 5 parameters that I vary using genetic algorithms until I have (after many trials) got a good gene pool, what are the best tactics if I then decide that 6 parameters might more closely fit the problem. Should I start from a random pool, or could I add random 6th DNA strands to my current gene pool. e.g. I have some pseudo-random thing that I think can be modelled by some relation X = a1*y+a2*y+a3*y+a4*y+a5*y now I decide perhaps a better fit is X = a1*y+a2*y+a3*y+a4*y+a5*y+a6*y If I start from scratch, I throw away the 1000's of trials done for the first case. Now if I modify that gene pool to include a random additional a6, does this a) save me time. b) have no impact, I might as well have started again. c) take much longer as the gene pool has to be effectively unfocus on the old situation and then focus on the new model. I suspect this depends on how closely the 6th parameter interacts with the others, but I can't decide. [ My experience suggests c) is most likely the case with traditional GA implementations in which loss of variability is hard to undo. - moderator ] 2) Is there any good rule of thumb for the size of the gene pool? I guess its obvious that the bigger the pool, the more trials needed to get rid of poor genes. Also a small gene pool is not going to provide much variety for crossing. Would anyone care to take a guess at a ratio of number of trials to gene pool size? E.g. I'm going to do 3000 trials, what size should my gene pool be, or conversely a gene pool of 35, how many trials are likely to be needed before I get a reasonably good/perfect gene pool. I guess this depends also depends on the the values of the DNA but these values could be normalised. [ DeJong's thesis and Grefenstette's paper in IEE SMC Jan86 contain experimental results and discussions of these issues. Briefly, there seems to be a lower bound on the population size around 35-50. Below that performance degrades due to too little memory (genetic drift). On the other hand, some of Holland's theorems suggest that the information capacity (hyperplane information) of a population increases exponentially with population size. Experimentally this shows up in that little advantage is seen in increasing population size beyond 100-200. The number of trials is related more to the complexity of the space being searched and the length of an individual (number of genes). Generally speaking, with population sizes of 50-100, 2000-3000 trials are a minimum requirement. - moderator ] 3) If there are two solutions to an answer, can this technique discover either or both answers? My thoughts on the possibilities here are a) The gene pool will sway around randomly until it approaches one of the solutions, and then latch onto that. b) The gene pool will be tugged in alternate directions going into a sort of flip-flop state going from one solution to the other. c) The gene pool will sway from side to side never getting to either of the solutions. d) the gene pool will be divided on the issue and there will the 50% of the genes on each answer. What does the panel think? [ Any of the above, none of the above, ... It depends, unfortunately, on the representation chosen. Hence all the discussion on representation issues. - moderator ] Julian. ------------------------------ Date: Fri, 27 Jun 86 09:11:28 edt From: Lashon Booker To: GA-List Subject: Clearinghouse Status: R I'm passing along this letter Dave Goldberg sent o all GA conference attendees, just in case some GA-List readers haven't seen it. Enclosed with the letter were copies of an extensive bibliography and a recent technical report. Lashon ----------------------------------------------------------- Dear Colleague: As an attendee at last year's conference on genetic algorithms, I thought you might be interested in the two enclosed publications. One is a bibliography of genetic algorithm literature, and the other is some musings on schemata in binary-coded populations. Because genetic algorithm publishing is such a hit and miss affair, I have established The Clearinghouse for Genetic Algorithms (TCGA - double entendre intended) here at Alabama as a publisher of last resort. I also have most of the publications cited in the bibliography in our TCGA files and can help you get copies if you are unable to obtain them through normal channels. If there is interest in the Clearinghouse, we can formalize both the publishing and information dissemination functions of TCGA. I want to emphasize that these efforts are intended to complement, not compete with, the electronic information exchange established at NRL by Lashon Booker and Ken DeJong. As we all know, sometimes there is no substitute for the printed word, and TCGA will focus its efforts there. Please let me know what you think about these publications and TCGA. Sincerely yours, David E. Goldberg Assistant Professor Dept. of Engineering Mechanics College of Engineering The University of Alabama University, Alabama 35486 205/348-7241 ------------------------------ Date: Fri, 27 Jun 86 10:18:40 edt From: Kenneth Dejong To: ga-list Subject: Machine Learning Group Status: R NCARAI, the Navy AI Center, is gearing up for a substantial effort in machine learning. In addition to Lashon and myself, we are delighted that John Grefenstette has accepted a position starting this fall. Several graduate students from local universities are also involved. More details will follow on the direction and scope of our activities. Ken ------------------------------ Date: Fri, 27 Jun 86 09:27:32 edt From: Lashon Booker To: GA-List Subject: News from JHH Status: R This is an edited version of two mail messages I recently received from John Holland. There are some news items here of interest to GA-List. Lashon ------------------------------------------------------------------------ Lashon, Good to hear from you! I've been out of the country for a month (in Norway at the invitation of the Norwegian Research Council) so I'm behind but I did get a glimpse at the most recent GA-List and intend to contribute. As far as I know there is no consistent compilation of errors in ANAS, though my copy is filled with sheets of paper noting down this and that -- the most serious thing I know of is the error in the proof of exponential allocation for the two-armed bandit problem (Dan Frantz found it) which in- volves the use of one limit that is a function of another. We both believe the theorem is still correct but no one has taken the time to fix it. Did I tell you that I've been elected Ulam Scholar at Los Alamos for a year (from an endowment set up for this purpose)? It gives me a year to pursue basic research in whatever direction I please, and I'll be in residence there for about 3/4 of the time. Also, did you see the June issue of SCIENCE 86 -- it has a pretty extensive exposition (at a popular level) of classifier systems and their relation to some of what I'll be doing at Los Alamos. Our book is due out from MIT press sometime next month. All in all we're reasonably satisfied with the final version (which, as you can see, went to press in a great hurry). I'm really glad to hear John is going to be with you guys; and with another addition you should be THE hotbed of work in this area. It seems that several parts of the defense establishment have picked up on some of this stuff -- I'm currently consulting for Honeywell on an Air Force 6.1 RFP that mentions this work by name. Since I don't much like to do that stuff I put an impossibly high consulting rate, but they accepted anyhow! Haven't heard anything from Dave Davis! Stewart Wilson gave a presen- tation at the "pushing out the door" ceremony for the Connection Machine (with Wolfram going out the door muttering "maybe there is some science that can be done there [machine learning] after all"). Since Davis was there maybe Wilson has more recent info. I seem to remember something from you a while back about emerging hierarchies and system coherence. If it still relevant to you, I think this is the most important single issue to be understood both analytically and with appropriate simulation. I have some new analytic results (some of them obtained in Norway), but nothing particularly new w.r.t. simula- tion (it's almost impossible to find large enough hunks of time to run it productively during the academic year). Stewart Wilson has some really nice stuff about the efficacy of crossover on boolean problems. I hear through Rick Riolo who just came back from a couple of days at Thinking Machines that they now have two of their own people (with Rick as consultant) working on implementation of classifier systems! I also hear from him that at BBN they were doing some connectionist experiments with settings given to them from CMU and then decided to use the genetic algor- ithm to explore the parameter space with the result that they came up with MUCH better settings than those recommended! I'll try to get back to you later with more. If you want to forward this to the GA-List, it's okay with me. Let me hear more about what you guys are up to. John ------------------------------ End of Genetic Algorithms Digest ********************