Professional Documents
Culture Documents
Abstract: The goal of this project stems from the poor results of the last project. In
the previous project, sentences came out jumbled and incoherent because the
grammar was generated by the genetic algorithm, and as a result, was ruined by
the negative effects of crossover and mutation. In this project, the goal was to
evolve good sentences by comparing categories of words, and plugging them into a
sentence template. For example, when evolving a subject, the program can use
family nouns such as sister or baby and use it together in a sentence with violent
verbs such as kick or punch to be plugged into a template to get The baby liked to
punch. These sentences are then given a fitness assigned by a user. Overall, this
new method produced much better sentences then the previous project, showing
my hypothesis is true. This is largely because of the sentence templates giving a
much better grammar structure then the previous project could evolve. Another
reason it did much better was because it kept producing similar sentences to the
users tastes. As a result, the user would find many sentences based off the ones
they have chosen before. For the future, it is recommended to allow individuals to
carry more than just one subject and one verb. It would also be nice to see this
implemented on Twitter, as the last project was.
Methods/Algorithms: This genetic algorithm consisted of a population of 5
individuals, with each individual containing a subject-category and a verb-category.
Each individual is then assigned a subject from the list of that subject-category
(iguana from ANIMALS) and a verb from that verb-category
(paints from
CREATIVE). These words are then plugged into a shared template (The SUBJECT
was a master at VERB.) to get an actual sentence (The iguana was a master at
paint.) The resulting sentences were then ranked by a user and assigned a fitness
based on that ranking. For the next generation, the best individual was appended to
the population as elitism. Then two individuals are chosen from a tournament
selection function. They are then crossovered. Crossover worked by having either
the subject-category or the verb-category switched between two individuals. Once
an individual receives a new category from the other individual, they are then
assigned a random word from that category. An example of crossover can be seen
in figure 1. The first individual is assigned the second individuals category of
KITCHEN APPLIANCE so it picks fridge.
Figure 1: Crossover. Green text are the subjects, yellow text are the verbs
Algorithm
Population
size
Selection
method
Elitism (if
used)
Population
initializatio
n
Crossover
method
Crossover
rate
Mutation
method
Mutation
rate
Fitness
function
Generational
5
Tournament selection of 2
Elitism is implemented by appending the highest ranked
individual to the next generation
A population of 5 individuals is initialized. Each individual picks
a subject category (ex. POLITICS) and a verb category (ex.
BODILYFUNCTIONS). Then its subject_word is assigned a
random noun from the list of that category (ex. Obama in
POLITICS) and its verb_word is assigned a random verb from
the list of that category (ex. Fart from BODILYFUNCTIONS).
Then, each individual is given the same sentence template,
which it fills with its two words. A user then reads each
individuals sentence and ranks all individuals.
50% to swap subject categories. 50% to swap verb categories.
30%
50% pick a random subject category to be assigned to them
50% pick a random verb category to be assigned to them
40%
Verb: FOOD,VIOLENCE,FUN,LAZY,SOCIAL,CREATIVE,ENGINEER
Results:
2)
3)
1)
4)
5)
If
If
If
If
If
you
you
you
you
you
like
like
like
like
like
Figure 2: An example of an initial population evolving. The first list consists of the
first 5 individuals. The green text are subjects and the yellow texts are verbs. The
numbers to the left of them are their rankings.
Individual 3
Individual 3
Individual 5
15
10
5
0
Generation
10
Figure 2: All 5 individuals fitness over time. Notice Individual 1 is the best
individual, individual 5 is the worst. The best evolved sentence was Being a Hillary
Clinton means you must fart.
Conclusion:
The hypothesis was a success. Evolving categories of words resulted in more
readable and interesting sentences then before. The evolutionary algorithm stuck to
the users preferences and tended to create similar sentences. As seen in figure 1,
POLITICS and CREATIVE were the categories for the best individual in the initial
population, and these categories are seen in the next generation (President
Obama in the first sentence, read in second sentence, congress in fourth
sentence.) This made for more fun results for the user, since it inclined to their
tastes. For example, I have a childish taste of humor so most of my sentences
involved Politics and Bodily functions as the respective subject and verb
categories. Since users tended to pick more of the same subject and verb, there
tended to always be one clear best individual. In Figure 2, it is individual 1 who is
the clear winner. However, we can see diversity as individuals vary in fitness, for
example Generation 6 where we can see many individuals fitness rankings change.
While the results were positive, there is more room for improvement in the future.
For example, individuals should be able to carry more than just one subject and one
verb. It would be nice to evolve varying sizes of word amounts, such as an
individual with 2 subjects and 1 verb or 5 subjects and 6 verbs. This would make the
results substantially more sporadic, but more complex in category combinations. I
would also like to see this implemented in Twitter. This was not done in this project
because each individual had to be ranked, from 1 to 5, and the twitter program in
the last project did not support that. In the future, implementing polling on a twitter
bot account would allow online users to collectively vote on sentences and allow for
more interesting category combinations. Lastly, I would like to see more subject
categories, verb categories, and sentence templates in the future. For this project, I
had to think of them all by myself. As a result, there isnt a large range of categories
to pick from. Increasing this amount would allow for more complex sentences. This
can be done by finding a database of these categories, or writing a program to
extract them from a text. Overall, this was a very fun project to create and the
success of the results show that my hypothesis was correct.