You are on page 1of 14

Folksonomy: giving definition to the world

Daniel J. Pool

Issues in Folksonomy - Pool 2

Introduction
In recent years the practice of tagging content in an information system has grown from
private hobby of specialists to everyday occurrence. Ten years ago a pound-symbol (#) was
either a notation of a sharp in sheet music or what you pressed after a selection over the
telephone for a call center. Today it has taken on the meaning of hash tag and is used to precede
a user indexed term. The technological landscape of information science is drastically changing
as new technology allows more people than ever to commit time and energy to practices once
withheld for curators of information systems like museums and libraries. As this happened a new
term for the classification of non-structured indexing was formed. However is this new technique
a passing whimsy or a new permanent function of the emerging information society? To better
understand the folksonomy phenomena one should understand what a folksonomy is, its history,
its structure, and the current issues facing the world community at large.
Current Research on Folksonomy
Folksonomy is a method of annotating digital content with open-ended socially generate
metadata (Noruzi 2006). It has no fixed categories or hierarchies and relies on end users to create
the organization. Where as a taxonomy is primarily created to be an invisible structure behind
information systems (Bates 1999) a folksonomy is created to utilize the natural language of the
user to be index or tag content online for future searching (Noruzi 2006).
The term was created from the combination taxonomy for structural organizations with
folk representing the social aspect of the system (Dye 2006). It was created at the IA Institute
(formerly the Asylomar Institute for Information Architecture) when researchers Gene Smith,
Eric Scheid, and Thomas Vander Wal were discussing the socially constructed bookmarking sites

Issues in Folksonomy - Pool 3

that were starting to gain momentum online such as del.icio.us and Flickr (Vander Wal 2007).
Eric Scheid call these annotated lists of metadata folk classification and Vander Wal restated
using the term folksonomy.
At the time the term described the user-created bottom-up categorical structure
development with an emergent thesaurus to describe the tags created on these upcoming
popular websites (Vander Wal 2007, 1). The reason this non-structured information structure
became popular was because web crawlers that traditionally indexed information online could
not keep up with the amount of content being created (Dye 2006).
However this method has become controversial as it is a flat set of terms that tries to
derive aboutness from the collective wisdom of a community (Dye 2006). This community is
not always the best judge of how an item should be classified and more so is prone to fault in
those classifications. Often they become inefficient and chaotic adding no significant value to the
content they describe.
The chaos created as a byproduct of the folksonomic structure comes from the social
nature of its collaborative creation (Bouillet, Feblowitz, Feng, Liu, Ranganatha, and Riabov
2008). In a stark contrast to ontologies which are controlled by a close knit group of individuals
who can create systematic, if not inflexible, organization hierarchies--folksonomies allow for
organization anarchy. Taxonomies rely on parent-child/subconcept-superconcept relationships
that create a controlled vocabulary (Kiu and Tsui 2011).
This is not unlike the history of early museums and libraries. At one point private
individuals would haphazardly gather oddities into collections (Hedstrom and King 2006). Over
time these collections required more and more specialized knowledge to handle and maintain.
This gave way to the introduction of information science to make the collections accessible by

Issues in Folksonomy - Pool 4

not just the private owners but also the general public. On the internet, private individuals collect
data in a haphazard way that over time takes more and more powerful systems to oversee proper
storage and usability. This leads a collection from its start from a folksonomy to a taxonomy as
the need for more classification is needed.
Types of Folksonomy
The user in a pure folksonomy holds all the power of the system with no oversight
(Bouillet, Feblowitz, Feng, Liu, Ranganatha, and Riabov 2008). By socially tagging content with
keywords, the user allows for automatic discovery by their peer group using similar terms.
Uncontrolled terms with no relationships are created by users out of natural language (Kui and
Tsui 2011).
There are five commonly described folksonomic structures in use for online content
management (Kiu and Tsui 2011). The first is the pure folksonomy or no structure organization.
In this format anyone can add metadata to an object without any oversight. The second is called
co-existent where a taxonomy and a folksonomy can exist side-by-side but neither
communicates or interacts with the other in any way.
The third is called the folksonomy-directed taxonomy in which items have a place within
a classical taxonomy but users can also annotate the articles with folksonomy based tags (Kiu
and Tsui 2011). In this system the two are still mostly separated however users still get the
benefit of easily finding items while retaining the basic structure of a taxonomy. The drawback is
that a formal vetting process must be in place to process terms that are used to describe objects.
The fourth system is the taxonomy-directed folksonomy in which choices or suggestions
for tags are given to the user from a pre-constructed taxonomy (Kiu and Tsui 2011). In this
version it still requires a great deal of oversight from administrators to keep systems orderly. The

Issues in Folksonomy - Pool 5

consistency of tagging is much greater but it still suffers some of the issues of controlled
taxonomies.
Lastly, the folksonomy hierarchies/ontologies creates taxonomies based on the terms
tagged annotated by users (Kiu and Tsui 2011). This is created by using clustering algorithms
that produce hierarchies based on the frequency, usage, and metadata within the system. This
system draws extensive resources and can be manipulated by purposely inputting incorrect
keywords.
Assessment
Though folksonomy structures seem like a revolution to some (Reamy 2009) it has
limitations that make it less than desirable for completely replacing the taxonomy structures that
came before (Mathes 2004). The ambiguity that user generated keywords create means that the
same object can be annotated in several different ways even by the same user much less several
thousand or even millions.
This is because of the chaotic decisions that a crowd can make (Reamy 2009). Bandwagon
effects, time of day, and region can all affect the annotation a user gives to an object. With
enough tagging an object can be correctly defined but it can take time.
Also there is a lack of control in the synonyms that are generated (Mathes 2004). This is
because there is no central authority, as it is a bottom-up architecture (Reamy 2009). All of the
control then is left in the hands of the user.
Because of this, there is a trend for social memory (Hedstrom and King 2006). This means
that what is considered important or worth remembering, indexing, defining, or archiving is in
the hands of the user. As generations grow older, those social groups views of importance

Issues in Folksonomy - Pool 6

change and so do their terms. The problem is popularity does not create organization alone
(Reamy 2009).
The great issue burning on the minds of information professionals however is whether or
not the metadata is actually helpful (Fichter 2006)? The amount of tags added to a digital item
does not matter if they do not add value to the end user. Unanimously, research has found it to be
useful as with Kiu and Tsuis research (2011), or Bouillet, Feblowitz, Feng, Liu, Ranganatha, and
Riabov models (2008), and even Reamys studies (2009). Researchers agree that the systems add
value for end user.
Researchers do disagree on the amount taxonomies and folksonomies should be
integrated however. Some believe the two should never cross (Reamy 2009) while others see it
as the only future for later systems (Kiu and Tsui 2011).
Libraries across the United States have already begun incorporating folksonomy structures
into their existing systems (Baker 2012). The University of Pennsylvanias PennTags project let
users annotate digital objects with keywords to search for later. LibraryThing, a social reading
website, has been integrated into the California State University-Northridge Oviatt Library to
allow user generated content without contaminating systems with junk data.
These successful projects offer libraries the ability to continue the same service they were
created to perform without losing control of those systems (Baker 2012).
Promising studies however conclude that one does not need to be a professional indexer in
order to create meaningful content (Bates 1999). One does not need extensive expertise in order
to correctly judge what field a material belongs in. Information merely needs to be retrievable
which does not mean it needs to be intricately described for all everyday users.

Issues in Folksonomy - Pool 7

Additionally the library system has always been designed to fulfill a social need (Bates
1999). The institution of information science came about because of the social drive to share
private collections (folksonomic collections) with the general public in a controlled way
(taxonomic collections) through a series of controls (Hedstrom and King 2006). In this way,
folksonomies are akin to early libraries and museums in that they are privately maintained
collections. Through advancements in technology we have just finally found a significant way to
share those collections in a public way.
Information science seeks to answer three major questions concerning: physical, social,
and design of information services (Bates 1999). This is framed by the guiding principle of
information science as the deliverer of information. In this way, information science is the study
of data transfer. For a folksonomy to be consider useful (Reamy 2009) it be answer these
questions and be able to deliver information to end users (Bates 1999).
In this regard, folksonomies are built upon existing physical frameworks (Dye 2006). For
example the gift lists on Amazons commercial website integrated to the point that they are
indistinguishable from the rest of the site. More so, end users expect folksonomic structures
included in their services (Fichter 2006).
Which leads to the second perspective of the social question (Bates 1999), can people
relate to the information retrieval process? This is an overwhelming yes. Not only are
folksonomic structures easy for users to relate to as they are in their native/natural vernacular but
they are designed, curated, and by the users (Noruzi 2006). It is the folk they are talking about
in folksonomy (Vander Wal 2007). Folksonomies are easy to use by anyone because they are
cognitively designed the way a person thinks (Reamy 2009).

Issues in Folksonomy - Pool 8

Lastly, and possibly most importantly for information science, is does the folksonomic
system effectively and rapidly supply information to the user (Bates 1999). This is where
folksonomy has trouble (Reamy 2009). Users often tag objects that hold a personal value to them
with a personal value. With a small group of a couple hundred items users feel compelled update
the items. With a couple million items, users begin to not tag items as they believe someone else
will. This is an issue with the economy of scale that a folksonomy can have.
Research
To investigate this phenomenon more, an informal study was devised. In order to look
how a folksonomy structures from, a questionnaire of twenty-five items was devised. Some
questions were open ended and allowed respondents to enter any number of annotated terms to
tag an item. Other questions had testers choose an answer from a list. For this initial study, a
section of images, text, and ideas were supplied.
The group was mostly comprised of women over the age of twenty-five. The majority held
a college degree or higher and all respondents were white non-Hispanic. All participants were
required to agree to terms of use, declare they were over eighteen, and complete the majority of
questions to be considered for the results. Thirty-one of these individuals started the survey but
only twenty-two completed every question.
Five questions allowed users to make up and input any number of tags onto images and
text. Five additional questions asked participants to chose from a list of terms to tag a similar list
of items. The other questions on the survey were split between demographics and opinionated
questions about the respondents use of internet tags. Questions were designed to create a basis
for further research.

Issues in Folksonomy - Pool 9

Findings indicate that users overwhelmingly choose similar words for graphical images.
One image, a tiger with googly eyes, was accurately described by users with a specific term,
tiger. Textual objects however were often tagged with the same word (tagging coffee with the
term coffee) or with a completely different word altogether that no other user had (duplicator
or butt multiplier as tags for the word printer).
Results
Initial results demonstrate that certain types of information are easier for a community to
describe. The more personal history a person might have with an object the greater the validity of
their answers. The less personal history a person has the less validity their annotation will be.
This study had several validity issues. Chiefly, the sample was too small and was not
representative of the total population in age, gender, or ethnicity. Secondly it was discovered
through the course of this research that certain questions were unfair or not easy to tag. For
example, single term text questions did not deliver results that were reliable. The associations
drawn from these questions are suspect based on answers and from additional research from
other studies.
Further research is a need to confirm the present research. Using the groundwork created
here, the next study would incorporate narrative samples. Short paragraphs that contain a story,
series of facts, or an explanation of a procedure would be given to participants. They would then
need to choose tags, create their own, or create tags within a framework presented.
Whether or not folksonomies are theoretically stable or appropriate for application in a
professional information organization system, they are here to stay. End users the world over not
only utilize these systems but hold them as an industry standard in their everyday lives. Not only

Issues in Folksonomy - Pool 10

for their ease of use, cheap scalability, and reliability but because they are now actually
pragmatic.
As technology closes the gap folksonomies have between organizational anarchy and well
ordered taxonomy their use will only increase. Systems and users will become more
sophisticated as time passes which means these systems will only continue to be integrated into
every aspect of information technology and organization.
Conclusion
In closing, folksonomies are not a proper organization system. They are a socially
constructed collection of annotated metadata that is neither reliable nor predictable. However, it
has become a necessary component of both technological and socially constructed environments
in order to make data accessible. Early experiments in combined folksonomy-taxonomy systems
has shown to be promising ways to utilize the best of each system without nearly as many of the
drawbacks of either.

Issues in Folksonomy - Pool 11

References
Baker, Kate. (2012). "Folksonomies and Social-Tagging." The Idaho Librarian.
http://theidaholibrarian.wordpress.com/2012/11/13/social-tagging-2012/.
Bates, Marcia. (1999). "The Invisible Substance of Information Science." Journal of the
American Society for Information Science and Technology 50, no. 12: 1043.
Bouillet, Eric, Mark Feblowitz, Hanhua Feng, Zhen Liu, Anand Ranganatha, and Anton Riabov.
(2008). "A Folksonomy-Based Model of Web Services for Discovery and Automatic
Composition." 2008 IEEE International Conference on Services Computing. 389-96.
Dye, Jessica. (2006). "Folksonomy: A Game of High-tech (and High-stakes) Tag." EContent 29,
no. 3: 38-43.
Fichter, D. (2006). "Intranet librarian. Intranet applications for tagging and folksonomies."
Information Today 30, no. 3: 43-45.
Hedstrom, Margaret, and John L. King. (2006). Epistemic Infrastructure in the Rise of the
Knowledge Economy. Cambridge: MIT Press.
Kiu, Ching-Chieh, and Tsui, Eric. (2011). "TaxoFolk: A hybrid taxonomyfolksonomy structure
for knowledge classification and navigation." Expert Systems With Applications 38, no.
5: 6049-58.
Mathes, Adam. (2004). "Folksonomies - Cooperative Classification and Communication
Through Shared Metadata." Computer Mediated Communication.
Noruzi, Alireza. (2006). "Folksonomies: (Un)Controlled Vocabulary?" Knowledge Organization
33, no. 4: 199-203.
Reamy, Tom. (2009). "Folksonomy Folktales." KM World 18, no. 9: 6-8.

Issues in Folksonomy - Pool 12

Vander Wal, Thomas. (2007). "Folksonomy." vanderwal.net.


http://vanderwal.net/folksonomy.html.
Yoo, Donghee, Keunho Choi, Yongmoo Suh, and Gunwoo Kim. (2013). "Building and
evaluating a collaboratively built structured folksonomy." Journal of Information
Science 39.5: 593-607.

Issues in Folksonomy - Pool 13

Appendix

Issues in Folksonomy - Pool 14

You might also like