Professional Documents
Culture Documents
(Autonomous)
Freshman Engineering Department
Green Fields, Vaddeswaram, Guntur-522502
ANDHRA PRADESH, INDIA
MINI PROJECT
INTERNET SEARCHING
BY
Lecturer In-charge(s)
CERTIFICATE
This is to certify that the students of I/IV B. tech. Mr. PAWAN
RAJ PHUYAL (Y8IT284), Mr. SANTOSH BHANDARI(Y8IT297)
& Md. SHAHBAZ HAIDER(Y8CE240) have done a mini project in
the field of Internet Searching in the year 2009-06.
2: History of Internet
3: Uses of Internet
Reasons why people use internet
Why do people put thing on web
6: Steps of Searching
7: Browsing the Internet
Web browsers
Useful web browsers
9: Conclusion
Chapter 1
Introduction
The Internet is a global system of interconnected computer networks that
use the standardized Internet Protocol Suite (TCP/IP). It is a network of networks
that consists of millions of private and public, academic, business, and government
networks of local to global scope that are linked by copper wires, fiber-optic
cables, wireless connections, and other technologies.
The Internet carries a vast array of information resources and services, most
notably, the inter-linked hypertext documents of the World Wide Web (WWW)
and the infrastructure to support electronic mail, in addition to popular services
such as online chat, file transfer and file sharing, online gaming, and Voice over
Internet Protocol (VoIP) person-to-person communication via voice and video.
The origins of the Internet reach back to the 1960s when the United States funded
research projects of its military agencies to build robust, fault-tolerant and
distributed computer networks. This research spawned world-wide participation in
the development of new networking technologies and led to the commercialization
of an international network and the popularization of countless applications in
virtually every aspect of modern human life. By 2009, an estimated quarter of
Earth's population uses the services of the Internet.
There are several different ways to look at what the Internet actually is
At the lowest level, it is the hardware behind the computer networks - the
computers, modems, phone lines and cables that link together to form a huge
network.
The Internet is a kind of anarchy. Everyone looks after their own little
Internet 'patch', but no one is responsible for looking after it as a whole. It would
be nearly impossible to control the Internet now - and trying to would certainly
destroy it. But datas available in the internet are updated, refreshed & released by
their responsible sites & organizations.
hypermedia: It contains various types of media (text, pictures, sound, movies ...)
and hyperlinks that connect pages to one another.
Chapter 2
History of the Internet
The Internet was born about 20 years ago, as a U.S. Defense Department
network called the ARPnet. information and comments with millions of people all
over the world, get a fast answer to any question imaginable on a scientific,
computing, technical, business, investment, or any other subject. You could join
over 11,000 electronic conferences, anytime, on any subject, you would be
broadcasting your views , questions, and information to millions of other part.
There has never been anything like it in the history of the world, and in this
English class we've covered a lot of history. At a growing rate of about 20% per
month the Internet is only getting bigger and if people don't start utilizing its
resources they could be road kill on this Information Superhighway. Hey, I'll bet in
the middle of that last sentence another computer just got on-line to the Net. There
are three major features of the Internet, On-line discussion groups, Universal
Electronic Mail, files and software. There's about 11,000 on-line discussion groups
called Newsgroups, on most any topic you can imagine. If you are on the Net, you
can participate in any of these discussions in any of these newsgroups. The next
thing is Universal Electronic Mail or E-mail. E-mail is the biggest and cheapest
system on the Net and is also one of its biggest attractions. Since all commercial
on-line services have something called gateways for sending and receiving
electronic mail messages on the Internet, you're able to send and receive messages
or files to anyone else who is on-line, anywhere in the world and in seconds. The
third feature I mentioned was files and software. This in my opinion is the most
impressive one. All the thousands of individual computer facilities connected to
the Internet are also vast storage repositories for hundreds of thousands of software
programs, information text files, video and sound clips, and other computer based
resources. And their all accessible in minutes from any personal computer on-line
to the Internet. So I could do all this stuff on the Internet, why should I take notice?
Because of its sheer size, volume of messages, and it's incredible monthly growth.
From the latest statistics I was able to get, there are currently 30 million people
who use the Internet worldwide
The internet developed from software called the ARPANET which the U.S
military had developed. It was only restrict to military personnel and the
people who developed it. Only after it was privatized was it allowed to be
used commercially.
The internet has developed to give many benefits to mankind. The access
to information being one of the most important. Student can now have
access to libraries around the world. Some charge a fee but most provide
free services. Before students had to spend hours and hours in the libraries
but now at the touch of a button students have a huge database in front of
them
A little vagueness won’t hurt. I can always muddle through and change things up
in response to market conditions or personal interest. No need to be perfect from
the start.
I looked at many websites to study their methods, to learn what made them a success. I started
planning what specific niche I wanted to explore and suddenly realized that I was
thinking about the whole thing in a roundabout way.
There’s really no need to think hard about having the perfect idea. The foundations
of popular and profitable websites/services are deeply related to the basic reasons
why people get online and use the internet. Let’s do some reverse engineering
from that perspective.
This is very much a fundamental human need. People like to meet and talk
to other people through the internet. They use it to maintain new or existing
relationships. They want to communicate ideas and find solidarity with
others who share similar interests. So do something which facilitates
communication. Hyper-local or cross-border communities, social networks,
virtual worlds, apps or services built on existing communication/social
protocols and services. Bring human social activities onto the internet grid.
Socialize existing web functions, emphasize on connecting people.
People want to find things online. So help them. Create a system which
provides information or filters existing content. Monetize the flow of data.
Blogs, training courses, social news, aggregated news, paid membership
sites, online journals, one-stop entertainment portals, video, image and game
hubs with a specific focus.
To support hypertext documents, the web uses a special protocol, called the
hypertext transfer protocol or HTTP.A hypertext document is specially enclosed
the file that uses the hypertext markup language, or HTML. This language allows
the document author or embed hypertext links –also called a hyperlinks –or just
link in the web document .HTTP and hypertext links are the foundation of world
wide web.
As you read a hypertext document more commonly called a web pages –on
screen, you can click a word or picture enclosed as a hypertext link and
immediately jump to another location within the same location or to different web
page .the second page may be located on the same computer as the original page or
anywhere else on the internet .Because you do not have to learn separate command
and address to jump to a new location, the world wide web organized widely
scattered resource in to a seamless whole.
A collection of related web pages is called a web site. Web site are housed on
web server, internet host computer s that often store thousands of individual pages.
copying a pages in to a server is called publishing the pages, but process also
called posting or uploading.
When we put the cursor on the browser’s blinking area and start
to type it will starts to find out the things which are related to
our input throughout the web. If our input is proper and correct
then only it will find whatever we want to search.
Chapter 5
Searching the Internet
4.1 Finding Things on the Web
The Web is a very big and much disorganized place. Just about any information
you would ever want to know (and a whole lot more that you wouldn't) exists on
the Web somewhere. But finding it is another story.
The reason for this is that it was never designed as a global information retrieval
system, hence there is no central place monitoring where or how information is
stored. The added complication of hypertext makes it very easy to lose your focus
and get lost.
Search Tools
These are lists of links to other sites related to a particular subject. The most useful
trailblazer pages have links divided into categories and descriptions of why each
site is useful.
Trailblazer pages can be very useful in your Web searching. You will often find
links to pages that don't show up in search engines or directories. However, it can
be frustrating to jump from one trailblazer page to another without finding any
pages with actual content!
Portal Sites
These are sites aim to be an Internet 'one-stop-shop', either to the whole Internet, or
for one particular broad subject (e.g. Education). As well as link directories and
search engines they might offer a range of other services such as discussion
forums, online shopping malls and news reports. They can be quite useful,
especially for new users to get orientated to what kinds of things the Internet can
offer them. No portal can cover the entire Internet though, so eventually you might
find their range of subjects limiting and prefer to go on a wider hunt for the
information you require.
Pick one or more search tools to use for your search. Here are some to get you
started:
Yahoo
AltaVista
Google
Infoseek
Excite
Search NZ
How long did it take to find what you were looking for?
How satisfied were you with the page you found? How well did it fit what you
were looking for?
How well do you think your chosen search tool/tools performed in your search?
This is a facility that you may "bookmark" or add to your "favorites" it is no longer
regularly updated and maintained nor will it be updated, as I personally use the
Google toolbar for most of my searching, If I need something "special" I can try
something from this list or I may use Speciality Search Engines or perhaps the
huge resource at Special Search Engines. There is also a search engine that
searches for specialist search engines, but ironically, I cannot find it at the moment.
Google
http://www.google.com
Yahoo
http://www.yahoo.com
Yahoo!
Launched in 1994, Yahoo is the web's oldest "directory," a place where human
editors organize web sites into categories. However, in October 2002, Yahoo made
a giant shift to crawler-based listings for its main results. These came from Google
until February 2004. Now, Yahoo uses its own search technology. Learn more in
this recent review from our Search Day newsletter, which also provides some
updated submission details.
In addition to excellent search results, you can use tabs above the search box on the
Yahoo home page to seek images, Yellow Page listings or use Yahoo's excellent
shopping search engine. Or visit the Yahoo Search home page, where even more
specialized search options are offered.
The Yahoo Directory still survives. You'll notice "category" links below some of
the sites lists in response to a keyword search. When offered, these will take you to
a list of web sites that have been reviewed and approved by a human editor.
AltaVista
http://www.altavista.com
AltaVista is a web search engine owned by Yahoo!. AltaVista was once one of the
most popular search engines but its popularity has waned due to the rise of Google.
AltaVista opened in December 1995 and for several years was the "Google" of its
day, in terms of providing relevant results and having a loyal group of users that
loved the service.
Ask
http://www.ask.com
Ask Jeeves initially gained fame in 1998 and 1999 as being the "natural language"
search engine that let you search by asking questions and responded with what
seemed to be the right answer to everything. In reality, technology wasn't what
made Ask Jeeves perform so well. Behind the scenes, the company at one point
had about 100 editors who monitored search logs. They then went out onto the web
and located what seemed to be the best sites to match the most popular queries.
In 1999, Ask acquired Direct Hit, which had developed the world's first "click
popularity" search technology. Then, in 2001, Ask acquired Teoma's unique index
and search relevancy technology. Teoma was based upon the clustering concept of
subject-specific popularity.
AOL Search
http://aolsearch.aol.com (internal)
http://search.aol.com/(external)
AOL Search provides users with editorial listings that come Google's crawler-
based index. Indeed, the same search on Google and AOL Search will come up
with very similar matches. So, why would you use AOL Search? Primarily because
you are an AOL user. The "internal" version of AOL Search provides links to
content only available within the AOL online service. In this way, you can search
AOL and the entire web at the same time. The "external" version lacks these links.
Why wouldn't you use AOL Search? If you like Google, many of Google's features
such as "cached" pages are not offered by AOL Search.
Live Search
http://www.live.com/
Live Search is the name of Microsoft's web search engine, successor to MSN
Search, designed to compete with the industry leaders Google and Yahoo. The
search engine offers some innovative features, such as the ability to view
additional search results on the same web page and the ability to adjust the amount
of information displayed for each search-result. It also allows the user to save
searches and see them updated automatically on Live.com.
Look Smart
http://www.looksmart.com
Lycos
http://www.lycos.comLycos is one of the oldest search engines on the web,
launched in 1994. It ceased crawling the web for its own listings in April 1999 and
instead provides access to human-powered results from LookSmart for popular
queries and crawler-based results from Yahoo for others.
Netscape Search
http://search.netscape.com
Owned by AOL Time Warner, Netscape Search uses Google for its main listings,
just as does AOL's other major search site, AOL Search. So why use Netscape
Search rather than Google? Unlike with AOL Search, there's no compelling reason
to consider it. The main difference between Netscape Search and Google is that
Netscape Search will list some of Netscape's own content at the top of its results.
Netscape also has a completely different look and feel than Google. If you like
either of these reasons, then try Netscape Search. Otherwise, you're probably better
off just searching at Google.
4.2 Limitations of Search Engines
The ambiguities of language mean that the list of retrieved documents may
contain a high percentage of irrelevant material.
Some search only document titles and others search the entire document.
Being electronic, they can't discriminate between valuable documents and
ones of dubious quality.
With millions of people using the Internet they sometimes become
overloaded.
Chapter 6
Steps of Internet Searching
How is it that an Internet Search engine can find the answers to a query so quickly?
It is
a four-step process:
2. Indexing the pages: to create an index from every word to every place it occurs.
4. Displaying the results: in a way that is easy for the user to understand.
A search engine’s index is similar to the index in the back of a book: it is used to
find the pages on which a word occurs. There are two main differences: the search
engine’s index lists every occurrence of every word, not just the important
concepts, and the number of pages is in the billions, not hundreds. Various
techniques of compression and clever representation are used to keep the index
―small,‖ but it is still measured in terabytes (millions of megabytes), which again
means that distributed computing is required. Most modern search engines index
link data as well as word data. It is useful to know how many pages link to a given
page, and what are the quality of those pages. This kind of analysis is similar to
citation analysis in bibliographic work, and helps establish which pages are
authoritative. Algorithms such as PageRank and HITS are used to assign a numeric
measure of authority to each page. For example, the PageRank algorithm says that
the rank of a page is a function of the sum of the ranks of the pages that link to the
page. If we let PR(p) be the PageRank of page p, Out(p) be the number of outgoing
links from page p, Links(p) be the set of pages that link to page p and N be the
total number of pages in the index, then we can define PageRank by
where r is a parameter that indicates the probability that a user will choose not to
follow a link, but will instead restart at some other page. The r/N term means that
each of the N pages is equally likely to be the restart point, although it is also
possible to use a smaller subset of well-known pages as the restart candidates. Note
that the formula for PageRank is recursive – PR appears on both the right- and left-
hand sides of the equation. The equation can be solved by iterating several times,
or by standard linear algebra techniques for computing the eigenvalues of a (3-
billion-by-3-billion) matrix.
The two steps above are query independent—they do not depend on the user’s
query, and thus can be done before a query is issued with the cost shared among all
users. This is why a search takes a second or less, rather than the days it would take
if a search engine had to crawl the web anew for each query. We now consider
what happens when a user types a query. Consider the query [―National
Academies‖ computer science], where the square brackets denote the beginning
and end of the query, and the quotation marks indicate that the enclosed words
must be found as an exact phrase match. The first step in responding to this query
is to look in the index for the hit lists corresponding to each of the four words
―National,‖ ―Academies,‖ ―computer‖ and ―science.‖ These four lists are then
intersected to yield the set of pages that mention all four words. Because ―National
Academies‖ was entered as a phrase, only hits where these two words appear
adjacent and in that order are counted. The result is a list of 19,000 or so pages.
The next step is ranking these 19,000 pages to decide which ones are most
relevant. In traditional information retrieval this is done by counting the number of
occurrences of each word, weighing rare words more heavily than frequent words,
and normalizing for the length of the page. A number of refinements on this
scheme have been developed, so it is common to give more credit for pages where
the words occur near each other, where the words are in bold or large font, or in a
title, or where the words occur in the anchor text of a link that points to the page.
Inaddition the query-independent authority of each page is factored in. The result is
a numeric score for each page that can be used to sort them best-first. For our four-
word query, most search engines agree that the Computer Science and
Telecommunications Board home page at www7.nationalacademies.org/cstb/ is the
best result, although one preferred the National
Studies have shown that the most popular uses of computers are email, word
processing and Internet searching. Of the three, Internet searching is by far the
most sophisticated example of computer science technology. Building a high-
quality search engine requires extensive knowledge and experience in information
retrieval, data structure design, user interfaces, and distributed systems
implementation.
The major web browsers in order of usage according to Net Applications are
Windows Internet Explorer, Mozilla Firefox, Apple Safari, Google Chrome, and
Opera.
Firefox
Firefox is a browser from Mozilla. It was released in 2004 and is one of the
most popular browser today.
Netscape
Netscape was the first commercial Internet browser. Netscape was
introduced in 1994, but gradually lost its popularity to Internet Explorer.
The development of Netscape officially ended in February 2008.
Mozilla
The Mozilla Project has grown from the ashes of Netscape. Browsers based
on Mozilla code are the largest browser-family on the Internet today.
Chapter 8
Demerits of Internet Searching
Theft of Personal information
If you use the Internet, you may be facing grave danger as your personal
information such as name, address, credit card number etc. can be accessed by
other culprits to make your problems worse.
Spamming
Spamming refers to sending unwanted e-mails in bulk, which provide no
purpose and needlessly obstruct the entire system. Such illegal activities can be
very frustrating for you, and so instead of just ignoring it, you should make an
effort to try and stop these activities so that using the Internet can become that
much safer.
Virus threat
Virus is nothing but a program which disrupts the normal functioning of your
computer systems. Computers attached to internet are more prone to virus attacks
and they can end up into crashing your whole hard disk, causing you considerable
headache.
Pornography:
This is perhaps the biggest threat related to your children’s healthy mental life.
A very serious issue concerning the Internet.
Time wasting
If we are not sure that what we are searching or if we can’t select the proper search
tips it will take long time.
Chapter 9
Conclusion
Now this is the era of 21 st century. The most of the people of the world are using
internet and it became a essential part of the daily life. We can say even a small
work at home also people are using internet. The modernization seen in the world
in a short period of time and rapid development of the world is only by the
evolution of computer and internet. We already discuss about the feature and uses
of Internet above also.
Therefore, from my experience during this mini project also I understood the
importance of Internet searching. We can get anything from the Internet if we
search for the proper combination of searching tips and proper words. If we are
surfing the Internet also we need to utilize it in the proper way.
References:
http://www.google.com
http://www.wikipedia.com
http://www.ask.com
http://www.yahoo.com