You are on page 1of 37

IBM 100 A Computer Called Watson

Theodor STANESCU
http://www.youtube.com/watch?v=mjdRgBAY278
theodor.stanescu@ro.ibm.com Strategy and Architecture Services Manager IBM ROMANIA

We are Entering a New Era


Smart Systems Era
Computer Intelligence
Learning Systems

Watson (2010)

Computing Era Tabulating Era


System/360 (1964) Deep Blue (1997)

Abacus (Circa 3500 BC)

Antikythera Napiers Rods Astronomical (Circa 1600) Computer (ca 87 BC)

Counting Machine (Circa 1820)

ENIAC (Circa 1945)

Time

We Have Entered a New Wave of Innovation


Age of IT and Telecom

Innovation

The Industrial Revolution

Age of Steam and Railways

Age of Steel, Electricity and Heavy Engineering

Age of Oil, Cars and Mass Production

Smarter Products & Services

1st Wave

2nd Wave

3rd Wave

4th Wave

5th Wave

6th Wave

1770

1830

1875

1920

1970

2010

Source: Next Generation Green: Tomorrows Innovation Green Business Leaders, Business Week, Feb 4, 2008

Our World is Getting Smarter


Our world is becoming

INSTRUMENTED
Our world is becoming

INTERCONNECTED
Virtually all things, processes and ways of working are becoming

INTELLIGENT

Macro Trends are Changing the IT Landscape


Information from Everywhere Radical Flexibility Extreme Scalability

Data and content Apps, web and sensors At rest and in motion Integrated and federated

Cloud computing Virtualization at every level Automated administration Easy-to-use analytics

Big data analytics Real-time stream processing Efficient parallelism Workload-optimized

Smarter Planet How Clients Engage


Smarter Industry
Healthcare Oil and Gas Energy & Utilities Transportation Telecommunications Retail Banking Government Electronics

Instrument to Manage
Electronic Medical Records Smart Sensoring Smart Meters Asset Instrumentation Service Innovation RFID Tagging Core Banking Digital Surveillance Track and Trace

Integrate to Innovate Optimize to Transform


Health Information Exchange Integrated Oilfield Management Integrated Asset Management Integrated Fare Management Carrier-grade Platform Integrated Demand/Supply Sys Single View of the Customer Crime Information Warehouse Product Design Collaboration Collaborative Care Oilfield Modeling Intelligent Utility Network Congestion Management New Services Smarter Commerce Market Expansion Crime Prediction and Prevention Supply Chain Optimization

What Is Watson?
The Next Grand Challenge Over the last century, IBM has reached numerous scientific breakthroughs through its commitment to research and its tradition of Grand Challenges. These Grand Challenges work to push science in ways that werent thought possible before. Jeopardy! The IBM Challenge poses a specific question with very real business implications: Can a system be designed that applies advanced data management and analytics to natural language in order to uncover a single, reliable insight in a fraction of a second? We look at areas where there's an enormous gap in current capability and use that as a challenge. We call them Grand Challenges. Dr. John E. Kelly III Senior Vice President and Director of IBM Research

What Is Watson?
Watson is unique in that it attends to heterogeneous sources postulates multiple possible answers considers evidence across multiple dimensions and learns.

What Is Watson?
Watson does not understand nor does it think.

Statistics
Development Team Project Duration Software 25 people 4 years 1,000,000+ SLOC 700K Java, 300K C++, plus other bits ~ 130 components Hardware 90 IBM Power 750 servers 2880 Power7 cores @ 80+ TFLOPS 20 TB memory 10 Gbps network

Play and comment for the audience the movie IBM Watson IBM Next Grand Challenge http://www.youtube.com/watch?v=VjHMYuGkzlU

5 minutes

Related Systems

PIQUANT (Practical Intelligent Question Answering Technology) @ IBM. OpenEphyra @ CMU. UIMA (Unstructured Information Management Architecture) @ IBM -> Apache.

http://www.research.ibm.com/UIMA/Project_QA.htm http://www.ephyra.info/ http://uima.apache.org/

Architecture: In Their Own Words


Case/Question/T opic Data

Answer Sources

Primary Search
Question /Case Analysis

Candidate Answer Generation

Hypothesi s Scoring

Evidence Sources

Evidence Retrieval

Deep Evidence Scoring

Learned Models help combine and weigh the Models Models Evidence
Models Models Models Models

Fact, Finding and Question Decomposition

Hypothesis Generation

Hypothesis and Evidence Scoring

Synthesis

Final Merging and Confidence & Ranking


Confidence- Weighted Differential Diagnosis

Hypothesis Generation

Hypothesis and Evidence Scoring

...

Sources
Wikipedia/Wikiquote/Wiktionary/Wikibooks (The Free Encyclopedia) @
http://wikipedia.org

YAGO2 (A Spatially and Temporally Enhanced Knowledge Base from Wikipedia)


@ http://www.mpi-inf.mpg.de/yago-naga

Dbpedia (Extracting Structured Information from Wikipedia) @ http://dbpedia.org WordNet (A Lexical Database for English) @ http://wordnet.princeton.edu Web expansion of many primary sources. Various licensed encyclopedias, dictionaries, books of quotations, and wire news.

Deployment View: Watson


90 IBM Power 750 servers. 4 Power7 processors/server. 8 cores/processor. 10 TB memory/server. SUSE Linux Enterprise OS. SONAS storage @ 20 TB. Juniper switch @ 10 Gbps. 2 x 20-air conditioning units.

Deployment View: Actuator

Players receive a clue simultaneously (humans visually/Watson


electronically).
Human reaction time to a light stimulus is approximately 190 ms; Watsons actuator reaction time is approximately 5-10 ms, giving it at best 180 ms to compute).

Players can buzz in only after an enable light. Humans can anticipate; Watson cannot.

Deployment View: Avatar

Developed by Joahua Davis and Branden Hall of Automata Studios. Designed using Adobe Illustrator CS5 and scripted with Adobe Flash
Professional CS5 using the HYPE visual framework.

Deployed using Adobe Flash Player 10.1.

http://www.hypeframework.org/blog/content/ibm-watson-and-the-jeopardy-challenge/

Deployment View: Voice

Must handle a large, open-ended vocabulary with many difficult words. Text to speech synthesis constructed from the phonemes extracted from
10 hours of the voice of Jeff Woodman.

Rules-based linguistic analysis front end. Prosody and acoustic model back-end.

http://www.anythinggoesradio.com/Interviews/JeffWoodman_02_14_11.MP3

Deployment View: Watson

Play the movie: IBM Watson: A System Designed for Answers http://www.youtube.com/watch?v=cU-AhmQ363I 3 minutes

Organization: Principal Investigator


David Ferrucci

Organization: Algorithms

Organization: Systems

Organization: Strategy

Organization: Speech

Organization: Annotations

Organization: Labs

Organization: Universities

Organization: Universities

Organization: Project Management

Organization: Applications

Tools

Eclipse. Subversion -> RTC. Watson Error Analysis Tool (WEAT). Feature Analysis Tool (FAT). BlueJ Automatic Distributed Execution Environment tools (BAIDE) Data repository tools.

Process

Agile development. War room setting. Weekly integration. End to end testing. ~ 6,000 experiments 10 gigabits of test data/week.

% Answered

Watsons Future: New Domains

Clinical decision support/physicians assistant. Business analytics. Customer service. Legal research.

References: On The Web


http://www.ibm.com/innovation/us/watson/ http://www.ted.com/webcast/archive/event/ibmwatson http://www.pbs.org/wgbh/nova/tech/smartest-machine-on-earth.html http://www.youtube.com/watch?v=FC3IryWr4c8 http://www.youtube.com/watch?v=3G2H3DZ8rNc&feature=relmfu

Play the movie IBM A century of progress


http://www.youtube.com/watch?v=gLlJDUPg-kY About 2 minutes

THANK YOU
www.ibm.com www.ibmwatson.com

Copyright IBM Corporation 2011. All rights reserved. The information contained in these materials is provided for informational purposes only, and is provided AS IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, these materials. Nothing contained in these materials is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software. References in these materials to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. Product release dates and/or capabilities referenced in these materials may change at any time at IBMs sole discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any way. IBM, the IBM logo, Rational, the Rational logo, Telelogic, the Telelogic logo, and other IBM products and services are trademarks of the International Business Machines Corporation, in the United States, other countries or both. Other company, product, or service names may be trademarks or service marks of others.

You might also like