You are on page 1of 17

Open Source Intelligence:

Access All Intelligence, in


All Languages, All the Time

Presented by
Abe Lederman, President and CTO
Deep Web Technologies, LLC
IOP ’06 Sheraton Premier, Tysons Corner, Virginia January 16-20
About Deep Web Technologies (DWT)
DWT is a New Mexico based company focused on providing
state-of-the-art software solutions which search, retrieve,
aggregate, and analyze content.

• Deployed first “federated search” portal in the


Federal Government, 1999
• Major clients include:
– DOE Office of Scientific & Technical Information
– Defense Technical Information Center
– Science.gov Alliance
– DOE Office of Science
– National Agricultural Library
Open Source Intelligence
The Problem:
• Collecting and analyzing enormous
quantities of information in any language,
in myriad formats, located anywhere,
accessible through a large variety of
means, with a majority not accessible
through the Internet
Shared Challenge:
OSINT and Knowledge Discovery/Diffusion

OSINT
Challenges
Knowledge
Discovery/
Diffusion
Challenges

DWT for the past six years has been the lead technical
organization addressing these challenges in collaboration
with DOE Office of Scientific & Technical Information
The DWT Proposition

To apply DWT’s technology, expertise


and ongoing innovations* to address
the challenges of OSINT

*Developed in partnership with DOE/OSTI


Challenges in Working with
Thousands of Data Sources
Locate Reliable Sources

Categorize Sources by Content

Configure Sources for Searching

Maintain Sources
Challenges in Searching
Thousands of Sources
Automatically Select
Sources to Search

Perform Many Searches


in Parallel

Translate, Analyze and


Organize Results

Relevance Extract Key Cluster/


Rank Information Visualize
ResearchAssistantTM
DWT’s State-of-the-art
Federated Search Engine
• Scalable, grid-computing based federated
search engine
• Sophisticated Search Conductor
• Supports custom connectors
• Multi-tier relevance ranking
• Framework accepts integration of advanced
linguistic, analyses, and visualization
modules
Grid Computing:
Distributing the Workload
Search Conductor
Select sources
to search

Perform search

Enough YES
good Deliver results
results? to user

NO
Can I get
YES more results
from “good”
sources?
NO
Multi-tier Relevance Ranking
• QuickRankTM – Ranks results based on
occurrence of search terms in title and
snippet

• MetaRankTM – Ranks results utilizing


custom algorithms applied to metadata

• DeepRankTM – Downloads and indexes


full-text documents
Science.gov Alliance Consortium of
12 Federal Government Agencies
Dept of Agriculture Dept of Interior
Dept of Commerce Environmental Protection Agency
Dept of Defense NASA
Dept of Education National Science Foundation
Dept of Energy US Government Printing Office
Dept of Health/Human Services National Archives & Records
Administration

Sponsoring

Science.gov Portal
(Access to most of Federal Government R&D
Science.gov Advanced Search Page
Science.gov Results Page
A Science.gov Document
Next Steps
Identify Sponsors and development
partners that can collaborate on the
development of a pilot that integrates best-
of-breed technologies of value to OSINT.

This pilot will result in a portal that


aggregates content of different types,
generating actionable intelligence.
Contact Us
Abe Lederman
122 Longview Drive
Los Alamos, NM 87544
abe@deepwebtech.com
www.deepwebtech.com

http://www.deepwebtech.com/talks/IOP.ppt

You might also like