You are on page 1of 29

Fedora

New Features, New Collaborations, Bright Future

Fedora Users Conference


Copenhagen, Denmark
September 28, 2005

Sandy Payette
Co-Director Fedora Project
Cornell University
Fedora Brief History
• Cornell Research (1997-present)
– DARPA and NSF-funded research
– First reference implementation developed
– Interoperable Repositories (experiments with CNRI)
– Policy Enforcement

• First Application (1999-2001)


– University of Virginia digital library prototype
– Technical implementation: adapted to web; RDBMS storage
– Scale/stress testing for 10,000,000 objects

• Open Source Software (2002-present)


– Andrew W. Mellon Foundation grants
– Technical implementation: XML and web services
– Fedora 1.0 (May 2003)
– Fedora 2.0 (Jan 2005)
– Fedora 2.1 (coming soon!)
Fedora Development Team

Cornell University University of Virginia

• Sandy Payette (co-director) • Thorny Staples (co-director)


• Chris Wilper • Ross Wayland
• Carl Lagoze • Ronda Grizzle
• Eddie Shin • Bill Niebel
• Bob Haschart
• Tim Sigmon
“Fedora Inside”
Known Use Cases

• Digital Library Collections


• Institutional Repository
• Educational Software
• Information Network Overlay
• Digital Archives and Records Management
• Digital Asset Management
• File Cabinet / Document Management
• Scholarly publishing
Fedora Repository and Web Services

C lie n t W eb B atch O th e r
A pp B row se r P rogram S e r v ic e

R EST SO A P R EST SO A P R EST SO A P R EST


Web Services
R EST

R D F
Exposure M anage A ccess
B a s ic
S e arch S e arc h
O A I
P r o v id e r

F e d o r a R e p o s it o r y M o d u le s

M anage A u th N A u th Z

RDF
Acce ss V a lid a t io n R e s o u r c e In d e x

files S to ra g e D is s e m in a t io n R e g is tr y
rdbms
The Basics: Fedora Digital Object Model
Container View

Persistent ID (PID) Digital object identifier


Relations (RELS-EXT)

Dublin Core (DC) Reserved Datastreams


Key object metadata
Audit Trail (AUDIT)

Datastream
Datastreams
Datastream
Aggregate content or metadata items

Default Disseminator Disseminators


Pointers to service definitions to
Disseminator provide service-mediated views
Fedora – Object Model XML

• FOXML (Fedora Object XML)


– Simple XML format directly expresses Fedora object model
– Easily adapts to Fedora new and planned features
– Easily translated to other well-known formats

• Enhanced Ingest/Export of objects


– FOXML, METS (Fedora extension)
– Extensible to accommodate new XML formats
– Planned: METS 1.4, MPEG21 DIDL
Fedora 2.1

“Release Notes”
Fedora Service Framework
(Fedora 2.1)

S e r v ic e s

PROAI
F u tu r e
O A I P ro v id e r
S e r v ic e O th e r
S e rv ic e
S e r v ic e

F e d o r a R e p o s ito r y
O th e r
S e r v ic e
S e r v ic e

F u tu r e D ir e c to r y
S e r v ic e In g e st
S e rv ic e

Z IP o r J A R
in p u t
Apps

A d m in is t r a t o r D ir In g e s t C lie n t
2.1 Release Notes

• Authentication plug-ins
– HTTP Basic auth
– Tomcat realms and login modules
• Plug-in #1 : Tomcat user/password file or database
• Plug-in #2 : LDAP tie-in
• Plug-in #3 : Radius Authentication

• Support for SSL

• Authorization module
– XML-based policies using XACML
– Repository-wide policies
– Object-specific policies
– Fine-grained policy enforcement
• API actions X subject attributes X object attributes
XACML Policy Examples

• Repository-wide Policy
– [xacml-1] Deny access to DC datastream to specific user group

• Object-specific Policy
– Deny all access to the object “cornell:cs100” if user is a not a Cornellian.

• Genre-oriented Policy
– [xacml-2] For objects with content model of “uva-image”, permit students
access to disseminations, but deny them access to raw datastreams, but
allow professors access to both.

• Time-oriented Policy
– Permit students access to “answers” datastream of learning object cs:125
after May 15, 2005

• Backend Service Security Policy


– Deny callback by the external MRSID service identified as “bmech:10”
2.1 Release Notes
• Review of RDF-based Resource Index

– “Relationships” Datastream
– Ontology of common relationships (RDF schema)
– RDF stored in datastream identified by “RELS-EXT”

– Resource Index (RI)


– RDF-based index of repository (automatic indexing into Kowari
triple-store))
– Graph-based index includes:
– Object properties and Dublin Core
– Object-to-object relationships
– Datastream Disseminations (and properties)

– RI Search (Search the repository as a graph)


– Powerful querying of graph of inter-related objects
– REST-based query interface (using RDQL or ITQL)
– Results in different formats (triples, tuples, sparql)
2.1 Release Notes
• New in Fedora 2.1 for Resource Index

– Resource Index corruption problems diagnosed and fixed (Kowari


memory bug)

– Minor RI model changes (may require modification of existing static


queries by users

– Relaxation of validation rules on RELS-EXT:


now accepts ( objectURI --- relation/property --- > URI/literal)

– Method Disseminations (and properties)


with option for method X parmVal permutations

– Scale and Performance Testing (NSDL 2M objects, >100M triples)

– Sesame support for triplestore


RI: Fedora Objects
RDF Graph view
Member
Object d c:cre
at or
" E d d ie S h in "
d D a te
la s t M o
" 2 0 0 5 - 0 1 - 1 0 :1 1 : 0 2 "

h a sR ep
in fo :fe d o ra /
im ag e :1 1 hasR
ep
er
m b
M e
has

Collection in fo : fe d o r a /
in fo :fe d o r a /im a g e :1 1 /B L D G

Object c o lle c t io n : 1
in fo :fe d o r a /im a g e :1 1 / b d e f:2 /g e tR e la te d L e tte r

ha d c :crea
sM
em in fo :fe d o ra/ to r
be
r im a g e :1 2 la st M
h a sR ep

odD a
la s tM o d D a te

te " C h r is W ilp e r "


d c:crea to r

h a sR
ep
" 2 0 0 5 - 0 2 - 0 1 :1 2 : 0 5 "
ha
sR
ep

in fo :fe d o r a /im a g e :1 2 / B L D G
in fo :fe d o r a /c o lle c ti o n :1 / b d e f:1 /M E M B E R S

" 2 0 0 5 - 0 1 - 0 1 :1 0 : 0 0 "

" E l ly C r a m e r "

in fo :fe d o r a / i m a g e :1 2 / b d e f:2 / g e tH IG H
Fedora 2.1 Release Notes

• PROAI Server (Advanced OAI Provider)


– Harvest multiple metadata formats
– Harvest datastreams and disseminations
– Support for incremental harvest by modified date
– Support for OAI sets
– Highly configurable via queries against Resource Index

• Directory Ingest Service


– Facilitate ingest of hierarchical directories of files
– Submit files as .zip or .jar (with a METS manifest)
– Automatically asserts parent-child relationships in RELS-EXT
– Stages content and ingests as FOXML objects into repository

• Directory Ingest Client


– Web client (signed applet)
– Browse directory trees, select dir/files, add metadata, add relations
– Auto-generates METS manifest for entire collection
– Packages as zip/jar and ingests into Fedora repository
2.1 Release Notes

• Rebuild Utility for Repository Indices


• Improved logging using log4j
– Trippi.log
– Kowari.log
– Repository log
• Handle System Plug-in for PID Generation
• Command-line utility syntax changes
• New Command-line utilities
– fedora-reload-policies
– validate-policy
– fedora-rebuild
• FedoraClient utility class for building new clients
Fedora Future
2006-2007
You asked…

• “We wish for a out-of-box” end-user client for


Fedora.”

• “Can’t you put the DSpace interface on top of a Fedora


repository?”

• “We need something to show people Fedora right away


(before we get $$ for development resources).”

• “We love Fedora. It would be really great if you


distributed a default end-user client.”
The Answer: FIRE Client

• Web-based client for “institutional repository”


• End-user content submission
• Object creation template for “content models”
• Configurable Workflows
• XACML policies coordinated with workflow
• Search/Browse collections

Development in progress!
Fedora Service Framework (2005-07)

F e d o r a S e r v ic e s
F e d e r a t io n O th e r
P ID
O penU RL

P R O AI S e rv ic e
aD O Re P r e s e r v a t io n R e s o lu t io n
M o n ito r in g ( O A I P r o v id e r )

JHO VE
O penU RL

a r X iv Event F e d o r a R e p o s ito r y P r e s e r v a t io n
N o t if ic a t io n S e r v ic e In t e g r it y

G DFR
O penU RL

D S pace
D ir e c t o r y
O penURL In g e s t
Acce ss
F e d o ra F e d o ra E x te rn a l
P o in t W o r k flo w
S e arch W o r k f lo w

Path w ays
I n t e r D is s e m in a t o r Apps D ia lo g B o x N a m e
T e x t:
Te xt S a m p le T e x t H e r e S a m p le T e x t H e r e S a m p le T e x t
H e r e S a m p le T e x t H e r e S a m p le T e x t H e r e S a m p le
O K

Te xt T e x t H e r e S a m p le T e x t H e r e S a m p le T e x t H e r e Cancel

S e r v ic e
Te xt S a m p le T e x t H e r e S a m p le T e x t H e r e
Te xt H e lp

Tex t
S a m p l e T e x t H e r e S a m p le T e x t H e r e S a m p le T e x t H e r e S a m p le T e x t H e r e
S a m p l e T e x t H e r e S a m p le T e x t H e r e S a m p le T e x t H e r e S a m p le T e x t H e r e
S a m p l e T e x t H e r e S a m p le T e x t H e r e S a m p le T e x t H e r e S a m p le T e x t H e r e

A d m i n i s t r a t o r P o li c y B u i ld e r F I R E C li e n t

W e b -b a s e d
s u b m is s io n a n d
b a s ic w o r k flo w
Fedora Development Priorities
2006-2007
• Fedora Framework Services
• Federated Repositories
– “Fedorations” with name service
– Federation with other repositories (DSpace, aDORE, arXiv)
• Cornell/LANL NSF Pathways project
• InterDisseminator
• “Content Model” Specification Language
• Advanced Object Creation Workbenches
• Tools for RDF browse and graph traversal
• Scalability/Performance – very large repositories
• Web services security and Shibboleth
• Code Refactoring
• Fedora as web app (.war)
• Fedora Showcase and News (on new website)
• Community Coordination and Co-Development
Collaboration:
Fedora Community Working Groups
• Preservation Working Group (Ron Jantz, Rutgers)
– Requirements for preservation services
– Define service APIs and technical integration with Fedora 2.1 +
– Preservation metadata recommendations for Fedora
– Prototyping of new services
– Development plan for deployment of new services
Collaboration:
Fedora Community Working Groups

• Workflow Working Group (Peter Murray, OhioLink)


– Sep 05: WORKFLOW WG chartered and begins work
– Oct 05: Submit "terminology and problem statement" document to fedora-users
for review
– Nov 05: Submit modeling diagrams, workflow process descriptions, and
recommendation for workflow engine to fedora-users for review
– Feb 06: Release alpha-quality version of ingestion workflow engine
– Apr 06: Release beta-quality version of ingestion workflow engine
– Aug 06: Release production-quality version of ingestion workflow engine
– Nov 06: Revise documents based upon implementation experience
– Feb 07: Release alpha-quality version 2.0 of ingestion workflow engine
– Apr 07: Release beta-quality version 2.0 of ingestion workflow engine
– Aug 07: Release production-quality version 2.0 of ingestion workflow engine
– Sep 07: Close or recharter the WG
Sample Workflows
Ingest-oriented process

Validate Ingest Link to Assign Index


byte- to Simulation Access and
streams Repo Service Policy Register
SIP
Review-oriented process

Review
Submit Review Edit Assign Publish
Policy
thesis

Preservation-oriented process Ingest


To
Object Archive
Diagnose Format Make
Versioning Ingest
Problems Migration Copies
In Repo To
Digital
Object Archive
Collaboration:
Fedora Community Working Groups
• Outreach Working Group (Linda Langschied, Rutgers)
– Improve content of Fedora web site
– More user-oriented information (currently technical focus)
– Community Showcase – demos, graphics
– Survey database with simple web form to profile users
– Collaboration Environment
– Wiki, Confluence, other?

• Content Model Working Group (under charter)


– Formalization of notion of Fedora content model
– XML schema to define content models
– Investigate ontology-based content model definition
– Round up existing content models and publish to promote reuse
Fedora Community

• Fedora Advisory Board


– Vision
– Commission Working Groups
– Prioritize Development
– Define Sustainability Model

• Collaborative Development Opportunities

• Share Tools via www.fedora.info


– User-contributed Tools, Apps, Services
Fedora Community (a sampling)

• General questions
• Hot topics
– Workflow
– Digital object typing
– Rdf and relationships
– Search and indexing
– Collaboration models
– other
• Demos
– Encylopedia of Chicago
– NSDL
New Fedora Web Site!

www.fedora.info

You might also like