Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Alfresco 3 Business Solutions
Alfresco 3 Business Solutions
Alfresco 3 Business Solutions
Ebook1,163 pages5 hours

Alfresco 3 Business Solutions

Rating: 0 out of 5 stars

()

Read preview

About this ebook

In Detail

Alfresco is the renowned and multiple award winning open source Enterprise content management system which allows you to build, design and implement your very own ECM solutions. It offers much more advanced and cutting edge features than its commercial counterparts with its modularity and scalability. If you are looking for quick and effective ways to use Alfresco to design and implement effective and world class business solutions that meet your organizational needs - Your search ends with this book.

Welcome to Alfresco 3 Business Solutions - Your practical and easy to use guide which, instead of teaching you just how to use Alfresco, teaches you how to live Alfresco. It will guide you through implementing real world solutions through real world scenarios. Each ECM problem is treated as a separate case study and has its own chapter, enabling you to uncover the practical aspects of an ECM implementation. You want more than just the theoretical details - You want practical insights to building, designing and implementing nothing less than world class business solutions with Alfresco - and Alfresco 3 Business Solutions is your solution.

This practical companion cuts short the preamble and you dive right into the world of business solutions with Alfresco.

Learn all techniques, basic and advanced, required to design and implement different solutions with Alfresco in easy and efficient ways. Learn all you need to know about Document Management, Records Management- the lot. Connect Alfresco with directory servers. Learn how to use CIFS and troubleshoot all types of problems. Migrate data when you have an existing network drive with documents and want to merge them into Alfresco. Implement Business Process Design Solutions with Swimlane diagrams. Easily extract content from Alfresco and build mashups in a portal like Liferay. Gain insights into mobile access and email integration.

This book will teach you to implement all that and more, in real world environments.

Approach

This book guides you through all the practical aspects of the Alfresco CMS with numerous case studies and real life scenarios. It is packed with illustrative examples and diagrams to make learning easier and straightforward.

Who this book is for

This book is designed for system administrators and business owners who want to learn and implement Alfresco Business Solutions in their teams or business organizations. General familiarity with Java and Alfresco is required.

LanguageEnglish
Release dateFeb 8, 2011
ISBN9781849513357
Alfresco 3 Business Solutions
Author

Martin Bergljung

Martin Bergljung is a Principal ECM Architect at Ixxus, a UK platinum Alfresco partner. He has over 20 years of experience in the IT sector, where he has worked with the Java platform since 1997. Martin began working with Alfresco in 2007 developing an email management extension for Alfresco called OpsMailmanager. In 2009 he started doing Alfresco consulting projects and has worked with customers such as Virgin Money, ITF, Unibet, and BNP Paribas.

Related to Alfresco 3 Business Solutions

Related ebooks

Computers For You

View More

Related articles

Reviews for Alfresco 3 Business Solutions

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Alfresco 3 Business Solutions - Martin Bergljung

    Alfresco 3 Business Solutions


    Alfresco 3 Business Solutions

    Copyright © 2011 Packt Publishing

    All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

    Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

    Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

    First published: February 2011

    Production Reference: 1030211

    Published by Packt Publishing Ltd. 32 Lincoln Road Olton Birmingham, B27 6PA, UK.

    ISBN 978-1-849513-34-0

    www.packtpub.com

    Cover Image by Ed Maclean (< edmaclean@gmail.com> )

    Credits

    Author

    Martin Bergljung

    Reviewers

    Johnny Gee

    Sivasundaram Umapathy

    Adrián Efrén Jiménez Vega

    Acquisition Editor

    Steven Wilding

    Development Editor

    Maitreya Bhakal

    Technical Editors

    Arun Nadar

    Namita Sahni

    Aditi Suvarna

    Copy Editor

    Laxmi Subramanian

    Editorial Team Leader

    Aditya Belpathak

    Project Team Leader

    Lata Basantani

    Project Coordinator

    Leena Purkait

    Proofreader

    Mario Cecere

    Indexers

    Tejal Daruwale

    Hemangini Bari

    Production Coordinator

    Aparna Bhagat

    Cover Work

    Aparna Bhagat

    About the Author

    Martin Bergljung is a Principal ECM Architect at Ixxus, a UK platinum Alfresco partner. He has over 20 years of experience in the IT sector, where he has worked with the Java platform since 1997.

    Martin began working with Alfresco in 2007 developing an e-mail management extension for Alfresco called OpsMailmanager. In 2009 he started doing Alfresco consulting projects and has worked with customers such as Virgin Money, ITF, Unibet, and BNP Paribas.

    I would like to thank Steven Wilding at Packt Publishing for suggesting the project and getting it on track.

    A thanks goes also to Leena Purkait, my Project Coordinator, who was always pushing me to deliver the next chapter. My Development Editor Maitreya Bhakal was also very helpful the last couple of months by pushing me to get the chapters finished in time and in line with the final size of the book. Thank you also to the entire Packt Publishing team for working so diligently to help bring out a high quality product.

    Thanks to all the book reviewers who gave me invaluable feedback during the whole project. I must also thank the talented team of developers who created the Alfresco open source product. It opens up a new way for everyone that wants to build any kind of ECM business solution.

    I would like to thank Paul Samuel at Ixxus for supporting my book project.

    Finally, I would like to give a special thanks to Robin Bramley for contributing source code for the chapter about mobile applications, and to Michael Walton and Oliver Bradley who let me do a lot of work and research for the book when I was working for Opsera.

    About the Reviewers

    Johnny Gee is the Chief Technology Officer at Beach Street Consulting, Inc. In that role, he is responsible for architecting solutions for multiple clients across various industries and building Content Enabled Vertical Applications (CEVAs) on the Documentum platform. He has over 13 years of experience in ECM system design and implementation, with a proven record of successful ECM project implementations.

    In addition to earning his undergraduate degree in Aerospace Engineering from University of Maryland, Johnny achieved two graduate degrees—one in Aerospace Engineering from Georgia Institute of Technology, and another in Information Systems Technology from George Washington University.

    Johnny is an EMC Proven Professional Specialist in Application Development in Content Management and has helped co-author the EMC Documentum Server Programming certification exam. He holds the position of top contributor to the EMC Support Forums and is one of the twenty EMC Community Experts worldwide. He has been invited on multiple occasions to the EMC Software Developer Conference and has spoken at EMC World. He also has a blog dedicated to designing Documentum solutions.

    Johnny was the technical reviewer for Pawan Kumar's revision to Documentum Content Management Foundations: EMC Proven Professional Certification Exam E20-120 Study Guide. He was also a technical reviewer for Munwar Shariff's book on Alfresco 3 Web Content Management.

    Sivasundaram Umapathy is currently working as a Technical Architect with Sella Servizi Bancari, the IT division of Gruppo Banca Sella, Italy where he is leading the organization's transition to Alfresco and Liferay technologies. He has a Post Graduate program in Software Enterprise Management (PGSEM) from IIM, Bangalore and MS in Software Systems from BITS, Pilani. He has an array of certifications ranging from CGEIT, TOGAF 8, PMP, SCEA, OCA, SCBCD, SCWCD, SCMAD, to SCJP. He has co-authored SCMAD Exam Guide(ISBN-9780070077881) and been a technical reviewer of Head First EJB (ISBN-9780596005719). His current interests are Enterprise Architecture, IT Governance, IT-Business mismatch, Tech Startups and Entrepreneurship. He can be reached at< siva@sivasundaram.com> or via his LinkedIn profile at http://bit.ly/sivasundaram

    Adrián Efrén Jiménez Vega works at the Center of Information Technologies (CTI) of the University of the Balearic Islands, in Mallorca (Spain). For four years, he has built and deployed various applications based on Alfresco.

    Since registering in the Alfresco Spanish forum approximately two years ago, he has dedicated time and openly shared his experience posting more than 600 messages, and contributed many practical solutions and useful hints for members of the Community. The 'mini-guides' he developed are now widely used and referenced among developers in Spain and Spanish speaking countries. He obtained the Alfresco Chumby Awards for Community Achievement in November 2008.

    He won the Web Script Developer Challenge with a Web Script solution to limit the space for users, including e-mail notification.

    He collaborated as technical reviewer for the book Alfresco 3 Enterprise Content Management Implementation (Packt Publishing) in 2009 and recently he has reviewed the book Alfresco 3 Web Services (Packt Publishing) in 2010.

    I would like to thank all the people who made possible my participation in this project. In particular I would thank my parents (despite the distance), my sister, and my friends at CTI.

    www.PacktPub.com

    Support files, eBooks, discount offers, and more

    You might want to visit www.PacktPub.com for support files and downloads related to your book.

    Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at< service@packtpub.com> for more details.

    At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

    http://PacktLib.PacktPub.com

    Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can access, read and search across Packt's entire library of books.

    Why Subscribe?

    Fully searchable across every book published by Packt

    Copy and paste, print and bookmark content

    On demand and accessible via web browser

    Free Access for Packt account holders

    If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view nine entirely free books. Simply use your login credentials for immediate access.

    This book is dedicated to my wife, Veronika, for always believing in me and accepting that her husband spent most weekends this year in front of the computer, and to my parents Sven-Erik and Irene, without you nothing would have been possible.

    Preface

    Alfresco is a renowned and multiple award-winning open source Enterprise Content Management System that allows you to build, design, and implement your very own ECM solutions. It offers much more advanced and cutting-edge features than its commercial counterparts with its modularity and scalability. If you are looking for quick and effective ways to use Alfresco to design and implement effective and world class business solutions that meet your organizational needs, your search ends with this book.

    Welcome to Alfresco 3 Business Solutions: Your practical and easy-to-use guide, which instead of teaching you just how to use Alfresco, teaches you how to live Alfresco. It will guide you through implementing real-world solutions through real-world scenarios. Each ECM problem is treated as a separate case study and has its own chapter, enabling you to uncover the practical aspects of an ECM implementation. You want more than just the theoretical details—you want practical insights to building, designing, and implementing nothing less than world class business solutions with Alfresco—and Alfresco 3 Business Solutions is your solution.

    This practical companion cuts short the preamble and you dive right into the world of business solutions with Alfresco.

    Learn all techniques, basic and advanced, required to design and implement different solutions with Alfresco in easy and efficient ways

    Learn all you need to know about document management

    Connect Alfresco with directory servers

    Learn how to use CIFS and troubleshoot all types of problems

    Migrate data when you have an existing network drive with documents and want to merge them into Alfresco

    Implement Business Process Design Solutions with Swimlane diagrams

    Easily extract content from Alfresco and build mashups in a portal such as Liferay

    Gain insights into mobile access and e-mail integration

    This book will teach you to implement all that and more, in real-world environments.

    What this book covers

    Chapter 1, The Alfresco Platform, introduces the architecture behind the repository and goes through important concepts such as store, node, and association. It describes the major features that are available such as rules, events, metadata extractors, transformers, subsystems, patches, and so on. It also explains the directory structure and database schema of an Alfresco installation.

    Chapter 2, The Alfresco APIs, presents the remote and embedded APIs that are available out of the box. It focuses on the embedded Foundation API and the JavaScript API and shows how to create, update, delete, and search for content.

    Chapter 3, Setting Up a Development Environment and a Release Process, shows you how to set up a development environment so you can build both Alfresco Explorer extensions and Alfresco Share extensions. It also covers how to manage a release process.

    Chapter 4, Authentication and Synchronization Solutions, describes how the authentication subsystem is working and how to configure it for the LDAP and the Microsoft Active Directory. It also covers how to synchronize user and group data with these directories.

    Chapter 5, File System Access Solutions, teaches you what CIFS is, how the underlying technology works, and how to troubleshoot it. It will take you through different configurations on different platforms. It also covers WebDAV and how to use that instead of CIFS.

    Chapter 6, Document and Records Management Solutions, covers how to design folder hierarchies with permissions, rules, and custom metadata using a folder template. It shows a lot of examples on how to use the JavaScript API when implementing DM solutions. It also introduces the Alfresco RM module.

    Chapter 7, Content Model Definition Solutions, explains the XSD Schema/meta model that describes the Alfresco content modeling language. It shows a lot of examples on how to design custom content models on top of the out-of-the-box Alfresco content models and how to display the custom data in Alfresco Explorer and Alfresco Share. It also shows you a couple of design patterns that can be used for content modeling.

    Chapter 8, Document Migration Solutions, teaches you strategies for implementing a document migration solution. It explains advantages and disadvantages between different import tools such as ACP file, CIFS, and the Alfresco Bulk File system Import Tool.

    Chapter 9, Business Process Design Solutions, shows you how Swimlane diagrams can be used to design business processes. It explains how a task-naming convention can be useful to distinguish between tasks and how to design using phases and sub-processes.

    Chapter 10, Business Process Implementation Solutions: Part 1, introduces the JBoss jBPM workflow engine that is used by Alfresco. It takes you through implementing a simple workflow and introduces jPDL concepts such as task node and decision node.

    Chapter 11, Business Process Implementation Solutions: Part 2, digs deeper into implementing workflows with JBoss jBPM and introduces concepts such as fork node, join node, phases, and sub-processes.

    Chapter 12, Enterprise Application Integration (EAI) Solutions, shows you how to build portlets that fetch content from Alfresco via remote Web Script calls.

    Chapter 13, Types of E-mail Integration Solutions, describes different types of e-mail management solutions and goes through how to configure Alfresco's built-in IMAP service.

    Chapter 14, Mobile Phone Access Solutions, takes you through building a small mobile client with Groovy and Grails. The CMIS interface is used to fetch content from Alfresco.

    What you need for this book

    To build the examples in this book you will need JDK 6, Apache Ant, and Alfresco SDK 3.x. To run the examples, an Alfresco 3.x (including MySQL) installation is needed. Some examples (Chapter 4) require a directory server such as OpenLDAP or the Apache Directory Server. For the portal integration example (Chapter 12), you will need to install Liferay Portal 6 and download GWT 1.7 and GXT 2.2. For the mobile application example (Chapter 14), you will need to install Grails 1.3.x.

    Who this book is for

    This book is designed for system administrators and business owners who want to learn and implement Alfresco Business Solutions in their teams or business organizations. General familiarity with Java and Alfresco is required.

    Conventions

    In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.

    Code words in text are shown as follows: Next, we'll add a function to the CModel class that will allow us to set a given effect to any given mesh part.

    A block of code is set as follows:

    private LdapTemplate m_ldapTemplate;

    private String m_userBase;

    When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

    {namespace:typeName}>

    {description of file domain type}

    cm:cmobject

    ...

    Any command-line input or output is written as follows:

    18:28:39,842 INFO [management.subsystems.ChildApplicationContextFactory] Starting 'Authentication' subsystem, ID: [Authentication, managed, bestmoneyLDAP]

    New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: " This Swimlane diagram shows a subprocess called Work Process that is called from a parent process called Studio Process.".

    Warnings or important notes appear in a box like this.

    Tips and tricks appear like this.

    Reader feedback

    Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

    To send us general feedback, simply send an e-mail to< feedback@packtpub.com>, and mention the book title via the subject of your message.

    If there is a book that you need and would like to see us publish, please send us a note in the SUGGEST A TITLE form on www.packtpub.com or e-mail< suggest@packtpub.com>.

    If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

    Customer support

    Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

    Downloading the example code for the book

    You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

    Errata

    Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/support, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.

    Piracy

    Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

    Please contact us at<copyright@packtpub.com> with a link to the suspected pirated material.

    We appreciate your help in protecting our authors, and our ability to bring you valuable content.

    Questions

    You can contact us at<questions@packtpub.com> if you are having a problem with any aspect of the book, and we will do our best to address it.

    Chapter 1. The Alfresco Platform

    Before we dive into implementing ECM solutions, we are going to have a look at the Alfresco platform. There are some key concepts and features that are important for us to know about before implementing anything on top of Alfresco.

    It helps to think about Alfresco as a big toolbox for building Content Management Systems (CMS). Alfresco, out of the box, can obviously be used straightaway but usually you want to configure it and customize it for the organization that is going to use it.

    This is important to think about, as otherwise you are missing the full potential of Alfresco. It enables organizations to tweak Alfresco, so that it works with their business processes and business rules. It does not impose a special way of working that the organization has to adopt. Instead, Alfresco adapts to the organization.

    In a lot of cases, organizations buy proprietary turnkey solutions that look really good out of the box with predefined content models, domain process definitions, business rules, and so on. However, after a while they usually realize that things are not working exactly as they want them to. Then they realize that to customize the solution will cost way more than if they would have started creating it from scratch, or it might not even be possible to customize functionality in the proprietary solution.

    In this chapter, you will learn:

    Important repository concepts

    How to use content rules

    What a metadata extractor is

    Why content transformers are used

    How to trigger custom code from events

    What a Servlet Command is

    What a subsystem is

    How the system can be bootstrapped in different ways

    What user interfaces are available

    About the directory structure created by the installation

    How to access content information directly from the database

    Throughout the book, we will be working with Best Money—a fictive financial institution that offers financial products such as credit cards, loans, and insurances. Best Money wants to complete its range of financial products by offering personal banking products. It is therefore under pressure to improve efficiency by automating business processes, structuring document storage, classifying documents, implementing document lifecycles, improving the level of auditing, managing e-mail content, and many other challenges in a complex business environment subject to heavy regulatory oversight.

    Best Money realizes that to do all of this it needs to put in place an Enterprise Content Management solution and it has selected Alfresco.

    Platform overview

    Alfresco is an open source content management system written entirely in Java that can be run in a standard Servlet container, such as Apache Tomcat or a JEE server, such as JBoss. The Alfresco platform is built using many third-party open source Java libraries and it's good to know about these libraries as we will use many of them when building extensions and solutions.

    The platform has many Application Programming Interfaces (APIs) and configuration techniques that we can use to build custom solutions on top of Alfresco.

    The following figure gives an overview of the platform:

    The Alfresco-specific components, modules, and user interfaces are depicted in a lighter color and the third-party libraries are depicted in a darker color. The Alfresco platform is presented in a layered approach as follows:

    Repository: The bottom layer comprises the database, the search indices, and the content files.

    Java Platform: Everything runs on Java, so it is independent of hardware, operating systems, and also of databases as they are accessed via Hibernate.

    Core: This layer contains all of the modules and libraries used by Alfresco to implement the CMS functionality.

    APIs: The interface layer contains a variety of application programming interfaces that can be used to communicate with Alfresco both in-process and remotely.

    Sub-systems: This layer consists of self-contained components that extend the CMS system with important functionality that often need to be configured during installation. Sub-systems can be started and stopped while the server is running.

    Bootstrap: System integrators can use bootstrap extensions to perform a variety of tasks, for example, to import content or patch the content with some custom metadata.

    Modules: The modules usually extend the Alfresco system with some major extra functionality such as web content management or records management. We will use a module for all new custom functionality we implement for the Best Money ECM system.

    User Interfaces (UI): Alfresco comes with a number of user interfaces that can be used to upload and manage content.

    Now, let's have a detailed look at each layer starting with the Repository.

    Repository concepts and definitions

    Before doing any custom coding for Alfresco, it is important to get to know the concepts and definitions around the repository. We need to get familiar with things such as Store and Association. And what does it mean when we talk about a Node in the repository.

    Repository

    When we talk about Alfresco, we often refer to the Alfresco Repository. So what is the repository more specifically? The repository is an abstract term for where everything is stored in Alfresco. It is also often called just repo and one of the main packages in the source code is also called repo.

    The following figure gives you an overview of the Alfresco Repository:

    The repository consists of a hierarchy of different types of nodes. This can be for example, folder nodes or leaf nodes representing a file. Each node is associated with a parent node via a parent-child relationship, except the top root node. Nodes can also be related to each other via source-target associations (that is, peer associations). If the node represents a file, then it is also associated with a file in the filesystem. This is a somewhat simplified view of the repository as each node actually resides in a store as in the following figure:

    The repository is built up by a number of stores and each one of them contains a node hierarchy. The nodes make up the metadata for the content and are stored in the database. The actual content itself, such as document files, is stored in the filesystem.

    Stores

    There are a couple of stores that you will come in contact with, if you work with Alfresco for a while. First, we have the Working Store, which is the main store where metadata for all live content is contained; this store is often referred to as just DM (Document Management). This is the content that you can access from all the different user interface clients. The default behavior when something is deleted from the Working Store via any of the user interfaces is that both the content file and the content metadata ends up in a store called the Archive Store.

    Content is not permanently removed from the disk when it is in the Archive Store. To physically remove the content from the disk, you need to manage content via the Admin user profile screen or configure a content store cleaner (http://wiki.alfresco.com/wiki/Content_Store_Configuration).

    If you turn on versioning for a document, then you will see some activity in the Version Store where all the previous versions of a piece of content will be stored. This is called the version history and there is one Node created per version history.

    The complete file for a previous version is stored and the system does not store the delta between versions.

    Whenever you install a new application module such as Records Management or Web Content Management, the data about this module is stored in the System Store. The data that is stored is, for example, module version number.

    Finally, we have the Content Store that contains all the physical files and it lives on the disk compared to the other stores that live in the database. It is called Content Store even though the behavior is not the same as for the stores that live in the database. It is more of an abstract term for where the physical content files are located.

    The Content Store

    So why are the physical files stored in the filesystem and not in the database as Binary Large Objects (BLOBs)? It would be easier to back up the whole system and also to set up replication if everything was in the database. And system administrators would not have to manage both database space and filesystem space.

    There are several reasons why content files are stored in the filesystem:

    Random access to files: One of the big advantages with Alfresco is that users can keep working the way they are used to by accessing Alfresco as a shared drive (that is, via the CIFS interface). However, this would not be possible if Alfresco was not storing the files in the filesystem, so they can be randomly accessed (sometimes also referred to as direct access). To support frequent updating and reading of database BLOBs would slow down performance of the CIFS interface to an unacceptable level.

    Real-time streaming: It is a lot easier to stream large content such as video and audio using files. A content file can now be streamed directly back to the browser as the file input stream is copied directly to the HTTP Response Output stream. If BLOBs were used, you would first have to read the BLOB then create a temporary file to stream from. Also, when writing BLOBs to the database, a lot of databases require you to know the stream size when inserting the record, so a temp file needs to be created. Further on, some databases such as MySQL have problems sending very large binaries from the JDBC driver to the database.

    Standard database access: Most database systems support BLOBs with custom access methods and custom objects. These usually perform much better than the JDBC BLOB objects and access methods. So it would be difficult to use Hibernate to access BLOBs in a standard way for all databases. For example, if you wanted to manage BLOBs with Oracle, you would have the best performance using their BLOB object. Also, the caching of BLOBs in databases is known to slow down the rest of the metadata access.

    Faster access: It is much faster to access content that is stored as files, which means that the user experience is much better and this leads to happier customers.

    Content Store policies

    Content Store policies let us decide what media we will store the selected content to. Quite often, content will have a lifetime during which it is relevant and then it will become obsolete; content store policies help with a solution for this. We do not want to get rid of the content files but store them on a cheaper, slower-access disk.

    So we might use a very fast tier 1, Solid-State Drives (SSD), for our most important content files, and based on business policies that we control, gradually move the data to cheaper tier 2 drives such as Fiber Channel (FC) drives or Serial ATA drives as it becomes less important. In this way, we can manage the storage of content more cost-effectively.

    So you could have one part of the repository store files on one disk and another part of the store files on another disk. The following figure illustrates:

    In the preceding figure, we can see that the system has been configured to store images on one disk and documents on another disk. This sort of content store configuration can be done with Content Store Selectors.

    The AVM Store

    One store that has not been mentioned so far is the special store introduced for the Alfresco WCM module. It is called the Advanced Versioning Manager (AVM) Store and it is modeled after Apache Subversion to be able to support extra features such as:

    File-level version control

    File-level branching

    Directory-level version control

    Directory-level branching

    Store-level version control (snapshots)

    Store-level branching

    These extra features are needed to be able to create user and staging sandboxes, so that web content can be created and previewed independently between the users. A staging environment is also supported where different snapshots of the website can be managed and deployed to production servers.

    There are some major differences between the Working Store (DM) and the AVM Store (WCM) that are good to know about when we are planning an ECM project. The following list explains some of the differences:

    Permissions can be set on object level in DM but only on Web Project level in WCM

    Types are defined with XML Schema files in WCM but with XML files in DM

    AVM folders do not support rules as in DM

    In WCM, we can search only one Web Project at a time, whereas in DM the complete repository is searchable

    E-mailing with SMTP or IMAP is not supported in WCM, but it is in DM

    Content can be cross-copied between the DM store and the AVM store and vice versa.

    There are things happening now and in the near future to update Alfresco WCM to be able to use the normal Alfresco DM Working Store, so that web content can reside along with all other content and be searchable in the same way.

    Store reference

    When you work with the application interfaces, you often have to pass a so-called store reference into a method call. This store reference tells Alfresco what store you are working with. A store reference is constructed from two pieces—the protocol and an identifier.

    The protocol basically specifies what store you are interested in. For example, two of the protocols are workspace and archive. You also need an identifier to create a store reference and it tells us what kind of store it is, for example, does it contain spaces or is it keeping version information. Most of the time we are accessing a store with folders (that is, spaces) and the identifier is then called SpacesStore.

    So if you wanted to access the Working Store from the previous figures, you would create the following store reference: workspace://SpacesStore. And this is the store that you will use most of the time.

    The following is a complete list of store references:

    workspace://SpacesStore: Contains the live content; this is the store reference that will be used in most situations

    workspace://lightWeightVersionStore: Version history for content

    workspace://Version2Store: Next-generation version history for content

    archive://SpacesStore: Archived files (that is, deleted files)

    user://alfrescoUserStore: User management

    system://system: Installed modules information

    avm://sitestore: Alfresco WCM content

    Nodes

    Each store in the repository contains nodes and every piece of content that is saved in the repository is represented by a node. This can be a document, a folder, a forum, an e-mail, an image, a person, and so on. Everything in a store is a node. A node is stored in the database and contains the following metadata for a piece of content:

    Type: A node can be of one type.

    Aspects: A node can have many aspects.

    Properties: A node can have properties defined in the type or the aspects. One of the properties points to the actual physical content file.

    Permissions: Permissions for this node.

    Associations: Associations to other nodes.

    Each node is of a certain Type such as Folder, Content, VersionHistory, Forum, and so on. Each type can have one or more properties associated with it. A node can only be of one type. So a Folder cannot also be a VersionHistory, pretty obvious, but it is good to mention this anyway so that there are no misunderstandings when we start creating custom types.

    So what if we wanted to have properties from two different types associated with a node, how would we do that? We would use something called an aspect. A node can be associated with more than one aspect. An aspect is for example, Versionable, Emailed, Auditable, and so on. So, this means that a MS Word document could be of type Content and be Versionable and Emailed.

    The following figure shows a folder node called User Guides that contains one document called userguide.pdf, which in turn is associated with an image node called logo.png:

    Some nodes such as folder nodes are not associated with any content; they just contain metadata, permission settings, and an association to the parent folder.

    Root node

    All nodes have to have a parent node and the top-level node in the store is called the store root as it does not have a parent node. The root node has an aspect applied to it called sys:aspect_root. It might look like a good idea to search for this aspect to get to the root node in a store, but it does not work as there are other nodes such as the root node for categories that also have this aspect set.

    An easy way to get to the root node in any of the stores is to do a Lucene search with PATH:/ or if we are using the Java Foundation Service API, we can use the Node Service to get the root node for a particular Store Reference.

    Node reference

    So, we have heard about all these nodes and seen how they can have properties, and so on. But how can one uniquely identify one node in the repository? This is where node references come into the picture. They are used to identify a specific node in one of the stores in the repository. You construct a node reference by combining a Store Reference such as workspace://SpacesStore with an identifier.

    The identifier is a Universally Unique Identifier (UUID) and it is generated automatically when a node is created. A UUID looks something like this: 986570b5-4a1b-11dd-823c-f5095e006c11 and it represents a 128-bit value. A complete Node Reference looks like workspace://SpacesStore/986570b5-4a1b-11dd-823c-f5095e006c11.

    The node reference is one of the most important concepts when developing custom behavior for Alfresco as it is required by a lot of the application interface methods.

    Node properties

    Properties contain all the information about the Node and are often referred to as metadata. When you create a new node of a certain type, such as Content, there are certain default number of properties that are set. These are properties such as Name, Created Date, Author, and so on.

    What properties are set, and if they are set automatically, depends on the MIME type of the content that is being added to the repository.

    Multipurpose Internet Mail Extensions (MIME) is an Internet standard that extends the format of e-mail to support message bodies with multiple parts, text in character sets other than ASCII, non-text attachments, header information in non-ASCII character sets, and so on.

    The MIME type is these days referred to as a content type after the header name Content-Type. But in the Alfresco environment they are called MIME types.

    Some MIME types such as the one for a MS Word document have so-called Metadata Extractors available that automatically extract properties from the content when it is added. So when we add a MS Word document to the repository, we will see that some properties have been filled in automatically via the automatic metadata extraction.

    Properties can be defined either as part of a type or as part of an aspect. When you list the properties in the UI, it does not show what type or aspect they belong to. You have to programmatically query the system to find out what properties belong to an aspect or type.

    What if we wanted to add a property called Name but it is already defined for type Content,

    Enjoying the preview?
    Page 1 of 1