You are on page 1of 304

The Oracle Coherence 12c Cookbook

David Whitmarsh with Phil Wheeler


November 20, 2014

c
Copyright 2013,2014
David Whitmarsh and Phil Wheeler
Published by Shadowmist Ltd, Hassocks, West Sussex, UK
ISBN: 978-0-9931260-1-7
This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.
org/licenses/by/4.0/ or send a letter to Creative Commons, PO Box 1866,
Mountain View, CA 94042, USA.
Oracle and Java are registered trademarks of Oracle and/or its affiliates.
Other names may be trademarks of their respective owners.

Contents
Preface

vii

1 Introduction
1.1 About This Book . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 About the product . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Integration Recipes
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . .
2.2 Write Your Own Main Class . . . . . . . . . . . . .
2.3 Build a CacheFactory with Spring Framework . . .
2.3.1 Ordered Hierarchy of Application Contexts
2.3.2 Preventing Premature Cluster Startup . . .
2.3.3 Waiting for Cluster Startup to Complete . .
2.3.4 Application Context in a Class Scheme . . .
2.3.5 Putting it Together: the Main Class . . . .
2.3.6 Avoid Static CacheFactory Methods . . . .
2.3.7 Rewiring Deserialised Objects . . . . . . . .
2.3.8 Using Spring with Littlegrid . . . . . . . . .
2.3.9 Other Strategies . . . . . . . . . . . . . . .
2.4 Linking Spring and Coherence JMX Support . . .
2.5 Using Maven Repositories . . . . . . . . . . . . . .
2.5.1 Install and Extract Jars . . . . . . . . . . .
2.5.2 Applying Oracle Patches . . . . . . . . . . .
2.5.3 Remove Default Configuration . . . . . . .
2.5.4 Select Maven Co-ordinates . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

1
1
2
3
5
5
6
7
9
13
14
17
19
19
20
28
29
29
32
32
33
34
34

3 Serialisation
35
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
i

ii

CONTENTS
3.2

3.3
3.4

3.5

3.6

3.7

3.8

Tips and Tricks with POF . . . . . . . . . . . . . . . . . . . .


3.2.1 Use Uniform Collections . . . . . . . . . . . . . . . . .
3.2.2 Put Often Used Fields First . . . . . . . . . . . . . . .
3.2.3 Define Common Extractors as Static Member Variables
3.2.4 Determining Serialised Object Sizes . . . . . . . . . . .
Testing POF Serialisation . . . . . . . . . . . . . . . . . . . .
Polymorphic Caches . . . . . . . . . . . . . . . . . . . . . . .
3.4.1 Value Objects of Different Types . . . . . . . . . . . .
3.4.2 Value Objects with Properties of Different Types . . .
Codecs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.1 Serialiser Codec . . . . . . . . . . . . . . . . . . . . . .
3.5.2 Serialise Enum by Ordinal . . . . . . . . . . . . . . . .
3.5.3 Enforcing Collection Type . . . . . . . . . . . . . . . .
Testing Evolvable . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.1 Testing with binary serialised data . . . . . . . . . . .
3.6.2 Testing with Duplicated Classes . . . . . . . . . . . . .
3.6.3 Classloader-based Testing . . . . . . . . . . . . . . . .
Implementing Custom Serialisation . . . . . . . . . . . . . . .
3.7.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.7.2 A Coherence Serialiser for Google Protocol Buffers . .
3.7.3 Custom Serialisation and EntryExtractor . . . . . . .
Avoiding Deserialisation . . . . . . . . . . . . . . . . . . . . .
3.8.1 Tracking Deserialisation While Testing . . . . . . . . .
3.8.2 Avoid Reflection Filters . . . . . . . . . . . . . . . . .
3.8.3 Consider Member Failure . . . . . . . . . . . . . . . .

4 Queries
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . .
4.1.1 Useful Idioms . . . . . . . . . . . . . . . . .
4.2 Projection Queries . . . . . . . . . . . . . . . . . .
4.2.1 Projection Queries . . . . . . . . . . . . . .
4.2.2 Covered Indexes . . . . . . . . . . . . . . .
4.2.3 DeserializationAccelerator . . . . . . . . . .
4.3 Conditional Indexes . . . . . . . . . . . . . . . . .
4.3.1 Conditional Index on a Polymorphic Cache
4.4 Querying Collections . . . . . . . . . . . . . . . . .
4.4.1 A Collection Element Extractor . . . . . . .
4.4.2 A POF Collection Extractor . . . . . . . . .
4.4.3 Querying With The Collection Extractor . .
4.4.4 Derived Values . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

36
36
36
37
37
39
42
43
47
51
52
52
54
54
56
61
64
70
70
72
79
85
86
88
91
95
95
95
100
100
103
104
105
105
107
108
110
112
113

CONTENTS
4.5

Custom Indexes . . . . . . . . . . . . . . . . . .
4.5.1 IndexAwareFilter on a SimpleMapIndex
4.5.2 A Custom Index Implementation . . . .
4.5.3 Further Reading . . . . . . . . . . . . .

iii
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

5 Grid Processing
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.1 EntryAggregator vs EntryProcessor . . . . . . . . . .
5.1.2 Useful Idioms . . . . . . . . . . . . . . . . . . . . . .
5.2 Void EntryProcessor . . . . . . . . . . . . . . . . . . . . . .
5.3 Keeping the Service Guardian Happy . . . . . . . . . . . . .
5.4 Writing a custom aggregator . . . . . . . . . . . . . . . . . .
5.5 Exceptions in EntryProcessors . . . . . . . . . . . . . . . . .
5.5.1 Setting up the examples . . . . . . . . . . . . . . . .
5.5.2 Exception When Invoking with a Filter . . . . . . .
5.5.3 Exception When Invoking with a Set of Keys . . . .
5.5.4 When Many Exceptions Are Thrown . . . . . . . . .
5.5.5 How To Manage Exceptions in an EntryProcessor . .
5.5.6 Return Exceptions . . . . . . . . . . . . . . . . . . .
5.5.7 Invoking Per Partition . . . . . . . . . . . . . . . . .
5.5.8 Use an AsynchronousProcessor . . . . . . . . . . . .
5.6 Using Invocation Service on All Partitions . . . . . . . . . .
5.6.1 Create the Invocable . . . . . . . . . . . . . . . . . .
5.6.2 Test setup . . . . . . . . . . . . . . . . . . . . . . . .
5.6.3 A Sunny Day Test . . . . . . . . . . . . . . . . . . .
5.6.4 Testing Member Failure During Invocation . . . . . .
5.6.5 Other Variations . . . . . . . . . . . . . . . . . . . .
5.7 Working With Many Caches . . . . . . . . . . . . . . . . . .
5.7.1 Referencing Another Cache From an EntryProcessor
5.7.2 Partition-Local Atomic Operations . . . . . . . . . .
5.7.3 Updating Many Entries . . . . . . . . . . . . . . . .
5.7.4 Backing Map Queries . . . . . . . . . . . . . . . . .
5.7.5 Backing Map Deadlocks . . . . . . . . . . . . . . . .
5.8 Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.8.1 Many-to-One Joins . . . . . . . . . . . . . . . . . . .
5.8.2 One-to-Many Joins . . . . . . . . . . . . . . . . . . .
5.8.3 Join Using a ValueExtractor . . . . . . . . . . . . . .
5.8.4 Joins Without Key Affinity . . . . . . . . . . . . . .
5.8.5 Further Reading . . . . . . . . . . . . . . . . . . . .

.
.
.
.

115
115
120
126

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

127
127
127
128
129
130
131
138
139
141
142
143
145
146
148
151
152
153
158
159
160
166
167
168
170
173
174
176
177
179
181
184
184
185

iv

CONTENTS

6 Persistence
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . .
6.1.1 Expiry and Eviction . . . . . . . . . . . . . . .
6.1.2 Thread Model . . . . . . . . . . . . . . . . . . .
6.1.3 Read-Through and Write-Through . . . . . . .
6.1.4 Read-Ahead . . . . . . . . . . . . . . . . . . . .
6.1.5 Write-Behind . . . . . . . . . . . . . . . . . . .
6.1.6 Consistency Model . . . . . . . . . . . . . . . .
6.2 A JDBC CacheStore . . . . . . . . . . . . . . . . . . .
6.2.1 The Example Table . . . . . . . . . . . . . . .
6.2.2 Loading . . . . . . . . . . . . . . . . . . . . . .
6.2.3 Erasing . . . . . . . . . . . . . . . . . . . . . .
6.2.4 Updating . . . . . . . . . . . . . . . . . . . . .
6.2.5 Testing with JDBC and Littlegrid . . . . . . .
6.3 A Controllable Cache Store . . . . . . . . . . . . . . .
6.3.1 Using Invocation . . . . . . . . . . . . . . . . .
6.3.2 Wiring the Invocable with Spring . . . . . . . .
6.3.3 Using A Control Cache . . . . . . . . . . . . . .
6.3.4 Wiring the CacheStore with Spring . . . . . . .
6.3.5 Decorated Values Anti-pattern . . . . . . . . .
6.4 Error Handling in a CacheStore . . . . . . . . . . . . .
6.4.1 Mitigating CacheStore Failure Problems . . . .
6.4.2 Handling Exceptions In A CacheStore . . . . .
6.4.3 Notes on Categorising Exceptions . . . . . . . .
6.4.4 Limiting the Number of Retries . . . . . . . . .
6.5 Priming Caches . . . . . . . . . . . . . . . . . . . . . .
6.5.1 Mapping Keys to Partitions . . . . . . . . . . .
6.5.2 Testing the Prime . . . . . . . . . . . . . . . .
6.5.3 Increasing Parallelism . . . . . . . . . . . . . .
6.5.4 Ensuring All Partitions Are Loaded . . . . . . .
6.5.5 Interaction With CacheStore and CacheLoader
6.6 Persist Without Deserialising . . . . . . . . . . . . . .
6.6.1 Binary Key And Value . . . . . . . . . . . . . .
6.6.2 Character Encoded Keys . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

187
187
188
188
189
190
190
193
194
195
195
196
196
198
200
200
204
205
206
208
209
209
210
213
214
214
217
218
220
221
221
223
223
225

7 Events
7.1 Introduction . . . . . . . . . . . . . . .
7.1.1 A Coherence Event Taxonomy
7.2 Have I Lost Data? . . . . . . . . . . .
7.2.1 A Lost Partition Listener . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

227
227
227
228
229

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

CONTENTS
7.3
7.4

.
.
.
.
.
.
.
.
.
.

232
233
234
235
239
240
240
241
243
244

8 Configuration
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.1.1 Operating System . . . . . . . . . . . . . . . . . . . .
8.1.2 Hardware Considerations . . . . . . . . . . . . . . . .
8.1.3 Virtualisation . . . . . . . . . . . . . . . . . . . . . . .
8.1.4 Breaking These Rules . . . . . . . . . . . . . . . . . .
8.2 Cache Configuration Best Practices . . . . . . . . . . . . . . .
8.2.1 Avoid Wildcard Cache Names . . . . . . . . . . . . . .
8.2.2 User-defined Macros in Cache Configuration . . . . . .
8.2.3 Avoid Unnecessary Service Proliferation . . . . . . . .
8.2.4 Separate Service Definitions and Cache Templates . .
8.2.5 Minimise Duplication in Configuration for Different
Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.2.6 Service Parameters . . . . . . . . . . . . . . . . . . . .
8.2.7 Partitioned backing map . . . . . . . . . . . . . . . . .
8.2.8 Overflow Scheme . . . . . . . . . . . . . . . . . . . . .
8.2.9 Beware of large high-units settings . . . . . . . . . . .
8.3 Operational Configuration Best Practice . . . . . . . . . . . .
8.3.1 Use the Same Operational Configuration In All Environments . . . . . . . . . . . . . . . . . . . . . . . . .
8.3.2 Service Guardian Configuration . . . . . . . . . . . . .
8.3.3 Specify Authorised Hosts . . . . . . . . . . . . . . . .
8.3.4 Use Unique Multicast Addresses . . . . . . . . . . . .
8.4 Validate Configuration With A NameSpaceHandler . . . . . .
8.5 NUMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.6 Eliminate Swapping . . . . . . . . . . . . . . . . . . . . . . .
8.6.1 Mitigation Strategies . . . . . . . . . . . . . . . . . . .
8.6.2 Prevention Strategy . . . . . . . . . . . . . . . . . . .

247
247
247
248
248
249
249
249
250
251
252

7.5

Event Storms . . . . . . . . . . . . .
Transactional Persistence . . . . . .
7.4.1 Synchronous or Asynchronous
7.4.2 Implementation . . . . . . . .
7.4.3 Catching Side-effects . . . . .
7.4.4 Database Contention . . . . .
Singleton Service . . . . . . . . . . .
7.5.1 MemberListener . . . . . . .
7.5.2 Instantiating the Service . . .
7.5.3 The Leaving Member . . . . .

v
. . . . . .
. . . . . .
operation
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

255
257
260
261
261
261
262
262
264
264
265
270
272
273
274

vi

CONTENTS

Appendix A Dependencies

277

Appendix B Additional Resources

279

Preface
This book started with a contact from a publisher looking for someone to
write it. I discussed it with Phil Wheeler and we decided to give it a go, but
personal commitments and pressure of work meant that we felt we would
not be able to fit with the publishers timetable. However, it sounded like an
interesting challenge so we decided to go ahead anyway. Now, about eighteen
months later the book is complete after and many hours work, mostly on
Southern trains on the London Brighton line, and a few on Brighton Belle1
while sailing from Brighton to Viveiro in Spain. Much smaller in scope
than our original conception, whole chapters have been sacrificed to the
expediency of getting something out there before it is obsolete, chapters on
security, extend, WAN and replication. A few months more work would have
meant I could produce a more polished book, more diagrams, perhaps make
some of the explanatory text clearer, add the missing chapters etc. But then,
it would be the Oracle 13c cook book and Id have to rework all the existing
material. Time to publish something now even if imperfect.
Most of the concepts and recipes in this book have been used in some form
in real production projects, though while developing the example code to
demonstrate them it was sobering to discover just how much of what I
thought I knew proved to be incorrect, or at least incomplete.
Personal reasons have meant that Phil has not been able to provide as much
material as we had originally hoped, though his help and encouragement have
been invaluable, not least in figuring out how to use some of the features of
LATEX.
There are many people who Id like to thank for their help and encouragement in writing this book. Too many to name them all but I would like
1
An Oyster 55 ketch owned by the Brighton Belle Sailing Club, brightonbelle.org,
open for membership applications or sail as a guest

vii

viii

PREFACE

to particularly thank Dave Felcey of Oracle and Andrew Wilson for their
encouragement and support.
This book is available as a free download in PDF format under a Creative
Commons Attribution license at no cost. I only ask that if you find it useful,
you make a donation to WaterAid based on how much a similar book might
have cost
www.justgiving.com/coherencecookbook
WaterAid transform peoples lives, providing water and sanitation leads to
improvements in health and prosperity in communities across the globe. You
can find out more about WaterAid and the work they do at their website,
www.wateraid.org.
All profits from publication of this book through any commercial channel
will also be donated to WaterAid.

David Whitmarsh
david@cohbook.org
November 2014

Chapter 1

Introduction
1.1

About This Book

This is a book about Oracle Coherence aimed at developers and architects


who already have some limited knowledge of the product. The intention is
to help the reader to progress from a basic knowledge of Coherence to having
a more detailed understanding of how it works, what are some best practices
for using it, some patterns to avoid and why.
If you have read, or have an understanding equivalent to that provided in
the book Oracle Coherence 3.5 by Aleksandar Seovic, then I hope that you
will find this volume useful.
The book is structured as a cookbook with many distinct recipes that may be
read largely independently of one another, though some sections in a chapter
build on prior sections or occasionally reference other chapters. I have tried
to indicate where familiarity with another section would be useful.
This is emphatically not a sales manifesto for Oracle Coherence. I am not
trying to promote or recommend the product, but I hope the background
material included may help an informed assessment of whether Coherence
is appropriate for your project. I have tried to be objective, although my
assessment of best practice and my view of some features are coloured by my
personal opinions on good design and software engineering practice.
1

CHAPTER 1. INTRODUCTION

The example code to accompany this book may be downloaded in zip or tgz
format from
www.cohbook.org/cookbook/code-examples.zip
www.cohbook.org/cookbook/code-examples.tgz

1.2

About the product

Oracle Coherence is one of the most mature, if not the most mature data grid
product on the market today. This brings some strengths, in that it has been
well tried and tested in some very demanding applications over a number of
years, but it also brings a certain amount of baggage, with an API that has
grown organically, and not always in the most elegant and consistent fashion.
It is also rather expensive. I consider the particular strengths of Coherence
to be in the way it is engineered at a low level. A considerable amount of
effort has gone into ensuring that it performs well and reliably even under
heavy load. I am aware of one company where one of Coherences competitors was replaced by Coherence because throughput plummeted when the
number of updates requested rose above a threshold1 . Coherences ability
to deliver large numbers of events to many clients with minimal reduction
in throughput is another of Coherences strengths that I have seen working
in practice. These are the products strengths. Amongst its weaknesses are
an API that can most kindly be described as idiosyncratic. The boundary
between internal and public APIs is not well-defined, sometimes leading to
unpleasant surprises when new releases appear, sometimes even with point
releases, The heavy use of static initialisation and system properties makes
the Coherence cluster member in your application an evil singleton, leading
to significant overheads in effectively testing your code. A particular concern
is that while it is very easy to get started with Coherence, it is also very easy
to make any one of a number of design and implementation errors that can
seriously compromise the reliability of the cluster (the vast majority of incidents of data loss are caused by design, management, or monitoring errors
rather than issues with Coherences resilience or reliability of the underlying
platform). It is therefore essential on any but the most trivial project to
1
This was long enough ago that it would not be fair to name the competitor, they may
have remedied the problem

1.3. TESTING

bring in someone with expertise2 . And for those most trivial projects, you
should probably be looking at a cheaper option anyway. It also has to be
said, that having been subsumed into the commercial behemoth that is Oracle Corporation, The Coherence product has suffered somewhat from the
influence of marketecture, What Oracle judges to be their commercial priority does not always align with the needs of us mere users of the product,
from the priority given to closer integration with WebLogic rather than more
urgent issues such as built-in replication support, to the arbitrary jump in
version number and the replacement of the relatively straightforward installation procedure of version 3.7 with the arcane procedure of version 12.1.2
onwards3 .
One final positive note. Many of the original engineering team from Tangosol
have stayed with Coherence since they were bought by Oracle and have been
supplemented by some very able individuals. Both formally through the
technical support process and informally through personal contacts I have
always found them to be responsive and competent.

1.3

Testing

The evil singleton pattern means that it is not a simple matter to test
cluster-based operations in a single JVM, and to solve this problem at least
four separate projects have arisen:
Project

Author

URL

Oracle Tools

Brian Oliver, Jonathan Knight

GridMan
GridLab

Jonathan Knight, Andrew Wilson


Alexey Ragozin

Littlegrid

Jonathan Hall

https://github.com/
coherence-community/oracle-tools
obsolete
https://code.google.com/p/
gridkit/wiki/GridLab
http://www.littlegrid.org/

Of these, GridMan has been subsumed into Oracle Tools since one of the
authors, JK, has joined Oracle. Each of them will enable testing of a cluster
in a single JVM by starting each member in its own classloader. Oracle
Tools and GridLab also support automated startup of members each in its
own JVM. GridLab is also intended to become a general purpose cluster
deployment and management tool. Littlegrid is in many respects the least
2
3

contact david@cohbook.org for rates and availability


Light the candles, ring the bell, read from the book...

CHAPTER 1. INTRODUCTION

feature-rich, but the best documented and easiest to use, hence the basis for
tests in this book and the downloadable examples.

Chapter 2

Integration Recipes
2.1

Introduction

This first chapter covers integration and configuration, in the sense of how
to build and run a Coherence node instance. This is, superficially at least,
extremely simple. Static initialisation and extensive use of default settings
within Coherence make it trivially easy to run up a JVM and create and
use a cache. As soon as you start trying to apply sound, current software
engineering practice, however, many issues arise. Here we try to highlight
some of these and offer some suggestions for techniques to work around them.
Many of the examples are illustrated by the use of Spring Framework. We
do not assert that this is the only, or even the best, tool for the job but it is
the most widely used and understood by the development community. The
patterns we illustrate could equally be implemented with other configuration
frameworks, or in native java code, Spring simply allows us to be less verbose
in our examples.
We have specifically not covered the use incubator patterns. There is some
overlap with our examples but largely the principles we espouse are orthogonal to their use. We also specifically do not consider the Coherence container and associated .gar deployment structure based on integrating Coherence with a stripped-down WebLogic instance. While this may have appeal
where WebLogic is already in use, we find the case for it far from compelling
- perhaps useful if your organisation already has infrastructure in place for
managing WebLogic.
5

CHAPTER 2. INTEGRATION RECIPES


Listing 2.1: A simple storage node main method
public static void main ( String [] args ) throws IOException {
l o a d C o h e r e n c e P r o p e r t i e s ();
new D ef a ul t Ca c he S er v er (
CacheFactory . g e t C o n f i g u r a b l e C a c h e F a c t o r y ()). startAndMonitor (5000);
}
private static void l o a d C o h e r e n c e P r o p e r t i e s () throws IOException {
Properties n e w S y s t e m P ro p e r t i e s = System . getProperties ();
InputStream is = new FileInputStream ( " cluster . properties " );
n e w S y s t em P r o p e r t i e s . load ( is );
is = new FileInputStream ( " storagenode . properties " );
n e w S y s t em P r o p e r t i e s . load ( is );
System . setProperties ( n e w S y s t e mP r o p e r t i e s );
}

2.2

Write Your Own Main Class

Objective
Discuss the benefits of providing a Main class, rather than using
DefaultCacheServer, illustrated with simple examples.
Coherence relies heavily on static initialisation and configuration by system
properties. We would often like to set these configuration options programmatically, but once Coherence gets into its initialisation routines, those properties have been read and it is too late to change them. The simple answer is
to provide a main class and set properties in the main method before calling
any Coherence methods.
To start a storage enabled node, we might use something like listing 2.1. By
separating those properties that are common across the entire cluster (e.g.
tangosol.coherence.clusteraddress, tangosol.coherence.ttl) from those specific
to a particular type of node (tangosol.coherence.distributed.localstorage)
type into separate files cluster.properties and storagenode.properties, we ease
maintenance and deployment. More generally, we could use the value of system property tangosol.coherence.role to construct the node-specific property
file name.
The ConfigurableCacheFactory.startAndMonitor(int interval) method will start
all of the Coherence services (to be precise, those defined with autostart true
in the cache configuration) and spawn a non-daemon thread that will check
service status every interval milliseconds, restarting any that have failed.

2.3. BUILD A CACHEFACTORY WITH SPRING FRAMEWORK

The process will continue to run until explicitly shut down or killed. For
a non-storage node that performs application services, we generally prefer the application logic to be in control of the process lifecycle, whether
it be a user-facing GUI, a message-driven server process, or an embedded web application. We may choose to allow our first call to a Coherence method to instantiate Coherence services on demand, or we could call
DefaultCacheServer.startDaemon() method in our main class (or ServletContext
initialisation for a web application). It is arguable whether it is appropriate
to use ConfigurableCacheFactory.startAndMonitor even for a storage node; if services die it may be wiser to terminate and restart the entire member.

2.3

Build a CacheFactory with Spring Framework

Objective
Illustrate a way of cleanly instantiating cluster nodes using Spring
Framework. The principles should be equally applicable to other IoC
containers. In particular, we discuss the subtle problems that can arise
from lazy instantiation of objects as Coherence services start.
Prerequisites
An understanding of the Spring BeanFactory and of Coherence initialisation
Code examples
Code samples are found in the configuration module of the downloadable code examples
Dependencies
The examples use Spring Framework
The internals of the cluster startup mechanism are complex and include the
following steps:
Reading and validating configuration files and system properties
Instantiating the singleton cache factory builder and various other
static initialisations
starting the cluster service, locating and negotiating with other cluster
members to establish membership of a cluster

CHAPTER 2. INTEGRATION RECIPES


starting each configured service, which itself involves a number of steps
for each such service
Establishing membership of the service with other cluster members running the same service (not all members run all services)
Starting worker thread pools
For partitioned services, locally instantiating existing caches and
transferring partitions from other members
lazily instantiating any configured objects (listeners, cache stores
etc.) needed for these caches.

That last step can be problematic. Instantiation of dependent objects can


be time consuming as external resources such as database connections are
established, yet they are performed lazily on demand from a pool of worker
threads. When starting a new, empty cluster this tends to all proceed in
an orderly fashion. There are no caches until explicitly created. Empty
partitions are all assigned before any caches are created, or are created
empty.
Joining a new storage enabled member to an existing, running cluster can
in practice be a much more dynamic and disorderly process with everything
happening at once, with a greater chance of synchronisation issues and race
conditions manifesting in some misbehaviour. These can be very difficult to
identify, diagnose, and test for. For this reason I strongly advocate instantiating all classes that Coherence depends on in a JVM before starting the
cluster.
There is a fundamental conflict between the Coherence way of doing things,
and the philosophy behind IoC frameworks like Spring. Coherence is somewhat schizophrenic in the design of its API: is it a component, to be embedded within and provide services to the larger application, or is it itself
a framework, to control and manage the construction of components that
perform application tasks? I, and I believe the majority of users, take the
former viewpoint, we would prefer to allow our IoC framework to manage the
construction of the application. The first obstacle is the manner in which Coherence seeks to obtain instances of classes, such as CacheStore or MapListener.
We may choose to construct instances, or obtain them from a static factory
method using <class-scheme>, where we would prefer to obtain them from the
Spring BeanFactory
If you wish to use Spring to build your application, your first Google search on

2.3. BUILD A CACHEFACTORY WITH SPRING FRAMEWORK

the topic will lead you to SpringAwareCacheFactory. This has the convenience of
allowing you to access bean instances by name from a BeanFactory associated
with the CacheFactory instance, but it does rely on incubator code, with all
the caveats that that implies. Here we describe an approach that does not
require SpringAwareCacheFactory, though is compatible with it. To follow the
advice given above to instantiate dependencies before starting Coherence,
avoid declarative configuration of SpringAwareCacheFactory; always construct
it programmatically using a BeanFactory constructed in advance, rather than
be providing a spring XML configuration file. Otherwise you may encounter
the following issues:
Race conditions as many Coherence service threads concurrently attempt to obtain bean instances may have unpredictable effects. As
nodes that appear to have started successfully cause backlogs on services as thread deadlocks occur in obtaining bean instances1 .
Failures in constructing beans may occur while your cluster node is
partly running, having joined some services but not yet fully able to
participate in them. The debris in log messages, membership changes,
and partition reassignments as the node shuts down again is at best
confusing and at worst destabilising2 .
Recursive instantiation, as a Spring bean calls a Coherence API, which
tries to access a Spring bean, which fails to instantiate because youre
already instantiating a bean further up the stack.3
More recently, and using the Coherence 12.1.2 namespace support is the Coherence Spring Integration project at https://java.net/projects/cohspr.
This project provides a neat way of reference Coherence objects from Spring
bean definitions as well as beans within Coherence cache configurations, but
it does require that the the application context itself is created by Coherence and is therefore, in its current form, incompatible with the approach
described in this section.

2.3.1

Ordered Hierarchy of Application Contexts

Some beans we would like to create before initialising Coherence: those that
Coherence itself will use such as CacheStore instances or backing map listeners.
1

Seen in more than one production system


This one too
3
Hopefully youd pick this one up while still in development
2

10

CHAPTER 2. INTEGRATION RECIPES

Other beans belonging to our application logic may have direct or transitive
runtime dependencies on Coherence and so must be instantiated after Coherence has started. Though these two sets of beans will not normally reference
each other (as Coherence itself is the link-layer between them), they may
have common dependencies on resources such as data sources or security
contexts. These groups of beans may be separated into three application
contexts.

utilBeansContext

depends

applicationBeansContext

depends

coherenceBeansContext

depends
depends

Coherence

order of initialisation

We will instantiate the Coherence application context and the common parent UtilBeansContext.xml before starting Coherence, and the application beans
context afterwards.
In listing 2.5 we construct a master context that defines each of the above
contexts as beans, taking care of the parent-child relationships through constructor arguments, Any context that we dont want immediately instantiated (the ApplicationBeansContext in this case), we declare to be lazily instantiated.
Refer to the Spring documentation for full details, but briefly, we have defined
three separate Spring bean factories
utilBeanContext: eagerly instantiated, containing our common beans

2.3. BUILD A CACHEFACTORY WITH SPRING FRAMEWORK

Listing 2.2: UtilBeansContext.xml


<? xml version = " 1.0 " encoding = " UTF -8 " ? >
< beans xmlns = " http: // www . springframework . org / schema / beans "
xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
x si : sc h em a Lo c at i on = " http: // www . springframework . org / schema / beans
http: // www . springframework . org / schema / beans / spring - beans . xsd " >
< bean id = " ex amp le Dat aSo urc e "
class = " org . springframework . jdbc . datasource . D r i v e r M a n a g e r D a t a S o u r c e " >
< constructor - arg >
< value > jdbc:h2:mem:test ; DB_CLOSE_DELAY = -1 </ value >
</ constructor - arg >
</ bean >
</ beans >

Listing 2.3: CoherenceBeansContext.xml


<? xml version = " 1.0 " encoding = " UTF -8 " ? >
< beans xmlns = " http: // www . springframework . org / schema / beans "
xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
x si : sc h em a Lo c at i on = " http: // www . springframework . org / schema / beans
http: // www . springframework . org / schema / beans / spring - beans . xsd " >
< bean id = " e x am p le C ac h eL o ad e r "
class = " org . cohbook . configuration . spring . E xa m pl e Ca c he L oa d er " >
< constructor - arg >
< ref bean = " e xam ple Dat aSo urc e " / >
</ constructor - arg >
</ bean >
</ beans >

Listing 2.4: ApplicationBeansContext.xml


<? xml version = " 1.0 " encoding = " UTF -8 " ? >
< beans xmlns = " http: // www . springframework . org / schema / beans "
xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
x si : sc h em a Lo c at i on = " http: // www . springframework . org / schema / beans
http: // www . springframework . org / schema / beans / spring - beans . xsd " >
< bean id = " exampleCache "
class = " org . cohbook . configuration . spring . ExampleCacheBean " / >
</ beans >

11

12

CHAPTER 2. INTEGRATION RECIPES

Listing 2.5: beanRefFactory.xml


<? xml version = " 1.0 " encoding = " UTF -8 " ? >
< beans xmlns = " http: // www . springframework . org / schema / beans "
xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns:context = " http: // www . springframework . org / schema / context "
x si : sc h em a Lo c at io n = " http: // www . springframework . org / schema / beans
http: // www . springframework . org / schema / beans / spring - beans . xsd
http: // www . springframework . org / schema / context
http: // www . springframework . org / schema / context / spring - context -3.2. xsd " >
< bean id = " utilBeansContext " lazy - init = " false "
class = " org . springframework . context . support . C l a s s P a t h X m l A p p l i c a t i o n C o n t e x t " >
< constructor - arg >
< value > UtilBeansContext . xml </ value >
</ constructor - arg >
</ bean >
<! -- child of above -- >
< bean id = " c o h e r e n c e B e a n s C o n t e x t " lazy - init = " true "
class = " org . springframework . context . support . C l a s s P a t h X m l A p p l i c a t i o n C o n t e x t " >
< constructor - arg >
< list > < value > C o h e r e n c e B e a n s C o n t e x t . xml </ value > </ list >
</ constructor - arg >
< constructor - arg >
< ref bean = " utilBeansContext " / >
</ constructor - arg >
</ bean >
<! -- child of above -- >
< bean id = " a p p l i c a t i o n B e a n s C o n t e x t " lazy - init = " true "
class = " org . springframework . context . support . C l a s s P a t h X m l A p p l i c a t i o n C o n t e x t " >
< constructor - arg >
< list > < value > A p p l i c a t i o n B e a n s C o n t e x t . xml </ value > </ list >
</ constructor - arg >
< constructor - arg >
< ref bean = " utilBeansContext " / >
</ constructor - arg >
</ bean >
</ beans >

2.3. BUILD A CACHEFACTORY WITH SPRING FRAMEWORK

13

coherenceBeanContext: lazily instantiated, containing beans needed by Coherence in order to start all of its services. beans defined in the parent
factory utilBeanContext are also visible.
applicationBeanContext: lazily instantiated, containing beans that may depend on Coherence. beans defined in the parent factory utilBeanContext
are also visible.
Our objective is to avoid bean instantiation during startup of the cluster member and its services, especially for beans with time-consuming or
contention-bound initialisation, such as creating and testing connections
to databases. Placing these in a separate context allows them to be preinstantiated, reducing or eliminating startup race conditions.
Some of the beans are associated with a service and will be obtained from the
coherenceBeansContext as the service starts during member startup. Others
are associated with individual caches and will be obtained when either:
A new cache is created under the control of application logic.
A new storage node joins an existing cluster, and partitions containing
an existing cache are transferred to the new member.
We do not usually expect that beans used in the construction of our application logic will need direct access to the beans defined in coherenceBeansContext,
Coherence itself forms the layer between them.

2.3.2

Preventing Premature Cluster Startup

Given that we do not want to start Coherence before we have instantiated


the ApplicationContext with its dependencies, we would perhaps like to enforce this - if, during development, we inadvertently reference a Coherence
API that starts the cluster prematurely, it would be useful to have a clear
and unambiguous exception that tells us so. We need to find some point in
Coherence initialisation that we can intercept and test our readiness to start.
One possibility amongst several is to override the getCacheFactoryBuilder
methods of DefaultCacheFactoryBuilder listing 2.6 shows an implementation,
LifecycleValidatingCacheFactoryBuilder, using a statically accessed boolean to
govern the check.
So we can prepare our dependencies with confidence that Coherence will not
start too soon, until we explicitly set the buildOk flag. One convenient means

14

CHAPTER 2. INTEGRATION RECIPES


Listing 2.6: LifecycleValidatingCacheFactoryBuilder

public class L i f e c y c l e V a l i d a t i n g C a c h e F a c t o r y B u i l d e r extends


DefaultCacheFactoryBuilder {
private static volatile boolean buildOk = false ;
@Override
public C o n f i g u r a b l e C a c h e F a c t o r y g e t C o n f i g u r a b l e C a c h e F a c t o r y (
ClassLoader loader ) {
if (! buildOk ) {
throw new I l l e g a l S t a t e E x c e p t i o n (
" Attempt to build a cache factory too early " );
}
return super . g e t C o n f i g u r a b l e C a c h e F a c t o r y ( loader );
}
@Override
public C o n f i g u r a b l e C a c h e F a c t o r y g e t C o n f i g u r a b l e C a c h e F a c t o r y (
String sConfigURI , ClassLoader loader ) {
if (! buildOk ) {
throw new I l l e g a l S t a t e E x c e p t i o n (
" Attempt to build a cache factory too early " );
}
return super . g e t C o n f i g u r a b l e C a c h e F a c t o r y ( sConfigURI , loader );
}
static void enableBuild () {
buildOk = true ;
}
}

of doing so is by using the Spring SmartLifecycle interface. For example, the


inner class in listing 2.7:
When this beans start method is called, we remove the block on instantiating the cluster and then perform the instantiation. Other beans can
have their start methods invoked before or after the cluster is instantiated according as their getPhase method returns less than or greater than
CLUSTER_PHASE.
< bean class = " org . cohbook . configuration . spring . BuilderLifeCycle " / >

2.3.3

Waiting for Cluster Startup to Complete

When initialising Coherence within a JVM, there are many services and
threads that are spawned. To follow the principle of ensuring a well-managed,
sequential startup, it would be best to wait until all those threads are running
before starting our application logic. Be aware, though that what is started
depends very much on the approach taken:

2.3. BUILD A CACHEFACTORY WITH SPRING FRAMEWORK

Listing 2.7: BuilderLifeCycle


public static final int B E F O R E _ C L U S T E R _ P H A S E = 100;
public static final int CLUSTER_PHASE = 200;
public static final int A F T E R _ C L U S T ER _ P H A S E = 300;
public static class BuilderLifeCycle implements SmartLifecycle {
@Override
public void start () {
enableBuild ();
CacheFactory . ensureCluster ();
}
@Override
public void stop () {
}
@Override
public boolean isRunning () {
return buildOk ;
}
@Override
public int getPhase () {
return CLUSTER_PHASE ;
}
@Override
public boolean isAutoStartup () {
return true ;
}
@Override
public void stop ( Runnable callback ) {
}
}

15

16

CHAPTER 2. INTEGRATION RECIPES

DefaultCacheFactory.start()
CacheFactory.getCache("cachename")

CacheFactory.ensureCluster()

Joins the cluster and starts all services marked autostart=true.


Joins the cluster and starts only
the service that owns the requested
cache.
Joins the cluster but starts no user
services.

Since version 12.1.2, Coherence has provided a LifecycleEvent that can be


used with a EventInterceptor to provide notification of when a CacheFactory is
activating, activated, or disposing. We can create and register an interceptor
in the StorageDisabledMain or StorageEnabledMain class. Here is a simple static
implementation of the interceptor and the method to register it:
private static Semaphore startupComplete = new Semaphore (0);
@Interceptor ( identifier = " ClusterStarted " )
public static class ClusterStarted implements EventInterceptor < LifecycleEvent > {
@Override
public void onEvent ( LifecycleEvent event ) {
if ( event . getType () == LifecycleEvent . Type . ACTIVATED ) {
startupComplete . release ();
}
}
}
private static ClusterStarted CLUSTERSTARTED = new ClusterStarted ();
private void r e g i s t e r L i f e c y c l e L i s t e n e r ( C o n f i g u r a b l e C a c h e F a c t o r y factory ) {
factory . g e t I n t e r c e p t o r R e g i s t r y (). r e g i s t e r E v e n t I n t e r c e p t o r (
CLUSTERSTARTED , R e g i s t r a t i o n B e h a v i o r . IGNORE );
}

The interceptor works by releasing the semaphore to indicate that startup has
completed (event type is ACTIVATED). Specifying RegistrationBehavior.IGNORE
means that repeated calls to the registration method are silently ignored.
Now in the initialise method shown in listing 2.8, we register the interceptor
before starting the cluster. To do this we have to instantiate the cache factory
by calling CacheFactory.getConfigurableCacheFactory. Thats ok though, once
instantiated, the subsequent call to DefaultCacheServer.start will use the
same CacheFactory instance as it is internally held in a static. Finally, we
block on the semaphore until the ACTIVATED event has been received.
After this initialise method returns we can carry on and instantiate our
applicationBeansContext secure in the knowledge that all dependent Coherence
services are fully up and running.
Be careful of other mechanisms for starting the cluster. If you simply call
CacheFactory.ensureCluster() or CacheFactory.getCache(String) there will be no
LifecycleEvent; the services have not all been started. Another subtle prob-

2.3. BUILD A CACHEFACTORY WITH SPRING FRAMEWORK

17

Listing 2.8: Waiting for the cache factory to be ACTIVATED


public static void initialise () throws IOException {
l o a d C o h e r e n c e P r o p e r t i e s ();
BeanLocator . getContext ( " c o h e r e n c e B e a n s C o n t e x t " );
r e g i s t e r L i f e c y c l e L i s t e n e r ( CacheFactory . g e t C o n f i g u r a b l e C a c h e F a c t o r y ());
D ef a ul t Ca c he Se r ve r . start ();
try {
startupComplete . acquire ();
} catch ( I n t e r r u p t e d E x c e p t i o n e ) {
throw new RuntimeException ( e );
}
}

lem can arise if you need your own implementation of ConfigurableCacheFactory,


be sure to subclass DefaultConfigurableCacheFactory, which is not, despite the
name, and is in fact now deprecated.
A more nave approach would be to simply register the interceptor immediately after the DefaultCacheServer.start() method, but that exposes us to
a potential race condition where cluster startup completes before the interceptor is registered so that the event is never received.

2.3.4

Application Context in a Class Scheme

We can obtain an instance of a bean from a BeanFactory by providing a static


factory. Listing 2.9 is a simple utility class, BeanLocator to obtain beans from
a named context, as defined in our master context:
It is a simple matter to get a bean instance in a class-scheme using our utility
class giving it the context name and the bean name as in listing 2.10.
Oracle have stated in response to an SR regarding a problem instantiating a
Serializer instance in version 12.1.3 that in future, the declared return type
of a factory method will be checked, so it may become necessary to add a
separate method per return type:
public static CacheStore get Cac he Sto reB ean ( String contextName , String beanName ) {
return getContext ( contextName ). getBean ( beanName , CacheStore . class );
}

18

CHAPTER 2. INTEGRATION RECIPES

Listing 2.9: BeanLocator.java


public class BeanLocator {
private static Ap p li c at io n Co n te x t a p pl i c a t i o n C o n t e x t s =
new C l a s s P a t h X m l A p p l i c a t i o n C o n t e x t (
" classpath : beanRefFactory . xml " );
public static BeanFactory getContext ( String contextName ) {
return ( BeanFactory ) a p p l i c a t i o n Co n t e x t s . getBean ( contextName );
}
public static Object getBean ( String contextName , String beanName ) {
return getContext ( contextName ). getBean ( beanName );
}
public static Object getBean ( String contextName , String beanName ,
String propertyName , Object propertyValue ) {
if ( propertyValue instanceof Value ) {
propertyValue = (( Value ) propertyValue ). get ();
}
Object bean = getContext ( contextName ). getBean ( beanName );
PropertyAccessor accessor = P r o p e r t y A c c e s s o r F a c t o r y . f o r B e a n P r o p e r t y A c c e s s ( bean );
accessor . setPropertyValue ( propertyName , propertyValue );
return bean ;
}
}

Listing 2.10: Using BeanLocator in a cache configuration


< distributed - scheme >
< scheme - name > d is t ri b ut e dS e rv ic e </ scheme - name >
< backing - map - scheme >
< read - write - backing - map - scheme >
< cachestore - scheme >
< class - scheme >
< class - factory - name >
org . cohbook . configuration . spring . BeanLocator
</ class - factory - name >
< method - name > getBean </ method - name >
< init - params >
< init - param >
< param - value > c o h e r e n c e B e a n s C o n t e x t </ param - value >
</ init - param >
< init - param >
< param - value > e xa m pl e Ca c he L oa d er </ param - value >
</ init - param >
</ init - params >
</ class - scheme >
</ cachestore - scheme >
</ read - write - backing - map - scheme >
</ backing - map - scheme >
< autostart > true </ autostart >
</ distributed - scheme >

2.3. BUILD A CACHEFACTORY WITH SPRING FRAMEWORK

2.3.5

19

Putting it Together: the Main Class

We construct the coherenceBeansContext in the main class of a simple storage node by calling BeanLocator.getContext before instantiating Coherence.
Note the difference between the start and startAndMonitor methods. The former simply starts the services, all as daemon threads while the latter starts
an additional non-daemon thread that periodically checks the status of the
services and restarts any that have failed.
Listing 2.11: ExampleStorageMain.java
public static void main ( String [] args ) {
l o a d C o h e r e n c e P r o p e r t i e s ();
BeanLocator . getContext ( " c o h e r e n c e B e a n s C o n t e x t " );
new D ef a ul t Ca c he S er v er (
CacheFactory . g e t C o n f i g u r a b l e C a c h e F a c t o r y ())
. startAndMonitor (5000);
}

We may want to create other services in a storage-enabled node, and define


it in an applicationBeansContext. We would implement an initialising stage
that registers the listener and waits for the ACTIVATED lifecycle event after
calling DefaultCacheServer.startAndMonitor .
In a simple non-storage-enabled member performing application functions,
we will often not require the coherence beans - e.g. CacheStore instances and
backing map listeners, even if defined in the cache configuration, will never
be referenced if storage is disabled. The initialise method is as defined
above in listing 2.8, it starts the cluster and waits for startup to complete
before returning.
public static void main ( String [] args ) throws InterruptedException , IOException {
initilise ();
// may not be needed
BeanLocator . getContext ( " c o h e r e n c e B e a n s C o n t e x t " );
initialise ();
BeanFactory bf = BeanLocator . getContext ( " a p p l i c a t i o n B e a n s C o n t e x t " );
}

2.3.6

Avoid Static CacheFactory Methods

Making calls to static CacheFactory.getCache and CacheFactory.getService methods within our application beans makes it harder to unit test them. Better

20

CHAPTER 2. INTEGRATION RECIPES

to inject a CacheFactory, or even a NamedCache or Service instance.


< bean id = " cluster " class = " com . tangosol . net . CacheFactory "
factory - method = " getCluster " / >
< bean id = " cacheFactory " class = " com . tangosol . net . CacheFactory "
factory - method = " g e t C o n f i g u r a b l e C a c h e F a c t o r y " / >
< bean id = " exampleCache " factory - bean = " cacheFactory "
factory - method = " ensureCache " >
< constructor - arg value = " test " / >
< constructor - arg > < null / > </ constructor - arg >
</ bean >
< bean id = " d i st r ib u te d Se r vi ce " factory - bean = " cacheFactory "
factory - method = " ensureService " >
< constructor - arg value = " e x a m p l e D i s t r i b u t e d S e r v i c e " / >
</ bean >
< bean id = " in voc ati onS er vic e " factory - bean = " cacheFactory "
factory - method = " ensureService " >
< constructor - arg value = " e x a m p l e I n v o c a t i o n S e r v i c e " / >
</ bean >
< bean id = " e x a m p l e C a c h e U s i n g B e a n "
class = " org . cohbook . configuration . spring . E x a m p l e C a c h e U s i n g B e a n " >
< property name = " cache " ref = " exampleCache " / >
</ bean >

The non-static ensureCache method for obtaining cache instances from a


ConfigurableCacheFactory takes two parameters: the cache name, and the
ClassLoader to use. we can pass null for the second parameter unless were
crossing ClassLoader boundaries, e,g, in a JEE or OSGi container.

2.3.7

Rewiring Deserialised Objects

There are many kinds of object passed between cluster participants - members and extend clients. Domain objects, instances of implementations of
Filter, ValueExtractor and ValueUpdater, EntryProcessor, Invocable, etc. Some
of these may have need to access objects managed by a Spring application context. We could, in each implemented class, explicitly obtain the
ApplicationContext using the BeanLocator pattern above, but thats not really
much of a dependency-injection model. Better to inject the values when
deserialising. Easy enough if were writing our own implementations of
PofSerializer for the classes, but a more general approach would be better. Now, Spring provides a class AutowireCapableBeanFactory that has a most
useful method4 :
void
a u t o w i r e B e a n P r o p e r t i e s ( Object existingBean ,
int autowireMode , boolean dependencyCheck )
// Autowire the bean properties of the given bean instance
// by name or type .
4
Details at http://static.springsource.org/spring/docs/3.2.x/javadoc-api/
org/springframework/beans/factory/config/AutowireCapableBeanFactory.html

2.3. BUILD A CACHEFACTORY WITH SPRING FRAMEWORK

21

This handy method will apply the select autowire strategy to any object with
@Autowired annotations to inject dependencies. The existingBean argument
does not need to have been created by a Spring application context. We
could write a Serializer that autowires objects after they are deserialised,
the deserialisation itself is delegated to a standard Serializer instance5
public class SpringSerializer implements Serializer {
private final Serializer delegate ;
private final String a p p l i c a t i o n C o n t e x t N a m e ;
public SpringSerializer ( String applicationContextName , Serializer delegate ) {
this . a p p l i c a t i o n C o n t e x t N a m e = a p p l i c a t i o n C o n t e x t N a m e ;
this . delegate = delegate ;
}
private A u t o w i r e C a p a b l e B e a n F a c t o r y getBeanFactory () {
BeanFactory bf = BeanLocator . getContext ( a p p l i c a t i o n C o n t e x t N a m e );
if ( bf instanceof Ap p li c at i on C on te x t ) {
return (( Ap p li c at i on C on t ex t ) bf )
. g e t A u t o w i r e C a p a b l e B e a n F a c t o r y ();
} else {
return null ;
}
}
public Object deserialize ( BufferInput in ) throws IOException {
Object result = delegate . deserialize ( in );
getBeanFactory (). a u t o w i r e B e a n P r o p e r t i e s (
result , A u t o w i r e C a p a b l e B e a n F a c t o r y . AUTOWIRE_BY_TYPE , false );
return result ;
}
@Override
public void serialize ( BufferOutput bufferoutput , Object obj )
throws IOException {
delegate . serialize ( bufferoutput , obj );
}
}

We will generally want to use the SpringSerializer to inject values into beans
that depend upon Coherence, i.e. from our applicationBeansContext rather
than the coherenceBeansContext, so in order to realise our desire for ordered,
controlled startup without excessive use of lazy bean instantiation, we must
avoid the use of the applicationBeansContext until Coherence is fully up and
running. Coherence itself will make extensive use of serialisation and deserialisation as it starts up its services; at this time applicationBeansContext
has not yet been instantiated so we cannot reference it in SpringSerializer.
One solution is to only retrieve the applicationBeansContext when deserialising an object that explicitly requires it. Reflectively checking every object
5

We could instead subclass ConfigurablePofContext, but by implementing as


a decorator, SpringSerializer can be agnostic as to whether we are using a
ConfigurablePofContext, a DefaultSerializer or something else to perform the serialisation.

22

CHAPTER 2. INTEGRATION RECIPES

for @Autowired annotations on every method and field would be prohibitively


expensive, but we can define our own class-level annotation for classes that
require autowiring when deserialised:
@Target ({ ElementType . TYPE })
@Retention ( RetentionPolicy . RUNTIME )
public @interface De s e r i a l i s e A u t o w ir e {
}

Our deserialise method then becomes:


public Object deserialize ( BufferInput in ) throws IOException {
Object result = delegate . deserialize ( in );
if ( result != null && result . getClass (). i s A n n o t a t i o n P r e se n t (
D e s e r i a l is e A u t o w i r e . class )) {
getBeanFactory (). a u t o w i r e B e a n P r o p e r t i e s ( result ,
A u t o w i r e C a p a b l e B e a n F a c t o r y . AUTOWIRE_BY_TYPE ,
false );
}
return result ;
}

So, what does a class look like that we want to autowire from the spring
context? Heres an implementation of Invocable that finds the difference
between system time and cluster time on each member of the cluster and
returns the difference in milliseconds:
public class E x a m p l e S p r i n g I n v o c a b l e implements Invocable , Serializable {
private transient Cluster cluster ;
private transient long result ;
public E x a m p l e S p r i n g I n v o c a b l e () {
}
@Autowired
public void setCluster ( Cluster cluster ) {
this . cluster = cluster ;
}
@Override
public void init ( Inv oca tio nSe rvi ce in vo cat ion ser vic e ) {
}
@Override
public void run () {
result = cluster . getTimeMillis () - new Date (). getTime ();
}
@Override
public Object getResult () {
return result ;
}
}

We could, rather than injecting the cluster object from the application context, obtain it in the init() method of Invocable by calling the static method
invocationService.getCluster(). The point here is that autowired injection is
not necessarily just for external resource connections - JDBC, JMS, HTTP,
etc. - but can equally be used for Coherence resources, greatly simplifying
unit testing in many cases.

2.3. BUILD A CACHEFACTORY WITH SPRING FRAMEWORK

23

Listing 2.12: DynamicAutowire annotation


@Target ({ ElementType . FIELD , ElementType . METHOD })
@Retention ( RetentionPolicy . RUNTIME )
public @interface DynamicAutowire {
public static final String DEFAULT_CONTEXT = " " ;
public String contextName () default DEFAULT_CONTEXT ;
public String beanNameProperty ();
}

Which Bean to Inject

A limitation of using Springs @Autowired annotation, or the similar JSR-250


annotations @Resource, is that we may select the instance to inject only at
build time, either by name using @Qualifier, or implicitly by type. If we wish
to specify the bean instance at runtime - one perhaps of many of the correct
type available in the context - then we must implement our own annotation
as in listing 2.12.
This annotation allows us to select the name of the bean to inject by choosing
another property of the annotated object as providing the name of the bean
to inject for this property. We may also provide the name of the application
context from which to obtain the bean, giving us some additional flexibility.
Now, in listing 2.13 we can add the code to find and use this annotation in
our SpringSerializer.
Ill leave it as an exercise for you, the reader, to extend this concept to allow
dynamic injection through annotations on superclasses and/or implemented
interfaces as you see fit.
Bear in mind that reflection like this is computationally relatively expensive.
It may be appropriate for control functions, or for operational objects (filters,
updaters, etc.) where their use is infrequent relative to the total number of
object transfers that occur. It may not be appropriate for large numbers of
frequently updated or retrieved domain objects, though a design that called
for injection of context information into domain objects warrants careful review anyway. If performance is an issue, we can cache the results of the class
introspection, or use a library that provides indexed access to reflection data,
such as reflections at http://code.google.com/p/reflections/

24

CHAPTER 2. INTEGRATION RECIPES

Listing 2.13: SpringSerializer with dynamic autowire


@Override
public Object deserialize ( BufferInput in ) throws IOException {
Object result = delegate . deserialize ( in );
if ( result != null && result . getClass (). i s A n n o t a t i o n P r e se n t (
D e s e r i a l is e A u t o w i r e . class )) {
getBeanFactory (). a u t o w i r e B e a n P r o p e r t i e s (
result , A u t o w i r e C a p a b l e B e a n F a c t o r y . AUTOWIRE_BY_TYPE , false );
try {
autowireObject ( result );
} catch ( I l l e g a l A c c e s s E x c e p t i o n | I l l e g a l A r g u m e n t E x c e p t i o n
| InvocationTargetException e) {
throw new IOException ( " failed to inject properties " , e );
}
}
return result ;
}
public void autowireObject ( Object bean ) throws IllegalAccessException ,
IllegalArgumentException , I n v o c a t i o n T a r g e t E x c e p t i o n {
Class <? > clazz = bean . getClass ();
BeanWrapper beanWrapper = P r o p e r t y A c c e s s o r F a c t o r y . f o r B e a n P r o p e r t y A c c e s s ( bean );
for ( Method method : clazz . getMethods ()) {
DynamicAutowire dynwire = method . getAnnotation ( DynamicAutowire . class );
if ( dynwire != null ) {
method . invoke ( getBeanToInject ( dynwire , beanWrapper ));
}
}
for ( Field field : clazz . getFields ()) {
DynamicAutowire dynwire = field . getAnnotation ( DynamicAutowire . class );
if ( dynwire != null ) {
field . setAccessible ( true );
field . set ( bean , getBeanToInject ( dynwire , beanWrapper ));
}
}
}
private Object getBeanToInject ( DynamicAutowire dynwire , BeanWrapper beanWrapper ) {
String beanName = ( String ) beanWrapper . getPropertyValue ( dynwire . beanNameProperty ());
String contextName = dynwire . contextName (). equals (
DynamicAutowire . DEFAULT_CONTEXT )
? applicationContextName
: dynwire . contextName ();
return BeanLocator . getBean ( contextName , beanName );
}

2.3. BUILD A CACHEFACTORY WITH SPRING FRAMEWORK

25

Instantiating the Serialiser


But how do we instantiate the serialiser itself? As a spring bean, of course!
If we define the bean in the coherenceBeansContext:
Listing 2.14: CoherenceBeansContext.xml
< bean id = " a u to w ir e Se r ia l is e r " class = " org . cohbook . configuration . spring . SpringSerializer " >
< constructor - arg >
< value > a p p l i c a t i o n B e a n s C o n t e x t </ value >
</ constructor - arg >
</ bean >

then we might try loading it from spring in the cache configuration:


Listing 2.15: cache-config.xml
< defaults >
< serializer >
< instance >
< class - factory - name >
org . cohbook . configuration . spring . BeanLocator
</ class - factory - name >
< method - name > getSerialzerBean </ method - name >
< init - params >
< init - param >
< param - type > String </ param - type >
< param - value > c o h e r e n c e B e a n s C o n t e x t </ param - value >
</ init - param >
< init - param >
< param - type > String </ param - type >
< param - value > a ut o wi r eS e ri a li s er </ param - value >
</ init - param >
</ init - params >
</ instance >
</ serializer >
</ defaults >

Unfortunately, as of Coherence version 12.1.3, there is a check that the


factory-method is declared to return an instance of Serializer, so the getBean
method of BeanLocator wont work as it is declared to return Object6 . We
have to add a new method to that class:
public static Serializer get Ser ia liz erB ean ( String contextName , String beanName ) {
return getContext ( contextName ). getBean ( beanName , Serializer . class );
}

and modify our cache configuration to call that method instead


Caution is needed in instantiating Coherence classes before starting the cluster. Many will perform static initialisations that read system properties, so
that subsequent changes to those properties have no effect.
The instance dependencies bear re-iteration here:
6
I understand that Oracle are reconsidering that change and the former behaviour may
be re-instated in a future release.

26

CHAPTER 2. INTEGRATION RECIPES


CoherenceBeansContext.xml contains the beans that Coherence cache factory requires to start, including the serialiser bean, autowireSerializer.
Beans used by the application, some of which may have dependencies on Coherence, are defined in applicationBeansContext.xml, which is
instantiated after the cluster has started.
the autowireSerialiser may need to wire classes with service beans
from ApplicationBeansContext, leading to a potential circular dependency, broken by the fact that we may not autowire classes that are
passed between cluster nodes while the cluster is starting - not usually
as an issue as these are should only be native java or Coherence classes
or cache domain objects.

Wiring Configuration-Time Dependencies


The approach outlined so far works well enough for simple singleton beans,
but there are some that we may wish to create dynamically. The Coherence
cache configuration syntax allows us to inject the cache name or backing map
manager context in an init-param of a class-scheme element. For any given
service, we would expect that all caches would reference the same backing
map manager context, so we can still use a singleton bean but require a
means of injecting that context. For cache name, however, well need to
create a new object instance each time we create a cache.
Lets say we have a cache mapping with a wildcard element:
< cache - mapping >
< cache - name >dyn -* </ cache - name >
< scheme - name > dynamicScheme </ scheme - name >
</ cache - mapping >

We do not know how many of these caches we will be creating at runtime,


and we want each cache to have its own CacheLoader instance, initialised with
the cache name. Our CacheLoader implementation in listing 2.16 has a setter
method for the cache name:
In the spring XML, we define the bean with scope prototype
< bean id = " d y na m ic C ac h eL o ad er " class = " org . cohbook . configuration . spring . D y na m ic Ca c he L oa d er "
scope = " prototype " >
< constructor - arg >
< ref bean = " e xam ple Dat aSo ur ce " / >
</ constructor - arg >
</ bean >

2.3. BUILD A CACHEFACTORY WITH SPRING FRAMEWORK

27

Listing 2.16: DynamicCacheLoader.java


public class D yn a mi c Ca c he L oa de r implements CacheLoader {
private DataSource dataSource ;
private String cacheName ;
public D y na m ic C ac h eL o ad e r ( DataSource dataSource ) {
this . dataSource = dataSource ;
}
public void setCacheName ( String cacheName ) {
this . cacheName = cacheName ;
}
// rest of implementation
}

Now we need, in the cache configuration, a way of obtaining the bean instance
and injecting the cache name. The simple, but somewhat clunky way of doing
it, is to add another method to our BeanLocator class:
public static Object getBean ( String contextName , String beanName ,
String propertyName , Object propertyValue ) {
if ( propertyValue instanceof Value ) {
propertyValue = (( Value ) propertyValue ). get ();
}
Object bean = getContext ( contextName ). getBean ( beanName );
PropertyAccessor accessor = P r o p e r t y A c c e s s o r F a c t o r y . f o r B e a n P r o p e r t y A c c e s s ( bean );
accessor . setPropertyValue ( propertyName , propertyValue );
return bean ;
}

We have two extra methods, a property name and a property value. We get
the bean instance as for the simple case (though now, because our bean has
prototype scope, we get a new instance for each call). We then use a Spring
PropertyAccessor to set the value. An additional complication is that if our
value is a Coherence expression - {cache-name} - the value we want will be
wrapped an instance of com.tangosol.config.expression.Value.
Finally, the dynamicScheme in the cache configuration of listing 2.17 provides
the extra parameters to the new BeanLocator.getBean method.
I described this approach as clunky; it is quite verbose, especially if more
parameters need to be injected. A more elegant approach is to use a custom namespace, as the Coherence Spring integration project does. More
on custom namespaces later in section 8.4: Validate Configuration With A
NameSpaceHandler

28

CHAPTER 2. INTEGRATION RECIPES


Listing 2.17: Instantiating a prototype CacheLoader
< distributed - scheme >
< scheme - name > dynamicScheme </ scheme - name >
< service - name > e x a m p l e D i s t r i b u t e d S e r v i c e </ service - name >
< backing - map - scheme >
< read - write - backing - map - scheme >
< internal - cache - scheme > < local - scheme / > </ internal - cache - scheme >
< cachestore - scheme >
< class - scheme >
< class - factory - name >
org . cohbook . configuration . spring . BeanLocator
</ class - factory - name >
< method - name > getBean </ method - name >
< init - params >
< init - param >
< param - value > c o h e r e n c e B e a n s C o n t e x t </ param - value >
</ init - param >
< init - param >
< param - value > d yn a mi c Ca c he L oa d er </ param - value >
</ init - param >
< init - param >
< param - value > cacheName </ param - value >
</ init - param >
< init - param >
< param - value >{ cache - name } </ param - value >
</ init - param >
</ init - params >
</ class - scheme >
</ cachestore - scheme >
</ read - write - backing - map - scheme >
</ backing - map - scheme >
< autostart > true </ autostart >
</ distributed - scheme >

2.3.8

Using Spring with Littlegrid

The default behaviour of Littlegrid when starting a storage node is to call


DefaultCacheServer.start(), the initialisation code we have placed in our own
ExampleStorageMain is not called. We must first separate the initialisation code
into a separate method:
public static void main ( String [] args ) {
initialise ();
new D ef a ul t Ca c he S er v er (
CacheFactory . g e t C o n f i g u r a b l e C a c h e F a c t o r y ())
. startAndMonitor (5000);
}
public static void initialise () {
l o a d C o h e r e n c e P r o p e r t i e s ();
BeanLocator . r e g i s t e r A p p l i c a t i o n C o n t e x t (
" c o h e r e n c e B e a n s C o n t e x t " );
}

Next we subclass Littlegrids DefaultClusterMember class:

2.4. LINKING SPRING AND COHERENCE JMX SUPPORT

29

Listing 2.18: LittlegridClusterMember.java


public class L i t t l e G r i d C l u s t e r M e m b e r extends D e f a u l t C l u s t e r M e m b e r {
@Override
public void doBeforeStart () {
E xa m pl e St o ra ge M ai n . initialise ();
}
}

Finally, we must tell Littlegrid to use our class


memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. setClusterMemberInstanceClassName (
" org . cohbook . configuration . spring . L i t t l e G r i d C l u s t e r M e m b e r " )
. s e t S t o r a g e E n a b l e d C o u n t (2)
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();

2.3.9

Other Strategies

Single Application Context


Ive worked on a number of projects that use Coherence with Spring, some
using the SpringAwareCacheFactory, that are configured with a single application context containing both beans the are required for Coherence, and
that depend on Coherence. Apart from the disordered startup of Coherence
as beans are created in parallel. It is necessary to ensure that beans that
depend on Coherence are started last. This can be done by setting explicit
dependencies between beans but is more often achieved by using lazy instantiation. Such configurations tend to be fragile and difficult to maintain; a
mistake leads to startup failures.

Coherence-Spring project
As it currently stands, this requires instantiation of context concurrently
with CacheFactory - all the issues of timing, race conditions, and deadlocks
apply.

2.4

Linking Spring and Coherence JMX Support

Objective
Demonstrate how to create JMX Mbeans in Spring, and make them
visible in the Coherence JMX Node

30

CHAPTER 2. INTEGRATION RECIPES

Prerequisites
An understanding of Spring, and of Coherence JMX support
Code examples
The sample code is in package org.cohbook.configuration.springjmx of
the configuration project.
Dependencies
Spring, Coherence, and Littlegrid
Spring provides convenient and easy ways of creating and registering JMX
MBeans, programmatically, declaratively in XML, and by annotation, without the need to define separate class and interface as required by the low-level
JMX API. In any Coherence application, it is often useful to create MBeans
to monitor or control application behaviour and register them with Coherence so that they are visible through the JMX node. In this section, well
look at a simple way of joining these together.
Coherence does provide the facility to export MBeans from the local MBean
server by configuring a query in the custom-mbeans.xml file, but this will only
export those MBeans that are already registered at the time that Coherence
is initialised; beans created later, during or after Coherence configuration
are never exported. Those MBeans created in an application-level context,
created after Coherence initialisation, need to be exported via the Coherence
API.
We do this by linking Springs MBeanExporter and the Coherence Registry.
Well subclass the former, overriding the methods that register the MBeans
with the local server to instead register them with Coherence:
public class S p r i n g C o h e r e n c e J M X E x p o r t e r extends MBeanExporter {
private Registry cachedRegistry = null ;
private final Map < ObjectName , String > registeredBeans = new ConcurrentHashMap < >();
protected synchronized Registry getRegistry () {
if ( cachedRegistry == null ) {
cachedRegistry = CacheFactory . ensureCluster (). getManagement ();
}
return cachedRegistry ;
}
@Override
protected void doRegister ( Object mbean , ObjectName objectName )
throws JMException {
Registry registry = getRegistry ();
String sname = registry . ensureGlobalName (
objectName . g e t K e y P r o p e r t y L i s t S t r i n g ());
if ( registry . isRegistered ( sname )) {
registry . unregister ( sname );
}
registry . register ( sname , mbean );

2.4. LINKING SPRING AND COHERENCE JMX SUPPORT

31

Listing 2.19: Using the SpringCoherenceJMXExporter


< context:annotation - config / >
< bean id = " exporter "
class = " org . cohbook . configuration . springjmx . S p r i n g C o h e r e n c e J M X E x p o r t e r " >
< property name = " assembler " ref = " assembler " / >
< property name = " namingStrategy " ref = " namingStrategy " / >
< property name = " autodetect " value = " true " / >
</ bean >
< bean id = " j m xA t tr i bu t eS o ur c e "
class = " org . springframework . jmx . export . annotation . A n n o t a t i o n J m x A t t r i b u t e S o u r c e " / >
< bean id = " assembler "
class = " org . springframework . jmx . export . assembler . M e t a d a t a M B e a n I n f o A s s e m b l e r " >
< property name = " attributeSource " ref = " j m xA t tr i bu t eS o ur c e " / >
</ bean >
< bean id = " namingStrategy "
class = " org . springframework . jmx . export . naming . M e t a d a t a N a m i n g S t r a t e g y " >
< property name = " attributeSource " ref = " j m xA t tr i bu t eS o ur c e " / >
</ bean >

registeredBeans . put ( objectName , sname );


}
@Override
protected void doUnregister ( ObjectName objectName ) {
Registry registry = getRegistry ();
String sname = registeredBeans . get ( objectName );
if ( sname != null ) {
registry . unregister ( sname );
}
}
@Override
protected void unregisterBeans () {
for ( ObjectName objectName : registeredBeans . keySet ()) {
doUnregister ( objectName );
}
registeredBeans . clear ();
}
}

Note that the Spring method identifies beans by the JMX ObjectName whereas
Coherence manages the beans using a String representation of the bean
name, globally unique in the cluster (i.e. has the node identity added). We
use the Coherence registrys ensureGlobalName method to generate the string
representation and store the mapping in the registeredBeans map in case we
need to later unregister.
We can use this class exactly as we would the Spring MBeanExporter in our
application context, listing 2.19 demonstrates the use of Spring annotations
to detect and export MBeans.
Our class does call the Coherence CacheFactory.ensureCluster() method. This
will happen during the initialisation phase of the context so if you are following the pattern described in section 2.3: Build a CacheFactory with Spring

32

CHAPTER 2. INTEGRATION RECIPES

Framework, especially in subsection 2.3.2: Preventing Premature Cluster


Startup, this class can only be used in its current form in an application
context instantiated after the Coherence cluster has been started. To use
within the same context, well need to defer registrations until after Coherence is running.

2.5

Using Maven Repositories

Objective
To describe how to install the Coherence jar in a repository
Oracle describe as one of the new features of Oracle 12c support for maven.
But this is support in the sense that piles of bricks will support a car when
you take the wheels off - itll hold up, but it completely misses the point of
owning the vehicle. The documentation on this feature states:
It is recommended that in such a situation set up one Maven
repository for each environment that you wish to target. For
example, a Maven test repository that contain artefacts that
matches the versions and patches installed in the test environment and a Maven QA repository that contains artefacts that
match the versions and patches installed in the QA environment.
The implication is that the Oracle maven plugin might update patched Coherence artefacts in the repository with the same co-ordinates as the original version. Needless to say, we dont recommend following this guidance.
Here are our instructions for obtaining the jar and installing it in a repository.

2.5.1

Install and Extract Jars

The Oracle 12c distribution adds considerable complexity in obtaining a


usable jar compared to the process for the previous version. Here are the
steps to follow:
1. download the release - something like
ofm_coherence_generic_12.1.2.0.0_disk1_1of1.zip

2. unzip it to obtain coherence_121200.jar

2.5. USING MAVEN REPOSITORIES

33

3. run the jar with java -jar coherence_121200.jar - this will run the GUI
installer. If you accept all the defaults you will have a directory
~/Oracle/Middleware/Oracle_Home

Interesting files in here are:


coherence/lib/coherence.jar
coherence/doc/api/CoherenceJavaDoc-12.1.2.0.0.jar
coherence/plugins/maven/com/oracle/coherence/coherence/12.1.2
/coherence.12.1.2.pom
The pom provided with release 12.1.2.0 gives the maven co-ordinates as:
< groupId > com . oracle . coherence </ groupId >
< artifactId > coherence </ artifactId >
< version > 12.1.2 -0 -0 </ version >

There are also the usual associated jars - coherence-jpa.jar etc. - and support
scripts for performing multicast tests etc.
At this point there is a maven plugin you can use to upload the artefacts to
a repository, but its hard to see what this offers over and above the conventional tools for deploying to a repository, at least for the basic coherence jar,
though may have some benefit for those with more complex dependencies.
You can use mvn install:install-file to place the jar and javadoc in your
local repository as normal using this pom, or provide your own preferred
co-ordinates, but read on about patches and versioning before deciding your
policy.
mvn install : install - file - Dfile = lib / coherence . jar - DpomFile = plugins / maven / com / oracle / cohere

2.5.2

Applying Oracle Patches

Downloading the patch version 12.1.2.0.1 gives a zip file that includes a
readme with complex and arcane instructions for installing the patch - a fragile process that must be performed with access to the original downloaded
version. Fortunately, one can simply extract the updated coherence.jar,
which you can then upload to your repository in the normal manner, but
of course with an updated version number. One presumes the appropriate
version number under the default Oracle scheme would be 12.1.2-0-1

34

2.5.3

CHAPTER 2. INTEGRATION RECIPES

Remove Default Configuration

Before uploading coherence.jar to your repository, we strongly recommend


deleting the default coherence-cache-config.xml from it7 . We consider it vastly
preferable for Coherence to fail to start if you neglect to specify a cache configuration rather than start with a default configuration that is not suitable
for any real application. We have too often been asked for help by project
teams who find that their data inexplicably expires after one hour.
zip -d coherence . jar coherence - cache - config . xml

2.5.4

Select Maven Co-ordinates

We may infer from the stated Oracle policy on how to use Coherence with
maven that patched versions of Coherence will not necessarily involve a
change in maven artefact version - though those we have seen so far always have, you may therefore wish to apply your own versioning scheme and
increase the version number with each patch received. This may become complex if you use a central repository for many projects and if these projects
have different sets of patches, which might happen, according to the Oracle documentation. If you are unable to enforce a single patch progression
across all projects in your organisations, we suggest separately versioning
the Coherence artefacts per project, rather than per environment as well
assume that you all follow sound engineering practice and promote artefacts
from test to production environments. In this case you might use a separate
repository per project (rather than per environment as Oracle suggest), or
in a shared repository, use a distinct groupId.

7
alternatively, rename is to example-cache-config.xml so that it is still available for
developers to look at, but will never be used by default

Chapter 3

Serialisation
3.1

Introduction

We assume you are familiar with the concepts of serialisation in Coherence; the inbuilt support for java serialisation and POF, and the benefits in
portability, speed and memory use of POF over java serialisation - our tests
indicate memory use decreases by a factor of between four and ten when
using POF as compared to native java serialisation.
But there is more to POF than simply saving space:
A consequence of the more concise representation is a reduction in the bandwidth required, or conversely the throughput of objects on the network. For
some applications this can be significant: network bandwidth can be the
limiting factor for many operations including large scale event distribution,
queries with large result sets, and the redistribution of partitions as members
join or leave the cluster.
Many operations can be performed without deserialising. This saves not only
CPU, but also reduces churn on the heap. Caution is needed here though.
A garbage collector carefully tuned for steady-state behaviour may react
adversely to abnormal events. Testing of all scenarios is essential, and one of
the sections in this chapter deals specifically with identifying and avoiding
unnecessary deserialisation.
As always, these benefits dont come for free. The cost is in more complexity and the consequent need for additional testing to support use of
35

36

CHAPTER 3. SERIALISATION

POF, so we devote considerable attention to testing, starting with relatively


straightforward domain objects and moving on to the greater demands of
using Evolvable. Sad to relate, but true; the authors are acquainted with
more projects that start using Evolvable and then abandon it because of the
problems of ensuring correctness, than those that use it successfully.
Finally, we consider the use of other serialisation schemes. Our example uses
Google Protocol Buffers, but our intention is to demonstrate the approach
so that you may use any other suitable library.

3.2

Tips and Tricks with POF

Objective
Provide hints and advice on optimal use of POF
Prerequisites
A basic understanding of POF

3.2.1

Use Uniform Collections

Whats the difference between


pofwriter . writeCollection (0 , collection );

and
pofwriter . writeCollection (0 , collection , MyClass . class );

The answer is collection.size() - 1 bytes in the size of the serialized object,


which can be significant where there are many small elements in the collection. A uniform collection (the second form) stores the runtime type of the
collections elements only once for the whole collection and is therefore more
compact. The simple form stores it separately for each element. On deserialisation, all elements are inflated to the same type for a uniform collection.
The advice is therefore to always serialise as a uniform collection unless the
collection may contain different types.

3.2.2

Put Often Used Fields First

With large and complex object graphs, the cost of reading through the serialised POF stream to obtain a value can become significant, even to the

3.2. TIPS AND TRICKS WITH POF

37

point of being more expensive than deserialising the object when several values are to be obtained. Those fields that are often accessed via PofExtractor
should always be serialised first to minimise the cost of extraction.
pofwriter . writeString (0 , oftenUsedField );
.
.
.
pofwriter . writeString (87 , seldomUsedField );

3.2.3

Define Common Extractors as Static Member Variables

It goes without saying that POF indexes for a class should be defined as
named constants rather than literal integers,
new POFExtactor ( String . class , Account . AC C OU NT _ NU M BE R _P O F )

is somewhat less opaque than


\ lstinline { new POFExtactor ( String . class , 12)}.

Better still, if in AccountClass (or its POFSerializer) you write:


public static final int A CC O UN T _N U MB ER _ PO F = 12
public static final ValueExtractor A C C O U N T _ N U M B E R _ E X T R A C T O R
= new PofExtractor ( String . class , A CC O UN T_ N UM B ER _ PO F );

Use of a singleton extractor has the added benefit of guaranteeing consistency; by using the same extractor instance when defining an index and
performing a query you can be sure that the two are compatible.

3.2.4

Determining Serialised Object Sizes

At its simplest, we can determine the memory consumed by an object serialized as a Binary by adding the length of its content with a fixed overhead:
Binary binaryObject ;
.
.
.
int size = com . tangosol . net . cache . S i m p l e M e m o r y C a l c u l a t o r . SIZE_BINARY +
binaryobject . length ;

For a complete cache entry, we need the size of the serialised key and value,
and the fixed overhead of the BinaryEntry object. There is a convenient API
method do to this for us:

38

CHAPTER 3. SERIALISATION
BinaryEntry binaryEntry ;
.
.
.
int size = B i n a r y M e m o r y C a l c u l a t o r . INSTANCE . calculateUnits (
binaryEntry . getBinaryKey () , binaryEntry . getBinaryValue ());

To investigate how big the cache entries are in a populated cluster, we can
put this in an EntryExtractor:
public class E nt r yS i ze E xt r ac t or extends EntryExtractor {
public static final En t ry S iz e Ex t ra c to r INSTANCE = new En t ry S iz eE x tr a ct o r ();
public Object extractFromEntry ( Entry entry ) {
BinaryEntry binaryEntry = ( BinaryEntry ) entry ;
UnitCalculator calculator = B i n a r y M e m o r y C a l c u l a t o r . INSTANCE ;
int result = calculator . calculateUnits (
binaryEntry . getBinaryKey () , binaryEntry . getBinaryValue ());
}
}

So we can now find the size of the largest entry in a cache:


Long maxsize = ( Long ) cache . aggregate (
AlwaysFilter . INSTANCE , new LongMax ( E n tr y Si z eE x tr a ct o r . INSTANCE ));

This will give us an idea of the size of cache entries alone, but does not include
the sizes of indexes. We can enhance the EntrySizeExtractor to determine and
add in the size of each of the indexes.
public Object extractFromEntry ( Entry entry ) {
BinaryEntry binaryEntry = ( BinaryEntry ) entry ;
UnitCalculator calculator = B i n a r y M e m o r y C a l c u l a t o r . INSTANCE ;
int result = calculator . calculateUnits (
binaryEntry . getBinaryKey () , binaryEntry . getBinaryValue ());
for ( MapIndex index : binaryEntry . g e t B a c k i n g M a p C o n t e x t (). getIndexMap (). values ()) {
Object indexedValue = index . get ( binaryEntry . getBinaryKey ());
UnitCalculator indexCalculator =
( IndexCalculator ) (( SimpleMapIndex ) index ). getCalculator ();
result += indexCalculator . calculateUnits ( null , indexedValue );
}
return result ;
}

How this works is dependent on the configured unit-calculator in the cache


configuration. The default setting of FIXED will simply add one to result for
each index. For correct behaviour, configure the backing map unit-calculator
to BINARY
< distributed - scheme >
< scheme - name > d is t ri b ut e dS e rv i ce </ scheme - name >
< backing - map - scheme >
< local - scheme >
< unit - calculator > BINARY </ unit - calculator >
</ local - scheme >
</ backing - map - scheme >
</ distributed - scheme >

3.3. TESTING POF SERIALISATION

39

Rather than using a hard-coded BinaryMemoryCalculator.INSTANCE, we can use


the configured unit-calculator for calculating the cache entry size:
UnitCalculator calculator =
(( C o n f i g u r a b l e C a c h e M a p ) binaryEntry . g e t B a c k i n g M a p C o n t e x t ()
. getBackingMap ()). get Uni tCa lcu lat or ();

though again, if we did this with the default FIXED calculator, this would give
an answer of 1 for every cache entry.
This is a useful technique for measuring the relative sizes of entries with
different characteristics, but is not in itself sufficient as a basis for capacity
analysis of an entire cluster. There are other variables and overheads to
consider.

3.3

Testing POF Serialisation

Objective
Show how we can prove that POF serialisation and deserialisation of
an object is correct.
Prerequisites
An understanding of the basic concepts of POF and the use of annotations
Code examples
Some domain classes and related files referred to in this sec are also used
in other sections of this chapter. These can be found in the package
org.cohbook.serialisation.domain in the serialisation project. Classes
specifically for this section are in org.cohbook.serialisation.poftest
Dependencies
As well as Coherence, we use JUnit and Apache commons-lang3.
Look at the simple domain class in listing 3.1. A few important points to
note about this class:
member variables can be private, but not final
weve written the class with no setters, and will populate it in the
constructor
We have to provide a no-args constructor for the POF annotations to
work

40

CHAPTER 3. SERIALISATION
Listing 3.1: A simple domain class

@Portable public class Person {


public static final int POF_FIRSTNAME = 0;
public static final int POF_LASTNAME = 1;
@P ort abl ePr ope rt y ( POF_FIRSTNAME ) private String firstName ;
@P ort abl ePr ope rt y ( POF_LASTNAME ) private String lastName ;
public Person () {
}
public Person ( String firstName , String lastName ) {
this . firstName = firstName ;
this . lastName = lastName ;
}
public String getFirstName () {
return firstName ;
}
public String getLastName () {
return lastName ;
}
@Override public int hashCode () {
// use your favourite tool to generate hashCode () and equals () methods
}
@Override public boolean equals ( Object obj ) {
// use your favourite tool to generate hashCode () and equals () methods
}
}

We have opted to specify the POF index for each field, rather than
allow it to be automatically assigned.
The net result is that we have a class whose instances are effectively immutable in normal use, even though we havent been able to make the member variables final. Automatic assignment of POF indexes limits somewhat
how you can use POF, we recommend best practice for domain objects stored
in a cache is to use named constants and always specify index values.
As usual, we also need to provide the POF configuration file, we will call it
person-pof-config.xml and place it in package org.cohbook.serialisation.domain
as follows:
<? xml version = 1.0 ? >
<pof - config xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - pof - config "
x si : sc h em a Lo c at i on =
" http: // xmlns . oracle . com / coherence / coherence - pof - config coherence - pof - config . xsd " >
< user - type - list >
< user - type >
< type - id > 1001 </ type - id >
< class - name >
org . cohbook . binaryutils . domain . Person
</ class - name >
</ user - type >
</ user - type - list >
</ pof - config >

3.3. TESTING POF SERIALISATION

41

Now, we would like to verify that the serialisation is working correctly, ideally
without needing to start up a whole cluster. Fortunately, this is easily done,
and should be done for every serialisable class. Write a unit test - our
example is in package org.cohbook.serialisation.poftest, were using JUnit
4, but the concept translates easily to other frameworks:
public class P e r s o n S e r i a l i s a t i o n T e s t {
@Test
public void t e s t P e r s on S e r i a l i s e () {
PofContext pofContext = new C o n f i g u r a b l e P o f C o n t e x t (
" / org / cohbook / serialisation / domain / person - pof - config . xml " );
Person me = new Person ( " David " , " Whitmarsh " );
Binary binaryMe = E x t e r n a l i z a b l e H e l p e r . toBinary ( me , pofContext );
Person meAgain =
( Person ) E x t e r n a l i z a b l e H e l p e r . fromBinary ( binaryMe , pofContext );
Assert . assertEquals ( me , meAgain );
}
}

This will work just fine so long as your serialisable class has a correct override
of the equals method.
We will have many objects to test, so lets use a helper class to reduce
boilerplate:
public class S e r i a l i s a t i o n T e s t H e l p e r {
private final Serializer serialiser ;
public S e r i a l i s a t i o n T e s t H e l p e r ( String configFileName ) {
this . serialiser = new C o n f i g u r a b l e P o f C o n t e x t ( configFileName );
}
public void e q u a l s C h e c k S e r i a l i s a t i o n ( Object object ) {
Binary binaryObject = E x t e r n a l i z a b l e H e l p e r . toBinary ( object , serialiser );
Object objectAgain = E x t e r n a l i z a b l e H e l p e r . fromBinary ( binaryObject , serialiser );
Assert . assertEquals ( object , objectAgain );
}
}

Now our test becomes:


public class P e r s o n S e r i a l i s a t i o n T e s t {
private S e r i a l i s a t i o n T e s t H e l p e r s e r i a l i s a t i o n T e s t H e l p e r ;
public P e r s o n S e r i a l i s a t i o n T e s t () {
s e r i a l i s a t i o n T e s t H e l p e r = new
SerialisationTestHelper (
" / org / cohbook / serialisation / domain / person - pof - config . xml " );
}
@Test public void t e s t P e r s o n Se r i a l i s e () {
s e r i a l i s a t i o n T e s t H e l p e r . e q u a l s C h e c k S e r i a l i s a t i o n ( new Person ( " Phil " ,
" Wheeler " ));
}
}

42

CHAPTER 3. SERIALISATION

What if our domain object doesnt override equals? We shouldnt require


it purely for the purpose of this test. We do not need to implement equals
and hashcode on types used as keys in the cache, as Coherence will hash and
compare the binary, serialised form of the key in its internal map. We will
need to implement some other means of testing equality. One approach is to
use the Apache EqualsBuilder class from commons-lang3. If we add the appropriate dependency to our project we can write in our SerialisationTestHelper
class:
public void r e f l e c t i o n C h e c k S e r i a l i s a t i o n ( Object object ) {
Binary binaryObject = E x t e r n a l i z a b l e H e l p e r . toBinary ( object , serialiser );
Object objectAgain = E x t e r n a l i z a b l e H e l p e r . fromBinary ( binaryObject ,
serialiser );
Assert . assertTrue ( EqualsBuilder . reflectionEquals ( object , objectAgain ));
}

The ReflectionEquals method is a useful convenience for unit tests, but we


wouldnt want to use it in production code as reflection is expensive and we
might expect equality checks to be performed very frequently. Whichever
approach we use, we must remember that the for the purpose of testing,
correct means that all serialised fields are written and read with perfect
fidelity (order, type, value, completeness). There are other possibilities. It
may, for example, be appropriate to provide a Comparator to perform the
evaluation.

3.4

Polymorphic Caches

Objective
To consider how we can query a cache containing values of different
types, or with values that have properties of different types, by looking
at type information, all without deserialising the binary value.
Prerequisites
An understanding of POF concepts.
Code examples
The domain classes and their POF configuration referred to in this section are also used in other sections of this chapter. These can be found
in the package org.cohbook.serialisation.domain in the serialisation
project. Classes belonging specifically to this section are in the package
org.cohbook.serialisation.polymorph

3.4. POLYMORPHIC CACHES

43

Dependencies
As well as Oracle Coherence, the examples use JUnit and Littlegrid
for performing cluster tests in a single JVM

3.4.1

Value Objects of Different Types

First, we are going to look at how we can query a cache that contains objects of different types. The example used has classes based on a common
interface, though that isnt a requirement for the solution to apply.
We are going to build a cache to store details of players of games. Each
player has a name:
public interface Player {
public static final int POF_FIRSTNAME = 0;
public static final int POF_LASTNAME = 1;
String getFirstName ();
String getLastName ();
}

We are declaring the POF constants here so that we can use them consistently in implementations of this interface. Now we will introduce two types
of players: players of Go in listing 3.2, and of Chess in listing 3.3. Top Go
players are ranked numerically from 1st Dan to 9th Dan. There are many
Chess rating systems, here well demonstrate with the FIDE named ranks,
Grandmaster, International master etc.
The difference between the Go player and the Chess player is in the additional
property used to rank players. They have different names and different
types, after all, it is not meaningful to compare rankings between players of
different games. It would be useful if we could attach the @PortableProperty
annotations for the common elements to the interface, but unfortunately the
default Coherence POF serialiser wont correctly serialise and deserialise the
objects if you do that.
Listing 3.4 shows are first attempt to perform a Littlegrid test to insert
one player of each type1 into a cache, and perform POF-based queries on
them:
1
I should point out that Phil is not actually a Chess grandmaster, and I am not really
a 9th Dan Go player.

44

CHAPTER 3. SERIALISATION

Listing 3.2: GoPlayer.java


public class GoPlayer implements Player {
public static final int POF_DAN = 11;
private String firstName ;
private String lastName ;
private int dan ;
public GoPlayer () {
}
public GoPlayer ( String firstName , String lastName , int dan ) {
this . firstName = firstName ;
this . lastName = lastName ;
this . dan = dan ;
}
@Override
@P ort abl ePr ope rt y ( POF_FIRSTNAME ) public String getFirstName () {
return firstName ;
}
@Override
@P ort abl ePr ope rt y ( POF_LASTNAME ) public String getLastName () {
return lastName ;
}
@P ort abl ePr ope rt y ( POF_DAN ) public int getDan () {
return dan ;
}
// Implement hashCode and equals as usual
}

Listing 3.3: ChessPlayer.java


public class ChessPlayer implements Player {
public static final int POF_RANK = 11;
private String firstName ;
private String lastName ;
private String rank ;
public ChessPlayer ( String firstName , String lastName , String rank ) {
this . firstName = firstName ;
this . lastName = lastName ;
this . rank = rank ;
}
@Override
@P ort abl ePr ope rt y ( POF_FIRSTNAME ) public String getFirstName () {
return firstName ;
}
@Override
@P ort abl ePr ope rt y ( POF_LASTNAME ) public String getLastName () {
return lastName ;
}
@P ort abl ePr ope rt y ( POF_RANK ) public String getRank () {
return rank ;
}
// Implement hashCode and equals as usual
}

3.4. POLYMORPHIC CACHES

45

Listing 3.4: A flawed polymporphic filter test


public class P o l y m o r p h i c F i l t e r T e s t {
@Test
public void test () {
C lu s te r Me m be rG r ou p memberGroup = null ;
System . setProperty ( " tangosol . coherence . serializer " , " pof " );
try {
memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. s e t S t o r a g e E n a b l e d C o u n t (2)
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();
final NamedCache cache = CacheFactory . getCache ( " test " );
cache . put (1 , new ChessPlayer ( " Phil " , " Wheeler " , " Grandmaster " ));
cache . put (2 , new GoPlayer ( " David " , " Whitmarsh " , 9));
Assert . assertEquals (1 , cache . keySet ( new EqualsFilter (
new PofExtractor ( String . class , ChessPlayer . POF_FIRSTNAME ) , " Phil " )). size ());
Assert . assertEquals (1 , cache . keySet ( new EqualsFilter (
new PofExtractor ( Integer . class , GoPlayer . POF_DAN ) , 9)). size ());
} finally {
C l u s t e r M e m b e r G r o u p U t i l s . s h u t d o w n C a c h e F a c t o r y T h e n C l u s t e r M e m b e r G r o u p s ( memberGroup );
}
}
}

So, what happens when we run this test? The first query completes successfully, but the second one fails with the message:
{ Caused by : Portable ( java . io . IOException ): unable to convert type -15
to a numeric type }
~

This is because we used the same POF index value for the Go players Dan
rating and the Chess players rank and queried for the Dan ratings type:
Integer. There are two solutions to this problem:
1. Ensure that every property of every class in a hierarchy has a distinct POF index value, perhaps by centralising their definitions. Easily
manageable for simple, stable class hierarchies but it does introduce
undesirable coupling between the classes and, moreover, is not a very
interesting solution.
2. Test the type of the class encoded in the POF stream within our filter.
Much more interesting, this is what well do.
Well use the POF type-id, as defined in the pof-config.xml to identify the
class. To get at this information, we need to introduce several Coherence
APIs:
com.tangosol.util.filter.EntryFilter
allows us to implement a filter that has direct access to the serialised

46

CHAPTER 3. SERIALISATION
binary form of the cache entry

com.tangosol.util.BinaryEntry
is the interface implemented by the serialised binary cache entry. Well
meet this guy many times in this book.
com.tangosol.io.pof.reflect.PofValue
encapsulates a single element within a serialised POF stream, providing
access to the value, type, and sub-elements.
com.tangosol.io.pof.PofContext
gives us access to the POF serialiser used by Coherence, and is needed
by:
com.tangosol.io.pof.reflect.PofValueParser
which allows us to extract a PofValue from a Binary value
Of these, EntryFilter and BinaryEntry are agnostic to the type of serialisation
used, and can be used with serialised forms other than POF.
Putting all of these together, we can construct a class, SimpleTypeIdFilter, as
follows2 :
public class S im p le T yp e Id F il t er implements EntryFilter {
private static final int POF_TYPEID = 0;
@P ort abl ePr ope rt y ( POF_TYPEID ) private int typeId ;
public S i mp l eT y pe I dF i lt e r () {
}
public S i mp l eT y pe I dF i lt e r (
this . typeId = typeId ;
}
@Override
public boolean evaluateEntry ( Entry entry ) {
BinaryEntry binEntry = ( BinaryEntry ) entry ;
PofContext ctx = ( PofContext ) binEntry . getSerializer ();
PofValue value = PofValueParser . parse ( binEntry . getBinaryValue () , ctx );
int valueType = value . getTypeId ();
return valueType == typeId ;
}
@Override
public boolean evaluate ( Object obj ) {
throw new U n s u p p o r t e d O p e r a t i o n E x c e p t i o n ();
}
// Implement hashCode and equals as usual
}

2
The need to implement hashCode and equals will become apparent when we look at
indexes.

3.4. POLYMORPHIC CACHES

47

The filter is itself POF serialisable so we must remember to add it into our
pof-config.xml:
< user - type >
< type - id > 1004 </ type - id >
< class - name >
org . cohbook . serialisation . filter . S im p le T yp e Id F il t er
</ class - name >
</ user - type >

Then we can change the GoPlayer search in the unit test as follows:
Filter goFilter = new AndFilter ( new S im p le T yp e Id F il t er (1002) ,
new EqualsFilter ( new PofExtractor ( Integer . class , GoPlayer . POF_DAN ) , 9));
Assert . assertEquals (1 , cache . keySet ( goFilter ). size ());

Theres one final enhancement. It really isnt elegant having to provide the
POF type-id for the class we are filtering on, but it is possible to look it up
from the PofContext:
C o n f i g u r a b l e P o f C o n t e x t ctx =
( C o n f i g u r a b l e P o f C o n t e x t ) cache . getCacheService (). getSerializer ();
int typeId = ctx . g e t U s e r T y p e I d e n t i f i e r ( GoPlayer . class );
Filter goFilter = new AndFilter ( new S im p le T yp e Id F il t er ( typeId ) ,
new EqualsFilter ( new PofExtractor ( Integer . class , GoPlayer . POF_DAN ) , 9));

3.4.2

Value Objects with Properties of Different Types

What if our cache contains objects all of the same type, but if that type has
properties that may be one of many run-time types? Well re-implement the
players of games in listing 3.5 as a generic type and well use enumerations
for the types of rating:
public enum ChessRating {
candidate_master , master , grand_master , i n t e r n a t i o n a l _ m a s t e r
}

and
public enum GoRating {
first_dan , second_dan , third_dan , fourth_dan ,
fifth_dan , sixth_dan , seventh_dan , eighth_dan , ninth_dan
}

All of these types will need to be added to our POF configuration file,
person-pof-config.xml in listing 3.6. Which also shows us the convenient
serialiser that Coherence provides for enum.
As before, the nave test of listing 3.7 like this
Caused by : Portable ( java . lang . C la s sC a st E xc e pt i on ): org . cohbook . serialisation . domain . ChessRatin
g is not assignable to org . cohbook . serialisation . domain . GoRating

48

CHAPTER 3. SERIALISATION

Listing 3.5: A generic player class


@Portable
public class GenericPlayer <T > implements Player {
public static final int POF_RATING = 11;
private String firstName ;
private String lastName ;
private T rating ;
public GenericPlayer () {
}
public GenericPlayer ( String firstName , String lastName , T rating ) {
this . firstName = firstName ;
this . lastName = lastName ;
this . rating = rating ;
}
@Override
@P ort abl ePr ope rt y ( POF_FIRSTNAME ) public String getFirstName () {
return firstName ;
}
@Override
@P ort abl ePr ope rt y ( POF_LASTNAME ) public String getLastName () {
return lastName ;
}
@P ort abl ePr ope rt y ( POF_RATING ) public T getRating () {
return rating ;
}
// equals and hashCode
}

Listing 3.6: Generic player POF configuration


< user - type >
< type - id > 1004 </ type - id >
< class - name >
org . cohbook . serialisation . domain . GenericPlayer
</ class - name >
</ user - type >
< user - type >
< type - id > 1005 </ type - id >
< class - name >
org . cohbook . serialisation . domain . GoRating
</ class - name >
< serializer >
< class - name >
com . tangosol . io . pof . Enu mPo fSe ria liz er
</ class - name >
</ serializer >
</ user - type >
< user - type >
< type - id > 1006 </ type - id >
< class - name >
org . cohbook . serialisation . domain . ChessRating
</ class - name >
< serializer >
< class - name >
com . tangosol . io . pof . Enu mPo fSe ria liz er
</ class - name >
</ serializer >
</ user - type >

3.4. POLYMORPHIC CACHES

Listing 3.7: Another failing query


@Test
public void test () {
C lu s te r Me m be rG r ou p memberGroup = null ;
System . setProperty ( " tangosol . coherence . serializer " , " pof " );
try {
memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. s e t S t o r a g e E n a b l e d C o u n t (2)
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();
final NamedCache cache = CacheFactory . getCache ( " test " );
cache . put (1 , new GenericPlayer < ChessRating >(
" Phil " , " Wheeler " , ChessRating . grand_master ));
cache . put (2 , new GenericPlayer < GoRating >(
" David " , " Whitmarsh " , GoRating . ninth_dan ));
Assert . assertEquals (1 ,
cache . keySet ( new EqualsFilter (
new PofExtractor ( String . class , ChessPlayer . POF_FIRSTNAME ) , " Phil " ))
. size ());
C o n f i g u r a b l e P o f C o n t e x t ctx =
( C o n f i g u r a b l e P o f C o n t e x t ) cache . getCacheService (). getSerializer ();
Filter valueFilter = new EqualsFilter (
new PofExtractor ( GoRating . class , GenericPlayer . POF_RATING ) ,
GoRating . ninth_dan );
Assert . assertEquals (1 , cache . keySet ( valueFilter ). size ());
} finally {
ClusterMemberGroupUtils
. shutdownCacheFactoryThenClusterMemberGroups (
memberGroup );
}
}

49

50

CHAPTER 3. SERIALISATION

We need a more capable variant of our SimpleTypeIdFilter to examine the runtime type of a property. Allow me to introduce you to PofTypeIdFilter:
@Portable
public class PofTypeIdFilter implements EntryFilter {
@P ort abl ePr ope rt y (0) private int typeId ;
@P ort abl ePr ope rt y (1) private int target ;
@P ort abl ePr ope rt y (2) private PofNavigator navigator ;
public PofTypeIdFilter () {
super ();
}
public PofTypeIdFilter (
int typeId ,
int target ,
PofNavigator navigator ) {
this . typeId = typeId ;
this . target = target ;
this . navigator = navigator ;
}
@Override
public boolean evaluateEntry ( Entry entry ) {
BinaryEntry binEntry = ( BinaryEntry ) entry ;
PofContext ctx = ( PofContext ) binEntry . getSerializer ();
com . tangosol . util . Binary binTarget ;
switch ( target ) {
case Abs tra ctE xtr act or . KEY :
binTarget = binEntry . getBinaryKey ();
break ;
case Abs tra ctE xtr act or . VALUE :
binTarget = binEntry . getBinaryValue ();
break ;
default :
throw new I l l e g a l A r g u m e n t E x c e p t i o n ( " invalid target " );
}
PofValue value = PofValueParser . parse ( binTarget , ctx );
if ( navigator != null ) {
value = navigator . navigate ( value );
}
if ( value == null ) {
return false ;
}
int valueType = value . getTypeId ();
return valueType == typeId ;
}
@Override
public boolean evaluate ( Object obj ) {
throw new U n s u p p o r t e d O p e r a t i o n E x c e p t i o n ();
}
}

This class allows us to check the type of a nested property in either the key
or value of a POF-encoded BinaryEntry. We construct with the type-id we
wish to compare, the target, whether to extract from key or value (here we
follow the precedent set by various Coherence classes of using the constants
from AbstractExtractor to specify), and a PofNavigatoridx instance to walk the
object graph.

3.5. CODECS

51

Now we can construct a filter that checks both the type and the value of
rating by combining a PofTypeIdFilter and an EqualsFilter:
C o n f i g u r a b l e P o f C o n t e x t ctx =
( C o n f i g u r a b l e P o f C o n t e x t ) cache . getCacheService (). getSerializer ();
int typeId = ctx . g e t U s e r T y p e I d e n t i f i e r ( GoRating . class );
PofNavigator nav = new SimplePofPath ( GenericPlayer . POF_RATING );
Filter typeIdFilter = new PofTypeIdFilter (
typeId , Abs tra ctE xt rac tor . VALUE , nav );
Filter valueFilter = new EqualsFilter (
new PofExtractor ( GoRating . class , nav ) , GoRating . ninth_dan );
Filter goFilter = new AndFilter ( typeIdFilter ,
valueFilter );
Assert . assertEquals (1 , cache . keySet ( goFilter ). size ());

We could easily subclass AndFilter to generate a single type and value checking TypeEqualsFilter defined entirely in the constructor.
public class TypeEqualsFilter extends AndFilter {
public TypeEqualsFilter ( int typeId , int target , PofNavigator navigator , Object value ) {
super (
new PofTypeIdFilter ( typeId , target , navigator ) ,
new EqualsFilter ( new PofExtractor ( null , navigator , target ) , value ));
}
}

We need not define serialisation as the class will by default be serialised and
deserialised as an AndFilter. Our simplified test now becomes:
C o n f i g u r a b l e P o f C o n t e x t ctx =
( C o n f i g u r a b l e P o f C o n t e x t ) cache . getCacheService (). getSerializer ();
int typeId = ctx . g e t U s e r T y p e I d e n t i f i e r ( GoRating . class );
PofNavigator nav = new SimplePofPath ( GenericPlayer . POF_RATING );
Filter goFilter = new TypeEqualsFilter (
typeId , Abs tra ctE xtr act or . VALUE , nav , GoRating . ninth_dan );
Assert . assertEquals (1 , cache . keySet ( goFilter ). size ());

3.5

Codecs

Objective
To demonstrate the use of Codec with POF annotations, illustrating
some simple but useful POF serialisation tricks
The @PortableProperty annotation permits us to define the class that will be
used to serialise a property, overriding the default serialisation behaviour for
that type of property. For example, this snippet of code specifies that the

52

CHAPTER 3. SERIALISATION

property of a class will be serialised using the SerialiserCodec


- of which more below:
failureReason

@P ort abl ePr ope rt y ( value =2 , codec = SerialiserCodec . class )


private Exception failureReason ;

The encodings outlined here could just as easily be provided in an implementation of PofSerializer or in the readExternal and writeExternal methods
of the PortableObject interface, but a Codec implementation provides a convenient method of defining the serialisation for a specific field, rather than for
all fields of a given type in the POF configuration file.

3.5.1

Serialiser Codec

Occasionally, you may encounter the need to add a field to a cache object
that is not POF serializable, and where the effort of defining a Serializer for
it is not justified - perhaps because it contains a complex, polymorphic object
graph. If the fields class implements Serializable, and if the limitations of
java serialisation are acceptable in this context (larger serialised form, no
introspection without deserialisation), then we can use a Codec to embed the
java serialised object within the POF stream. This Codec does just that,
serialising the field into a byte array in the stream.
public class SerialiserCodec implements Codec {
public Object decode ( PofReader pofreader , int i ) throws IOException {
byte [] bytes = pofreader . readByteArray ( i );
InputStream bis = new B y t e A r r a y I n p u t S t r e a m ( bytes );
Ob jec tIn pu tSt rea m ois = new O bje ctI npu tS tre am ( bis );
try {
return ois . readObject ();
} catch ( C l a s s N o t F o u n d E x c e p t i o n e ) {
throw new IOException ( e );
}
}
public void encode ( PofWriter pofwriter , int i , Object obj )
throws IOException {
B y t e A r r a y O u t p u t S t r e a m bos = new B y t e A r r a y O u t p u t S t r e a m ();
ObjectOutput out = new Ob j ec t Ou t pu tS t re a m ( bos );
out . writeObject ( obj );
pofwriter . writeObject (i , bos . toByteArray ());
}
}

3.5.2

Serialise Enum by Ordinal

In section 3.4: Polymorphic Caches we saw how we can define serialisation


for an Enum in the pof configuration:

3.5. CODECS

53

< user - type >


< type - id > 1005 </ type - id >
< class - name >
org . cohbook . serialisation . domain . GoRating
</ class - name >
< serializer >
< class - name >
com . tangosol . io . pof . Enu mPo fSe ria liz er
</ class - name >
</ serializer >
</ user - type >

This works well enough, though if you look at the POF stream in detail,
you find that this works by embedding the name of the serialised instance
as a string in the POF stream. For small objects with few distinct enum
values and long, descriptive names this can be quite inefficient. We can
write an alternative serialiser, or for this example, a Codec that will place
the ordinal value of the enum in the stream. As these are integers, typically
with small values, the serialised form is particularly concise. Unfortunately
to deserialise, we need to know the specific enum type, so we must write an
abstract codec base class and subclass the codec for each enum type to be
serialised.
public abstract class A b s t r a c t E n u m O r d i n a l C o d e c implements Codec {
private final Class <? > enumType ;
protected A b s t r a c t E n u m O r d i n a l C o d e c ( Class <? > enumType ) {
this . enumType = enumType ;
}
public Object decode ( PofReader pofreader , int i ) throws IOException {
int ordinal = pofreader . readInt ( i );
for ( Object e : enumType . getEnumConstants ()) {
if ( ordinal == (( Enum <? >) e ). ordinal ()) {
return e ;
}
}
throw new I l l e g a l A r g u m e n t E x c e p t i o n (
" invalid ordinal " + ordinal + " for type " + enumType . getName ()) ;
}
public void encode ( PofWriter pofwriter , int i , Object obj )
throws IOException {
pofwriter . writeInt (i , (( Enum <? >) obj ). ordinal ());
}
}

Given an enum:
public enum Status { success , failure };

we create the codec as:


public class StatusCodec extends A b s t r a c t E n u m O r d i n a l C o d e c {
public StatusCodec () {
super ( Status . class );
}
}

and use it thus:

54

CHAPTER 3. SERIALISATION
@P ort abl ePr ope rt y ( value =1 , codec = StatusCodec . class )
private Status status ;

Extra care is needed if your serialised objects are persisted across code
changes, adding or removing names in the enum can change the name to
ordinal mapping.

3.5.3

Enforcing Collection Type

We have a property in our class of declared type Map<String,String, but we


choose to instantiate it with runtime type, say, LinkedHashMap. We can, by implementing a simple codec, ensure that the map is serialised most efficiently
as a uniform collection and also that it is instantiated as a LinkedHashMap after
deserialisation:
public class L in k ed H as h Ma p Co d ec implements Codec {
public Object decode ( PofReader pofreader , int i ) throws IOException {
return pofreader . readMap (i , new LinkedHashMap < >());
}
public void encode ( PofWriter pofwriter , int i , Object obj )
throws IOException {
pofwriter . writeMap (i , ( Map ) obj , String . class , String . class );
}
}

As before, we simply specify the codec to use in the property annotation:


@P ort abl ePr ope rt y ( value =3 , codec = Li n ke d Ha sh M ap C od e c . class )
private Map < String , String > params ;

3.6

Testing Evolvable

Objective
To explore the several distinct use cases for Evolvable and the problems
with testing that these introduce, and to propose some solutions.
Prerequisites
This section builds on concepts introduced in section 3.3: Testing POF
Serialisation. An understanding of POF and Evolvable are needed.
Code examples
The examples in this section can be found in the downloadable code
in the projects evolvableCurrent and evolvableTest. A copy of an earlier version of evolvableCurrent is in project evolvableLegacy for ease of
reference.

3.6. TESTING EVOLVABLE

55

Dependencies
As well as Oracle Coherence, the examples use JUnit and the Littlegrid
library
To avoid confusion, the terms forward compatible and backward compatible
are interpreted in this section as follows:
Backward compatible: A later version of a class can be successfully instantiated from the serialised form of an earlier version of that class.
Forward compatible: an older version of a class can be successfully instantiated from the serialised form of a later version of that class.
With Evolvable correctly implemented, we can translate data between different release versions of a data model without loss. The correctly implemented is a caveat that warrants careful consideration; there are several
mistakes that could be made that would break Evolvable. Many projects
that start using an evolvable data model suffer from failures during upgrade
and give up on it, simply because the migration of objects between versions
has not been adequately tested. So, if you are considering using Evolvable,
it is imperative that you consider your testing strategy. Before we can think
about testing a solution, we must understand the problem. What are the
use cases for Evolvable?
Support rolling restarts across releases
As nodes are stopped and restarted with the new data model, partitions will be transferred back and forth between nodes, this will at
times include transfers from new to old nodes as well as old to new,
so we will need to verify backward and forward compatibility, There
are many conditions under which a rolling restart and upgrade is not
possible: changes in cache configuration, in operational configuration;
some version upgrades of Coherence itself. You will always need to
have a procedure for restarting the cluster from cold so do you really
need to support rolling restarts, and are you prepared to perform the
necessary testing to ensure that you can? In practice, very few projects
actually use this capability, and even fewer do so across releases.
Release extend clients independently of the cluster
You may be working in a landscape where several separately managed
projects connect to a central cluster via extend and where it would
be impractical to synchronise releases. Are these clients read-only, or
read-write? If read-only, and if you can always assure that the cluster
data model is updated before the clients, then you will need only to

56

CHAPTER 3. SERIALISATION
test forward compatibility. You will need to test for all data model
versions extant in your wider architecture.

Backward compatibility with persisted serialised data


Storing POF serialised data in a database or other store has obvious
benefits. It is fast and efficient, and simple to manage. Evolvable is
then an obvious mechanism for managing data model upgrades, In
this case, only backward compatibility need be supported and we must
test all extant versions of serialised objects with the new model for
each release.
Whatever the rationale for considering Evolvable, you must factor into your
development plan the effort involved in testing cross-version compatibility,
and the strategy you will use to perform that testing. Well now consider
some testing techniques.

3.6.1

Testing with binary serialised data

We can capture and store the binary serialised form of a test object. As the
test classes evolve over newer versions, the preserved binary artefacts from
older versions will allow us to validate instantiation of the new version from
the older binary forms. We can use this technique to prove backward compatibility, but not forward compatibility, so is appropriate for the persisted
serialised data use-case.
We start with a simple evolvable domain object Fruity in listing 3.8 and its
POF configuration in fruity-pof-config.xml, listing 3.9. Next, we create our
unit test, listing 3.10, to verify that it serialises correctly, as in section 3.3:
Testing POF Serialisation of this chapter.

3.6. TESTING EVOLVABLE

Listing 3.8: A simple evolvable domain object


@Portable
public class Fruity extends Abs tra ctE vol vab le {
public static final int POF_NAME = 0;
public static final int POF_FRUIT = 1;
@P ort abl ePr ope rty ( POF_NAME ) private String name ;
@P ort abl ePr ope rty ( POF_FRUIT ) private String favouriteFruit ;
public Fruity () {
}
public Fruity ( String name , String favouriteFruit ) {
this . name = name ;
this . favouriteFruit = favouriteFruit ;
}
@Override
public int getImplVersion () {
return 1;
}
public String getName () {
return name ;
}
public String ge tFa vou rit eFr uit () {
return favouriteFruit ;
}
// Generate equals () and hashCode () as usual
}

Listing 3.9: POF config for the simple evolvable object


<pof - config xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - pof - config "
x si : sc h em a Lo ca t io n =
" http: // xmlns . oracle . com / coherence / coherence - pof - config
coherence - pof - config . xsd " >
< user - type - list >
< user - type >
< type - id > 2001 </ type - id >
< class - name >
org . cohbook . serialisation . evolvable . Fruity
</ class - name >
</ user - type >
</ user - type - list >
</ pof - config >

57

58

CHAPTER 3. SERIALISATION
Listing 3.10: Verify serialisation of the domain object

public class T e s t F r u i t y S e r i a l i s a t i o n {
private S e r i a l i s a t i o n T e s t H e l p e r s e r i a l i s a t i o n T e s t H e l p e r ;
public T e s t F r u i t y S e r i a l i s a t i o n () {
s e r i a l i s a t i o n T e s t H e l p e r = new S e r i a l i s a t i o n T e s t H e l p e r (
" org / cohbook / serialisation / "
+ " evolvable / legacy / fruity - pof - config . xml " );
}
@Test
public void t e s t F r u i t y S e r i a l i s a t i o n () throws IOException {
Object object = new Fruity ( " Mark " , " Grapes " );
s e r i a l i s a t i o n T e s t H e l p e r . e q u a l s C h e c k S e r i a l i s a t i o n ( object );
}
}

When we want to implement a change to our domain object, Fruity, we first


need to capture an example of its serialised form. Listing 3.11 adds another
method to the SerialisationTestHelper base class to give us the ability to
do this. Our domain object unit test can then compare an object to one
constructed from the serialised binary data, listing 3.12.
After running this once, we have a file called mark-grapes.bin with the serialised form of this version of the code. For strict accuracy, you should
conduct this test based on precisely the code base currently in your production system, including the dependent version of Coherence, in case of
any change in the way serialisation is performed. Only the steps to capture
the serialised from should be added, and that might be better done through
run-time configuration than code change.
The captured binary file should be then added to your development branch
as a classpath resource. Now that we have captured the old form, we can
modify our domain object for the next version by adding an extra field:
listing 3.13.

Listing 3.11: Saving the binary serialised form


public void saveBinary ( Object object , String fileName ) throws IOException {
Binary binaryObject = E x t e r n a l i z a b l e H e l p e r . toBinary ( object , serialiser );
byte bytes []

= binaryObject . toByteArray ();

try ( OutputStream os = new FileOutputStream ( fileName )) {


os . write ( bytes );
}
}

3.6. TESTING EVOLVABLE

Listing 3.12: Checking the binary serialised form


@Test
public void t e s t F r u i t y S e r i a l i s a t i o n () throws IOException {
Object object = new Fruity ( " Mark " , " Grapes " );
e q u a l s C h e c k S e r i a l i s a t i o n ( object );
saveBinary ( object , " mark - grapes . bin " );
}

Listing 3.13: Evolvable domain object version 2


@Portable
public class Fruity extends Abs tra ctE vol vab le {
public static final int POF_NAME = 0;
public static final int POF_FRUIT = 1;
public static final int POF_CHEESE = 2;
@P ort abl ePr ope rty ( POF_NAME ) private String name ;
@P ort abl ePr ope rty ( POF_FRUIT ) private String favouriteFruit ;
@P ort abl ePr ope rty ( POF_CHEESE ) private String favouriteCheese ;
public Fruity () {
}
public Fruity ( String name , String favouriteFruit ) {
this ( name , favouriteFruit , null );
}
public Fruity ( String name , String favouriteFruit , String favouriteCheese ) {
this . name = name ;
this . favouriteFruit = favouriteFruit ;
this . favouriteCheese = favouriteCheese ;
}
@Override
public int getImplVersion () {
return 2;
}
public String getName () {
return name ;
}
public String ge tFa vou rit eFr uit () {
return favouriteFruit ;
}
public String g e tF a vo u ri te C he e se () {
return favouriteCheese ;
}
// Generate equals () and hashCode () as usual
}

59

60

CHAPTER 3. SERIALISATION

A few key points to note here:


The existing POF constants are unchanged, and the new one has a
higher numeric value these are requirements of Evolvable.
Weve increased the value returned by getImplVersion
Weve kept the original constructor, mostly as a convenience to ensure
that we can use the existing unit tests; well create a new test to
validate the extra property.
For brevity, weve omitted the content of the generated hashCode and equals
methods: it is essential to regenerate these after adding the new field. We
can add another test case to TestFruitySerialisation to check that a fully
populated object still serialises correctly.
@Test
public void t e s t C h e e s y S e r i a l i s a t i o n () throws IOException {
Object object = new Fruity ( " Elizabeth " , " Banana " , " Wensleydale " );
s e r i a l i s a t i o n T e s t H e l p e r . e q u a l s C h e c k S e r i a l i s a t i o n ( object );
}

We now need to verify that the serialised form of the captured binary object deserialises correctly into the new form. Time for another convenience
method in our SerialisationTestHelper class, this will read the serialised data,
deserialise it and compare to an exemplar:
public void e qua lsS ave dB ina ry ( Object object , String fileName ) throws IOException {
Binary binaryObject = new Binary ( g e t B y t e A r r a y F r o m F i l e ( fileName ));
Object objectAgain = E x t e r n a l i z a b l e H e l p e r . fromBinary ( binaryObject , serialiser );
Assert . assertEquals ( object , objectAgain );
}
private byte [] g e t B y t e A r r a y F r o m F i l e ( String fileName ) throws IOException {
InputStream input = new FileInputStream ( fileName );
B y t e A r r a y O u t p u t S t r e a m output = new B y t e A r r a y O u t p u t S t r e a m ()) {
byte [] buffer = new byte [1024];
int n = 0;
while ( -1 != ( n = input . read ( buffer ))) {
output . write ( buffer , 0 , n );
}
return output . toByteArray ();
}
}

and now we can add another check to our unit test:

3.6. TESTING EVOLVABLE

61

@Test
public void t e s t F r u i ty E v o l v a b l e () throws IOException {
Object object = new Fruity ( " Mark " , " Grapes " );
s e r i a l i s a t i o n T e s t H e l p e r . e qua lsS ave dB ina ry ( object , " mark - grapes . bin " );
}

This last test is sensitive to how the new field is initialised, how the equals
comparison is made, and how serialisation is implemented. For example, if
we want new objects to default the value of favouriteCheese to we might set
the default in the constructors:
public Fruity () {
this . favouriteCheese = " Cheddar " ;
}
public Fruity ( String name , String favouriteFruit ) {
this ( name , favouriteFruit , " Cheddar " );
}

then the test, as we have written it here, will fail. This is because the
annotation-based POF serialiser will explicitly set the favouriteCheese property to null if it is not found in the stream. There are a number of possible
solutions to this, the simplest is to apply the annotation to the accessor
rather than the property, and set the default value in the setter:
@P ort abl ePr ope rty ( POF_CHEESE )
public String g e tF a vo u ri te C he e se () {
return favouriteCheese ;
}
public void se t Fa v ou r it e Ch e es e ( String favouriteCheese ) {
this . favouriteCheese =
favouriteCheese == null ? " Cheddar " : favouriteCheese ;
}

For more complex transformations between versions, it may become necessary to provide our own serialisation instead of using annotations, either by
implementing PortableObject or by providing a PortableObjectSerializer.

3.6.2

Testing with Duplicated Classes

Saving binary data and instantiating from it is fine for testing backward
compatibility, but is of no help if your use-case also needs forward compatibility. It also requires a certain amount of prescience in working out which
test cases will be needed and capturing the binaries before updating the data
model. Well now explore an alternative approach.
Well start with the same Fruity example class as above, but before adding
favouriteCheese well copy the domain classes and POF configuration file into

62

CHAPTER 3. SERIALISATION

a new legacy package. In the downloadable example, which uses maven to


build, we follow maven conventions and place this new package under src/test
rather than src/main as it will only be used for testing. So, we copy the class
Fruity and the POF configuration fruity-pof-config.xml from the original
package org.cohbook.serialisation.evolvable to the simulated legacy package
org.cohbook.serialisation.evolvable.legacy, also changing the package names
of the classes referenced in the copied POF configuration file to point to the
legacy package.
< user - type - list >
< user - type >
< type - id > 2001 </ type - id >
< class - name >
org . cohbook . serialisation . evolvable . legacy . Fruity
</ class - name >
</ user - type >
</ user - type - list >

I know this is a little cumbersome and contrived well work towards a


solution that is more elegant from the code maintenance point of view (but
less so for the build process) in the next section.
We now make our changes, adding favouriteCheese to Fruity as before. Then,
we can instantiate separate POF contexts for new and legacy versions.
Serializer legacyPofContext = new C o n f i g u r a b l e P o f C o n t e x t (
" org / cohbook / serialisation / evolvable / legacy / fruity - pof - config . xml " );
Serializer cur ren tPo fCo nte xt = new C o n f i g u r a b l e P o f C o n t e x t (
" org / cohbook / serialisation / evolvable / fruity - pof - config . xml " );

For convenience, lets place these in a helper class, as before:


public class S e r i a l i s a t i o n 2 W a y T e s t H e l p e r {
private final Serializer cu rre ntS eri ali ser ;
private final Serializer legacySerialiser ;
public S e r i a l i s a t i o n 2 W a y T e s t H e l p e r (
String legacyConfigFileName , String c u r r e n t C o n f i g F i l e N a m e ) {
legacySerialiser = new C o n f i g u r a b l e P o f C o n t e x t ( l e g a c y C o n f i g F i l e N a m e );
cu rre ntS er ial ise r = new C o n f i g u r a b l e P o f C o n t e x t ( c u r r e n t C o n f i g F i l e N a m e );
}
}

It now becomes a simple task to add methods to convert between current


and legacy forms of the same class:
public Object currentToLegacy ( Object currentObject ) {
Binary binaryObject = E x t e r n a l i z a b l e H e l p e r . toBinary ( currentObject , c urr ent Ser ial ise r );
return E x t e r n a l i z a b l e H e l p e r . fromBinary ( binaryObject , legacySerialiser );
}
public Object legacyToCurrent ( Object legacyObject ) {
Binary binaryObject = E x t e r n a l i z a b l e H e l p e r . toBinary ( legacyObject , legacySerialiser );
return E x t e r n a l i z a b l e H e l p e r . fromBinary ( binaryObject , cur ren tSe ria lis er );
}

3.6. TESTING EVOLVABLE

63

Then we can easily check that an object constructed with the legacy class
and converted to the current class is equal to the directly constructed new
class:
public void l e g a c y T o C u r r e n t C h e c k ( Object legacyObject , Object currentObject ) {
Assert . assertEquals ( currentObject , legacyToCurrent ( currentToLegacy ( currentObject )));
}

and vice-versa:
public void c u r r e n t T o L e g a c y C h e c k ( Object legacyObject , Object currentObject ) {
Assert . assertEquals ( legacyObject , currentToLegacy ( legacyToCurrent ( legacyObject )));
}

And that each form survives round-trip conversion to the other:


public void roundTripCheck ( Object legacyObject , Object currentObject ) {
Assert . assertEquals ( currentObject , legacyToCurrent ( currentToLegacy ( currentObject )));
Assert . assertEquals ( legacyObject , currentToLegacy ( legacyToCurrent ( legacyObject )));
}

So we can write a complete test of the evolvable class that will verify all of
these conversions like this:
public class T e s t F r u i t y S e r i a l i s a t i o n 2 W a y {
private S e r i a l i s a t i o n 2 W a y T e s t H e l p e r s e r i a l i s a t i o n 2 W a y T e s t H e l p e r ;
public T e s t F r u i t y S e r i a l i s a t i o n 2 W a y () {
s e r i a l i s a t i o n 2 W a y T e s t H e l p e r = new S e r i a l i s a t i o n 2 W a y T e s t H e l p e r (
" org / cohbook / serialisation / evolvable / " +
" legacy / fruity - pof - config . xml " ,
" org / cohbook / serialisation / evolvable / " +
" fruity - pof - config . xml " );
}
@Test
public void t e s t F r u i t y S e r i a l i s a t i o n () throws IOException {
Object legacyObject = new org . cohbook . serialisation . evolvable . legacy . Fruity (
" Mark " , " Grapes " );
Object currentObject = new org . cohbook . serialisation . evolvable . Fruity (
" Mark " , " Grapes " );
s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . roundTripCheck ( legacyObject , currentObject );
s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . c u r r e n t T o L e g a c y C h e c k ( legacyObject , currentObject );
s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . l e g a c y T o C u r r e n t C h e c k ( legacyObject , currentObject );
}
@Test
public void t e s t C h e e s y S e r i a l i s a t i o n () throws IOException {
Object legacyObject = new org . cohbook . serialisation . evolvable . legacy . Fruity (
" Elizabeth " , " Banana " );
Object currentObject = new org . cohbook . serialisation . evolvable . Fruity (
" Elizabeth " , " Banana " , " Wensleydale " );
s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . roundTripCheck ( legacyObject , currentObject );
s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . c u r r e n t T o L e g a c y C h e c k ( legacyObject , currentObject );
}
}

64

CHAPTER 3. SERIALISATION

In the second example we are testing the behaviour when setting a property
that is not present in the legacy form. The legacy form cannot, therefore, be
correctly converted to the new form so we have omitted that check.
With this approach we have a solution that can be adapted to test conversions of difference versions of classes; by using a suitable package naming
convention we could even deal with more than two versions at the same time.
The weaknesses of this approach are:
it relies on a potentially error-prone process of copying and modifying
domain classes to new packages
the test may be incomplete or inaccurate if dependent classes, not
least Coherence itself, are different between the current and legacy
implementations actually found in the deployed environment.
To address these shortcomings, well have to look at how to instantiate the
two POF contexts in different classloaders.

3.6.3

Classloader-based Testing

We will use a ClassLoader for each version so that we can serialise an object
using one version of a project, and deserialise with another. Testing across
different versions of a project is problematic as build systems in Java are
not usually able to satisfy dependencies on more than one version of the
same project. In our example code we use maven, and have worked around
this limitation by placing our test code in a separate project that has no
configured dependency on either our example data model or Coherence itself.
The code itself is not dependent on use of maven3 so the general approach
should be adaptable to other build systems. In the downloadable example
code, the old and new versions of the code below are versions 1.0.0 and
2.0.0 of the evolvableCurrent project. Version 1.0.0 is duplicated in project
evolvableLegacy for convenience of reference, though this isnt used in the
test. The test code itself is in project evolvableTest. Version 1.0.0 of the
built binaries is in directory historical-artefacts of the examples.
The domain object well be testing with is the Fruity example above, version
2.0.0 with the added favouriteCheese. In addition to the domain classes, the
evolvableCurrent project include a test support class, listing 3.14, used to
simplify the implementation of the test itself. We follow maven conventions
3

Except that we use a path name typical of a local maven repository.

3.6. TESTING EVOLVABLE

65

Listing 3.14: Convert between object and byte array


public class S e r i a l i s e r T e s t S u p p o r t {
private Serializer serialiser ;
public S e r i a l i s e r T e s t S u p p o r t ( final String pofConfigName ) {
serialiser = new C o n f i g u r a b l e P o f C o n t e x t ( pofConfigName );
}
public byte [] serialise ( Object object )
throws IllegalAccessException , IllegalArgumentException ,
InvocationTargetException {
return E x t e r n a l i z a b l e H e l p e r . toBinary ( object , serialiser ). toByteArray ();
}
public Object deserialise ( byte [] bytearray )
throws IllegalAccessException , IllegalArgumentException ,
InvocationTargetException , I n s t a n t i a t i o n E x c e p t i o n {
return E x t e r n a l i z a b l e H e l p e r . fromBinary ( new Binary ( bytearray ) , serialiser );
}
}

Listing 3.15: Instantiate a domain object by reflection


public Object createBean ( final String className , final Map < String , Object > properties )
throws ClassNotFoundException , InstantiationException , IllegalAccessException ,
NoSuchMethodException , SecurityException , IllegalArgumentException ,
InvocationTargetException {
Class <? > beanClass = this . getClass (). getClassLoader (). loadClass ( className );
Object bean = beanClass . newInstance ();
for ( Map . Entry < String , Object > propEntry : properties . entrySet ()) {
Method setter = beanClass . getMethod (
propEntry . getKey () , propEntry . getValue (). getClass ());
setter . invoke ( bean , propEntry . getValue ());
}
return bean ;
}

and add this class under src/java/test and construct a tests jar separate from
the main jar.
The class encapsulates a Serializer, in this case specifically a PofContext, and
provides simple methods for converting an object to and from a byte array.
Well be accessing the class by reflection through a ClassLoader. We also need
a method for creating instances of our domain object. Listing 3.15 shows a
very simplistic example using reflection.
This gives us the tools to create, serialise and deserialise objects, now we need
to wrap this up in a ClassLoader that allows these methods to work with a particular version of our domain objects, and of Coherence. In our evolvableTest
project well create a class, DelegatingClassLoaderSerialiserTestSupport beginning with listing 3.16 that can delegate these operations across a ClassLoader

66

CHAPTER 3. SERIALISATION
Listing 3.16: Creating reflection methods

public class D e l e g a t i n g C l a s s L o a d e r S e r i a l i s e r T e s t S u p p o r t
implements C l a s s L o a d e r S e r i a l i s e r T e s t S u p p o r t {
private
private
private
private
private
private

final Object delegate ;


final ClassLoader classLoader ;
final Method createBean ;
final Method serialise ;
final Method deserialise ;
static final String DEL EGA TEC LAS SN AME =
" org . cohbook . evolvable . S e r i a l i s e r T e s t S u p p o r t " ;

public D e l e g a t i n g C l a s s L o a d e r S e r i a l i s e r T e s t S u p p o r t ( String jarPaths [] , String pofConfigName )


throws MalformedURLException , ClassNotFoundException , NoSuchMethodException ,
SecurityException , InstantiationException , IllegalAccessException ,
IllegalArgumentException , I n v o c a t i o n T a r g e t E x c e p t i o n {
super ();
URL jarUrls [] = new URL [ jarPaths . length ];
int i = 0;
for ( String jarPath : jarPaths ) {
jarUrls [ i ++] = new File ( jarPath ). toURI (). toURL ();
}
classLoader = new URLClassLoader ( jarUrls , Thread . currentThread ()
. g e t C o n t e x t C l a s s L o a d e r ());
Class <? > delegateClass = classLoader . loadClass ( D ELE GAT ECL AS SNA ME );
Constructor <? > constructor = delegateClass . getConstructor ( String . class );
delegate = constructor . newInstance ( pofConfigName );
createBean = delegateClass . getMethod ( " createBean " , String . class , Map . class );
serialise = delegateClass . getMethod ( " serialise " , Object . class );
deserialise = delegateClass . getMethod ( " deserialise " , byte []. class );
}
}

boundary.
This creates a ClassLoader using the jar file paths we provide, then instantiates an instance of our SerialiserTestSupport class within that ClassLoader,
and finally extracts Method objects for the various operations we wish to perform. We will now also need methods on this class that delegate to our
SerialiserTestSupport instance, as in listing 3.17.
We now have the tools to access and use a PofContext in one version of
our code. We instantiate DelegatingClassLoaderSerialiserTestSupport twice, to
delegate to one SerialiserTestSupport instance for each version of our project.
Well likely need to do this many times so well follow our earlier pattern and
place this in a helper class, listing 3.18
This example assumes it will find the build artefacts in a maven repository
structure. We are explicitly loading the domain objects jar, the corresponding tests jar (to get our SerialiserTestSupport class) and the version of Co-

3.6. TESTING EVOLVABLE

67

Listing 3.17: Delegating to the reflected methods


public Object c r e a t e B e a n I n C l a s s L o a d e r ( String className , Map < String , Object > properties )
throws InstantiationException , IllegalAccessException , ClassNotFoundException ,
NoSuchMethodException , SecurityException , IllegalArgumentException ,
InvocationTargetException {
return createBean . invoke ( delegate , className , properties );
}
public byte [] serialise ( Object object )
throws IllegalAccessException , IllegalArgumentException ,
InvocationTargetException {
return ( byte []) serialise . invoke ( delegate , object );
}
public Object deserialise ( byte [] bytearray )
throws IllegalAccessException , IllegalArgumentException ,
InvocationTargetException , I n s t a n t i a t i o n E x c e p t i o n {
return deserialise . invoke ( delegate , bytearray );
}

Listing 3.18: Plugging in the before and after jars


public class S e r i a l i s a t i o n 2 W a y T e s t H e l p e r {
private final C l a s s L o a d e r S e r i a l i s e r T e s t S u p p o r t c u r r e n t P o f T e s t S u p p o r t ;
private final C l a s s L o a d e r S e r i a l i s e r T e s t S u p p o r t l e g a c y P o f T e s t S u p p o r t ;
private final String JAR2TEST =
" / org / cohbook / evolvableCurrent /2.0.0/ "
" evolvableCurrent -2.0.0 - tests . jar " ;
private final String JAR2 =
" / org / cohbook / evolvableCurrent /2.0.0/ "
" evolvableCurrent -2.0.0. jar " ;
private final String JAR1TEST =
" / org / cohbook / evolvableCurrent /1.0.0/ "
" evolvableCurrent -1.0.0 - tests . jar " ;
private final String JAR1 =
" / org / cohbook / evolvableCurrent /1.0.0/ "
" evolvableCurrent -1.0.0. jar " ;
private final String COHJAR =
" / com / oracle / coherence /12.1.3.0/ " +
" coherence -12.1.3.0. jar " ;

public S e r i a l i s a t i o n 2 W a y T e s t H e l p e r ( String configFileName )


throws Exception {
String rootDirectory = System . getProperty ( " jarRootDir " );
if ( rootDirectory == null ) {
rootDirectory = System . getProperty ( " user . home " ) + " /. m2 / repository " ;
}
c u r r e n t P o f T e s t S u p p o r t = new D e l e g a t i n g C l a s s L o a d e r S e r i a l i s e r T e s t S u p p o r t (
new String [] { rootDirectory + JAR2TEST ,
rootDirectory + JAR2 ,
rootDirectory + COHJAR } ,
configFileName );
l e g a c y P o f T e s t S u p p o r t = new D e l e g a t i n g C l a s s L o a d e r S e r i a l i s e r T e s t S u p p o r t (
new String [] { rootDirectory + JAR1TEST ,
rootDirectory + JAR1 ,
rootDirectory + COHJAR } ,
configFileName );
}

68

CHAPTER 3. SERIALISATION

Listing 3.19: Test support utility methods adapted for the classloader model
public Object currentToLegacy ( Object currentObject )
throws IllegalAccessException , IllegalArgumentException ,
InvocationTargetException , I n s t a n t i a t i o n E x c e p t i o n {
byte [] binaryObject = c u r r e n t P o f T e s t S u p p o r t . serialise ( currentObject );
return l e g a c y P o f T e s t S u p p o r t . deserialise ( binaryObject );
}
public Object legacyToCurrent ( Object legacyObject )
throws IllegalAccessException , IllegalArgumentException ,
InvocationTargetException , I n s t a n t i a t i o n E x c e p t i o n {
byte [] binaryObject = l e g a c y P o f T e s t S u p p o r t . serialise ( legacyObject );
return c u r r e n t P o f T e s t S u p p o r t . deserialise ( binaryObject );
}
public void c u r r e n t T o L e g a c y C h e c k ( Object legacyObject , Object currentObject )
throws IllegalAccessException , IllegalArgumentException ,
InvocationTargetException , I n s t a n t i a t i o n E x c e p t i o n {
Assert . assertEquals ( legacyObject , currentToLegacy ( legacyToCurrent ( legacyObject )));
}
public void l e g a c y T o C u r r e n t C h e c k ( Object legacyObject , Object currentObject )
throws IllegalAccessException , IllegalArgumentException ,
InvocationTargetException , I n s t a n t i a t i o n E x c e p t i o n {
Assert . assertEquals ( currentObject , legacyToCurrent ( currentToLegacy ( currentObject )));
}
public void roundTripCheck ( Object legacyObject , Object currentObject )
throws IllegalAccessException , IllegalArgumentException ,
InvocationTargetException , I n s t a n t i a t i o n E x c e p t i o n {
Assert . assertEquals ( currentObject , legacyToCurrent ( currentToLegacy ( currentObject )));
Assert . assertEquals ( legacyObject , currentToLegacy ( legacyToCurrent ( legacyObject )));
}

herence used by that version. For build systems other than maven, youll
need to modify this accordingly.
We now have two instances of DelegatingClassLoaderSerialiserTestSupport,
each working with a different version of our code. In listing 3.19 we adapt
utility methods in Serialisation2WayTestHelper for the class-copying technique.
The principle difference is that, as Binary is a Coherence class, it is on the
child side of our ClassLoader context and so cannot be used to transfer serialised forms between the versions, we therefore define the interfaces using
byte[].
Our domain object test class in listing 3.20 is very similar to the one we
defined earlier using copied classes.

3.6. TESTING EVOLVABLE

Listing 3.20: Two way serialisation test


public class T e s t F r u i t y S e r i a l i s a t i o n 2 W a y {
private S e r i a l i s a t i o n 2 W a y T e s t H e l p e r s e r i a l i s a t i o n 2 W a y T e s t H e l p e r ;
public T e s t F r u i t y S e r i a l i s a t i o n 2 W a y () throws MalformedURLException ,
ClassNotFoundException , NoSuchMethodException , SecurityException ,
InstantiationException , IllegalAccessException , IllegalArgumentException ,
InvocationTargetException {
serialisation2WayTestHelper =
new S e r i a l i s a t i o n 2 W a y T e s t H e l p e r ( " fruity - pof - config . xml " );
}
@Test
public void t e s t F r u i t y S e r i a l i s a t i o n ()
throws IOException , InstantiationException , IllegalAccessException ,
ClassNotFoundException , NoSuchMethodException , SecurityException ,
IllegalArgumentException , I n v o c a t i o n T a r g e t E x c e p t i o n {
Map < String , Object > props = new HashMap < String , Object >();
props . put ( " setName " , " Mark " );
props . put ( " set Fav our ite Fru it " , " Grapes " );
Object currentObject = s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . g e t C u r r e n t P o f T e s t S u p p o r t ().
c r e a t e B e a n I n C l a s s L o a d e r ( " org . cohbook . evolvable . Fruity " , props );
Object legacyObject = s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . g e t L e g a c y P o f T e s t S u p p o r t ().
c r e a t e B e a n I n C l a s s L o a d e r ( " org . cohbook . evolvable . Fruity " , props );
s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . roundTripCheck ( legacyObject , currentObject );
s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . c u r r e n t T o L e g a c y C h e c k ( legacyObject , currentObject );
s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . l e g a c y T o C u r r e n t C h e c k ( legacyObject , currentObject );
}
@Test
public void t e s t C h e e s y S e r i a l i s a t i o n ()
throws IOException , InstantiationException , IllegalAccessException ,
ClassNotFoundException , NoSuchMethodException , SecurityException ,
IllegalArgumentException , I n v o c a t i o n T a r g e t E x c e p t i o n {
Map < String , Object > props = new HashMap < String , Object >();
props . put ( " setName " , " Elizabeth " );
props . put ( " set Fav our ite Fru it " , " Banana " );
Object legacyObject = s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . g e t L e g a c y P o f T e s t S u p p o r t ().
c r e a t e B e a n I n C l a s s L o a d e r ( " org . cohbook . evolvable . Fruity " , props );
props . put ( " s et Fa v ou r it e Ch e es e " , " Wensleydale " );
Object currentObject = s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . g e t C u r r e n t P o f T e s t S u p p o r t ().
c r e a t e B e a n I n C l a s s L o a d e r ( " org . cohbook . evolvable . Fruity " , props );
s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . roundTripCheck ( legacyObject , currentObject );
s e r i a l i s a t i o n 2 W a y T e s t H e l p e r . c u r r e n t T o L e g a c y C h e c k ( legacyObject , currentObject );
}
}

69

70

CHAPTER 3. SERIALISATION

Our earlier technique of copying classes into a new package suffered from the
risk of error in manually copying the domain classes and POF configuration
into new packages. This technique removes that problem, we are testing
against the actual classes that were released for the earlier version. The
trade-off is a somewhat cumbersome adaptation to the build system. In particular, when making changes to the current version, a full build and install
of the generated jars is needed before the test project can be executed.
Incorporating the current version into the parent classloader, and hence
avoiding this last problem is possible, but beyond the scope of this book.
We would first have to implement a child-first classloader, the default behaviour of the URLClassLoader is parent first.

3.7

Implementing Custom Serialisation

Objective
To understand how to implement support for another serialisation format in Coherence, using Google Protocol Buffers as an example.
Prerequisites
A familiarity with POF is useful to provide context. Familiarity with
Google Protocol Buffers is not needed.
Code examples
The classes and resources used in this section may be found in the
org.cohbook.serialisation.protobuf package in the serialisation project.
Dependencies
As well as Oracle Coherence, the examples use JUnit and, of course,
Google Protocol Buffers.

3.7.1

Setup

The example project uses maven, Create a project with the standard plugins
and dependencies described earlier, there is a maven plugin available, the
dependency can be added to your pom.xml as shown in listing 3.21
If not using maven, then follow the instructions at https://developers.
google.com/protocol-buffers/docs/overview for downloading and using

3.7. IMPLEMENTING CUSTOM SERIALISATION

71

Listing 3.21: Using the protobuf-maven-plugin


< plugin >
< groupId > com . github . igor - petruk . protobuf </ groupId >
< artifactId > protobuf - maven - plugin </ artifactId >
< version > 0.6.2 </ version >
< executions >
< execution >
< goals >
< goal > run </ goal >
</ goals >
</ execution >
</ executions >
</ plugin >

protocol buffers. Out of the box, Coherence gives you the choice of POF or
native java serialisation. There are many reasons to prefer POF that are adequately described in the Oracle documentation and other texts, but you may
have good reason to consider other serialisation formats. You may need to
have your serialised data accessible to applications written in languages that
do not support POF, for example, by persisting it directly into a database
using a BinaryStore, or by passing it to a messaging system in a trigger or
interceptor. There are many technologies available for serialising data into
binary or readable form, well illustrate the approach using google protocol
buffers, hereafter referred to as GPB. First of all, lets consider some of the
differences between POF and GPB:
GPB generates the classes it serialises, so cannot be used for arbitrary
types
The GPB serialised form does not identify the type that has been
serialised, so we can only deserialise streams when we know the type
that they contain.
Well work around the latter restriction by defining a GPB wrapper type
that effectively contains a union of all the other types that we define4 . Our
example uses a re-implementation of our ChessPlayer and GoPlayer classes.
Create the file player.proto as in listing 3.22, if using maven this should be
in the directory src/main/protobuf of your project so that the maven plugin
finds it.
The protobuf-maven-plugin will create the Google protocol buffer serialiser
class org.cohbook.serialisation.protobuf.Player in target/generated-sources, if
not using maven, the GPB code generator can be run manually as per the
4
A technique described in the GPB documentation at https://developers.google.
com/protocol-buffers/docs/techniques#union

72

CHAPTER 3. SERIALISATION
Listing 3.22: Protocol buffers definition

package org . cohbook . serialisation . protobuf ;


option java_package = " org . cohbook . serialisation . protobuf " ;
message Person {
required string firstname = 1;
required string lastname = 2;
}
message GoPlayer {
required Person person = 1;
required int32 dan = 2;
}
message ChessPlayer {
required Person person = 1;
required string rank = 2;
}
message Wrapper {
enum MessageType {
GOPLAYER = 1;
CHESSPLAYER = 2;
}
required MessageType type = 1;
optional GoPlayer goPlayer = 2;
optional ChessPlayer chessPlayer = 3;
}

instructions on the protocol buffers project site. The class Player itself contains member classes, Person, ChessPlayer, GoPlayer, and Wrapper, along with
additional interfaces. These are our domain objects and the builders used to
construct them.

3.7.2

A Coherence Serialiser for Google Protocol Buffers

Now that we have generated our domain classes, we can write the serialiser
code itself, listing 3.23 implementing the com.tangosol.io.Serializer interface:
A few things to notice about this code:
We can use ExternalizableHelper to convert the Coherence BufferInput
and BufferOutput into InputStream and OutputStream
We user GPBs writeDelimitedTo and parseDelimitedFrom rather than
writeTo and parseFrom. This is because Coherence will sometimes use
the same buffer for several objects and GPB needs the delimited form
in order to identify the object boundaries.
We use our Wrapper class within the serialiser to implement the GPB

3.7. IMPLEMENTING CUSTOM SERIALISATION

Listing 3.23: A first stab at a GPB serialiser


public class P ro t ob u fS e ri a li se r implements Serializer {
@Override
public void serialize ( final BufferOutput bufferoutput , final Object obj )
throws IOException {
OutputStream output = E x t e r n a l i z a b l e H e l p e r . getOutputStream ( bufferoutput );
Player . Wrapper . Builder wrapperBuilder = Player . Wrapper . newBuilder ();
if ( obj instanceof ChessPlayer ) {
wrapperBuilder . setChessPlayer (( ChessPlayer ) obj );
wrapperBuilder . setType (
Player . Wrapper . MessageType . CHESSPLAYER );
wrapperBuilder . build (). writeDelimitedTo ( output );
} else if ( obj instanceof GoPlayer ) {
wrapperBuilder . setGoPlayer (( GoPlayer ) obj );
wrapperBuilder . setType (
Player . Wrapper . MessageType . GOPLAYER );
wrapperBuilder . build (). writeDelimitedTo ( output );
} else {
throw new I l l e g a l A r g u m e n t E x c e p t i o n (
" Dont know how to serialise a " + obj . getClass ());
}
}
@Override
public Object deserialize ( final BufferInput bufferinput ) throws IOException {
Player . Wrapper wrapper = Player . Wrapper . pa r se D el i mi t ed F ro m ( E x t e r n a l i z a b l e H e l p e r .
getInputStream ( bufferinput ));
switch ( wrapper . getType ()) {
case GOPLAYER :
return wrapper . getGoPlayer ();
case CHESSPLAYER :
return wrapper . getChessPlayer ();
}
throw new RuntimeException ( " unexpected message type : " + wrapper . getType ());
}
}

73

74

CHAPTER 3. SERIALISATION
union pattern.

Perhaps we should have done this next bit first, but we now need to write
a unit test to validate the serialisation. First we need to make a small
enhancement to our SerialisationTestHelper support class. Previously we
gave it the name of a POF configuration file as a constructor argument,
but now we are using our own serialiser, not a ConfigurablePofContext. The
member variable itself doesnt specify the implementation, so we need only
add an alternate constructor
private final Serializer serialiser ;
public S e r i a l i s a t i o n T e s t H e l p e r ( String configFileName ) {
this . serialiser = new C o n f i g u r a b l e P o f C o n t e x t ( configFileName );
}
public S e r i a l i s a t i o n T e s t H e l p e r ( Serializer serializer ) {
this . serialiser = serializer ;
}

Our test then follows the same pattern as the POF serialisation test.
public class T e s t P r o t o b u f S e r i a l i z e r {
private S e r i a l i s a t i o n T e s t H e l p e r s e r i a l i s a t i o n T e s t H e l p e r ;
public T e s t P r o t o b u f S e r i a l i z e r () {
s e r i a l i s a t i o n T e s t H e l p e r = new S e r i a l i s a t i o n T e s t H e l p e r ( new Pr o to bu f Se r ia l is e r ());
}
@Test
public void t e s t S e r i a l i s e G o P l a y e r () throws IOException {
Player . GoPlayer . Builder builder = Player . GoPlayer . newBuilder ();
Player . Person . Builder personBuilder = Player . Person . newBuilder ();
personBuilder . setFirstname ( " David " );
personBuilder . setLastname ( " Whitmarsh " );
builder . setPerson ( personBuilder . build ());
builder . setDan (9);
Player . GoPlayer object = builder . build ();
s e r i a l i s a t i o n T e s t H e l p e r . e q u a l s C h e c k S e r i a l i s a t i o n ( object );
}

Weve now shown that our serialiser can correctly serialise and deserialise
our domain objects, but now we need to see it working in a cluster. Well
create a cache configuration in listing 3.24 that uses the serialiser for one
service.
A note of caution here, the cache configuration schema shows the serializer
element is defined per scheme, and the name of the service used is also
defined per scheme. This might lead you to expect that it would be possible
to define two schemes, each referring to the same service name, but with
different serializer configurations. Such a configuration would pass schema
validation, but would not behave as you expect, both schemes will be given
the same serializer configuration whichever one happens to be started

3.7. IMPLEMENTING CUSTOM SERIALISATION

75

Listing 3.24: Defining the serialiser for a service


< cache - config xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - cache - config "
x si : sc h em a Lo c at i on =
" http: // xmlns . oracle . com / coherence / coherence - cache - config
coherence - cache - config . xsd " > < caching - scheme - mapping >
< cache - mapping >
< cache - name > test </ cache - name >
< scheme - name > d is t ri b ut e dS e rv i ce </ scheme - name >
</ cache - mapping >
</ caching - scheme - mapping >
< caching - schemes >
< distributed - scheme >
< scheme - name > d is t ri b ut e dS e rv i ce </ scheme - name >
< serializer >
< instance >
< class - name >
org . cohbook . serialisation . protobuf . Pr o to b uf S er i al is e r
</ class - name >
</ instance >
</ serializer >
< backing - map - scheme >
< local - scheme / >
</ backing - map - scheme >
< autostart > true </ autostart >
</ distributed - scheme >
</ caching - schemes >
</ cache - config >

first5 .
Finally, were ready to write a Littlegrid test to insert an object into a cache,
listing 3.25.
But, with what we have done so far, the test would fail like this:
2013 -04 -10 08 : 08 : 31 . 07 2 /6 . 07 6 Oracle Coherence GE 3.7.1.3 < Error > ( thread = DistributedCache , me
mber =1): java . lang . I l l e g a l A r g u m e n t E x c e p t i o n :
Dont know how to serialise a class java . lang . String
at org . cohbook . serialisation . protobuf . P r ot o bu f Se r ia l is e r . serialize ( Pr ot o bu f Se r ia l is e r . jav
a :37)
at com . tangosol . coherence . component . util . daemon . queueProcessor . Service . writeObject ( Servic
e . CDB :4)
at com . tangosol . coherence . component . util . ServiceConfig . writeObject ( ServiceConfig . CDB :1)
at com . tangosol . coherence . component . util . Ser vic eC onf ig$ Map . writeObject ( ServiceConfig . CDB :
1)

Not only did we forget that we need to serialise our caches key types as
well as value types, but also it turns out that Coherence uses the services
serialiser to handle all kinds of objects used in its own internal protocol.
Look at coherence-pof-config.xml in the Coherence jar to see the kinds of
things it needs to deal with. There are several strategies we could use to
5

This issue is covered further in chapter 8: Configuration, subsection 8.2.4: Separate


Service Definitions and Cache Templates and section 8.4: Validate Configuration With A
NameSpaceHandler

76

CHAPTER 3. SERIALISATION

Listing 3.25: Testing in a littlegrid cluster


public class P r o t o b u f C l u s t er T e s t {
@Test
public void test () {
C lu s te r Me m be r Gr o up memberGroup = null ;
try {
memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. s e t S t o r a g e E n a b l e d C o u n t (2)
. setCacheConfiguration (
" org / cohbook / serialisation / protobuf / cache - config . xml " )
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();
final NamedCache cache = CacheFactory . getCache ( " test " );
Player . GoPlayer . Builder builder = Player . GoPlayer . newBuilder ();
Player . Person . Builder personBuilder = Player . Person . newBuilder ();
personBuilder . setFirstname ( " David " );
personBuilder . setLastname ( " Whitmarsh " );
builder . setPerson ( personBuilder . build ());
builder . setDan (9);
Player . GoPlayer object = builder . build ();
cache . put ( Integer . valueOf (99) , object );
Player . GoPlayer object2 =
( org . cohbook . serialisation . protobuf . Player . GoPlayer ) cache . get ( " DJW " );
Assert . assertEquals ( object , object2 );
} finally {
ClusterMemberGroupUtils .
s h u t d o w n C a c h e F a c t o r y T h e n C l u s t e r M e m b e r G r o u p s ( memberGroup );
}
}
}

3.7. IMPLEMENTING CUSTOM SERIALISATION

77

Listing 3.26: Protobuf specification with a POF stream


package org . cohbook . serialisation . protobuf ;
option java_package = " org . cohbook . serialisation . protobuf " ;
message Person {
required string firstname = 1;
required string lastname = 2;
}
message GoPlayer {
required Person person = 1;
required int32 dan = 2;
}
message ChessPlayer {
required Person person = 1;
required string rank = 2;
}
message Wrapper {
enum MessageType {
GOPLAYER = 1;
CHESSPLAYER = 2;
POFSTREAM = 3;
}
required
optional
optional
optional

MessageType type = 1;
GoPlayer goPlayer = 2;
ChessPlayer chessPlayer = 3;
bytes pofStream = 4;

solve this problem:


1. We could extend our GPB schema to include all of the types covered
by coherence-pof-config.xml. This looks to be somewhat laborious, as
well as being fragile in the event of changes in new Coherence versions
2. We could create a new GPB wrapper class containing only a byte
array. We would make this class POF serialisable, but the byte array
value would contain the GPB serialised form of our domain class. The
disadvantage of this is that we would then have the small additional
overhead of the POF type-id in every stored value.
3. Conversely, we could define our GPB wrapper to manage a byte array
type: the POF serialised value of any type not explicitly managed by
our GPB serialiser. This is exactly the inverse of the previous solution,
but now the additional overhead is on the POF serialised forms that
are used in communicating between nodes rather than in the persisted
values.
Well implement this last option here. The first step is to modify player.proto
to add the POF stream type as shown in listing 3.24.

78

CHAPTER 3. SERIALISATION

We need to make some changes to our ProtobufSerialiser class. First of all we


need a POF context for serialising objects of types unknown to our serialiser.
Well create this in a member variable:
private Serializer pofDelegate = new C o n f i g u r a b l e P o f C o n t e x t ( " ccoherence - pof - config . xml " );

Whereas previously we would throw an exception in the serialize method


when attempting to serialise an object of unknown type, now well simply
delegate to the POF context:
if ( obj instanceof ChessPlayer ) {
.
.
.
} else {
wrapperBuilder . setPofStream ( ByteString . copyFrom (
E x t e r n a l i z a b l e H e l p e r . toBinary ( obj ,
pofDelegate ). toByteArray ()));
wrapperBuilder . setType (
Player . Wrapper . MessageType . POFSTREAM );
wrapperBuilder . build (). writeDelimitedTo ( output );
}

And when deserialising, we must delegate POFSTREAM values to the POF context:
switch ( wrapper . getType ()) {
case GOPLAYER :
return wrapper . getGoPlayer ();
case CHESSPLAYER :
return wrapper . getChessPlayer ();
case POFSTREAM :
return E x t e r n a l i z a b l e H e l p e r . fromBinary (
new Binary ( wrapper . getPofStream (). toByteArray ()) , pofDelegate );
default :
throw new RuntimeException ( " unexpected message type : " + wrapper . getType ());
}

We need to test this arrangement. Well add a new test method to our
TestProtobufSerialiser unit test to round-trip a class unknown to our GPB
schema. A String will do:
@Test
public void t e s t S e r ia l i s e S t r i n g () throws IOException {
s e r i a l i s a t i o n T e s t H e l p e r . e q u a l s C h e c k S e r i a l i s a t i o n ( " a test string " );
}

And now, finally, our ProtobufClusterTest with Littlegrid should execute successfully.
I must re-iterate here the lesson of this section; that the serialiser configured
for a service affects not just the persisted values, but keys, EntryProcessors,
Filters, exceptions and the whole zoo of types that are transferred between
members of a cluster in relation to the service. In our example, most of
these types will be serialised as POF streams within a GPB stream only
the types we specifically handle in our protobuf serialiser are handled as

3.7. IMPLEMENTING CUSTOM SERIALISATION

79

native GPB streams. Our example uses an Integer key so within the binary
backing map, the key value will itself be stored as POF within GPB, though
its a simple exercise for you, the reader, to extend the GPB definition to
handle an Integer key natively. In the next section well look at how we deal
with a GPB binary backing map entry.

3.7.3

Custom Serialisation and EntryExtractor

With POF, we can use a PofExtractor and PofUpdater to read and manipulate
binary streams without deserialising the entire value object. The Coherence
API does a reasonably good job of abstracting the handling of binary data
away from the Serializer implementation so that, insofar as the underlying
serialisation mechanism supports it, Coherence will allow you to perform
similar manipulations for any binary data. Though that is a significant
caveat there are many tools out there to map objects to streams and back
again, but not many natively allow you to operate directly on the stream in
the way that POF does. GPB does not support introspection of the stream
through its public API, though the stream itself does contain the data needed
to support it. Purely for the purposes of illustration and education, I have
prepared a simple utility class in listing 3.27 to aid in extracting values
from a GPB stream, based on copying and modifying the GPB WireFormat
class. Im not recommending, or even suggesting, that you do such a thing
in production code, and especially not using my minimally tested hacked
example my purpose here is, as I say, to illustrate and educate.
Without going into too much detail after all, this is a book about Coherence, not Google Protocol Buffers GPB maintains two distinct concepts of type: the field type, as declared in the .proto file, fully specifies the type of a field in the generated class. the wire type, as stored
in the encoded stream, contains just enough information to navigate the
stream. The tag, in listing 3.27, combines the field number and the wire
type. These are sufficient to extract the binary serialised form of a field
from the stream, but we also need the field type to be able to correctly
deserialise the field. In order to extract a value from the stream, we need
to know the nested sequence of field numbers, for example, to obtain a
GoPlayers Dan rating, we look for the Dan field, as identified by the constant
Player.GoPlayer.DAN_FIELD_NUMBER within the GoPlayer stream, itself identified
by Player.Wrapper.GOPLAYER_FIELD_NUMBER.
Altogether, we therefore need to hold the expected field type and the nested

80

CHAPTER 3. SERIALISATION
Listing 3.27: Googles WireFormat, hacked

public class WireFormat {


private WireFormat () {}
public
public
public
public
public
public

static
static
static
static
static
static

final
final
final
final
final
final

int
int
int
int
int
int

WIRETYPE_VARINT
WIRETYPE_FIXED64
WIRETYPE_LENGTH_DELIMITED
WIRETYPE_START_GROUP
W IR ET Y PE _ EN D _G R OU P
WIRETYPE_FIXED32

=
=
=
=
=
=

0;
1;
2;
3;
4;
5;

static final int TAG_TYPE_BITS = 3;


static final int TAG_TYPE_MASK = (1 << TAG_TYPE_BITS ) - 1;
/* * Given a tag value , determines the wire type ( the lower 3 bits ). */
static int getTagWireType ( final int tag ) {
return tag & TAG_TYPE_MASK ;
}
/* * Given a tag value , determines the field number ( the upper 29 bits ). */
public static int ge tTa gFi eld Num ber ( final int tag ) {
return tag >>> TAG_TYPE_BITS ;
}
/* * Makes a tag value given a field number and wire type . */
static int makeTag ( final int fieldNumber , final int wireType ) {
return ( fieldNumber << TAG_TYPE_BITS ) | wireType ;
}
// Field numbers
static final int
static final int
static final int

for feilds in MessageSet wire format .


MESSAGE_SET_ITEM
= 1;
M E S S A G E _ S E T _ TY P E _ I D = 2;
M E S S A G E _ S E T _ ME S S A G E = 3;

// Tag numbers .
static final int M E S S A G E _ S E T _ I T E M _ T A G =
makeTag ( MESSAGE_SET_ITEM , W I R E T Y P E _ S T A R T _ G R O U P );
static final int M E S S A G E _ S E T _ I T E M _ E N D _ T A G =
makeTag ( MESSAGE_SET_ITEM , W I RE T YP E _E ND _ GR O UP );
static final int M E S S A G E _ S E T _ T Y P E _ I D _ T A G =
makeTag ( MESSAGE_SET_TYPE_ID , WIRETYPE_VARINT );
static final int M E S S A G E _ S E T _ M E S S A G E _ T A G =
makeTag ( MESSAGE_SET_MESSAGE , W I R E T Y P E _ L E N G T H _ D E L I M I T E D );
public static Object readField ( final CodedInputStream stream ,
final int tag , final Descriptors . FieldDescriptor . Type fieldType )
throws IOException {
switch ( WireFormat . getTagWireType ( tag )) {
case WireFormat . WIRETYPE_VARINT :
return stream . readInt32 ();
case WireFormat . WIRETYPE_FIXED64 :
return stream . r e a d R a w L i t t l e E n d i a n 6 4 ();
case WireFormat . W I R E T Y P E _ L E N G T H _ D E L I M I T E D :
switch ( fieldType ) {
case STRING :
return stream . readString ();
case MESSAGE :
return null ;
case GROUP :
return null ;
default :
return null ;
}
case WireFormat . W I R E T Y P E _ S T A R T _ G R O U P :
stream . skipField ( tag );
return null ;
case WireFormat . WI R ET Y PE _ EN D _G RO U P :
return null ;
case WireFormat . WIRETYPE_FIXED32 :
return stream . r e a d R a w L i t t l e E n d i a n 3 2 ();
default :
throw new RuntimeException ( " invalid wire type " );
}
}
}

3.7. IMPLEMENTING CUSTOM SERIALISATION

81

field list as member variables of our extractor. Well extend EntryExtractor,


which also identifies the target of extraction: key or value:
public class Pro tob ufE xtr act or extends EntryExtractor {
private int [] fields ;
private Descriptors . FieldDescriptor . Type fieldType ;
public Pr ot obu fEx tra cto r ( int nTarget , int fields [] ,
Descriptors . FieldDescriptor . Type fieldType ) {
super ( nTarget );
this . fields = fields ;
this . fieldType = fieldType ;
}

We must provide an implementation of extractFromEntry that identifies and


obtains the correct binary target, then wraps it in a GPB CodedInputStream
that we can then parse:
@Override
public Object extractFromEntry ( Entry untypedentry ) {
BinaryEntry entry = ( BinaryEntry ) untypedentry ;
Binary bin ;
switch ( super . m_nTarget ) {
case KEY :
bin = entry . getBinaryKey ();
break ;
case VALUE :
bin = entry . getBinaryValue ();
break ;
default :
throw new RuntimeException ( " invalid target " + m_nTarget );
}
CodedInputStream stream = CodedInputStream . newInstance ( bin . toByteArray ());
try {
return r e a d F i e l d F r o m M e s s a g e S t r e a m ( stream , fields );
} catch ( IOException e ) {
throw new RuntimeException ( e );
}
}

and now implement readFieldFromMessageStream, which recurses over our nested


field list until we reach the field we wish to extract:
private Object r e a d F i e l d F r o m M e s s a g e S t r e a m ( CodedInputStream stream ,
int [] fields ) throws IOException {
int nextField = fields [0];
int rtag ;
while (( rtag = stream . readTag ()) != 0) {
int fieldRead = WireFormat . get Tag Fie ldN umb er ( rtag );
if ( fieldRead == nextField ) {
if ( fields . length == 1) {
return WireFormat . readField ( stream , rtag , fieldType );
} else {
return r e a d F i e l d F r o m M e s s a g e S t r e a m (
stream , Arrays . copyOfRange ( fields , 1 , fields . length ));
}
} else {
stream . skipField ( rtag );
}
}
return null ;
}

82

CHAPTER 3. SERIALISATION

There remains one issue: in order to execute our extractor, we must be able
to serialise it to send it to the storage nodes. Should we use GPB or POF to
do this? If the principle reason for using GPB is to make serialised domain
objects accessible to other technologies, we really dont need to care about
how we serialise extractors and aggregators, so it is probably simpler to use
POF, especially as our superclass EntryExtractor class already implements
PortableObject.
// Start field numbers from 10 to avoid collisions with superclass
private static final int FIELDS_INDEX = 10;
private static final int FIELDTYPE_INDEX = 11;
@Override
public void readExternal ( PofReader in ) throws IOException {
super . readExternal ( in );
fields = in . readIntArray ( FIELDS_INDEX );
fieldType = Descriptors . FieldDescriptor . Type . valueOf (
in . readString ( FIELDTYPE_INDEX ));
}
@Override
public void writeExternal ( PofWriter out ) throws IOException {
super . writeExternal ( out );
out . writeIntArray ( FIELDS_INDEX , fields );
out . writeString ( FIELDTYPE_INDEX , fieldType . name ());
}

We now need to test the extractor. Well start in listing ?? with a simple
unit test that mocks the BinaryEntry.
The more thorough integration test of listing 3.29 uses Littlegrid to run up a
cluster and execute the extractor in a storage node. To run this test we will
need to include our ProtobufExtractor in the delegate PofContext used by our
PofSerializer. Change the declaration of the pofDelegate member variable in
ProtobufSerialiser from
private Serializer pofDelegate = new C o n f i g u r a b l e P o f C o n t e x t ( " ccoherence - pof - config . xml " );

to
private Serializer pofDelegate = new C o n f i g u r a b l e P o f C o n t e x t (
" org / cohbook / serialisation / protobuf / protobuf - pof - config . xml " );

and create protobuf-pof-config.xml as:


<? xml version = 1.0 ? >
<pof - config xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - pof - config "
x si : sc h em a Lo c at i on =
" http: // xmlns . oracle . com / coherence / coherence - pof - config
coherence - pof - config . xsd " >
< user - type - list >
< include > coherence - pof - config . xml </ include >
< user - type >
< type - id > 1001 </ type - id >
< class - name >
org . cohbook . serialisation . protobuf . P rot obu fEx tra ct or
</ class - name >
</ user - type >
</ user - type - list >
</ pof - config >

3.7. IMPLEMENTING CUSTOM SERIALISATION

83

Listing 3.28: Testing the ProtobufExtractor


public class T e s t P r o t o b u f E x t r a c t o r {
private Serializer serialiser ;
private Mockery context ;
@Before
public void setup () {
serialiser = new Pr o to b uf S er i al is e r ();
context = new Mockery ();
}
@Test
public void test () {
Player . GoPlayer . Builder builder = Player . GoPlayer . newBuilder ();
Player . Person . Builder personBuilder = Player . Person . newBuilder ();
personBuilder . setFirstname ( " David " );
personBuilder . setLastname ( " Whitmarsh " );
builder . setPerson ( personBuilder . build ());
builder . setDan (9);
Player . GoPlayer object = builder . build ();
final Binary binaryObject = E x t e r n a l i z a b l e H e l p e r . toBinary ( object , serialiser );
final BinaryEntry entry = context . mock ( BinaryEntry . class );
context . checking ( new Expectations () {{
oneOf ( entry ). getBinaryValue ();
will ( returnValue ( binaryObject ));
}});
EntryExtractor extractor = new P rot obu fEx tra cto r ( Abs tra ctE xtr act or . VALUE ,
new int [] { Player . Wrapper . GOPLAYER_FIELD_NUMBER ,
Player . GoPlayer . DAN_FIELD_NUMBER } ,
Descriptors . FieldDescriptor . Type . INT32 );
Assert . assertEquals (9 , extractor . extractFromEntry ( entry ));
context . a sse rtI sSa tis fie d ();
}
}

We now need to test the extractor. Well start with listing 3.28, a simple unit
test that mocks the BinaryEntry. A more thorough integration test, listing
3.28 uses Littlegrid to run up a cluster and execute the extractor in a storage
node.

84

CHAPTER 3. SERIALISATION

Listing 3.29: Testing the protobuf extractorin a cluster


@Test
public void testExtract () {
C lu s te r Me m be r Gr o up memberGroup = null ;
try {
memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. s e t S t o r a g e E n a b l e d C o u n t (2)
. setCacheConfiguration (
" org / cohbook / serialisation / protobuf / cache - config . xml " )
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();
final NamedCache cache = CacheFactory . getCache ( " test " );
Player . GoPlayer . Builder builder = Player . GoPlayer . newBuilder ();
Player . Person . Builder personBuilder = Player . Person . newBuilder ();
personBuilder . setFirstname ( " David " );
personBuilder . setLastname ( " Whitmarsh " );
builder . setPerson ( personBuilder . build ());
builder . setDan (9);
Player . GoPlayer object = builder . build ();
cache . put ( " DJW " , object );
EntryExtractor extractor = new P rot obu fEx tra cto r ( Abs tra ct Ext rac tor . VALUE ,
new int [] { Player . Wrapper . GOPLAYER_FIELD_NUMBER ,
Player . GoPlayer . DAN_FIELD_NUMBER } ,
Descriptors . FieldDescriptor . Type . INT32 );
Integer extDan = ( Integer ) cache . invoke ( " DJW " , new E xt r ac t or P ro ce s so r ( extractor ));
Assert . assertEquals ( Integer . valueOf (9) , extDan );
} finally {
ClusterMemberGroupUtils .
s h u t d o w n C a c h e F a c t o r y T h e n C l u s t e r M e m b e r G r o u p s ( memberGroup );
}
}

3.8. AVOIDING DESERIALISATION

85

Running this last test will reveal that we need to include our ProtobufExtractor
in the delegate PofContext used by our PofSerializer. Change the line:
private Serializer pofDelegate = new C o n f i g u r a b l e P o f C o n t e x t ( " ccoherence - pof - config . xml " );

to
private Serializer pofDelegate = new C o n f i g u r a b l e P o f C o n t e x t (
" org / cohbook / serialisation / protobuf / protobuf - pof - config . xml " );

and create protobuf-pof-config.xml as:


<? xml version = 1.0 ? >
<pof - config xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - pof - config "
x si : sc h em a Lo ca t io n =
" http: // xmlns . oracle . com / coherence / coherence - pof - config
coherence - pof - config . xsd " >
< user - type - list >
< include > coherence - pof - config . xml </ include >
< user - type >
< type - id > 1001 </ type - id >
< class - name >
org . cohbook . serialisation . protobuf . P rot obu fEx tra cto r
</ class - name >
</ user - type >
</ user - type - list >
</ pof - config >

3.8

Avoiding Deserialisation

Objective
To develop an understanding of the costs and risks of deserialising
objects, show one method of analysing frequency of deserialisation,
illustrated with a couple of specific scenarios. In particular, to demonstrate the importance of analysis and testing to determine the impact
of design decisions.
Prerequisites
Familiarity with POF concepts
Code examples
The classes and resources in this section may be found in the package
org.cohbook.serialisation.tracker in the serialisation project. Domain
objects are also used from the org.cohbook.serialisation.domain package.
Dependencies
As well as Oracle Coherence, the examples use JUnit and Littlegrid.

86

CHAPTER 3. SERIALISATION
Listing 3.30: A POF context that counts deserialisations

public class D e s e r i a l i s a t i o n C h e c k i n g P o f C o n t e x t extends C o n f i g u r a b l e P o f C o n t e x t {


private static ConcurrentHashMap < Class <? > , AtomicInteger >
d e s e r i a l i s a t i o n C o u n t = new ConcurrentHashMap < >();
public D e s e r i a l i s a t i o n C h e c k i n g P o f C o n t e x t () {
}
public D e s e r i a l i s a t i o n C h e c k i n g P o f C o n t e x t ( String sLocator ) {
super ( sLocator );
}
public D e s e r i a l i s a t i o n C h e c k i n g P o f C o n t e x t ( XmlElement xml ) {
super ( xml );
}
@Override
public Object deserialize ( BufferInput in ) throws IOException {
Object result = super . deserialize ( in );
if ( result != null ) {
Class <? > resultClass = result . getClass ();
d e s e r i a l i s a t i o n C o u n t . putIfAbsent ( resultClass , new AtomicInteger (0));
d e s e r i a l i s a t i o n C o u n t . get ( resultClass ). incrementAndGet ();
}
return result ;
}
public static Integer g e t D e s e r i a l i s a t i o n C o u n t ( Class <? > clazz ) {
AtomicInteger counter = d e s e r i a l i s a t i o n C o u n t . get ( clazz );
return counter == null ? 0 : counter . getAndSet (0);
}
}

You have defined your data model, built your domain classes, now configured
your serialisers. Now you can sit back and watch your application fly, a model
of computational efficiency.
Hold on a moment - are you really sure you arent deserialising unnecessarily?
Make no mistake, when you have millions of objects, or very large and complex objects, or worse: millions of large and complex objects, inflating them
from serialised form is expensive, and if you then immediately throw them
away, youre feeding that garbage collection like nobodys business.

3.8.1

Tracking Deserialisation While Testing

So, how can you be sure? There are profiling tools out there, but using
them on a distributed cluster adds another level of complexity. Wed ideally
want some simple way of running our load tests and seeing just how much
deserialisation goes on. A simple convenient way is to subclass the serialiser
to keep track. Maybe like listing 3.30. We will also need a way of collating
the results as they are distributed around the cluster. A simple and obvious
way is to use an aggregator, such as in listing 3.31.

3.8. AVOIDING DESERIALISATION

Listing 3.31: A An aggregator to retrieve the deserialisation count


@Portable
public class D e s e r i a l i s a t i o n A g g r e g a t o r implements P a r a l l e l A w a r e A g g r e g a t o r {
@P ort abl ePr ope rty (0) private String className ;
public D e s e r i a l i s a t i o n A g g r e g a t o r () {
}
public D e s e r i a l i s a t i o n A g g r e g a t o r ( Class <? > clazz ) {
this . className = clazz . getName ();
}
@Override
public Object aggregate ( Set set ) {
Class <? > clazz ;
try {
clazz = Class . forName ( className );
} catch ( C l a s s N o t F o u n d E x c e p t i o n e ) {
throw new RuntimeException ( e );
}
int count = D e s e r i a l i s a t i o n C h e c k i n g P o f C o n t e x t .
g e t D e s e r i a l i s a t i o n C o u n t ( clazz );
return count ;
}
@Override
public EntryAggregator g e t P a r a l l e l A g g r e g a t o r () {
return this ;
}
@Override
public Object aggregateResults ( Collection collection ) {
Integer total = 0;
for ( Integer partial : ( Collection < Integer >) collection ) {
total += partial ;
}
return total ;
}
}

87

88

CHAPTER 3. SERIALISATION

There is a mismatch between our DeserialisationCheckingPofContext class,


which collects statistics for a member, and our DeserialisationAggregator
which will usually collect once per partition. Our workaround trick is to
clear the count when reading - counter.getAndSet(0) so that even if the aggregator is called several times in a node, the result will be correct. There
are a number of other possible approaches to gathering the statistics:
An Invocable which would be invoked once per member, but needs an
InvocationService to be configured
A JMX MBean per node. Good for visibility, but needs an external
mechanism to sum the totals per node

3.8.2

Avoid Reflection Filters

By way of a demonstration, well write a Littlegrid test to store some GoPlayer


instances in a cache and then run an IsNullFilter to look for null last names.
First well need a POF configuration that include the domain object and our
DeserialisationAggregator:
<? xml version = 1.0 ? >
<pof - config xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - pof - config "
x si : sc h em a Lo c at i on =
" http: // xmlns . oracle . com / coherence / coherence - pof - config
coherence - pof - config . xsd " >
< user - type - list >
< include > coherence - pof - config . xml </ include >
< include >
org / cohbook / serialisation / domain / person - pof - config . xml
</ include >
< user - type >
< type - id > 2001 </ type - id >
< class - name >
org . cohbook . serialisation . tracker . D e s e r i a l i s a t i o n A g g r e g a t o r
</ class - name >
</ user - type >
</ user - type - list >
</ pof - config >

uses reflection, and will deserialise the entire domain object to


extract and test a single value. In listing 3.32 were inserting three values,
one of which has a nasty data quality issue - weve put the entire name in the
firstName field. Our test shows that all three objects are deserialised.

IsNullFilter

Running our DeserialisationAggregator with the AlwaysFilter instance ensures


it will be run against all partitions. Several Coherence query-related classes
that have no constructor arguments have an INSTANCE member as a convenience - check NeverFilter, PresentFilter, IdentityExtractor

3.8. AVOIDING DESERIALISATION

Listing 3.32: Testing the count


public class TrackNullFilter {
private Cl u st e rM e mb e rG r ou p memberGroup ;
private NamedCache cache ;
@Before
public void setup () {
memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. s e t S t o r a g e E n a b l e d C o u n t (2)
. setCacheConfiguration (
" org / cohbook / serialisation / tracker / cache - config . xml " )
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();
cache = CacheFactory . getCache ( " test " );
cache . put (1 , new GoPlayer ( " Honinbo " , " Sansa " , 9));
cache . put (2 , new GoPlayer ( " Nakamura " , " Doseki " , 9));
cache . put (3 , new GoPlayer ( " Nobuaki Ansai " , null , 4));
}
@After
public void teardown () {
ClusterMemberGroupUtils .
s h u t d o w n C a c h e F a c t o r y T h e n C l u s t e r M e m b e r G r o u p s ( memberGroup );
}
@Test
public void testIsNullFilter () {
Filter isNullFilter = new IsNullFilter ( " getLastName " );
int count = ( int ) cache . aggregate ( isNullFilter , new Count ());
assertEquals (1 , count );
int deserial = ( int ) cache . aggregate (
AlwaysFilter . INSTANCE , new
D e s e r i a l i s a t i o n A g g r e g a t o r ( GoPlayer . class ));
assertEquals (3 , deserial );
}
}

89

90

CHAPTER 3. SERIALISATION
Listing 3.33: Testing the count with POF
@Test
public void t est Pof Nul lF ilt er () {
Filter isNullFilter = new EqualsFilter (
new PofExtractor ( null , GoPlayer . POF_LASTNAME ) , null );
int count = ( int ) cache . aggregate ( isNullFilter , new Count ());
assertEquals (1 , count );
int deserial = ( int ) cache . aggregate (
AlwaysFilter . INSTANCE ,
new D e s e r i a l i s a t i o n A g g r e g a t o r ( GoPlayer . class ));
assertEquals (0 , deserial );
}

Listing 3.34: A POF filter for null values


public class EqNullFilter extends EqualsFilter {
public EqNullFilter ( ValueExtractor extractor ) {
super ( extractor , null );
}
public EqNullFilter ( int pofindex ) {
super ( new PofExtractor ( null , pofindex ) , null );
}
public EqNullFilter ( int [] pofIndexes ) {
super ( new PofExtractor ( null , new SimplePofPath ( pofIndexes )) , null );
}
}

Having proven that we are deserialising, maybe we should do something


about it. Theres no equivalent filter in the Coherence API that can work
with POF, but in listing 3.33 we simply use a EqualsFilter with a PofExtractor
and test for equality with a null value.
There is a simple little syntactic sugar trick we can perform here. In listinglst:eqnullfilter we create a subclass of EqualsFilter to give us a simple
IsNullFilter replacement.
We dont even need to add this to a POF configuration - it will serialise as
an EqualsFilter and be constructed that way in the storage node where it is
used. This works for adding new constructors, but not if we try to modify
runtime behaviour by overriding any of the other methods apart from the
constructors. With this convenience class, we can replace in our last test
method, testPofNullFilter, the somewhat verbose construction:
Filter isNullFilter = new EqualsFilter (
new PofExtractor ( null , GoPlayer . POF_LASTNAME ) ,
null );

with:

3.8. AVOIDING DESERIALISATION

91

Filter isNullFilter = new EqNullFilter ( GoPlayer . POF_LASTNAME );

If the property we are checking for null has values that are themselves large
and complex, it would seem somewhat wasteful to deserialise them simply to
check if they are null, but there is a way to test for null without deserialising
even the indexed property. Null values in the POF stream are represented
by a special type-id defined as a constant V_REFERENCE_NULL in the PofConstants
class (the value of which is -37). We can therefore use the PofTypeIdFilter
we introduced in subsection 3.4.2: Value Objects with Properties of Different
Types:
Filter isNullFilter = new PofTypeIdFilter (
PofConstants . V_REFERENCE_NULL ,
Ab str act Ex tra cto r . VALUE ,
new SimplePofPath ( GoPlayer . POF_LASTNAME ));

Subclassing PofTypeIdFilter to create PofNullFilter in a manner analagous to


the EqNullFillter is left as an exercise for the reader.

3.8.3

Consider Member Failure

Is it strictly necessary to obsessively eliminate reflection at all costs? Clearly


not. Consider a cache that is accessed only by key, or whose queries are all
well supported with indexes. Well prepare a cache of Go players again, this
time with a few more entries:
public class T es t Tr a ck W it h In de x {
private Cl u st e rM e mb e rG r ou p memberGroup ;
private NamedCache cache ;
private static final int ENTRIES = 2000;
@Before
public void setup () {
memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. s e t S t o r a g e E n a b l e d C o u n t (2)
. setCacheConfiguration (
" org / cohbook / serialisation / tracker / cache - config . xml " )
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();
cache = CacheFactory . getCache ( " test " );
for ( int i = 0; i < ENTRIES ; i ++) {
cache . put (i , new GoPlayer ( " THX " , Integer . valueOf ( i ). toString () , 9));
}
cache . addIndex ( new R e f l e c t i o n E xt r a c t o r ( " getLastName " ) , false , null );
// clear the counters
cache . aggregate ( AlwaysFilter . INSTANCE , new D e s e r i a l i s a t i o n A g g r e g a t o r ( GoPlayer . class ));
}

Creating the index will deserialise each entry, updating the counter before we
start the behaviour we want to test, so we use our DeserialisationAggregator

92

CHAPTER 3. SERIALISATION
Listing 3.35: Testing member failure
@Test
public void testMemberFail () {
int firstMember = memberGroup . g e t S t a rt e d M e m b e r I d s ()[0];
memberGroup . stopMember ( firstMember );
int deserial = ( int ) cache . aggregate (
AlwaysFilter . INSTANCE ,
new D e s e r i a l i s a t i o n A g g r e g a t o r ( GoPlayer . class ));
System . out . println ( " deserialised " + deserial + " entries " );
assertTrue ( deserial > 0);
}

to reset the counter after creating the index. Now, lets perform a reflection
query on the cache:
@Test
public void testGetOne () {
Filter firstNameFilter = new EqualsFilter ( " getFirstName " , " THX " );
Filter lastNameFilter = new EqualsFilter ( " getLastName " , " 1138 " );
int resultSetSize = cache . keySet (
new AndFilter ( firstNameFilter , lastNameFilter ))
. size ();
assertEquals (1 , resultSetSize );
int deserial = ( int ) cache . aggregate (
AlwaysFilter . INSTANCE ,
new D e s e r i a l i s a t i o n A g g r e g a t o r ( GoPlayer . class ));
assertEquals (1 , deserial );
}

This demonstrates that only one cache entry is deserialised. If we had queried
solely with getLastName then no entries would have been deserialised. So far,
so good; we have an efficient query with none of that tiresome messing around
with POF. This will scale quite well with increasing cache size (though not
with query rate, as the filter will be executed on all nodes on parallel). Also,
it wont scale with increasing update rates, as the object must be deserialised
to update the index, or with increasing object size, as the entire object must
be deserialised.
For large caches, we must also consider what happens when we lose one or
more members, as demonstrated by listing 3.35.
We find that killing one of two nodes, approximately half of the entries
are deserialised. This represents the updating of indexes in nodes where
partitions are redistributed. If the cache is large but distributed over few
machines and one of those machines fails, the total CPU overhead can be
significant. Worse still, the sudden surge in object creation might trigger full

3.8. AVOIDING DESERIALISATION

93

GCs in several nodes simultaneously, potentially producing a cascade failure


of the cluster6 .
The lessons here:
Be sure you fully understand the implications before choosing to use
reflection to query a cache.
Test the impact of all query scenarios
Test resilience in failure scenarios (do this anyway, whether using reflection or not)

This isnt idle speculation, the author has seen this happen

94

CHAPTER 3. SERIALISATION

Chapter 4

Queries
4.1
4.1.1

Introduction
Useful Idioms

There are few handy shortcuts in the standard Coherence API that might
save you some time if you know about them. Heres a selection

Extract the Whole Value


You may find that you need to provide a ValueExtractor, but the value
you want to extract is the entire entry value. Use the IdentityExtractor.
For instance, if the value objects in a cache are simple strings, you might
write:
Set mungLoverKeys = f a v o u r i t e F o o d s C ac h e . keySet (
new EqualsFilter ( Id ent ity Ext rac tor . INSTANCE , " mung beans " ));

See also subsection 4.2.3: DeserializationAccelerator which describes the new


DeserializationAccelerator in version 12.1.3.
Several other useful classes have a static INSTANCE value where there are no
constructor arguments.
95

96

CHAPTER 4. QUERIES

Extract from the Key


You may wish to extract a part of a compound key. Many of the standard
ValueExtractor implementations inherit from AbstractExtractor, which has an
extractFromEntry method capable of extracting values from either the key or
value of a cache entry. So, a for a cache entry with a key of class Species
with a getGenus() method, we could write:
ValueExtractor extractor = new R e f l e ct i o n E x t r a c t o r (
" getGenus " , null , Ab str act Ext rac tor . KEY );
Set vignakeys = speciesCache . keySet ( new EqualsFilter ( extractor , " Vigna " ));

java.util.Map semantics
Here are two ways of obtaining a collection of all the entries in a cache, what
is the difference between them?
Set entries1 = cache . entrySet ();

// i

Set entries2 = cache . entrySet ( AlwaysFilter . INSTANCE );

// ii

The first is defined by the standard java java.util.Map interface and the second by com.tangosol.util.QueryMap. The javadoc for com.tangosol.util.QueryMap
gives part of the answer, in the entrySet methods it says:
Unlike the Map.entrySet() method, the set returned by this method
may not be backed by the map, so changes to the set may not
be reflected in the map, and vice-versa.
So that any call to methods on the set entries1 or on its iterator that modify
the set will change the underlying cache, in particular the Set.remove(Object)
method deletes the corresponding entry from the cache. Modifications to
entries2 are not propagated to the underlying cache. The text changes to
the set may not be reflected in the map should perhaps be read as changes
to the set will not be reflected in the map 1 .
The unwritten implication becomes apparent if you attempt to do this for
a very large cache. The first form will work as internally it fetches entries
from the cluster on demand, the second will give an OutOfMemoryError as every
entry in the cache is sent to the client to instantiate the entire set in one
go.
1
Or perhaps as an instruction to an implementor of the interface rather than advice
for a user of it

4.1. INTRODUCTION

97

Equality of Extractors
If you plan to use a custom ValueExtractor when constructing an index it is
imperative to correctly implement equals and hashcode; Coherence maintains
internally a map of indexes keyed by ValueExtractor, which the query resolver
uses to identify candidate indexes. A ValueExtractor that has no properties
can calculate equality simply by comparison of runtime type.
For a query to use an index, the ValueExtractor used in the query must be
equal to that used in the index, in particular a reflection extractor for a field
will never be equal to a POF extractor for the same field. Filters that have
a String method name argument will implicitly use the ReflectionExtractor
for that method.
For extractors that have no member variables, an alternative to implementing
equals and hashcode is to make the class a singleton with a private constructor,
and provide an inner PofSerializer class that returns the singleton, thus the
inherited Object.equals and Object.hashcode suffice:
public class O r d e r V a l u e E xt r a c t o r extends Abs tra ct Ext rac tor {
private static final O r d e r V a l u e E x t ra c t o r INSTANCE = new O r d e r V a l u e E x t r ac t o r ();
private O r d e r V a lu e E x t r a c t o r () {
}
.
.
.
public class Serializer implements PofSerializer {
public void serialize ( PofWriter pofwriter , Object obj )
throws IOException {
pofwriter . writeRemainder ( null );
}
public Object deserialize ( PofReader pofreader ) throws IOException {
pofreader . readRemainder ();
return INSTANCE ;
}
}
}

Defining the same index more than once is normally harmless, although it
can have an adverse effect on performance as locks are held on the cache for
a short while while the index map is checked.

Query Optimisation
Coherence appears to implement a very basic set of query optimisations. If
you use nested AndFilter or an AllFilter, there is an effort to examine those
filters and execute the ones that are supported by an index before those that

98

CHAPTER 4. QUERIES

are not, and, it seems, to prefer to execute indexed EqualsFilter before any
indexed range filters. However, there does not appear to be any attempt
to identify and apply the most selective filters first. This means that if you
have many indexes on a cache, it is very important to construct your query
filters with the most selective clauses first. For example, in a cache with a
million trades, each with a unique id, and a trade status that has only three
distinct values, and both trade id and status are indexed:
Filter filter = new AndFilter (
new EqualsFilter ( " getTradeId " , " 12345 " ) ,
new EqualsFilter ( " getStatus " , " OPEN " ));

will perform significantly better than:


Filter filter = new AndFilter (
new EqualsFilter ( " getStatus " , " OPEN " ) ,
new EqualsFilter ( " getTradeId " , " 12345 " ));

Low cardinality indexes are not an entirely pointless exercise as they can be
used for index covering of queries as described below in subsection 4.2.2: Covered Indexes. Alexey Ragozin has written in his blog about low-cardinality
indexes.2
Restricting Queries by Member
Sometimes, it is not necessary to have a query execute on all members.
Coherence will execute key or key set based operations on only the members
that own the given key or key set. Filter queries will be executed in parallel
on all members. If you wish to restrict results to only entries matching an
associated key, it is simple enough to wrap the query in a KeyAssociatedFilter,
but what if you want to...
Return entries that belong to a set of known keys, but also match
other criteria. You could simply include an InFilter clause in your query,
but that would execute on all members - even those that cannot contain your
set of keys:
Set entries = orderCache . entrySet ( new AndFilter (
new InFilter ( new KeyExtractor ( I den tit yEx tra cto r . INSTANCE ) , keyset ) ,
subQuery ));

Heres a cunning way of getting the same information, but only executing
on members that own those keys:
2
See http://blog.ragozin.info/2013/07/coherence-101-filters-performance-and.
html.

4.1. INTRODUCTION

99

Map result = orderCache . aggregate ( keyset , GroupAggregator . createInstance (


new KeyExtractor ( Id ent ity Ext rac tor . INSTANCE ) ,
new Red uce rAg gr ega tor ( I den tit yEx tra cto r . INSTANCE ) ,
subQuery ));

Because were aggregating over a keyset, Coherence knows to only execute


the aggregator on members that own those keys. We use a GroupAggregator,
but the grouping criterion is the key itself so it will always return a distinct
entry for each key. The oddity here is that each value in the result map
is itself a map with one value. An exercise for the reader is to write an
EntryAggregator that simply applies a filter and returns the map of matching
results.

Return entries that match a set of known associated keys. A query


using KeyAssociatedFilter will execute only on the owning member of a single
associated key, but there is no equivalent for a set of associated keys, and
as KeyAssociatedFilter must be the outermost filter in a query we cant use
a collection of them in an AnyFilter.
We can iterate over the set of associated keys to find the partitions they
belong to and construct a PartitionedFilter that will execute only on the
members where those keys will be found. Here is a static utility method
that creates such a filter
public static Filter g e t F i l t e r P a r i t i o n F o r K e y s (
NamedCache cache ,
Set <? extends Object > associatedKeys ,
Filter filter ,
ValueExtractor a s s o c i a t e d K e y E x t r a c t o r ) {
P ar t it i on e dS er v ic e service = ( Pa r ti t io n ed S er v ic e ) cache . getCacheService ();
PartitionSet partitionSet = new PartitionSet ( service . get Par ti tio nCo unt ());
K e y P a r t i t i o n i n g S t r a t e g y partstrat = service . g e t K e y P a r t i t i o n i n g S t r a t e g y ();
for ( Object key : associatedKeys ) {
partitionSet . add ( partstrat . getKeyPartition ( key ));
}
Filter keyFilter = new InFilter ( associatedKeyExtractor , associatedKeys );
return new P art iti one dFi lte r ( new AndFilter ( keyFilter , filter ) , partitionSet );
}

where the arguments are:


cache the cache to be queried
associatedKeys the set of associated keys to filter by
filter contains any additional filter clauses to be applied

100

CHAPTER 4. QUERIES

associatedKeyExtractor is a ValueExtractor used to extract the associated


key from the key of the entry. A subclass of AbstractExtractor with
target set to AbstractExtractor.KEY.

Unconditional Conditional Indexes


A conditional index is one that is created only for some entries in a cache.
Well explore how they are constructed and used in section 4.3: Conditional
Indexes. But we can create a conditional index that indexes all entries, an
unconditional conditional index by specifying the AlwaysFilter.INSTANCE when
creating the ConditionalExtractor used to define the index:
cache . addindex (
new C o n d i t i o n a l E x t r a c t o r ( AlwaysFilter . INSTANCE ) , valueExtractor , false ) ,
true ,
null );

This is not quite as daft an idea as it sounds. The rationale lies in the
third constructor argument to ConditionalExtractor; the boolean false value
suppresses creation of the forward index so that this index is more space
efficient, but cannot be used to satisfy queries by index covering as described
below in subsection 4.2.2: Covered Indexes. When you have many indexes
and do not need to use index covering, this is a useful technique to reduce
the memory overhead of those indexes.

4.2

Projection Queries

Objective
Demonstrate how to perform projection queries, extracting one or more
fields from a set of objects in a cache.

4.2.1

Projection Queries

The QueryMap interface provides a means of selecting entries from a cache,


equivalent to the WHERE clause in an SQL statement, but if we want to return
only some part of each entry, we need to perform a transformation on the
entries. Coherence provides a means to do this using the ReducerAggregator
and the aggregate methods on the InvocableMap interface.

4.2. PROJECTION QUERIES

101

ReducerAggregator
If we have cache of on-line customer orders, and we wish to obtain the
customer name for an order, Rather than writing:
CustomerOrder order = ( Order ) orderCache . get ( orderId );
String customerName = order . getCustomerName ();

we could write:
EntryAggregator reducer = new Red uce rAg gre gat or ( " getCustomerName " );
String customerName = ( String ) orderCache . aggregate (
Collections . singleton ( orderId ) , reducer );

Thus only the value we are interested in is transferred across the network
to the client. For queries involving large sets, where the fields of interest are small relative to the objects that contain them, the savings can be
significant. Of course, if we use POF and the ValueExtractor constructor
of ReducerAggregator, there is also a considerable efficiency improvement in
the storage nodes as it is no longer necessary to deserialise any of the objects.
To return a list of fields for each matching object we can use a MultiExtractor
with an array of ValueExtractor for the fields we wish to return. The return
value of the ReducerAggregator is a map whose key is the cache key and whose
value is the list of extracted values:
NamedCache orderCache = CacheFactory . getCache ( " order " );
ValueExtractor extractor = new MultiExtractor ( new ValueExtractor [] {
new R e f l e c t i o n E x tr a c t o r ( " getCustomerName " ) ,
new R e f l e c t i o n E x tr a c t o r ( " getPostCode " )
});
EntryAggregator aggregator = new R edu cer Agg re gat or ( extractor );
Map < Integer , List < Object > > resultMap =
( Map < Integer , List < Object > >) orderCache . aggregate (
AlwaysFilter . INSTANCE , aggregator );
for ( List < Object > values : resultMap . values ()) {
String customerName = ( String ) values . get (0);
String postCode = ( String ) values . get (1);
// Do something
}

If we frequently need to extract the same set of fields in different queries, or


if our aversion to the lack of type-safety in using MultiExtractor is sufficiently
strong, we might choose to write a custom class to hold the extracted object,
and a ValueExtractor implementation to construct it. To encapsulate the
name and postcode above we could write a SummaryCustomerDetail class as in

102

CHAPTER 4. QUERIES
Listing 4.1: Custom domain object for a projection

public class S u m m a r y C u s t o m e r D e t a i l implements Serializable {


private String customerName ;
private String postCode ;
public S u m m a r y C u s t o m e r D e t a i l ( String customerName , String postCode ) {
this . customerName = customerName ;
this . postCode = postCode ;
}
public String getCustomerName () {
return customerName ;
}
public String getPostCode () {
return postCode ;
}
}

Listing 4.2: Custom value extractor


public class C u s t o m e r S u m m a r y E x t r a c t o r implements ValueExtractor , Serializable {
public static final C u s t o m e r S u m m a r y E x t r a c t o r INSTANCE = new C u s t o m e r S u m m a r y E x t r a c t o r ();
public Object extract ( Object obj ) {
CustomerOrder order = ( CustomerOrder ) obj ;
return new S u m m a r y C u s t o m e r D e t a i l (
order . getCustomerName () , order . getPostCode ());
}
}

listing 4.1, and a and ValueExtractor to create it, CustomerSummaryExtractor,


listing 4.2
We can then perform the projection query:
EntryAggregator aggregator =
new Red uc erA ggr ega tor ( C u s t o m e r S u m m a r y E x t r a c t o r . INSTANCE );
Map < Integer , SummaryCustomerDetail > resultMap =
( Map < Integer , SummaryCustomerDetail >)
orderCache . aggregate ( AlwaysFilter . INSTANCE , aggregator );

If we are using POF, we have introduced the penalty of deserialising the entire CustomerOrder object in the storage node in order to extract the two fields.
Assuming that we follow the best practice described in subsection 3.2.3:
Define Common Extractors as Static Member Variables of defining simple
extractors in the domain class that they refer to, like this:
@Portable
public class CustomerOrder implements Serializable {
private static final int POF_CUSTOMERNAME = 1;
private static final int POF_POSTCODE = 3;
@P ort abl ePr ope rt y ( POF_CUSTOMERNAME )
private String customerName ;
@P ort abl ePr ope rt y ( POF_POSTCODE )

4.2. PROJECTION QUERIES

103

private String postCode ;


public static final Abs tra ctE xtr act or PO STC OD EEX TRA CTO R =
new PofExtractor ( String . class , POF_POSTCODE );
public static final Abs tra ctE xtr act or C U S T O M E R N A M E E X T R A C T O R =
new PofExtractor ( String . class , POF_CUSTOMERNAME );
.
.
.
}

Then we can rewrite our CustomSummaryExtractor to implement instead the


EntryExtractor interface:
public class C u s t o m e r S u m m a r y E n t r y E x t r a c t o r extends EntryExtractor {
public static final C u s t o m e r S u m m a r y E n t r y E x t r a c t o r INSTANCE =
new C u s t o m e r S u m m a r y E n t r y E x t r a c t o r ();
public Object extractFromEntry ( Entry entry ) {
InvocableMap . Entry iEntry = ( com . tangosol . util . InvocableMap . Entry ) entry ;
return new S u m m a r y C u s t o m e r D e t a i l (
( String ) iEntry . extract ( CustomerOrder . C U S T O M E R N A M E E X T R A C T O R ) ,
( String ) iEntry . extract ( CustomerOrder . POS TCO DEE XT RAC TOR ));
}
}

4.2.2

Covered Indexes

We do not have to use POF to be able to extract fields without deserialisation. If the ValueExtractor we provide to the ReducerAggregator is also used
to define an index on the cache, the ReducerAggregator will extract the field
directly from the internal forward index without referring to the cache value
at all. We can verify this using the DeserialisationCheckingPofContext we developed in subsection 3.8.1: Tracking Deserialisation While Testing.
@Test
public void t e s t C o v e r e d E x t r a c t o r () {
NamedCache orderCache = CacheFactory . getCache ( " order " );
orderCache . put (1 , new CustomerOrder (
1 , " David Cameron " , " 10 Downing Street " , " SW1A 2 AA " ));
ValueExtractor extractor = new R e f l ec t i o n E x t r a c t o r ( " getCustomerName " );
orderCache . addIndex ( extractor , false , null );
D e s e r i a l i s a t i o n A g g r e g a t o r aggCheck =
new D e s e r i a l i s a t i o n A g g r e g a t o r ( CustomerOrder . class );
// resets the deserialisation count after constructing the index
orderCache . aggregate ( AlwaysFilter . INSTANCE , aggCheck );
EntryAggregator aggregator = new R edu cer Agg re gat or ( extractor );
Map < Integer , String > resultMap = ( Map < Integer , String >)
orderCache . aggregate ( AlwaysFilter . INSTANCE , aggregator );
for ( String customerName : resultMap . values ()) {
assertEquals ( " David Cameron " , customerName );
}

104

CHAPTER 4. QUERIES
assertEquals (1 , resultMap . size ());
// check we haven t deserialised the order
assertEquals ( Integer . valueOf (0) ,
( Integer ) orderCache . aggregate ( AlwaysFilter . INSTANCE , aggCheck ));

A point to note - there are many interfaces and classes in Coherence that
are given a Map.Entry to work with, such as EntryExtractor. How do you take
advantage of index covering in your own code? Navely, we might extract a
value from an entry thus:
value = extractor . extractFromEntry ( entry );

but the preferred way is:


value = (( InvocableMap . Entry ) entry ). extract ( extractor );

The limitation of the latter approach is that it requires an InvocableMap.Entry


rather than a Map.Entry, but in most cases of interest the runtime type were
dealing with can be so cast. There are two advantages:
the former approach works only for a subclass of EntryExtractor such
as a PofExtractor, the latter will work equally for a ReflectionExtractor.
if extractor defines an index, the value will be obtained from the index,
more efficient even when using a PofExtractor.
There is a convenient utility method that does the necessary type checking
and calls the most appropriate method:
value = I n vo c ab l eM a pH e lp er . extractFromEntry ( extractor , entry );

4.2.3

DeserializationAccelerator

In Coherence 12.1.3, the DeserializationAccelerator was introduced. This is


an IndexAwareExtractor that creates a forward-only index containing the deserialised value. The result is that extracting the cache value in an EntryProcessor
or EntryAggregator becomes a very cheap operation as the deserialised value
can be obtained instantly by index covering. There are costs though:
Memory use is increased as both the serialised and deserialised forms
of the cache value are stored.
There is an increased CPU overhead when updating values.

4.3. CONDITIONAL INDEXES

4.3

105

Conditional Indexes

Objective
Give an example of the use of a conditional index, and elaborate on
behaviour
Prerequisites
Our example uses POF and the polymorphic cache concept described
in section 3.4: Polymorphic Caches
Code examples
In the org.cohbook.queries package of the queries module.
The Oracle on-line documentation contains a good summary of conditional
indexes, with a simple example showing how to index only the non-null values
of a field: http://docs.oracle.com/middleware/1212/coherence/COHDG/
api_querycache.htm#CDECFCHF, here well look at creating a conditional index that indexes a field that is only defined for some entries a in polymorphic
cache.

4.3.1

Conditional Index on a Polymorphic Cache

We have a polymorphic cache containing orders from account customers and


from individual non-account customers. Both are based on an AbstractOrder
containing details of the items ordered. The AccountOrder of listing 4.3 has
an account id, whereas the CustomerOrder of 4.4 has name and address details.

Listing 4.3: Account order domain object


@Portable
public class AccountOrder extends AbstractOrder {
public static final int P O F _ C U S T O M E RA C C O U N T = 2;
@P ort abl ePr ope rty ( P O F _ C U S TO M E R A C C O U N T )
private int customerAccount ;
public static final Abs tra ctE xtr act or C U S T O M E R A C C O U N T E X T R A C T O R =
new PofExtractor ( Integer . class , P O F _ C U S T O M E R A C CO U N T );
// constructors , getters etc
}

106

CHAPTER 4. QUERIES
Listing 4.4: Customer order domain object

@Portable
public class CustomerOrder extends AbstractOrder implements Serializable {
private static final long serialVersionUID = -2463128160190979868 L ;
private static final int POF_CUSTOMERNAME = 2;
private static final int PO F _ C U S T O M E R A D D R ES S = 3;
private static final int POF_POSTCODE = 4;
@P ort abl ePr ope rt y ( POF_CUSTOMERNAME )
private String customerName ;
@P ort abl ePr ope rt y ( P O F _ C U S T OM E R A D D R E S S )
private String customerAddress ;
@P ort abl ePr ope rt y ( POF_POSTCODE )
private String postCode ;
public static final A bst ra ctE xtr act or PO STC ODE EXT RAC TO R =
new PofExtractor ( String . class , POF_POSTCODE );
public static final A bst ra ctE xtr act or C U S T O M E R N A M E E X T R A C T O R =
new PofExtractor ( String . class , POF_CUSTOMERNAME );
// constructors , getters etc
++}

It would be useful to index this cache by the customerAccount, but this is only
valid for orders of type AccountOrder. We can, however, construct the index
using the CUSTOMERACCOUNTEXTRACTOR with an instance of SimpleTypeIdFilter that
we described in section 3.4: Polymorphic Caches.
NamedCache orderCache = CacheFactory . getCache ( " order " );
C o n f i g u r a b l e P o f C o n t e x t pofContext = ( C o n f i g u r a b l e P o f C o n t e x t )
orderCache . getCacheService (). getSerializer ();
int a cc o un t Or d er T yp e Id = pofContext . g e t U s e r T y p e I d e n t i f i e r ( AccountOrder . class );
Filter filter = new S im p le T yp eI d Fi l te r ( ac c ou n tO r de rT y pe I d );
C o n d i t i o n a l E x t r a c t o r extractor = new C o n d i t i o n a l E x t r a c t o r (
filter , AccountOrder . CUSTOMERACCOUNTEXTRACTOR , true );
orderCache . addIndex ( extractor , true , null );

In section 3.4 we described The technique of using a SimpleTypeIdFilter to


restrict a query to a particular run-time type at the time that we perform the
query, but now that we have defined a conditional index, this isnt necessary
for queries that use the indexed extractor:
@Test
public void testQuery () {
NamedCache orderCache = CacheFactory . getCache ( " order " );
orderCache . put (1 , new CustomerOrder (
1 , " David Cameron " , " 10 Downing Street " , " SW1A 2 AA " ));
orderCache . put (2 , new AccountOrder (42));
Collection < Integer > keys = orderCache . keySet ( new EqualsFilter (
AccountOrder . CUSTOMERACCOUNTEXTRACTOR , 42));
Assert . assertEquals (1 , keys . size ());
Assert . assertEquals ( Integer . valueOf (2) , ( Integer ) keys . iterator (). next ());
}

4.4. QUERYING COLLECTIONS

107

This query examines only entries that match the conditional index. If we
perform the test without the index in place, we get an exception as the
CUSTOMERACCOUNTEXTRACTOR is applied to the CustomerOrder instance in the cache,
which just happens to use the same POF index for a field of a different
type:
Portable ( com . tangosol . util . WrapperException ): ( Wrapped : Failed request execution for Distribu
tedCache service on Member ( Id =1 , Timestamp =2014 -05 -01 13:06:46.63 , Address =127.0.0.1:22000 , M
achineId =30438 , Location = site : DefaultSite , rack : DefaultRack , machine : DefaultMachine , process :273
70 , Role = D e d i c a t e d S t o r a g e E n a b l e d M e m b e r ) ( Wrapped ) unable to convert type -15 to a numeric typ
e ) unable to convert type -15 to a numeric type

A word of caution here, conventional indexes may affect how quickly a result
set is returned, but should not change the content of the result set. The
presence of a conditional index implicitly filters queries that use that index,
changing the semantics of the query.

4.4

Querying Collections

Objective
We look at how to perform queries - extractions, filters, and indexes - on
fields that are themselves collections of other objects, using reflection
and POF.
Prerequisites
An understanding of the concepts covered in the earlier sections of this
chapter, and of POF serialisation covered in chapter 3: Serialisation
Code examples
Are in the org.cohbook.queries.collections package of the queries project.
We also use the domain objects from org.cohbook.queries.domain
The Coherence API contains filters that can be used to query simple collections. For example, if we have a cache of Order instances, and Order has a
method Collection<String> getProducts() that returns all the products on the
order, then we can query for all orders that include green widgets with:
NamedCache orderCache = CacheFactory . getCache ( " order " );
Set orderEntries = orderCache . entrySet ( new ContainsFilter ( " getProducts " , " GRNWDG " ));

We can create an index to support this query, the ValueExtractor implied by


the above filter is a new ReflectionExtractor("getProducts"). If we create an
index using this extractor, Coherence will recognise that the value returned
by the extractor is a collection and will individually index each value in the

108

CHAPTER 4. QUERIES
Listing 4.5: Example Order Data as JSON

{
" CustomerOrder " : {
" orderId " : 4 2 ,
" customerName " : " David Cameron " ,
" customerAddress " : " 1 0 Downing Street " ,
" postCode " : " SW 1 A 2 AA " ,
" orderLines " : [
{
" product " : "BLUWDG" ,
" itemPrice " : 0 . 2 3 ,
" quantity " : 1 0 0
},
{
" product " : "GLDWDG" ,
" itemPrice " : 7 . 9 9 ,
" quantity " : 1 0 0
}
]
}
}

collection. This works just as well for a POF query, if the POF serialised
order object contains a collection of product codes.
But things are a little more complicated in our example object model. Listing
4.5 shows a CustomerOrder that contains a collection of OrderLine, each of which
has a product, illustrated as JSON.
We could simply add a Collection<String> getProducts() method on the order
that iterates the collection of order lines and assembles a collection of product
codes to return, but that would requires us to deserialise the entire order to
perform the query. Alternatively, we could add a field that contains the
list of product codes, maintained in line with the order lines. The products
could then be efficiently extracted using a POF extractor, but we are adding
complexity to the domain object to support a particular query, and also
duplicating data. We really need to write a custom ValueExtractor to extract
the products from the order lines

4.4.1

A Collection Element Extractor

Well start with listing 4.6, a ValueExtractor that encompasses a pair of


ValueExtractor fields - one to extract a collection, and another that can extract a field from an element in that collection. We can use this extractor
with a ReducerAggregator to obtain the list of products on a particular order
as shown in listing 4.7.
This will work just fine, so long as collectionExtractor and elementExtractor
are both instances of ReflectionExtractor, but a PofExtractor inherits its im-

4.4. QUERYING COLLECTIONS

109

Listing 4.6: A collection element extractor


@Portable
public class C o l l e c t i o n E l e m e n t E x t r a c t o r implements ValueExtractor , Serializable {
@P ort abl ePr ope rty (0)
private ValueExtractor c o l l ec t i o n E x t r a c t o r ;
@P ort abl ePr ope rty (1)
private ValueExtractor elementExtractor ;
public C o l l e c t i o n E l e m e n t E x t r a c t o r () {
}
public C o l l e c t i o n E l e m e n t E x t r a c t o r (
ValueExtractor collectionExtractor , ValueExtractor elementExtractor ) {
this . c o l l e c t i o n Ex t r a c t o r = c o l l e ct i o n E x t r a c t o r ;
this . elementExtractor = elementExtractor ;
}
public Object e x t r a c t F r o m C o l l e c t i o n ( Collection < Object > input ) {
Collection < Object > result = new ArrayList < >( input . size ());
for ( Object element : input ) {
result . add ( elementExtractor . extract ( element ));
}
return result ;
}
public Object extract ( Object oTarget ) {
Collection < Object > collection = ( Collection < Object >)
c o l l e c ti o n E x t r a c t o r . extract ( oTarget );
return e x t r a c t F r o m C o l l e c t i o n ( collection );
}
// hashcode and equals ( essential for index support !)
}

Listing 4.7: Extracting products with a ReducerExtractor


NamedCache orderCache = CacheFactory . getCache ( " order " );
ValueExtractor productExtractor = new C o l l e c t i o n E l e m e n t E x t r a c t o r (
new R e f l e c t i o n E x tr a c t o r ( " getOrderLines " ) ,
new R e f l e c t i o n E x tr a c t o r ( " getProduct " ));
EntryAggregator reducer = new Red uce rAg gre gat or ( productExtractor );
int orderId = 42;
Map < Integer , Collection < String > > orderProducts = ( Map < Integer , Collection < String > >)
orderCache . aggregate ( Collections . singleton ( orderId ) , reducer );

110

CHAPTER 4. QUERIES

plementation of the extract(Object target) method from AbstractExtractor,


which throws an UnsupportedOperationException for any non-null value. A
PofExtractor is intended to be called by the extractFromEntry method, so to
support this, we must do the same, creating a subclass of AbstractExtractor
and implementing extractFromEntry
public class C o l l e c t i o n E l e m e n t E x t r a c t o r extends Abs tra ctE xtr act or {
.
.
.
public Object extractFromEntry ( Entry entry ) {
if ( c o l l e c t i o n Ex t r a c t o r instanceof A bst ra ctE xtr act or ) {
Collection < Object > collection = ( Collection < Object >)
(( Ab str act Ex tra cto r ) c o l l e c t i o n E x t r ac t o r ). extractFromEntry ( entry );
return e x t r a c t F r o m C o l l e c t i o n ( collection );
} else {
return extract ( entry . getValue ());
}
}

This will work for either a ReflectionExtractor or a PofExtractor to obtain the


collection:
ValueExtractor productExtractor = new C o l l e c t i o n E l e m e n t E x t r a c t o r (
new PofExtractor ( AbstractOrder . POF_ORDERLINES ) ,
new R e f l e c t i o n E x t ra c t o r ( " getProduct " ));

In some cases this might be a worthwhile efficiency gain, if the collection


is small compared to the object that contains it. We cant easily apply
a PofExtractor to the elements of the collection using this approach - we
dont have a BinaryEntry instance to pass to the elementExtractor. To do that
we need to take a different approach and write a POF-specific collection
extractor.

4.4.2

A POF Collection Extractor

When we are dealing with POF, we can use a PofNavigator to extract a


PofValue instance from the serialised value without deserialising. Where the
PofValue represents a serialised collection, we can cast it to a PofCollection
which provides a getChild(int index) method to obtain the PofValue representing a single element within the collection. So, in listing 4.8 we use
two PofNavigator instances: collectionNavigator to give us the collection, and
elementNavigator, which can be applied to each element of the collection to
get the field of interest3 .
Theres no free lunch here. There is CPU overhead in traversing the POF
stream to obtain the unserialised value. In some cases this may even be
3
We could also write a ValueExtractor to extract just a single field from the nth
element in the collection. I leave that as an exercise for you, reader

4.4. QUERYING COLLECTIONS

111

Listing 4.8: A POF collection element extractor


@Portable
public class P o f C o l l e c t i o n E l e m e n t E x t r a c t o r extends A bst rac tEx tra cto r {
@P ort abl ePr ope rty (0)
private PofNavigator c o l l e c t i o n N a vi g a t o r ;
@P ort abl ePr ope rty (1)
private PofNavigator elementNavigator ;
public P o f C o l l e c t i o n E l e m e n t E x t r a c t o r () {
}
public P o f C o l l e c t i o n E l e m e n t E x t r a c t o r ( PofNavigator collectionNavigator ,
PofNavigator elementNavigator ) {
this . c o l l e c t i o n Na v i g a t o r = c o l l e ct i o n N a v i g a t o r ;
this . elementNavigator = elementNavigator ;
}
public Object extractFromEntry ( Entry entry ) {
BinaryEntry binaryEntry = ( BinaryEntry ) entry ;
PofContext pofContext = ( PofContext ) binaryEntry . getSerializer ();
PofValue pofValue = PofValueParser . parse ( binaryEntry . getBinaryValue () , pofContext );
PofCollection pofCollection = ( PofCollection ) c o l l e c t io n N a v i g a t o r . navigate ( pofValue );
Collection < Object > result = new ArrayList < >( pofCollection . getSize ());
for ( int i = 0; i < pofCollection . getLength (); i ++) {
PofValue pofElement = pofCollection . getChild ( i );
PofValue extractedValue = elementNavigator . navigate ( pofElement );
result . add ( extractedValue . getValue ());
}
return result ;
}
// As before , we must implement hashcode and equals if we want to use
// this extractor to define an index
}

112

CHAPTER 4. QUERIES

more expensive than deserialisation, though with less heap churn and GC
load.
The extractor for the ReducerAggregator now becomes:
ValueExtractor productExtractor = new P o f C o l l e c t i o n E l e m e n t E x t r a c t o r (
new SimplePofPath ( AbstractOrder . POF_ORDERLINES ) ,
new SimplePofPath ( OrderLine . POF_PRODUCT ));

4.4.3

Querying With The Collection Extractor

Either CollectionElementExtractor or PofCollectionElementExtractor can now


be used to find all orders that include blue widgets:
ValueExtractor productExtractor = new P o f C o l l e c t i o n E l e m e n t E x t r a c t o r (
new SimplePofPath ( AbstractOrder . POF_ORDERLINES ) ,
new SimplePofPath ( OrderLine . POF_PRODUCT ));
Filter filter = new ContainsFilter ( productExtractor , " BLUWDG " );
Collection < Integer > blueOrderKeys = orderCache . keySet ( filter );

and provided that we have correctly implement hashcode and equals on the extractor, we can improve the efficiency of the query by creating an index
NamedCache orderCache = CacheFactory . getCache ( " order " );
ValueExtractor productExtractor = new P o f C o l l e c t i o n E l e m e n t E x t r a c t o r (
new SimplePofPath ( AbstractOrder . POF_ORDERLINES ) ,
new SimplePofPath ( OrderLine . POF_PRODUCT ));
orderCache . addIndex ( productExtractor , true , null );
Filter filter = new ContainsFilter ( productExtractor , " BLUWDG " );
Collection < Integer > blueOrderKeys = orderCache . keySet ( filter );

Consider that when using an index, there is no significant difference in the


efficiency of evaluation of the query whether you use POF or reflection as
the index stores the values in object form. The deserialisation costs of using
reflection occurs at the time of update rather than of query when using an
index. You may consider that the overhead is acceptable and opt for the
simpler reflection approach, but there is one easily overlooked, yet critical,
case to consider: when a repartitioning occurs - typically because of a storage node or machine failure. As a partition is received by a storage member
a reflection-based index will cause every entry in that partition to be deserialised. This has been observed in some circumstances to cause a sudden
surge of object creation in the heap, triggering a full GC, ejection of the
member from the cluster, causing more repartitioning, ultimately leading to
a cascading failure of the entire cluster. As always, test.

4.4. QUERYING COLLECTIONS

4.4.4

113

Derived Values

Our AbstractOrder class contains a method to obtain the total order value:
public double getOrderValue () {
double result = 0.0;
for ( OrderLine orderline : orderLines ) {
result += orderline . getValue ();
}
return result ;
}

which calls a method in OrderLine


public double getValue () {
return itemPrice * quantity ;
}

The order value and order value are not stored in the serialised form. To
query them we must either:
Write a custom serialiser that calls getOrderValue() to obtain the derived
value and stores it in the POF stream so that it is available for a
PofExtractor4 . The derived value is discarded when deserialising.
Write a custom PofExtractor that performs the calculation independently
Use a reflection extractor
Which approach is appropriate depends on your requirements - is it more
important to you to minimise the size of the stored serialised objects and
CPU and GC load, or to minimise complexity? By way of illustration, listing
4.9 shows the implementation of a POF extractor to calculate the order value
with minimum deserialisation.

4
Storing a derived value in the POF stream falls down if you use a PofUpdater to
change any of the field values used in the derivation

114

CHAPTER 4. QUERIES

Listing 4.9: Extracting a derived value


public class O r d e r V a l u e E x t ra c t o r extends Ab str act Ext rac tor {
private static final PofNavigator O R D ER L I N E S N A V I G A T O R =
new SimplePofPath ( AbstractOrder . POF_ORDERLINES );
private static final PofNavigator I T EM P RI C EN A VI G AT O R =
new SimplePofPath ( OrderLine . POF_ITEMPRICE );
private static final PofNavigator QU AN TIT YNA VIG ATO R =
new SimplePofPath ( OrderLine . POF_QUANTITY );
public Object extractFromEntry ( Entry entry ) {
BinaryEntry binaryEntry = ( BinaryEntry ) entry ;
PofContext pofContext = ( PofContext ) binaryEntry . getSerializer ();
Binary binaryValue = binaryEntry . getBinaryValue ();
PofValue pofValue = PofValueParser . parse ( binaryValue , pofContext );
PofCollection pofCollection =. ( PofCollection ) O R D E R L I N E S N A VI G A T O R . navigate ( pofValue );
double result = 0.0;
for ( int i = 0; i < pofCollection . getLength (); i ++) {
PofValue pofElement = pofCollection . getChild ( i );
int quantity = ( int ) QU ANT ITY NAV IGA TOR
. navigate ( pofElement ). getValue ();
double itemprice = ( double ) IT E MP R IC EN A VI G AT O R
. navigate ( pofElement ). getValue ();
result += quantity * itemprice ;
}
return result ;
}
.
.
.
}

4.5. CUSTOM INDEXES

4.5

115

Custom Indexes

Objective
Demonstrate the used of custom index-aware filters and custom indexes, starting with use of a custom filter against a conventional index.
Dependencies
Littlegrid, Apache commons-math3
There are three interfaces involved in implementing and using a custom
index:
an implementation of MapIndex that contains the index data structure
and whose methods will be called as entries are added, modified, or
removed from the cache.
an implementation of IndexAwareExtractor that will be used to create
and destroy the MapIndex instance when used in a call to QueryMap.addIndex
or QueryMap.removeIndex. It also serves as the key to extract the correct
index from the index map passed to IndexAwareFilter.applyIndex.
an implementation of IndexAwareFilter that uses a ValueExtractor (which
may or may not be an IndexAwareExtractor) to find which index to use
(which may or may not be a custom implementation of MapIndex).
The conditional index that we met in section 4.3: Conditional Indexes works
by using the ConditionalExtractor class, which implements IndexAwareExtractor,
to create an instance of type ConditionalIndex, which implements MapIndex

4.5.1

IndexAwareFilter on a SimpleMapIndex

The IndexAwareFilter provides a means of accessing the index map when


evaluating a query via the public Filter applyIndex(Map indexmap, Set keyset)
method. the indexmap argument is the map of indexes available on the cache
being queried, and keyset is the set of keys that we wish to evaluate the
index against. This might be the entire set on the member, or a subset after
applying other filters. The implementation works by removing keys that
do not pass the filter evaluation from the keyset argument. If a Filter is
returned, that filter will subsequently be applied to the remaining contents
of keyset. A return value of null indicates that no further filtering of the
candidate key set is required.

116

CHAPTER 4. QUERIES

To demonstrate the functionality, we will write a filter that passes only those
keys that have values unique for an associated key. e.g. if our order cache
has a key comprising order id and depot id, with depot id as an associated
key, then we can find cases where there is only single order for a customer
(the indexed value) at a depot (the associated key):
@Portable
public class OrderKey implements KeyAssociation {
public static final int POF_ORDERID = 0;
public static final int POF_DEPOTID = 1;
@P ort abl ePr ope rt y ( POF_ORDERID )
private int orderId ;
@P ort abl ePr ope rt y ( POF_DEPOTID )
private int depotId ;
public OrderKey () {
}
public OrderKey ( int orderId , int depotId ) {
this . orderId = orderId ;
this . depotId = depotId ;
}
public int getOrderId () {
return orderId ;
}
public int getDepotId () {
return depotId ;
}
@Override
public Object getAssociatedKey () {
return getDepotId ();
}
// hashcode and equals
}

We could achieve the objective of finding keys that match these criteria
using a custom aggregator. The advantage of using a filter is that we may
then perform operations on the set of keys (e.g. via an EntryProcessor) in a
single pass. We shall write an implementation of IndexAwareFilter and call it
UniqueValueFilter. In the applyIndex method we must first obtain the index
passed Map. The map key is the ValueExtractor used to construct the index,
so we hold this in a member variable of the filter. The index we obtain will
usually be an instance of SimpleMapIndex, though it could be a class of our own
devising if we create a custom index - more in this in the next section.
There are two particular methods of interest on SimpleMapIndex:
Object get(Object key) returns the indexed value for the given cache entry key. This is from the internal forward index.
Map getIndexContents() returns the index map. The key to this map is
the indexed value, and each value is the collection of cache keys that
match that value.

4.5. CUSTOM INDEXES

117

We iterate over each of the candidate keys in keyset, find the corresponding
indexed value, and determine whether there are other keys in the candidate
set that map to the same value and that share the same associated key.
@Portable
public class Uni que Val ueF ilt er implements IndexAwareFilter {
@P ort abl ePr ope rty (0)
private ValueExtractor indexExtractor ;
@Override
public Filter applyIndex ( Map indexmap , Set keyset ) {
SimpleMapIndex index = ( SimpleMapIndex ) indexmap . get ( indexExtractor );
if ( index == null ) {
throw new I l l e g a l A r g u m e n t E x c e p t i o n (
" This filter only works with a supporting index " );
}
Collection < Object > matchingKeys = new ArrayList < >( keyset . size ());
for ( Object candidateKey : keyset ) {
Object value = index . get ( candidateKey );
Collection < Object > otherKeys =
( Collection < Object >) index . getIndexContents (). get ( value );
if ( i s C a n d i d a t e U n i q u e I n S e t ( candidateKey , keyset , otherKeys )) {
matchingKeys . add ( candidateKey );
}
}
keyset . retainAll ( matchingKeys );
return null ;
}
private boolean i s C a n d i d a t e U n i q u e I n S e t (
Object candidateKey , Set < Object > keyset , Collection < Object > otherKeys ) {
Object c a n d i d a t e A s s o c i a t e d K e y = getAssociatedKey ( candidateKey );
for ( Object compareKey : otherKeys ) {
if (! candidateKey . equals ( compareKey )
&& keyset . contains ( compareKey )
&& c a n d i d a t e A s s o c i a t e d K e y . equals (
getAssociatedKey ( compareKey ))) {
return false ;
}
}
return true ;
}
}

Now we must implement the getAssociatedKey(Object key) method, and here


there is a problem. The object instances we are working with for both
keys and values are serialised Binary objects - we dont have a means of
extracting the associated key from the binary. We could deserialise the key
and call its getAssociatedKey method, or we could use a ValueExtractor to
deserialise only the associated key, but either of these techniques require
access to the PofContext for the service. Unfortunately we have no means
within the IndexAwareFilter or SimpleMapIndex API of obtaining it. It is held in
a protected field SimpleMapIndex.m_ctx, so we could use the Apache commons
lang FieldUtils class to get it:

118

CHAPTER 4. QUERIES

private PofContext getPofContext ( SimpleMapIndex mapIndex ) {


Ba cki ngM ap Con tex t bmc ;
try {
bmc = ( Bac kin gMa pCo nte xt ) FieldUtils . readField ( mapIndex , " m_ctx " , true );
} catch ( I l l e g a l A c c e s s E x c e p t i o n e ) {
throw new RuntimeException ( e );
}
return ( PofContext ) bmc . g etM ana ger Co nte xt (). getCacheService (). getSerializer ();
}

For those less sanguine about diving beneath the supported API, you might
add the cache name as a field of the filter and use it to resolve the service:
private String cacheName ;
.
.
.
private PofContext getPofContext () {
return ( PofContext ) CacheFactory . getCache ( cacheName )
. getCacheService (). getSerializer ();
}

Now we can write the getAssociatedKey method, first initialising a transient


serialiser field in the applyIndex method. We also provide a ValueExtractor
field to take the associated key from the cache key5 :
@P ort abl ePr ope rt y (0)
private ValueExtractor indexExtractor ;
@P ort abl ePr ope rt y (1)
private PofNavigator navigator ;
private transient PofContext serialiser ;
.
.
.
public Filter applyIndex ( Map indexmap , Set keyset ) {
SimpleMapIndex index = ( SimpleMapIndex ) indexmap . get ( indexExtractor );
if ( index == null ) {
throw new I l l e g a l A r g u m e n t E x c e p t i o n (
" This filter only works with a supporting index " );
}
serialiser = getPofContext ( index );
.
.
.
}
private Object getAssociatedKey ( Object key ) {
PofValue pofKey = PofValueParser . parse (( Binary ) key , serialiser );
return navigator . navigate ( pofKey ). getValue ();
}

There are other methods we need to implement to complete our filter. The
calculateEffectiveness method is used by Coherence to decide whether to call
5

Alternatively,
we could obtain the KeyAssociator from the caches
PartitionedService, with the same caveat as obtaining the PofContext - we have
no convenient means of obtaining it. Using a KeyAssociator would also require us to
deserialise each key to obtain the associated key. For variety, we use this approach later
in subsection 4.5.2: A Custom Index Implementation

4.5. CUSTOM INDEXES

119

applyIndex.

For our example, we must always use applyindex as evaluate and


evaluateEntry do not have the full context needed to perform the evaluation.
We therefore implement calculateEffectiveness and the evaluate methods as
follows:
public int c a l c u l a t e E f f e c t i v e n e s s ( Map map , Set set ) {
return 1;
}
public boolean evaluateEntry ( Entry entryArg ) {
throw new U n s u p p o r t e d O p e r a t i o n E x c e p t i o n (
" This filter only works with a supporting index " );
}
public boolean evaluate ( Object obj ) {
throw new U n s u p p o r t e d O p e r a t i o n E x c e p t i o n (
" This filter only works with a supporting index " );
}

And finally the complete constructor and fields for the class: the extractor
that identifies the index to use, the PofNavigator to extract the associated
key from the primary key, and the transient PofContext:
@P ort abl ePr ope rty (0)
private ValueExtractor indexExtractor ;
@P ort abl ePr ope rty (1)
private PofNavigator navigator ;
private transient PofContext serialiser ;
// For POF
public Un iq ueV alu eFi lte r () {
}
public Un iq ueV alu eFi lte r (
ValueExtractor indexExtractor , int a s s o c i a t e d K e y P o f I n d e x ) {
this . indexExtractor = indexExtractor ;
this . navigator = new SimplePofPath ( a s s o c i a t e d K e y P o f I n d e x );
}

Now, we can use our filter to find all of the orders where the order is the
only one for a customer at the depot:
NamedCache orderCache = CacheFactory . getCache ( " order " );
ValueExtractor cu sto mer Ext rac tor =
new PofExtractor ( Integer . class , AccountOrder . P O F _ C U S T O M E R A C C OU N T );
orderCache . addIndex ( customerExtractor , true , null );
Filter filter = new Uni que Val ueF il ter ( customerExtractor , OrderKey . POF_DEPOTID );
Set < OrderKey > u n iq u eC u st o me r Ke y s = orderCache . keySet ( filter );

We can modify the UniqueValueFilter to handle collection indexes. In the


applyIndex method we check whether the indexed value in the forward index
of SimpleMapIndex is a collection, and if so we perform our uniqueness test
separately on each of the items in the value collection:

120

CHAPTER 4. QUERIES
for ( Object candidateKey : keyset ) {
Object value = index . get ( candidateKey );
if ( value instanceof Collection ) {
for ( Object valueItem : ( Collection < Object >) value ) {
Collection < Object > otherKeys = ( Collection < Object >)
index . getIndexContents (). get ( valueItem );
if ( i s C a n d i d a t e U n i q u e I n S e t ( candidateKey , keyset , otherKeys )) {
matchingKeys . add ( candidateKey );
}
}
} else {
Collection < Object > otherKeys =
( Collection < Object >) index . getIndexContents (). get ( value );
if ( i s C a n d i d a t e U n i q u e I n S e t ( candidateKey , keyset , otherKeys )) {
matchingKeys . add ( candidateKey );
}
}
}

Now we can use a ValueExtractor that returns a collection in order to define the index and to search for unique values. Including, for example, the
PofCollectionElementExtractor we defined in section 4.4: Querying Collections.
In this piece of code we find all the orders at each depot that contain a product that is not referenced by any other order at that depot:
NamedCache orderCache = CacheFactory . getCache ( " order " );
ValueExtractor productExtractor = new P o f C o l l e c t i o n E l e m e n t E x t r a c t o r (
new SimplePofPath ( AbstractOrder . POF_ORDERLINES ) ,
new SimplePofPath ( OrderLine . POF_PRODUCT ));
orderCache . addIndex ( productExtractor , true , null );
Filter filter = new Uni que Val ueF ilt er ( productExtractor , OrderKey . POF_DEPOTID );
Set < OrderKey > un iqu ePr od uct Key s = orderCache . keySet ( filter );

4.5.2

A Custom Index Implementation

The IndexAwareFilter allows us to access the underlying data structure of


an index. I endeavoured in the previous section to demonstrate how we
might use this to achieve results with the default SimpleMapIndex structure
that are not possible using a conventional filter - though admittedly, the
example use-case is a little strained, it served to demonstrate the use of the
first part of a more useful pattern, the custom index based on any arbitrary data structure you might choose to implement. The first line of the
UniqueValueFilter.applyIndex method is:
SimpleMapIndex index = ( SimpleMapIndex ) indexmap . get ( indexExtractor );

But the value we obtain need not be an instance of SimpleMapIndex We can


define our own class for an index implementing the MapIndex interface. To construct the index instance, we create an implementation of IndexAwareExtractor
which is used in the call to addIndex om the cache. Confusingly, although

4.5. CUSTOM INDEXES

121

extends ValueExtractor, the inherited extract method is


not used as, in this example, we never use the extractor in a query. We
are concerned only with the IndexAwareExtractor methods createIndex and
destroyIndex, and with the equals and hashcode methods used when selecting
the index from the index map in the corresponding IndexAwareFilter.
IndexAwareExtractor

When might we want to use a custom index? We can already index any value
that may be derived from a cache entry, even if not present in that entry,
by using a custom ValueExtractor. We already have ConditionalExtractor for
those cases where we might wish to index only some entries. The answer
is that custom indexes are of use where the query evaluation requires some
additional context, such as how the indexed value relates to other entries.
Generally this is only useful for relationships between associated keys as
other keys may be in other cluster members and so not easily accessible
during evaluation. We could even consider indexes that evaluate based on
the relationship with data in a separate cache, though race conditions might
be problematic.
The objective of using indexes is to make data retrieval faster, albeit at
the cost of slower data updates. In particular, if queries are executed far
more frequently than updates it may be a net saving to perform even quite
complex computations at update time if it improves query efficiency. Well
look at how we can use a custom index to maintain statistics on the index
contents, recalculated at the time of index update. Our index maintains an
Apache commons-math3 DescriptiveStatistics object for each associated key,
containing the statistics for the set of indexed values whose keys share that
associated key. The indexed value for this index must therefore be a double,
as that is the only type directly supported by DescriptiveStatistics.
Listing 4.9 shows in implementation, StatisticsIndex, in which we define
an inner class to contain the DescriptiveStatistics object, with methods to
add and remove entries. A limitation of DescriptiveStatistics is that, while
there is a method to add a new value, there is no mechanism to remove an
arbitrary value from the set, so we must maintain our own map of values
used to support cache entry deletion. We maintain one instance of this class
per associated key, so define a map to maintain that association.
Well need a couple of utility methods to extract rhe associated key from a
key, and to deserialise a key in Binary form. Well keep the KeyAssociator and
Converter for these in fields and initialise them from the BackingMapContext in
the constructor. Also in the constructor we provide the ValueExtractor that
will be used to get the value from the cache entry:

122

CHAPTER 4. QUERIES

Listing 4.10: A custom index to maintain statistics


public class StatisticsIndex implements MapIndex {
.
.
.
private class A s so c ia t ed K ey I nd e x {
private final Map < Object , Double > forwardIndex ;
private D e s c r i p t i v e S t a t i s t i c s statistics ;
private As s oc i at e dK e yI n de x () {
this . forwardIndex = new HashMap < >();
this . statistics = new D e s c r i p t i v e S t a t i s t i c s ();
}
private void addValue ( Object key , double value ) {
forwardIndex . put ( key , value );
statistics . addValue ( value );
}
private void removeValue ( Object key ) {
forwardIndex . remove ( key );
statistics = new D e s c r i p t i v e S t a t i s t i c s ();
for ( Double value : forwardIndex . values ()) {
statistics . addValue ( value );
}
}
}
private
private
private
private

final
final
final
final

Map < Object , AssociatedKeyIndex > associatedKeyMap ;


ValueExtractor valueExtractor ;
KeyAssociator keyAssociator ;
Converter k e y F r o m I n t e r n a l C o n v e r t e r ;

public StatisticsIndex ( ValueExtractor valueExtractor , Ba cki ngM apC ont ext context ) {
P ar t it i on e dS e rv i ce service =
( Pa r ti t io n ed S er v ic e ) context . g etM ana ge rCo nte xt (). getCacheService ();
this . valueExtractor = valueExtractor ;
this . keyAssociator = service . getKeyAssociator ();
this . k e y F r o m I n t e r n a l C o n v e r t e r =
context . g etM ana ger Con tex t (). g e t K e y F r o m I n t e r n a l C o n v e r t e r ();
associatedKeyMap = new HashMap < >();
}
private Object c o n v e r t K e y F r o m B i n a r y ( Object key ) {
if ( key instanceof Binary ) {
return k e y F r o m I n t e r n a l C o n v e r t e r . convert ( key );
} else {
return key ;
}
}
private Object getAssociatedKey ( Object primaryKey ) {
primaryKey = c o n v e r t K e y F r o m B i n a r y ( primaryKey );
return keyAssociator . getAssociatedKey ( primaryKey );
}

4.5. CUSTOM INDEXES

123

The real work of maintaining the index is done in the insert, delete, and
update methods of the MapIndex interface. These are called by Coherence
as the cache contents change. The main work is done in the inner class
AssociatedKeyIndex defined above, so we now need to simply invoke the appropriate methods on that class, creating or deleting instances as necessary:
public void insert ( Entry entry ) {
Object key = entry . getKey ();
double value = ( double ) I nv o ca b le M ap H el pe r . extractFromEntry ( valueExtractor , entry );
insertInternal ( key , value );
}
private void insertInternal ( Object key , double value ) {
Object associatedKey = getAssociatedKey ( key );
A ss o ci a te d Ke yI n de x s = associatedKeyMap . get ( associatedKey );
if ( s == null ) {
s = new As s oc i at e dK e yI n de x ();
associatedKeyMap . put ( associatedKey , s );
}
s . addValue ( key , value );
}
public void delete ( Entry entry ) {
Object key = entry . getKey ();
deleteInternal ( key );
}
private void deleteInternal ( Object key ) {
Object associatedKey = getAssociatedKey ( key );
A ss o ci a te d Ke yI n de x s = associatedKeyMap . get ( associatedKey );
if ( s != null ) {
s . removeValue ( key );
if ( s . forwardIndex . isEmpty ()) {
associatedKeyMap . remove ( associatedKey );
s = null ;
}
}
}
public void update ( Entry entry ) {
Object key = entry . getKey ();
double value = ( double ) I nv o ca b le M ap H el pe r . extractFromEntry ( valueExtractor , entry );
deleteInternal ( key );
insertInternal ( key , value );
}

We also need methods to get data out of the index. We implement the
MapIndex.get method to retrieve the indexed value by key, and we also provide
a method to obtain the DescriptiveStatistics object associated with a key.
In some use cases we may want to call these methods with a Binary serialised
key, so it is convenient to call our convertKeyFromBinary method to ensure we
have a deserialised key.
public Object get ( Object key ) {
key = c o n v e r t K e y F r o m B i n a r y ( key );
Object associatedKey = getAssociatedKey ( key );
A ss o ci a te d Ke yI n de x s = associatedKeyMap . get ( associatedKey );
return s == null ? null : s . forwardIndex . get ( key );
}
public D e s c r i p t i v e S t a t i s t i c s getStatistics ( Object key ) {
key = c o n v e r t K e y F r o m B i n a r y ( key );
A ss o ci a te d Ke yI n de x s = associatedKeyMap . get ( getAssociatedKey ( key ));
return s == null ? null : s . statistics ;
}

124

CHAPTER 4. QUERIES

Now we have an index that will maintain a comprehensive set of statistics


per associated key. But how do we create an instance? We must implement
an IndexAwareExtractor to do so:
public class S t a t i s t i c s I n d e x B u i l d E x t r a c t o r implements I nd e x A w a r e E x t r a c t or {
private ValueExtractor valueExtractor ;
public S t a t i s t i c s I n d e x B u i l d E x t r a c t o r () {
}
public S t a t i s t i c s I n d e x B u i l d E x t r a c t o r ( ValueExtractor valueExtractor ) {
this . valueExtractor = valueExtractor ;
}
public Object extract ( Object obj ) {
throw new U n s u p p o r t e d O p e r a t i o n E x c e p t i o n ( " not yet implemented " );
}
public MapIndex createIndex ( boolean flag , Comparator comparator , Map map ,
Ba cki ngM apC ont ext b ac kin gma pco nte xt ) {
MapIndex index = new StatisticsIndex ( valueExtractor , b ac kin gma pco nte xt );
map . put ( this , index );
return index ;
}
public MapIndex destroyIndex ( Map map ) {
return ( MapIndex ) map . remove ( this );
}
public int hashCode () {
final int prime = 31;
int result = 1;
result = prime * result
+ (( valueExtractor == null ) ? 0 : valueExtractor . hashCode ());
return result ;
}
public boolean equals ( Object obj ) {
if ( this == obj ) {
return true ;
}
if ( obj == null ) {
return false ;
}
if ( getClass () != obj . getClass ()) {
return false ;
}
S t a t i s t i c s I n d e x B u i l d E x t r a c t o r other = ( S t a t i s t i c s I n d e x B u i l d E x t r a c t o r ) obj ;
if ( valueExtractor == null ) {
if ( other . valueExtractor != null ) {
return false ;
}
} else if (! valueExtractor . equals ( other . valueExtractor )) {
return false ;
}
return true ;
}
}

The work of creating and destroying the index is performed by the createIndex
and destroyIndex methods, but as with any ValueExtractor used to construct
an index, the equals and hashcode methods are crucial as the extractor is used
as the key to the index map associated with the cache. To create the index,
we simply use this extractor in the usual way:
ValueExtractor i nd e x B u i l d E x t r a c t or =
new S t a t i s t i c s I n d e x B u i l d E x t r a c t o r ( u n d e rl y i n g E x t r a c t o r );
cache . addIndex ( indexBuildExtractor , true , null );

4.5. CUSTOM INDEXES

125

where underlyingExtractor extracts a double value from the cache entry.


The extract method throws an UnsupportedOperationException because the object argument to the method does not provide the context we need to locate
and use the index. We could also implement EntryExtractor, and in some
cases that may be appropriate but here we have many values associated
with the same index in the DescriptiveStatistics object, so well implement
instead another separate EntryExtractor that can extract the mean value from
the statistics.
public class MeanExtractor extends A bst rac tEx tra cto r {
private static final long serialVersionUID = -3722194152372224211 L ;
@P ort abl ePr ope rty (0)
private ValueExtractor valueExtractor ;
public MeanExtractor () {
}
public MeanExtractor ( ValueExtractor valueExtractor ) {
this . valueExtractor = valueExtractor ;
}
public Object extractFromEntry ( Entry entry ) {
Ba cki ngM apC ont ex t context = (( BinaryEntry ) entry ). g e t B a c k i n g M a p C o n t e x t ();
ValueExtractor indexMapKey = new S t a t i s t i c s I n d e x B u i l d E x t r a c t o r ( valueExtractor );
StatisticsIndex index = ( StatisticsIndex ) context . getIndexMap (). get ( indexMapKey );
if ( index == null ) {
throw new I l l e g a l S t a t e E x c e p t i o n ( " No index defined to support this extractor " );
}
D e s c r i p t i v e S t a t i s t i c s stats = index . getStatistics ( entry . getKey ());
return stats == null ? null : stats . getMean ();
}
}

We could provide other implementations to extract other statistics values,


or generalise this one with an argument to extract the value other statistics
provided by DescriptiveStatistics, or add a parameter to select which value
to extract. Now, let us be clear about what the extractor gives us, it is the
average of the set of values from all the cache entries that share the same
associated key, so would return the same value across all those keys, so a
query like:
Filter filter = new LessFilter ( new MeanExtractor ( un d e r l y i n g E x t r a ct o r ) , 5.0);
Set keySet = orderCache . keySet ( filter );

Would return all the keys that map to the set of associated keys where
the average value is less than 5.0. Perhaps more useful would be a query
that allows us to retrieve all the entries where the indexed value was, say,
more than the average for the set of associated entries. To that, we need to
write an IndexAwareFilter that understands our custom index. Weve already
covered the basics of writing an IndexAwareFilter, so well just concentrate
on the applyIndex method:

126

CHAPTER 4. QUERIES

public class A bo v eA v er a ge F il t er implements IndexAwareFilter {


private ValueExtractor valueExtractor ;
.
.
.
public Filter applyIndex ( Map indexMap , Set keySet ) {
ValueExtractor s t at i st i cs I nd e xK ey = new S t a t i s t i c s I n d e x B u i l d E x t r a c t o r ( valueExtractor );
StatisticsIndex index = ( StatisticsIndex ) indexMap . get ( s ta t is t ic s In d ex K ey );
if ( index == null ) {
throw new I l l e g a l A r g u m e n t E x c e p t i o n ( " No matching StatisticsIndex found " );
}
for ( Iterator < Object > keyit = keySet . iterator (); keyit . hasNext ();) {
Object key = keyit . next ();
Double value = ( Double ) index . get ( key );
D e s c r i p t i v e S t a t i s t i c s statistics = index . getStatistics ( key );
if ( value <= statistics . getMean ()) {
keyit . remove ();
}
}
return null ;
}
}

Executing this on a distributed service we will find that the set of keys passed
to applyindex are in serialised form, hence the need in our implementation of
StatisticsIndex to allow for use of Binary keys. It would be more efficient to
implement the StatisticsIndex to use binary keys internally, but we would not
then be able to use the KeyAssociator from the PartitionedService to extract
the associated key and would instead need to provide a PofExtractor as we
did in the UniqueValueFilter described in subsection 4.5.1: IndexAwareFilter
on a SimpleMapIndex above.

4.5.3

Further Reading

Alexey Ragozin has an excellent write-up on using a custom index to support


time series data6 . The principles Alexey espouses in that article form the
basis of my own experimental MVCC support for Coherence7 .

See http://blog.ragozin.info/2011/10/grid-pattern-managing-versioned-data.
html
7
http://www.shadowmist.co.uk/coherence-mvcc.pdf, code at https://code.
google.com/p/shadow-mvcc-coherence/

Chapter 5

Grid Processing
5.1

Introduction

In this chapter well look at Coherences capabilities for distributing computation through the cluster using EntryAggregator, EntryProcessor, and Invocable.
This is a complex area full of incompletely documented subtleties of behaviour.

5.1.1

EntryAggregator vs EntryProcessor

EntryProcessor

and EntryAggregator are similar in many respects:

They are both used to operate on sets of data distributed through the
cluster
Both are guaranteed to execute against all relevant partitions in the
event of member failure and partition moves.
Both pin a partition while executing. A pinned partition will not move
to another member until execution has completed.
The principle difference is that EntryAggregator is strictly read-only. Any call
to a method on a Entry or BinaryEntry that would modify the entry will fail.
The following table summarises the key differences.
127

128

CHAPTER 5. GRID PROCESSING

Pin partition
Lock Keys
Sort Keys
Speed
Modify
Reduce

EntryProcessor

EntryAggregator

Y
Y
Y
Slower
Y
Partial

Y
N
N
Faster
N
Y

You can preform a reduce operation in a limited fashion in the processAll


method of an EntryProcessor, reducing the result set to a map with one entry
per member or partition.
An EntryAggregator is faster because, as it cannot modify an entry, it need
not lock or sort the keys it operates on. For this reason, EntryAggregator
should always be preferred over EntryProcessor for read-only operations on
data sets.
For any real-world use case we would always implement an aggregator as a
ParallelAwareAggregator. This gives us complete control of the reduce phase
of an aggregation operation, allowing us to manage combining partial result
sets from around the grid. EntryProcessor does give us some limited scope
for controlling the returned data set: the processAll method returns a map intended to be keyed by the cache keys affected, but we can implement it to
return anything we like in the map - Coherence will return a map combining
all of the maps from the individual invocations in partitions and members.
Provided that we can ensure that no two members or partitions will use the
same key, we can store anything we like in that returned map. Conversely,
EntryAggregator returns an Object, but that object could be a collection, a
selection and/or projection of the objects processed rather than a simple
aggregation.

5.1.2

Useful Idioms

There are few handy shortcuts in the standard Coherence API that might
save you some time if you know about them. Heres a selection

5.2. VOID ENTRYPROCESSOR

129

Efficient Put
NamedCache.put(Object key, Object value) inherits java.util.Map semantics and

returns the previous value for the key. This is cheap for a simple map in one
JVM as it simply returns a reference to an easily accessible object. For a
distributed cache it can be expensive as the old value is copied across the
network and deserialised, especially for large value objects. Rather than
writing:
cache . put ( key , value );

it may be more efficient to write:


cache . putall ( Collections . singletonMap ( key , value ));

putall

is a void method, so the old value is never transferred.

Conditional Insert
Well documented - once you know it is there - to insert a value only if the
key is absent:
cache . invoke ( key , new ConditionalPut ( PresentFilter . INSTANCE , value );

Several other useful classes have a static INSTANCE value where there are no
constructor arguments.

5.2

Void EntryProcessor

Objective
Show a technique for improving the efficiency of EntryProcessor execution where return values are not needed.
Invoking an EntryProcessor against a large number of cache entries does not
scale well. A call to invokeAll against a large cache can crash a client or proxy
node with an OutOfMemoryError, even if the EntryProcessor.process method
returns null. This is because the return value of invokeAll is a map of all
the keys affected. Even if the client ignores the return value, it is assembled
from the return values received from the individual cluster members.
Any EntryProcessor that subclasses AbstractProcessor inherits a processAll
method that iterates over a set of entries (all of the selected entries in a

130

CHAPTER 5. GRID PROCESSING

partition or member, depending on configuration), and constructs the return


map for all of those entries. The client or proxy node that sends the requests
into the cluster assembles the map into a single result map for the cluster. If
we have no interest in the return value, then we must override processAll to
return an empty map, we can do this with an alternate abstract class:
public abstract class A b s t r a c t V o i d P r o c e s s o r implements EntryProcessor {
public Map processAll ( Set set ) {
for ( InvocableMap . Entry entry : ( Set < InvocableMap . Entry >) set ) {
process ( entry );
}
return Collections . EMPTY_MAP ;
}
}

When you dont care about EntryProcessor return values, this approach ensure that client and proxy memory use is independent of cache size. It alseo
eliminates the network and cpu deserialisation overhead of transferring potentially large numbers of keys to a client.

5.3

Keeping the Service Guardian Happy

Objective
To show how to stop the service guardian thinking there is a problem,
when there isnt.
In the previous section, section 5.2: Void EntryProcessor, we looked at limiting the memory use of an EntryProcessor that is invoked against large numbers of cache entries. Another potential issue with such a scenario is the
time it takes for the EntryProcessor to execute, and the danger of the service
guardian reacting and terminating the thread or JVM, or raising an alert
(depending on configuration - more on that later in subsection 8.3.2: Service
Guardian Configuration). If we have the service guardian set for a thirty
second timeout, and our EntryProcessor takes 30ms to process each entry, invocation need only affect one thousand entries on a member before we reach
the guardian timeout. Fortunately, there is an API to allow us to reset the
timeout. First, we need to obtain an instance of GuardContext for the service
worker thread that we are executing in, we do this via the GuardSupport utility
class:
GuardContext guardContext = GuardSupport . getThreadContext ();

5.4. WRITING A CUSTOM AGGREGATOR

131

Then, at intervals during our processing we can tell the guardian that we
are still alive, awake, and doing (hopefully) useful work:
guardContext . heartbeat ();

We can modify our AbstractVoidProcessor to keep the guardian happy by


giving a heartbeat after processing each individual entry:
public Map processAll ( Set set ) {
GuardContext guardContext = GuardSupport . getThreadContext ();
for ( InvocableMap . Entry entry : ( Set < InvocableMap . Entry >) set ) {
process ( entry );
if ( guardContext != null ) {
guardContext . heartbeat ();
}
}
return Collections . EMPTY_MAP ;
}

The null check is advisable in case we ever execute in a thread that does not
have an associated guardian, particularly in unit tests.
You might consider adding guardian heartbeats to any potentially longrunning processing elements: EntryProcessor, Aggregator, Invocable, CacheStore,
etc.

5.4

Writing a custom aggregator

Objective
To demonstrate how to write a simple and correct custom aggregator
The principle is simple. Dont ship the data to the code; run the code against
the data in the cluster, then just ship the result.
Coherence aggregators work in two stages: a parallel stage in each storage
node; then a final aggregation of the partial results. The client only sees the
final result.
In most cases the client node is a storage-disabled application node, but it
can also be a storage node. For a TCP*Extend client the proxy node runs
the final aggregation before passing the result back to the client.
The InvocableMap interface provides two methods for calling aggregators:
Object aggregate(filter, aggregator) : executes the aggregator against
a set of entries matching a filter.

132

CHAPTER 5. GRID PROCESSING


Object aggregate(keySet, aggregator) : executes the aggregator against
the entries matching a set of keys.

Rolling our own


Oracle provides AbstractAggregator as a base for custom aggregators. This
dog food is consumed by Coherences own aggregators. You have three
abstract methods: finalizeResult(boolean fFinal), init(boolean fFinal), and
process(Object o, boolean fFinal). The process method gets called for each
entry the aggregator will consider. The fFinal parameter tells you whether
this is a partial or final aggregation stage.
I dont like AbstractAggregator much:
AbstractAggregator iterates the set of entries so we cant. For example,
we cant terminate the loop early.
The process method is void. This requires our aggregator to be stateful
by keeping intermediate and final results as member variables.
The fFinal parameter complicates things, and has subtly different semantics in the init and process methods.
Instead, we use the com.tangosol.util.InvocableMap.ParallelAwareAggregator interface. This has methods for our partial and final aggregation stages:
Object aggregate(Set entries) : aggregate a set of entries into a partial
result (the parallel stage).
Object aggregateResults(Collection partialResults) : aggregate a final
result from the parallel results (the final stage).
The meaning of these is clear and obvious, although there are still some
annoyances for us to tackle later.
Example
Lets invent a data model that is complicated enough to warrant a custom
aggregator. A cache contains Player objects representing guitarists. We dont
cater for drummers, however admirable they may be. Each Player contains a
list of Guitar objects that hold the makes and models each player is associated
with. For example, Jimi Hendrix is forever linked with a Fender Stratocaster
(played upside down), and Jimmy Page with his Les Paul. Wed like to count
the guitar makes used by a set of players by looking at their embedded guitar
collections, and return that as a map: make count.

5.4. WRITING A CUSTOM AGGREGATOR

133

Listing 5.1: Players


@Portable
public class Player {
public static final int POF_NAME
= 0;
public static final int POF_COUNTRY_CODE = 1;
public static final int POF_GUITARS
= 2;
@P ort abl ePr ope rty ( POF_NAME )
@P ort abl ePr ope rty ( POF_COUNTRY_CODE )
@P ort abl ePr ope rty ( POF_GUITARS )

private String
name ;
private String
countryCode ;
private List < Guitar > guitars ;

public Player () {} // deserialisation constructor


public Player ( String name , String countryCode , Guitar ... guitars ) {
...
}
public Collection < Guitar > getGuitars () {
return Collections . u n m o d i f i a b l e C o l l e c t i o n ( guitars );
}
public static ValueExtractor g e t C o u n t r y C o d e E x t r a c t o r () {
return new PofExtractor ( String . class , POF_COUNTRY_CODE );
}
@Override
public String toString () {
return String . format ( " player { name =% s , guitars =% s } " , name , guitars );
}
}

Listings 5.1 and 5.2 have simple Player and Guitar classes and listing 5.3 the
data.
Eric has more than one strat as any guitar nerd knows, though his cherry-red
335 is no longer with him. It raised about $850,000 a few years ago in an
auction for the Crossroads charity.
Testing the aggregator
We test using the Littlegrid framework so each part of the aggregation runs
in a separate node. It is vital to test a parallel aggregator on multiple nodes
as in listing 5.4.
We aggregate the popularity of guitar makes for UK players. This recipe
has some interesting seasoning already. The Player provides a static method
that returns a PofExtractor for country code. This idiom saves clients from
having to use the POF index and specify the fields class. So that we
dont need to cast Coherences Object return value into the results map,
our GuitarUseByMakeAggregator provides a type-safe method for invoking it on
a cache. Weve encapsulated that type conversion along as part of the aggregator to make the client code cleaner and less brittle.
The aggregator We decided to use ParallelAwareAggregator in preference to
AbstractAggregator but there are annoyances with that approach too:
We must implement getParallelAggregator - normally by simply return-

134

CHAPTER 5. GRID PROCESSING

Listing 5.2: Guitars


@Portable
public class Guitar {
public static final int POF_MAKE = 0;
public static final int POF_MODEL = 1;
@P ort abl ePr ope rt y ( POF_MAKE )
@P ort abl ePr ope rt y ( POF_MODEL )

private String make ;


private String model ;

public Guitar () {} // deserialisation constructor


public Guitar ( String make , String model ) { ... }
public String getMake () {
return make ;
}
@Override
public String toString () {
return String . format ( " guitar { make =% s , model =% s } " , make , model );
}
}

Listing 5.3: Player and guitar data


final
final
final
final

Guitar
Guitar
Guitar
Guitar

strat
tele
lespaul
es335

=
=
=
=

new
new
new
new

Guitar ( " fender " ,


Guitar ( " fender " ,
Guitar ( " gibson " ,
Guitar ( " gibson " ,

final Player [] players = {


new Player ( " eric clapton " ,
new Player ( " david gilmour " ,
new Player ( " peter green " ,
new Player ( " jimi hendrix " ,
new Player ( " jimmy page " ,
new Player ( " larry carlton " ,
new Player ( " keith richard " ,
};

" UK " ,
" UK " ,
" UK " ,
" US " ,
" UK " ,
" US " ,
" UK " ,

" stratocaster " );


" telecaster " );
" les paul " );
" es 335 " );

strat , strat , strat , es335 ) ,


strat , tele ) ,
lespaul ) ,
strat ) ,
lespaul , tele ) ,
es335 , lespaul ) ,
tele , es335 ) ,

Listing 5.4: Testing the aggregator


final Cl u st e rM e mb e rG r ou p memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. s e t S t o r a g e E n a b l e d C o u n t (2)
. s e t F a s t S t a r t J o i n T i m e o u t M i l l i s e c o n d s (100)
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();
try {
NamedCache cache = CacheFactory . getCache ( " distributed - test - cache " );
for ( int i = 0; i < players . length ; i ++) {
cache . put (i , players [ i ]);
}
G u i t a r U s e B y M a k e A g g r e g a t o r aggregator = new G u i t a r U s e B y M a k e A g g r e g a t o r ();
Filter uk = new EqualsFilter ( Player . g e t C o u n t r y C o d e E x t r a c t o r () , " UK " );
Map < String , Integer > ukResults = aggregator . aggregate ( cache , uk );
assertEquals (7 , ( int ) ukResults . get ( " fender " ));
assertEquals (4 , ( int ) ukResults . get ( " gibson " ));
}
finally {
C l u s t e r M e m b e r G r o u p U t i l s . s h u t d o w n C a c h e F a c t o r y T h e n C l u s t e r M e m b e r G r o u p s ( memberGroup );
}

5.4. WRITING A CUSTOM AGGREGATOR

135

ing this.
The aggregate methods receive raw Set and Collection types that must
be cast, to a set of BinaryEntry and partial results respectively.
Coherence will sometimes pass us an empty set of entries. Well also
get a collection of partial results to finalise with some null entries.
Lets implement a base class to hide the messiness from our custom aggregators, listing 5.5.
This thin base class handles the type conversions and cleans up the collections. It takes two type parameters <P, F> to define the types of the Partial
and Final aggregation results. These can be different. Concrete aggregator
subclasses will implement two new type-safe methods:
P aggregatePartial ( Set < BinaryEntry > entries );
F aggregateFinal ( Collection <P > partialResults );

Weve also added a pair of aggregate methods so that clients can invoke the
aggregator, with either a Filter or a Key set, and get the correct return type
<F> of the final aggregation. Finally in listing 5.6 we implement the actual
aggregator.
Note that we dont need to store any state in the aggregator. In this case,
the return type of both partial and final results is the same, but our design
would let us have two different types. The slight complexity in the final
aggregation stage is the partialResults parameter. This is a collection, each
entry of which is partial result from one node. In our case each partial result
is a map make count so we need to accumulate those counts into the
totals map. The generic base class helps considerably here by making the
types clear. We might have gone further, and made the base class aware
of the key and entry types in the cache. We chose not to do that, since
we dont always extract and deserialise keys and values. For example, we
could write an aggregator that uses a PofExtractor instead of deserialising
the entire entry. Sometimes that is more efficient. Imagine a cache of values
with dozens of large fields. We might only care about a single number, so it
would make sense to extract only the field of interest.
Our custom aggregator implements POF, and needs its own entry in the POF
configuration file, pof-config.xml. Likewise, any custom types used for partial
and intermediate results must also be portable, since these are transmitted
between nodes. In this example we used java.util.Map<String, Integer> which
is already handled in the POF configuration in coherence.jar. Our guitar

136

CHAPTER 5. GRID PROCESSING

Listing 5.5: Aggregator base class


/* *
* Abstract base class for parallel aggregators to make writing custom aggregators
* simpler . Provides type - safe methods for both stages of the aggregation ( partial and final )
* and removes null entries from collections . Subclasses can avoid type - casts for the
* results or for cache entries .
*
* Provides type - safe invocation methods to returns a type safe final result and avoid user
* code having to cast from Object :
*
* { @link aggregate ( NamedCache , Filter )}
* { @link aggregate ( NamedCache , Set )}
*
* @param <P > the type of the intermediate , partial , result of the 1 st level aggregation
* @param <F > the type of the final overall result of the aggregation
*/
public abstract class AbstractParallelAggregator <P , F > implements P a r a l l e l A w a r e A g g r e g a t o r {
@Override
public P aggregate ( Set entries ) {
return entries . size () > 0 ? aggregatePartial ( entries ) : null ;
}
/* *
* Return a partial result from the first level of aggregation
*
* @param entries set of entries to consider in a partial aggregation stage
* @return partial result from this stage
*/
public abstract P aggregatePartial ( Set < BinaryEntry > entries );
@Override
public F aggregateResults ( Collection partialResults ) {
partialResults . removeAll ( Collections . singleton ( null ));
return aggregateFinal ( partialResults );
}

// get rid of null entries

/* *
* Final aggregation
*
* @param collection of partial results from each parallel aggregation stage
* @return overall results
*/
public abstract F aggregateFinal ( Collection <P > partialResults );
/* *
* Run this aggregator on the cache using a filter and return typed result
*
* @return overall result of the aggregation
*/
public F aggregate ( NamedCache cache , Filter filter ) {
return ( F ) cache . aggregate ( filter , this );
}
/* *
* Run this aggregator on the cache using a key set and return typed result
*
* @return overall result of the aggregation
*/
public F aggregate ( NamedCache cache , Collection <? > keys ) {
return ( F ) cache . aggregate ( keys , this );
}
@Override
public EntryAggregator g e t P a r a l l e l A g g r e g a t o r () {
return this ;
}
}

5.4. WRITING A CUSTOM AGGREGATOR

Listing 5.6: Guitar use by make


@Portable
public final class G u i t a r U s e B y M a k e A g g r e g a t o r
extends AbstractParallelAggregator < Map < String , Integer > , Map < String , Integer > > {
/* *
* @return map of guitar usage by make
*/
@Override
public Map < String , Integer > aggregatePartial ( Set < BinaryEntry > entries ) {
Map < String , Integer > usage = new HashMap < String , Integer >();
for ( BinaryEntry be : entries ) {
Player player = ( Player ) be . getValue ();
for ( Guitar guitar : player . getGuitars ()) {
String make = guitar . getMake ();
int count = usage . containsKey ( make ) ? usage . get ( make ) : 0;
usage . put ( make , count + 1);
}
}
return usage ;
}
/* *
* @return aggregated map of guitar usage by make
*/
@Override
public Map < String , Integer > aggregateFinal (
Collection < Map < String , Integer > > partialResults ) {
Map < String , Integer > totals = new HashMap < String , Integer >();
for ( Map < String , Integer > usage : partialResults ) {
for ( Map . Entry < String , Integer > entry : usage . entrySet ()) {
String make = entry . getKey ();
int count = entry . getValue ();
int total = totals . containsKey ( make ) ? totals . get ( make ) : 0;
totals . put ( make , total + count );
}
}
return totals ;
}
}

137

138

CHAPTER 5. GRID PROCESSING

aggregator doesnt have any arguments in its constructor that need to be sent
to the storage nodes; if it did we would define those as portable properties
too. Always test the correctness of your POF implementation for components
like EntryAggregator and EntryProcessor implementations in the same way you
test domain classes that live in caches.

5.5

Exceptions in EntryProcessors

Objective
To provide a thorough understanding of the effects of throwing exceptions from within an EntryProcessor invocation, with examples and
discussion of strategies for dealing with them
Prerequisites
An understanding of services, threads, and partitions as described in
the introduction to this chapter, together with a basic understanding
of the functioning of an EntryProcessor
Code examples
The classes and resources in this section are in the gridprocessing
project under package org.cohbook.gridprocessing.entryprocessor.
Dependencies
As well as Oracle Coherence, the examples use JUnit and make extensive use of Littlegrid. Logging is via slf4j and logback.
What happens when an exception is throw from the process(Entry entry) of
an EntryProcessor? Starting with the trivial case of an invocation against
a single key using InvocableMap.invoke(Object key, EntryProcessor processor),
any modifications to the entry are discarded and never written to the caches
backing map (this is true so long as you dont do anything dubious like
getting hold of the backing map and manipulating it directly1 .
When we invoke an EntryProcessor against many entries in a single invocation, things get a little more complicated. Rather than just describe what
happens, well create some sample code to elucidate the behaviours discussed.
1
safe ways of doing this are covered later in subsection 5.7.2: Partition-Local Atomic
Operations

5.5. EXCEPTIONS IN ENTRYPROCESSORS

139

Listing 5.7: A simple key partitioning strategy


public class I n t K e y P a r t i t i o n i n g S t r a t e g y implements K e y P a r t i t i o n i n g S t r a t e g y {
private int npartitions ;
@Override
public void init ( P ar t it i on e dS e rv i ce pa r ti t io n ed s er v ic e ) {
npartitions = p ar t it i on e ds er v ic e . g etP art iti onC ou nt ();
}
@Override
public int getKeyPartition ( Object obj ) {
return (( int ) obj ) % npartitions ;
}
@Override
public PartitionSet g e t A s s o c i a t e d P a r t i t i o n s ( Object obj ) {
PartitionSet result = new PartitionSet ( npartitions );
result . add ( getKeyPartition ( obj ));
return result ;
}
}

5.5.1

Setting up the examples

The behaviour of an EntryProcessor is closely coupled with the way a cache is


split into partitions, which are distributed among members of the cluster, and
the services worker threads that are used to execute the EntryProcessor.
We will create a fairly simple, but somewhat artificial cache structure in
order to illustrate the behaviours that we are interested in. This consists
of:
A service with thirteen (prime number) partitions.
Two members - Coherence will spread the partitions as evenly as possible, six on one member, seven on the other.
A cache with thirty-nine entries, constructed so that there are precisely
three in each partition.
There are several mechanisms for controlling the allocation of entries to
partitions. The one that suits our purpose here is to create an instance of
KeyPartitioningStrategy that generates the partition id for an integer key as
simply the modulo of the key by the number of partitions, listing 5.7. Then
in listing 5.8 we define the service that uses this IntKeyPartitioningStrategy
and three worker threads.
Well create a unit test in to exercise some behaviours. The first step in
listing 5.9 is to initialise the cache with our thirty-nine entries.
Now, time to test some scenarios.

140

CHAPTER 5. GRID PROCESSING

Listing 5.8: Configure the key partitioning strategy


<? xml version = " 1.0 " ? >
< cache - config xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - cache - config "
x si : sc h em a Lo ca t io n =
" http: // xmlns . oracle . com / coherence / coherence - cache - config
coherence - cache - config . xsd " >
< caching - scheme - mapping >
< cache - mapping >
< cache - name > test </ cache - name >
< scheme - name > d is t ri b ut e dS e rv i ce </ scheme - name >
</ cache - mapping >
</ caching - scheme - mapping >
< caching - schemes >
< distributed - scheme >
< scheme - name > d is t ri b ut e dS e rv i ce </ scheme - name >
< thread - count >3 </ thread - count >
< partition - count > 13 </ partition - count >
<key - partitioning >
< class - name >
org . cohbook . gridprocessing . entryprocessor . I n t K e y P a r t i t i o n i n g S t r a t e g y
</ class - name >
</ key - partitioning >
< backing - map - scheme >
< local - scheme / >
</ backing - map - scheme >
< autostart > true </ autostart >
</ distributed - scheme >
</ caching - schemes >
</ cache - config >

Listing 5.9: Setting up the unit test


public class T e s t E n t r y P r o c e s s o r E x c e p t i o n {
private Cl u st e rM e mb e rG r ou p memberGroup ;
private NamedCache cache ;
private static final Logger LOG =
LoggerFactory . getLogger ( T e s t E n t r y P r o c e s s o r E x c e p t i o n . class );
@Before
public void setup () {
memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. s e t S t o r a g e E n a b l e d C o u n t (2)
. setCacheConfiguration (
" org / cohbook / gridprocessing / "
+ " entryprocessor / cache - config . xml " )
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();
cache = CacheFactory . getCache ( " test " );
P ar t it i on e dS e rv i ce service = ( Pa r ti t io n ed S er v ic e ) cache . getCacheService ();
int partitionCount = service . ge tPa rt iti onC oun t ();
for ( int i = 0; i < partitionCount ; i ++) {
cache . put (i , " foo " );
cache . put ( i + partitionCount , " foo " );
cache . put ( i + partitionCount * 2 , " foo " );
}
}
.
.
.
}

5.5. EXCEPTIONS IN ENTRYPROCESSORS

141

Listing 5.10: EntryProcessor with a subtle bug


public class A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r extends Ab str act Pro ces sor {
private static final Logger LOG = LoggerFactory . getLogger (
A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r . class );
public static final A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r INSTANCE =
new A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r ();
private int invocationCount ;
public A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r () {
}
@Override
public Object process ( Entry entry ) {
BinaryEntry bentry = ( BinaryEntry ) entry ;
int key = ( int ) entry . getKey ();
P ar t it i on e dS er v ic e ps = ( P a rt i ti o ne d Se r vi c e ) bentry . getContext (). getCacheService ();
int partition = ps . g e t K e y P a r t i t i o n i n g S t r a t e g y (). getKeyPartition ( key );
LOG . info ( " invoked with key = " + key + " , partition = " + partition );
if ( partition == 7 && invocationCount ++ > 0) {
throw new RuntimeException ( " Failed on key = " + key );
}
return entry . setValue ( entry . getValue () + " bar " );
}
@Override
public Map processAll ( Set set ) {
invocationCount = 0;
LOG . info ( " invoked with set size = " + set . size ());
return super . processAll ( set );
}
}

5.5.2

Exception When Invoking with a Filter

We have in listing 5.9 an EntryProcessor that sets the value of any entries it
is invoked against to "Zap!". Unfortunately, due to a subtle, hard to detect
programming error, it throws a RuntimeException the second time it is asked
to update an entry in partition number seven.
You might be particularly interested in the way that it so cunningly accidentally obtains a reference to the PartitionedService that owns this cache from
the BinaryEntry and then carelessly uses that services KeyPartitioningStrategy
(which is, of course, the IntKeyPartitioningStrategy we defined earlier) to find
the partition that this key belongs to, before finally, by terrible mischance,
throwing the exception when that partition id happens to be seven (if weve
already processed one entry).
We will first in listing 5.11 invoke this ArbitrarilyFailingEntryProcessor against
our entire cache using the AlwaysFilter.
When we execute this test, we find that the processAll(Set set) method of

142

CHAPTER 5. GRID PROCESSING


Listing 5.11: Arbitrary failure invoked against the entire cache

@Test
public void testInvokeFilter () {
try {
cache . invokeAll (
AlwaysFilter . INSTANCE , A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r . INSTANCE );
} catch ( RuntimeException e ) {
LOG . info ( " caught " , e );
}
int setSize = cache . entrySet ( new EqualsFilter (
Id ent ity Ext rac tor . INSTANCE , " foobar " )). size ();
LOG . info ( setSize + " entries updated " );
}

is called twice, once with eighteen entries


(six partitions) and once with twenty-one entries (seven partitions), i.e. the
EntryProcessor is invoked once per member in the cluster.

ArbitrarilyFailingEntryProcessor

So, how many entries are actually updated?


22:48:52.891 [ main ] INFO

o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o n - 18 entries updated

It turns out that none of the changes applied on the member that contains
partition seven are successful, even though (as you can see from the log)
invocations of the process(Entry entry) method for many of those entries
have succeeded.

5.5.3

Exception When Invoking with a Set of Keys

We can invoke against the same thirty-nine entries in the cache by explicitly
providing the set of keys:
@Test
public void testInvokeKeys () {
try {
cache . invokeAll (
cache . keySet () , A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r . INSTANCE );
} catch ( RuntimeException e ) {
LOG . info ( " caught " , e );
}
int setSize = cache . entrySet ( new EqualsFilter (
Id ent ity Ext rac tor . INSTANCE , " foobar " )). size ();
LOG . info ( setSize + " entries updated " );
}

5.5. EXCEPTIONS IN ENTRYPROCESSORS

143

This time, we see that the processAll(Set set) is invoked thirteen times, with
three entries each time:
23:09:26.107 [ D i s t r i b u t e d C a c h e W o r k e r :0] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r - invok
ed with set size =3

If you study the log output closely, you will see that each invocation covers the three entries in a single partition. Also, these invocations are run
in parallel across the six worker threads (a pool of three threads in each
member).
And how many entries are successfully updated?
23:09:26.171 [ main ] INFO

o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o n - 36 entries updated

i.e. all but three entries. As we know the values before and after update, we
can query to find which particular entries werent updated:
LOG . info ( " unchanged keys : "
+ cache . keySet ( new EqualsFilter (
Id ent ity Ex tra cto r . INSTANCE , " foo " )));

We find that keys 7, 20, 33 were not updated. These, of course, are the keys
belonging to partition seven.

5.5.4

When Many Exceptions Are Thrown

In each of these two tests, we have thrown an exception that tells us which
key was affected; it wouldnt be hard to extend this to identify the partition
or member:
Caused by : java . lang . RuntimeException : Failed on key =20

But what if we have exceptions thrown in the processing of more than one
entry - these are being processed in parallel in many threads on many cluster
members. Weve replaced our unfortunate developer, the one who let this
last subtle bug slip in, with someone who is, frankly, incompetent. Every
entry processed results in an exception:

144

CHAPTER 5. GRID PROCESSING

public class A l w a y s F a i l i n g E n t r y P r o c e s s o r extends Abs tr act Pro ces sor {


private static final Logger LOG = LoggerFactory . getLogger (
A l w a y s F a i l i n g E n t r y P r o c e s s o r . class );
public static final A l w a y s F a i l i n g E n t r y P r o c e s s o r INSTANCE =
new A l w a y s F a i l i n g E n t r y P r o c e s s o r ();
public A l w a y s F a i l i n g E n t r y P r o c e s s o r () {
}
@Override
public Object process ( Entry entry ) {
BinaryEntry bentry = ( BinaryEntry ) entry ;
int key = ( int ) entry . getKey ();
P ar t it i on e dS e rv i ce ps = ( Pa rt i ti o ne d Se r vi c e ) bentry . getContext (). getCacheService ();
int partition = ps . g e t K e y P a r t i t i o n i n g S t r a t e g y (). getKeyPartition ( key );
LOG . info ( " invoked with key = " + key + " , partition = " + partition );
throw new RuntimeException ( " Failed on key = " + key );
}
@Override
public Map processAll ( Set set ) {
MDC . put ( " member " , Integer . valueOf ( CacheFactory . getCluster ()
. getLocalMember (). getId ()). toString ());
LOG . info ( " invoked with set size = " + set . size ());
return super . processAll ( set );
}
}

So, let us test this on the same thirty-nine entries:


@Test
public void te s tA l wa y sF a il K ey s () {
try {
cache . invokeAll ( cache . keySet () , A l w a y s F a i l i n g E n t r y P r o c e s s o r . INSTANCE );
} catch ( RuntimeException e ) {
LOG . info ( " caught " , e );
}
int setSize = cache . entrySet ( new EqualsFilter (
Id ent ity Ext rac tor . INSTANCE , " foobar " )). size ();
LOG . info ( setSize + " entries updated " );
}

Examining the log output will show that, as before when invoking over a key
set, the EntryProcessor is invoked thirteen times over three keys each time.
Unsurprisingly, we find that no entries were updated.
.
.
Caused by : java . lang . RuntimeException : Failed on key =5
.
.
07:04:28.742 [ main ] [ member =] INFO o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o n - 0 entries updated

5.5. EXCEPTIONS IN ENTRYPROCESSORS

145

Although many exceptions were thrown, we can, of course, catch only one of
them. Now lets summarise the result of these experiments. If an exception
is thrown:
We know that none of the updates in the same partition (for invocations by key set) or member (for invocations by filter) will have been
processed.
When we catch an exception, we have no way of knowing whether other
partitions or members have also failed.
Any return values from the invocations that did succeed are lost.
There is one further variation we havent tested - the above applies if we define a thread pool for the service - i.e. include the <thread-count> attribute in
the cache-scheme definition. If we dont then we find that invoking by keyset
produces one execution per member rather than per partition - behaviour is
the same as for invocation by filter.
Before you collapse in paroxysms of shock and outrage at such careless disregard for your processing results in such a mature, well-regarded product,
I should point out that this behaviour is, to coin a phrase, not a bug - its
a feature. Throwing an exception from an EntryProcessor is by design, the
means by which we may perform a rollback of changes made, but not yet
committed - at least within the scope of the execution of an instance on a
single partition or member.

5.5.5

How To Manage Exceptions in an EntryProcessor

The fundamental problem here is that, when we catch an exception, we can


have no knowledge of which entries have been processed, and what the results
of that processing were. There are four ways of solving this problem:
1. Never invoke an EntryProcessor on filters or sets of keys, operate on a
single key at a time.
2. Never throw an exception from an EntryProcessor
(a) . . . by ensuring exceptions are never thrown in the first place
(b) . . . by catching the exception and placing it in the return value
within the EntryProcessor

146

CHAPTER 5. GRID PROCESSING


Listing 5.12: Result class to encapsulate caught exceptions

public class Result {


private Object returnvalue ;
private Exception exception ;
public Result ( Object returnvalue , Exception exception ) {
this . returnvalue = returnvalue ;
this . exception = exception ;
}
public Object getReturnvalue () {
return returnvalue ;
}
public Exception getException () {
return exception ;
}
}

3. Explicitly invoke on a single partition at a time, so that you know all


entries succeeded, or none did.
4. if you are using Coherence 12.1.3 or later, you can use a custom
AsynchronousProcessor to return all values and exceptions.2
The first of these options is simple and straightforward, but significantly
increases the cost of performing operations against a set of entries. The
second, well cover now.

5.5.6

Return Exceptions

A simple way to retrofit exception-catching behaviour to an existing code


base, is to use the decorator pattern; create a new EntryProcessor class that
deals with exception handling and delegates to another instance that does
the work. To give us a clean interface, well define a Result class in listinglst:epresult to separate the intended return value from any exception,
and then our simple decorator class in 5.13.

see subsection 5.5.8: Use an AsynchronousProcessor

5.5. EXCEPTIONS IN ENTRYPROCESSORS

147

Listing 5.13: Exception catching EntryProcessor decorator


public class E x c e p t i o n C a t c h i n g E n t r y P r o c e s s o r extends Abs tra ctP ro ces sor {
private EntryProcessor delegate ;
public E x c e p t i o n C a t c h i n g E n t r y P r o c e s s o r () {
}
public E x c e p t i o n C a t c h i n g E n t r y P r o c e s s o r ( EntryProcessor delegate ) {
this . delegate = delegate ;
}
@Override
public Object process ( Entry entry ) {
try {
Object resultObject = delegate . process ( entry );
return new Result ( resultObject , null );
} catch ( Exception e ) {
return new Result ( null , e );
}
}
}

we can test the ArbitrarilyFailingEntryProcessor again, but now with the


decorator:
@Test
public void t e s t C a t c hE x c e p t i o n s () {
Map results = cache . invokeAll ( cache . keySet () ,
new E x c e p t i o n C a t c h i n g E n t r y P r o c e s s o r (
A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r . INSTANCE ));
int setSize = cache . entrySet (
new EqualsFilter ( Id ent ity Ext rac tor . INSTANCE , " foobar " )). size ();
LOG . info ( setSize + " entries updated " );
LOG . info ( " unchanged keys : " + cache . keySet (
new EqualsFilter ( Id ent ity Ext rac tor . INSTANCE , " foo " )));
for ( Map . Entry < Integer , Result > entry : results . entrySet ()) {
if ( entry . getValue (). getReturnvalue () == null ) {
LOG . info ( " exception for key " + entry . getKey () +
" : " + entry . getValue (). getException (). getMessage ());
}
}
}

We find that two keys, 7 and 20 are unchanged, 33 has been processed
correctly even though it belongs to partition seven. Our log messages confirm
that we have captured both exceptions.
08:05:24.770 [ main ]
08:05:24.776 [ main ]
verterCollection {7 ,
08:05:24.780 [ main ]
0: Failed on key =20
08:05:24.783 [ main ]
: Failed on key =7

[ member =] INFO
[ member =] INFO
20}
[ member =] INFO

o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o n - 37 entries updated
o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o n - unchanged keys : Con

[ member =] INFO

o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o n - exception for key 7

o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o n - exception for key 2

We must consider carefully the implications of catching and returning the


exception. The fact that the update to key 33 has been applied tells us that

148

CHAPTER 5. GRID PROCESSING

we have subverted Coherences partition rollback logic. Now look at what


happens if we make a small change to the ArbitrarilyFailingEntryProcessor
so that we update the cache entry before throwing the exception:
@Override
public Object process ( Entry entry ) {
BinaryEntry bentry = ( BinaryEntry ) entry ;
int key = ( int ) entry . getKey ();
P ar t it i on e dS e rv i ce ps = ( Pa rt i ti o ne d Se r vi c e ) bentry . getContext (). getCacheService ();
int partition = ps . g e t K e y P a r t i t i o n i n g S t r a t e g y (). getKeyPartition ( key );
LOG . info ( " invoked with key = " + key + " , partition = " + partition );
Object result = entry . setValue ( entry . getValue () + " bar " );
if ( partition == 7 && invocationCount ++ > 0) {
throw new RuntimeException ( " Failed on key = " + key );
}
return result ;
}

The test results now:


08:11:25.555 [ main ] [ member =] INFO o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o
n - 39 entries updated
08:11:25.562 [ main ] [ member =] INFO o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o
n - unchanged keys : C o n v e r t e r Co l l e c t i o n {}
08:11:25.567 [ main ] [ member =] INFO o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o
n - exception for key 20: Failed on key =20
08:11:25.569 [ main ] [ member =] INFO o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o
n - exception for key 7: Failed on key =7

We still have the two exceptions, but all the entries have been updated.
The conclusion: We can capture the exceptions, but the cost is the loss of
rollback functionality. Any change made before the exception is thrown is
retained. If we carefully structure the EntryProcessor.process method such
that any exceptions are thrown before changes are made, then we can still
assure that changes are correctly made per entry, rather than the default
per partition or per member behaviour. Which is fine if it suits our usecase.

5.5.7

Invoking Per Partition

If we wish to preserve the atomic update per partition, we can iterate over
the partitions and apply the EntryProcessor once per partition. We use the
slightly odd PartitionSet class to do this. This is not, as its name might
imply, a Set of integer partition ids, but a class in its own right. We must
first construct an instance giving the total number of partitions configured
in the service we wish to interrogate:
P ar t it i on e dS e rv i ce service = ( Pa r ti t io n ed S er v ic e ) cache . getCacheService ();
int partitionCount = service . ge tPa rt iti onC oun t ();
PartitionSet partitionSet = new PartitionSet ( partitionCount );

5.5. EXCEPTIONS IN ENTRYPROCESSORS

149

Listing 5.14: Invoke against a PartitionedFilter


public void t e s t P a r t it i o n F i l t e r () {
P ar t it i on e dS er v ic e service = ( Pa r ti t io n ed S er v ic e ) cache . getCacheService ();
int partitionCount = service . g etP art iti onC oun t ();
PartitionSet partitionSet = new PartitionSet ( partitionCount );
for ( int i = 0; i < partitionCount ; i ++) {
partitionSet . clear ();
partitionSet . add ( i );
Filter filter = new Par tit ion edF ilt er ( AlwaysFilter . INSTANCE , partitionSet );
try {
cache . invokeAll ( filter , A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r . INSTANCE );
} catch ( Exception e ) {
LOG . info ( " caught while processing partition " + i , e );
}
}
int setSize = cache . entrySet ( new EqualsFilter (
Id ent ity Ex tra cto r . INSTANCE , " foobar " )). size ();
LOG . info ( setSize + " entries updated " );
LOG . info ( " unchanged keys : " + cache . keySet ( new EqualsFilter (
Id ent ity Ex tra cto r . INSTANCE , " foo " )));
}

Next we add the specific ids of the partitions we wish to operate on:
partitionSet . add ( i );

We may add several ids, and we may clear the set. Finally, we can use the
PartitionSet in a PartitionedFilter to restrict the operation of another filter
to only the entries in that set of partitions. So we can iterate over all the
partitions one at a time as in listing 5.14.
Functionally, this gives us what we need:
We catch all the exceptions (there can only be one per partition).
We know which entries have not been modified (those belonging to the
failed partitions).
The output of this test shows us that an exception is thrown for partition 7,
and that entries 7, 20, and 33 are not updated.
18:14:48.007 [ main ] [ member =] INFO o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o
n - caught while processing partition 7
.
.
.
18:14:48.050 [ main ] [ member =] INFO o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o
n - 36 entries updated
18:14:48.055 [ main ] [ member =] INFO o . c . g . e . T e s t E n t r y P r o c e s s o r E x c e p t i o
n - unchanged keys : C o n v e r t e r C ol l e c t i o n {33 , 7 , 20}
.
.
.

150

CHAPTER 5. GRID PROCESSING

Functionally it gives us what we need, but it isnt very efficient, especially as


there are may be thousands of partitions. We single thread the entire operation and cause a network dialogue for each partition rather than broadcasting the request to all members simultaneously. We could try and improve
matters by discovering which members own each partition and invoking the
EntryProcessor once per member. There is an API available for that discovery,
we could write:
P ar t it i on e dS e rv i ce service = ( Pa r ti t io n ed S er v ic e ) cache . getCacheService ();
Set members = service . g e t O w n e r s h i p E n a b l e d M e m b e r s ();
for ( Member member : ( Set < Member >) members ) {
PartitionSet partitionSet = service . ge t Ow n ed P ar t it i on s ( member );
Filter filter = new Par tit ion edF ilt er ( AlwaysFilter . INSTANCE , partitionSet );
try {
cache . invokeAll ( filter , A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r . INSTANCE );
} catch ( Exception e ) {
LOG . info ( " caught while processing member " + member . getId () , e );
}
}

Weve already established that when invoking by filter, the EntryProcessor is


invoked atomically over the entire member rather than just the position. Unfortunately, this approach suffers from race conditions. If a partition moves
in between service.getOwnedPartitions(member) and cache.invokeAll, then the
invocation will go to more than one member and we may again lose exceptions. There is also the problem of ensuring that we have executed once and
once only for every partition. The next section, section 5.6: Using Invocation
Service on All Partitions explores this problem in more detail.
If we know the subset of keys we wish to operate on, we can improve things
somewhat, depending on the size of that set. We must turn the set of keys
into a set of partitions ids (not to be confused with a PartitionSet - see
above):
Set < Integer > keys = new HashSet < >();
Collections . addAll ( keys , 10 , 11 , 13 , 14 , 15 , 20 , 21 , 22 , 23 , 24 , 25 , 26 );
P ar t it i on e dS e rv i ce service = ( Pa r ti t io n ed S er v ic e ) cache . getCacheService ();
K e y P a r t i t i o n i n g S t r a t e g y kps = service . g e t K e y P a r t i t i o n i n g S t r a t e g y ();
Set < Integer > partitions = new HashSet < >();
for ( int key : keys ) {
partitions . add ( kps . getKeyPartition ( key ));
}
int partitionCount = service . ge tPa rt iti onC oun t ();
PartitionSet partitionSet = new PartitionSet ( partitionCount );
for ( int i : partitions ) {
partitionSet . clear ();
partitionSet . add ( i );
.
.
.

5.5. EXCEPTIONS IN ENTRYPROCESSORS

151

To give us both correct functionality and performance, we need a different


approach - an approach that has wider applicability than correctly handling
exceptions, so merits a section of it own. Read on. . .

5.5.8

Use an AsynchronousProcessor

Coherence 12.1.3 introduced the AsynchronousProcessor with the intent of providing a mechanism for clients to invoke several operation concurrently, then
wait for the results. The basic approach is well covered by the Oracle documentation. Not mentioned is that the new API gives us a much cleaner
way of returning full information about successful and failed invocations to
the caller. To do so, we must provide our own subclass. The provided
AsynchronousProcessor has a method void onException(Throwable eReason) that
is called when an exception is returned from the invocation on a member.
The default implementation will mark the processing as complete when the
first exception occurs, so leaves us still with the problem of not knowing
what else has failed. We can override the method with one that simply
captures and stores any exceptions, and then continues to wait for all other
members:
public class E x c e p t i o n C a t c h i n g A s y n c P r o c e s s o r extends A s y n c h r o n o u s P r o c e s s o r {
public E x c e p t i o n C a t c h i n g A s y n c P r o c e s s o r ( EntryProcessor processor ,
boolean fFlowControl , int iUnitOrderId ) {
super ( processor , fFlowControl , iUnitOrderId );
}
public E x c e p t i o n C a t c h i n g A s y n c P r o c e s s o r ( EntryProcessor processor ,
boolean fFlowControl ) {
super ( processor , fFlowControl );
}
public E x c e p t i o n C a t c h i n g A s y n c P r o c e s s o r ( EntryProcessor processor ) {
super ( processor );
}
private List < Throwable > exceptions = new ArrayList < >();
@Override
public void onException ( Throwable eReason ) {
exceptions . add ( eReason );
}
public Collection < Throwable > getExceptions ()
throws InterruptedException , E xe c ut io n Ex c ep t io n {
// get () waits for completion
get ();
return exceptions ;
}
}

Given an EntryProcessor ep that we wish to execute against a Filter filter,


we write:

152

CHAPTER 5. GRID PROCESSING


E x c e p t i o n C a t c h i n g A s y n c P r o c e s s o r asyncProcessor =
new E x c e p t i o n C a t c h i n g A s y n c P r o c e s s o r ( ep );
cache . invokeAll ( filter , asyncProcessor );
Map result = ( Map ) asyncProcessor . get ();
Collection < Throwable > exceptions = asyncProcessor . getExceptions ();

Neither get() nor getExceptions() will return until all results and exceptions
have been collected from all members so we have reliable results on which
entries have been updated, and what exceptions have been thrown.
The use of service worker threads changes with an AsynchronousProcessor. The
usual behaviour is that filter invocation will be invoked once per partition
and key invocations once per member, with AsynchronousProcessor, it appears
to always invoke once per member.

5.6

Using Invocation Service on All Partitions

Objective
To demonstrate how to correctly and efficiently ensure that some processing takes place once against every partition in a partitioned service,
using as an example the invocation of an EntryProcessor against each
partition without losing any information about exceptions that have
occurred3 .
Prerequisites
A basic understanding of the concept of the Coherence invocation service is useful. To understand the use-case for the specific example
covered in this section, please read section 5.5: Exceptions in EntryProcessors
Code examples
The classes and resources in this section are in the gridprocessing
project under package org.cohbook.gridprocessing.invocation.
Dependencies
As well as Oracle Coherence, the examples use JUnit and Littlegrid
3

In practice, this use case is better covered by AsynchronousProcessor as discussed in the previous section, however the example still serves to illustrate the use of
InvocationService per partition

5.6. USING INVOCATION SERVICE ON ALL PARTITIONS

153

In simple cases the NamedCache provides efficient ways of distributing execution of a task through the cluster, making maximum use of the parallelism
available, and minimising network latency by using few network interactions
between members. Invocation service allows us to perform more complex
operations in cluster members where the data are held, perhaps where we
need to access several entries in separate caches, or where we wish to perform operations on the result or parameter sets locally in the cluster. In this
example we invoke an EntryProcessor separately on each partition in order to
capture all exceptions thrown, but we perform all the invocations for a single member on that member. This will require a single network request, and
single response per member to the originator of the request, rather than one
per partition as with the approach described in subsection 5.5.7: Invoking
Per Partition.

5.6.1

Create the Invocable

We will start by defining an invocation service in the cache configuration,


extending the configuration we used in the previous section, we have:
<? xml version = " 1.0 " ? >
< cache - config xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - cache - config "
x si : sc h em a Lo c at i on =
" http: // xmlns . oracle . com / coherence / coherence - cache - config
coherence - cache - config . xsd " > < caching - scheme - mapping >
< cache - mapping >
< cache - name > test </ cache - name >
< scheme - name > d is t ri b ut e dS e rv i ce </ scheme - name >
</ cache - mapping >
</ caching - scheme - mapping >
< caching - schemes >
< distributed - scheme >
< scheme - name > d is t ri b ut e dS e rv i ce </ scheme - name >
< thread - count >3 </ thread - count >
< partition - count > 13 </ partition - count >
<key - partitioning >
< class - name >
org . cohbook . gridprocessing . entryprocessor . I n t K e y P a r t i t i o n i n g S t r a t e g y
</ class - name >
</ key - partitioning >
< backing - map - scheme >
< local - scheme / >
</ backing - map - scheme >
< autostart > true </ autostart >
</ distributed - scheme >
< invocation - scheme >
< scheme - name > invocationScheme </ scheme - name >
< service - name > in voc ati onS erv ice </ service - name >
< autostart > true </ autostart >
</ invocation - scheme >
</ caching - schemes >
</ cache - config >

154

CHAPTER 5. GRID PROCESSING

We can use the invocation service to execute code on selected members in the
cluster, the object we send to the service implements the Invocable interface,
which itself extends Runnable, with the addition of a getResult() method to
obtain a return value4 ).
public interface Invocable
extends Runnable , Serializable
{
void init ( In voc ati onS erv ice inv oca tio nse rvi ce );
void run ();
Object getResult ();
}

As well as implementing this class, we must define a return type that will
send back to the invoking client the results of the invocation. This will
contain a map of EntryProcessor results by cache key, a map of exceptions
thrown by partition number, and the set of partitions that were processed
on a member.
public class InvocationResult implements Serializable {
private Map results ;
private Map < Integer , Exception > p a r t i ti o n E x c e p t i o n s ;
private PartitionSet p a r t i t i o n s P r o ce s s e d ;
public InvocationResult ( Map results , Map < Integer , Exception > partitionExceptions ,
PartitionSet p a r t i t i o n s Pr o c e s s e d ) {
this . results = results ;
this . p a r t i t i o n Ex c e p t i o n s = p a r ti t i o n E x c e p t i o n s ;
this . p a r t i t i o n sP r o c e s s e d = p a r ti t i o n s P r o c e s s e d ;
}
public Map getResults () {
return results ;
}
public Map < Integer , Exception > g e t P a r t i t i o n E x c e p t i o n s () {
return p a rt i t i o n E x c e p t i o n s ;
}
public PartitionSet g e t P a r t i t i o n s P r o c e s s e d () {
return p a rt i t i o n s P r o c e s s e d ;
}
}

Our Invocable must first determine the set of partitions present on the member on which it is executing. To do this, we obtain the PartitionedService instance containing the cache and the Member object for the local member:
4
The Invocable interface pre-dates java 1.5, when Callable was introduced, which
would have been a more logical choice

5.6. USING INVOCATION SERVICE ON ALL PARTITIONS

155

NamedCache cache = CacheFactory . getCache ( cacheName );


P ar t it i on e dS er v ic e service = ( Pa r ti t io n ed S er v ic e ) cache . getCacheService ();
Member member = CacheFactory . getCluster (). getLocalMember ();
PartitionSet partitionSet = service . ge t Ow n ed P ar t it i on s ( member );

For each partition in that set, we must construct a PartitionedFilter for


that single partition encapsulating the filter, here called queryFilter, for our
query:
PartitionSet filterPartSet = new PartitionSet ( partitionCount );
filterPartSet . add ( partitionId );
Filter filter = new Par tit ion edF il ter ( queryFilter , filterPartSet );

Putting it all together, we construct in listing 5.15 an Invocable that, given a


cache name, a Filter, and an EntryProcessor, will execute the EntryProcessor
against all partitions on the local member, one partition at a time, on entries
matching the Filter.
To execute this against the entire cache, we must invoke it against all storage
enabled members for that cache. Given a fully populated PartitionSet for
cacheService, we obtain the set of Member objects and invoke our Invocable for
all those members:
Set < Member > memberSet = new HashSet < >();
for ( int partition : partitionset . toArray ()) {
memberSet . add (
cacheService . g etP art it ion Own er ( partition ));
}
in voc ati onS erv ic e . execute ( invocable , memberSet , observer );

The observer is an instance of an implementation of InvocationObserver which


collects all the the result objects from the invocations on each member.
Well implement this in listing 5.16 so that it aggregates the contents of each
InvocationResult into a single result object. Also, well pass it a Semaphore to
release once all the invocations have completed.
The memberCompleted method is invoked once for every member that our
Invocable successfully completed processing, so we use this method to update our aggregatedResult with the results and exceptions, and the set of
partitions that have been processed. invocationCompleted will be called when
all invocations have completed or failed. Well use this to release a Semaphore
that the test method thread can wait on.

156

CHAPTER 5. GRID PROCESSING

Listing 5.15: Invoke an EntryProcessor against local partitions


public class P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r implements Invocable {
private
private
private
private

transient InvocationResult result = null ;


EntryProcessor entryProcessor ;
Filter queryFilter ;
String cacheName ;

public P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r ( EntryProcessor entryProcessor ,


Filter queryFilter , String cacheName ) {
this . entryProcessor = entryProcessor ;
this . queryFilter = queryFilter ;
this . cacheName = cacheName ;
}
@Override
public void init ( Inv oca tio nSe rvi ce in vo cat ion ser vic e ) {
}
protected PartitionSet g e t P a r t i t i o n S e t T o P r o c e s s ( NamedCache cache ) {
P ar t it i on e dS e rv i ce service = ( Pa r ti t io n ed S er v ic e ) cache . getCacheService ();
Member member = CacheFactory . getCluster (). getLocalMember ();
PartitionSet partitionSet = service . ge t Ow n ed P ar t it i on s ( member );
partitionSet . retain ( r e qu i re d Pa r ti t io n s );
return partitionSet ;
}
@Override
public void run () {
NamedCache cache = CacheFactory . getCache ( cacheName );
PartitionSet partitionSet = g e t P a r t i t i o n S e t T o P r o c e s s ( cache );
Map resultMap = new HashMap < >();
Map < Integer , Exception > exceptionMap = new HashMap < >();
for ( int partitionId : partitionSet . toArray ()) {
PartitionSet filterPartSet = new PartitionSet ( partitionSet . g etP art iti onC oun t ());
filterPartSet . add ( partitionId );
Filter filter = new Par tit ion edF ilt er ( queryFilter , filterPartSet );
try {
resultMap . putAll (
cache . invokeAll ( filter , entryProcessor ));
} catch ( Exception e ) {
exceptionMap . put ( partitionId , e );
}
}
result = new InvocationResult ( resultMap , exceptionMap , partitionSet );
}
@Override
public Object getResult () {
return result ;
}
}

5.6. USING INVOCATION SERVICE ON ALL PARTITIONS

Listing 5.16: Am invocation observer


class I n v o c a t i o n R e s u l t O b s e r v e r implements In v oc a ti o nO b se r ve r {
private static final Logger LOG = LoggerFactory . getLogger (
I n v o c a t i o n R e s u l t O b s e r v e r . class );
private final InvocationResult aggregatedResult ;
private final Semaphore in v oc a ti o nC o mp l et e ;
public I n v o c a t i o n R e s u l t O b s e r v e r (
InvocationResult aggregatedResult , Semaphore invocationComplete ,
Map < Integer ) {
this . aggregatedResult = aggregatedResult ;
this . i nv o ca t io nC o mp l et e = i n vo c at i on C om pl e te ;
}
@Override
public void memberLeft ( Member member ) {
LOG . info ( " member " + member . getId () + " has left " );
}
@Override
public void memberFailed ( Member member , Throwable throwable ) {
}
@Override
public void memberCompleted ( Member member , Object obj ) {
LOG . info ( " member " + member . getId () + " completed " );
InvocationResult invocationResult = ( InvocationResult ) obj ;
aggregatedResult . getResults (). putAll (
invocationResult . getResults ());
aggregatedResult . g e t P a r t i t i o n E x c e p t i o n s (). putAll (
invocationResult . g e t P a r t i t i o n E x c e p t i o n s ());
aggregatedResult . g e t P a r t i t i o n s P r o c e s s e d (). add (
invocationResult . g e t P a r t i t i o n s P r o c e s s e d ());
}
@Override
public void i n v o c a t i on C o m p l e t e d () {
LOG . info ( " invocation completed " );
i nv o ca t io n Co mp l et e . release ();
}
}

157

158

CHAPTER 5. GRID PROCESSING

So we will have a client member that constructs an Invocable, and submits


it for execution against each storage-enabled member of the cluster. Each
of these determines which partitions it owns, then executes and returns the
results (including exceptions) to the originating member. Theres a race
condition here. If a member joins or leaves during the interval between the
Invocable being submitted and executed, we may find that some partitions
have been missed. But as each execution of our Invocable returns the set
of partitions that it processed, we can track which partitions have been
processed and loop over those that remain, but to do this we must avoid
reprocessing those partitions that have not moved, so we will modify the
PartitionEntryProcessorInvoker to give it the set of partitions that need to be
processed; during execution on each member, well determine the intersection
of the set of partitions we require to be processed, and the set local to that
member. First we add the set of required partitions to the constructor:
@P ort abl ePr ope rt y (0)
@P ort abl ePr ope rt y (1)
@P ort abl ePr ope rt y (2)
@P ort abl ePr ope rt y (3)

private
private
private
private

EntryProcessor entryProcessor ;
Filter queryFilter ;
String cacheName ;
PartitionSet r eq u ir e dP a rt i ti on s ;

public P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r ( EntryProcessor entryProcessor ,


Filter queryFilter , String cacheName , PartitionSet re q ui r ed P ar t it i on s ) {
this . entryProcessor = entryProcessor ;
this . queryFilter = queryFilter ;
this . cacheName = cacheName ;
this . re q ui r ed P ar t it i on s = r e qu ir e dP a rt i ti o ns ;
}

Then, when we obtain the set of local partitions, we retain only those in the
required set:
PartitionSet partitionSet = service . ge t Ow n ed P ar t it i on s ( member );
partitionSet . retain ( r e qu i re d Pa r ti t io n s );

5.6.2

Test setup

Well construct a test class in listing 5.17, initialising a cache in exactly the
same way as in section 5.5: Exceptions in EntryProcessors, except that we
use our new cache configuration that includes the invocation service definition.

5.6. USING INVOCATION SERVICE ON ALL PARTITIONS

159

Listing 5.17: Setting up a test for the Invocable


public class T e s t P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r {
private Cl u st e rM e mb e rG r ou p memberGroup ;
private NamedCache cache ;
private static final Logger LOG = LoggerFactory . getLogger (
T e s t P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r . class );
@Before
public void setup () {
memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. s e t S t o r a g e E n a b l e d C o u n t (2)
. setCacheConfiguration (
" org / cohbook / gridprocessing / invocation / cache - config . xml " )
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();
cache = CacheFactory . getCache ( " test " );
P ar t it i on e dS er v ic e service = ( Pa r ti t io n ed S er v ic e ) cache . getCacheService ();
int partitionCount = service . g etP art iti onC oun t ();
for ( int i = 0; i < partitionCount ; i ++) {
cache . put (i , " foo " );
cache . put ( i + partitionCount , " foo " );
cache . put ( i + partitionCount * 2 , " foo " );
}
}
.
.
.
}

5.6.3

A Sunny Day Test

In the test method itself, we first need to define variables to hold the accumulated set of results and exceptions. We must also initialise the set of
required partitions to the full set for the cache service.
@Test
public void testInvocation () throws I n t e r r u p t e d E x c e p t i o n {
P ar t it i on e dS er v ic e cacheService =
( Pa r ti t io n ed S er v ic e ) cache . getCacheService ();
final InvocationResult aggregatedResult = new InvocationResult (
new HashMap < >() ,
new HashMap < Integer , Exception >() ,
new PartitionSet ( cacheService . ge tP art iti onC oun t ()));
final PartitionSet r eq u ir e dP a rt i ti o ns = new PartitionSet (
cacheService . g etP art it ion Cou nt ());
r eq u ir e dP a rt it i on s . fill ();
final Semaphore i n vo c at i on C om p le te = new Semaphore (0);
I nv o ca t io n Ob se r ve r observer = new I n v o c a t i o n R e s u l t O b s e r v e r (
aggregatedResult , in v oc a ti o nC o mp l et e );
.
.
.

Now we can write the loop that invokes the Invocable until all partitions have
been processed:

160

CHAPTER 5. GRID PROCESSING


In voc ati on Ser vic e inv oca tio nSe rvi ce =
( I nvo cat ion Se rvi ce ) CacheFactory . getService ( " i nvo cat ion Ser vi ce " );
while (! r eq u ir e dP a rt i ti o ns . isEmpty ()) {
Set < Member > memberSet = new HashSet < >();
for ( int partition : re q ui r ed Pa r ti t io n s . toArray ()) {
memberSet . add (
cacheService . get Par tit ion Own er ( partition ));
}
Invocable invocable = new P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r (
A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r . INSTANCE , AlwaysFilter . INSTANCE ,
" test " , re q ui r ed Pa r ti t io n s );
in voc ati onS erv ice . execute ( invocable , memberSet , observer );
i nv o ca t io n Co m pl e te . acquire ();
r eq u ir e dP a rt i ti o ns . remove ( aggregatedResult . g e t P a r t i t i o n s P r o c e s s e d ());
}

Under normal circumstances, this loop will be executed only once, only if
partitions move during execution would we find that requiredPartitions is
not empty after the first iteration.

5.6.4

Testing Member Failure During Invocation

There are a number of scenarios to consider for member failure:


Failure before the Invocable has determined the partitions in the local
member.
Failure after local partitions have been determined.
Failure after some partitions have been processed.
The first of these is uninteresting: partitions moved will be found in the new
members and processed in the first iteration. To simulate the second, we
need to co-ordinate our test thread with the PartitionEntryProcessorInvoker
execution in the storage nodes service thread. Well subclass this class
to release a Semaphore once it has determined the set of partitions for the
member:
public class S y n c h P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r extends P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r {
.
.
.
@Override
protected PartitionSet g e t P a r t i t i o n S e t T o P r o c e s s ( NamedCache cache ) {
PartitionSet partitionSet = super . g e t P a r t i t i o n S e t T o P r o c e s s ( cache );
Semaphore semaphore = g e t R u n R e l e a s e S e m a p h o r e (); s
if ( semaphore != null ) {
semaphore . release ();
}
return partitionSet ;
}

5.6. USING INVOCATION SERVICE ON ALL PARTITIONS

161

Unfortunately, because the Invocable instance being executed is not the same
one we construct in the test method - having been serialied, deserialised in
the storage nodes class loader - getting the test thread and the storage node
to see the same Semaphore instance is not trivial. I provide here a utility class
that allows us to obtain a reference to a static member variable of a given
class in a parent classloader:
public class ReflectionUtils {
public static Object g e t F i e l d F r o m T o p C l a s s L o a d e r ( Class <? > clazz , String fieldName ) {
ClassLoader loader = clazz . getClassLoader ();
ClassLoader parent = loader . getParent ();
if ( parent != null ) {
loader = parent ;
}
Class <? > topClass ;
try {
topClass = loader . loadClass ( clazz . getName ());
} catch ( C l a s s N o t F o u n d E x c e p t i o n e ) {
throw new RuntimeException ( e );
}
Field resultField ;
try {
resultField = topClass . getDeclaredField ( fieldName );
} catch ( N o S u c h F i e l d E x c e p t i o n | S ecu ri tyE xce pti on e ) {
throw new RuntimeException ( e );
}
resultField . setAccessible ( true );
try {
return resultField . get ( null );
} catch ( I l l e g a l A r g u m e n t E x c e p t i o n | I l l e g a l A c c e s s E x c e p t i o n e ) {
throw new RuntimeException ( e );
}
}
}

We can augment SynchPartitionEntryProcessorInvoker to use this utility class:


private static Semaphore runRelease ;
public static void s e t R u n R e l e a s e S e m a p h o r e ( Semaphore semaphore ) {
runRelease = semaphore ;
}
public static Semaphore g e t R u n R e l e a s e S e m a p h o r e () {
return ( Semaphore ) ReflectionUtils . g e t F i e l d F r o m T o p C l a s s L o a d e r (
S y n c h P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r . class , " runRelease " );
}

We must now be careful of the behaviour of our chosen test framework. With
Littlegrid, the test thread runs in the parent ClassLoader of the other cluster
nodes. If you use the JUnit support of gridman or Oracle Tools5 , the test
thread ClassLoader is a sibling of the nodes so the set method must also
reflectively reference the parent6 .
Now, in our test method, we can use SynchPartitionEntryProcessorInvoker and
wait until partitions have been assigned, then kill one of the members:
5

http://thegridman.com/
and
https://github.com/coherence-community/
oracle-tools respectively
6
another exercise for the committed reader

162

CHAPTER 5. GRID PROCESSING

@Test
public void te s tW i th D ea d Me m be r () throws I n t e r r u p t e d E x c e p t i o n {
.
.
.
final int memberToStop = memberGroup . g e t S t ar t e d M e m b e r I d s ()[0];
int iterations = 0;
Semaphore runRelease = new Semaphore (0);
SynchPartitionEntryProcessorInvoker . setRunReleaseSemaphore (
runRelease );
while (! r eq u ir e dP a rt i ti o ns . isEmpty ()) {
iterations ++;
Set < Member > memberSet = new HashSet < >();
for ( int partition : re q ui r ed Pa r ti t io n s . toArray ()) {
memberSet . add ( cacheService . g etP art iti onO wne r (
partition ));
}
Invocable invocable = new S y n c h P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r (
A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r . INSTANCE ,
AlwaysFilter . INSTANCE ,
" test " ,
r eq u ir e dP a rt i ti on s );
in voc ati onS erv ice . execute ( invocable , memberSet , observer );
if ( iterations == 1) {
runRelease . acquire ( S T O R A G E _ M E M B E R _ C O U N T );
memberGroup . stopMember ( memberToStop );
}
i nv o ca t io n Co m pl e te . acquire ();
r eq u ir e dP a rt i ti o ns . remove (
aggregatedResult . g e t P a r t i t i o n s P r o c e s s e d ());
}
.
.
.
}

We find when running this that killing one member causes backup partitions
to be promoted:
2013 -06 -03 1 8 : 0 4 : 0 7. 2 1 9 / 1 1 . 7 9 7 Oracle Coherence GE 3.7.1.3 < Info > ( thr
ead = DistributedCache , member =2): Restored from backup 1 partitions : Pa
rtitionSet {12}
2013 -06 -03 1 8 : 0 4 : 0 7. 2 2 3 / 1 1 . 8 0 1 Oracle Coherence GE 3.7.1.3 < Info > ( thr
ead = DistributedCache , member =4): Restored from backup 3 partitions : Pa
rtitionSet {9 , 10 , 11}

These partitions are then processed in the second iteration, with the nowfamiliar final result:
18:04:08.940 [ main ] [ member =] INFO
eration 2
18:04:08.965 [ main ] [ member =] INFO
pdated
18:04:08.971 [ main ] [ member =] INFO
ys : C o n v e r t e r C o ll e c t i o n {33 , 7 , 20}

o . c . g . i . T e s t P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r - Completed it
o . c . g . i . T e s t P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r - 36 entries u
o . c . g . i . T e s t P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r - unchanged ke

But what if the member is killed after some partitions have been processed?
Well make this happen by wrapping the ArbitrarilyFailingEntryProcessor

5.6. USING INVOCATION SERVICE ON ALL PARTITIONS

163

with one that allows the test thread to synchronise on a semaphore during the
second invocation of processAll. i.e. after the first invocation has completed.
This again uses the ClassLoader trickery to ensure test thread and storage
nodes see the same Semaphore object:
public class D e l e g a t i n g S y n c h E n t r y P r o c e s s o r implements EntryProcessor , Serializable {
private EntryProcessor delegate ;
private int memberToBlock ;
private static Semaphore runRelease ;
private transient int invocationCount = 0;
public static void s e t R u n R e l e a s e S e m a p h o r e ( Semaphore semaphore ) {
runRelease = semaphore ;
}
public static Semaphore g e t R u n R e l e a s e S e m a p h o r e () {
return ( Semaphore ) ReflectionUtils . g e t F i e l d F r o m T o p C l a s s L o a d e r (
D e l e g a t i n g S y n c h E n t r y P r o c e s s o r . class , " runRelease " );
}
public D e l e g a t i n g S y n c h E n t r y P r o c e s s o r ( EntryProcessor delegate , int memberToBlock ) {
this . delegate = delegate ;
this . memberToBlock = memberToBlock ;
}
@Override
public Object process ( Entry entry ) {
return delegate . process ( entry );
}
@Override
public Map processAll ( Set setEntries ) {
if ( CacheFactory . getCluster (). getLocalMember (). getId () == memberToBlock ) {
if ( invocationCount ++ > 0) {
g e t R u n R e l e a s e S e m a p h o r e (). release ();
try {
Thread . sleep (10000);
throw new RuntimeException ( " waited too long to be killed " );
} catch ( I n t e r r u p t e d E x c e p t i o n e ) {
throw new RuntimeException ( e );
}
}
}
return delegate . processAll ( setEntries );
}
}

Our test calls it as follows:

164

CHAPTER 5. GRID PROCESSING

public void t e s t W i t h D e a d M e m b e r A f t e r P r o c e s s i n g () throws I n t e r r u p t e d E x c e p t i o n {


P ar t it i on e dS e rv i ce cacheService = ( P ar t it i on e dS e rv ic e ) cache . getCacheService ();
final InvocationResult aggregatedResult = new InvocationResult (
new HashMap < >() ,
new HashMap < Integer , Exception >() ,
new PartitionSet ( cacheService . g etP art iti onC oun t ()));
final PartitionSet r eq u ir e dP a rt it i on s = new PartitionSet (
cacheService . get Par tit ion Cou nt ());
r eq u ir e dP a rt i ti o ns . fill ();
final Semaphore i n vo c at io n Co m pl e te = new Semaphore (0);
I nv o ca t io n Ob s er v er observer = new I n v o c a t i o n R e s u l t O b s e r v e r (
aggregatedResult , in v oc a ti o nC o mp l et e );
final int memberToStop = memberGroup . g e t S t ar t e d M e m b e r I d s ()[0];
In voc ati on Ser vic e inv oca tio nSe rvi ce =
( I nvo cat ion Se rvi ce ) CacheFactory . getService ( " i nvo cat ion Ser vi ce " );
int iterations = 0;
Semaphore runRelease = new Semaphore (0);
DelegatingSynchEntryProcessor . setRunReleaseSemaphore (
runRelease );
while (! r eq u ir e dP a rt i ti o ns . isEmpty ()) {
iterations ++;
Set < Member > memberSet = new HashSet < >();
for ( int partition : re q ui r ed Pa r ti t io n s . toArray ()) {
memberSet . add ( cacheService . g etP art iti onO wne r (
partition ));
}
Invocable invocable = new P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r (
new D e l e g a t i n g S y n c h E n t r y P r o c e s s o r (
A r b i t r a r i l y F a i l i n g E n t r y P r o c e s s o r . INSTANCE , memberToStop ) ,
AlwaysFilter . INSTANCE ,
" test " ,
r eq u ir e dP a rt i ti on s );
in voc ati onS erv ice . execute ( invocable , memberSet , observer );
if ( iterations == 1) {
runRelease . acquire ();
memberGroup . stopMember ( memberToStop );
}
i nv o ca t io n Co m pl e te . acquire ();
r eq u ir e dP a rt i ti o ns . remove (
aggregatedResult . g e t P a r t i t i o n s P r o c e s s e d ());
LOG . info ( " Completed iteration " + iterations );
if (! memberExceptions . isEmpty ()) {
fail ( " member exceptions thrown " );
break ;
}
}
int setSize = cache . entrySet ( new EqualsFilter (
Id ent ity Ext rac tor . INSTANCE , " foobar " )). size ();
int dupSetSize = cache . entrySet ( new EqualsFilter (
Id ent ity Ext rac tor . INSTANCE , " foobarbar " )). size ();
LOG . info ( setSize + " entries updated once , " + dupSetSize
+ " entries updated twice " );
LOG . info ( " unchanged keys : " + cache . keySet (
new EqualsFilter ( Id ent ity Ext rac to r . INSTANCE , " foo " )));
}

5.6. USING INVOCATION SERVICE ON ALL PARTITIONS

165

So what happens when we run this test?


.
.
.
07:49:53.954 [ D i s t r i b u t e d C a c h e W o r k e r :0] [ member =1] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with set size =3
07:49:53.976 [ D i s t r i b u t e d C a c h e W o r k e r :0] [ member =1] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =9 , partition =9
07:49:53.977 [ D i s t r i b u t e d C a c h e W o r k e r :0] [ member =1] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =35 , partition =9
07:49:53.977 [ D i s t r i b u t e d C a c h e W o r k e r :0] [ member =1] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =22 , partition =9
.
.
.
Jun 12 , 2013 7:49:54 AM org . Littlegrid . impl . D e f a u l t C l u s t e r M e m b e r G r o u p s t o p M e m b e r INFO : About t
o stop cluster member with id 1

Member 1 processes a partition and is then killed.


.
.
.
07:49:54.112 [ Invocation : i nv oca tio nSe rvi ce ] [ member =] INFO o . c . g . i . I n v o c a t i o n R e s u l t O b s e r v e r
- member 1 has left
.
.
.
2013 -06 -12 0 7 : 4 9 : 5 4 .6 1 6 / 1 1 . 8 3 2 Oracle Coherence GE 3.7.1.3 < Info > ( thread = DistributedCache , m
ember =2): Restored from backup 2 partitions : PartitionSet {11 , 12}
2013 -06 -12 0 7 : 4 9 : 5 4 .6 1 8 / 1 1 . 8 3 4 Oracle Coherence GE 3.7.1.3 < Info > ( thread = DistributedCache , m
ember =4): Restored from backup 1 partitions : PartitionSet {10}
2013 -06 -12 0 7 : 4 9 : 5 4 .6 2 0 / 1 1 . 8 3 6 Oracle Coherence GE 3.7.1.3 < Info > ( thread = DistributedCache , m
ember =3): Restored from backup 1 partitions : PartitionSet {9}
.
.
.
07:49:56.073 [ main ] [ member =] INFO o . c . g . i . T e s t P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r - Completed it
eration 1

The members partitions are promoted in the remaining members, and we


reach the end of the first iteration. In the next iteration, all of the partitions
formerly belonging to member 1 are reprocessed:
07:49:56.087 [ D i s t r i b u t e d C a c h e W o r k e r :2] [ member =4] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with set size =3
07:49:56.088 [ D i s t r i b u t e d C a c h e W o r k e r :2] [ member =4] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =36 , partition =10
07:49:56.088 [ D i s t r i b u t e d C a c h e W o r k e r :2] [ member =4] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =23 , partition =10
07:49:56.088 [ D i s t r i b u t e d C a c h e W o r k e r :2] [ member =4] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =10 , partition =10
07:49:56.095 [ D i s t r i b u t e d C a c h e W o r k e r :1] [ member =3] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with set size =3
07:49:56.095 [ D i s t r i b u t e d C a c h e W o r k e r :1] [ member =3] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =9 , partition =9
07:49:56.095 [ D i s t r i b u t e d C a c h e W o r k e r :1] [ member =3] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =35 , partition =9
07:49:56.095 [ D i s t r i b u t e d C a c h e W o r k e r :1] [ member =3] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =22 , partition =9
07:49:56.101 [ Invocation : i nv oca tio nSe rvi ce ] [ member =] INFO o . c . g . i . I n v o c a t i o n R e s u l t O b s e r v e r
- member 4 completed
07:49:56.102 [ Invocation : i nv oca tio nSe rvi ce ] [ member =] INFO o . c . g . i . I n v o c a t i o n R e s u l t O b s e r v e r
- member 3 completed
07:49:56.103 [ D i s t r i b u t e d C a c h e W o r k e r :1] [ member =2] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with set size =3
07:49:56.103 [ D i s t r i b u t e d C a c h e W o r k e r :1] [ member =2] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =37 , partition =11
07:49:56.103 [ D i s t r i b u t e d C a c h e W o r k e r :1] [ member =2] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =11 , partition =11

166

CHAPTER 5. GRID PROCESSING

07:49:56.103 [ D i s t r i b u t e d C a c h e W o r k e r :1] [ member =2] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e


ssor - invoked with key =24 , partition =11
07:49:56.106 [ D i s t r i b u t e d C a c h e W o r k e r :2] [ member =2] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with set size =3
07:49:56.107 [ D i s t r i b u t e d C a c h e W o r k e r :2] [ member =2] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =38 , partition =12
07:49:56.107 [ D i s t r i b u t e d C a c h e W o r k e r :2] [ member =2] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =12 , partition =12
07:49:56.107 [ D i s t r i b u t e d C a c h e W o r k e r :2] [ member =2] INFO o . c . g . e . A r b i t r a r i l y F a i l i n g E n t r y P r o c e
ssor - invoked with key =25 , partition =12
07:49:56.111 [ Invocation : inv oca tio nSe rvi ce ] [ member =] INFO o . c . g . i . I n v o c a t i o n R e s u l t O b s e r v e r
- member 2 completed
07:49:56.111 [ Invocation : inv oca tio nSe rvi ce ] [ member =] INFO o . c . g . i . I n v o c a t i o n R e s u l t O b s e r v e r
- invocation completed
07:49:56.111 [ main ] [ member =] INFO o . c . g . i . T e s t P a r t i t i o n E n t r y P r o c e s s o r I n v o k e r - Completed it
eration 2

In particular, partition 9 has been processed again. Our EntryProcessor appends bar to the end of the value to turn foo into foobar, but now if we
query the cache, well find the three entries from partition 9 have the value
foobarbar - these have been processed twice.
The best we can do using this approach is to guarantee that each entry will
be processed at least once - the InvocableObserver can tell us that a member
has left, but there is no way to determine from the callers perspective what,
if any, processing had been performed on that member.. This is a common
theme with Coherence, you will find that this is precisely Coherences guarantee in many circumstances including event handlers, CacheStore etc. It is
always safest to code defensively on the assumption that your code may be
executed more than once.

5.6.5

Other Variations

In this section we have concentrated on the model of sending a single Invocable


object to all members, so the member variables of that Invocable available
to parameterise the execution will be the same on each node. There are
other variations we could choose to use, but in order to keep this book down
to a manageable size we wont explore them in full here - generally they
involve putting together the bits of API we have covered, but in different
ways.
If we were starting with a set of keys rather than a filter, we might find
that we need invoke only against a subset of partitions, and even a subset
of members. Identifying the partition owning a key and the member owning
a partition is a cheap operation, executed entirely locally, so restricting the
invocation to only those member where it is needed will be more efficient. If
the set of keys is large, it may be preferable to construct a distinct Invocable

5.7. WORKING WITH MANY CACHES

167

for each member and send it separately. We can still invoke all of them before
waiting for any, and can use a single InvocableObserver to track completion.
In this case we would be determining partitions owned by a member in the
caller, rather than in the Invocable itself - this does increase the window
during which a partition might move before invocation starts, though this
would generally affect only performance and not correctness and should not
be of undue concern - far better to try and prevent partitioning events than
try and optimise for them.

5.7

Working With Many Caches

Objective
To illustrate techniques of working with many caches in the cluster,
and their limitations. In doing so we demonstrate several useful techniques including partition-local atomic operations and use of backingmap queries. We also explore two causes of deadlocks, their consequences and how to avoid them.
Prerequisites
You should be acquainted with the EntryProcessor and BinaryEntry API
Code examples
Source is in the org.cohbook.gridprocessing.reentrancy package of module gridprocessing
Dependencies
The examples use JUnit and Littlegrid for executing test cases
There are some tricks to performing operations in the grid the most efficient
way, and some pitfalls for the unwary. Well base our examples on the business of our favourite airline, BozoJet. Bozojet need to keep details of seat
availability and existing reservations in a cache to provide a fast, accurate
response to their online customers and other booking systems. Well model
flights, uniquely identified by a flight number, passengers, uniquely identified by name, and bookings. A booking comprises one or more passenger
reservations on one or more flights. The number of available seats on a flight
must accurately reflect the reservations made for that flight so that bookings
are rejected if their is insufficient availability.

168

CHAPTER 5. GRID PROCESSING

5.7.1

Referencing Another Cache From an EntryProcessor

We could take locks on the various cache entries wed like to update, perform
the updates, then release the locks, but that approach is inefficient and
somewhat fragile (we have to track and release locks ourselves). Also, explicit
locks in Coherence dont really work in some circumstances (where there are
multi-threaded TCP*extend clients to be precise). So, instead, well start
with an EntryProcessor. We have two caches, flight and reservation The
flight value object looks like this:
public class Flight implements Serializable {
private
private
private
private
private
private

String flightId ;
String origin ;
String destination ;
Date departureTime ;
int ava ila ble Bus ine ss = 42;
int availableEconomy = 298;

// getters , setters etc

Well take flightId as our unique cache key (Not to be confused with flight
number, which conventionally has the same value for the same flight on
different days)
An individual passenger reservation looks like this:
public class Reservation implements Serializable {
private
private
private
private

String bookingRef ;
String passengerName ;
String flightId ;
boolean checkedIn = false ;

enum SeatType { economy , business };


private SeatType seatType ;
// getters , setters etc

A booking comprises of a collection of these passenger reservations sharing


a common booking reference. We wont currently model this as an entity
in itself. The first thing we might think to do is write an EntryProcessor
that is invoked to insert or update a Reservation, and will update the seat
availability on the Flight as it does so, listing 5.18. This will cope with an
amended reservation entry.isPresent() == true or a deleted reservation
this.reservation == null as well as a new reservation. The updateFlight
method works by invoking another EntryProcessor on the flight cache:
There are two serious problems with this approach:
We might update the flight cache twice for the same reservation.
We may exhaust the available service threads and deadlock.

5.7. WORKING WITH MANY CACHES

Listing 5.18: First attempt to update two caches


public class R e s e r v a t i o n P r o c e s s o r 1 extends A bst rac tPr oce sso r {
private Reservation reservation ;
public R e s e r v a t i o n P r o c e s s o r 1 ( Reservation reservation ) {
this . reservation = reservation ;
}
public Object process ( Entry entry ) {
if ( entry . isPresent ()) {
Reservation previous = ( Reservation ) entry . getValue ();
updateFlight ( previous . getFlightNo () , previous . getSeatType () , -1);
}
if ( booking == null ) {
entry . remove ( false );
} else {
entry . setValue ( booking );
updateFlight ( booking . getFlightNo () , booking . getSeatType () , 1);
}
return null ;
}
private void updateFlight ( String flightId , SeatType seatType , int seats ) {
F l i g h t U p d a t e P r o c e s s o r processor = new F l i g h t U p d a t e P r o c e s s o r ( seatType , seats );
NamedCache flightCache = CacheFactory . getCache ( " flight " );
flightCache . invoke ( flightId , processor );
}
...

169

170

CHAPTER 5. GRID PROCESSING

The first of these might happen if a cluster member is lost during the invocation of ReservationProcessor1 but after FlightUpdateProcessor has completed.
Coherence will repartition the cache, the booking cache entry will be assigned
to another member, and the ReservationProcessor1 instance will be invoked
on the same key on the new member. There is no record that the flight has
been updated so it will be updated again. There is no transactional integrity
between the updates.
The second problem may occur if several worker threads are concurrently executing ReservationProcessor1 for different reservations, each will call out, requiring a new worker thread to execute the FlightUpdateProcessor. Each invocation of FightUpdateProcessor blocks waiting for a thread to become free, but
if all the busy threads are waiting for the same thing they will never become
free. The example code includes a Littlegrid example BookingProcessor1Test
that illustrates the problem by running a service on a single storage node
with two worker threads, while making ReservationProcessor1 invocations on
two client threads. This problem can be remedied by placing reservation and
flight caches on different services, so long as there are only cross-service calls
in one direction - i.e. if we also has EntryProcessor invocations on the flight
cache calling out to the reservation cache service, we could still encounter
thread starvation deadlocks.
In general, consider carefully the consequences of calling services from within
a service worker thread, as happens in this case. To avoid deadlocks, ensure
that such calls happen only between services, and not within the same service. Ensure that these can occur in only one direction. Even then, crossservice calls may have a detrimental impact on latency and throughput as
additional network hops may be incurred, tying up the caller thread for the
duration of the nested call. You can avoid or mitigate this by calling out to
a replicated service or to a distributed service with a near cache. If you must
perform updates through a nested call, these must be idempotent because
of the possibility of a repeated call.

5.7.2

Partition-Local Atomic Operations

We can update the flight cache consistently with the reservation through the
BackingMapContext, but only if the reservation entry and flight entry are in the
same partition of the same service. We assure this by using key affinity, making the reservation cache key class implement KeyAssociation (there are other
methods, we could configure a KeyAssociator or a KeyPartitioningStrategy on

5.7. WORKING WITH MANY CACHES

171

the service, but this is the simplest way):


public class ReservationKey implements KeyAssociation , Serializable {
private int bookingId ;
private String passengerName ;
private int flightId ;
public ReservationKey ( Reservation booking ) {
this . bookingId = booking . getBookingId ();
this . passengerName = booking . getPassengerName ();
this . flightId = booking . getFlightId ();
}
@Override
public Object getAssociatedKey () {
return flightId ;
}
// getters
}

Well write a new ReservationProcessor2 which will update the flight cache
entry through the BackingMapContext. First, we have to get the context:
public Object process ( Entry entry ) {
B a c k i n g M a p M a n a g e r C o n t e x t bmc = (( BinaryEntry ) entry ). getContext ();
Ba cki ngM apC ont ex t flightContext = bmc . g e t B a c k i n g M a p C o n t e x t ( " flight " );

Then, from the flightContext, we can get the BinaryEntry, the cache entry in
the backing map, for the flight. The getBackingMapEntry method needs the
binary serialised form of the key, in this case the flighId from the key of the
entry in the reservation cache.
int flightId = (( ReservationKey ) entry . getKey ()). getFlightId ();
Binary serFlightId = ( Binary ) bmc . g e t K e y T o I n t e r n a l C o n v e r t e r (). convert ( flightId );
BinaryEntry flightEentry =
( BinaryEntry ) flightContext . ge t Ba ck i ng M ap E nt r y ( serFlightId );
if (! flightEentry . isPresent ()) {
throw new I l l e g a l A r g u m e n t E x c e p t i o n ( " No such flight " + flightId ));
}
Flight flight = ( Flight ) flightEentry . getValue ();

Now we can update the availability on the flight, again taking account of
new, amended, or removed reservations. We must call the setValue method
on the flight entry we obtained from the backing map context.

172

CHAPTER 5. GRID PROCESSING


if ( entry . isPresent ()) {
Reservation previous = ( Reservation ) entry . getValue ();
updateFlight ( flight , previous . getSeatType () , -1);
}
entry . setValue ( reservation );
if ( reservation != null ) {
updateFlight ( flight , reservation . getSeatType () , 1);
entry . setValue ( reservation );
} else {
entry . remove ( false );
}
flightEentry . setValue ( flight );
return null ;

The updateFlight method is now quite trivial.


private void updateFlight ( Flight flight , SeatType seatType , int seats ) {
switch ( seatType ) {
case business :
flight . s e t A v a i l a b l e B u s i n e s s ( flight . g e t A v a i l a b l e B u s i n e s s () - seats );
break ;
case economy :
flight . s et A v a i l a b l e E c o n o my ( flight . g e t A v a i l a b l e Ec o n o m y () - seats );
break ;
}
if ( flight . g e t A v a il a b l e E c o n o m y () >= 0 && flight . g e t A v a i l a b l e B u s i n e s s () >= 0) {
throw new I l l e g a l S t a t e E x c e p t i o n ( " insufficient seat availability " );
}
}

Unlike the previous example, the updates to flight and reservation caches
are now atomic. If, at any time before the completion of execution of the
ReservationProcessor2, the member owning the partition dies, the entries will
be transferred to the new owning member in a consistent state and the
processor will be run again. We are also protected against updates from
other worker threads as the call to getBackingMapEntry takes a lock on the
entry until the EntryProcessor invocation is complete. We have handled
the insufficient availability condition by throwing an exception. This has
the effect of telling Coherence to roll back all changes made through the
BackingMapManagerContext, so again, we ensure transactional consistency. We
could choose instead to return a boolean true/false value from the process
method to indicate whether the update had succeeded or not, but then it
would be up to us to ensure that we tested for the insufficient seats condition
and only called setValue or remove on any of the backing map entries if the
test passed7 .
7
See section 5.5: Exceptions in EntryProcessors for more on handling exceptions, in
particular for invokeAll calls.

5.7. WORKING WITH MANY CACHES

5.7.3

173

Updating Many Entries

Weve looked at how to atomically update a flight with the details of a single
reservation, but a single booking may contain reservations for a number of
passengers on the same flight. Wed like to have the flight updated once and
only if there is sufficient availability for the entire group. To do that, well
turn the implementation around and invoke an EntryProcessor on the flight
cache with many reservations.
public class F l i g h t R e s e r v a t i o n P r o c e s s o r extends A bs tra ctP roc ess or {
private List < Reservation > reservations ;
public F l i g h t R e s e r v a t i o n P r o c e s s o r ( List < Reservation > reservations ) {
this . reservations = reservations ;
}
public Object process ( Entry entry ) {
Flight flight = ( Flight ) entry . getValue ();
B a c k i n g M a p M a n a g e r C o n t e x t bmc = (( BinaryEntry ) entry ). getContext ();
Ba cki ngM apC ont ex t re s er v at i on C on t ex t = bmc . g e t B a c k i n g M a p C o n t e x t ( " reservation " );
Converter k e y T o I n t e r n a l C o n v e r t e r = bmc . g e t K e y T o I n t e r n a l C o n v e r t e r ();
for ( Reservation reservation : reservations ) {
ReservationKey resKey = new ReservationKey ( reservation );
Binary binResKey = ( Binary ) k e y T o I n t e r n a l C o n v e r t e r . convert ( resKey );
Entry r ese rva tio nsE ntr y = re s er v at i on C on t ex t . g e tB a ck i ng M ap E nt r y ( binResKey );
if ( res erv ati ons Ent ry . isPresent ()) {
Reservation previous = ( Reservation ) r ese rva ti ons Ent ry . getValue ();
updateFlight ( flight , previous . getFlightId () , previous . getSeatType () , -1);
}
updateFlight ( flight , reservation . getFlightId () , reservation . getSeatType () , 1);
re ser vat ion sEn tr y . setValue ( reservation );
}
if ( flight . g e t A v a i la b l e E c o n o m y () < 0 || flight . g e t A v a i l a b l e B u s i n e s s () < 0) {
throw new I l l e g a l S t a t e E x c e p t i o n ( " Insufficient availability " );
}
entry . setValue ( flight );
return null ;
}
private void updateFlight ( Flight flight , int flightId , SeatType seatType , int seats ) {
if ( flight . getFlightId () != flightId ) {
return ;
}
switch ( seatType ) {
case business :
flight . s e t A v a i l a b l e B u s i n e s s ( flight . g e t A v a i l a b l e B u s i n e s s () - seats );
break ;
case economy :
flight . se t A v a i l a b l e E c o n om y ( flight . g e t A v a i l a b l eE c o n o m y () - seats );
break ;
}
}
}

The pattern is similar, in that we obtain the BackingMapManangerContext for


the other cache, though here we start from the flight cache and get the

174

CHAPTER 5. GRID PROCESSING

cache, but then we iterate over a list of reservations, storing


each one and updating the parent flights seat availability. If, at the end of
the operation, we find that we have exceeded the seat availability, then the
exception will cause the entire invocation to be rolled back. Note also that
in the updateFlight method we have these lines:
reservation

if ( flight . getFlightId () != flightId ) {


return ;
}

so that any reservation against a flight other than the one we are invoking
for is silently ignored. If we have a booking that consists of reservations on
many flights, we can invokeAll a single FlightReservationProcessor against the
collection of flight keys affected, though such an invocation would be atomic
only for each flight8 .

5.7.4

Backing Map Queries

We can now deal with inserted or amended reservations on the flight, but to
remove deleted ones - i.e. those present in the cache but not in the collection
in the FlightReservationProcessor, we need to query the backing map. Well
start by creating a map of the current reservations by the reservations serialised key, then add into the map any entries for the same reservation that
are not in the map, but using a null value to indicate that the entry must
be removed from the cache:
Map < Binary , Reservation > reservationMap = new HashMap < >();
for ( Reservation reservation : reservations ) {
Binary binKey = ( Binary ) k e y T o I n t e r n a l C o n v e r t e r . convert (
new ReservationKey ( reservation ));
reservationMap . put ( binKey , reservation );
}
for ( Binary key : findReservations (
reservationContext , ( Integer ) entry . getKey () , bookingRef )) {
if (! reservationMap . containsKey ( key )) {
reservationMap . put ( key , null );
}
}

There is a utility class in the Coherence API, InvocableMapHelper, that allows


us to perform filter queries against simple maps, we can use this to perform
a filter query against the backing map to find all entries for the current flight
and booking id.
8
Though an exception thrown for one flight may roll back updates for other flights on
the same partition or member, again, see section 5.5: Exceptions in EntryProcessors

5.7. WORKING WITH MANY CACHES

175

private Collection < Binary > findReservations (


Ba cki ngM apC ont ex t context , final int flightId , final int bookingRef ) {
final B a c k i n g M a p M a n a g e r C o n t e x t mgr = context . g etM ana ger Con tex t ();
EntryFilter filter = new AndFilter (
new EqualsFilter ( " getBookingId " , bookingRef ) ,
new EqualsFilter ( " getFlightId " , flightId ));
return I n vo c ab l eM a pH e lp e r . query ( context . getBackingMap () , filter , false , false , null );
}

In a style of API design almost worthy of Microsoft, The runtime type of the
return value fromInvocabalMapHelper.query, depends on the third parameter,
here were obtaining the set of keys to the matching entries. If it were true we
would have obtained the set of entries. There is a problem here. For a distributed service, the backing map is effectively a java.util.Map<Binary,Binary>
with serialised keys and values. Implementations of Filter require that the
map entry passed to them return the deserialised values from the getKey()
and getValue() methods, and in some cases that the map entry passed is
an implementation of BinaryEntry. We can write an adapter that fulfils these
requirements by constructing a BackingMapBinaryEntry and passing that to the
filter.
final class B i n a r y E n t r y A d a p t e r F i l t e r implements EntryFilter {
private final B a c k i n g M a p M a n a g e r C o n t e x t mgr ;
private final EntryFilter delegate ;
public B i n a r y E n t r y A d a p t e r F i l t e r ( B a c k i n g M a p M a n a g e r C o n t e x t mgr , EntryFilter filter ) {
this . mgr = mgr ;
this . delegate = filter ;
}
public boolean evaluate ( Object obj ) {
throw new U n s u p p o r t e d O p e r a t i o n E x c e p t i o n ();
}
public boolean evaluateEntry ( java . util . Map . Entry entry ) {
return delegate . evaluateEntry ( new B a c k i n g M a p B i n a r y E n t r y (
( Binary ) entry . getKey () ,
( Binary ) entry . getValue () ,
null ,
mgr ));
}
}

So that in the findReservations method we write:


.
.
.
Filter filterAdapter = new B i n a r y E n t r y A d a p t e r F i l t e r ( mgr , filter );
return I n vo c ab l eM a pH e lp e r . query (
context . getBackingMap () , filterAdapter , false , false , null );

Now, in the process method, we have a map of current reservations, and


the keys of those no longer current. We iterate over that map updating the
reservation and flight for each:

176

CHAPTER 5. GRID PROCESSING

public Object process ( Entry entry ) {


Flight flight = ( Flight ) entry . getValue ();
B a c k i n g M a p M a n a g e r C o n t e x t bmc = (( BinaryEntry ) entry ). getContext ();
Ba cki ngM ap Con tex t r es e rv a ti on C on t ex t = bmc . g e t B a c k i n g M a p C o n t e x t ( " reservation " );
Converter k e y T o I n t e r n a l C o n v e r t e r = bmc . g e t K e y T o I n t e r n a l C o n v e r t e r ();
Map < Binary , Reservation > reservationMap = new HashMap < >();
for ( Reservation reservation : reservations ) {
Binary binKey = ( Binary ) k e y T o I n t e r n a l C o n v e r t e r . convert (
new ReservationKey ( reservation ));
reservationMap . put ( binKey , reservation );
}
for ( Binary key : findReservations (
reservationContext , ( Integer ) entry . getKey () , bookingRef )) {
if (! reservationMap . containsKey ( key )) {
reservationMap . put ( key , null );
}
}
for ( Map . Entry < Binary , Reservation > mapEntry : reservationMap . entrySet ()) {
Entry r ese rva tio ns Ent ry = re s er v at i on C on t ex t
. ge t Ba c ki n gM a pE n tr y ( mapEntry . getKey ());
if ( res erv ati ons Ent ry . isPresent ()) {
Reservation previous = ( Reservation ) r es erv ati ons Ent ry . getValue ();
updateFlight ( flight , previous . getFlightId () , previous . getSeatType () , -1);
}
Reservation reservation = mapEntry . getValue ();
if ( reservation == null ) {
re ser vat ion sEn try . remove ( false );
} else {
updateFlight ( flight , reservation . getFlightId () , reservation . getSeatType () , 1);
re ser vat ion sEn try . setValue ( reservation );
}
}
if ( flight . g e t A v a il a b l e E c o n o m y () < 0 || flight . g e t A v a i l a b l e B u s i n e s s () < 0) {
throw new I l l e g a l S t a t e E x c e p t i o n ( " Insufficient availability " );
}
entry . setValue ( flight );
return null ;
}

5.7.5

Backing Map Deadlocks

The InvocableMapHelper.query method called in findReservations does not take


any locks on the backing map or individual entries. Only when we call
reservationContext.getBackingMapEntry is a lock taken on that backing map
entry. If we modify the reservation cache in another worker thread without
first locking the corresponding flight cache entry, we risk a race condition
where the cache contents change between executing the query and performing
the updates. We must therefore be careful in our application design, any
operation that changes the reservation cache in a manner that might change
the results of the query must be performed via an EntryProcessor against
the flight cache, updating the reservation cache through the backing map

5.8. JOINS

177

context just as this example does. In this way, updates are synchronised
on the flight cache entry. There are still operations we can perform on the
reservation cache that do not interfere. For example, we may update the
checkin status of a set of reservations:
EntryProcessor checkinProcessor = new UpdaterProcessor ( " setCheckedIn " , true );
reservationCache . invokeAll ( keys , checkinProcessor );

Now, if some subset of the keys passed to invokeAll are in the same partition, as they will be if they refer to the same flightId, then they will be
updated atomically, locked in turn and the lock released only when all keys
in the partition have been updated. We can therefore create a deadlock
with the FlightReservationProcessor where the order of iteration of keys in
reservationMap differs from the order of invocation of the checkInProcessor.
Prior to version 12, Coherence would truly deadlock at this point, only recoverable by the service guardian if configured to do so. In version 129 , the
condition is detected:
( Wrapped : Failed request execution for DistributedCache service on Member ( Id =1 , Timestamp =201
4 -03 -18 08:10:07.45 , Address =127.0.0.1:22000 , MachineId =30438 , Location = site : DefaultSite , rack
: DefaultRack , machine : DefaultMachine , process :7792 , Role = D e d i c a t e d S t o r a g e E n a b l e d M e m b e r )) java . l
ang . I l l e g a l M o n i t o r S t a t e E x c e p t i o n : Deadlock while trying to lock a cache resource .
at com . tangosol . util . Base . e n s u r e R u n t i m e E x c e p t i o n ( Base . java :286)
at com . tangosol . coherence . component . util . daemon . queueProcessor . service . Grid . tagExcept
ion ( Grid . CDB :50)
at com . tangosol . coherence . component . util . daemon . queueProcessor . service . grid . partition
edService . PartitionedCache . onInvokeRequest ( PartitionedCache . CDB :61)
at com . tangosol . coherence . component . util . daemon . queueProcessor . service . grid . partition
edService . P a r t i t i o n e d C a c h e $ I n v o k e R e q u e s t . run ( PartitionedCache . CDB :1)
at com . tangosol . coherence . component . util . D a e m o n P o o l $ W r a p p e r T a s k . run ( DaemonPool . CDB :1)
at com . tangosol . coherence . component . util . D a e m o n P o o l $ W r a p p e r T a s k . run ( DaemonPool . CDB :32
)
at com . tangosol . coherence . component . util . Dae mo nPo ol$ Dae mon . onNotify ( DaemonPool . CDB :66
)
at com . tangosol . coherence . component . util . Daemon . run ( Daemon . CDB :51)
at java . lang . Thread . run ( Thread . java :744)

The solution in this case is straightforward. In the FlightReservationProcessor,


we must ensure we iterate and lock the reservation cache entries in the natural ordering of the serialised binary key, which is precisely what we get by
using a TreeMap rather than a HashMap.
Map < Binary , Reservation > reservationMap = new TreeMap < >();

5.8

Joins

Objective
Show how to perform join processing between two caches, distributing
the work through the cluster.
9

Feature only announced for 12.1.3, so perhaps should not be relied on in 12.1.2

178

CHAPTER 5. GRID PROCESSING

Prerequisites
An understanding of key affinity as explored in section 5.7: Working
With Many Caches. Also, read through section 5.4: Writing a custom aggregator for an understanding of how aggregators work and best
practices using them.
Code examples
Source code is in the org.cohbook.gridprocessing.joins package of the
gridprocessing module. We also re-use the domain objects from the
org.cohbook.gridprocessing.reentrancy package from the previous section, section 5.7: Working With Many Caches.
Dependencies
We use Littlegrid in the example tests
The first question to answer when you want to join data from two caches
together is: why? We deal with objects here, not a flat relational model.
Why not store the joined data as a single parent object containing a collection
of the related child objects? It is possible, with some limitations, to query
and mutate the embedded collection, even to create indexes on it. Having
said that, there are valid cases where a separate cache does make sense updating a larger more complex object will be more expensive than adding
a simple object to a separate cache and that may be an overriding factor. So
having established the need for separate caches and a join strategy, we must
first consider the constraints.
To perform join processing between two distributed caches within the cluster
without hitting service re-entrancy problems similar to those we discussed
in subsection 5.7.1: Referencing Another Cache From an EntryProcessor,
we must ensure that the cache entries on both sides of the join are in the
same cluster member, that they are stored in separate caches within the
same distributed service, and linked with key affinity. We will demonstrate
using the Flight and Reservation classes from section 5.7: Working With
Many Caches. Recall that the flight cache is keyed by the int flightId
and the reservation cache is keyed by a ReservationKey, which implements
KeyAssociation using the flightId.
Recall also from subsection 5.1.1: EntryAggregator vs EntryProcessor that
we have two mechanisms for invoking in-situ processing of cache entries,
the EntryProcessor and the EntryAggregator. Each can operate on cache entries. but the EntryAggregator offers significantly better performance with the
limitation that it cannot modify the cache. Wherever possible you should

5.8. JOINS

179

use EntryAggregator in preference to EntryProcessor and we shall do so in the


remainder of this section.

5.8.1

Many-to-One Joins

We start with the simple case of a many-to-one join, where we use a foreign
key in one cache to lookup an entry in another. In our BozoJet data model,
a booking consists of many reservations, each linking a passenger to a flight.
Wed like to produce an itinerary for a booking by finding all reservations
for a given booking id, and joining the flight details for each stage. The
ParallelAwareAggregator.aggregate method is passed a Set which we can cast
to Set<BinaryEntry>
public class I t i n e r a r y A g gr e g a t o r implements P a r a l l e l A w a r e A g g r e g a t o r {
public Object aggregate ( Set set ) {
for ( BinaryEntry reservationEntry : ( Set < BinaryEntry >) set ) {
Reservation reservation = ( Reservation ) reservationEntry . getValue ();
.
.
.
}

For each reservation, we obtain the flight id and use this to look up in the
backing map of the flight cache. We cannot call getBinaryEntry() for the
flight cache, that method is only usable in an EntryProcessor; it would throw
an IllegalStateException if we called it here. But there is an alternative
BackingMapContext.getReadOnlyEntry we can use:
B a c k i n g M a p M a n a g e r C o n t e x t managerContext =
reservationEntry . getContext ();
Ba cki ngM apC ont ex t flightContext =
managerContext . g e t B a c k i n g M a p C o n t e x t ( " flight " );
int flightId = reservation . getFlightId ();
Object binaryKey = managerContext
. g e t K e y T o I n t e r n a l C o n v e r t e r (). convert ( flightId );
Flight flight = ( Flight ) flightContext
. getReadOnlyEntry ( binaryKey ). getValue ();

We now have the Reservation object and its corresponding Flight, what are
we going to return from the aggregate method? We could write a simple
wrapper class that contains the flight and its reservation, or use a generic
Pair implementation and return a collection of these. But, again, we are not
dealing with a flat relational model - we should structure the data according
to its use. If the ultimate goal is to produce a booking itinerary that lists
each flight with the names of the passengers booked on that flight, we should
return the data in that form. Lets create a new value object class:

180

CHAPTER 5. GRID PROCESSING

public class ItineraryStage implements Serializable {


private
private
private
private
private

int flightId ;
String origin ;
String destination ;
Date departureTime ;
List < String > passengerNames ;

public ItineraryStage ( Flight flight ) {


flightId = flight . getFlightId ();
origin = flight . getOrigin ();
destination = flight . getDestination ();
departureTime = flight . getDepartureTime ();
passengerNames = new ArrayList < >();
}
public void addPassenger ( String passenger ) {
passengerNames . add ( passenger );
}
// getters
}

The return value for the aggregate method will be a collection of ItineraryStage
public Object aggregate ( Set set ) {
Map < Integer , ItineraryStage > fl igh tIt ine rar ies =
new HashMap < Integer , ItineraryStage >();
for ( BinaryEntry reservationEntry : ( Set < BinaryEntry >) set ) {
Reservation reservation = ( Reservation ) reservationEntry . getValue ();
int flightId = reservation . getFlightId ();
if (! fl igh tIt ine rar ies . containsKey ( flightId )) {
B a c k i n g M a p M a n a g e r C o n t e x t managerContext = reservationEntry . getContext ();
Ba cki ngM apC ont ext flightContext =
managerContext . g e t B a c k i n g M a p C o n t e x t ( " flight " );
Object binaryKey = managerContext . g e t K e y T o I n t e r n a l C o n v e r t e r ()
. convert ( flightId );
Object binaryValue = flightContext . getBackingMap (). get ( binaryKey );
Flight flight = ( Flight ) managerContext . g e t V a l u e F r o m I n t e r n a l C o n v e r t e r ()
. convert ( binaryValue );
ItineraryStage stage = new ItineraryStage ( flight );
fl igh tIt ine rar ies . put ( flightId , stage );
}
fl igh tIt ine rar ies . get ( flightId ). addPassenger (
reservation . getPassengerName ());
}
return new ArrayList < ItineraryStage >( fli ght Iti ner ari es . values ());
}

We have to construct a new ArrayList for the return value because this
will be serialised and sent to another node but the runtime return type
of HashMap.values() is not Serializable.
The aggregate method gives the results for a single member (or partition,
depending on configuration). A single booking may involve several flights

5.8. JOINS

181

on different partitions so our implementation of the aggregateResults method


must combine all of these results. All reservations for a single flight will be
on the same partition (through key affinity), so we simply need to add all
the collections together:
public Object aggregateResults ( Collection collection ) {
Collection < ItineraryStage > result = new ArrayList < >();
for ( Collection < ItineraryStage > partialResult :
( Collection < Collection < ItineraryStage > >) collection ) {
result . addAll ( partialResult );
}
return result ;
}

We have followed the practices in section 5.4: Writing a custom aggregator


to produce a stateless aggregator so the last method that we must implement
is trivial:
public EntryAggregator g e t P a r a l l e l A g g r e g a t o r () {
return this ;
}

Finally, the aggregator itself has no arguments, so for convenience we can


define a single public static instance:
public class I t i n e r a r y A g gr e g a t o r implements P a r a l l e l A w a r e A g g r e g a t o r {
public static final I t i n e r a r y Ag g r e g a t o r INSTANCE = new I t i n e r a r y A gg r e g a t o r ();
.
.
.

To invoke, we use a filter to find all keys in the reservation cache that contain
the required booking id:
NamedCache reservationCache = CacheFactory . getCache ( " reservation " );
ValueExtractor extractor = new R e f l ec t i o n E x t r a c t o r (
" getBookingId " , null , Ab str act Ext rac tor . KEY );
int bookingId = 23;
Filter filter = new EqualsFilter ( extractor , bookingId );
Collection < ItineraryStage > itinerary =
( Collection < ItineraryStage >) reservationCache . aggregate (
filter , I t i n e r a r y A g g r eg a t o r . INSTANCE );

5.8.2

One-to-Many Joins

Now consider the requirement to produce a passenger manifest containing


details for a flight, including the lists of passengers booked in economy and
business class. Well define our result object first:

182

CHAPTER 5. GRID PROCESSING

public class Pas sen ge rMa nif est implements Serializable {


private
private
private
private
private
private

int flightId ;
String origin ;
String destination ;
Date departureTime ;
List < String > e c o n o m y P a s s e n g e r N a m e s ;
List < String > b u s i n e s s P a s s e n g e r N a m e s ;

public Pa sse nge rMa ni fes t ( Flight flight ) {


flightId = flight . getFlightId ();
origin = flight . getOrigin ();
destination = flight . getDestination ();
departureTime = flight . getDepartureTime ();
e c o n o m y P a s s e n g e r N a m e s = new ArrayList < >();
b u s i n e s s P a s s e n g e r N a m e s = new ArrayList < >();
}
public void addPassenger ( String passenger , SeatType seatType ) {
switch ( seatType ) {
case business :
b u s i n e s s P a s s e n g e r N a m e s . add ( passenger );
break ;
case economy :
e c o n o m y P a s s e n g e r N a m e s . add ( passenger );
break ;
}
}
// getters
}

We start our PassengerManifestAggregator by iterating over the flight entries,


constructing a manifest for each:
public class P a s s e n g e r M a n i f e s t A g g r e g a t o r implements P a r a l l e l A w a r e A g g r e g a t o r {
public Object aggregate ( Set set ) {
Collection < PassengerManifest > manifests = new ArrayList < >();
for ( BinaryEntry flightEntry : ( Set < BinaryEntry >) set ) {
manifests . add (
getManifest (( Flight ) flightEntry . getValue () , flightEntry . getContext ()));
}
return manifests ;
}

In the getManifest method, we must first obtain the reservation cache backing
map and query it for all reservations for the flight. As we found in subsection 5.7.4: Backing Map Queries, the backing map keys and values are serialised so, unless were using POF, we must use the BinaryEntryAdapterFilter
we developed in that section to convert the search values. We use the magic
third parameter to InvocableMapHelper.query to give us the entries that match,
rather than the keys:

5.8. JOINS

183

private P ass eng erM an ife st getManifest (


Flight flight , B a c k i n g M a p M a n a g e r C o n t e x t managerContext ) {
Ba cki ngM apC ont ex t re s er v at i on C on t ex t =
managerContext . g e t B a c k i n g M a p C o n t e x t ( " reservation " );
ValueExtractor fl igh tId Ext rac tor = new R e f l ec t i o n E x t r a c t o r (
" getFlightId " , null , Abs tra ctE xtr act or . KEY );
EntryFilter filter = new EqualsFilter ( flightIdExtractor , flight . getFlightId ());
Filter filterAdapter = new B i n a r y E n t r y A d a p t e r F i l t e r ( managerContext , filter );
Collection < Entry > reservations = In v oc a bl e Ma pH e lp e r . query (
r es e rv a ti o nC o nt e xt . getBackingMap () , filterAdapter , true , false , null );

Now we can construct our PassengerManifest and iterate over the reservations adding each passenger to the manifest. Again, we must deserialise the
reservation in order to extract the passenger name and seat type:
Pa sse nge rMa nif es t manifest = new Pas sen ger Man ife st ( flight );
Converter r e s e r v a t i o n C o n v e r t e r = managerContext . g e t V a l u e F r o m I n t e r n a l C o n v e r t e r ();
for ( Entry b i n a r y R e s e r v a t i o n E n t r y : reservations ) {
Object bi nar yRe ser va tio n = b i n a r y R e s e r v a t i o n E n t r y . getValue ();
Reservation reservation =
( Reservation ) r e s e r v a t i o n C o n v e r t e r . convert ( bi nar yRe ser vat io n );
manifest . addPassenger ( reservation . getPassengerName () , reservation . getSeatType ());
}
return manifest ;
}

The class is completed with a static instance and remaining methods exactly
as for the ItineraryAggregator.
public class P a s s e n g e r M a n i f e s t A g g r e g a t o r implements P a r a l l e l A w a r e A g g r e g a t o r {
public static final P a s s e n g e r M a n i f e s t A g g r e g a t o r INSTANCE =
new P a s s e n g e r M a n i f e s t A g g r e g a t o r ();
.
.
.
public EntryAggregator g e t P a r a l l e l A g g r e g a t o r () {
return this ;
}
public Object aggregateResults ( Collection collection ) {
Collection < PassengerManifest > result = new ArrayList < >();
for ( Collection < PassengerManifest > partialResult :
( Collection < Collection < PassengerManifest > >) collection ) {
result . addAll ( partialResult );
}
return result ;
}
}

Invoke for a single flight using the keyset form of InvocableMap.aggregate:


int flightId = 23;
NamedCache flightCache = CacheFactory . getCache ( " flight " );
Collection < PassengerManifest > manifests =
( Collection < PassengerManifest >) flightCache . aggregate (
Collections . singleton ( flightId ) , P a s s e n g e r M a n i f e s t A g g r e g a t o r . INSTANCE );
Pa sse nge rMa nif es t manifest = manifests . iterator (). next ();

184

CHAPTER 5. GRID PROCESSING

5.8.3

Join Using a ValueExtractor

An alternative implementation for one-to-many joins is to place the join logic


in an EntryExtractor:
public class P a s s e n g e r M a n i f e s t E x t r a c t o r extends EntryExtractor {
public static final P a s s e n g e r M a n i f e s t E x t r a c t o r INSTANCE
= new P a s s e n g e r M a n i f e s t E x t r a c t o r ();
public Object extractFromEntry ( java . util . Map . Entry entry ) {
BinaryEntry flightEntry = ( BinaryEntry ) entry ;
Flight flight = ( Flight ) flightEntry . getValue ();
B a c k i n g M a p M a n a g e r C o n t e x t managerContext = flightEntry . getContext ();
Ba cki ngM ap Con tex t r es e rv a ti on C on t ex t =
managerContext . g e t B a c k i n g M a p C o n t e x t ( " reservation " );
ValueExtractor fl igh tId Ext ra cto r = new R e f l e ct i o n E x t r a c t o r (
" getFlightId " , null , Abs tra ctE xtr act or . KEY );
EntryFilter filter = new EqualsFilter ( flightIdExtractor , flight . getFlightId ());
Filter filterAdapter = new B i n a r y E n t r y A d a p t e r F i l t e r ( managerContext , filter );
@S upp res sW arn ing s ( " unchecked " )
Collection < Entry > reservations = In vo c ab l eM a pH e lp e r . query (
r es e rv a ti o nC on t ex t . getBackingMap () , filterAdapter , true , false , null );
Pa sse nge rM ani fes t manifest = new Pas sen ger Man ife st ( flight );
Converter r e s e r v a t i o n C o n v e r t e r = managerContext . g e t V a l u e F r o m I n t e r n a l C o n v e r t e r ();
for ( Entry b i n a r y R e s e r v a t i o n E n t r y : reservations ) {
Object bi nar yR ese rva tio n = b i n a r y R e s e r v a t i o n E n t r y . getValue ();
Reservation reservation =
( Reservation ) r e s e r v a t i o n C o n v e r t e r . convert ( bi nar yRe ser vat ion );
manifest . addPassenger ( reservation . getPassengerName () , reservation . getSeatType ());
}
return manifest ;
}
}

This extractor can be used with an instance of ReducerAggregator to return a


map of flight id to manifest, so for a single flight we can write:
NamedCache flightCache = CacheFactory . getCache ( " flight " );
EntryAggregator m an i fe s tA g gr e ga t or =
new Red uc erA ggr ega tor ( P a s s e n g e r M a n i f e s t E x t r a c t o r . INSTANCE );
Map < Integer , PassengerManifest > manifestMap =
( Map < Integer , PassengerManifest >) flightCache . aggregate (
Collections . singleton ( FLIGHTID ) , ma ni f es t Ag g re g at o r );
Pa sse nge rM ani fes t manifest = manifestMap . entrySet (). iterator (). next (). getValue ();

5.8.4

Joins Without Key Affinity

A common use-case is to join transaction data to relatively slow-changing


reference data; trades to instruments, orders to products etc. Where the
size of the reference data is not too large this can be stored in a replicated

5.8. JOINS

185

cache. A service re-entrant call to NamedCache.get or NamedCache.query in an


EntryProcessor or EntryAggregator will be quickly satisfied from the local member. If the reference data is too large or too fast-changing to use a replicated
cache you might consider using a distributed cache with a near cache - but be
careful of deadlocks, and of the performance impact of cache misses.

5.8.5

Further Reading

Ben Stopford gave a talk at a London Coherence SIG covering the use of
a snowflake schema and a strategy for dealing with data sets too large to
replicate to all members of a cluster10 .

10
See
http://www.slideshare.net/benstopford/beyond-the-data-grid-coherencenormalisation-joins-and-linear-scalability

186

CHAPTER 5. GRID PROCESSING

Chapter 6

Persistence
6.1

Introduction

Persistence. In the world of Oracle Coherence, this usually means keeping


cached data and stored data in some relational database in line using implementations of CacheLoader or CacheStore, though there are many variations on
the theme:
The storage engine does not have to be an RDBMS. NoSQL solutions,
flat files, JMS publishers to some remote storage engine have all been
used.
There are other points, such as triggers and interceptors, where changes
to the cache may be intercepted. In particular see section 7.4: Transactional Persistence to see how to map partition-local transactions in
Coherence to a database transaction.
Mostly in this chapter well be investigating how to write, configure, and use
CacheStore and CacheLoader. For such a simple and intuitive interface, it is
surprising just how many subtle traps there are for the unwary, so well start
with a brief overview of how a distributed cache service works and how it
interacts with with your CacheStore
187

188

6.1.1

CHAPTER 6. PERSISTENCE

Expiry and Eviction

These are terms that often cause confusion. Just to make it absolutely clear
what were talking about:
Expiry is the mechanism by which Coherence determines that a cache entry
is out-of-date. When the entry expires Coherence does nothing at
all, specifically, it is not deleted at the time of expiry. All expiry
means is that the entry will be discarded at some future time and will
not be returned without being refreshed from the CacheLoader (if any).
Experiment suggests that expired entries are removed on any access
to the backing map that holds them, but this is not explicitly part of
the contract. Expiry is concerned with ensuring that cache entries are
up-to-date and not with cache size.
Eviction is the removal of entries from the cache when the size of the backing map has grown too large. Note that the test applies to each backing
map separately (which may either be a partition, or all partitions in
a member depending on whether the service is configured for a partitioned backing map).

6.1.2

Thread Model

A single distributed cache service has, in each storage-enabled member, one


main thread associated with it. An additional pool of one or more worker
threads may also be configured. If no pool is configured, all cache operations
are performed in the main service thread. Otherwise, these are delegated to
the worker thread pool. This thread pool is created per service, and shared
by all caches on that service.
Synchronous CacheStore operations (read-through and write-through) are performed on the thread (service or worker) that invoked them. Each individual
cache that has read-ahead configured will have one read-ahead thread, each
cache configured for write-behind will have one write-behind thread.

6.1. INTRODUCTION

189

Here is a summary in table form:

6.1.3

Thread

Number

Per

Service thread
Service worker thread pool
Read-ahead
Write-behind

1
0 or more
0 or 1
0 or 1

Service
Service
Cache
Cache

Read-Through and Write-Through

This is the simplest strategy to implement and to manage. Database operations are performed in the service or worker thread that invokes them.
Exceptions can be thrown back all the way to the invoking client, causing
the cache operation to fail including rolling back any enclosing partition-local
operation. There is a configuration option to have Coherence swallow the
exception, but using it seems to be like choosing to have all the disadvantages
of synchronous and asynchronous operations combined.
Database operations will typically take considerably longer than cache operations, tens or hundreds of milliseconds compared to perhaps a millisecond for
a cache operation. The executing service thread will be in use for all of this
time, so you should consider the duration and frequency of database operations when sizing the worker thread pool so as to avoid task backlogs in the
service. Remember that service worker threads are shared by many caches
so slow-running operations in one cache can cause thread starvation and task
backlogs which affect caches that themselves have no database i/o.
Synchronous Batch Operations
Under what circumstances will Coherence call the set based CacheStore operations loadAll, storeAll, eraseAll?
Any cache operation that affects a set of entries in the same partition or
member will be handled as a batch. A number of factors, including whether
the cache operation itself is key or filter based and whether a service thread
pool ahas been configured, will affect whether batches are at partition or
member level.
If the <cachestore-scheme> is configured for operation bundling, concurrent
operations from different threads may be batched together.

190

CHAPTER 6. PERSISTENCE

6.1.4

Read-Ahead

For each cache configured for read-ahead, a single thread per member will
refresh cache entries approaching expiry, in order to avoid the synchronous
database read latency for read-through of an expired entry. Use with caution,
and experiment with different values of <refresh-ahead-factor> under both
typical and abnormal load scenarios as bursts of read-ahead activity can tie
up the thread and increase, rather than decrease read latency.

6.1.5

Write-Behind

Cache updates are recorded in a queue per cache per member, a single thread
per cache per member reads this queue to perform database updates in background. Operations can be batched through the storeAll method. Writebehind is enabled by setting the <write-delay> configuration element to a
non-zero value.

The Write-Behind Queue


The queue contains the keys of the entries that need to be persisted, internally Coherence annotates the cache entries using the decorations API
methods (see ExternalizableHelper in the Coherence API documentation). If
a partition moves, or is lost and promoted from backup, the queue state
is reconstructed on the new owning member, no updates are lost, though
there is the potential for an update to have been committed to the database
but not yet recorded as such in the queue, which leads to the update being
applied again by the member that becomes the new owner of the partition.
Hence the importance of making database updates idempotent.

Set Sizes and Batch Operations


By default, Coherence will batch up to 128 pending writes into a single
batch for a call to storeAll, though this number can be changed using the
<write-max-batch-factor> configuration element1 . Precisely which elements
will be written at any given time is governed by the settings <write-delay>
1

since version 3.7.1

6.1. INTRODUCTION

191

and <write-batch-factor>. The documentation on these isnt very clear, so


heres how it works with some example settings:
write-delay

write-batch-factor

What gets written

1 second

1.0

1 second

0.75

1 second

0.0

When one update has been pending


for one second, all updates are batched
and CacheStore.storeAll is called. Some
cache entries may be written almost as
soon as they are updated
When one update has been pending
for one second, those updates that are
more than 34 second old are batched and
passed to the CacheStore. No cache entry will be written until it is at least 43
second old.
Entries are batched together only when
they are 1 second old, each entry will
be written about 1 second after it is
updated. There appear to be some
vagaries in time resolution here - you
might expect that this would mean that
entries are all individually submitted
but this is not the case.

Erase Operations
Calls to CacheStore.erase and CacheStore.eraseAll are always synchronous,
even if write-behind is enabled.2
Sizing a Connection Pool
Your cache operations are going to have to wait around long enough for
database i/o to complete, you dont want to compound that by having
threads hanging around waiting for a database connection to become available from a pool. If you want to ensure that that never happens, how many
connections do you need available in a member?
2
Weve tried several times to persuade Oracle that this is a bug, so far without success.
Please bend the ear of your Oracle representative if you concur.

192

CHAPTER 6. PERSISTENCE
One connection per cache that has read-ahead enabled
One connection per cache that has write-behind enabled
As many connections as there are worker threads in the thread pools of
each service that has caches configured with a CacheStore. This is true
even if all the caches are configured with read-ahead and write-behind
(erase is always synchronous).

Coalescing Updates
If a cache entry is modified and enters the write-behind queue, then is modified again before being written, there will be only one call to store or storeAll
for that entry. You might say that the updates are coalesced, or that your
CacheStore is given only the latest state of the entry to persist. It is therefore essential not to make any assumption as to whether a store operation
represents an insert or an update.

Error Handling
What happens if the store or storeAll operation throws an exception back to
Coherence? Simple enough for write-through, as weve already mentioned,
you can choose whether to throw the exception back to the caller or silently
discard it. For write-behind there is no such choice, the caller already believes that everything has succeeded and is already at the boarding gate for
the next Bozojet flight to Dubrovnik for a weekend drinking lager with its
mates.
What happens to the failed entry (or entries, if the exception was thrown
from storeAll) depends on the setting of <write-requeue-threshold> - a poorly
documented and inaccurately named configuration option.
if <write-requeue-threshold> is zero (the default), the offending entry (or entries) is discarded.
if <write-requeue-threshold> is non-zero, the entries are placed back on the
write-behind queue and retried one minute later. There does not appear to
be any option to configure the retry interval, nor is there any option to limit
the number of retries.

6.1. INTRODUCTION

193

There are detailed recommendations and an example of how to most safely


deal with exceptions in section 6.4: Error Handling in a CacheStore.

6.1.6

Consistency Model

Considering the architecture of an application that has a cache in front of


a persistent store, there are a number of factors to consider in maintaining
consistency between the two.
Synchronous or Asynchronous Operation
Synchronous read-through and write-through are the simplest case. The
persistent store is kept in line with any changes made to the cache. If the
data may be modified by another path, i.e., not through the cache, then
the cached data may be stale. This is most simply managed by setting
an appropriate expiry time for the cached data. If no acceptable trade-off
between cache refresh rate and age of data exists, then the cache must be
more actively managed. Either the processes that update the database must
be modified to do so through the cache or the database must be actively
monitored for changes and these propagated to the cache e.g. by calling
java cache update code from within an Oracle trigger, or feeding a Sybase
replication stream to a cache update process, or making use of the Oracle
Coherence GoldenGate HotCache.
Write-behind introduces additional complications:
persisted data will be stale by an indeterminate time dependent on the
size of the write-behind queue, the database connection throughput
and the <write-delay> setting.
lost partitions in the cluster will result in lost database updates as the
contents of the write-behind queue for those partitions is also lost. The
system management strategy must include a mechanism for recovering
from this situation.
as discussed in section 6.1.5: Error Handling above, exceptions occurring during database updates may result in inconsistency between
cache and database.
There is no transactional consistency between database updates for
related cache updates. There is no transactional consistency between

194

CHAPTER 6. PERSISTENCE
cache updates to start with unless using the transaction service (for
which we cant use a CacheStore) or partition-local transactions. We
consider a solution to this last in section 7.4: Transactional Persistence

Full or Partial Cache


The decision as to whether our cache will contain at all times the full set
of data in the persistent store has important implications for the design
and management of the application. The full store approach allows some
potentially useful optimisations: we can stub the load and loadAll operations
and avoid database calls on the basis that a cache miss always represent an
absent key. With a partial set we can mitigate the cost of repeated access
to the same missing keys by configuring a cache-miss cache.
Holding the full data set in the cache is a prerequisite for supporting correct
filter queries.
One disadvantage of holding the full set is a loss of tolerance to lost data.
When were building around a partial set, a lost partition incurs additional
overhead as the resulting cache misses cause read-through of the lost data,
but the correctness of the results is not impacted for key-based operations.
Another disadvantage is the need to ensure the cache is completely primed
at startup before providing application services. More on cache priming in
section 6.5: Priming Caches

6.2

A JDBC CacheStore

Objective
To demonstrate best-practice in persisting and retrieving data efficiently from a relational database
Prerequisites
An understanding of SQL, JDBC, and Spring frameworks database
support. A basic understanding of the typical architecture of an enterprise relational database server with its layers of disk, network, and
CPU are assumed, rather than explained here.
Code examples
Source code is in package org.cohbook.persistence.modelcachestore of the
persistence module of the downloadable examples

6.2. A JDBC CACHESTORE

195

Dependencies
To simplify the examples, we use Springs JdbcTemplate and related
classes. The example code uses the h2 database for tests, though its
easy enough to adapt to other databases by using the appropriate
driver and modifying the SQL
To minimise latency and maximise throughput in our Coherence cluster,
we must first try to avoid unnecessary database access. Once done, we
must now look at how to make what database access remains as efficient as
possible.

6.2.1

The Example Table

We can illustrate the key points using the simplest possible cache, storing
strings as keys and values, and persisting to a simple example table:
CREATE TABLE EXAMPLE_TABLE (
KEY VARCHAR (10) NOT NULL PRIMARY KEY ,
VALUE VARCHAR (100) NOT NULL
)

6.2.2

Loading

The implementation of the load method is trivially simple:


public Object load ( Object key ) {
return jdbcTemplate . queryForObject (
" SELECT VALUE FROM EXAMPLE_TABLE WHERE KEY = ? " ,
new Object [] { key } , String . class );
}

where jdbcTemplate is an instance of a Spring JdbcTemplate. The loadAll


method is only slightly more interesting.
public Map loadAll ( Collection keys ) {
final Map < String , String > result = new HashMap < String , String >();
RowMapper < String [] > rowmapper = new RowMapper < String [] >() {
@Override
public String [] mapRow ( ResultSet rs , int rowNum ) throws SQLException {
result . put ( rs . getString (1) , rs . getString (2));
return null ;
}
};
Map < String , Object > params = new HashMap < >();
params . put ( " keys " , keys );
npjdbcTemplate . query (
" SELECT KEY , VALUE FROM EXAMPLE_TABLE WHERE KEY in (: keys ) " ,
params , rowmapper );
return result ;
}

196

CHAPTER 6. PERSISTENCE

The important point here is to choose the most efficient means of querying
many rows; for many databases an IN clause on a key column is as efficient as
youll get. Note that we use NamedParameterJdbcTemplate as the ? placeholder
syntax doesnt as easily support collections.

6.2.3

Erasing

Even simpler than loading, the same points arise, so heres the code:
public void erase ( Object key ) {
jdbcTemplate . update (
" DELETE FROM EXAMPLE_TABLE WHERE KEY = ? " , key );
}
public void eraseAll ( Collection keys ) {
Map < String , Object > params = new HashMap < >();
params . put ( " keys " , keys );
npjdbcTemplate . update (
" DELETE FROM EXAMPLE_TABLE WHERE KEY IN (: keys ) " , params );
}

6.2.4

Updating

We have often been asked, How do I find the value that was in the cache
before the change was made so I can tell whether to insert into or update
the database?. There are two possible answers to that question. The wrong
answer is use a BinaryStore" (which well look at later in section 6.6: Persist
Without Deserialising). The correct answer is dont.
In general terms, its better to apply changes to a database based on what was
previously in the database, rather than on a cached value which hopefully,
so long as nothing has gone wrong and various other conditions are fulfilled,
should, with a bit of luck and a following wind, be the same as what was in
the database.
Here are some situations in which an insert/update decision based on previous cache value will be wrong:
The old cached state was stale, missing a recent database change
A member has failed after the CacheStore.store method completed, but
before the write-behind queue was updated, the member owning the
promoted backup partition applies it again.

6.2. A JDBC CACHESTORE

197

A previous database update failed, perhaps the old cache value caused
a constraint violation (more on exceptions later in section section 6.4:
Error Handling in a CacheStore)
The need to implement database updates in an idempotent manner is orthogonal to the concerns around use of database transactions. Where a cache
update maps to a single-row update in a database, transactions add little,
if any, value in CacheStore and add latency in that an extra database call is
usually required to perform the commit. The choice on whether or not to
use transactions should be based on the desired behaviour if database errors
occur. More on that topic later in section section 6.4: Error Handling in a
CacheStore).
Most databases have an upsert syntax that will allow a row to be updated or
created atomically whether or not it previously existed. We strongly recommend that your CacheStore.store implementations always use this approach.
For our h2 example, this is very simple:
public void store ( Object key , Object value ) {
jdbcTemplate . update (
" MERGE INTO EXAMPLE_TABLE VALUES (? , ?) " ,
key , value );
}

The equivalent SQL syntax for Oracle is a little more complex:


MERGE INTO EXAMPLE_TABLE USING DUAL
ON KEY = ? WHEN MATCHED THEN
UPDATE SET VALUE = ?
WHEN NOT MATCHED THEN
INSERT ( KEY , VALUE ) VALUES (? , ?)

Consult your database documentation, or someone who really understands


the product, and test the performance of the possible approaches. The objective to construct an efficient, atomic, idempotent request to the database.
For loadAll and eraseAll, we were able to provide a single statement that
worked with all the keys, though that would not be so easy with a compound
key, and certainly isnt easy for our upsert operation. The most efficient
solution is to use a JDBC batch:
public void storeAll ( Map map ) {
List < Object [] > batchValues = new ArrayList < >( map . size ());
for ( Map . Entry < String , String > entry :
(( Map < String , String >) map ). entrySet ()) {
batchValues . add ( new Object [] {
entry . getKey () , entry . getValue () });
}
jdbcTemplate . batchUpdate (
" MERGE INTO EXAMPLE_TABLE VALUES (? , ?) " ,
batchValues );
}

198

6.2.5

CHAPTER 6. PERSISTENCE

Testing with JDBC and Littlegrid

For brevity, we omit the full text of the units tests for all of the code developed in this chapter, but it can all be found in the downloadable source code.
But here is a brief description of how we write self-contained unit tests for
a cluster using JDBC with an in-memory database. There must be a single
instance of the database shared by all of the cluster members, each of which
runs in its own class loader. Set up a TCP server for the database:
private static final String DBURL = " jdbc : h2 : tcp :// localhost / mem : test ; DB_CLOSE_DELAY = -1 " ;
private static final String TABLESQL = " CREATE TABLE EXAMPLE_TABLE ( "
+ " KEY VARCHAR (10) NOT NULL PRIMARY KEY , "
+ " VALUE VARCHAR (100) NOT NULL "
+ " ); " ;
private Server h2Server ;
private JdbcOperations jdbcop ;
@Before
public void setUp () throws SQLException {
h2Server = Server . createTcpServer (). start ();
DataSource dataSource = new D r i v e r M a n a g e r D a t a S o u r c e ( DBURL );
jdbcop = new JdbcTemplate ( dataSource );
jdbcop . execute ( TABLESQL );
.
.
.
}

With h2, the DB_CLOSE_DELAY flag causes the database to remain open when
there are no active connections, we will need to explicitly close it again after
the test:
@After
public void tearDown () {
.
.
.
h2Server . shutdown ();
}

Now, in this example, when each member constructs their CacheStore it gets
the database URL from a system property:
private static final DataSource dataSource = new D r i v e r M a n a g e r D a t a S o u r c e (
System . getProperty ( " database . url " ));

And this can be set while creating the Littlegrid cluster:


memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. s e t S t o r a g e E n a b l e d C o u n t (1)
. setCacheConfiguration (
" org / cohbook / persistence / cachectrldstore / cache - config . xml " )
. s e t A d d i t i o n a l S y s t e m P r o p e r t y ( " database . url " , DBURL )
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();

But we find the test fails:

6.2. A JDBC CACHESTORE

199

( Wrapped : Failed request execution for DistributedCache service on Member ( Id =1 , Timestamp =201
4 -10 -01 08:04:51.415 , Address =127.0.0.1:22000 , MachineId =30438 , Location = site : DefaultSite , rack
: DefaultRack , machine : DefaultMachine , process :4504 , Role = D e d i c a t e d S t o r a g e E n a b l e d M e m b e r ) ( Wrappe
d : Failed to store key ="1") Could not get JDBC Connection ; nested exception is java . sql . SQLEx
ception : No suitable driver found for jdbc : h2 : tcp :// localhost / mem : test ; DB \ _CLOSE \ _DELAY = -1) o
rg . springframework . jdbc . C a n n o t G e t J d b c C o n n e c t i o n E x c e p t i o n : Could not get JDBC Connection ; nes
ted exception is java . sql . SQLException : No suitable driver found for jdbc : h2 : tcp :// localhost /
mem : test ; DB \ _CLOSE \ _DELAY = -1
at com . tangosol . util . Base . e n s u r e R u n t i m e E x c e p t i o n ( Base . java :289)
.
.
.
at com . tangosol . coherence . component . util . Daemon . run ( Daemon . CDB :51)
at java . lang . Thread . run ( Thread . java :745)
Caused by : org . springframework . jdbc . C a n n o t G e t J d b c C o n n e c t i o n E x c e p t i o n : Could not get JDBC Conne
ction ; nested exception is java . sql . SQLException : No suitable driver found for jdbc : h2 : tcp ://
localhost / mem : test ; DB \ _CLOSE \ _DELAY = -1
at org . springframework . jdbc . datasource . DataSourceUtils . getConnection ( DataSourceUtils .
java :80)
at org . springframework . jdbc . core . JdbcTemplate . execute ( JdbcTemplate . java :575)
at org . springframework . jdbc . core . JdbcTemplate . update ( JdbcTemplate . java :818)
at org . springframework . jdbc . core . JdbcTemplate . update ( JdbcTemplate . java :874)
at org . springframework . jdbc . core . JdbcTemplate . update ( JdbcTemplate . java :882)
at org . cohbook . persistence . modelcachestore . S p r i n g J d b c C a c h e S t o r e . store ( SpringJdbcCache
Store . java :64)
at org . cohbook . persistence . cachectrldstore . C o n t r o l l a b l e C a c h e S t o r e . store ( ControllableC
acheStore . java :32)
at com . tangosol . net . cache . R e a dW r i t e B a c k i n g M a p \ $ C ac h eS t or e Wr a pp e r . storeInternal ( ReadWr
iteBackingMap . java :5930)
.
.
.
at com . tangosol . coherence . component . util . daemon . queueProcessor . service . grid . partition
edService . PartitionedCache . onPutRequest ( PartitionedCache . CDB :40)
... 10 more
Caused by : java . sql . SQLException : No suitable driver found for jdbc : h2 : tcp :// localhost / mem : te
st ; DB \ _CLOSE \ _DELAY = -1
at java . sql . DriverManager . getConnection ( DriverManager . java :596)
at java . sql . DriverManager . getConnection ( DriverManager . java :187)
at org . springframework . jdbc . datasource . D r i v e r M a n a g e r D a t a S o u r c e . g e t C o n n e c t i o n F r o m D r i v e
rManager ( D r i v e r M a n a g e r D a t a S o u r c e . java :173)
at org . springframework . jdbc . datasource . D r i v e r M a n a g e r D a t a S o u r c e . g e t C o n n e c t i o n F r o m D r i v e
r ( D r i v e r M a n a g e r D a t a S o u r c e . java :164)
at org . springframework . jdbc . datasource . A b s t r a c t D r i v e r B a s e d D a t a S o u r c e . getConnectionFro
mDriver ( A b s t r a c t D r i v e r B a s e d D a t a S o u r c e . java :153)
at org . springframework . jdbc . datasource . A b s t r a c t D r i v e r B a s e d D a t a S o u r c e . getConnection ( Ab
s t r a c t D r i v e r B a s e d D a t a S o u r c e . java :119)
at org . springframework . jdbc . datasource . DataSourceUtils . doGetConnection ( DataSourceUtil
s . java :111)
at org . springframework . jdbc . datasource . DataSourceUtils . getConn ection ( DataSourceUtils
. java :77)
... 24 more

The JDBC driver resolution mechanism is confused by the ClassLoader isolation. The workaround is to tell Littlegrid to exclude the h2 jar from the
ClassLoader classpath:
memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. s e t S t o r a g e E n a b l e d C o u n t (1)
. setCacheConfiguration (
" org / cohbook / persistence / cachectrldstore / cache - config . xml " )
. s e t A d d i t i o n a l S y s t e m P r o p e r t y ( " database . url " , DBURL )
. s e t J a r s T o E x c l u d e F r o m C l a s s P a t h ( " h2 -1.3.172. jar " )
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();

200

6.3

CHAPTER 6. PERSISTENCE

A Controllable Cache Store

Objective
Enable and disable CacheStore writes under program control so that we
can prime a cache without writing back to the database
Prerequisites
Basic familiarity with CacheStore. The example code references the
SpringJDBCCacheStore described in section 6.2: A JDBC CacheStore,
we also briefly discuss integration with the Spring application context
described in section 2.3: Build a CacheFactory with Spring Framework
Code examples
Source code is in the org.cohbook.persistence.controllablecachestore package of the persistence module of the downloadable examples
Dependencies
As well as Coherence, we use JUnit. h2 is used as an example database
Its the archetypal chicken-and-egg. A cache in front of a database table.
Writes to the cache must be written through to the database, and at startup
we want to load the cache from the database, but everything we write to the
cache during startup will then be written back to the database. Generally
harmless but inefficient and slow. To solve this problem we want to switch
the CacheStore on and off, disabling write-through during the cache priming
phase.

6.3.1

Using Invocation

We can easily add a setEnabled method to our CacheStore implementation,


the problem is, how do we call it from the member that will control the
priming, given that we must do so in every storage enabled member of the
service? One solution is to use an invocation service to send a switch to each
member.
First we must add the control method to our CacheStore. Well use a decorator
pattern for this example, a wrapper that only delegates the store methods
when enabled, ControllableCacheStore, shown in listing 6.1. We also need
an implementation of Invocable to perform the switch, CacheStoreSwitcher in
listing 6.2.

6.3. A CONTROLLABLE CACHE STORE

Listing 6.1: ControllableCacheStore.java


public class C o n t r o l l a b l e C a c h e S t o r e implements CacheStore {
private final CacheStore delegate ;
private boolean enabled = false ;
public C o n t r o l l a b l e C a c h e S t o r e ( CacheStore delegate ) {
this . delegate = delegate ;
}
public Object load ( Object obj ) {
return delegate . load ( obj );
}
public Map loadAll ( Collection collection ) {
return delegate . loadAll ( collection );
}
public void store ( Object obj , Object obj1 ) {
if ( enabled ) {
delegate . store ( obj , obj1 );
}
}
public void storeAll ( Map map ) {
if ( enabled ) {
delegate . storeAll ( map );
}
}
public void erase ( Object obj ) {
if ( enabled ) {
delegate . erase ( obj );
}
}
public void eraseAll ( Collection collection ) {
if ( enabled ) {
delegate . eraseAll ( collection );
}
}
public void setEnabled ( boolean enabled ) {
this . enabled = enabled ;
}
}

201

202

CHAPTER 6. PERSISTENCE
Listing 6.2: CacheStoreSwitcher.java

public class C ac h eS t or e Sw i tc h er implements Invocable , Serializable {


private
private
private
private

final boolean enable ;


final String cacheStoreName ;
transient Object result = null ;
transient C o n t r o l l a b l e C a c h e S t o r e cacheStore ;

public C a ch e St o re S wi t ch e r ( String cacheStoreName , boolean enable ) {


this . enable = enable ;
this . cacheStoreName = cacheStoreName ;
}
public void init ( Inv oca tio nSe rvi ce in vo cat ion ser vic e ) {
}
public void run () {
try {
cacheStore . setEnabled ( enable );
} catch ( RuntimeException e ) {
result = e ;
}
}
public Object getResult () {
return result ;
}
}

The missing piece in this puzzle is how we construct the ControllableCacheStore


and how we find that instance in the CacheStoreSwitcher. One approach is to
construct a utility class with a static factory method:
Listing 6.3: CacheStoreFactory.java
public class Cac heS to reF act ory {
private static final Map < String , ControllableCacheStore > cacheStoreMap = new HashMap < >();
private static final DataSource dataSource = new D r i v e r M a n a g e r D a t a S o u r c e (
System . getProperty ( " database . url " ));
public static C o n t r o l l a b l e C a c h e S t o r e getCacheStore ( String cacheStoreName ) {
synchronized ( cacheStoreMap ) {
C o n t r o l l a b l e C a c h e S t o r e result = cacheStoreMap . get ( cacheStoreName );
if ( result == null ) {
CacheStore jc = new S p r i n g J d b c C a c h e S t o r e ( dataSource );
result = new C o n t r o l l a b l e C a c h e S t o r e ( jc );
cacheStoreMap . put ( cacheStoreName , result );
}
return result ;
}
}
}

Well test with a cache configuration that defines a distributed scheme that
obtains the CacheStore instance using this factory, and also defines an invocation scheme that we can use to perform the switch:

6.3. A CONTROLLABLE CACHE STORE

203

<? xml version = " 1.0 " ? >


< cache - config xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - cache - config "
x si : sc h em a Lo c at i on = " http: // xmlns . oracle . com / coherence / coherence - cache - config
coherence - cache - config . xsd " >
< caching - scheme - mapping >
< cache - mapping >
< cache - name > test </ cache - name >
< scheme - name > d is t ri b ut e dS e rv ic e </ scheme - name >
</ cache - mapping >
</ caching - scheme - mapping >
< caching - schemes >
< distributed - scheme >
< scheme - name > d is t ri b ut e dS e rv ic e </ scheme - name >
< backing - map - scheme >
< read - write - backing - map - scheme >
< internal - cache - scheme >
< local - scheme / >
</ internal - cache - scheme >
< cachestore - scheme >
< class - scheme >
< class - factory - name >
org . cohbook . persistence . c o n t r o l l a b l e c a c h e s t o r e . C ach eSt ore Fac tor y
</ class - factory - name >
< method - name > getCacheStore </ method - name >
< init - params >
< init - param >
< param - value > testcachestore </ param - value >
</ init - param >
</ init - params >
</ class - scheme >
</ cachestore - scheme >
</ read - write - backing - map - scheme >
</ backing - map - scheme >
< autostart > true </ autostart >
</ distributed - scheme >
< invocation - scheme >
< scheme - name > invocationScheme </ scheme - name >
< service - name > in voc ati onS erv ice </ service - name >
< autostart > true </ autostart >
</ invocation - scheme >
</ caching - schemes >
</ cache - config >

This approach can form a component of a startup process, but caution is


needed, there are circumstances where it will fail:
1. A new member joining the cluster will start with the default state,
it wont be aware of previous enable/disable switch events, so if you
enable by default, you must ensure no new members join while priming
with the ControllableCacheStore disabled.
2. The assumption in the design is that priming will not take place while
business-as-usual processing is happening. If you wish to re-prime,
youll also need to suspend BAU processing taking particular care with
write-behind - ensure that the queues are drained before disabling the
ControllableCacheStore.

204

6.3.2

CHAPTER 6. PERSISTENCE

Wiring the Invocable with Spring

If we use Spring to instantiate the ControllableCacheStore and follow the approach described in section 2.3: Build a CacheFactory with Spring Framework, then we can dispense with the CacheStoreFactory. Here is the equivalent
Spring bean definition:
< bean id = " ex amp leD ata So urc e "
class = " org . springframework . jdbc . datasource . D r i v e r M a n a g e r D a t a S o u r c e " >
< constructor - arg >
< value > jdbc:h2:mem:test ; DB_CLOSE_DELAY = -1 </ value >
</ constructor - arg >
</ bean >
< bean id = " testcachestore "
class = " org . cohbook . persistence . c o n t r o l l a b l e c a c h e s t o r e . C o n t r o l l a b l e C a c h e S t o r e " >
< constructor - arg >
< bean class = " org . cohbook . persistence . modelcachestore . S p r i n g J d b c C a c h e S t o r e " >
< constructor - arg >
< bean id = " ex amp leD ata Sou rc e " / >
</ constructor - arg >
</ bean >
</ constructor - arg >
</ bean >

The CacheStoreSwitcher can now have the ControllableCacheStore injected:


@DeserialiseAutowire
public class C ac h eS t or e Sw i tc h er implements Invocable , Serializable {
.
.
.
@Autowired
@Qualifier ( " testcachestore " )
private transient C o n t r o l l a b l e C a c h e S t o r e cacheStore ;
.
.
.
@Override
public void init ( Inv oca tio nSe rvi ce in vo cat ion ser vic e ) {
}

If there are many instances to control, you might instead inject, say, a map
bean of name vs instance, or the ApplicationContext itself and use the name
to extract the instance. A cleaner solution is to use the @DynamicAutowire annotation described in section 2.3: Build a CacheFactory with Spring Framework
Listing 6.4: "CacheStoreSwitcher.java"
private final boolean enable ;
private final String cacheStoreName ;
private transient Object result = null ;
@DynamicAutowire ( beanNameProperty = " cacheStoreName " )
private transient C o n t r o l l a b l e C a c h e S t o r e cacheStore ;
public C a ch e St o re S wi t ch e r ( String cacheStoreName , boolean enable ) {
this . enable = enable ;
this . cacheStoreName = cacheStoreName ;
}
@Override
public void init ( Inv oca tio nSe rvi ce in vo cat ion ser vic e ) {
}

6.3. A CONTROLLABLE CACHE STORE

6.3.3

205

Using A Control Cache

An alternative approach to controlling a CacheStore it to use a separate


cache of control information. Consider a cache that has the name of a
ControllableCacheStore instance as its key, and a Boolean as its value, true and
false indicating that the named instance should be enabled or disabled.
We will decouple the enable status check from the ControllableCacheStore
through an interface:
public interface E n a b l e m e n t S t a t u s C h e c k e r {
boolean isEnabled ();
}

and
public class C o n t r o l l a b l e C a c h e S t o r e implements CacheStore {
private final CacheStore delegate ;
private final E n a b l e m e n t S t a t u s C h e c k e r check ;
public C o n t r o l l a b l e C a c h e S t o r e ( CacheStore delegate , E n a b l e m e n t S t a t u s C h e c k e r check ) {
this . delegate = delegate ;
this . check = check ;
}
@Override
public void store ( Object obj , Object obj1 ) {
if ( check . isEnabled ()) {
delegate . store ( obj , obj1 );
}
}
// etc
}

The implementation of CacheEnablementsChecker performs the lookup in the


control cache:
public class C a c h e E n a b l e m e n t S t a t u s C h e c k e r implements E n a b l e m e n t S t a t u s C h e c k e r {
public static final String CONTROLCACHE = " control - cache " ;
private final String key ;
public C a c h e E n a b l e m e n t S t a t u s C h e c k e r ( String key ) {
this . key = key ;
}
@Override
public boolean isEnabled () {
Boolean enabled = ( Boolean ) CacheFactory . getCache ( CONTROLCACHE ). get ( key );
return enabled == null ? false : enabled ;
}
}

We must configure our control cache such that this cache lookup always
happens locally in every storage node. This is one of the few situations in
which a replicated cache is appropriate: low update rate and little concern
for atomicity of updates.

206

CHAPTER 6. PERSISTENCE

< cache - config >


< caching - scheme - mapping >
.
.
.
< cache - mapping >
< cache - name > control - cache </ cache - name >
< scheme - name > rep lic at edS erv ice </ scheme - name >
</ cache - mapping >
.
.
.
</ caching - scheme - mapping >
< caching - schemes >
.
.
.
< replicated - scheme >
< scheme - name > rep lic at edS erv ice </ scheme - name >
</ replicated - scheme >
.
.
.
</ caching - schemes >
</ cache - config >

Finally, we must construct and inject the CacheEnablementStatusChecker when


we construct the ControllableCacheStore:
Listing 6.5: CacheStoreFactory.java
public static C o n t r o l l a b l e C a c h e S t o r e getCacheStore ( String cacheStoreName ) {
synchronized ( cacheStoreMap ) {
C o n t r o l l a b l e C a c h e S t o r e result = cacheStoreMap . get ( cacheStoreName );
if ( result == null ) {
CacheStore jc = new S p r i n g J d b c C a c h e S t o r e ( dataSource );
result = new C o n t r o l l a b l e C a c h e S t o r e ( jc ,
new C a c h e E n a b l e m e n t S t a t u s C h e c k e r ( cacheStoreName ));
cacheStoreMap . put ( cacheStoreName , result );
}
return result ;
}
}

6.3.4

Wiring the CacheStore with Spring

We can eliminate the need for the CacheStoreFactory by using the approach
outlined in section 2.3: Build a CacheFactory with Spring Framework.
Caution is advisable in any situation where one service depends on another,
especially as members start and join a cluster - its easy to create race conditions and circular dependencies, hence the advice in section 2.3: Build a
CacheFactory with Spring Framework to impose an ordered startup sequence,

6.3. A CONTROLLABLE CACHE STORE

207

constructing spring beans in advance. Key to that is avoiding any calls to


Coherence API methods while instantiating objects. We must ensure that
testing the control cache does not occur until the cluster has been started.
This we achieve by modifying the CacheEnablementStatusChecker to block until the cluster has started using the Spring SmartLifecycle as described in
subsection 2.3.2: Preventing Premature Cluster Startup:
public class C a c h e E n a b l e m e n t S t a t u s C h e c k e r implements EnablementStatusChecker , SmartLifecycle {
public static final String CONTROLCACHE = " control - cache " ;
private final String key ;
private final CountDownLatch latch = new CountDownLatch (1);
public C a c h e E n a b l e m e n t S t a t u s C h e c k e r ( String key ) {
this . key = key ;
}
@Override
public boolean isEnabled () {
try {
latch . await ();
} catch ( I n t e r r u p t e d E x c e p t i o n e ) {
throw new RuntimeException ( e );
}
Boolean enabled = ( Boolean ) CacheFactory . getCache ( CONTROLCACHE ). get ( key );
return enabled == null ? false : enabled ;
}
@Override
public void start () {
latch . countDown ();
}
@Override
public void stop () {
}
@Override
public boolean isRunning () {
return latch . getCount () == 0;
}
@Override
public int getPhase () {
return L i f e c y c l e V a l i d a t i n g C a c h e F a c t o r y B u i l d e r
. A F T E R _ C L US T E R _ P H A S E ;
}
@Override
public boolean isAutoStartup () {
return true ;
}
@Override
public void stop ( Runnable callback ) {
}
}

208

CHAPTER 6. PERSISTENCE

Finally, we wire it all together:


< beans xmlns = " http: // www . springframework . org / schema / beans "
xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns:context = " http: // www . springframework . org / schema / context "
x si : sc h em a Lo c at i on = " http: // www . springframework . org / schema / beans
http: // www . springframework . org / schema / beans / spring - beans . xsd
http: // www . springframework . org / schema / context
http: // www . springframework . org / schema / context / spring - context . xsd " >
< context:property - placeholder / >
< bean id = " c a c h e E n a b l e m e n t S t a t u s C h e c k e r "
class = " org . cohbook . persistence . s p r i n g c o n t r o l l a b l e c a c h e s t o r e .\
CacheEnablementStatusChecker ">
< constructor - arg value = " testcachestore " / >
</ bean >
< bean id = " dataSource "
class = " org . springframework . jdbc . datasource . D r i v e r M a n a g e r D a t a S o u r c e " >
< constructor - arg >
< value > ${ database . url } </ value >
</ constructor - arg >
</ bean >
< bean id = " jdbcCacheStore "
class = " org . cohbook . persistence . modelcachestore . S p r i n g J d b c C a c h e S t o r e " >
< constructor - arg >
< ref bean = " dataSource " / >
</ constructor - arg >
</ bean >
< bean id = " testcachestore "
class = " org . cohbook . persistence . cachectrldstore . C o n t r o l l a b l e C a c h e S t o r e " >
< constructor - arg >
< ref bean = " jdbcCacheStore " / >
</ constructor - arg >
< constructor - arg >
< ref bean = " c a c h e E n a b l e m e n t S t a t u s C h e c k e r " / >
</ constructor - arg >
</ bean >
< bean class = " org . cohbook . configuration . spring . L i f e c y c l e V a l i d a t i n g C a c h e F a c t o r y B u i l d e r .\
BuilderLifeCycle " / >
</ beans >

6.3.5

Decorated Values Anti-pattern

One pattern that we have seen independently invented by various project


teams in different organisations at different times is the persist flag on the
value object. Either as a direct property of the value objects class, or as
some decoration or encapsulation of it. The theory being that when priming,
one sets the flag to the dont persist value, and during normal operation
to persist. The CacheStore implementation examines this flag to determine
whether or not to store.
We cannot advise sufficiently strongly against pursuing this strategy. There
are a number of flaws:
1. It is ugly, contaminating your business-level objects with low-level control information

6.4. ERROR HANDLING IN A CACHESTORE

209

2. It is is susceptible to error - what action do you take if you use an


EntryProcessor to modify a value before it has been persisted?
3. It is unreliable. With write-behind, successive updates may be coalesced, possibly losing the desired state of the persist flag.
4. The CacheStore cannot modify the value to indicate that it has been
written, so it is not possible to mitigate the above problems by detecting whether the entry has been stored. Although a BinaryEntryStore
may make modifications to the entry it has been passed, Coherence
will only update the cache with the modified entry on a best efforts
basis.

6.4

Error Handling in a CacheStore

Objective
To understand the behaviour of Coherence when exceptions occur when
storing data, and describe some best practices for dealing with them
What happens when a CacheStore or CacheLoader implementation throws an
exception? In the case of a single synchronous operation, the default answer
is simple, the cache operation fails and the exception is thrown back to
the caller. It is possible by setting the rollback-cachestore-failures element
to false in the read-write-backing-map-scheme, to have the cache operation
succeed and no error thrown back to the caller, but valid use-cases for this
option are uncommon.
The situation is more complex for asynchronous (read-ahead, write-behind)
configurations; an exception will result in a discrepancy between cache and
database for one or more entries. So, what can you do about it?

6.4.1

Mitigating CacheStore Failure Problems

Prevention Is Better Than Cure


The first rule is to minimise the likelihood of this happening. Once you
choose to use asynchronous database updates, you have also passed responsibility for data validation to the application, rather than the database. Make
your database design as tolerant as possible:

210

CHAPTER 6. PERSISTENCE
no constraints, including relational integrity constraints
all columns large enough to accommodate the largest values they may
receive
allow nulls unless they really cannot occur.

If using an existing schema that is beyond your applications control, you


must ensure that any such constraints are checked in your domain object
before writing to the cache, perhaps in a trigger or interceptor.

Know When There Is A Problem


Use of asynchronous updates demands an adequate monitoring solution to
raise alerts when problems o occur, and the procedures to ensure that alerts
are acted upon and the problem resolved. You can choose to raise alerts by
log-scraping, or by using JMX to monitor the storeFailures attribute of the
cache MBean, but youll need to refer to logs to find the caches and keys
affected (you will be logging those, right?).

Have A Way Of Fixing The Problem


Youve raised an alert, your procedures are all in place and your diligent
first-line support analyst has spoken to the DBA and found and resolved the
underlying cause of the problem, but you still have a discrepancy between
cache and database. Short of shutting down and restarting the whole system,
how do you bring everything back into line? This might be as simple as
using the existing interfaces to resubmit a piece of work, or it may require
special coding to reload particular keys or caches, perhaps via a JMX MBean.
The important thing is that, once you have decided to use an asynchronous
CacheStore, you have considered the consequences and mitigation strategy
when (not if ) things go wrong.

6.4.2

Handling Exceptions In A CacheStore

Recall from the introduction to this chapter the way that Coherence handles
exceptions thrown from a CacheStore. The setting of write-requeue-threshold
determines whether or not failures are retried. Some exceptions may arise

6.4. ERROR HANDLING IN A CACHESTORE

211

from situations that are temporary or transient, we would like to set a nonzero write-requeue-threshold so that these can be requeued for a later retry.
Some errors will not succeed on retry - a value constraint violation, for example. If we requeue those, they will retry and fail for ever. Or at least until
either the cluster is shut down, or the cache entry is removed or updated
with a valid value. We might consider distinguishing between those errors
that are worth retrying and those that well simply raise alerts for and discard. We will create a decorator around our SpringJdbcCacheStore to manage
any exceptions it throws.
public class E x c e p t i o n H a n d l i n g C a c h e S t o r e implements CacheStore {
private static final Logger LOG =
LoggerFactory . getLogger ( E x c e p t i o n H a n d l i n g C a c h e S t o r e . class );
private final CacheStore delegate ;
public E x c e p t i o n H a n d l i n g C a c h e S t o r e ( CacheStore delegate ) {
this . delegate = delegate ;
}
public Object load ( Object obj ) {
return delegate . load ( obj );
}
public Map loadAll ( Collection collection ) {
return delegate . loadAll ( collection );
}
// store and erase methods
}

Spring defines a useful exception hierarchy from its DataAccessException type.


There are three immediate subtypes that all others subclass:
RecoverableDataAccessException,
TransientDataAccessException
NonTransientDataAccessException
It is a reasonable starting assumption that the first two are worth retrying,
but the last (and any other RuntimeException) is not, though well revisit that
assumption later. We might therefore write a store method like this:
public void store ( Object key , Object value ) {
try {
delegate . store ( key , value );
} catch ( R e c o v e r a b l e D a t a A c c e s s E x c e p t i o n | T r a n s i e n t D a t a A c c e s s E x c e p t i o n ex ) {
throw ex ;
} catch ( RuntimeException ex ) {
h a n d l e N o n T r a n s i e n t F a i l u r e ( key , value , ex );
}
}
protected void h a n d l e N o n T r a n s i e n t F a i l u r e ( Object key , Object value , Exception exception ) {
LOG . error ( " failed to store " + key + " : " + value , exception );
}

212

CHAPTER 6. PERSISTENCE

so that any error not worth retrying, we raise an alert for (of course, weve a
log-scraping monitor in place) and swallow the exception, so Coherence wont
retry it. Any retryable or transient exception gets thrown back to Coherence
for the entry to be re-queued. For some error conditions, an immediate retry
may succeed, so it may be worth putting a retry loop in:
private static final int MAX_RETRY_COUNT = 3;
public void store ( Object key , Object value ) {
try {
storeWithRetry ( key , value );
} catch ( R e c o v e r a b l e D a t a A c c e s s E x c e p t i o n | T r a n s i e n t D a t a A c c e s s E x c e p t i o n ex ) {
throw ex ;
} catch ( RuntimeException ex ) {
h a n d l e N o n T r a n s i e n t F a i l u r e ( key , value , ex );
}
}
private void storeWithRetry ( Object key , Object value ) {
int retryCount = 0;
RuntimeException lastException = null ;
while ( retryCount < MAX_RETRY_COUNT ) {
try {
delegate . store ( key , value );
return ;
} catch ( R e c o v e r a b l e D a t a A c c e s s E x c e p t i o n | T r a n s i e n t D a t a A c c e s s E x c e p t i o n ex ) {
lastException = ex ;
retryCount ++;
}
}
throw lastException ;
}

So far, so good. We could use the same approach for the storeAll method,
but there is an added complication: If weve submitted a batch of entries and
catch an exception, we dont know which entry caused it. Also, some of the
database updates may have been applied successfully before the exception
was thrown (we arent here running the batch update in a transaction. For
a retryable exception, we can retry the whole batch, but it might be better
to at least persist those entries that are good. For a non-transient exception,
we should split the batch and process each entry separately. For this to
succeed, we need to either perform the batch update in a transaction, or
better, follow the advice in section 6.2: A JDBC CacheStore and make the
updates idempotent so we can apply them more than once.

6.4. ERROR HANDLING IN A CACHESTORE

213

public void storeAll ( Map rawmap ) {


Map < Object , Object > map = rawmap ;
try {
st ore All Wit hRe tr y ( map );
} catch ( RuntimeException ex ) {
h an d le B at c hF a il u re ( map );
}
}
private void sto reA llW it hRe try ( Map < Object , Object > map ) {
int retryCount = 0;
RuntimeException lastException = null ;
while ( retryCount < MAX_RETRY_COUNT ) {
try {
delegate . storeAll ( map );
return ;
} catch ( R e c o v e r a b l e D a t a A c c e s s E x c e p t i o n | T r a n s i e n t D a t a A c c e s s E x c e p t i o n ex ) {
lastException = ex ;
retryCount ++;
}
}
throw lastException ;
}
private void h an d le B at c hF a il u re ( Map < Object , Object > map ) {
RuntimeException lastException = null ;
Iterator < Map . Entry < Object , Object > > iterator = map . entrySet (). iterator ();
while ( iterator . hasNext ()) {
Map . Entry < Object , Object > entry = iterator . next ();
try {
store ( entry . getKey () , entry . getValue ());
iterator . remove ();
} catch ( R e c o v e r a b l e D a t a A c c e s s E x c e p t i o n | T r a n s i e n t D a t a A c c e s s E x c e p t i o n ex ) {
lastException = ex ;
} catch ( RuntimeException ex ) {
h a n d l e N o n T r a n s i e n t F a i l u r e ( entry . getKey () , entry . getValue () , ex );
iterator . remove ();
}
}
if ( lastException != null ) {
throw lastException ;
}
}

Even though weve updated the database with everything that did succeed,
and logged the details of all those that did not, Coherence would still requeue the entire batch. Referring to the API doc for storeAll we see that
we can selectively resubmit by removing those entries that we dont wish to
resubmit from the input map, achieved by the iterator.remove() statement
above. We remove both the entries that succeeded and those that we deem
not worth retrying, so only transient and retyable exceptions remain in the
map.

6.4.3

Notes on Categorising Exceptions

Weve used the Spring exception categorisation here as an example, but


no such scheme will be perfect out of the box. For example, an Oracle
database unable to extend a table space will return error code ORA-01653,

214

CHAPTER 6. PERSISTENCE

which Spring will map to an UncategorizedDataAccessException, a subclass of


NonTransientDataAccessException. You might consider this should be a an exception worth retrying - an alert DBA might extend the database or purge
old data to make space. Spring does provide mechanisms for changing the
exception mapping. The point remains, that there will be those exceptions
that you wish to retry, those you wish to log and discard, and those that
you havent thought of (uncategorised). It is down to the requirements of
your application as to whether you consider it safer to discard these, or to
retry indefinitely (until cluster shutdown, or the cache entry is modified or
deleted).

6.4.4

Limiting the Number of Retries

Coherence has no option to limit the number of times that a failing update
will be re-queued, and it isnt really practical to do so yourself within the
CacheStore implementation as we have no way of updating the cache entry
or its decorations. The simplest approach would be to timestamp entries
as they are created or updated, perhaps with a trigger. We could then
implement our CacheStore to log and discard exceptions for entries over a
certain age.
A BinaryStore could be used to update the entry with a retry count in a
custom decoration each time it failed until it reached the limit, though the
best efforts update policy implemented by Coherence means that the count
may not always be updated.

6.5

Priming Caches

Objective
A discussion of methods of priming a cache on cluster startup, culminating in the most efficient method we have found to date
Prerequisites
An understanding of SQL, JDBC, and Coherence partitions
Often, the design of an application requires that data must be pre-loaded
into a cache before it can be used. Even where read-through is an option
the latency incurred by the first reads may be unacceptable. It is usually
more efficient to bulk-load the data before starting the application. Typically

6.5. PRIMING CACHES

215

from a relational database - which well use as our example - though similar
considerations apply to other data sources.
First, some ways not to do it.
Except for small, maybe replicated caches, dont read the entire data
set into memory and then call NamedCache.putall()). You are presumably using a clustered cache because there is too much data to accommodate in a single JVM. Most of us have made this mistake or
something similar at least once, so dont feel too bad about it.
Dont read all the keys into memory and then iterate over them doing
a NamedCache.get() This combines all the disadvantages of the cache
and the database - were single-threading the whole process and then
performing a separate database read for every item. You might as well
not bother and just let the application load the data on demand.
A single JDBC query, iterating over the result set some number of rows at a
time with batched calls to NamedCache.putall() is a reasonable starting point.
We stream the data form the data source and take advantage of some degree
of parallelism with the cache updates.
If you can reasonably partition the data into roughly equal tranches with
suitable WHERE clauses, you can speed up the load by performing several
database reads in parallel, perhaps executing these on different hosts on
the cluster.
To really minimise network i/o while priming the cache, we would ideally
like to load the data directly from the source into the storage node that owns
the partition. We can do this if we store the partition id in the database,
and perform the load from an Invocable. Well use a simple table to prove
the concept:
CREATE TABLE EXAMPLE_TABLE (
PARTITION INTEGER NOT NULL ,
KEY VARCHAR (10) NOT NULL PRIMARY KEY ,
VALUE VARCHAR (100) NOT NULL
);

We create an implementation of Invocable to run on every storage-enabled


member that will:
1. find which partitions are owned by that member
2. query the database for all the rows that are owned by those partitions
3. instantiate the objects for those rows and place them in the cache

216

CHAPTER 6. PERSISTENCE

@Portable
public class C a c h e P r i m e I n v oc a b l e implements Invocable {
private
private
private
private

transient
transient
transient
transient

Member member ;
N a m e d P a r a m e t e r J d b c O p e r a t i o n s jdbcTemplate ;
NamedCache cache ;
PartitionSet partitionSet ;

public RowMapper < Object > getRowMapper () {


return new RowMapper < Object >() {
public Object mapRow ( ResultSet rs , int rowNum ) throws SQLException {
cache . put ( rs . getString (1) , rs . getString (2));
return null ;
}
};
}
public void run () {
P ar t it i on e dS e rv i ce service = ( Pa r ti t io n ed S er v ic e ) cache . getCacheService ();
partitionSet = service . g e tO w ne d Pa r ti t io n s ( member );
List < Integer > parts = Arrays . asList ( ArrayUtils . toObject ( partitionSet . toArray ()));
Map < String , Object > parameters = new HashMap < >();
parameters . put ( " PARTITIONS " , parts );
jdbcTemplate . query (
" SELECT KEY , VALUE FROM EXAMPLE_TABLE WHERE PARTITION IN ( : PARTITIONS ) " ,
parameters , getRowMapper ());
}
}

Of the transient member variables, the member object, the JDBC template,
and the cache can be injected, perhaps using the pattern described in subsection 2.3.7: Rewiring Deserialised Objects. For the sake of simplicity in our
example, well just construct them in the init method of the Invocable
public void init ( Inv oca tio nSe rvi ce in vo cat ion ser vic e ) {
member = inv oca tio nse rv ice . getCluster (). getLocalMember ();
DataSource dataSource = new D r i v e r M a n a g e r D a t a S o u r c e (
System . getProperty ( " database . url " ));
jdbcTemplate = new N a m e d P a r a m e t e r J d b c T e m p l a t e ( dataSource );
cache = CacheFactory . getCache ( CACHENAME );
}

The last transient member variable partitionSet is set by the run() method
so that it can be returned by the getResult() method. We need to do this
to cope with the situation where a node dies or is killed before the Invocable
completes. section 5.6: Using Invocation Service on All Partitions contains
a discussion and examples on how to ensure that an Invocable is executed
against all partitions so we wont repeat that here.
public Object getResult () {
return partitionSet ;
}

If we use an InvocationService to execute this Invocable on every storageenabled member of the service that owns the cache, then we distribute the
loading across the entire cluster whilst also ensuring that the data items
are read by and inserted into the cache on the node that owns them. the

6.5. PRIMING CACHES

217

call requires no network hops so will take microseconds rather


than milliseconds to update the primary copy, although a network hop is still
required to update backup copies. There are edge cases to consider:

NamedCache.put

A repartitioning after the owned partition set has bean identified may
mean that for affected partitions, the process is less efficient.
A failed node may mean some partitions are not loaded. As mentioned
above, the pattern described in section 5.6: Using Invocation Service
on All Partitions deals with that problem.

6.5.1

Mapping Keys to Partitions

All of this presupposes that we can correctly set the owning partition in
the database table. How we do this depends on how the cache and key
are configured, and the nature of the key class. In the simple case, Coherence calculates the partition id based on the hash of the serialised form of
the key and the number of keys configured by the service. The Coherence
Binary class has a method calculateNaturalPartition(int cPartitions) that implements the mapping - this is independent of the Serializer implementation
used to construct the binary. For example, given an instance of NamedCache,
and a key object, we can write:
P ar t it i on e dS er v ic e cacheService = ( P ar t it io n ed S er v ic e ) cache . getCacheService ();
int partitionCount = cacheService . g etP art it ion Cou nt ();
Serializer serialiser = cacheService . getSerializer ();
Binary keyBinary = E x t e r n a l i z a b l e H e l p e r . toBinary ( key , serialiser );
int part = keyBinary . c a l c u l a t e N a t u r a l P a r t i t i o n ( partitionCount );

This gives us the partition id part in which the key key will be stored. This
will work in most common cases, but there are configuration options that
affect the way the partition is calculated.
If the key implements KeyPartitioningStrategy.PartitionAwareKey, then we simply call that keys getPartitionId() method to obtain the partition,
If the service has a KeyAssignmentStrategy configured, then we call that implementations getKeyPartition(java.lang.Object oKey) method to obtain the
partition.
If a key implements KeyAssociation we must call that keys getAssociatedKey()
method and apply the above strategy to that methods return value rather
than to the key itself. Similarly, if the service has a KeyAssociator, we must
call that objects getAssociatedKey method and operate on its result.

218

CHAPTER 6. PERSISTENCE

There are several configuration changes that can invalidate the stored value
of partition id:
partition count on the cache service
changes to a configured KeyAssociator class
changes to a configured KeyPartitioningStrategy class, not to be confused with the partition assignment strategy, which assigns partitions
to members)
any change in the key class that results in a change in the serialised
value
If any such change is made, then it will be necessary to calculate the new
value for each row and update the database before the next prime.

6.5.2

Testing the Prime

We can test this behaviour using Littlegrid and an in-memory database as


in listing 6.6. As written, this does not guarantee that the keys are actually loaded in the correct member, only that all are correctly loaded. The
CachePrimeInvocable itself will still provide the correct result if the stored partition ids are incorrect (provided that they are all within the valid range
for the service) but without the efficiency benefit. We could choose to verify during loading that we are only loading rows in the correct member by
changing the RowMapper implementation to that in listing 6.7.

6.5. PRIMING CACHES

219

Listing 6.6: Testing cache priming


public class C a c h e P r i m e I n v o c a b l e T e s t {
private static final String DBURL = " jdbc : h2 : tcp :// localhost / mem : test ; DB_CLOSE_DELAY = -1 " ;
private static final String TABLESQL = " CREATE TABLE EXAMPLE_TABLE ( "
+ " PARTITION INTEGER NOT NULL , "
+ " KEY VARCHAR (10) NOT NULL PRIMARY KEY , "
+ " VALUE VARCHAR (100) NOT NULL "
+ " ); " ;
private static final String INSERT_STATEMENT =
" INSERT INTO EXAMPLE_TABLE VALUES (? , ? , ?); " ;
private JdbcOperations jdbcop ;
private Cl u st e rM e mb e rG r ou p memberGroup ;
private Server h2Server ;
@Before
public void setUp () throws SQLException {
h2Server = Server . createTcpServer (). start ();
DataSource dataSource = new D r i v e r M a n a g e r D a t a S o u r c e ( DBURL );
jdbcop = new JdbcTemplate ( dataSource );
jdbcop . execute ( TABLESQL );
memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. s e t S t o r a g e E n a b l e d C o u n t (4)
. setCacheConfiguration (
" org / cohbook / persistence / distprime / cache - config . xml " )
. setAdditionalSystemProperty (
" tangosol . coherence . log . level " , 6)
. s e t A d d i t i o n a l S y s t e m P r o p e r t y ( " database . url " , DBURL )
. s e t J a r s T o E x c l u d e F r o m C l a s s P a t h ( " h2 -1.3.172. jar " )
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();
NamedCache cache = CacheFactory . getCache ( " testCache " );
P ar t it i on e dS er v ic e cacheService = ( P ar t it io n ed S er v ic e ) cache . getCacheService ();
int partitionCount = cacheService . g etP art it ion Cou nt ();
Serializer serialiser = cacheService . getSerializer ();
for ( int i = 0; i < 100; i ++) {
String key = Integer . valueOf ( i ). toString ();
String value = " value - " + key ;
int part = E x t e r n a l i z a b l e H e l p e r . toBinary ( key , serialiser )
. c a l c u l a t e N a t u r a l P a r t i t i o n ( partitionCount );
jdbcop . update ( INSERT_STATEMENT , part , key , value );
}
}
@Test
public void testRun () throws Exception {
C a c h e P r i me I n v o c a b l e primer = new C ac h e P r i m e I n v o c a b le ();
In voc ati onS erv ic e inv oca tio nSe rvi ce =
( I nv oca tio nSe rvi ce ) CacheFactory . getService ( " i nvo ca tio nSe rvi ce " );
NamedCache cache = CacheFactory . getCache ( " testCache " );
P ar t it i on e dS er v ic e cacheService = ( P ar t it io n ed S er v ic e ) cache . getCacheService ();
Map results = i nvo ca tio nSe rvi ce . query (
primer , cacheService . g e t O w n e r s h i p E n a b l e d M e m b e r s ());
for ( Object result : results . values ()) {
Assert . assertNotNull ( result );
}
Assert . assertEquals (100 , cache . size ());
}
}

220

CHAPTER 6. PERSISTENCE
Listing 6.7: Verifying the partition while loading

public RowMapper < Object > getRowMapper () {


return new RowMapper < Object >() {
private final K e y P a r t i t i o n i n g S t r a t e g y keystrat =
(( P a rt i ti o ne dS e rv i ce ) cache . getCacheService ())
. g e t K e y P a r t i t i o n i n g S t r a t e g y ();
public Object mapRow ( ResultSet rs , int rowNum )
throws SQLException {
Object key = rs . getString (1);
LOG . debug ( " put key " + key + " from member " + member . getId ());
int rowpart = keystrat . getKeyPartition ( key );
if (! partitionSet . contains ( rowpart )) {
throw new I l l e g a l S t a t e E x c e p t i o n (
" partition " + rowpart + " for key " + key +
" is not on member " + member . getId ());
}
cache . put ( key , rs . getString (2));
return null ;
}
};
}

6.5.3

Increasing Parallelism

Depending on the performance characteristics of the database, the network,


and the machines your cluster is running on, there may be additional benefits
in processing several queries and loads in parallel on each host. One way
would be to construct a separate Invocable per partition and execute each
specifically on the node that owns that partition. Perhaps simpler is to use
an ExecutorService within the Invocable to query several threads in parallel.
A natural separation of tasks is to perform one per partition. We can create
a Runnable inner class of the Invocable:
private class Primer implements Runnable {
private final int partition ;
public Primer ( int partition ) {
this . partition = partition ;
}
public void run () {
Map < String , Integer > parameters = new HashMap < >();
parameters . put ( " PARTITION " , Integer . valueOf ( partition ));
jdbcTemplate . query (
" SELECT KEY , VALUE FROM EXAMPLE_TABLE WHERE PARTITION = : PARTITION " ,
parameters , getRowMapper ());
}
}

We add another transient member variable of the Invocable, which for the
sake of simplicity in the example well construct in the init method, but in
the real-world we might inject.

6.5. PRIMING CACHES

221

.
.
.
private transient ExecutorService executor ;
public void init ( Inv oca tio nSe rv ice i nvo cat ion ser vic e ) {
.
.
.
executor = Executors . ne wF i xe d Th r ea d Po o l (5);
}

Then in the Invocable.run method, we create, submit, and wait for one task
per partition on the local member:
public void run () {
P ar t it i on e dS er v ic e service = ( Pa r ti t io n ed S er v ic e ) cache . getCacheService ();
partitionSet = service . ge tO w ne d Pa r ti t io n s ( member );
int parts [] = partitionSet . toArray ();
List < Future <? > > futures = new ArrayList < >( parts . length );
for ( int partition : parts ) {
futures . add ( executor . submit ( new Primer ( partition )));
}
for ( Future <? > future : futures ) {
try {
future . get ();
} catch ( I n t e r r u p t e d E x c e p t i o n | Ex e cu t io n Ex c ep t io n e ) {
throw new RuntimeException ( e );
}
}
}

6.5.4

Ensuring All Partitions Are Loaded

It may be that processing of one member throws an exception, or that a


member dies before reporting completion status of the Invocable, or that
a race condition as a repartitioning occurs causes some partitions not to
be loaded. Ensuring that an Invocable is executed at least once for every
partition in a service is a general problem that we cover in section 5.6: Using
Invocation Service on All Partitions.

6.5.5

Interaction With CacheStore and CacheLoader

Priming the cache using this technique has no impact on how you might
implement a CacheLoader as, by definition, load and loadall methods are called
from the member that owns the partition. In a CacheStore we will need
to calculate and store the partition of each entry we save. Here we can
isolate ourselves from specifics of the partitioning strategy as we can get the
implementation from the underlying service.

222

CHAPTER 6. PERSISTENCE

public class P a r t i t i o n A w a r e C a c h e S t o r e implements CacheStore {


private Pa r ti t io n ed S er v ic e service ;
protected P a r t i t i o n A w a r e C a c h e S t o r e ( B a c k i n g M a p M a n a g e r C o n t e x t bmmc ) {
this . service = ( Pa r ti t io n ed S er v ic e ) bmmc . getCacheService ();
}
public void store ( Object key , Object value ) {
int partition = service . g e t K e y P a r t i t i o n i n g S t r a t e g y (). getKeyPartition ( key );
/* rest of store implementation */
}

We need to provide the BackingMapManagerContext when constructing an instance of this CacheStore. This is available using the standard macro substitutions in the cache configuration.
< class - name > org . cohbook . persistence . distprime . P a r t i t i o n A w a r e C a c h e S t o r e </ class - name >
< init - params >
< init - param >
< param - type > com . tangosol . net . B a c k i n g M a p M a n a g e r C o n t e x t </ param - type >
< param - value >{ manager - context } </ param - value >
</ init - param >
</ init - params >

Priming a cache with a CacheStore, we will want to disable the store methods
during the prime phase using one of the techniques described in section 6.3:
A Controllable Cache Store. It may be tempting to enable the CacheStore for
a member as the last operation in the Invocable that performs the prime, but
if a repartitioning occurs after one member has been primed, that member
may receive partitions that have not yet been primed. Better to wait until
the cache is completely primed before enabling any CacheStore.
In a BinaryEntryStore we dont need the constructor argument. We can get
the service from the passed BinaryEntry:
public class P a r t i t i o n A w a r e B i n a r y E n t r y S t o r e implements BinaryEntryStore {
public void store ( BinaryEntry binaryentry ) {
P ar t it i on e dS e rv i ce service =
( Pa r ti t io n ed S er v ic e ) binaryentry . getContext (). getCacheService ();
int partition = service . g e t K e y P a r t i t i o n i n g S t r a t e g y ()
. getKeyPartition ( binaryentry . getKey ());
/* rest of store implementation */
}

In the simple case, where we have not configured a key partitioning strategy
or key association, we can calculate the partition without deserialising the
key:
public void store ( BinaryEntry binaryentry ) {
P ar t it i on e dS e rv i ce service =
( Pa r ti t io n ed S er v ic e ) binaryentry . getContext (). getCacheService ();
int partition = binaryentry . getBinaryKey ()
. c a l c u l a t e N a t u r a l P a r t i t i o n ( service . ge tPa rti tio nCo unt ());

6.6. PERSIST WITHOUT DESERIALISING

6.6

223

Persist Without Deserialising

Objective
Show how to use BinaryEntryStore to persist a cache to a database and
load it back without deserialising the data
Prerequisites
An understanding of SQL, and familiarity with Coherence CacheStore
and CacheLoader
Code examples
The storebinary package of the persistence module.
Dependencies
The examples use Springs NamedParameterJdbcTemplate, the h2 database,
and JMock

6.6.1

Binary Key And Value

Where a database is used only as the backing store for a Coherence cache,
it may not be necessary to flatten the cached object model to a relational
model, or it may be complex to do so. It is possible using a BinaryEntryStore
instead of a CacheStore to extract the serialised form of cache entries, persist
to a database and load back again without deserialising the object. Thus
reducing CPU overhead, churn in the emphnew generation space, and potentially, network bandwidth to the database. In the example here we consider
a traditional database but the technique is equally applicable to any other
storage technology that can deal with arbitrary byte arrays.
Using h2 as our database, the table definition is:
CREATE TABLE BINTABLE (
KEY BINARY (10) NOT NULL PRIMARY KEY ,
VALUE BINARY (100) NOT NULL ,
PARTITION INT NOT NULL
);

Both key and value are byte arrays, and we will also store the owning partition so that we can prime efficiently as described in section 6.5: Priming
Caches.
Consider the differences between CacheStore and BinaryEntryStore. For the
sake of clarity, imagine that these Coherence interfaces were updated with
generics, this is how they would look

224

CHAPTER 6. PERSISTENCE
Listing 6.8: CacheStore.java

public interface CacheStore <K ,V >


extends CacheLoader <K ,V >
{
void
void
void
void

store ( K key , V value );


storeAll ( Map <K ,V > map );
erase ( K key );
eraseAll ( Collection <K > collection );

// from CacheLoader ...


V load ( K key );
Map <K ,V > loadAll ( Collection <K > collection );
}

Listing 6.9: BinaryEntryStore.java


public interface BinaryEntryStore
{
void
void
void
void
void
void

load ( BinaryEntry binaryentry );


loadAll ( Set < BinaryEntry > set );
store ( BinaryEntry binaryentry );
storeAll ( Set < BinaryEntry > set );
erase ( BinaryEntry binaryentry );
eraseAll ( Set < BinaryEntry > set );

Whereas the CacheStore methods deal in key and value objects and collections
and maps of these, the BinaryEntryStore deals only with BinaryEntry and sets
thereof, giving us access to the raw, serialised form of the data and much
additional useful context information. These are all void methods, results
are returned by directly modifying the BinaryEntry.
To load an entry from the database:
public class E xa m pl e Bi n ar y St o re implements BinaryEntryStore {
private final N a m e d P a r a m e t e r J d b c O p e r a t i o n s jdbcTemplate ;
public E x am p le B in a ry S to r e ( DataSource dataSource ) {
jdbcTemplate = new N a m e d P a r a m e t e r J d b c T e m p l a t e ( dataSource );
}
public void load ( BinaryEntry binaryentry ) {
Binary binarykey = binaryentry . getBinaryKey ();
byte [] bytesvalue = jdbcTemplate . get Jdb cOp er ati ons (). queryForObject (
" SELECT VALUE FROM BINTABLE WHERE KEY =? " ,
byte []. class ,
binarykey . toByteArray ());
binaryentry . up dat eBi nar yVa lue ( new Binary ( bytesvalue ));
}

To store the entry, we extract the binary key and value from the entry and
then calculate the partition id before calling the database MERGE command to
update the row, remembering the importance of making all of our database
operations idempotent:

6.6. PERSIST WITHOUT DESERIALISING

225

public void store ( BinaryEntry binaryentry ) {


Binary key = binaryentry . getBinaryKey ();
Binary value = binaryentry . getBinaryValue ();
P ar t it i on e dS er v ic e service =
( Pa r ti t io n ed S er v ic e ) binaryentry . getContext (). getCacheService ();
int partition = key . c a l c u l a t e N a t u r a l P a r t i t i o n (
service . g etP art iti onC ou nt ());
jdbcTemplate . get Jdb cOp era tio ns (). update (
" MERGE INTO BINTABLE VALUES (? , ? , ?) " ,
key . toByteArray () ,
value . toByteArray () ,
partition );
}

As described in section 6.5: Priming Caches, this method of calculating a


partition from the serialised form is valid only for a default key assignment
strategy and if no key association is defined. Otherwise it will be necessary
to deserialise at least the key:
int partition = service . g e t K e y P a r t i t i o n i n g S t r a t e g y ()
. getKeyPartition ( binaryentry . getKey ());

This example shows how to persist both key and value as binary without
deserialising. It may be more appropriate for you to store the key in a
human-readable form, The getKey() may be used instead of getBinaryKey() to
get the deserialised key object. If you would like to store particular fields
of the value object in their own database columns without deserialising the
entire object you may be able to do so, depending on the serialisation used.
For example, with POF, use a PofExtractor.
private PofExtractor col1Extractor = new PofExtractor ( String . class , 23);
public void store ( BinaryEntry binaryentry ) {
String field23 = ( String ) col1Extractor . extractFromEntry ( binaryentry );
.
.
.
}

Prior to this, nothing in this example is specific to any particular serialisation


format.

6.6.2

Character Encoded Keys

Some databases, most notably Oracle, do not play well with binary data as
a primary key or indexed value. For these we can simply encode the binary
key as a string. So the table definition becomes:

226

CHAPTER 6. PERSISTENCE

CREATE TABLE BINTABLE (


KEY VARCHAR (40) NOT NULL PRIMARY KEY ,
VALUE BINARY (100) NOT NULL ,
PARTITION INT NOT NULL
);

In the BinaryEntryStore implementation we translate the binary key to and


from a String. There are many packages that can be used for this, one option
is to use Coherences own Base64OutputStream and Base64InputStream classes
both in package com.tangosol.io, which have convenient static methods for
the conversion. The load method becomes:
public void load ( BinaryEntry binaryentry ) {
Binary binarykey = binaryentry . getBinaryKey ();
String encodedKey = new String ( Ba s e6 4 Ou t pu tS t re a m . encode ( binarykey . toByteArray ()));
byte [] bytesvalue = jdbcTemplate . get Jdb cOp er ati ons (). queryForObject (
SELECT_ONE_SQL ,
byte []. class ,
encodedKey );
binaryentry . up dat eBi nar yVa lue ( new Binary ( bytesvalue ));
}

The rest is fairly easy to figure out, but can be seen in full in the example
code in org.cohbook.persistence.storebinary.EncodedKeyBinaryStore.

Chapter 7

Events
7.1

Introduction

There are many products, commercial and open source, that provide object
stores or caches over a wide spectrum of performance, reliability, resilience
and cost characteristics. For many use-cases Coherence will not be the most
cost-effective option. But one area where Coherence really delivers more
than any of its competitors is in the efficiency and power of its event API.
With version 12c a whole new API for events, the unified event model, has
been introduced which should be preferred over the older model where there
is a choice, though as yet its coverage is not complete.

7.1.1

A Coherence Event Taxonomy

pre-12c, created programmatically These event handlers are created


by API calls. The events are received in the cluster member where the
handler is registered but may originate in other members. They are not
superceded by 12c interceptors - though PartitionListener functionality is
partially covered by the 12c TransferEvent interceptor.
MapListener
MemberListener
ServiceListener
PartitionListener
227

228

CHAPTER 7. EVENTS

Pre-12c, created by configuration These event handlers are created by


configuration, and in each case only a single instance can be created against
a given cache or service. CacheStore contains functionality not reproduced
in the interceptor model, though for some purposes an interceptor may be
more appropriate, as we will explore in this chapter. backing map listeners
and triggers are superseded by the new event model and should not be used
in new code.
Backing map listener
Trigger
CacheStore

12c Interceptors These are used to respond to events in the node in which
the triggering change occurs. EntryProcessorEvent and TransactionEvent in particular add significant new functionality. Unlike programmatically created
listeners, interceptors for these events are called in the cluster member where
the originating event occurs, so are not suited to generating notifications to
non-storage members or extend clients.
EntryEvent
EntryProcessorEvent
TransferEvent
TransactionEvent
LifecycleEvent

7.2

Have I Lost Data?

Objective
Illustrate how to use a PartitionListener to identify when data has been
lost from the cluster
Code examples
In the org.cohbook.events.partitionloss package of the events project.
Dependencies
Tests use Littlegrid

7.2. HAVE I LOST DATA?

229

The Coherence service MBean will tell us that, at a given moment, our data
is MACHINE-SAFE, NODE-SAFE, or ENDANGERED, i.e. that it would require the loss
of more than one machine, a single machine, or a single storage-enabled
member for data to be lost from the cluster. What it does not tell us is
whether any data has been lost. There are a number of ways of solving this
problem:
1. Check the logs. The senior member of a service that has lost data
will log a message of the form Assigned %n1 orphaned primary partitions. You might configure a log-monitoring application like Splunk
or Logstash to raise an alert when this happens.
2. Construct a canary cache 1 in the service that is initialised on cluster
startup with a single entry in each partition, and periodically check
that the number of entries present is the same as the number of partitions.
3. Register a PartitionListener on the service to notify of data loss.

7.2.1

A Lost Partition Listener

Many projects I have worked with have independently devised something


like this, perhaps it should be a feature of the product. We create an instance of a class that implements PartitionListener and registers an MBean
(most simply, itself) with the Coherence registry. First we define the MBean
interface2
public interface L o s t P a r t i t i o n L i s t e n e r M B e a n {
int g e t P a r t i t i o n s L o s t C o u n t ();
}

Then a class that implements both PartitionListener, to receive the events,


and LostPartitionListenerMBean, to publish the count to JMX, listing 7.1.
The ensureRegistered method registers the listener itself with the Coherence
management registry so that the MBean is automatically visible through a
JMX node.
1

A recipe that didnt make it into this book


Unless you are following the pattern described in section 2.4: Linking Spring and
Coherence JMX Support
2

230

CHAPTER 7. EVENTS
Listing 7.1: A lost partition listener

public class L o s t P a r t i t i o n L i s t e n e r implements PartitionListener , L o s t P a r t i t i o n L i s t e n e r M B e a n {


private int p a r t i t i o n sL o s t C o u n t ;
private boolean registered = false ;
@Override
public int g e t P a r t i t i o n s L o s t C o u n t () {
return p a rt i t i o n s L o s t C o u n t ;
}
public void onPartitionEvent ( PartitionEvent partitionevent ) {
if ( partitionevent . getId () == PartitionEvent . PARTITION_LOST ) {
ensureRegistered ( partitionevent . getService ()
. getInfo (). getServiceName ());
p a r t i t i o ns L o s t C o u n t += partitionevent . getPartitionSet (). cardinality ();
}
}
private synchronized void ensureRegistered ( String servicename ) {
if ( registered ) {
return ;
}
registered = true ;
Registry registry = CacheFactory . getCluster (). getManagement ();
String name = " Coherence : type = LostPartitionListener , service = " + servicename ;
registry . register ( registry . ensureGlobalName ( name ) , this );
}
}

Finally, configure the listener for the service in the cache configuration.
< caching - schemes >
< distributed - scheme >
< scheme - name > dis tri bu ted Sch eme </ scheme - name >
< service - name > e x a m p l e D i s t r i b u t e d S e r v i c e </ service - name >
<! -- small partition count , zero backups , for testing -- >
< partition - count > 13 </ partition - count >
< backup - count >0 </ backup - count >
< partition - listener >
< class - name >
org . cohbook . events . partitionloss . L o s t P a r t i t i o n L i s t e n e r
</ class - name >
</ partition - listener >
< backing - map - scheme >
< local - scheme / >
</ backing - map - scheme >
< autostart > true </ autostart >
</ distributed - scheme >
</ caching - schemes >

In this cache configuration, weve set a small partition count and no backups
so that we can easily test the listener. If we create a cluster of two storage
nodes and kill one of them as shown in listing 7.2, well lose half (six or
seven) of the partitions as we have no backups.

7.2. HAVE I LOST DATA?

Listing 7.2: Testing the lost partition listener


public class L o s t P a r t i t i o n L i s t e n e r T e s t {
private Cl u st e rM e mb e rG r ou p memberGroup ;
@Before
public void setUp () throws Exception {
memberGroup = C l u s t e r M e m b e r G r o u p U t i l s . newBuilder ()
. s e t C a c h e C o n f i g u r a t i o n ( " cache - config . xml " )
. s e t S t o r a g e E n a b l e d C o u n t (2)
. se t Jm x Mo n it o rC o un t (1)
. b u i l d A n d C o n f i g u r e F o r S t o r a g e D i s a b l e d C l i e n t ();
}
@Test
public void t e s t P a r t i t i o n L i s t e n e r ()
throws MalformedObjectNameException , AttributeNotFoundException ,
InstanceNotFoundException , MBeanException , ReflectionException ,
InterruptedException {
NamedCache cache = CacheFactory . getCache ( " test " );
cache . put (1 , " A " );
memberGroup . stopMember ( memberGroup . g e t S t a r t e d Me m b e r I d s ()[0]);
Thread . sleep (3000);
ObjectName name = new ObjectName (
" Coherence : type = LostPartitionListener , "
+ " service = exampleDistributedService ,* " );
MBeanServer mbs = Man age men tF act ory . g e t P l a t f o r m M B e a n S e r v e r ();
Integer plc = 0;
for ( ObjectName mbean : mbs . queryNames ( name , null )) {
plc += ( Integer ) mbs . getAttribute ( mbean , " P a r t i t io n s L o s t C o u n t " );
}
Assert . assertTrue ( plc == 6 || plc == 7);
}
}

231

232

CHAPTER 7. EVENTS

While this is a considerable improvement, it isnt totally foolproof. We can


imagine an unstable cluster losing members resulting in partitions being lost,
and then the senior member (whose LostPartitionListener has recorded the
loss) itself dies without further partition loss. If all of this happens in the
interval between polls by your JMX monitoring solution the partition loss
will remain undetected. It might be more robust to communicate directly
with your monitoring infrastructure form the LostPartitionListener, e.g. by
emitting a JMX notification, or an SNMP trap.

7.3

Event Storms

Objective
Discussion of the circumstances in which large numbers of events can
destabilise the cluster, and strategies for avoiding the problem
A particular system at one of my clients has:
17 machines with
9 storage nodes (153 total)
2 proxy nodes (34 total)
170 extend clients (5 per proxy node)
One cache contained about one million entries of 1KB each. The extend
clients held near caches of this data with invalidation strategy set to auto,
which defaulted to all3 , in that version of Coherence.
Because of a production issue, the support team decided to invoke clear()
on this cache.
As a consequence each of the 153 storage nodes tried to send all of its data
in the cache to each of the 170 extend clients via the 34 extend nodes,
immediately saturating the network and causing growing backlogs in the
outgoing queues in both storage and extend nodes. This led to OOM errors
in both types of node and bringing down the entire cluster. Lessons from
the experience:
Be very careful with NamedCache.clear() when you have large numbers
of clients listening to large numbers of entries. This applies to near
3

In versions up to 3.7.1 auto is equivalent to all. From 12.1.2 it is present

7.4. TRANSACTIONAL PERSISTENCE

233

cache, continuous query or an explicitly registered MapListener.


Be very careful with your choice of invalidation strategy for near caches.
Use auto or all only when necessary. Prefer present.
One of the main strengths of Coherence is the reliability and efficiency of
its event handling, but it cant work miracles. The network traffic arising
from an action will increase approximately in proportion to the product of
the number of entries affected by that action, the size of those entries and
the number of listeners. When all of these numbers are large there will be
a surge of network activity. Do the arithmetic for any large-scale actions in
your system and, if necessary, break those actions down into smaller units
and trickle them in, e.g. per partition.

Further Reading https://blogs.oracle.com/OracleCoherence/entry/


oracle_coherence_near_cache_best has an excellent write-up on the differences between cache invalidation strategies.

7.4

Transactional Persistence

Objective
Show how to mirror a partition-local transaction update with a single
database transaction updating the corresponding database entries
Prerequisites
We build on the partition-local transaction concept explored in section 5.7: Working With Many Caches, adding database persistence to
those operations using the domain model developed. An understanding of the conventional Coherence persistence model as described in
chapter 6: Persistence, in particular section 6.4: Error Handling in a
CacheStore, may be useful to better understand when this alternative
approach might be appropriate.
Code examples
Are in the org.cohbook.events.transaction package of the events module.
Dependencies
Littlegrid, h2, and the gridprocessing module. For simplicity we use
Springs JDBC and transaction support.

234

CHAPTER 7. EVENTS

Partition-local transactions allow related entries in different caches to be


updated atomically so that consistency can be enforced, however that enforcement cannot pass through to the database using the standard CacheStore
model. Each cache is updated separately and there is no mechanism for maintaining transactional integrity between them whether using write-through or
write-behind.
The Coherence 12c unified event model introduces the transaction interceptor, which receives a TransactionEvent on the completion of each partitionlocal transaction. The TransactionEvent contains the details of every cache
entry, in all of the caches affected by the transaction. We can therefore propagate these changes to the database in a single transaction and ensure that
the persisted data is internally consistent.

7.4.1

Synchronous or Asynchronous operation

A transaction interceptor can be configured to intercept either the COMMITTING


or COMMITTED event.
COMMITTING - a pre-commit interceptor
A COMMITTING event is executed in the service worker thread, before the cache
changes are committed and while cache entries are still locked. An exception
thrown by the interceptor will cause all of the cache updates to be rolled
back. This gives us the best possible guarantee of consistency between the
database and cache, but cache entry locks are held and the worker thread tied
up for the duration of the database transaction. The Oracle documentation
says:
Precommit event types allow event interceptors to modify entries before the entries are committed to a cache. The interceptors are processed synchronously and must not perform long
running operations (such as database access) that could potentially block or slow cache operations. Calls to external resource
must always return as quickly as possible to avoid blocking cache
operations.
But, the same could be said of using CacheStore with write-through. I would
advise that you need to be aware of the implications of using COMMITTING, configure the worker thread pool and database connection pool sizes appropri-

7.4. TRANSACTIONAL PERSISTENCE

235

ately for the latency and throughput requirements, and design to avoid large
transactions or hot entries that might give rise to excessive contention.

COMMITTED - a post-commit interceptor


A COMMITTED event is executed in the service event dispatcher thread and does
not begin until after the cache updates are committed and locks released. We
therefore avoid the problem of latency and contention for the service worker
threads, but lose the guarantee of consistency between database and cache.
A failure in the interceptor means that the database update is lost. We do
still assure that the database is internally consistent. Whereas configuring a
CacheStore for write-behind persistence uses a thread for each cache, a postcommit transaction interceptor is run in a single event dispatcher thread for
the entire service, potentially restricting throughput. You could, within the
interceptor, delegate the events to a thread pool for concurrent execution, but
care would be needed to ensure that related transactions were not performed
concurrently, or even out of order. A striped executor service as described
by Heinz Kabutz4 might serve, using the partition number as the identity
for striping.

7.4.2

Implementation

We implement EventInterceptor giving the event type, TransactionEvent as


the generic argument. The event contains a collection of cache entries that
have been added, updated, or removed by the transaction. Remember that
the interceptor is configured per service, and if we follow best practice of
minimising the number of services, our interceptor will receive transactions
for all caches in the service, some of which we may not wish to persist by this
means. Rather than tightly coupling our implementation to many distinct
components in our application, we could simply delegate persistence to a
separate class, providing a map of implementations by cache name. Our
cache-specific persistors will implement this interface5 :
4

see
http://www.javaspecialists.eu/archive/Issue206.html,
and
https:
//github.com/kabutz/striped-executor-service
5
If you are persisting binary data as in section 6.6: Persist Without Deserialising, then
you might instead define the method void persistAll(Set<BinaryEntry> entrySet)

236

CHAPTER 7. EVENTS

public interface CachePersistor <K , V > {


void persistAll ( Map <K , V > map );
}

So our event interceptor iterates over the entries updated by the transaction, dividing them up by cache, and passing the entries for each cache to
the appropriate CachePersistor, if one is defined. The actual persistence is
performed within a database transaction, but we only want to pay the cost
of starting that transaction if there actually is anything to persist.
public class T r a n s a c t i o n a l C a c h e P e r s i s t o r implements EventInterceptor < TransactionEvent > {
protected T r a n s ac t i o n T e m p l a t e t r a n s a ct i o n T e m p l a t e ;
protected Map < String , CachePersistor > ca che Per sis to rMa p = new HashMap < >();
public void onEvent ( final TransactionEvent event ) {
final Map < String , Map < Object , Object > > updatesByCache = new HashMap < >();
for ( BinaryEntry entry : event . getEntrySet ()) {
String cacheName = entry . g e t B a c k i n g M a p C o n t e x t (). getCacheName ();
if ( cac heP ers ist orM ap . containsKey ( cacheName )) {
Map < Object , Object > updateMap = updatesByCache . get ( cacheName );
if ( updateMap == null ) {
updateMap = new HashMap < >();
updatesByCache . put ( cacheName , updateMap );
}
updateMap . put ( entry . getKey () , entry . getValue ());
}
}
if ( updatesByCache . size () > 0) {
t r a n s a c t io n T e m p l a t e . execute ( new TransactionCallback < Object >() {
public Object doInTransaction ( Tr ans act ion Sta tus status ) {
for ( Map . Entry < String , Map < Object , Object > > entry :
updatesByCache . entrySet ()) {
ca che Per sis to rMa p . get ( entry . getKey ())
. persistAll ( entry . getValue ());
}
return null ;
}
});
}
}
}

We must initialise the transactionTemplate and cachePersistorMap member


variables. We could add setter methods and instantiate using Spring and
a factory scheme as described in section 2.3: Build a CacheFactory with
Spring Framework, but for simplicity in this example we will define a subclass that sets up persistence for the flight and reservation caches we used

7.4. TRANSACTIONAL PERSISTENCE

237

in section 5.7: Working With Many Caches.


public class F l i g h t R e s e r v a t i o n T r a n s a c t i o n a l P e r s i s t o r extends
TransactionalCachePersistor {
public F l i g h t R e s e r v a t i o n T r a n s a c t i o n a l P e r s i s t o r ( String dburl ) {
DataSource dataSource = new D r i v e r M a n a g e r D a t a S o u r c e ( dburl );
Map < String , CachePersistor > storeMap = new HashMap < >();
storeMap . put ( " flight " , new FlightStore ( dataSource ));
storeMap . put ( " reservation " , new ReservationStore ( dataSource ));
ca che Per sis tor Ma p = storeMap ;
P l a t f o r m T r a n s a c t i o n M a n a g e r t ra n sa c ti o nM a na g er =
new D a t a S o u r c e T r a n s a c t i o n M a n a g e r ( dataSource );
t r a n s a c t io n T e m p l a t e = new T r a n sa c t i o n T e m p l a t e ( t r an s ac t io n Ma n ag e r );
}
}

Implementations of CachePersistor.persistAll(Map<K, V> map) must cope with


a map of entries that represent a mixture of new, updated, and removed
entries. Adhering to best practice we use a MERGE statement so that we need
not be concerned by the distinction between new and updated rows. A
removed entry will be indicated by a null value in the map. Given a flight
table in the database:
CREATE TABLE FLIGHT (
FLIGHTID INTEGER NOT NULL PRIMARY KEY ,
ORIGIN VARCHAR (100) NOT NULL ,
DESTINATION VARCHAR (100) NOT NULL ,
DEPARTURETIME DATETIME NOT NULL ,
AV AIL ABL EBU SIN ES S INTEGER NOT NULL ,
AVAILABLEECONOMY INTEGER NOT NULL
)

We can write a reasonably efficient CachePersistor by iterating over the map


of changed entries segregating the updates and removes, then performing a
batch update for each as required. Listing 7.3 demonstrates for our flight
cache. The implementation for the reservation cache follows a similar pattern, provided in full in the example code.
We must also register the interceptor. The simplest way is declaratively in
the cache configuration as in listing 7.4.
If we had not added the @Interceptor annotation to the class, we would find
the interceptor is called twice, one for the COMMITTING event, and once for
COMMITTED.
There is a Littlegrid unit test, FlightReservationTransactionalPersistorTest in
the example code that exercises the FlightReservationTransactionalPersistor

238

CHAPTER 7. EVENTS

Listing 7.3: FlightStore


@Interceptor ( tra nsa cti onE ven ts = TransactionEvent . Type . COMMITTING )
public class FlightStore implements CachePersistor < Integer , Flight > {
private static final String MERGE_SQL =
" MERGE INTO FLIGHT VALUES (? , ? , ? , ? , ? , ?); " ;
private static final String DELETE_MANY_SQL =
" DELETE FROM FLIGHT WHERE FLIGHTID IN (: FLIGHTIDS ); " ;
private final N a m e d P a r a m e t e r J d b c O p e r a t i o n s jdbcTemplate ;
public FlightStore ( DataSource dataSource ) {
jdbcTemplate = new N a m e d P a r a m e t e r J d b c T e m p l a t e ( dataSource );
}
@Override
public void persistAll ( Map < Integer , Flight > map ) {
List < Object [] > u pda teB atc hVa lue s = new ArrayList < >( map . size ());
Collection < Integer > removeKeys = new ArrayList < >( map . size ());
for ( Map . Entry < Integer , Flight > entry : map . entrySet ()) {
Flight flight = entry . getValue ();
if ( flight == null ) {
removeKeys . add ( entry . getKey ());
} else {
Object [] row = new Object [6];
row [0] = entry . getKey ();
row [1] = flight . getOrigin ();
row [2] = flight . getDestination ();
row [3] = flight . getDepartureTime ();
row [4] = flight . g e t A v a i l a b l e B u s i n e s s ();
row [5] = flight . g e t A v a i l a b l e E c on o m y ();
up dat eBa tch Val ues . add ( row );
}
}
if ( upd ate Bat chV alu es . size () > 0) {
jdbcTemplate . g et Jdb cOp era tio ns (). batchUpdate ( MERGE_SQL , u pda teB atc hVa lue s );
}
if ( removeKeys . size () > 0) {
Map < String , Object > parameters = new HashMap < >();
parameters . put ( " FLIGHTIDS " , removeKeys );
jdbcTemplate . update ( DELETE_MANY_SQL , parameters );
}
}
}

7.4. TRANSACTIONAL PERSISTENCE

239

Listing 7.4: Registering the interceptor


< caching - schemes >
< distributed - scheme >
< scheme - name > d is t ri b ut e dS e rv i ce </ scheme - name >
< thread - count >2 </ thread - count >
< partition - count > 13 </ partition - count >
< backing - map - scheme >
< local - scheme / >
</ backing - map - scheme >
< autostart > true </ autostart >
< interceptors >
< interceptor >
< instance >
< class - name >
org . cohbook . events . transaction . F l i g h t R e s e r v a t i o n T r a n s a c t i o n a l P e r s i s t o r
</ class - name >
< init - params >
< init - param >
< param - value system - property = " jdbc . url " / >
</ init - param >
</ init - params >
</ instance >
</ interceptor >
</ interceptors >
</ distributed - scheme >
</ caching - schemes >

using the FlightReservationProcessor from section 5.7: Working With Many


Caches.

7.4.3

Catching Side-effects

A pre-commit interceptor may modify the transaction, and there may be


many interceptors configured for an event, so it is possible for the transaction
to be modified by another interceptor after weve persisted it. To prevent
this we could:
Make sure the persistence interceptor is last in the chain. Possibly
tricky if we have a combination of declarative and programmed registration of interceptors.
Ensure that the persistence interceptor is registered with order set to
LOW and all others with order HIGH. Unfortunately the default is LOW
Make the persistence interceptor fire all the other interceptors before
performing the persistence.
The last approach can be achieved by starting the onEvent method:

240

CHAPTER 7. EVENTS

public void onEvent ( final TransactionEvent event ) {


event . nextInterceptor ();
.
.
.
}

This fires all other interceptors before continuing with this one. There is
the possibility that an interceptor higher up the chain has also executed
event.nextInterceptor(), so it could still modify the transaction after this one
has executed. To avoid this, as a rule, never modify an entry or transaction
in an interceptor after calling event.nextInterceptor().

7.4.4

Database Contention

The TransactionalCachePersistor provide an efficient general framework for


performing consistent database updates, but without consideration of contention in the database. If all updates to the affected rows and tables are
through the cache then this will not be a problem. In synchronous operation,
the affected rows are already guarded by cache locks on the corresponding
entries, and in asynchronous operation, updates originating from the same
cache service partition will be performed in order. But if there are database
updates from other sources, or if your persistence data model does not map
simply to the cache data model, then you may need to take more care with
the order in which updates are applied within the transaction to minimise
the risk of database deadlocks.

7.5

Singleton Service

Objective
Demonstrate how to implement a class that performs some task continuously on a single node, automatically failing over to another if
necessary
Code examples
In the org.cohbook.events.singletonservice package of the events project.

7.5. SINGLETON SERVICE

7.5.1

241

MemberListener

I have often found the need to run some piece of code, continuously or repeatedly, in one place somewhere in the cluster. One recent example was to
periodically poll a cache and prepare and publish a JMS message summarising the contents. The overhead of the task may be low, but it is important
that it runs in one place at a time and is resilient against failure of a member
or machine. How do we decide which member to run our code on? A simple
approach is to use the senior member of a service. At startup, each member checks if it is the senior member, and if so, starts running the service
code. For a given Service we identify the senior member from that services
ServiceInfo object.
Member oldestMember = service . getInfo (). getOldestMember ();

Each member also needs to know its own identity:


Member localMember = service . getCluster (). getLocalMember ();

We then have to detect when the current senior member leaves the cluster, each member must then check again to see if it has become the senior member. We can achieve all of this in a class that registers itself as a
MemberListener, given a Service instance to evaluate against, and a Runnable
to execute when it is the senior member, listing 7.5

242

CHAPTER 7. EVENTS

Listing 7.5: Singleton service


public class SingletonService implements MemberListener {
private static final Logger LOG = LoggerFactory . getLogger ( SingletonService . class );
private final Runnable runnable ;
private Thread thread = null ;
public SingletonService ( Service service , Runnable runnable ) {
this . runnable = runnable ;
service . a ddM emb erL is ten er ( this );
s t a r t I f Th i s I s O l d e s t ( service );
}
@Override
public void memberLeft ( MemberEvent memberevent ) {
s t a r t I f Th i s I s O l d e s t ( memberevent . getService ());
}
@Override
public void memberJoined ( MemberEvent memberevent ) {
}
@Override
public void memberLeaving ( MemberEvent memberevent ) {
}
private synchronized void s t ar t I f T h i s I s O l d e s t ( Service service ) {
Member oldestMember = service . getInfo (). getOldestMember ();
Member localMember = service . getCluster (). getLocalMember ();
if ( oldestMember . equals ( localMember ) && thread == null ) {
thread = new Thread ( runnable );
thread . start ();
}
}
}

7.5. SINGLETON SERVICE

7.5.2

243

Instantiating the Service

The next problem is how to create our SingletonService instance. It must be


instantiated on every member that runs the service, and must start after the
cluster has been created and services started. One option is to create it in
our main method after starting the cluster.
public static void main ( String [] args ) throws InterruptedException , IOException {
CacheFactory . ensureCluster ();
Service service = CacheFactory . getService ( " e x a m p l e D i s t r i b u t e d S e r v i c e " );
Runnable runnable = getRunnable ();
SingletonService singletonService = new SingletonService ( service , runnable );

This couples our SingletonService implementation to the framework code,


but worse, may prove unreliable; there is no guarantee that the service has
started when we try to use it. Better would be to register a LifecycleEvent
interceptor to instantiate the service once the cluster and its services have
started. A suitable implementation would be something like:
public class S i n g l e t o n S e r v i c e C o n t r o l l e r implements EventInterceptor < LifecycleEvent > {
private SingletonService singletonService = null ;
public void onEvent ( LifecycleEvent event ) {
switch ( event . getType ()) {
case ACTIVATED :
Service service = event . g e t C o n f i g u r a b l e C a c h e F a c t o r y ().
ensureService ( getServiceName ());
singletonService = new SingletonService ( service , getRunnable ());
break ;
default :
}
}
protected String getServiceName () {
.
.
.
}
protected Runnable getRunnable () {
.
.
.
}
}

Complete the missing methods as appropriate. We register the listener in


the cache configuration:

244

CHAPTER 7. EVENTS

<? xml version = " 1.0 " ? >


< cache - config xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - cache - config "
x si : sc h em a Lo c at io n = " http: // xmlns . oracle . com / coherence / coherence - cache - config
coherence - cache - config . xsd " >
< interceptors >
< interceptor >
< name > S i n g l e t o n S e r v i c e I n t e r c e p t o r </ name >
< instance >
< class - name >
org . cohbook . events . singletonService . S i n g l e t o n S e r v i c e C o n t r o l l e r
</ class - name >
</ instance >
</ interceptor >
</ interceptors >
< caching - scheme - mapping >
< cache - mapping >
< cache - name > test </ cache - name >
< scheme - name > dis tri bu ted Sch eme </ scheme - name >
</ cache - mapping >
</ caching - scheme - mapping >
< caching - schemes >
< distributed - scheme >
< scheme - name > dis tri bu ted Sch eme </ scheme - name >
< service - name > e x a m p l e D i s t r i b u t e d S e r v i c e </ service - name >
< partition - count > 13 </ partition - count >
< backup - count >0 </ backup - count >
< backing - map - scheme >
< local - scheme / >
</ backing - map - scheme >
< autostart > true </ autostart >
</ distributed - scheme >
</ caching - schemes >
</ cache - config >

7.5.3

The Leaving Member

We need not normally be concerned with terminating the thread that executes the Runnable, a member ceases to be the senior member when its JVM
terminates. But when running in a container (in a gar or or war for example,
or under an in-JVM test framework like Littlegrid), the member may leave
the cluster without terminating the thread. We might try intercepting the
DISPOSING lifecycle event and interrupting the thread, but a cluster member
may terminate without that event being generated. You may find it more
reliable to check cluster state periodically within the worker thread. For
example, in the Runnable used in the example test for this code, we check the
Cluster.isRunning method:

7.5. SINGLETON SERVICE


public class TestRunnable implements Runnable {
public void run () {
int member = CacheFactory . getCluster (). getLocalMember (). getId ();
for (;;) {
if (! CacheFactory . getCluster (). isRunning ()) {
System . out . println ( " Goodbye from member " + member );
return ;
}
System . out . println ( " Hello from member " + member );
try {
Thread . sleep (1000);
} catch ( I n t e r r u p t e d E x c e p t i o n e ) {
System . out . println ( " member " + member + " interrupted " );
return ;
}
}
}
}

245

246

CHAPTER 7. EVENTS

Chapter 8

Configuration
8.1

Introduction

We will examine several aspects of configuration in this chapter, covering


coherence operational and cache configuration, and JVM configuration. First
a brief discussion of the environment in which your cluster runs.

8.1.1

Operating System

Coherence, written in java, can run in any environment that supports java,
though in practice you will probably consider Windows, Linux, or some
UNIX variant. By far the most common choice is Linux, with good reason;
Coherence is sensitive to pauses caused by memory paging. Windows
lacks adequate means to ensure that the process memory is not paged
out. Even on a lightly loaded machine with plenty of free memory,
Windows may page out parts of a cluster members memory eventually.
Windows imposes a much higher CPU overhead. The same hardware
running a cluster on Linux will provide up to 30% greater throughput.
When might you consider running Coherence on Windows? If you are providing caching services to a Windows application, with no other Linux systems
in your application landscape, you might consider that the costs and overheads of sticking with the familiar are justified. Though we would strongly
247

248

CHAPTER 8. CONFIGURATION

recommend restarting the servers at least weekly. Not long ago we did research whether there were any native Windows caching solution that might
be appropriate for this use-case, but found nothing that really measured up
to the demands of an enterprise class application.

8.1.2

Hardware Considerations

Once you have an idea of the demands on your cluster, CPU, memory, disk
and network i/o, there are a number of factors to consider in specifying the
hardware - how many machines, how many CPUs per machine, how much
memory, how many NICs, and what speed? You may be constrained in these
choices by the standard offerings available in your organisations datacentres.
here are a few things to consider:
The standard Oracle Coherence is priced per CPU, and this license cost is
often a significant part of overall system cost, so you might consider minimising the number of CPUs in your cluster, allowing of course, for machine
failures. Provide sufficient memory and NICs that CPU is the limiting factor
in capacity and throughput.
Each machine should have dual, bonded NiCs connected via separate switches.
There is no point having redundancy within the cluster if a switch failure can
take out the cluster. Each NIC should, alone, have sufficient bandwidth for
the node to run at capacity. Consider, and test the volume of network traffic
that will result from repartitioning after a node or machine failure.
If your cluster is network-bound, you may find that TCMP traffic within
the cluster swamps TCP traffic from extend clients. Consider using separate
pairs of NICs and switches for TCMP traffic, separating internal cluster
communication from external connections.
Every machine on the cluster should be on the same switch (pair) and subnet. The increased latency and network contention of sending cluster traffic
through longer routes is potentially destabilising. In particular clustering
over the WAN between datacentres is usually not a good idea.

8.1.3

Virtualisation

Virtualisation is a handy way of sharing one machine among many applications. Coherence is a handy way of sharing one application over many ma-

8.2. CACHE CONFIGURATION BEST PRACTICES

249

chines. In particular, Coherence does not like being swapped or time-sliced


out. moved on the network or any of the other things that virtualisation
provides. If you must use virtualisation, e.g. to conform to ill-thought-out
corporate standards (weve all seen plenty of those), then the environment
must be configured not to overcommit memory, CPU, or bandwidth, and to
be on the same physical switch and subnet. At the time of writing, Oracle
officially support only their own virtualisation product for Coherence (and,
in fact, all their products, including Websphere). As far as we are able to
determine, this is purely for marketing reasons; you should not infer that
Coherence works any better on Oracles product than, say, VMWare. Anecdotal reports of the success of using virtual servers is mixed - some apparently
successful, some not. Informally, Oracle have been known to provide support
for Coherence clusters on VMWare and other platforms, though they may
require you to demonstrate a problem on physical hardware if there is some
suspicion that the environment is responsible.

8.1.4

Breaking These Rules

You can quite happily ignore any or all of the recommendations above, and
still have a cluster that runs perfectly happily and fulfils its requirements
under normal circumstances. But you have gone to the trouble and expense
of using Coherence presumably for its high-availability features. Break the
rules above and you compromise the robustness of the cluster. It may work
fine under normal circumstances, but you want it to keep working even when
things go wrong.

8.2

Cache Configuration Best Practices

Objective
Understand the relationiship between declarative XML cache configuration and the instantiated cluster member, and to develop some conventions and practices for maintainable configuration.

8.2.1

Avoid Wildcard Cache Names

If you look into coherence-cache-config.xml in the distributed coherence.jar,


you will find the following cache-mapping element:

250

CHAPTER 8. CONFIGURATION

<! -- never do this -- >


< cache - mapping >
< cache - name >* </ cache - name >
< scheme - name > example - distributed </ scheme - name >
</ cache - mapping >

As mentioned in section 2.5: Using Maven Repositories, we recommend removing this file from the jar before installing in your own artefact repository,
but the point we make here is that you should avoid ever placing a clause
like this in your own cache configuration files. Wherever possible, list each
cache name separately. If you have a set of dynamically created caches, use
a naming convention that will not overlap any of your individually defined
caches, and never include a final catch-all: a typo or forgotten map might
otherwise result in a cache being created on an inappropriate service. Better
to have a hard failure while creating the cache during development.
<! -- this is ok if for a set of caches whose
names are dynamically generated -- >
< cache - mapping >
< cache - name >dyn -* </ cache - name >
< scheme - name > dynamic - caches - scheme </ scheme - name >
</ cache - mapping >

8.2.2

User-defined Macros in Cache Configuration

Coherence provides a set of macros for injecting values into cache configurations. The example below shows a distributed-scheme that uses a read-write
backing map; the CacheStore has a constructor argument set using the buildin cache-name macro, but the scheme also references two user-defined macros,
write-max-batch-size with default value 128, and write-delay with default
value 1s.
< caching - schemes >
< distributed - scheme >
< scheme - name > write - behind - cache - scheme </ scheme - name >
< scheme - ref > distributed - service - scheme </ scheme - ref >
< backing - map - scheme >
< read - write - backing - map - scheme >
< scheme - name > write - behind - backing - map - scheme </ scheme - name >
< internal - cache - scheme > < local - scheme / > </ internal - cache - scheme >
< write - max - batch - size >
{write-max-batch-size 128}
</ write - max - batch - size >
< cachestore - scheme >
< class - scheme >
< class - name > E xa mpl eCa che Sto re </ cache - name >
< init - params >
< init - param >
< param - value >{cache-name}</ param - value >
</ init - param >
</ init - params >
</ class - scheme >
</ cachestore - scheme >
< write - delay >{write-delay 1s}</ write - delay >
</ read - write - backing - map - scheme >
</ backing - map - scheme >
</ distributed - scheme >

8.2. CACHE CONFIGURATION BEST PRACTICES

251

Now we define cache1, a cache that uses the default values for batch size and
write delay, and cache2, which overrides these values:
< caching - scheme - mapping >
< cache - mapping >
< cache - name > cache1 </ cache - name >
< scheme - name > write - behind - cache - scheme </ scheme - name >
</ cache - mapping >
< cache - mapping >
< cache - name > cache2 </ cache - name >
< scheme - name > write - behind - cache - scheme </ scheme - name >
< init - params >
< init - param >
< param - name > write - max - batch - size </ param - name >
< param - value > 512 </ param - value >
</ init - param >
< init - param >
< param - name > write - delay </ param - name >
< param - value > 15 s </ param - value >
</ init - param >
</ init - params >
</ cache - mapping >
</ caching - scheme - mapping >

This technique helps to avoid defining large numbers of similar schemes where
only a few element values differ.

8.2.3

Avoid Unnecessary Service Proliferation

When should you place caches in separate services, rather than keeping them
all in a single service? The answer is only when absolutely necessary. Creating additional services adds costs - more service means more thread pools,
more threads and hence more frequent context switches. More JMX MBeans
to be monitored, more complex configuration to maintain. If you have slow
operations on some caches - because of read-through, write-through, or expensive caches, then separating those caches onto a separate service will
often not help performance - the same work needs to be done for the calling
application. Always start with the minimum set of services and only add to
them if you have a clearly identifiable need, and testing shows that separating the services does improve things. Here are a few examples of cases where
we have found additional services to be useful:

Different backup requirements


Consider a read-through cache that consumes a significant proportion of your
storage space, that holds some fraction of an underlying data set limited by
the high-units setting. It may improve performance to set the backup-count

252

CHAPTER 8. CONFIGURATION

on such a cache to zero. You will increase the space available, and hence the
proportion of your underlying data set that you can hold in memory. If a
node is lost, so will its partitions, but you would lose that amount of data
anyway as repartitioning would trigger eviction in overfull members - the
only difference is in which entries are lost. backup-count is an attribute of a
service, so can only be set for specific caches by placing them on a distinct
service.
Remember that if you configure partitioned backing maps, the high-units
setting applies per partition rather than per member, so a node loss will
result in more entries being stored per node rather than the eviction of
entries.

High priority use-cases


The argument for not separating a high-latency and a low-latency cache
into separate services holds when both caches are accessed as part of the
same application use-case. If, say, the low latency cache is used by enduser interactions but the high latency cache is used by background batch
operations, then it is worth considering splitting them onto separate services
so that the user requests do not get caught in a backlog of long-running highlatency requests. This will only really help if the user interaction involves
no use of the high-latency service.
In this scenario, we can also tune up the thread-count of the high-latency
service to maximise throughput.

8.2.4

Separate Service Definitions and Cache Templates

We will begin this section with a simple assertion: the Coherence cache
configuration schema is fundamentally broken in its design. It is possible
to create legal, schema conformant configurations that are ambiguous or
inconsistent; configurations in which elements are ignored at runtime without error or warning. The problem lies with the caching-schemes elements,
these serve two distinct purposes: providing a template for configuration
of individual caches; and defining the runtime properties of services. It is
possible to define many elements within caching-schemes that reference the
same service, but with contradictory parameters. Have a look at this cache
configuration:

8.2. CACHE CONFIGURATION BEST PRACTICES

253

< cache - config xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - cache - config "
x si : sc h em a Lo c at i on = " http: // xmlns . oracle . com / coherence / coherence - cache - config
coherence - cache - config . xsd " >
< caching - scheme - mapping >
< cache - mapping >
< cache - name > slow - cache </ cache - name >
< scheme - name > SlowCacheScheme </ scheme - name >
</ cache - mapping >
< cache - mapping >
< cache - name > fast - cache </ cache - name >
< scheme - name > FastCacheScheme </ scheme - name >
</ cache - mapping >
</ caching - scheme - mapping >
< caching - schemes >
< distributed - scheme >
< scheme - name > SlowCacheScheme </ scheme - name >
< service - name > e x a m p l e D i s t r i b u t e d S e r v i c e </ service - name >
< thread - count > 100 </ thread - count >
< backing - map - scheme >
< read - write - backing - map - scheme >
< internal - cache - scheme > < local - scheme / > </ internal - cache - scheme >
< cachestore - scheme >
< class - scheme >
< class - name > com . example . S l o w D a t a b a s e C a c h e L o a d e r </ class - name >
</ class - scheme >
</ cachestore - scheme >
</ read - write - backing - map - scheme >
</ backing - map - scheme >
< autostart > true </ autostart >
</ distributed - scheme >
< distributed - scheme >
< scheme - name > FastCacheScheme </ scheme - name >
< service - name > e x a m p l e D i s t r i b u t e d S e r v i c e </ service - name >
< thread - count >5 </ thread - count >
< backing - map - scheme >
< read - write - backing - map - scheme >
< internal - cache - scheme > < local - scheme / > </ internal - cache - scheme >
</ read - write - backing - map - scheme >
</ backing - map - scheme >
< autostart > true </ autostart >
</ distributed - scheme >
</ caching - schemes >
</ cache - config >

The author of this scheme is defining two caches, one with a slow, highlatency CacheLoader and another that operates purely in memory. As readthrough access to the high-latency cache may tie up worker threads for a
long time, it seems reasonable to specify a larger thread-count for that caches
scheme. The problem here is that thread-count specifies the number of worker
threads for the service, and both of these distributed-scheme elements reference the same service name. So, how many threads will the service have?
Five, or one hundred? The answer is not clearly defined but appears in
practice to depend on the order in which caches are created, Coherence will
instantiate the service with the parameters of the first scheme it is asked to
instantiate, and will silently ignore alternative settings for the same service.

254

CHAPTER 8. CONFIGURATION

Though in some circumstances, different nodes can reach different choices as


to how to configure a service, and a warning will be logged as these are reconciled and one member coerced to configure the service the same as the rest
of the cluster. It is easy enough to see what is happening in this simple example with two schemes and one service, but a configuration file with many
schemes and services with complex chains of inheritance via the scheme-ref
element can become very hard to maintain correctly.
The simplest solution to this problem is by establishing a convention of
separating service configuration schemes from cache template schemes:
Schemes that define services contain only service-related elements.
schemes that define cache templates contain no service-related elements.
Use a naming convention that clearly identifies whether the scheme is
service- or cache-related.
Each service name should only occur once in the configuration, in the
service-name of the scheme that defines that service.
A scheme that names a service may inherit from abstract service schemes
that do not define a service name.
Heres how the caching-schemes section of the above example looks when we
refactor it for these rules, but creating a different service for each of the two
caches so that we size the thread pools appropriately. First, the schemes
that define the services. Here were defining two concrete schemes, each
with a defined service-name,
<! --

Service schemes - no cache template parameters -- >

< distributed - scheme >


< scheme - name > M a n y T h r e a d S e r v i c e S c h e m e </ scheme - name >
< scheme - ref > A b s t r a c t S e r v i c e S c h e m e </ scheme - ref >
< service - name > M a n y T h r e a d D i s t r i b u t e d S e r v i c e </ service - name >
< thread - count > 100 </ thread - count >
</ distributed - scheme >
< distributed - scheme >
< scheme - name > F e w T h r e a d S e r v i c e S c h e m e </ scheme - name >
< scheme - ref > A b s t r a c t S e r v i c e S c h e m e </ scheme - ref >
< service - name > F e w T h r e a d D i s t r i b u t e d S e r v i c e </ service - name >
< thread - count >5 </ thread - count >
</ distributed - scheme >

Both of these schemes inherit from an abstract scheme to set common


parameters.
< distributed - scheme >
< scheme - name > A b s t r a c t S e r v i c e S c h e m e </ scheme - name >
< autostart > true </ autostart >
</ distributed - scheme >

8.2. CACHE CONFIGURATION BEST PRACTICES

255

All threes of these schemes define only elements that are applied to services and none that apply to caches; that is, they may have any elements
except backing-map-scheme and listener. If any cache-mapping element references one of these schemes directly, then configuration will fail because no
backing-map-scheme is defined.
Now we define the cache schemes that use the services. This time we define
schemes that reference a service scheme, and that have only backing-map-scheme,
and optionally, listener elements.
<! --

Cache template schemes - no service parameters -- >

< distributed - scheme >


< scheme - name > SlowCacheScheme </ scheme - name >
< scheme - ref > M a n y T h r e a d S e r v i c e S c h e m e </ scheme - ref >
< backing - map - scheme >
< read - write - backing - map - scheme >
< internal - cache - scheme >
< local - scheme / >
</ internal - cache - scheme >
< cachestore - scheme >
< class - scheme >
< class - name >
com . example . S l o w D a t a b a s e C a c h e L o a d e r
</ class - name >
</ class - scheme >
</ cachestore - scheme >
</ read - write - backing - map - scheme >
</ backing - map - scheme >
</ distributed - scheme >
< distributed - scheme >
< scheme - name > FastCacheScheme </ scheme - name >
< scheme - ref > F e w T h r e a d S e r v i c e S c h e m e </ scheme - ref >
< backing - map - scheme >
< read - write - backing - map - scheme >
< internal - cache - scheme >
< local - scheme / >
</ internal - cache - scheme >
</ read - write - backing - map - scheme >
</ backing - map - scheme >
</ distributed - scheme >

Adopting these conventions does not solve the fundamental problem that inconsistent configuration can occur without warning; it merely makes it easier
to avoid the problem in complex configurations. A later section will explore
use of a custom namespace handler to provide some level of validation.

8.2.5

Minimise Duplication in Configuration for Different


Roles

A cluster will normally consist of members performing different roles, storage,


proxy, JMX, perhaps several varieties of application nodes. Not all services
need run on all nodes:

256

CHAPTER 8. CONFIGURATION
The proxy service will run only on the proxy nodes.
The JMX node usually needs only the cluster service.
Replicated services are needed only on the caches that reference the
replicated caches, which may be only some application nodes for example.

Using XML configuration with the Coherence core product alone, we must
define all the services for a node in a single cache configuration XML file
(more precisely, all the services for a single CacheFactory). Obviously, we
would prefer not to have a separate configuration file per role with large
swathes of duplicated cache mappings and cache schemes. Two solutions
present themselves:
1. use the element namespace from the coherence-common incubator package and split the configuration into sections that can separately be
included per role as required.
2. Use a single cache configuration for all roles, but only start services in
members where they are required.
A trap for the unwary with the latter approach is that an undocumented
change in Coherence version 12.1.3 validates the type correctness of classes
referenced in the cache configuration during startup, even if those classes
are never used - such as a CacheStore in a storage-disabled member1 . It is
therefore necessary to ensure that all such classes are on the classpath of all
members.
This configuration fragment defines a replicated service that will start automatically only if the system property com.example.startreplicated is set to
true
< replicated - scheme >
< scheme - name > replicated - service </ scheme - name >
< service - name > My Rep lic ate dCa ch e </ service - name >
< autostart system - property = " com . example . startreplicated " >
false </ autostart >
</ replicated - scheme >

Though remember that even if autostart is not set to true, the service will
start in a member if it is referenced, e.g. by accessing a cache that maps to
that service.
1
Oracle appear to be taking seriously the fact that this change is causing problems for
at least one major user and may well fix this soon

8.2. CACHE CONFIGURATION BEST PRACTICES

257

The JMX node is a special case, usually it requires no access to services


other than the cluster service so a separate stub cache configuration may be
used:
<? xml version = " 1.0 " ? >
< cache - config xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - cache - config "
x si : sc h em a Lo c at i on = " http: // xmlns . oracle . com / coherence / coherence - cache - config
coherence - cache - config . xsd " >
</ cache - config >

8.2.6

Service Parameters

Some guidelines for tuning services.


Threads
Tune the thread-count of a service to maximise throughput.On your busiest
services you should configure a sufficient number of threads that all the cores
on the machine may be used:
thread count members per machine cores per machine
Test under peak load conditions, if you see no task backlog on the service,
reduce the thread-count. If there is a task backlog, increase the thread count
until either the backlog disappears, or until no improvement in throughput
is seen. Err on the side of too many, rather than too few. If the latency
of cache operations is limited by external resources, e.g. database access,
check where the bottleneck lies - is your connection pool sufficiently large?
Does performance of the underlying resource degrade with too many parallel
connections? You need to address performance tuning holistically across the
whole stack. This tuning process becomes impossibly complex if you have a
large number of services. The more variables, the harder the optimisation
becomes.
Partitions
What is the optimum number of partitions? Theres no simple answer to
arrive at a correct number, but there are constraints:

258

CHAPTER 8. CONFIGURATION
p1

p2

p3

p4

p5

p6

p7

Cache A

Cache B

Cache C

Figure 8.1: A service with three caches and seven partitions running on a
single storage node
more than some moderate multiple of the number of storage nodes otherwise the cluster will become unbalanced
a prime number? folk wisdom says so, but weve no clear evidence.
Maybe this advice is a hangover from the early days where the java
hash maps behaved optimally with a prime number of buckets (no
longer true)
some operations may be performed in parallel across partitions, particularly when using a PartitionedFilter. Ensure that you have at least
as many partitions per machine as there are cores to maximise opportunities for parallel execution.
large enough that a full partition takes no more than about one second
to transfer across the network if repartitioning occurs, so dependent
on the speed of your NICs
anecdotally, performance degrades if the number of partitions exceeds
about 8000
These last two conditions together impose a limit on the volume of data that
may be stored on a single service of 8000 N ICbandwidth, so, 8TB when
using 10GigE,
All of this is guidance rather than a strict formula. Try it, test it.

8.2. CACHE CONFIGURATION BEST PRACTICES

p1

p2

p3

p4

p5

p6

p5

p6

259

p7

Cache A

Cache B

Cache C

p2
Cache A

Cache B

Cache C

Figure 8.2: When another member joins the cluster, Coherence moves partitions to even out partitions per member

260

CHAPTER 8. CONFIGURATION

backup-count
The default backup count for a distributed service is one, meaning that one
backup copy of each partition is held on a different member, normally on a
different host. You may consider increasing the number for additional security, but unless you are being equally paranoid in all other aspects of your
architecture and your software engineering approach, you may be indulging
in a pretence of security. Consider that even with one backup, loss of a data
from a Coherence cluster happens far more frequently from application software defects or inadequate monitoring and management than from hardware
failure.
For a cache that contains a read-only subset of data from a backing store
(i.e. a cache that is being used as a cache), where the cache population is
limited by the high-units setting, you might consider setting backup-count to
zero.

8.2.7

Partitioned backing map

You may choose whether or not to use a partitioned backing map:


< backing - map - scheme >
<partitioned>true</partitioned>
< local - scheme / >
< high - units > 10000 </ high - units >
</ backing - map - scheme >

This means that Coherence will use a separate map for each partition of a
cache rather than a single map for all partitions on a member. There are a
number of implications for behaviour and configuration.
Some operations, most notably the execution of a backing map listener, will
hold a lock on the backing map while they execute. A partitioned backing
map will improve the granularity of these locks.
Backing map configuration options such as high-units apply to the backing
map, so that in the example above, there is a limit of 10,000 entries per
partition and entries are evicted from any partition that exceeds that limit,
even if other partitions on the same member have fewer entries or are empty.
If members are lost from the cluster, each remaining member will have more
partitions and will therefore be permitted to hold a greater number of entries.
With a non-partitioned backing map, the limit applies to the member as a
whole; the number of entries in individual partitions is immaterial.

8.3. OPERATIONAL CONFIGURATION BEST PRACTICE

261

Filter operations will be executed on separate threads per backing map. i.e.
in a non-partitioned map the filter will executed once per member on a single
worker thread. With a partitioned map, each partition will be executed as a
separate task, in parallel where there are sufficient worker threads. You can
always force execution in parallel by using a PartitionedFilter.

8.2.8

Overflow Scheme

Be wary of using overflow-scheme. If you routinely want to store more data in


cache than can be accommodated by physical memory, consider instead configuring elastic data. Though the documentation indicates that the journal
features are designed for use with flash memory, they have been successfully
used with spinning disk. Use of overflow-scheme should be restricted to the
use case of avoiding data loss in the event of multiple node failure where
data in caches is not externally persisted, or where refresh by read-through
is too expensive. Test carefully the performance of impact of overflow if you
do wish to use it. If you do wish to use an overflow scheme (or external
scheme for any other purpose), under no circumstance should you use the
now deprecated LH file manager.

8.2.9

Beware of large high-units settings

Internally, Coherence store the high-units setting for a cache as an int. Values
larger than 2147483646 are silently ignored and set to zero, disabling any
intended cache eviction strategy. You must use a unit-factor setting to bring
high-units down within the acceptable range if you are using large heaps in
storage nodes with cache sizes in this range.

8.3

Operational Configuration Best Practice

Objective
Provide guidelines and discuss considerations for assembling an operational configuration

262

CHAPTER 8. CONFIGURATION

8.3.1

Use the Same Operational Configuration In All Environments

You have several clusters for an application. Perhaps different environments;


development, regression test, uat, production. Perhaps different regions.
Minimise the differences between them, and maximise the validity of your
test cycles by using the same operational configuration in all environments
and externalising the differences in system properties. There are pre-defined
properties available for many of configuration values you will want to configure. For any others, define your own, as for authorized-hosts below.

8.3.2

Service Guardian Configuration

The service guardian sits in background in each member of the cluster timing
the activity on each thread of each service. There are two things to be
configured with the service guardian:
1. How long a task should be allowed to run before we wish to take some
action - the guardian-timeout
2. What action to take if the guardian timeout limit is exceeded - the
service-failure-policy.
Firstly, what is a sensible maximum for the time limit? That is highly
dependent on the nature of your application. If your application deals in large
numbers of key-based operations, entirely in-memory (no external access in
cache loaders etc.), then tens of milliseconds might seem a generous time
allowance. On the other hand, executing an EntryProcessor or EntryAggregator
against large data sets may take considerably longer. Any operations that
result in calls to high-latency external resources, databases or web services for
example, may involve variable, unpredictable latencies. Guardian timeouts
of many seconds, or even minutes may be appropriate.
What action to take on a timeout is another problem. The options for
service-failure-policy are:
exit-cluster
exit-process
logging

8.3. OPERATIONAL CONFIGURATION BEST PRACTICE

263

In the first two, Coherence will attempt to recover threads that appear to
be unresponsive. This entails performing a Thread.interrupt() and spawning
a new thread. In many cases this is a dangerous operation. If you use this
option you need to be certain that your code, and any libraries you use,
handle thread interrupts safely. Many libraries, including all but the most
recent Oracle JDBC drivers2 . do not. The risk is that resources will not be
closed, and you may be left with an orphaned thread holding locks or other
resources.
For this reason we recommend that you always set the service guardian
failure policy to logging and ensure that your monitoring alerts you when
this happens so that you can investigate the underlying cause. Here is an
example operational configuration that does this:
<? xml version = 1.0 ? >
< coherence xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - operational - config "
x si : sc h em a Lo ca t io n = " http: // xmlns . oracle . com / coherence /
coherence - operational - config coherence - operational - config . xsd " >
< cluster - config >
< service - guardian >
< service - failure - policy > logging </ service - failure - policy >
</ service - guardian >
</ cluster - config >
</ coherence >

Causes of guardian timeouts fall into two classes:


Operations proceeding correctly but more slowly than expected.
Deadlocks.
We suggest that the former should usually be left to complete, or for manual
intervention to understand and rectify the underlying cause. The latter
should, by careful design and through testing, never be permitted to occur
in the first place. Some causes of deadlocks are discussed in section 5.7:
Working With Many Caches.
Only for service instances where you can be certain that thread interrupts are
handled correctly should you configure exit-cluster or exit-process.
If you configure exit-cluster, the service may be stopped by the guardian,
but the JVM will continue to run, so you may want to have some mechanism for restarting failed services. The DefaultCacheServer methods main,
2
Version
12.1.0.1,
however
you
must
set
the
system
oracle.jdbc.javaNetNio=true to enable the use of NIO by the JDBC driver

property

264

CHAPTER 8. CONFIGURATION

startAndMonitor,

and the startDaemon methods do this automatically using an


instance of ServiceMonitor.

8.3.3

Specify Authorised Hosts

Specifying a set or range of authorised hosts in your operational configuration prevents new members joining the cluster from other machines. This
is in part a security measure in that it mitigates against deliberate attacks
on the cluster from elsewhere on your network, but it is also protection
against accidental disruption, for example, by a developer starting a node
on their own PC accidentally using the production configuration (there are
other protections against accidental clustering). As a security measure, this
is not a complete solution, knowledgeable attackers could conceivably cause
disruption by sending carefully crafted spoofed packets to the cluster, but
they would be unlikely by this means to be able to extract data. You can
choose whether to specify the hosts as individual IP addresses, or as a range.
Specifying individually, whilst maintaining the practice of using a single configuration file for all environments means defining as many system properties
as there are hosts in your largest cluster, using a range is simpler:
<? xml version = 1.0 ? >
< coherence xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - operational - config "
x si : sc h em a Lo c at i on = " http: // xmlns . oracle . com / coherence /
coherence - operational - config coherence - operational - config . xsd " >
< cluster - config >
< authorized - hosts >
< host - range >
< from - address system - property = " appname . authorised . hosts . from " >
192.168.0.0
</ from - address >
<to - address system - property = " appname . authorised . hosts . to " >
192.168.0.254
</ to - address >
</ host - range >
</ authorized - hosts >
</ cluster - config >
</ coherence >

8.3.4

Use Unique Multicast Addresses

There are predefined system properties for setting the multicast address and
port: tangosol.coherence.clusteraddress and tangosol.coherence.clusterport.
If you configure to use it, consider how routers and hosts handle multicast.

8.4. VALIDATE CONFIGURATION WITH A NAMESPACEHANDLER265


Many routers track which routes a multicast address is used on, and will
not propagate multicast packets on routes where that address is not used,
so to reduce network traffic (and potentially exposure of sensitive data),
separate clusters on different network segments should always use different
multicast addresses. Interestingly, this is a often a more efficient means of
limiting propagation of multicast packets than time-to-live: some routers
will handle packets with expired TTL in software far less efficiently. At one
client, a Coherence test cluster so overwhelmed a router discarding packets in
software that it rendered a vital production system inaccessible, the problem
was resolved by increasing TTL so that the router used its hardware based
routing logic to discard the packets. It may be better to use a moderately
high TTL, and avoid re-use of multicast addresses.
You may choose to run more than one cluster on a set of hosts (this can be
a useful way of setting up several lower-capacity development environments
on a limited hardware budget). For these, there is no harm in sharing the
same multicast address, with distinct multicast port per cluster.

8.4

Validate Configuration With A NameSpaceHandler

Objective
To demonstrate the use of a NameSpaceHandler by providing a means
of enforcing the cache configuration best practices previously discussed
Prerequisites
An understanding of the problem of ambiguous cache configuration,
and the conventions used to avoid them described in section 8.2: Cache
Configuration Best Practices. Also we recommend skimming through
the Coherence javadoc for NameSpaceHandler, AbstractNameSpaceHandler,
and DocumentPreprocessor and related classes and interfaces
Code examples
In the package org.cohbook.configuration.cache in the integration project
Weve seen how it is easy to produce cache configurations that are inconsistent, and which do not behave as we might navely expect, and of how we
can mitigate this by clearly distinguishing between service configuration and

266

CHAPTER 8. CONFIGURATION

cache configuration. Coherence does provide a mechanism adding or own


XML namespace to a cache configuration, and for examining and modifying
the configuration as it is processed.
We will use this facility to define a new XML namespace that defines a
new attribute for the distributed-scheme element. The attribute name is
scheme-type and the permitted values are:
cache this scheme element provides a template for a cache. It may not have
any service-specific parameters, and its scheme-ref element may refer
to another scheme of type cache, or one that defines a service.
service defines a single service. If there is a service-name element, it must be
unique. If not (i.e. it describes the default service), it must be the only
one. Only elements that are applied to service configuration may be
included (no backing-map-scheme or listener). It may have a scheme-ref
element that refers to a scheme of type abstract-service
abstract-service defines a template for service schemes. Only service related
elements are allowed, no service-name, and service-scheme elements may
only reference another abstract-service.
We introduce our namespace using the prefix scheme, implemented with our
CacheConfigNameSpaceHandler class
< cache - config xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance "
xmlns = " http: // xmlns . oracle . com / coherence / coherence - cache - config "
xmlns:scheme="class://org.cohbook.configuration.cache.CacheConfigNameSpaceHandler"
x si : sc h em a Lo c at i on = " http: // xmlns . oracle . com / coherence / coherence - cache - config
coherence - cache - config . xsd " >

Here is an example showing one scheme of each type:


< caching - schemes >
<! --

Cache template schemes - no service parameters -- >

< distributed - scheme scheme:scheme - type = " cache " >


< scheme - name > SlowCacheScheme </ scheme - name >
< scheme - ref > M a n y T h r e a d S e r v i c e S c h e m e </ scheme - ref >
< backing - map - scheme >
< read - write - backing - map - scheme >
< internal - cache - scheme > < local - scheme / > </ internal - cache - scheme >
</ read - write - backing - map - scheme >
</ backing - map - scheme >
</ distributed - scheme >
<! --

Service schemes - no cache template parameters -- >

< distributed - scheme scheme:scheme - type = " service " >


< scheme - name > M a n y T h r e a d S e r v i c e S c h e m e </ scheme - name >
< scheme - ref > A b s t r a c t S e r v i c e S c h e m e </ scheme - ref >
< service - name > M a n y T h r e a d D i s t r i b u t e d S e r v i c e </ service - name >
< thread - count > 100 </ thread - count >
</ distributed - scheme >

8.4. VALIDATE CONFIGURATION WITH A NAMESPACEHANDLER267


< distributed - scheme scheme:scheme - type = " abstract - service " >
< scheme - name > A b s t r a c t S e r v i c e S c h e m e </ scheme - name >
< autostart > true </ autostart >
</ distributed - scheme >

We could take a more comprehensive approach and define our own elements
to replace distributed-scheme but that would be much more complex, and
harder to maintain if the underlying Coherence cache configuration schema
changed in a later release.
The starting point is the NameSpaceHandler interface. This allows us to provide our own class instances to be called when processing the configuration
document, or individual elements or attributes. Because we are interesting
in validating the consistency of the document as a whole, we will provide
an implementation of DocumentPreprocessor called ServiceSchemePreprocessor.
Our NameSpaceHandler implementation merely sets the DocumentPreprocessor to
use:
public class C a c h e C o n f i g N a m e S p a c e H a n d l e r extends A b s t r a c t N a m e s p a c e H a n d l e r {
public void onStartNamespace ( Pro ces sin gCo nte xt processingcontext ,
XmlElement element , String prefix , URI uri ) {
s e t D o c u m e n t P r e p r o c e s s o r ( new S e r v i c e S c h e m e P r e p r o c e s s o r ( prefix ));
}
}

Our preprocessor has a constructor with one argument, the prefix declared for
the namespace handler. Well need this to identify our new attribute:
public class S e r v i c e S c h e m e P r e p r o c e s s o r implements D o c u m e n t P r e p r o c e s s o r {
private final QualifiedName schemeType ;
public S e r v i c e S c h e m e P r e p r o c e s s o r ( String prefix ) {
schemeType = new QualifiedName ( prefix , " scheme - type " );
}

First weve defined some member variables to allow us to record the cache
scheme hierarchy as we process it:
private Map < String , String > schemeParentMap ;
private Map < String , String > schemeTypeMap ;
private Set < String > d e f i n e d S er v i c e N a m e s ;

schemeParentMap is a map of schemes that define a parent using the


scheme-ref element. Key is the child (referer) and value is the parent (referenced) scheme.
schemeTypeMap records our scheme type (cache, service, abstract-service)
for each scheme.
definedServices is the set of service names defined so far, including any
implicit definition of the default service by defining an unnamed scheme

268

CHAPTER 8. CONFIGURATION
of type service.

The interface has a single method we need to implement:


private static final String A L R E A D Y P R O C E S S E D C O O K I E =
S e r v i c e S c h e m e P r e p r o c e s s o r . class . getName ();
@Override
public boolean preprocess ( Pr oce ssi ngC ont ext context ,
XmlElement xmlelement ) throws C o n f i g u r a t i o n E x c e p t i o n {
if ( Boolean . TRUE . equals ( context . getCookie ( Boolean . class , A L R E A D Y P R O C E S S E D C O O K I E ))) {
return false ;
}
schemeParentMap = new HashMap < >();
schemeTypeMap = new HashMap < >();
d e f i n e d Se r v i c e N a m e s = new HashSet < >();
XmlElement schemes = xmlelement . getElement ( " caching - schemes " );
for ( XmlElement schemeElement : ( List < XmlElement >) schemes . getElementList ()) {
if ( schemeElement . getName (). equals ( " distributed - scheme " )) {
v a l i d a t e S c h e m e E l e m e n t ( schemeElement );
}
}
for ( Map . Entry < String , String > entry : schemeParentMap . entrySet ()) {
String parent = entry . getValue ();
String child = entry . getKey ();
va lid ate Hei rar chy ( parent , child );
}
context . addCookie ( Boolean . class , ALREADYPROCESSEDCOOKIE , Boolean . TRUE );
return false ;
}

We must iterate over the caching-schemes element looking for instances of


distributed-scheme and validating that they have a valid scheme-type and
that their elements conform to the rules. This is a simplistic approach
as we havent considered distributed-scheme elements within a near-scheme
or overflow-scheme element3 . This validateScheme element also populates the
member variables defined above so that we can validate the hierarchy.
Coherence will call this preprocess method more than once. Whilst the precise rationale for this is not documented, it appears that one or more iterations occur of expanding scheme-ref elements, merging schemes with their
parents. Our validation must occur only on the first iteration, with the
unprocessed configuration, so we set a cookie in the context at the end of
processing, and check it at the start.
We define two collections of element names, one to list those elements that we
consider valid only in cache schemes, and those that are valid in all schemes.
By implication, anything else is valid only in service schemes:
private static final Set < String > C A C H E _ S C H E M E _ E L E M E N T S =
new HashSet < String >( Arrays . asList ( new String [] {
3

Another exercise for you, dear reader

8.4. VALIDATE CONFIGURATION WITH A NAMESPACEHANDLER269


" backing - map - scheme " , " listener " }));
private static final Set < String > A L L _ S C H E M E _ E L E ME N T S =
new HashSet < String >( Arrays . asList ( new String [] {
" scheme - name " , " scheme - ref " }));

In the validateScheme element, we check the type and content of each element,
and also verify that there are no duplicate service names:
private void v a l i d a t e S c h e m e E l e m e n t ( XmlElement xmlelement ) {
String schemeTypeName = schemeType . getName ();
XmlValue attribute = xmlelement . getAttribute ( schemeTypeName );
if ( attribute == null ) {
raiseError ( " no scheme - type attribute " , xmlelement );
}
String type = attribute . getString ();
switch ( type ) {
case " cache " :
for ( XmlElement subelement : ( List < XmlElement >) xmlelement . getElementList ()) {
if (! C A C H E _ S C H E M E _ E L E M E N T S
. contains ( subelement . getName ())
&& ! A L L _ S C H E M E _ EL E M E N T S
. contains ( subelement . getName ())) {
raiseError ( " cache scheme contains invalid element "
+ subelement , xmlelement );
}
}
break ;
case " service " :
String serviceName = g e t C h i l d E l e m e n t V a l u e ( xmlelement , " service - name " );
if ( serviceName == null ) {
serviceName = " D e f a u l t D i s t r i b u t e d S e r v i c e " ;
}
if ( d e f i n e d S e r vi c e N a m e s . contains ( serviceName )) {
raiseError (
" duplicate service name "
+ serviceName , xmlelement );
}
d e f i n e dS e r v i c e N a m e s . add ( serviceName );
for ( XmlElement subelement : ( List < XmlElement >) xmlelement . getElementList ()) {
if ( C A C H E _ S C H E M E _ E L E M E N T S
. contains ( subelement . getName ())) {
raiseError (
" service scheme contains invalid element "
+ subelement , xmlelement );
}
}
break ;
case " abstract - service " :
for ( XmlElement subelement : ( List < XmlElement >) xmlelement . getElementList ()) {
String elementName = subelement . getName ();
if ( elementName . equals ( " service - name " ) ||
C A C H E _ S C H E M E _ E L E M E N T S . contains ( elementName )) {
raiseError (
" service scheme contains invalid element "
+ subelement , xmlelement );
}
}
break ;
default :
raiseError ( " invalid scheme - type " + type , xmlelement );
}
String schemeName = g e t C h i l d E l e m e n t V a l u e ( xmlelement , " scheme - name " );
schemeTypeMap . put ( schemeName , type );
String schemeRef = g e t C h i l d E l e m e n t V a l u e ( xmlelement , " scheme - ref " );
if ( schemeRef != null ) {
schemeParentMap . put ( schemeName , schemeRef );
}

270

CHAPTER 8. CONFIGURATION

We also have a couple of utility methods to reduce boilerplate, extracting


the value from an element if it is present, and raising an error with a bit of
context to help us locate the cause if we do have a validation.
private String g e t C h i l d E l e m e n t V a l u e ( XmlElement element , String childElementName ) {
String result = null ;
XmlElement childElement = element . getElement ( childElementName );
if ( childElement != null ) {
result = childElement . getString ();
}
return result ;
}
public void raiseError ( String problem , XmlElement xmlelement ) {
String context = xmlelement . toString ();
XmlElement schemeElement = xmlelement . getElement ( " scheme - name " );
if ( schemeElement != null ) {
context = " scheme " + schemeElement . getString ();
}
throw new C o n f i g u r a t i o n E x c e p t i o n ( problem + " in " + context , " " );
}

Finally, the method that checks the hierarchy:


private void val id ate Hei rar chy ( String parent , String child ) {
String parentType = schemeTypeMap . get ( parent );
if ( parentType == null ) {
throw new C o n f i g u r a t i o n E x c e p t i o n ( " scheme " + child +
" references non - existent scheme " + parent , " " );
}
String childType = schemeTypeMap . get ( child );
switch ( childType ) {
case " cache " :
if ( parentType . equals ( " abstract - service " )) {
throw new C o n f i g u r a t i o n E x c e p t i o n ( " cache scheme " + child +
" references abstract service scheme " + parent ,
" scheme type \" cache \" may reference only schemes " +
" of scheme type \" cache \" or \" service \" " );
}
break ;
case " service " :
case " abstract - service " :
if (! parentType . equals ( " abstract - service " )) {
throw new C o n f i g u r a t i o n E x c e p t i o n ( " cache scheme " + child +
" references scheme " + parent + " of scheme type \" " +
childType + " \" " ,
" scheme type \" " + childType +
" \" may reference only schemes of scheme type \" abstract - service \" " );
}
}
}

8.5

NUMA

Objective
Show how to maximise CPU/memory performance in a multi-CPU/NUMA
architecture by pinning each JVM to a single CPU

8.5. NUMA

271

Dependencies
A multi-cpu system with numactl installed.
Modern multi-CPU, multi-core systems have several layers of memory: onchip cache per core, cache shared per-CPU, and main memory accessible
from all CPUs. Main memory may be divided into regions owned per CPU,
it is more expensive for CPU A to access memory connected to CPU B than
its own attached memory. This constitutes a NUMA (non-uniform memory)
architecture. By default, the kernel may schedule separate threads of a single
process to execute on different processors, with its memory image distributed
across the memory attached to different processors.
CPU A

CPU B

Core 1

Core 2

Core 1

Core 2

Core 3

Core 4

Core 3

Core 4

Memory Region A

Memory Region B

Bus

Figure 8.3: A CPUs access to its own memory region is faster than to
another CPUs region
The Oracle Hotspot JVM offers a command line argument -XX:+UseNUMA, an
object created by one thread will be allocated in heap memory local to the
CPU on which that thread is running. Though this is of some benefit for
short-lived objects referenced only by that thread, we may still have access from other CPUs for cached objects and objects communicated between
threads, especially in the network layers. UseNUMA also allows some optimisation of parallel GC, so that GC threads collect from memory attached to the
processor they run on. Be aware that older Linux kernels have a bug that may

272

CHAPTER 8. CONFIGURATION

cause a crash with this flag enabled, see http://docs.oracle.com/javase/


7/docs/technotes/guides/vm/performance-enhancements-7.html#numa
The Linux command numactl can be used to bind a JVM to a single CPU.
$ numactl -- cpubind =0 -- membind =0 " $JAVA_HOME / bin / java java - args ... " .

will ensure that all threads of the java process will be bound to CPU 0, and
that memory allocations will be made from the memory connected to CPU
0.
Use the --hardware option to see what CPUs are available on a host and how
the memory is divided between them.
$ numactl -- hardware
available : 2 nodes (0 -1)
node 0 size : 96936 MB
node 0 free : 85767 MB
node 1 size : 96960 MB
node 1 free : 83328 MB
node distances :
node 0 1
0: 10 20
1: 20 10

To maintain a balanced cluster, you should spread the members on each host
across all the CPUs, ideally the number of members should be a multiple
of the number of CPUs. The performance gain is sensitive to too many
variables to give any simple figure, but one of our colleagues reported a 43%
improvement in cluster throughput on one test scenario. 10% to 25% was
more typical.

8.6

Eliminate Swapping

Objective
Describe a technique for preventing cluster members from being swapped
out
Code examples
The code for this example is in the org.cohbook.configuration.memlock
package in the configration project
Dependencies
We use the JNA (java native architecture) library for accessing native
functions. This technique has been tested on several modern Linux
variants, it may be applicable to UNIX variants, but we havent tested
it.

8.6. ELIMINATE SWAPPING

273

Swapping is a disaster for a Coherence cluster, even worse than long garbage
collection cycles, members that are partly or completely swapped out stand
a very good chance of being ejected from the cluster, but will still be alive
and will attempt to rejoin. When all the members on one or more hosts are
being swapped in and out, the chances of a cluster performing any useful
work are slim, and the likelihood of partition loss is correspondingly great.
So, what precautions can we take to reduce or eliminate the likelihood of
swapping?

8.6.1

Mitigation Strategies

Swappiness
The kernel parameter swappiness controls the extent to which the kernel will
prefer to use memory for page cache rather than for process memory images.
Setting it to zero will mean the kernel will never swap a process image page
out in preference to a cache page. If you do nothing else, do this. It will
not guarantee that your processes are never swapped, but it will make it less
likely - swapping will only occur if the total size of running processes exceeds
available memory.

Measure your memory use


A running JVMs memory size is larger than the configured heap size, possibly by a large margin. Always fix the size of the various heap regions,
then perform extensive load testing while monitoring the size of your JVMs.
Use of off-heap storage, e.g. with nio, may lead to considerable variation
in memory use. To be sure that you will not exceed available memory, you
need to know how much you will use.

Schmooze the Sysadmins


Especially in large organisations, sysadmins can be a law unto themselves
leaving you little control over your own applications hardware. One cluster
weve seen would regularly fall over at 6pm on a Friday evening. Eventually
we traced this to a backup utility that would look at the size of physical
memory on the machine to decide how much to allocate for itself. Talk to

274

CHAPTER 8. CONFIGURATION

your sysadmin team, be nice to them, buy them beer and get them on your
side - it may save you much pain.
Leave enough headroom
Once you know how much memory your cluster members will use, you need
to allow enough additional memory for operating system overhead and any
other administrative process (such as backup utilities) that may run while
your cluster is up - consider the worst case scenario. Talk to the sysadmins
about what they may run - backups, system updates, monitoring tools, etc.
Ultimately your calculation of headroom requirements must depend on your
confidence in the accuracy of these measurements and estimates, and your
appetite for the risk of cluster failure. If you do not implement the approach
outlined in the next section, you should be very generous with your overhead
allowance.

8.6.2

Prevention Strategy

The POSIX standard defines functions, mlockall and munlockall to lock/unlock the address space of a process - see http://www.unix.com/man-page/
POSIX/3posix/mlockall/ or man mlockall on your UNIX/Linux system.
This simple class provides a static method that will lock all current and
future memory allocations for the process in physical memory:
import com . sun . jna . Library ;
import com . sun . jna . Native ;
public class MemLock {
public static final int MCL_CURRENT = 1;
public static final int MCL_FUTURE = 2;
private interface CLibrary extends Library {
int mlockall ( int flags );
}
public synchronized static void mlockall () {
if (! SystemUtils . IS_OS_LINUX ) {
return ;
}
CLibrary instance = ( CLibrary ) Native . loadLibrary ( " c " , CLibrary . class );
int errno = instance . mlockall ( MCL_CURRENT | MCL_FUTURE );
if ( errno != 0) {
throw new RuntimeException ( " mlockall failed with errno = " + errno );
}
}
}

Ive chosen to make failure to lock an exception; Id let the process fail rather
than run without locked memory. You may choose to log a warning or return

8.6. ELIMINATE SWAPPING

275

the failure to the caller. Refer to the mlockall man page for a full explanation
of possible reasons for failure. This page warrants careful reading but there
are a few key points to consider.
Call the method as soon as possible in your program, ideally at the beginning of your main method. The SystemUtils.IS_OS_LINUX guard check will
allow you to develop and test on Windows but deploy to Linux. For other
environments, youll need to test out what works.
There is a per-process limit to the amount of memory that can be locked.
This is set per user in /etc/limits.conf using the memlock key. You should
set this to be larger than the maximum image size (not the heap size) of a
running cluster member.
You can verify that memory is locked by looking at /proc/<pid>/status where
<pid> is the process id of the running JVM
Name : java
State : S ( sleeping )
...
VmPeak : 3960000 kB
VmSize : 3959996 kB
VmLck : 3959996 kB
VmHWM : 3958052 kB
VmRSS : 3958048 kB
VmData : 3945308 kB
VmStk : 40 kB
VmExe : 40 kB
VmLib : 14136 kB
VmPTE : 7752 kB
Threads : 47
...

shows a process with current vm image size (VmSize) of 3959996 kB, all of it
locked in memory (VmLck)
Caution is advised: the configured limit is per-process, there is no limit on
the number of processes that lock memory. Running too many, too large
cluster members may leave so little headroom that the system may start
thrashing swap or giving out of memory errors on the unlocked processes.
However, this may be preferable to the cluster members themselves being
swapped out.
If the memlock limit is set too small, processes may fail with out of memory
errors as allocations fail because they cannot be locked. Again, you may
consider that a hard failure of a member trying to use too much memory
(perhaps nio buffers) is preferable to the gradual collapse of the cluster caused
by the onset of swapping.
Finally, heavy paging of other processes may still have an adverse effect on
the cluster as available CPU and i/o capacity is consumed, leaving so little

276

CHAPTER 8. CONFIGURATION

for the cluster that members do not respond quickly enough. This may be
mitigated by setting a higher priority for the cluster members.4
should be considered another useful tool in making memory use
somewhat more deterministic. While it may allow you to more confidently
limit the OS headroom. It is not a way of managing with less memory than
you really need.
mlockall

4
Or using realtime round-robin scheduling with sched_setscheduler system call. If
you try this and it works, let me know.

Appendix A

Dependencies
This is the list of dependencies we used in preparing this book, with maven
co-ordinates and URL for further information. Note that the versions were
correct at the time of writing. If you later download the code samples, its
possible that well have updated them with a later version.
Name

Group

Artefact

Version

Littlegrid
Spring framework
Apache commons lang
Apache commons io
Google protocol buffers
SLF4J

org.littlegrid
springframework
org.apache.commons
org.apache.commons
com.google.protobuf
org.slf4j
org.slf4j
org.slf4j
ch.qos.logback
junit
org.jmock

Littlegrid
spring-core
commons-lang3
commons-io
protobuf-java
slf4j-api
log4j-over-slf4j
jcl-over-slf4j
logback-classic
junit
jmock

2.14
1.2.6
3.1
1.3.2
2.5.0
1.7.2
1.7.2
1.7.2
1.0.10
4.11
2.6.0

Logback
JUnit
JMock

The version of the Google Protocol Buffers maven artefact must match the
version number of the Google Protocol Buffers compiler, protoc installed on
your system. Check with the command protoc --version

277

278

APPENDIX A. DEPENDENCIES

Appendix B

Additional Resources
Here are a few selected sources of further information from people who have
been working with Coherence for far longer than I have, some of them formerly of Tangosol, now Oracle.
Where

What

www.shadowmist.co.uk

This book, the code, some older presentations, possibly some addenda
and additional material in time
www.cohbook.org
Direct link to the coherence cookbook section of the above
bookshops
Oracle Coherence 3.5 by Aleksander Seovic
blog.ragozin.info
Alexey Ragozins blog
thegridman.com
JKs blog
littlegrid.org
Jon Halls littlegrid
www.benstopford.com
Ben Stopfords blog
blogs.oracle.com/felcey
Dave Felceys blog
brianoliver.wordpress.com
Brian Olivers blog
wibble.atlassian.net/wiki/display/COH
Andrew Wilsons space

279

280

APPENDIX B. ADDITIONAL RESOURCES

Index
/etc/limits.conf, 275
AbstractAggregator, 132, 133
AbstractEnumOrdinalCodec, 53
AbstractExtractor, 50, 96
extract, 110
extractFromEntry, 96, 110
KEY, 100
AbstractNameSpaceHandler, 265
AbstractVoidProcessor, 130, 131
addIndex
QueryMap, 115
aggregate
InvocableMap, 131, 183
ParallelAwareAggregator, 132, 179
aggregateResults
ParallelAwareAggregator, 132
AllFilter, 97
AndFilter, 97
AnyFilter, 99
applyIndex
IndexAwareFilter, 115, 119
ArbitrarilyFailingEntryProcessor, 141, 162
processAll, 141
AsynchronousProcessor, 146, 151, 152
get, 152
onException, 151
authorized-hosts, 262
AutowireCapableBeanFactory, 20
Autowired, 2123
281

282

INDEX

backing-map-scheme, 255, 266


BackingMapBinaryEntry, 175
BackingMapContext, 121, 170, 171
getBackingMapEntry, 171, 176
getBinaryEntry, 179
getKeyToInternalConverter, 171
getReadOnlyEntry, 179
BackingMapManagerContext, 222
getBackingMapContext, 171
backup-count, 251, 252, 260
Base64InputStream, 226
Base64OutputStream, 226
BeanLocator, 17, 25
Binary, 46, 68, 117, 121, 123, 126
calculateNaturalPartition, 217
BinaryEntry, 46, 50, 82, 83, 110, 127, 135, 141, 171, 175, 222, 224
isPresent, 168
BinaryEntryAdapterFilter, 182
BinaryEntryStore, 209, 222224, 226
BinaryMemoryCalculator, 37, 38
BinaryStore, 196, 214
BufferInput, 72
BufferOutput, 72
cache-mapping, 249, 255
cache-name, 250
CacheConfigNameSpaceHandler, 266
CacheLoader, 187, 188, 209, 221
load, 195
loadAll, 195
CachePersistor, 236
CachePrimeInvocable, 218
CacheStore, 187189, 191, 192, 198, 200, 202, 205, 208210, 214, 221224,
228, 234, 235
erase, 191
store, 192, 196, 197
storeAll, 190, 192, 212
cachestore-scheme, 189
CacheStoreFactory, 204, 206
CacheStoreSwitcher, 200

INDEX
caching-schemes, 252, 254, 268
calculateEffectiveness
IndexAwareFilter, 118
calculateNaturalPartition
Binary, 217
ClassLoader, 6466, 68, 199
clear
NamedCache, 232
Cluster
getLocalMember, 241
isRunning, 244
Codec, 52
CodedInputStream, 81
coherence-cache-config.xml, 249
CollectionElementExtractor, 110, 112
COMMITTED, 234, 235, 237
COMMITTING, 234, 237
ConditionalExtractor, 100, 115, 121
ConditionalIndex, 115
ConfigurableCacheFactory, 17
ensureCache, 20
startAndMonitor, 6, 7
ControllableCacheStore, 200, 202205
Converter, 121
createIndex
IndexAwareExtractor, 121, 124
custom-mbeans.xml, 30
DataAccessException, 211
DB_CLOSE_DELAY, 198
deadlock, 9, 167, 168, 170, 176, 177, 185, 240, 263
DefaultCacheFactory, 16
DefaultCacheFactoryBuilder, 13
DefaultCacheServer, 7
main, 263
start, 16, 17
startAndMonitor, 19, 264
startDaemon, 264
DefaultClusterMember, 28
DefaultConfigurableCacheFactory, 17

283

284

INDEX

delete
MapIndex, 123
DescriptiveStatistics, 121, 123, 125
DeserialisationAggregator, 86, 88, 91
DeserialisationCheckingPofContext, 86, 88, 103
DeserialiseAutowire, 22
DeserializationAccelerator, 104
destroyIndex
IndexAwareExtractor, 121, 124
distributed-scheme, 250, 253, 266268
DocumentPreprocessor, 265, 267
DynamicAutowire, 23, 204
ENDANGERED, 229
ensureCache, 20
ConfigurableCacheFactory, 20
Entry
InvocableMap, 104
Map, 104
EntryAggregator, 99, 104, 127, 128, 262
EntryEvent, 228
EntryExtractor, 81, 82, 103, 104, 125, 184
extractFromEntry, 81
EntryFilter, 45
evaluateEntry, 119
EntryProcessor, 104, 116, 127, 128, 130, 138, 139, 141, 142, 150, 152155,
168, 209, 262
process, 138, 148
processAll, 128
EntryProcessorEvent, 228
entrySet
Map, 96
QueryMap, 96
EntrySizeExtractor, 38
EnumPofSerializer, 47, 52
EqNullFilter, 90
EqualsBuilder, 42
EqualsFilter, 98
erase
CacheStore, 191

INDEX
evaluate
Filter, 119
evaluateEntry
EntryFilter, 119
Event
nextInterceptor, 240
EventInterceptor, 16, 235
eviction, 188
Evolvable, 36, 55, 60
getImplVersion, 60
exit-cluster
service-failure-policy, 262, 263
exit-process
service-failure-policy, 262, 263
expiry, 188
ExternalizableHelper, 72, 190
fromBinary, 41
toBinary, 41
extract
AbstractExtractor, 110
PofExtractor, 108
extractFromEntry
AbstractExtractor, 96, 110
EntryExtractor, 81
InvocableMapHelper, 104
PofExtractor, 110
FieldUtils, 117
Filter, 155
evaluate, 119
fromBinary
ExternalizableHelper, 41
get
AsynchronousProcessor, 152
MapIndex, 123
SimpleMapIndex, 116
getAssociatedKey
KeyAssociation, 217
KeyAssociator, 217

285

286
getBackingMapContext
BackingMapManagerContext, 171
getBackingMapEntry
BackingMapContext, 171, 176
getBinaryEntry
BackingMapContext, 179
getChild
PofCollection, 110
getImplVersion
Evolvable, 60
getIndexContents
SimpleMapIndex, 116
getKeyPartition
KeyAssignmentStrategy, 217
getKeyToInternalConverter
BackingMapContext, 171
getLocalMember
Cluster, 241
getOldestMember
ServiceInfo, 241
getOwnedPartitions
PartitionedService, 150
getParallelAggregator
ParallelAwareAggregator, 133
getPartitionId
PartitionAwareKey, 217
getReadOnlyEntry
BackingMapContext, 179
Google Protocol Buffers, 70
GroupAggregator, 99
GuardContext, 130
guardian-timeout, 262
GuardSupport, 130
high-units, 251, 252, 260, 261
IdentityExtractor, 95
IndexAwareExtractor, 104, 115, 120, 121, 124
createIndex, 121, 124
destroyIndex, 121, 124

INDEX

INDEX

287

IndexAwareFilter, 115117, 120, 121, 125


applyIndex, 115, 119
calculateEffectiveness, 118
InFilter, 98
init
Invocable, 216, 220
InputStream, 72
insert
MapIndex, 123
installation, 32
Interceptor, 237
interrupt
Thread, 263
invalidation strategy, 232, 233
Invocable, 22, 88, 127, 154, 155, 158161, 166, 167, 200, 215, 216, 220222
init, 216, 220
run, 221
InvocableMap, 100
aggregate, 131, 183
Entry, 104
invoke, 138
invokeAll, 150
InvocableMapHelper
extractFromEntry, 104
query, 174176, 182
InvocableObserver, 166, 167
invocationCompleted, 155
memberCompleted, 155
invocationCompleted
InvocableObserver, 155
InvocationObserver, 155
InvocationResult, 155
InvocationService, 152, 216
invoke
InvocableMap, 138
invokeAll
InvocableMap, 150
IsNullFilter, 88, 90
isPresent
BinaryEntry, 168

288
isRunning
Cluster, 244
JdbcTemplate, 195
JNA, 272
KEY
AbstractExtractor, 100
KeyAssignmentStrategy
getKeyPartition, 217
KeyAssociatedFilter, 98, 99
KeyAssociation, 170
getAssociatedKey, 217
KeyAssociator, 118, 121, 126, 170, 218
getAssociatedKey, 217
KeyPartitioningStrategy, 139, 141, 170, 218
LifecycleEvent, 16, 228, 243
listener, 255, 266
load
CacheLoader, 195
loadAll
CacheLoader, 195
logging
service-failure-policy, 262, 263
LostPartitionListener, 232
LostPartitionListenerMBean, 229
m_ctx
SimpleMapIndex, 117
MACHINE-SAFE, 229
main
DefaultCacheServer, 263
Map
Entry, 104
entrySet, 96
MapIndex, 115, 120
delete, 123
get, 123
insert, 123
update, 123

INDEX

INDEX
MapListener, 227, 233
maven, 32, 34
MBeanExporter, 30, 31
Member, 154, 155
memberCompleted
InvocableObserver, 155
MemberListener, 241
MemberListenerpListener, 227
memlock, 275
Method, 66
mlockall, 274276
MultiExtractor, 101
munlockall, 274
NamedCache
clear, 232
putAll, 215
NamedParameterJdbcTemplate, 196
NameSpaceHandler, 265, 267
near cache, 232, 233
near-scheme, 268
nextInterceptor
Event, 240
NODE-SAFE, 229
NonTransientDataAccessException, 211, 214
NUMA, 270
numactl, 272
onException
AsynchronousProcessor, 151
operation bundling, 189
OutputStream, 72
overflow-scheme, 261, 268
ParallelAwareAggregator, 128, 132, 133
aggregate, 132, 179
aggregateResults, 132
getParallelAggregator, 133
parseDelimitedFrom, 72
parseFrom, 72
partition, 139

289

290
partition-local transaction, 170, 189, 194, 233
PartitionAwareKey
getPartitionId, 217
partitioned backing map, 188
PartitionedFilter, 99, 149, 155
PartitionedService, 118, 126, 141, 154
getOwnedPartitions, 150
PartitionEntryProcessorInvoker, 155, 158, 160, 164
PartitionListener, 227229
PartitionSet, 148150, 155
patch, 32, 34
POF, 35, 36
PofCollection, 110
getChild, 110
PofCollectionElementExtractor, 112, 120
PofConstants
V_REFERENCE_NULL, 91
PofContext, 46, 65, 66, 82, 117119
PofExtractor, 37, 79, 90, 104, 110, 113, 126, 133, 135
extract, 108
extractFromEntry, 110
PofNavigator, 110, 119
PofSerializer, 52, 82, 97
PofTypeIdFilter, 50, 91
PofUpdater, 79, 113
PofValue, 46, 110
PofValueParser, 46, 50
PortableObject, 52, 61, 82
PortableObjectSerializer, 61
PortableProperty, 43, 51, 53
process
EntryProcessor, 138, 148
processAll
ArbitrarilyFailingEntryProcessor, 141
EntryProcessor, 128
PropertyAccessor, 27
protobuf-maven-plugin, 70, 71
ProtobufClusterTest, 78
ProtobufExtractor, 81, 82
ProtobufSerialiser, 72, 78

INDEX

INDEX
putAll
NamedCache, 215
Qualifier, 23
query
InvocableMapHelper, 174176, 182
QueryMap, 100
addIndex, 115
entrySet, 96
removeIndex, 115
read-ahead, 188, 190
read-through, 188, 189
read-write-backing-map-scheme, 209
RecoverableDataAccessException, 211
ReducerAggregator, 100, 101, 103, 108, 112, 184
ReflectionExtractor, 97, 104, 108, 110
RegistrationBehaviour, 16
remove
Set, 96
removeIndex
QueryMap, 115
replicated cache, 185
Resource, 23
rollback-cachestore-failures, 209
rolling restart, 55
run
Invocable, 221
RuntimeException, 211
sched_setscheduler, 276
scheme-ref, 254, 266, 267
scheme-type, 266
senior member, 241
Serialisation2WayTestHelper, 62, 68
SerialisationTestHelper, 41, 42, 58, 60, 74
SerialiserCodec, 52
SerialiserTestSupport, 65, 66
Serializer, 65, 79
Service, 241
service-failure-policy, 262

291

292
exit-cluster, 262, 263
exit-process, 262, 263
logging, 262, 263
service-name, 254, 266
service-scheme, 266
ServiceInfo, 241
getOldestMember, 241
ServiceListener, 227
ServiceMonitor, 264
Set
remove, 96
SimpleMapIndex, 116, 117, 119, 120
get, 116
getIndexContents, 116
m_ctx, 117
SimpleMemoryCalculator, 37
SimpleTypeIdFilter, 46, 50, 106
SingletonService, 243
SmartLifecycle, 14, 207
SpringAwareCacheFactory, 9, 29
SpringCoherenceJMXExporter, 30
SpringSerializer, 21, 23, 25
start
DefaultCacheServer, 16, 17
startAndMonitor
ConfigurableCacheFactory, 6, 7
DefaultCacheServer, 19, 264
startDaemon
DefaultCacheServer, 264
StorageDisabledMain, 16
StorageEnabledMain, 16
store
CacheStore, 192, 196, 197
storeAll
CacheStore, 190, 192, 212
storeFailures, 210
swappiness, 273
SynchPartitionEntryProcessorInvoker, 161
tangosol.coherence.clusteraddress, 264

INDEX

INDEX
tangosol.coherence.clusterport, 264
testPofNullFilter, 90
TestProtobufSerialiser, 78
Thread
interrupt, 263
thread-count, 139, 145, 252, 253, 257
toBinary
ExternalizableHelper, 41
TransactionEvent, 228, 234, 235
transactions, 193
TransferEvent, 227, 228
TransientDataAccessException, 211
type-id, 45, 50, 91
TypeEqualsFilter, 51
UncategorizedDataAccessException, 214
uniform collection, 36, 54
UniqueValueFilter, 116, 117, 119, 126
unit-calculator, 38
unit-factor, 261
UnsupportedOperationException, 110, 125
update
MapIndex, 123
URLClassLoader, 70
UseNUMA, 271
V_REFERENCE_NULL
PofConstants, 91
ValueExtractor, 9597, 100103, 107, 108, 110, 115118, 120, 121, 124
WireFormat, 79
worker thread, 139, 143, 152, 188, 234
write-behind, 188, 190, 196
write-requeue-threshold, 192, 210, 211
write-through, 188, 189
writeDelimitedTo, 72
writeTo, 72

293

You might also like