Professional Documents
Culture Documents
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
1 Table of Contents
1TableofContents...................................................................................................................................2 2Introduction............................................................................................................................................3 2.1Intendedaudience...........................................................................................................................3 2.2Whybuildadatawarehouse..........................................................................................................3 3Buildingadatawarehouse.....................................................................................................................5 3.1Designingadimensionaldatawarehouse......................................................................................5 3.1.1Askingquestions.....................................................................................................................5 3.1.2Modelingstructures................................................................................................................5 3.1.3Pickingafactgrain.................................................................................................................5 3.1.4Addingdimensions.................................................................................................................6 3.2Constructingthedatawarehouse....................................................................................................6 3.2.1DesigningtransformationsusingSpoon.................................................................................7 Somepreparations........................................................................................................................7 Updatingtype1dimensions.....................................................................................................7 Updatingtype2dimensionsandthefacttable.........................................................................8 Aggregatingthedata..................................................................................................................10 3.2.2Puttingitalltogether............................................................................................................10 4Usingthedatawarehouse....................................................................................................................11 4.1Preparingforonlineanalyticalprocessing..................................................................................11 4.1.1Fromrelationaltodimensional.............................................................................................11 4.1.2DoingitinMondrian............................................................................................................11 4.2Askingmultidimensionalqueries.................................................................................................12 4.3Visualizationandpresentation.....................................................................................................13 5References............................................................................................................................................14 6Historyofthisdocument......................................................................................................................15 1stofseptember2006Initialrelease................................................................................................15 AppendixA:Technologyoverview.........................................................................................................16 Kettle...................................................................................................................................................16 Mondrian.............................................................................................................................................16 Jpivot...................................................................................................................................................16 AppendixB:GeneratingadatedimensionusingJavaScript...................................................................17 JavaScriptcode....................................................................................................................................17 AppendixC:ExampleMondrianXMLschema......................................................................................18 License.....................................................................................................................................................19
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
2 Introduction
Thisarticleisaboutbuildingdatawarehouses.Adatawarehouseisacomputerdatabasethatcollects, integratesandstoresanorganization'sdatawiththeaimofproducingaccurateandtimelymanagement informationandsupportingdataanalysis1.Itexplaintheimportanceofagooddatawarehouseand covertheprocessofbuildingsuchaspecializeddatabaseusingopensourcetechnologies.
2.1
Intended audience
2.2
Therearemanyreasonstojustifybuildingadatawarehouse,butalmostallofthemboildowntothe samebasicwish:providemeansforanalysisofdatatosupportmanagementdecisions.Youare probablyalreadyprovidingmanagementwithdatalikesiteusagestatistics,referrertrendsandthe numberofregisteredusersoftheirinformationsystem.Thisisbasicinformation,whichcanbe retrieveddirectlyfromtheinformationsystemitself.However,thishassomeimportantdrawbacks. Firstofall,sinceallofthisdataisretrieveddirectlyfromtheinformationsystem,itplacesa considerableburdenonthissystem.Analysisoftenrequireshugeamountsofdatatobeprocessed, whichisoftenaproblemif,forinstance,thesystem'sdatabasetablesgetlockedforretrieval.Adata warehousecanbecompletelydetachedfromtheinformationsystem,evenrunningonadifferent system. Secondly,anOLTP2system'sdatamodelisrarelyoptimizedforanalysis.Wealllearntodevelop systemstouseanormalizeddatabasemodeledafterourentityrelationships.Thisisagoodthingforthe informationsystem,sincetheunderlyingmodelisaclosereflectionofthesystemitself,butitmakes queryingforlargesumsofaggregateddataacostlyoperation.Furthermore,redundancyisrarelypart ofthisdatabasedesign,becauseredundancyishardtomaintain,oftencausingdatainconsistenciesor worse.Fordataanalysis,redundancycanbegreat,becauseitspeedsuptheprocess. Moreover,datainanOLTPsystemmightchangeovertime.Acustomermightmovetoanother country,leavingyou,thedataanalysisproviderwithanimpossiblechoice:eitheryouupdatethe
1 http://en.wikipedia.org/wiki/Data_warehouse 2 OnLineTransactionProcessing
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
3.1
3.1.1
Themostimportantstepinbuildingadatawarehouseisdesigningit.Youhavetoaskyourself:What doesthemanagementwanttoknow?.First,you'llhavetofigureoutwhichquestionsneedtobe answeredbyanalysisofthedatawarehouseto.be3.Forexample,isthereacorrelationbetweenusersof awebbasedsystemandthepagesitprovides?Docertaingroupsofusersvisitotherpagesthanothers? Wheredothevisitorsfromdifferentpagescomefrom?Mybeliefisatleast80%ofallmanagement questionsaboutthedatageneratedbyaninformationsystemcanbeansweredbyadecentdata warehouse. 3.1.2 Modeling structures
Onceyougetagoodviewonwhatquestionsyou'llwanttoaskthedata warehouselateron,youhavetodeterminethedifferentdata arrangementsthatcomewiththesequestions.Forexample,aquestion aboutsaleswillhavetooperateonadifferentstructurethanoneabout employment.Thesestructuresarecalledcubesintheworldof OLAP4,becausetheyareessentiallyanextensiontothetwo dimensionalarrayofdatathatisstoredinaconventional(forexample SQL)database.Dependingonthedata,acubemayhavemorethantwo oreventhreedimensions.We'llmodelthesecubesinarelational databaseasastarschema,containingasinglefacttablelinking togethermultipledimensiontables. 3.1.3 Picking a fact grain
Illustration1:Requestcube containingonlyatimedimension
Theartinmakingupthisstructureisfindingagoodbasisforafacttable.Themoreaggregatedthe
3 Thisisnotatypo 4 OnLineAnalyticalProcessing
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
Nowthatwehaveacentralstartingpointforourdata cube,itistimetodecoratethebasicfactsbyaddingnew dimensions,containingaspectsthatcanbeattributedto thefacts.Ifyoucan'tthinkofanydimensionthatisvalid foreveryvalueinyourfacttable,thenyou'veprobably chosenthewronggraintobeginwith.Themosttrivialof alldimensionsisthatoftime.Everyrequesttakesplace onacertainpointintime,sowecreatedimensiontable timeandhaveeveryrequestentryinthefacttablelink Illustration2:AcubemodelledasanOLAPstarina totheentryinthistablecorrespondingtothetimeittook relationaldatabase place.Iftworequeststookplaceonthesamemoment, theywilllinktothesameentryinthetimedimension'stable. Anotherdimensionwewanttoaddisthatofpages.Foreachdistinctpageassociatedtoarequestinthe facttable,anentrywillexistinthepagedimensiontable.Thesamegoesforreferrers. Finally,we'llwanttolinkrequeststotheusersthatmadethem.Everyrequestinthefacttablewas madeeitherbyaknownuser,orbynouseratall,whichwillbeaspecialentryintheuserdimension's table. Aswecreateandlinkdimensions,animportantruleistoneverusetheexistingkeysfromtheonline system,aswehavenocontroloverhowtheymightchangeorevendisappear.Instead,everydimension willgetit'sownsurrogatekeycalledthetechnicalkeywhichisuniquetothedatawarehouse.This alsogoesforthefacttable.
3.2
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
WestartbyusingSpoontomakethetransformationsthatwillpopulateourdatawarehouse.Inorderto beabletofillthecentralfacttable,thekeystoallofthedimensionsmustbeknown.Thereforewe makeadistinctionbetweentwotypesofdimensions(nottobeconfusedbyRalphKimball'sslowly changingdimensiontypes): 1. Dimensionsconsistingofdataalreadyknowntotheonlinesystem. 2. Dimensionsthataretobegeneratedfromthefactdataandsurroundingsources. Whilewegenerateorupdatethedimensionsoftypetwowhilefillingourfacttableandthusknowits keysatthattime,wecannotdothisforthedimensionsofthefirstkind. Inourexample,thetimeandpagedimensionsareofthesecondkind,meaningtheyaregeneratedwhen updatingourfacttable.Theuserdimensionhastobeknownbeforethen,becauseitisbasedonsome independenttablesintheonlinesystem.Becauseofthis,wewillfillourdatawarehouseintwo separatetransformations,ensuringfirsttheexistenceofourtype1userdimensionandlatertheother dimensionsandthefacttableitself.
Some preparations
BeforewecanbeginusingSpoon,wehavetodefinethedatasources.Inourcase,wehaveonlyone datasource:thedatabaseofouronlinesystem.WeneedtoaddthisdatabaseasaConnectionto SpoonasdescribedinitsdocumentationaboutDatabaseconnections.Wealsodefinetheconnection toourtargetdatabasethisway.
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
dimension'stableanditisalmostalwaysthemostusefulone. Inourexample,wewanttheuserdimensiontobeof typetwo,sothetransformationinSpoonwilllooklike Illustration3.First,thedataisreadfromtheonline system'susertable,thenthedatawewanttoputinour dimensiontableisfilteredusingaselectstepand Illustration3:The"updateuserdimension"transformationin finally,itallgetsinsertedintoourdatawarehouses Spoon user_dimtable.Becauseallofthestepsarequite accuratelydescribedinSpoon'sdocumentation,wewillonlyexplainhowtoconfigurethedimension lookup/updatestepforthisexample. Becausewecreatethisdimensiondirectlyfromatablefrom theonlinesystem,thekeyfieldtouseinthisdimensionis trivial.Itisthekeyusedintheonlinesystem,whichinour caseistheuser'semail.Assaidbefore,thisfieldwillnotbe usedasakeyforourdatawarehouse,aswereplaceitbya technicalkeyuniquetothewarehouse.Itismerelyneededto beabletoretrievethistechnicalkeylater,aswewillneeditto linkthedimensiontothefacttable.Inourcase,thistechnical keyfieldiscalleduser_id. IntheFieldstab,wespecifythefieldsthatareofimportance tothisdimension,suchashierarchicaldata.Inourcase,we keeptrackofinformationliketheuser'sfullname,the companyheispartofandhisorherssex. IncaseoftypeIIdimensions,theVersionfieldisusedto keeptrackofthedifferentversionsinaslowlychanging dimension.Thedaterangeswillbeusedtoindicatetheperiod Illustration4:DimensionLookup/Updatedialog ofvalidityforeachversioninthatcase.Iftheonlinesystem keepstrackofwhenrecordschange,itisusefultousetheStreamDatefieldforbettervalidity,but therearemanymoreapplicationspossible,whicharenotinthescopeofthisdocument.Usually,the default(now)willsuffice.
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
Westartbyreadingthebasicgrainforthefacttablefrom therequestlog. Then,thedatedimensionisgenerated,whichisdonebyusing theJavaScriptsteponthetimestampfieldintherequestlog,as describedinAppendixB:Generatingadatedimensionusing JavaScript.UsingtheDimensionLookup/Updatestep,we putthisdataintothedatabase,andkeepreferencestoitinthe formofthegeneratedtechnicalkeytime_id,whichisadded tothestreamasanextracolumn. Nextupistheuserdimension.Remember,wealreadyupdated thatoneinthepreviousstep,sonowallwehavetodoislink itscorrespondingentryforeachrequestinthefacttable. Throughacomplexlookupusingthesessionkeytomatch eachrequesttoaloginactionfromtheactionstable,weare abletofinallymatchausertoeachrequest.Ifnousercanbe foundforthissession,weturnitskey(theemail)toNULL,so itwillmatchaspecialcasecreatedbySpoon'sDimension Lookup/Updatestep:theunknownuser.Becausethefiltering createsseparatethreads,wesorttheuserstreamsafterwards usingSpoon'ssortstep.WhenitreachestheLookup user_dimstep,thestreamcontainsanextrafieldnamed email,whichcontainstheemailofeachrequest,orNULLif itisnotknown.Rememberwespecifiedemailasthelookup keywhenupdatingthisdimensionearlier,sousingthisfield, thetransformationstepcanfindatechnicalkeyforeachentry inthestream,addingitasanextrafieldcalleduser_id. Alloftheinformationneededtofillthepagedimensionis alreadyintherequestlog'sstream.Whilefillingthedimension tablewithhierarchicaldataaboutthisrequest'sdomain,path andpage,thenewlygeneratedtechnicalkeyfieldpage_idis addedtothestream. Thesameisdoneforareferrerdimensionweaddedforfun, whichwasnotmentionedearlier,butissimilartothepage dimension. Finally,thedatathatistogointothefacttableisfilteredso
Illustration5:Dimensionandfactupdatetransformation
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
thatonlythetechnicalkeystothedimensionsandthegrain'sfactsremain.Itistheninsertedintoa table,request_factsinthisexample.
Nowthatwe'vedefinedthetransformationsthatfillourdatawarehouse,it'stimetoprovethatoneplus oneequalsthreebyusingKettle'sChefandKitchentocombinethetransformationsintoonejob, addinglogginganddebuggingfunctionalityintheprocess. ThetoolChefdoesn'trequiremuchexplanation,asitsinterfaceisalmostequaltothatofSpoon.Where Spoonisusedtocombinestepstocreateatransformation,Cheflinkstogethertransformationstomake ajob.BecauseChefisalreadyprettywelldocumented,asimplescreenshotsaysitallforthisexample. Becausetherearetwotypesofdimensionsinourcase, theprocessoffillingthedatawarehouseisdividedintwo transformationphases,asdescribedearlier.First,Chef willexecutethetransformationthatupdatesourtype1 Userdimension.Onfailure,itwillsendanemailwith moredetailsonwhatwentwrong.Ifnot,itwillcontinue withthesecondphase:thetransactionthatupdatesthe otherdimensionsandthefacttable.Inthisexample,Ialso wantedtoreceiveanemailonsuccess,soIalsolinked thelasttransformationtotheMailstep.
Illustration6:TheChefjobtofillthedatawarehouse
AfinalfeatureofChefisthatitsjobscanbescheduled.UsingKitchenasacommandlinetool,this makesKettleaverypowerfulsystemforcreating,updatingandmaintainingdatawarehouses.
10
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
4.1
Insection3.1.2:Modelingstructures,westartedbystructuringdatacubesinthedatawarehouseasa relationalstarschema,containingasinglefacttablelinkingtogethermultipledimensiontables.This way,eachdimensionislinkedtoeveryotherdimensioninannnwayforeverycubeitispartof. Becauseofthespecialwaythedataisorganized,itispossibletoquerythedatainageneralizedform. ThisisthetaskforanOLAPserver.Inourexample,wewillusetheopensourcerelationalOLAP serverMondriantodoso. 4.1.1 From relational to dimensional
Asexplainedearlier,thedatainthewarehouseisstoredinarelationalform.Afacttablelinkstogether multipledimensions,eachintheirowntable,containingtheirowndata.InanOLAPcube,datais organizedinamultidimensionalformthatconsistsofallofthesetablestogether. Eachcubecancontainmultiplemeasures,thatcanbeobtainedfromthefacttable.Thesemeasuresare usedtocalculateaggregationsandgroupdata.Acube'smeasurescanbeanything,likesalesquantity, productpriceoracalculatedtotalofthesefacts.Inthecaseoffactlessfacttables*theredoesn'teven havetobeameasure,itcanbe1allthetime. Everycubealsohasmultipledimensions.Adimensionisonewayoflookingatthedata,constructed fromusuallyonedimensiontableintherelationaldatabase.Themosttrivialexampleofadimensionis ofcourse,thatoftime. Everydimensioncanhaveoneormorehierarchies.Ahierarchyis,asyoumightexpect,anorderedset ofincreasinglyaggregatedlevelsofinformationaboutthedimension.Incaseofatimedimension,one hierarchymightlooklike:daymonthquarteryear.Hierarchiesareveryusefulingroupingand aggregatingdatawhileanalyzing. TheseconceptscanbeusedbyanOLAPservertoquerythedatainamultidimensionalway.Iwillnow tellyouhowtoconfiguretheMondrianOLAPservertodoso. 4.1.2 Doing it in Mondrian
AftertheinstallationoftheMondrianOLAPserver,whichistrivialandoutsidethescopeofthis document,weneedtotellitaboutourdatawarehouse'sstructure.Thisisdoneusingafairlywell 11
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
documentedXMLformat,containinginformationabouthowtheOLAPcubesandtheirdimensions, hierarchiesandmeasuresaretobebuiltfromtablesinarelationaldatabase. Let'sdothisforourexample.AscanbeseeninAppendixC:ExampleMondrianXMLschema,the configurationstartsbytellingMondrianthenameoftheschema.Inourcase,theschemacontainsonly onecube,whichiscalledRequests.Therequestscubeislinkeddirectlytotherequest_factstable, andhasonemeasureandfourdimensions. Becausetheonlyfactsinthefacttablearethevariablesandwedon'treallycareaboutthoserightnow, themeasureisatrivialone:Thecountofthenumberofid'sofeachrequest,whichisalwaysequalto one. Thefourdimensionsaredefinedasbeingpartofthecube.InourcasewehaveaUser,Page, ReferrerandTimedimension.Eachcontainingoneormorehierarchies.Foreverydimension, Mondrianneedstoknowwhichbywhichtableitisrepresentedandbywhatkeyitislinkedtothefact table.Foreachdimension,atablecolumnorcalculatedvalueistobedefined. OnceMondrianknowshowtointerpretthedatawarehouse'srelationaldatabase,itcanactasa translationlayer,allowingmultidimensionalqueriestoberunonarelationaldatabase.
4.2
Illustration7:BasicMDXsyntax
12
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
4.3
13
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
5 References
ADimensionalModelingManifestoRalphKimball1997 SlowlyChangingDimensionsRalphKimball1996 SpoonDocumentationPentaho20012006
14
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
15
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
Kettle
http://www.kettle.be
KettleisanopensourceETLsuitefor Extractingdatafromvarioussources, TransformingitandLoadingitintothedata warehouse. Kettleconsistsoftwosetsoftools.Spoon andPanareusedtorespectivelycreateand executetransformationsondata.Chefand Kitchenareusedtoorganizetransformations intojobsandexecutetheminascheduled way.
Illustration9:Usedtechnologiesandtheirplace
Mondrian
http://mondrian.sourceforge.net
MondrianisarelationalOLAPserver,whichactsasalayerontopofarelationaldatabasetoallowfor multidimensionalqueriestobeperformedonthedata.
Jpivot
http://jpivot.sourceforge.net
JPivotisatoolthatcanactasapresentationlayerontopofMondrian.Itgeneratesmultidimensional queriesanddisplaystheirresultsasinteractivepivottables.
16
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
JavaScript code
// The fields we want to calculate var day_of_month; var week_of_year; var month_of_year; var year; var quarter; var name_day; var name_month; // Calculate! day_of_month = dateTime.Clone().dat2str("dd"); week_of_year = dateTime.Clone().Clone().dat2str("ww"); month_of_year = dateTime.Clone().dat2str("MM"); year = dateTime.Clone().dat2str("yyyy"); name_day = dateTime.Clone().dat2str("E").getString(); name_month = dateTime.Clone().dat2str("MMMM").getString(); if(day_of_month <= 3) { quarter = "Q1"; } else if(day_of_month <= 6) { quarter = "Q2"; } else if(day_of_month <= 9) { quarter = "Q3"; } else { quarter = "Q4"; }
17
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
18
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
License
THEWORK(ASDEFINEDBELOW)ISPROVIDEDUNDERTHETERMSOFTHISCREATIVE COMMONSPUBLICLICENSE("CCPL"OR"LICENSE").THEWORKISPROTECTEDBY COPYRIGHTAND/OROTHERAPPLICABLELAW.ANYUSEOFTHEWORKOTHERTHAN ASAUTHORIZEDUNDERTHISLICENSEORCOPYRIGHTLAWISPROHIBITED. BYEXERCISINGANYRIGHTSTOTHEWORKPROVIDEDHERE,YOUACCEPTAND AGREETOBEBOUNDBYTHETERMSOFTHISLICENSE.THELICENSORGRANTSYOU THERIGHTSCONTAINEDHEREINCONSIDERATIONOFYOURACCEPTANCEOFSUCH TERMSANDCONDITIONS. 1.Definitions 1."CollectiveWork"meansawork,suchasaperiodicalissue,anthologyorencyclopedia,inwhich theWorkinitsentiretyinunmodifiedform,alongwithanumberofothercontributions,constituting separateandindependentworksinthemselves,areassembledintoacollectivewhole.Aworkthat constitutesaCollectiveWorkwillnotbeconsideredaDerivativeWork(asdefinedbelow)forthe purposesofthisLicense. 2."DerivativeWork"meansaworkbasedupontheWorkorupontheWorkandotherpreexisting works,suchasatranslation,musicalarrangement,dramatization,fictionalization,motionpicture version,soundrecording,artreproduction,abridgment,condensation,oranyotherforminwhichthe Workmayberecast,transformed,oradapted,exceptthataworkthatconstitutesaCollectiveWork willnotbeconsideredaDerivativeWorkforthepurposeofthisLicense.Fortheavoidanceofdoubt, wheretheWorkisamusicalcompositionorsoundrecording,thesynchronizationoftheWorkin timedrelationwithamovingimage("synching")willbeconsideredaDerivativeWorkforthepurpose ofthisLicense. 3."Licensor"meanstheindividualorentitythatofferstheWorkunderthetermsofthisLicense. 4."OriginalAuthor"meanstheindividualorentitywhocreatedtheWork. 5."Work"meansthecopyrightableworkofauthorshipofferedunderthetermsofthisLicense. 6."You"meansanindividualorentityexercisingrightsunderthisLicensewhohasnotpreviously violatedthetermsofthisLicensewithrespecttotheWork,orwhohasreceivedexpresspermission fromtheLicensortoexerciserightsunderthisLicensedespiteapreviousviolation. 7."LicenseElements"meansthefollowinghighlevellicenseattributesasselectedbyLicensorand indicatedinthetitleofthisLicense:Attribution,ShareAlike.
19
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
2.FairUseRights.Nothinginthislicenseisintendedtoreduce,limit,orrestrictanyrightsarising fromfairuse,firstsaleorotherlimitationsontheexclusiverightsofthecopyrightownerunder copyrightlaworotherapplicablelaws. 3.LicenseGrant.SubjecttothetermsandconditionsofthisLicense,LicensorherebygrantsYoua worldwide,royaltyfree,nonexclusive,perpetual(forthedurationoftheapplicablecopyright)license toexercisetherightsintheWorkasstatedbelow: 1.toreproducetheWork,toincorporatetheWorkintooneormoreCollectiveWorks,andto reproducetheWorkasincorporatedintheCollectiveWorks; 2.tocreateandreproduceDerivativeWorks; 3.todistributecopiesorphonorecordsof,displaypublicly,performpublicly,andperformpublicly bymeansofadigitalaudiotransmissiontheWorkincludingasincorporatedinCollectiveWorks; 4.todistributecopiesorphonorecordsof,displaypublicly,performpublicly,andperformpublicly bymeansofadigitalaudiotransmissionDerivativeWorks. 5. Fortheavoidanceofdoubt,wheretheworkisamusicalcomposition: 1.PerformanceRoyaltiesUnderBlanketLicenses.Licensorwaivestheexclusiverighttocollect, whetherindividuallyorviaaperformancerightssociety(e.g.ASCAP,BMI,SESAC),royaltiesforthe publicperformanceorpublicdigitalperformance(e.g.webcast)oftheWork. 2.MechanicalRightsandStatutoryRoyalties.Licensorwaivestheexclusiverighttocollect, whetherindividuallyorviaamusicrightssocietyordesignatedagent(e.g.HarryFoxAgency), royaltiesforanyphonorecordYoucreatefromtheWork("coverversion")anddistribute,subjecttothe compulsorylicensecreatedby17USCSection115oftheUSCopyrightAct(ortheequivalentinother jurisdictions). 6.WebcastingRightsandStatutoryRoyalties.Fortheavoidanceofdoubt,wheretheWorkisasound recording,Licensorwaivestheexclusiverighttocollect,whetherindividuallyorviaaperformance rightssociety(e.g.SoundExchange),royaltiesforthepublicdigitalperformance(e.g.webcast)ofthe Work,subjecttothecompulsorylicensecreatedby17USCSection114oftheUSCopyrightAct(or theequivalentinotherjurisdictions). Theaboverightsmaybeexercisedinallmediaandformatswhethernowknownorhereafterdevised. Theaboverightsincludetherighttomakesuchmodificationsasaretechnicallynecessarytoexercise therightsinothermediaandformats.AllrightsnotexpresslygrantedbyLicensorareherebyreserved. 4.Restrictions.ThelicensegrantedinSection3aboveisexpresslymadesubjecttoandlimitedbythe 20
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
followingrestrictions: 1.Youmaydistribute,publiclydisplay,publiclyperform,orpubliclydigitallyperformtheWork onlyunderthetermsofthisLicense,andYoumustincludeacopyof,ortheUniformResource Identifierfor,thisLicensewitheverycopyorphonorecordoftheWorkYoudistribute,publicly display,publiclyperform,orpubliclydigitallyperform.Youmaynotofferorimposeanytermsonthe WorkthatalterorrestrictthetermsofthisLicenseortherecipients'exerciseoftherightsgranted hereunder.YoumaynotsublicensetheWork.YoumustkeepintactallnoticesthatrefertothisLicense andtothedisclaimerofwarranties.Youmaynotdistribute,publiclydisplay,publiclyperform,or publiclydigitallyperformtheWorkwithanytechnologicalmeasuresthatcontrolaccessoruseofthe WorkinamannerinconsistentwiththetermsofthisLicenseAgreement.Theaboveappliestothe WorkasincorporatedinaCollectiveWork,butthisdoesnotrequiretheCollectiveWorkapartfrom theWorkitselftobemadesubjecttothetermsofthisLicense.IfYoucreateaCollectiveWork,upon noticefromanyLicensorYoumust,totheextentpracticable,removefromtheCollectiveWorkany creditasrequiredbyclause4(c),asrequested.IfYoucreateaDerivativeWork,uponnoticefromany LicensorYoumust,totheextentpracticable,removefromtheDerivativeWorkanycreditasrequired byclause4(c),asrequested. 2.Youmaydistribute,publiclydisplay,publiclyperform,orpubliclydigitallyperformaDerivative WorkonlyunderthetermsofthisLicense,alaterversionofthisLicensewiththesameLicense ElementsasthisLicense,oraCreativeCommonsiCommonslicensethatcontainsthesameLicense ElementsasthisLicense(e.g.AttributionShareAlike2.5Japan).Youmustincludeacopyof,orthe UniformResourceIdentifierfor,thisLicenseorotherlicensespecifiedintheprevioussentencewith everycopyorphonorecordofeachDerivativeWorkYoudistribute,publiclydisplay,publiclyperform, orpubliclydigitallyperform.YoumaynotofferorimposeanytermsontheDerivativeWorksthat alterorrestrictthetermsofthisLicenseortherecipients'exerciseoftherightsgrantedhereunder,and YoumustkeepintactallnoticesthatrefertothisLicenseandtothedisclaimerofwarranties.Youmay notdistribute,publiclydisplay,publiclyperform,orpubliclydigitallyperformtheDerivativeWork withanytechnologicalmeasuresthatcontrolaccessoruseoftheWorkinamannerinconsistentwith thetermsofthisLicenseAgreement.TheaboveappliestotheDerivativeWorkasincorporatedina CollectiveWork,butthisdoesnotrequiretheCollectiveWorkapartfromtheDerivativeWorkitselfto bemadesubjecttothetermsofthisLicense. 3.Ifyoudistribute,publiclydisplay,publiclyperform,orpubliclydigitallyperformtheWorkorany DerivativeWorksorCollectiveWorks,YoumustkeepintactallcopyrightnoticesfortheWorkand provide,reasonabletothemediumormeansYouareutilizing:(i)thenameoftheOriginalAuthor(or pseudonym,ifapplicable)ifsupplied,and/or(ii)iftheOriginalAuthorand/orLicensordesignate anotherpartyorparties(e.g.asponsorinstitute,publishingentity,journal)forattributioninLicensor's copyrightnotice,termsofserviceorbyotherreasonablemeans,thenameofsuchpartyorparties;the 21
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
titleoftheWorkifsupplied;totheextentreasonablypracticable,theUniformResourceIdentifier,if any,thatLicensorspecifiestobeassociatedwiththeWork,unlesssuchURIdoesnotrefertothe copyrightnoticeorlicensinginformationfortheWork;andinthecaseofaDerivativeWork,acredit identifyingtheuseoftheWorkintheDerivativeWork(e.g.,"FrenchtranslationoftheWorkby OriginalAuthor,"or"ScreenplaybasedonoriginalWorkbyOriginalAuthor").Suchcreditmaybe implementedinanyreasonablemanner;provided,however,thatinthecaseofaDerivativeWorkor CollectiveWork,ataminimumsuchcreditwillappearwhereanyothercomparableauthorshipcredit appearsandinamanneratleastasprominentassuchothercomparableauthorshipcredit. 5.Representations,WarrantiesandDisclaimer UNLESSOTHERWISEAGREEDTOBYTHEPARTIESINWRITING,LICENSOROFFERSTHE WORKASISANDMAKESNOREPRESENTATIONSORWARRANTIESOFANYKIND CONCERNINGTHEMATERIALS,EXPRESS,IMPLIED,STATUTORYOROTHERWISE, INCLUDING,WITHOUTLIMITATION,WARRANTIESOFTITLE,MERCHANTIBILITY, FITNESSFORAPARTICULARPURPOSE,NONINFRINGEMENT,ORTHEABSENCEOF LATENTOROTHERDEFECTS,ACCURACY,ORTHEPRESENCEOFABSENCEOFERRORS, WHETHERORNOTDISCOVERABLE.SOMEJURISDICTIONSDONOTALLOWTHE EXCLUSIONOFIMPLIEDWARRANTIES,SOSUCHEXCLUSIONMAYNOTAPPLYTOYOU. 6.LimitationonLiability.EXCEPTTOTHEEXTENTREQUIREDBYAPPLICABLELAW,INNO EVENTWILLLICENSORBELIABLETOYOUONANYLEGALTHEORYFORANY SPECIAL,INCIDENTAL,CONSEQUENTIAL,PUNITIVEOREXEMPLARYDAMAGES ARISINGOUTOFTHISLICENSEORTHEUSEOFTHEWORK,EVENIFLICENSORHAS BEENADVISEDOFTHEPOSSIBILITYOFSUCHDAMAGES. 7.Termination 1.ThisLicenseandtherightsgrantedhereunderwillterminateautomaticallyuponanybreachby YouofthetermsofthisLicense.IndividualsorentitieswhohavereceivedDerivativeWorksor CollectiveWorksfromYouunderthisLicense,however,willnothavetheirlicensesterminated providedsuchindividualsorentitiesremaininfullcompliancewiththoselicenses.Sections1,2,5,6, 7,and8willsurviveanyterminationofthisLicense. 2.Subjecttotheabovetermsandconditions,thelicensegrantedhereisperpetual(forthedurationof theapplicablecopyrightintheWork).Notwithstandingtheabove,Licensorreservestherightto releasetheWorkunderdifferentlicensetermsortostopdistributingtheWorkatanytime;provided, howeverthatanysuchelectionwillnotservetowithdrawthisLicense(oranyotherlicensethathas 22
BuildingdatawarehousesusingopensourcetechnologiesMichelJansen
been,orisrequiredtobe,grantedunderthetermsofthisLicense),andthisLicensewillcontinuein fullforceandeffectunlessterminatedasstatedabove. 8.Miscellaneous 1.EachtimeYoudistributeorpubliclydigitallyperformtheWorkoraCollectiveWork,the LicensorofferstotherecipientalicensetotheWorkonthesametermsandconditionsasthelicense grantedtoYouunderthisLicense. 2.EachtimeYoudistributeorpubliclydigitallyperformaDerivativeWork,Licensorofferstothe recipientalicensetotheoriginalWorkonthesametermsandconditionsasthelicensegrantedtoYou underthisLicense. 3.IfanyprovisionofthisLicenseisinvalidorunenforceableunderapplicablelaw,itshallnotaffect thevalidityorenforceabilityoftheremainderofthetermsofthisLicense,andwithoutfurtheraction bythepartiestothisagreement,suchprovisionshallbereformedtotheminimumextentnecessaryto makesuchprovisionvalidandenforceable. 4.NotermorprovisionofthisLicenseshallbedeemedwaivedandnobreachconsentedtounless suchwaiverorconsentshallbeinwritingandsignedbythepartytobechargedwithsuchwaiveror consent. 5.ThisLicenseconstitutestheentireagreementbetweenthepartieswithrespecttotheWorklicensed here.Therearenounderstandings,agreementsorrepresentationswithrespecttotheWorknot specifiedhere.Licensorshallnotbeboundbyanyadditionalprovisionsthatmayappearinany communicationfromYou.ThisLicensemaynotbemodifiedwithoutthemutualwrittenagreementof theLicensorandYou.
23