Professional Documents
Culture Documents
CS 473 EX
February 6, 2016
Agenda
Course Structure and Expectations
Logistics
Term Project team, submissions
Assignments
Quizzes
Module 2
CSM
Estimation
Learning Objectives
Upon successful completion of this course, you will be prepared to:
Introduction
Introduction
Focus on learning . You should assume that you will be successful with the course
and hence, grades will be reflective of that. Focus on those best practices that you
learn here and be able to actually apply.
This is an overview course. It delves into multiple topics. Buckle up. You cannot
afford spending all your time on a single topic.
Time box your assignments. This is not about a PhD thesis. This is about becoming
aware of an important concept and then moving to the next concept.
This course is not about a specific language or a framework.
Some assignments and discussions - are optional, aimed to stimulate your thinking
Term project brings it all together. All predefined deliverables of a term project are
introduced in lectures.
How does it all fits together? It is all about how to do software better, about
software engineering and program management. You cannot function as a
professional software engineer or a project manager if you are unaware of these
key concepts covered in the class.
MET
CS473
Principles
Process has no goal in itself
Process
Improvement
Process
A Glue. A Heart bit. A Librarian. Architecture
Data-driven design
Test
Essentials
Unit
Test
Deployment pipeline
Continuous
Delivery
Globalization
Offshoring
Taxonomy
Requirements
Backlog
Grooming
Engineering
Management
Software
Configuration
Management
2
4
Estimation
Identification , Control,
Auditing, Reporting
Precision equal to accuracy
Software
Design
Agile
SW Tools
Evaluation
Peer
Reviews
MET
CS473
Responsibilities
Find the shortest path
Separate released from drafts
Unit
Test
Continuous
Delivery
Requirements
Backlog
Grooming
Process
Improvement
Engineering
Management
Process
Architecture
Test
Essentials
Globalization
Offshoring
Taxonomy
2
4
Software
Configuration
Management
Estimation
Control changes
Retain historical estimates
Software
Design
Agile
SW Tools
Evaluation
Peer
Reviews
Split
Ask for and provide comments
Engineering
Management
UML
Git
ISO
Test Essentials
Faults & Failures
Cost Of Delay
Black Swan
Requirements
Pivotal
Version
One
Rally
Backlog Grooming
5/20 Rule
Story Points
Architecture
Asset Library
Motivation
Peer Reviews
Estimation
Continuous
Delivery
SW Tools
Evaluation
Agile Manifesto
Scrum
CMM
Regression
Cone Of Uncertainty
Software Process
Configuration
Management
MVC
Unit Test
Process Improvement
SCM
Software Configuration Management
What is SCM ?
There is a certain tradition to the way SCM (Software Configuration Management) is described
and taught. SEI CMMI has a process area at level 2 called SCM. Eric Braude book on Software
Engineering has a Chapter Six dedicated to this topic. Such long-standing tradition also stems
from the unmovable CM audits conducted by government agencies to all their large and small
vendors. Whether you use Agile or Waterfall methodology, whether you are a small startup or
a Fortune 500 company, the format and template of a CM Plan that you have to produce and
demonstrate to an auditor - remain identical.
Software Configuration management (SCM) is a discipline of maintaining integrity of
product parts as they propagate along a project life cycle.
There are many natural organizational forces that could impede the consistency,
completeness and traceability among various product pieces. If no systematic
improvement effort is applied, the amount of disorder is bound to increase. This is a
fundamental entropy law we learn at high school.
SCM discipline decreases the risk of losing integrity by establishing the four overriding
processes, Identification, Control, Auditing and Reporting.
It is useful to trace the chain of events of a software change, so to understand the details
and mechanics of, first, how does it happen that following these four steps will prevent the
un-intended software changes,
and second, if you will not Configuration Identification Discipline for
identifying the item, its configuration, and
identification
follow these steps, then
documenting its functional and physical
sooner or later the chaos
characteristics; "What version of the file is this?"
will prevail.
Configuration Control Creation and
exercising of established procedures to classify,
approve or disapprove, release, implement and
confirm changes to agreed specifications and
baselines; "How many changes and which
changes went into the latest version of this
product?"
control
auditing
Here is the quote from James Bach
article "The highs and lows of change
control", where he describes the
balancing act. Both, too much or
too little of a change control
could cause issues.
reporting
Change Control is vital. But the forces that make it necessary also make it annoying. We worry about change because a tiny
perturbation in code can create a big failure in the product. But it can also fix a big failure or enable a wonderful new capabilities.
We worry about change because a single rouge developer could sink the project; yet brilliant ideas originated in the minds of
those rogues, and the burdensome change control process could effectively discourage them from doing creative work.
Here is an illustration of a CI list that is similar to the one you do at your term project. In a
commercial environment this is an extensive list that brings together different Owners and
various Repositories. One has to accustom to dealing with multiple tools of a large
organization. It is best if a CI List is a report produced auto-magically from various
repositories. Although in a real-life-project a manual intervention is common.
The list matches contents
of the project repository
1
2
3
4
5
6
7
8
9
10
11
12
13
Version
Date
Owner
Repository
1.4
1.0
1.3
2.0
2.1
1.7
4.3
5.8
0.8
0.4
na
na
1.0
9-Sep
9-Sep
9-Sep
16-Sep
23-Sep
23-Sep
23-Sep
23-Sep
6-Oct
23-Sep
na
na
9-Sep
John S
Mary W
Joe B
Joe B
Alex B
John S
Harry V
Harry V
Bill K
Sam C
na
na
Peter Z
Baselines confirm
traceability between
items
Example.
After a Baseline Audit, all
versions are bumped to 5.0
Version structure
shows whether an
item is a draft
The list itself has a version
Some items have no version,
since they are not controlled
One has to decide upfront which items are
controlled and which ones are stored
Change Requests
A simple bug fix
start
close
close
SUB1
Submits &
Assigns CR
Is resolution
acceptable?
QA1
SUB1
Submits &
Assigns CR
Is resolution
acceptable?
QA1
QA Engineer
Verify
Resolution
Verify
Resolution
QA Engineer
Resolves
CR
CMT1
Manage
deferred list
DEV3
Prioritize
CMT3
Change
Management
Team
(BugReview)
reject
DEV3
Resolve
Resolved
Developer
Assigned
What is the
outcome?
no
Submitter
Verifies
Accepts
defer
accept
Already Resolved
New
Manage
rejects
No
Developer
Pass to
someone
else
CMT4
CMT2
Resolved
Submit
new bug
Resolution
incorrect
ReOpened
Verified
Test Case
Exists &
Closed for Good
Zombie
Closed
Control
Auditing
Reporting
Code
User
Docs
Internal
Docs
Tools
This chart shows four parts of SCM applied to four different types of work products.
It could serve as an indicator of adopting best practices. You can see that Code is wellcontrolled, but the User Docs are not under control.
A project baseline consists of a set of baselined work products that are required at the end of
each phase or at some designated interim quality checkpoint. The project baseline evolves
through the life of a project.
A baseline audit checks the integrity of each baseline. Artifacts are traceable to each other
within each baseline, although might NOT be traceable between different baselines.
Requirements Baseline
PRD
ERL
Proj
Plan
Development Baseline
Func
Doc
Proj
Spec
Plan
Sched
Test
Test
ERL
Cases
Plan
Baseline audit
release content
is clear and stable
Baseline Audit
ERL matches Test Plan
test cases match FS
FS is updated
Baseline Audit
Release Baseline
Func User Proj Test
Spec Doc Sched Report
Not To Do List
Here is the so-called "Not-To-Do-List", as the following actions are greatly
discouraged,
Versioning Standard
A mature organization adopts a consistent definition of versioning. It could take you a full
page to describe various examples of versions, as well as scenarios of a transition between
versions, for both code and docs.
A common question is why do I need to manually insert a version (at the bottom right corner
of a document's footer), if a repository itself assigns a version automatically? The point is
that each time you hit the commit button, the version to your artifact is bumped. You can
insert ten blank spaces resulting in exactly ten additional versions. Obviously these new
versions have no corresponding software change that worth mentioning. You still have to tie
the meaningful software changes to the sequence of versions.
A versioning standard could be quite simple,
X.Y
<X> for major
<Y> for minor
or it could be more complex,
A.B.C.D
<A> designates a Major Release
<B> ... Minor Release
<C> ... Major Draft
<D> ... Minor Cosmetic Change
One of the Quiz questions reads as, "If a consistent versioning standard is adopted, it is enough to examine the
structure of a version number to determine whether a configuration item is a draft or a released." The expected
answer is "YES", regardless of the specifics of a versioning standard. It would be quite unusual for a versioning
not to distinguish a released from a draft.
Often folks make "C" and "D" equal to zero - after a customer release or after a baseline audit. Alternatively,
customers are never exposed to an internal versioning, so the only numbers customers see are "A" and "B".
Examples
More Examples
Transition from one version to another corresponds to a certain software change. A missing
version is indicative of a software change that has not been accounted for - which could
raise a red flag.
4.
5.
6.
7.
8.
9.
10.
11.
Projects A and B are rebased and then change #12345 is shared (integrated) from project A to project B
Change list #12346 is made to project A
Project A codeline is moved to deployed.
A release is deployed to the network
Project B is rebased, during which a conflict occurs
Branch specification C is created and integrated
Project codeline C is rebased
A move to deployed is scheduled for Project C
Taxonomy
Codeline
An area in the repository file system, consisting of all the files contained under a certain
directory. No two files in a codeline can share a common ancestor, except in the case of a
`rename' operation. Examples are the deployed codeline, and project codelines.
Branch Specification
An object that stores a list of relationships between codelines. It is used to simplify
integration commands.
Branch specifications are used to indicate how a project's code changes are to be integrated
into the deployed codeline
Rebase
To integrate changes from a `parent' codeline into a `child' codeline. This is used to integrate
changes made on the parent codeline into the child codeline, to keep the child codeline up
to date. Most rebases will be from the deployed codeline (the parent) to a project codeline
(the child). Projects are not released unless they are rebased from the head of the deployed
codeline; this ensures that changes to a component are not unintentionally lost.
Promote
To integrate changes from a `child' codeline into a `parent' codeline. Promotions occur from
project codelines to the deployed codeline, when projects are deployed
Key Concepts
The only way to ship software to the field is through Deployed Codeline. In some systems it is called
Trunk, or Master. On the pictorial such event is designated with a yellow square and called "Release is
Deployed". Understanding that the sole purpose of all our activities is to succeed with the Release to
customers. Here is the link to an interesting discussion about the "Power of One" covering concept of a
single delivery codeline. http://blogs.workday.com/why_weve_moved_to_single_codeline_development_at_workday.html
Synchronization among all branches of the repository should be as frequent as possible. The longer
branches are kept out of synch, the more difficult it is to merge them. "Building Often" - is the key
principle of a modern organization. Reading the Google's front site "Fast is Better Than slow", one could
appreciate the fact that the effectiveness of a huge enterprise is driven by a simple parameter "how fast
can you build".
Resolution of code conflicts is inherently manual process. Each source control tool offers a variety of
features to simplify the merge; however the final decisions are always done by a developer who is
familiar with the context.
More to the above paragraph, in most cases, resolution of a conflict could be only done by the
developer who introduced this conflict on a first place. This brings about the principle aired by Jez
Humble "Never Go Home On A Broken Build". Imagine yourself shutting down the Trunk and forcing
folks in different time zones to sit on their thumbs for the next twenty-four hours while waiting for your
return.
Branching and Merging Strategy should be well-documented. Who has his own branch and who does
not - should be determined and not improvised. Multiple branches kept for a long time without a
rebase - are bound to hinder the operation.
Tutorials
Module 4 assignment covers several source control tools, AccuRev, CVS, Subversion, GIT, ClearCase, and
Perforce. Here is the link to an explicit GIT tutorial. http://stackoverflow.com/questions/315911/git-for-beginners-the-definitive-practical-guide
Note that terminology from each tool
differs slightly, while basic concepts
remain common.
https://github.com/
3
Fork
class
repository
Merge
changes
Search
class
repository
What is Estimation?
reconstruct
& correlate
historical
data
How does a simple release of documentation fit into the Life Cycle
Which phase covers daily stand-ups of software development
What comes first, a Functional Spec or a Test Plan
What comes first, Code or Test Cases
How does MRD is traceable to unit test cases
How does bug fixing of customer-reported defects fit into the Life Cycle
These points shed light into grey areas and have to be responded to assure the life cycle is adopted at your specific environment.
Life
Cycle
A lifecycle is the set of
steps a program follows
from its conception,
through its design,
development, test,
manufacture, service, and
disposal.
Multiple programs follow
the different steps of a
predefined lifecycle.
A product life cycle is an organizational concept, not a project-level concept.
Refinements and improvements can be identified in a program and in a lifecycle. A
lifecycle is a general guidance, not a strict instruction set. Program deliverables differ
greatly depending on the program type.
The lifecycle is an important part of the organizational process set. As all controlled
processes, a lifecycle is under a version control that is periodically reviewed and updated.
Here we introduce a concept of Cone of Uncertainty as thick red lines overlaying the product life
cycle. I encourage you to examine the text book to this course by Steve McConnell "Software
Estimation" that has a chapter 4.2 dedicated to this topic. The red lines on the pictorial below
show the Estimation CP-0: Begin CP-1: Defined Requirements
Accuracy.
Project Charter
PRD Complete
Rough Schedule
Legend:
Phase 1
Initiate
CP-5: Ready
Quality assessment
Phase 2
Plan
MRD
Checkpoints
Control Docs
Review Meetings
Phase 3
Define
Phase 4
Build
Phase 5
Optimize
Phase 6
Release
Functional Specs
Test Plans
Phase 7
Introduce
RTS
RTM
Phase 8
Support
EOL
EOS
First, you see that Estimation is iterative and the same features are being re-estimated
Second, the accuracy of estimation is improving dramatically. At the end, it is one hundred percent accurate since, while standing on
the shipping dock we know exactly when the product is shipped. In the beginning, the gap is huge. When the initial concept is being
analyzed, an estimate could be several hundred percent off.
The fundamental question that is being asked - How to Beat the Cone? How to reduce variability on a project? The cone does not narrow
itself. We need to maintain well-defined requirements, consistent code reviews and effective unit tests.
Its worth mentioning the names of three gurus, that are industry practitioners who have
been advancing the subject of estimation over the last thirty years. Barry Boehm, Capers
Jones and Steve McConnell have each a written well-known books on the subject. It is
prudent to use the wealth of knowledge these software gurus offer. Here we map the Cone
first published by Barry Boehm into our standard life cycle. Note that variability of
Scope in text books Cone (4X) is much greater than variability of Schedule (1.6X).
The fundamental reason we introduce this training is to improve the Estimation Accuracy on
your project. Usually, Program Management is in a position to track the delivery dates of all
release types. Here is below we illustrate the concept with a Cone based on dates.
Several conclusions and lessons one could immediately derive from the chart that is a
slightly modified example of a real data.
The chart is lopsided, as none ships early. Schedules are always over-optimistic, as
programs frequently slip and hardly ever completed before a commitment.
Bigger programs have a greater margin of error.
The earlier we acknowledge an upcoming slippage the easier it is to implement a
correction. Late corrections are costly, as it is difficult to prepare for a dramatic change
of a course.
15 percent correction is expected at a program start
50 percent correction is the maximum one could anticipate at program start.
Let's compare these numbers (15% middle and 50% max) against the numbers quoted
by Steve McConnell and Burry Boehm. The text book chart above has 1.6X (60%)
variability. We should be able to justify this difference.
In our example, at "program start", a significant part of analysis has been completed
which mitigated some large sources of variability.
The data shown relates to Schedule, not to Effort and not to Size. As we all know, some
releases are compelled to drop features, so to keep the train on schedule.
Let us review the framework for estimation in some detail. There are three buckets that all
measurements should fall into. Size. Effort, and Duration.
Example of Size would be Lines of Code, Screens, APIs, and, of course, Story Points.
Effort is measured in Person-Hours or (ideal hours). Only 7 hours per day are counted
considering this is the time that spent productively. Staff meetings, vacations, and
other non-productive work is excluded.
Duration is measured in Months.
The structured
estimation process
proceeds in this same
order. From Size to
Effort to Duration.
Skipping a step has its
ramifications. Managers
are encouraged not to
ask for a date, and not
to focus on when you
are going to finish. But
instead to ask first the
details of .what are
you going to do.
Estimating Size is the key to a prudent process. It is a Game Changer so to speak. Instead of
first giving dates and then dealing with consequences. Our strategy is to,
retain size measures mapped into Effort and Duration,
maintain a historical estimation repository and
eventually re-use this historical data on next projects.
In fact these three points constitute the most basic method called Estimating By Analogy.
The distinction between accuracy and precision is critical. Project stakeholders make
assumptions about project accuracy based on the precision with which an estimate is
presented.
When you present an estimate of 395.7 days, stakeholders assume the estimate is
accurate to 4 significant digits! The accuracy of the estimate might be better reflected
by estimating 1 year, 4 quarters, or 13 months rather than 395.7 days. Using an
estimate of 395.7 days instead of 1 year is like representing pi as 3.37882the
number is more precise, but its really less accurate.
Coding
Test
# fields
# reports
# pages
# issues reported
# use cases
design complexity (S/M/L)
0.5
2
2
1
8
50
5
20
20
5
M
0.625
0.25
1
0.5
1
10,000
10
9
5
12
4
5
6
4
4
There are many assumptions and decisions that are made while completing this estimation record. Making decisions about small
project components (for example about GUI Screens) are much easier than to make such decisions about the whole project. That is why
the Decomposition is an effective method producing estimates within acceptable ranges.
Analyzing an estimation record could lead to some useful conclusions. One might observe that too much effort is spent on coding. If
we re-distribute the effort to put additional emphasis on requirements and design then coding and testing could be significantly
reduced, so the total effort will be smaller. Needless to say that without a consolidated estimation record, such analysis is impossible.
A collection of estimation records constitute a historical estimation repository. Some organizations maintain databases with
thousands of estimation records along with their corresponding actuals. Should note the significant value of actuals, since all future
estimates are based on past actuals.
40
Here is an estimation record example/template that fits well into the context of the term
project. It should account for each member of the project team.
Decomposition method
is used
identify deliverables
that need to be done
Size Measures
Size (how many of each
of the size measure)
Effort
Round-up to the next
Fibonacci number
Analogy method
draws from
previous projects
Expert Judgment
method relies on
two SMEs
Schedule is not
estimated, since
duration is time boxed
Expert A Expert B
Typical Effort
Effort
Final
Expert A Expert B
Size Measure per Size Measure Size
(person / hours) (Fibonacci)
Typical
Effort
(person
/ hours)
Effort
Final
Size Measure per Size Measure Size
Definition of users (personas)
# roles
0.20
5
1 (person 1/ hours) (Fibonacci)
1
Requirements
(person / hours)
Definition of Scope and Limitations # attributes
0.10
10
1
1
1
Requirements Engineering
Definition
of users (personas)
# roles
5 5 2 1
1 1
2 1
Requirements
List
# requirements
0.40 0.20
Definition of Scope and Limitations
# attributes
0.10
10
1
1
1
Config Mngmt
Configuration Items List
# CI items
0.09
11
1
1
1
Engineering Requirements List
# requirements
0.40
5
2
1
2
Estimaton
Estimation Record
# activities
0.08
13
1
1
1
Config Mngmt
Configuration Items List
# CI items
0.09
11
1
1
1
0.5
1
Design
State Transition Diagram
# states
0.25
4
1
Estimation
Estimation Record
# activities
0.08
13
1
1
1
Definition of UseCases
# use cases
0.50
4
2
2
2
Design
State Transition
# states
20 4 2 1
2 0.5
2 1
Definition
of Fields Diagram
# fields
0.10 0.25
Definition
of UseCases
# use cases
Definition
of Reports
# reports
1 0.50
2 4 2 2
2 2
2 2
Definition of Fields
# fields
0.10
20
2
2
2
Peer Reviews
Issues from Peer Reviews retained # issues
0.16
25
4
4
5
Definition of Reports
# reports
1
2
2
2
2
8
Implementation Defect Tracking System
# User Screens
6
1
6
8
Peer Reviews
Issues from Peer Reviews retained
# issues
0.16
25
4
4
5
Test Design
Test Cases
# test cases
0.33
6
2
2
2
Implementation
Defect Tracking System
# User Screens
6
1
6
8
8
3
Test Execution
Defects from testing recorded
# defects
0.30
10
3
3
Test Design
Test Cases
# test cases
0.33
6
2
2
2
Total:
31
Test Execution
Defects from testing recorded
# defects
0.30
10
3
3
3
Total:
31
Final estimate is
rounded up to the next
Fibonacci number
to create a buffer
Planning Poker
Planning Poker is the key technique used to estimate stories. The whole SCRUM team is
involved in a Planning Poker session. The same or a similar technique could be used
successfully for non-agile projects to estimate features and individual requirements. Here are
several considerations.
Prepare to estimate. The process will not work if you are surprised when the Product
Owner describes a story. In fact the biggest myth about this method is that one can waltz
into a planning poker session and be able to respond with a reasonable prediction.
The reason process is so effective since experiences of several SMEs are combined and
the total is greater than the simple sum of its parts.
Ask questions and be ready to reconcile assumptions.
During initial discussion, numbers must not be mentioned at all in relation to feature size
to avoid anchoring, when team members are influenced prematurely.
Reveal cards all at once. This prevents folks from influencing each others estimates.
A value of 3 needs to be 3 times bigger than the value of one. Need to ground this proxy
measures to an actual Size. For example, to a number of test cases or bug fixes or GUI
changes.
The larger the estimate, the larger the error. Precision should be equal to accuracy.
Remember Leonardo Fibonacci who brought us the buffer.
Adopting a consistent estimation practice across a large enterprise usually meets with a
considerable resistance that is natural for any organizational change.
Here is the relevant LinkedIn discussion on the topic that might provide a taste for a common
deployment issues. We could continue this discussion within the framework of our class.
As Product Management is searching for a direct measure of Customer Value and
stating that Story Points are inappropriate measure of Value. Please suggest a way of
measuring Value, so to help Product Management.
What are the ramifications of missing the first step in our estimation process? Namely,
if we drop the idea of Size and jump directly to Effort.
Here is an interesting site called "Ballpark" http://www.stridenyc.com/ballpark/
It has a certain methodology for a rough-level initial estimate.
Apparently the site is trying to differentiate itself by juxtaposing Budgeting and
Estimation. Obviously these two concepts are not in any competition. First goes
estimation, second goes budgeting. They are not a substitute for each other. I feel that
there is an apparent confusion with terminology, as methodology like T-Shirt Sizing is
what these folks call Ballpark. Although glancing at their site it looks like Monte Carlo
Simulation is used as a first estimation method.
Estimation Tools
Most commercial tools are based on a proprietary database from thousands of completed
projects. Initially, you derive an estimate from an industry averages. After your companyspecific data is collected, you start customizing and substituting the industry data with your
specific data.
Checkpoint originally introduced by Capers Jones the founder of SPR (Software Productivity
Research) of Burlington, with projects contributed from AT&T. It uses Function Points and
employs more than a hundred parameters in five categories.
Process
Technology
Personnel
Environment
Special factors
SLIM (Software Lifecycle Management) has a proprietary algorithm, supported by QSM
(Quantitative Software Management). Commonly used by government contractors, uses a
variety of parameters, including software size and Productivity Index (PI). The PI is calibrated
to an organization's historical software development environment.