You are on page 1of 311

Introduction to Programming Using Java Part II

Version 5.0, December 2006


(Version 5.1, with minor updates and corrections, June 2009)

David J. Eck
Hobart and William Smith Colleges

ii

c 19962009, David J. Eck


David J. Eck (eck@hws.edu) Department of Mathematics and Computer Science Hobart and William Smith Colleges Geneva, NY 14456 This book can be distributed in unmodied form with no restrictions. Modied versions can be made and distributed provided they are distributed under the same license as the original. More specically: This work is licensed under the Creative Commons Attribution-Share Alike 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/bysa/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. The web site for this book is: http://math.hws.edu/javanotes

Contents
Preface 8 Correctness and Robustness 8.1 Introduction to Correctness and Robustness . 8.1.1 Horror Stories . . . . . . . . . . . . . . 8.1.2 Java to the Rescue . . . . . . . . . . . 8.1.3 Problems Remain in Java . . . . . . . 8.2 Writing Correct Programs . . . . . . . . . . . 8.2.1 Provably Correct Programs . . . . . . 8.2.2 Robust Handling of Input . . . . . . . 8.3 Exceptions and try..catch . . . . . . . . . . . 8.3.1 Exceptions and Exception Classes . . 8.3.2 The try Statement . . . . . . . . . . . 8.3.3 Throwing Exceptions . . . . . . . . . . 8.3.4 Mandatory Exception Handling . . . . 8.3.5 Programming with Exceptions . . . . 8.4 Assertions . . . . . . . . . . . . . . . . . . . . 8.5 Introduction to Threads . . . . . . . . . . . . 8.5.1 Creating and Running Threads . . . . 8.5.2 Operations on Threads . . . . . . . . . 8.5.3 Mutual Exclusion with synchronized 8.5.4 Wait and Notify . . . . . . . . . . . . 8.5.5 Volatile Variables . . . . . . . . . . . . 8.6 Analysis of Algorithms . . . . . . . . . . . . . Exercises for Chapter 8 . . . . . . . . . . . . . . . Quiz on Chapter 8 . . . . . . . . . . . . . . . . . . 9 Linked Data Structures and Recursion 9.1 Recursion . . . . . . . . . . . . . . . . 9.1.1 Recursive Binary Search . . . . 9.1.2 Towers of Hanoi . . . . . . . . 9.1.3 A Recursive Sorting Algorithm 9.1.4 Blob Counting . . . . . . . . . 9.2 Linked Data Structures . . . . . . . . 9.2.1 Recursive Linking . . . . . . . 9.2.2 Linked Lists . . . . . . . . . . . 9.2.3 Basic Linked List Processing . 9.2.4 Inserting into a Linked List . . iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 373 . 373 . 374 . 375 . 377 . 378 . 378 . 381 . 385 . 386 . 388 . 390 . 392 . 393 . 396 . 399 . 400 . 404 . 405 . 408 . 412 . 413 . 419 . 424 . . . . . . . . . . 427 427 428 430 432 435 439 439 441 441 445

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iv 9.2.5 Deleting from a Linked List . 9.3 Stacks, Queues, and ADTs . . . . . . 9.3.1 Stacks . . . . . . . . . . . . . 9.3.2 Queues . . . . . . . . . . . . 9.3.3 Postx Expressions . . . . . . 9.4 Binary Trees . . . . . . . . . . . . . 9.4.1 Tree Traversal . . . . . . . . 9.4.2 Binary Sort Trees . . . . . . 9.4.3 Expression Trees . . . . . . . 9.5 A Simple Recursive Descent Parser . 9.5.1 Backus-Naur Form . . . . . . 9.5.2 Recursive Descent Parsing . . 9.5.3 Building an Expression Tree . Exercises for Chapter 9 . . . . . . . . . . Quiz on Chapter 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

447 448 449 452 456 459 460 462 467 470 470 472 476 479 482

10 Generic Programming and Collection Classes 10.1 Generic Programming . . . . . . . . . . . . . . 10.1.1 Generic Programming in Smalltalk . . . 10.1.2 Generic Programming in C++ . . . . . 10.1.3 Generic Programming in Java . . . . . . 10.1.4 The Java Collection Framework . . . . . 10.1.5 Iterators and for-each Loops . . . . . . . 10.1.6 Equality and Comparison . . . . . . . . 10.1.7 Generics and Wrapper Classes . . . . . 10.2 Lists and Sets . . . . . . . . . . . . . . . . . . . 10.2.1 ArrayList and LinkedList . . . . . . . . 10.2.2 Sorting . . . . . . . . . . . . . . . . . . 10.2.3 TreeSet and HashSet . . . . . . . . . . . 10.2.4 EnumSet . . . . . . . . . . . . . . . . . 10.3 Maps . . . . . . . . . . . . . . . . . . . . . . . . 10.3.1 The Map Interface . . . . . . . . . . . . 10.3.2 Views, SubSets, and SubMaps . . . . . 10.3.3 Hash Tables and Hash Codes . . . . . . 10.4 Programming with the Collection Framework . 10.4.1 Symbol Tables . . . . . . . . . . . . . . 10.4.2 Sets Inside a Map . . . . . . . . . . . . 10.4.3 Using a Comparator . . . . . . . . . . . 10.4.4 Word Counting . . . . . . . . . . . . . . 10.5 Writing Generic Classes and Methods . . . . . 10.5.1 Simple Generic Classes . . . . . . . . . . 10.5.2 Simple Generic Methods . . . . . . . . . 10.5.3 Type Wildcards . . . . . . . . . . . . . 10.5.4 Bounded Types . . . . . . . . . . . . . . Exercises for Chapter 10 . . . . . . . . . . . . . . . . Quiz on Chapter 10 . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

485 . 485 . 486 . 487 . 488 . 489 . 491 . 492 . 495 . 496 . 496 . 499 . 500 . 503 . 504 . 505 . 506 . 509 . 511 . 511 . 512 . 515 . 517 . 519 . 520 . 521 . 523 . 527 . 531 . 535

CONTENTS

v 537 . 537 . 537 . 539 . 540 . 541 . 544 . 545 . 546 . 546 . 549 . 552 . 554 . 555 . 557 . 559 . 561 . 568 . 569 . 571 . 572 . 574 . 578 . 581 . 582 . 585 . 588 . 596 . 596 . 598 . 600 . 606 . 609 611 611 611 617 620 621 622 624 625 627 629 630 633

11 Files and Networking 11.1 Streams, Readers, and Writers . . . . . 11.1.1 Character and Byte Streams . . 11.1.2 PrintWriter . . . . . . . . . . . . 11.1.3 Data Streams . . . . . . . . . . . 11.1.4 Reading Text . . . . . . . . . . . 11.1.5 The Scanner Class . . . . . . . . 11.1.6 Serialized Object I/O . . . . . . 11.2 Files . . . . . . . . . . . . . . . . . . . . 11.2.1 Reading and Writing Files . . . . 11.2.2 Files and Directories . . . . . . . 11.2.3 File Dialog Boxes . . . . . . . . . 11.3 Programming With Files . . . . . . . . . 11.3.1 Copying a File . . . . . . . . . . 11.3.2 Persistent Data . . . . . . . . . . 11.3.3 Files in GUI Programs . . . . . . 11.3.4 Storing Objects in Files . . . . . 11.4 Networking . . . . . . . . . . . . . . . . 11.4.1 URLs and URLConnections . . . 11.4.2 TCP/IP and Client/Server . . . 11.4.3 Sockets . . . . . . . . . . . . . . 11.4.4 A Trivial Client/Server . . . . . 11.4.5 A Simple Network Chat . . . . . 11.5 Network Programming and Threads . . 11.5.1 A Threaded GUI Chat Program. 11.5.2 A Multithreaded Server . . . . . 11.5.3 Distributed Computing . . . . . 11.6 A Brief Introduction to XML . . . . . . 11.6.1 Basic XML Syntax . . . . . . . . 11.6.2 XMLEncoder and XMLDecoder 11.6.3 Working With the DOM . . . . . Exercises for Chapter 11 . . . . . . . . . . . . Quiz on Chapter 11 . . . . . . . . . . . . . . 12 Advanced GUI Programming 12.1 Images and Resources . . . . . . . 12.1.1 Images and BueredImages 12.1.2 Working With Pixels . . . . 12.1.3 Resources . . . . . . . . . . 12.1.4 Cursors and Icons . . . . . 12.1.5 Image File I/O . . . . . . . 12.2 Fancier Graphics . . . . . . . . . . 12.2.1 Measuring Text . . . . . . . 12.2.2 Transparency . . . . . . . . 12.2.3 Antialiasing . . . . . . . . . 12.2.4 Strokes and Paints . . . . . 12.2.5 Transforms . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

vi 12.3 Actions and Buttons . . . . . . . . 12.3.1 Action and AbstractAction 12.3.2 Icons on Buttons . . . . . . 12.3.3 Radio Buttons . . . . . . . 12.3.4 Toolbars . . . . . . . . . . . 12.3.5 Keyboard Accelerators . . . 12.3.6 HTML on Buttons . . . . . 12.4 Complex Components and MVC . 12.4.1 Model-View-Controller . . . 12.4.2 Lists and ListModels . . . . 12.4.3 Tables and TableModels . . 12.4.4 Documents and Editors . . 12.4.5 Custom Components . . . . 12.5 Finishing Touches . . . . . . . . . 12.5.1 The Mandelbrot Set . . . . 12.5.2 Design of the Program . . . 12.5.3 Internationalization . . . . 12.5.4 Events, Events, Events . . . 12.5.5 Custom Dialogs . . . . . . . 12.5.6 Preferences . . . . . . . . . Exercises for Chapter 12 . . . . . . . . . Quiz on Chapter 12 . . . . . . . . . . . Appendix: Source Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

636 636 638 639 642 643 645 646 646 647 650 654 655 660 660 662 664 666 668 669 671 673 675

Preface
Introduction to Programming Using Java is a free introductory computer programming textbook that uses Java as the language of instruction. It is suitable for use in an introductory programming course and for people who are trying to learn programming on their own. There are no prerequisites beyond a general familiarity with the ideas of computers and programs. There is enough material for a full year of college-level programming. Chapters 1 through 7 can be used as a textbook in a one-semester college-level course or in a year-long high school course. This version of the book covers Java 5.0. It also works well with later versions of Java. (While Java 5.0 introduced major new features that need to be covered in an introductory programming course, Java 6.0 and the upcoming Java 7.0 do not.) Many of the examples in the book use features that were not present before Java 5.0. Note that Java applets appear throughout the pages of the on-line version of this book. Many of those applets will be nonfunctional in Web browsers that do not support Java 5.0. The home web site for this book is http://math.hws.edu/javanotes/. The page at that address contains links for downloading a copy of the web site and for downloading a PDF version of the book.

In style, this is a textbook rather than a tutorial. That is, it concentrates on explaining concepts rather than giving step-by-step how-to-do-it guides. I have tried to use a conversational writing style that might be closer to classroom lecture than to a typical textbook. Youll nd programming exercises at the end of most chapters, and you will nd a detailed solution for each exercise, with the sort of discussion that I would give if I presented the solution in class. (Solutions to the exercises can be found in the on-line version only.) I strongly advise that you read the exercise solutions if you want to get the most out of this book. This is certainly not a Java reference book, and it is not even close to a comprehensive survey of all the features of Java. It is not written as a quick introduction to Java for people who already know another programming language. Instead, it is directed mainly towards people who are learning programming for the rst time, and it is as much about general programming concepts as it is about Java in particular. I believe that Introduction to Programming using Java is fully competitive with the conventionally published, printed programming textbooks that are available on the market. (Well, all right, Ill confess that I think its better.) There are several approaches to teaching Java. One approach uses graphical user interface programming from the very beginning. Some people believe that object oriented programming should also be emphasized from the very beginning. This is not the approach that I take. The approach that I favor starts with the more basic building blocks of programming and builds from there. After an introductory chapter, I cover procedural programming in Chapters 2, 3, and 4. Object-oriented programming is introduced in Chapter 5. Chapters 6 covers the closely related topic of event-oriented programming and graphical user interfaces. Arrays are covered in vii

viii

Preface

Chapter 7. Chapter 8 marks a turning point in the book, moving beyond the fundamental ideas of programming to cover more advanced topics. Chapter 8 is mostly about writing robust and correct programs, but it also has a section on parallel processing and threads. Chapters 9 and 10 cover recursion and data structures, including the Java Collection Framework. Chapter 11 is about les and networking. Finally, Chapter 12 returns to the topic of graphical user interface programming to cover some of Javas more advanced capabilities.


Major changes were made for the fth edition of this book. Perhaps the most signicant change is the use of parameterized types in the chapter on generic programming. Parameterized typesJavas version of templateswere the most eagerly anticipated new feature in Java 5.0. Other new features in Java 5.0 are also covered. Enumerated types are introduced, although they are not covered in their full complexity. The for-each loop is covered and is used extensively. Formatted output is also used extensively, and the Scanner class is covered (though not until Chapter 11). Static import is covered briey, as are variable arity methods. The non-standard TextIO class that I use for input in the rst half of the book has been rewritten to support formatted output. I have also added some le I/O capabilities to this class to make it possible to cover some examples that use les early in the book. Javadoc comments are covered for the rst time in the fth edition. Almost all code examples have been revised to use Javadoc-style comments. The coverage of graphical user interface programming has been reorganized, much of it has been rewritten, and new material has been added. In previous editions, I emphasized applets. Stand-alone GUI applications were covered at the end, almost as an afterthought. In the fth edition, the emphasis on applets is gone, and almost all examples are presented as stand-alone applications. However, applet versions of each example are still presented on the web pages of the on-line version of the book. The chapter on advanced GUI programming has been moved to the end, and a signicant amount of new material has been added, including coverage of some of the features of Graphics2D. Aside from the changes in content, the appearance of the book has been improved, especially the appearance of the PDF version. For the rst time, the quality of the PDF approaches that of conventional textbooks. Version 5.1 of this book is a minor update of Version 5.0. A number of typographical and coding errors in Version 5.0 have been corrected. Also, the discussion of the Eclipse IDE in Section 2.6 has been updated to be consistent with more recent versions of Eclipse.


The latest complete edition of Introduction to Programming using Java is always available on line at http://math.hws.edu/javanotes/. The rst version of the book was written in 1996, and there have been several editions since then. All editions are archived at the following Web addresses: First edition: http://math.hws.edu/eck/cs124/javanotes1/ (Covers Java 1.0.) Second edition: http://math.hws.edu/eck/cs124/javanotes2/ (Covers Java 1.1.) Third edition: http://math.hws.edu/eck/cs124/javanotes3/ (Covers Java 1.1.) Fourth edition: http://math.hws.edu/eck/cs124/javanotes4/ (Covers Java 1.4.) Fifth edition: http://math.hws.edu/eck/cs124/javanotes5/ (Covers Java 5.0.) Introduction to Programming using Java is free, but it is not in the public domain. As of Version 5.0, it is published under the terms of the Creative Commons Attribution-Share

Preface

ix

Alike 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/bysa/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. This license allows redistribution and modication under certain terms. For example, you can: Post an unmodied copy of the on-line version on your own Web site (including the parts that list the author and state the license under which it is distributed!). Give away or sell printed, unmodied copies of this book, as long as they meet the requirements of the license. Make modied copies of the complete book or parts of it and post them on the web or otherwise distribute them, provided that attribution to the author is given, the modications are clearly noted, and the modied copies are distributed under the same license as the original. This includes translations to other languages. While it is not actually required by the license, I do appreciate hearing from people who are using or distributing my work.


A technical note on production: The on-line and PDF versions of this book are created from a single source, which is written largely in XML. To produce the PDF version, the XML is processed into a form that can be used by the TeX typesetting program. In addition to XML les, the source includes DTDs, XSLT transformations, Java source code les, image les, a TeX macro le, and a couple of scripts that are used in processing. I have not made the source materials available for download, since they are not in a clean enough form to be publishable, and because it would require a fair amount of expertise to make any use of them. However, they are not meant to be secret, and I am willing to make them available on request.


Professor David J. Eck Department of Mathematics and Computer Science Hobart and William Smith Colleges Geneva, New York 14456, USA Email: eck@hws.edu WWW: http://math.hws.edu/eck/

Preface

Chapter 8

Correctness and Robustness


previous chapters, we have covered the fundamentals of programming. The chapters that follow will cover more advanced aspects of programming. The ideas that are presented will be a little more complex and the programs that use them a little more complicated. This chapter is a kind of turning point in which we look at the problem of getting such complex programs right. Computer programs that fail are much too common. Programs are fragile. A tiny error can cause a program to misbehave or crash. Most of us are familiar with this from our own experience with computers. And weve all heard stories about software glitches that cause spacecraft to crash, telephone service to fail, and, in a few cases, people to die. Programs dont have to be as bad as they are. It might well be impossible to guarantee that programs are problem-free, but careful programming and well-designed programming tools can help keep the problems to a minimum. This chapter will look at issues of correctness and robustness of programs. It also looks more closely at exceptions and the try..catch statement, and it introduces assertions, another of the tools that Java provides as an aid in writing correct programs. This chapter also includes sections on two topics that are only indirectly related to correctness and robustness. Section 8.5 will introduce threads while Section 8.6 looks briey at the Analysis of Algorithms. Both of these topics do t into this chapter in its role as a turning point, since they are part of the foundation for more advanced programming.

In

8.1

Introduction to Correctness and Robustness

A program is correct if it accomplishes the task that it was designed to perform. It is robust if it can handle illegal inputs and other unexpected situations in a reasonable way. For example, consider a program that is designed to read some numbers from the user and then print the same numbers in sorted order. The program is correct if it works for any set of input numbers. It is robust if it can also deal with non-numeric input by, for example, printing an error message and ignoring the bad input. A non-robust program might crash or give nonsensical output in the same circumstance. Every program should be correct. (A sorting program that doesnt sort correctly is pretty useless.) Its not the case that every program needs to be completely robust. It depends on who will use it and how it will be used. For example, a small utility program that you write for your own use doesnt have to be particularly robust. The question of correctness is actually more subtle than it might appear. A programmer
373

374

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

works from a specication of what the program is supposed to do. The programmers work is correct if the program meets its specication. But does that mean that the program itself is correct? What if the specication is incorrect or incomplete? A correct program should be a correct implementation of a complete and correct specication. The question is whether the specication correctly expresses the intention and desires of the people for whom the program is being written. This is a question that lies largely outside the domain of computer science.

8.1.1

Horror Stories

Most computer users have personal experience with programs that dont work or that crash. In many cases, such problems are just annoyances, but even on a personal computer there can be more serious consequences, such as lost work or lost money. When computers are given more important tasks, the consequences of failure can be proportionately more serious. Just a few years ago, the failure of two multi-million space missions to Mars was prominent in the news. Both failures were probably due to software problems, but in both cases the problem was not with an incorrect program as such. In September 1999, the Mars Climate Orbiter burned up in the Martian atmosphere because data that was expressed in English units of measurement (such as feet and pounds) was entered into a computer program that was designed to use metric units (such as centimeters and grams). A few months later, the Mars Polar Lander probably crashed because its software turned o its landing engines too soon. The program was supposed to detect the bump when the spacecraft landed and turn o the engines then. It has been determined that deployment of the landing gear might have jarred the spacecraft enough to activate the program, causing it to turn o the engines when the spacecraft was still in the air. The unpowered spacecraft would then have fallen to the Martian surface. A more robust system would have checked the altitude before turning o the engines! There are many equally dramatic stories of problems caused by incorrect or poorly written software. Lets look at a few incidents recounted in the book Computer Ethics by Tom Forester and Perry Morrison. (This book covers various ethical issues in computing. It, or something like it, is essential reading for any student of computer science.) In 1985 and 1986, one person was killed and several were injured by excess radiation, while undergoing radiation treatments by a mis-programmed computerized radiation machine. In another case, over a ten-year period ending in 1992, almost 1,000 cancer patients received radiation dosages that were 30% less than prescribed because of a programming error. In 1985, a computer at the Bank of New York started destroying records of on-going security transactions because of an error in a program. It took less than 24 hours to x the program, but by that time, the bank was out $5,000,000 in overnight interest payments on funds that it had to borrow to cover the problem. The programming of the inertial guidance system of the F-16 ghter plane would have turned the plane upside-down when it crossed the equator, if the problem had not been discovered in simulation. The Mariner 18 space probe was lost because of an error in one line of a program. The Gemini V space capsule missed its scheduled landing target by a hundred miles, because a programmer forgot to take into account the rotation of the Earth. In 1990, AT&Ts long-distance telephone service was disrupted throughout the United States when a newly loaded computer program proved to contain a bug. These are just a few examples. Software problems are all too common. As programmers, we need to understand why that is true and what can be done about it.

8.1. INTRODUCTION TO CORRECTNESS AND ROBUSTNESS

375

8.1.2

Java to the Rescue

Part of the problem, according to the inventors of Java, can be traced to programming languages themselves. Java was designed to provide some protection against certain types of errors. How can a language feature help prevent errors? Lets look at a few examples. Early programming languages did not require variables to be declared. In such languages, when a variable name is used in a program, the variable is created automatically. You might consider this more convenient than having to declare every variable explicitly. But there is an unfortunate consequence: An inadvertent spelling error might introduce an extra variable that you had no intention of creating. This type of error was responsible, according to one famous story, for yet another lost spacecraft. In the FORTRAN programming language, the command DO 20 I = 1,5 is the rst statement of a counting loop. Now, spaces are insignicant in FORTRAN, so this is equivalent to DO20I=1,5. On the other hand, the command DO20I=1.5, with a period instead of a comma, is an assignment statement that assigns the value 1.5 to the variable DO20I. Supposedly, the inadvertent substitution of a period for a comma in a statement of this type caused a rocket to blow up on take-o. Because FORTRAN doesnt require variables to be declared, the compiler would be happy to accept the statement DO20I=1.5. It would just create a new variable named DO20I. If FORTRAN required variables to be declared, the compiler would have complained that the variable DO20I was undeclared. While most programming languages today do require variables to be declared, there are other features in common programming languages that can cause problems. Java has eliminated some of these features. Some people complain that this makes Java less ecient and less powerful. While there is some justice in this criticism, the increase in security and robustness is probably worth the cost in most circumstances. The best defense against some types of errors is to design a programming language in which the errors are impossible. In other cases, where the error cant be completely eliminated, the language can be designed so that when the error does occur, it will automatically be detected. This will at least prevent the error from causing further harm, and it will alert the programmer that there is a bug that needs xing. Lets look at a few cases where the designers of Java have taken these approaches. An array is created with a certain number of locations, numbered from zero up to some specied maximum index. It is an error to try to use an array location that is outside of the specied range. In Java, any attempt to do so is detected automatically by the system. In some other languages, such as C and C++, its up to the programmer to make sure that the index is within the legal range. Suppose that an array, A, has three locations, A[0], A[1], and A[2]. Then A[3], A[4], and so on refer to memory locations beyond the end of the array. In Java, an attempt to store data in A[3] will be detected. The program will be terminated (unless the error is caught, as discussed in Section 3.7). In C or C++, the computer will just go ahead and store the data in memory that is not part of the array. Since there is no telling what that memory location is being used for, the result will be unpredictable. The consequences could be much more serious than a terminated program. (See, for example, the discussion of buer overow errors later in this section.) Pointers are a notorious source of programming errors. In Java, a variable of object type holds either a pointer to an object or the special value null. Any attempt to use a null value as if it were a pointer to an actual object will be detected by the system. In some other languages, again, its up to the programmer to avoid such null pointer errors. In my old Macintosh computer, a null pointer was actually implemented as if it were a pointer to memory location zero. A program could use a null pointer to change values stored in memory near location zero. Unfortunately, the Macintosh stored important system data in those locations. Changing that

376

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

data could cause the whole system to crash, a consequence more severe than a single failed program. Another type of pointer error occurs when a pointer value is pointing to an object of the wrong type or to a segment of memory that does not even hold a valid object at all. These types of errors are impossible in Java, which does not allow programmers to manipulate pointers directly. In other languages, it is possible to set a pointer to point, essentially, to any location in memory. If this is done incorrectly, then using the pointer can have unpredictable results. Another type of error that cannot occur in Java is a memory leak. In Java, once there are no longer any pointers that refer to an object, that object is garbage collected so that the memory that it occupied can be reused. In other languages, it is the programmers responsibility to return unused memory to the system. If the programmer fails to do this, unused memory can build up, leaving less memory for programs and data. There is a story that many common programs for older Windows computers had so many memory leaks that the computer would run out of memory after a few days of use and would have to be restarted. Many programs have been found to suer from buer overow errors. Buer overow errors often make the news because they are responsible for many network security problems. When one computer receives data from another computer over a network, that data is stored in a buer. The buer is just a segment of memory that has been allocated by a program to hold data that it expects to receive. A buer overow occurs when more data is received than will t in the buer. The question is, what happens then? If the error is detected by the program or by the networking software, then the only thing that has happened is a failed network data transmission. The real problem occurs when the software does not properly detect buer overows. In that case, the software continues to store data in memory even after the buer is lled, and the extra data goes into some part of memory that was not allocated by the program as part of the buer. That memory might be in use for some other purpose. It might contain important data. It might even contain part of the program itself. This is where the real security issues come in. Suppose that a buer overow causes part of a program to be replaced with extra data received over a network. When the computer goes to execute the part of the program that was replaced, its actually executing data that was received from another computer. That data could be anything. It could be a program that crashes the computer or takes it over. A malicious programmer who nds a convenient buer overow error in networking software can try to exploit that error to trick other computers into executing his programs. For software written completely in Java, buer overow errors are impossible. The language simply does not provide any way to store data into memory that has not been properly allocated. To do that, you would need a pointer that points to unallocated memory or you would have to refer to an array location that lies outside the range allocated for the array. As explained above, neither of these is possible in Java. (However, there could conceivably still be errors in Javas standard classes, since some of the methods in these classes are actually written in the C programming language rather than in Java.) Its clear that language design can help prevent errors or detect them when they occur. Doing so involves restricting what a programmer is allowed to do. Or it requires tests, such as checking whether a pointer is null, that take some extra processing time. Some programmers feel that the sacrice of power and eciency is too high a price to pay for the extra security. In some applications, this is true. However, there are many situations where safety and security are primary considerations. Java is designed for such situations.

8.1. INTRODUCTION TO CORRECTNESS AND ROBUSTNESS

377

8.1.3

Problems Remain in Java

There is one area where the designers of Java chose not to detect errors automatically: numerical computations. In Java, a value of type int is represented as a 32-bit binary number. With 32 bits, its possible to represent a little over four billion dierent values. The values of type int range from -2147483648 to 2147483647. What happens when the result of a computation lies outside this range? For example, what is 2147483647 + 1? And what is 2000000000 * 2? The mathematically correct result in each case cannot be represented as a value of type int. These are examples of integer overow . In most cases, integer overow should be considered an error. However, Java does not automatically detect such errors. For example, it will compute the value of 2147483647 + 1 to be the negative number, -2147483648. (What happens is that any extra bits beyond the 32-nd bit in the correct answer are discarded. Values greater than 2147483647 will wrap around to negative values. Mathematically speaking, the result is always correct modulo 232 .) For example, consider the 3N+1 program, which was discussed in Subsection 3.2.2. Starting from a positive integer N, the program computes a certain sequence of integers:
while ( N != 1 ) { if ( N % 2 == 0 ) // If N is even... N = N / 2; else N = 3 * N + 1; System.out.println(N); }

But there is a problem here: If N is too large, then the value of 3*N+1 will not be mathematically correct because of integer overow. The problem arises whenever 3*N+1 > 2147483647, that is when N > 2147483646/3. For a completely correct program, we should check for this possibility before computing 3*N+1:
while ( N != 1 ) { if ( N % 2 == 0 ) // If N is even... N = N / 2; else { if (N > 2147483646/3) { System.out.println("Sorry, but the value of N has become"); System.out.println("too large for your computer!"); break; } N = 3 * N + 1; } System.out.println(N); }

The problem here is not that the original algorithm for computing 3N+1 sequences was wrong. The problem is that it just cant be correctly implemented using 32-bit integers. Many programs ignore this type of problem. But integer overow errors have been responsible for their share of serious computer failures, and a completely robust program should take the possibility of integer overow into account. (The infamous Y2K bug was, in fact, just this sort of error.) For numbers of type double, there are even more problems. There are still overow errors, which occur when the result of a computation is outside the range of values that can be represented as a value of type double. This range extends up to about 1.7 times 10 to the

378

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

power 308. Numbers beyond this range do not wrap around to negative values. Instead, they are represented by special values that have no real numerical equivalent. The special values Double.POSITIVE INFINITY and Double.NEGATIVE INFINITY represent numbers outside the range of legal values. For example, 20 * 1e308 is computed to be Double.POSITIVE INFINITY. Another special value of type double, Double.NaN, represents an illegal or undened result. (NaN stands for Not a Number.) For example, the result of dividing by zero or taking the square root of a negative number is Double.NaN. You can test whether a number x is this special non-a-number value by calling the boolean-valued function Double.isNaN(x). For real numbers, there is the added complication that most real numbers can only be represented approximately on a computer. A real number can have an innite number of digits after the decimal point. A value of type double is only accurate to about 15 digits. The real number 1/3, for example, is the repeating decimal 0.333333333333..., and there is no way to represent it exactly using a nite number of digits. Computations with real numbers generally involve a loss of accuracy. In fact, if care is not exercised, the result of a large number of such computations might be completely wrong! There is a whole eld of computer science, known as numerical analysis, which is devoted to studying algorithms that manipulate real numbers. So you see that not all possible errors are avoided or detected automatically in Java. Furthermore, even when an error is detected automatically, the systems default response is to report the error and terminate the program. This is hardly robust behavior! So, a Java programmer still needs to learn techniques for avoiding and dealing with errors. These are the main topics of the rest of this chapter.

8.2

Writing Correct Programs

Correct

programs dont just happen. It takes planning and attention to detail to avoid errors in programs. There are some techniques that programmers can use to increase the likelihood that their programs are correct.

8.2.1

Provably Correct Programs

In some cases, it is possible to prove that a program is correct. That is, it is possible to demonstrate mathematically that the sequence of computations represented by the program will always produce the correct result. Rigorous proof is dicult enough that in practice it can only be applied to fairly small programs. Furthermore, it depends on the fact that the correct result has been specied correctly and completely. As Ive already pointed out, a program that correctly meets its specication is not useful if its specication was wrong. Nevertheless, even in everyday programming, we can apply some of the ideas and techniques that are used in proving that programs are correct. The fundamental ideas are process and state. A state consists of all the information relevant to the execution of a program at a given moment during its execution. The state includes, for example, the values of all the variables in the program, the output that has been produced, any input that is waiting to be read, and a record of the position in the program where the computer is working. A process is the sequence of states that the computer goes through as it executes the program. From this point of view, the meaning of a statement in a program can be expressed in terms of the eect that the execution of that statement has on the computers state. As a simple example, the meaning of the assignment statement x = 7; is that after this statement is executed, the value of the variable x will be 7. We can be absolutely

8.2. WRITING CORRECT PROGRAMS

379

sure of this fact, so it is something upon which we can build part of a mathematical proof. In fact, it is often possible to look at a program and deduce that some fact must be true at a given point during the execution of a program. For example, consider the do loop:
do { TextIO.put("Enter a positive integer: "); N = TextIO.getlnInt(); } while (N <= 0);

After this loop ends, we can be absolutely sure that the value of the variable N is greater than zero. The loop cannot end until this condition is satised. This fact is part of the meaning of the while loop. More generally, if a while loop uses the test while ( condition ), then after the loop ends, we can be sure that the condition is false. We can then use this fact to draw further deductions about what happens as the execution of the program continues. (With a loop, by the way, we also have to worry about the question of whether the loop will ever end. This is something that has to be veried separately.) A fact that can be proven to be true after a given program segment has been executed is called a postcondition of that program segment. Postconditions are known facts upon which we can build further deductions about the behavior of the program. A postcondition of a program as a whole is simply a fact that can be proven to be true after the program has nished executing. A program can be proven to be correct by showing that the postconditions of the program meet the programs specication. Consider the following program segment, where all the variables are of type double:
disc = B*B - 4*A*C; x = (-B + Math.sqrt(disc)) / (2*A);

The quadratic formula (from high-school mathematics) assures us that the value assigned to x is a solution of the equation A*x2 + B*x + C = 0, provided that the value of disc is greater than or equal to zero and the value of A is not zero. If we can assume or guarantee that B*B-4*A*C >= 0 and that A != 0, then the fact that x is a solution of the equation becomes a postcondition of the program segment. We say that the condition, B*B-4*A*C >= 0 is a precondition of the program segment. The condition that A != 0 is another precondition. A precondition is dened to be condition that must be true at a given point in the execution of a program in order for the program to continue correctly. A precondition is something that you want to be true. Its something that you have to check or force to be true, if you want your program to be correct. Weve encountered preconditions and postconditions once before, in Subsection 4.6.1. That section introduced preconditions and postconditions as a way of specifying the contract of a subroutine. As the terms are being used here, a precondition of a subroutine is just a precondition of the code that makes up the denition of the subroutine, and the postcondition of a subroutine is a postcondition of the same code. In this section, we have generalized these terms to make them more useful in talking about program correctness. Lets see how this works by considering a longer program segment:
do { TextIO.putln("Enter A, B, and C. TextIO.put("A = "); A = TextIO.getlnDouble(); TextIO.put("B = "); B = TextIO.getlnDouble(); TextIO.put("C = "); B*B-4*A*C must be >= 0.");

380

CHAPTER 8. CORRECTNESS AND ROBUSTNESS


C = TextIO.getlnDouble(); if (A == 0 || B*B - 4*A*C < 0) TextIO.putln("Your input is illegal. } while (A == 0 || B*B - 4*A*C < 0); disc = B*B - 4*A*C; x = (-B + Math.sqrt(disc)) / (2*A);

Try again.");

After the loop ends, we can be sure that B*B-4*A*C >= 0 and that A != 0. The preconditions for the last two lines are fullled, so the postcondition that x is a solution of the equation A*x2 + B*x + C = 0 is also valid. This program segment correctly and provably computes a solution to the equation. (Actually, because of problems with representing numbers on computers, this is not 100% true. The algorithm is correct, but the program is not a perfect implementation of the algorithm. See the discussion in Subsection 8.1.3.) Here is another variation, in which the precondition is checked by an if statement. In the rst part of the if statement, where a solution is computed and printed, we know that the preconditions are fullled. In the other parts, we know that one of the preconditions fails to hold. In any case, the program is correct.
TextIO.putln("Enter your values for A, B, and C."); TextIO.put("A = "); A = TextIO.getlnDouble(); TextIO.put("B = "); B = TextIO.getlnDouble(); TextIO.put("C = "); C = TextIO.getlnDouble(); if (A != 0 && B*B - 4*A*C >= 0) { disc = B*B - 4*A*C; x = (-B + Math.sqrt(disc)) / (2*A); TextIO.putln("A solution of A*X*X + B*X + C = 0 is " + x); } else if (A == 0) { TextIO.putln("The value of A cannot be zero."); } else { TextIO.putln("Since B*B - 4*A*C is less than zero, the"); TextIO.putln("equation A*X*X + B*X + C = 0 has no solution."); }

Whenever you write a program, its a good idea to watch out for preconditions and think about how your program handles them. Often, a precondition can oer a clue about how to write the program. For example, every array reference, such as A[i], has a precondition: The index must be within the range of legal indices for the array. For A[i], the precondition is that 0 <= i < A.length. The computer will check this condition when it evaluates A[i], and if the condition is not satised, the program will be terminated. In order to avoid this, you need to make sure that the index has a legal value. (There is actually another precondition, namely that A is not null, but lets leave that aside for the moment.) Consider the following code, which searches for the number x in the array A and sets the value of i to be the index of the array element that contains x:

8.2. WRITING CORRECT PROGRAMS


i = 0; while (A[i] != x) { i++; }

381

As this program segment stands, it has a precondition, namely that x is actually in the array. If this precondition is satised, then the loop will end when A[i] == x. That is, the value of i when the loop ends will be the position of x in the array. However, if x is not in the array, then the value of i will just keep increasing until it is equal to A.length. At that time, the reference to A[i] is illegal and the program will be terminated. To avoid this, we can add a test to make sure that the precondition for referring to A[i] is satised:
i = 0; while (i < A.length && A[i] != x) { i++; }

Now, the loop will denitely end. After it ends, i will satisfy either i == A.length or A[i] == x. An if statement can be used after the loop to test which of these conditions caused the loop to end:
i = 0; while (i < A.length && A[i] != x) { i++; } if (i == A.length) System.out.println("x is not in the array"); else System.out.println("x is in position " + i);

8.2.2

Robust Handling of Input

One place where correctness and robustness are importantand especially dicultis in the processing of input data, whether that data is typed in by the user, read from a le, or received over a network. Files and networking will be covered in Chapter 11, which will make essential use of material that will be covered in the next two sections of this chapter. For now, lets look at an example of processing user input. Examples in this textbook use my TextIO class for reading input from the user. This class has built-in error handling. For example, the function TextIO.getDouble() is guaranteed to return a legal value of type double. If the user types an illegal value, then TextIO will ask the user to re-enter their response; your program never sees the illegal value. However, this approach can be clumsy and unsatisfactory, especially when the user is entering complex data. In the following example, Ill do my own error-checking. Sometimes, its useful to be able to look ahead at whats coming up in the input without actually reading it. For example, a program might need to know whether the next item in the input is a number or a word. For this purpose, the TextIO class includes the function TextIO.peek(). This function returns a char which is the next character in the users input, but it does not actually read that character. If the next thing in the input is an end-of-line, then TextIO.peek() returns the new-line character, \n. Often, what we really need to know is the next non-blank character in the users input. Before we can test this, we need to skip past any spaces (and tabs). Here is a function that does

382

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

this. It uses TextIO.peek() to look ahead, and it reads characters until the next character in the input is either an end-of-line or some non-blank character. (The function TextIO.getAnyChar() reads and returns the next character in the users input, even if that character is a space. By contrast, the more common TextIO.getChar() would skip any blanks and then read and return the next non-blank character. We cant use TextIO.getChar() here since the object is to skip the blanks without reading the next non-blank character.)
/** * Reads past any blanks and tabs in the input. * Postcondition: The next character in the input is an * end-of-line or a non-blank character. */ static void skipBlanks() { char ch; ch = TextIO.peek(); while (ch == || ch == \t) { // Next character is a space or tab; read it // and look at the character that follows it. ch = TextIO.getAnyChar(); ch = TextIO.peek(); } } // end skipBlanks()

(In fact, this operation is so common that it is built into the most recent version of TextIO. The method TextIO.skipBlanks() does essentially the same thing as the skipBlanks() method presented here.) An example in Subsection 3.5.3 allowed the user to enter length measurements such as 3 miles or 1 foot. It would then convert the measurement into inches, feet, yards, and miles. But people commonly use combined measurements such as 3 feet 7 inches. Lets improve the program so that it allows inputs of this form. More specically, the user will input lines containing one or more measurements such as 1 foot or 3 miles 20 yards 2 feet. The legal units of measure are inch, foot, yard, and mile. The program will also recognize plurals (inches, feet, yards, miles) and abbreviations (in, ft, yd, mi). Lets write a subroutine that will read one line of input of this form and compute the equivalent number of inches. The main program uses the number of inches to compute the equivalent number of feet, yards, and miles. If there is any error in the input, the subroutine will print an error message and return the value -1. The subroutine assumes that the input line is not empty. The main program tests for this before calling the subroutine and uses an empty line as a signal for ending the program. Ignoring the possibility of illegal inputs, a pseudocode algorithm for the subroutine is
inches = 0 // This will be the total number of inches while there is more input on the line: read the numerical measurement read the units of measure add the measurement to inches return inches

We can test whether there is more input on the line by checking whether the next non-blank character is the end-of-line character. But this test has a precondition: Before we can test the next non-blank character, we have to skip over any blanks. So, the algorithm becomes

8.2. WRITING CORRECT PROGRAMS


inches = 0 skipBlanks() while TextIO.peek() is not \n: read the numerical measurement read the unit of measure add the measurement to inches skipBlanks() return inches

383

Note the call to skipBlanks() at the end of the while loop. This subroutine must be executed before the computer returns to the test at the beginning of the loop. More generally, if the test in a while loop has a precondition, then you have to make sure that this precondition holds at the end of the while loop, before the computer jumps back to re-evaluate the test. What about error checking? Before reading the numerical measurement, we have to make sure that there is really a number there to read. Before reading the unit of measure, we have to test that there is something there to read. (The number might have been the last thing on the line. An input such as 3, without a unit of measure, is illegal.) Also, we have to check that the unit of measure is one of the valid units: inches, feet, yards, or miles. Here is an algorithm that includes error-checking:
inches = 0 skipBlanks() while TextIO.peek() is not \n: if the next character is not a digit: report an error and return -1 Let measurement = TextIO.getDouble(); skipBlanks() // Precondition for the next test!! if the next character is end-of-line: report an error and return -1 Let units = TextIO.getWord() if the units are inches: add measurement to inches else if the units are feet: add 12*measurement to inches else if the units are yards: add 36*measurement to inches else if the units are miles: add 12*5280*measurement to inches else report an error and return -1 skipBlanks() return inches

As you can see, error-testing adds signicantly to the complexity of the algorithm. Yet this is still a fairly simple example, and it doesnt even handle all the possible errors. For example, if the user enters a numerical measurement such as 1e400 that is outside the legal range of values of type double, then the program will fall back on the default error-handling in TextIO. Something even more interesting happens if the measurement is 1e308 miles. The number 1e308 is legal, but the corresponding number of inches is outside the legal range of

384

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

values for type double. As mentioned in the previous section, the computer will get the value Double.POSITIVE INFINITY when it does the computation. Here is the subroutine written out in Java:
/** * Reads the users input measurement from one line of input. * Precondition: The input line is not empty. * Postcondition: If the users input is legal, the measurement * is converted to inches and returned. If the * input is not legal, the value -1 is returned. * The end-of-line is NOT read by this routine. */ static double readMeasurement() { double inches; // Total number of inches in users measurement. // One measurement, // such as the 12 in "12 miles" // The units specified for the measurement, // such as "miles"

double measurement; String units; char ch;

// Used to peek at next character in the users input. // No inches have yet been read.

inches = 0;

skipBlanks(); ch = TextIO.peek(); /* As long as there is more input on the line, read a measurement and add the equivalent number of inches to the variable, inches. If an error is detected during the loop, end the subroutine immediately by returning -1. */ while (ch != \n) { /* Get the next measurement and the units. Before reading anything, make sure that a legal value is there to read. */ if ( ! Character.isDigit(ch) ) { TextIO.putln( "Error: Expected to find a number, but found " + ch); return -1; } measurement = TextIO.getDouble(); skipBlanks(); if (TextIO.peek() == \n) { TextIO.putln( "Error: Missing unit of measure at end of line."); return -1; } units = TextIO.getWord(); units = units.toLowerCase(); /* Convert the measurement to inches and add it to the total. */ if (units.equals("inch") || units.equals("inches") || units.equals("in")) { inches += measurement;

8.3. EXCEPTIONS AND TRY..CATCH


} else if (units.equals("foot") || units.equals("feet") || units.equals("ft")) { inches += measurement * 12; } else if (units.equals("yard") || units.equals("yards") || units.equals("yd")) { inches += measurement * 36; } else if (units.equals("mile") || units.equals("miles") || units.equals("mi")) { inches += measurement * 12 * 5280; } else { TextIO.putln("Error: \"" + units + "\" is not a legal unit of measure."); return -1; } /* Look ahead to see whether the next thing on the line is the end-of-line. */ skipBlanks(); ch = TextIO.peek(); } // end while

385

return inches; } // end readMeasurement()

The source code for the complete program can be found in the le LengthConverter2.java.

8.3

Exceptions and try..catch

Getting a program to work under ideal circumstances is usually a lot easier than making the program robust. A robust program can survive unusual or exceptional circumstances without crashing. One approach to writing robust programs is to anticipate the problems that might arise and to include tests in the program for each possible problem. For example, a program will crash if it tries to use an array element A[i], when i is not within the declared range of indices for the array A. A robust program must anticipate the possibility of a bad index and guard against it. One way to do this is to write the program in a way that ensures that the index is in the legal range. Another way is to test whether the index value is legal before using it in the array. This could be done with an if statement:
if (i < 0 || i >= A.length) { ... // Do something to handle the out-of-range index, i } else { ... // Process the array element, A[i] }

386

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

There are some problems with this approach. It is dicult and sometimes impossible to anticipate all the possible things that might go wrong. Its not always clear what to do when an error is detected. Furthermore, trying to anticipate all the possible problems can turn what would otherwise be a straightforward program into a messy tangle of if statements.

8.3.1

Exceptions and Exception Classes

We have already seen that Java (like its cousin, C++) provides a neater, more structured alternative method for dealing with errors that can occur while a program is running. The method is referred to as exception handling . The word exception is meant to be more general than error. It includes any circumstance that arises as the program is executed which is meant to be treated as an exception to the normal ow of control of the program. An exception might be an error, or it might just be a special case that you would rather not have clutter up your elegant algorithm. When an exception occurs during the execution of a program, we say that the exception is thrown. When this happens, the normal ow of the program is thrown o-track, and the program is in danger of crashing. However, the crash can be avoided if the exception is caught and handled in some way. An exception can be thrown in one part of a program and caught in a dierent part. An exception that is not caught will generally cause the program to crash. (More exactly, the thread that throws the exception will crash. In a multithreaded program, it is possible for other threads to continue even after one crashes. We will cover threads in Section 8.5. In particular, GUI programs are multithreaded, and parts of the program might continue to function even while other parts are non-functional because of exceptions.) By the way, since Java programs are executed by a Java interpreter, having a program crash simply means that it terminates abnormally and prematurely. It doesnt mean that the Java interpreter will crash. In eect, the interpreter catches any exceptions that are not caught by the program. The interpreter responds by terminating the program. In many other programming languages, a crashed program will sometimes crash the entire system and freeze the computer until it is restarted. With Java, such system crashes should be impossiblewhich means that when they happen, you have the satisfaction of blaming the system rather than your own program. Exceptions were introduced in Section 3.7, along with the try..catch statement, which is used to catch and handle exceptions. However, that section did not cover the complete syntax of try..catch or the full complexity of exceptions. In this section, we cover these topics in full detail.


When an exception occurs, the thing that is actually thrown is an object. This object can carry information (in its instance variables) from the point where the exception occurs to the point where it is caught and handled. This information always includes the subroutine call stack , which is a list of the subroutines that were being executed when the exception was thrown. (Since one subroutine can call another, several subroutines can be active at the same time.) Typically, an exception object also includes an error message describing what happened to cause the exception, and it can contain other data as well. All exception objects must belong to a subclass of the standard class java.lang.Throwable. In general, each dierent type of exception is represented by its own subclass of Throwable, and these subclasses are arranged in a fairly complex class hierarchy that shows the relationship among various types of exception. Throwable has two direct subclasses, Error and Exception. These two subclasses in turn have

8.3. EXCEPTIONS AND TRY..CATCH

387

many other predened subclasses. In addition, a programmer can create new exception classes to represent new types of exception. Most of the subclasses of the class Error represent serious errors within the Java virtual machine that should ordinarily cause program termination because there is no reasonable way to handle them. In general, you should not try to catch and handle such errors. An example is a ClassFormatError, which occurs when the Java virtual machine nds some kind of illegal data in a le that is supposed to contain a compiled Java class. If that class was being loaded as part of the program, then there is really no way for the program to proceed. On the other hand, subclasses of the class Exception represent exceptions that are meant to be caught. In many cases, these are exceptions that might naturally be called errors, but they are errors in the program or in input data that a programmer can anticipate and possibly respond to in some reasonable way. (However, you should avoid the temptation of saying, Well, Ill just put a thing here to catch all the errors that might occur, so my program wont crash. If you dont have a reasonable way to respond to the error, its best just to let the program crash, because trying to go on will probably only lead to worse things down the roadin the worst case, a program that gives an incorrect answer without giving you any indication that the answer might be wrong!) The class Exception has its own subclass, RuntimeException. This class groups together many common exceptions, including all those that have been covered in previous sections. For example, IllegalArgumentException and NullPointerException are subclasses of RuntimeException. A RuntimeException generally indicates a bug in the program, which the programmer should x. RuntimeExceptions and Errors share the property that a program can simply ignore the possibility that they might occur. (Ignoring here means that you are content to let your program crash if the exception occurs.) For example, a program does this every time it uses an array reference like A[i] without making arrangements to catch a possible ArrayIndexOutOfBoundsException. For all other exception classes besides Error, RuntimeException, and their subclasses, exception-handling is mandatory in a sense that Ill discuss below. The following diagram is a class hierarchy showing the class Throwable and just a few of its subclasses. Classes that require mandatory exception-handling are shown in italic:
e n o i t p e c " x e l E t b e . a k s f n c w e o o s o o i s S r e t a h p l m e T c o c " s b x s u E d s s n O a n s o l a I i t n t c i o i p e t e h p c n e x T c o i E x t F E p s e O n d c E o n x i u t E o p d e B e f c t x p O t u E u r r O e t x e n I d l n I b y a a w r r o r n A h o n n i T t o o i i p t t e p p c e e x c c E x x e r E E t t o m r a n i r t e m E n r m u o u R F g f r r A e l b a g m e u l l N I

The class Throwable includes several instance methods that can be used with any exception object. If e is of type Throwable (or one of its subclasses), then e.getMessage() is a function

388

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

that returns a String that describes the exception. The function e.toString(), which is used by the system whenever it needs a string representation of the object, returns a String that contains the name of the class to which the exception belongs as well as the same string that would be returned by e.getMessage(). And e.printStackTrace() writes a stack trace to standard output that tells which subroutines were active when the exception occurred. A stack trace can be very useful when you are trying to determine the cause of the problem. (Note that if an exception is not caught by the program, then the system automatically prints the stack trace to standard output.)

8.3.2

The try Statement

To catch exceptions in a Java program, you need a try statement. We have been using such statements since Section 3.7, but the full syntax of the try statement is more complicated than what was presented there. The try statements that we have used so far had a syntax similar to the following example:
try { double determinant = M[0][0]*M[1][1] System.out.println("The determinant of } catch ( ArrayIndexOutOfBoundsException e ) System.out.println("M is the wrong size e.printStackTrace(); } M[0][1]*M[1][0]; M is " + determinant); { to have a determinant.");

Here, the computer tries to execute the block of statements following the word try. If no exception occurs during the execution of this block, then the catch part of the statement is simply ignored. However, if an exception of type ArrayIndexOutOfBoundsException occurs, then the computer jumps immediately to the catch clause of the try statement. This block of statements is said to be an exception handler for ArrayIndexOutOfBoundsException. By handling the exception in this way, you prevent it from crashing the program. Before the body of the catch clause is executed, the object that represents the exception is assigned to the variable e, which is used in this example to print a stack trace. However, the full syntax of the try statement allows more than one catch clause. This makes it possible to catch several dierent types of exception with one try statement. In the above example, in addition to the possible ArrayIndexOutOfBoundsException, there is a possible NullPointerException which will occur if the value of M is null. We can handle both possible exceptions by adding a second catch clause to the try statement:
try { double determinant = M[0][0]*M[1][1] System.out.println("The determinant of } catch ( ArrayIndexOutOfBoundsException e ) System.out.println("M is the wrong size } catch ( NullPointerException e ) { System.out.print("Programming error! M } M[0][1]*M[1][0]; M is " + determinant); { to have a determinant.");

doesnt exist." + );

Here, the computer tries to execute the statements in the try clause. If no error occurs, both of the catch clauses are skipped. If an ArrayIndexOutOfBoundsException occurs, the computer

8.3. EXCEPTIONS AND TRY..CATCH

389

executes the body of the rst catch clause and skips the second one. If a NullPointerException occurs, it jumps to the second catch clause and executes that. Note that both ArrayIndexOutOfBoundsException and NullPointerException are subclasses of RuntimeException. Its possible to catch all RuntimeExceptions with a single catch clause. For example:
try { double determinant = M[0][0]*M[1][1] - M[0][1]*M[1][0]; System.out.println("The determinant of M is " + determinant); } catch ( RuntimeException err ) { System.out.println("Sorry, an error has occurred."); System.out.println("The error was: " + err); }

The catch clause in this try statement will catch any exception belonging to class RuntimeException or to any of its subclasses. This shows why exception classes are organized into a class hierarchy. It allows you the option of casting your net narrowly to catch only a specic type of exception. Or you can cast your net widely to catch a wide class of exceptions. Because of subclassing, when there are multiple catch clauses in a try statement, it is possible that a given exception might match several of those catch clauses. For example, an exception of type NullPointerException would match catch clauses for NullPointerException, RuntimeException, Exception, or Throwable. In this case, only the rst catch clause that matches the exception is executed. The example Ive given here is not particularly realistic. You are not very likely to use exception-handling to guard against null pointers and bad array indices. This is a case where careful programming is better than exception handling: Just be sure that your program assigns a reasonable, non-null value to the array M. You would certainly resent it if the designers of Java forced you to set up a try..catch statement every time you wanted to use an array! This is why handling of potential RuntimeExceptions is not mandatory. There are just too many things that might go wrong! (This also shows that exception-handling does not solve the problem of program robustness. It just gives you a tool that will in many cases let you approach the problem in a more organized way.)


I have still not completely specied the syntax of the try statement. There is one additional element: the possibility of a nally clause at the end of a try statement. The complete syntax of the try statement can be described as:
try { statements } optional-catch-clauses optional-finally-clause

Note that the catch clauses are also listed as optional. The try statement can include zero or more catch clauses and, optionally, a finally clause. The try statement must include one or the other. That is, a try statement can have either a finally clause, or one or more catch clauses, or both. The syntax for a catch clause is
catch ( exception-class-name statements } variable-name ) {

390 and the syntax for a finally clause is


finally { statements }

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

The semantics of the finally clause is that the block of statements in the finally clause is guaranteed to be executed as the last step in the execution of the try statement, whether or not any exception occurs and whether or not any exception that does occur is caught and handled. The finally clause is meant for doing essential cleanup that under no circumstances should be omitted. One example of this type of cleanup is closing a network connection. Although you dont yet know enough about networking to look at the actual programming in this case, we can consider some pseudocode:
try { open a network connection } catch ( IOException e ) { report the error return // Dont continue if connection cant be opened! } // At this point, we KNOW that the connection is open. try { communicate over the connection } catch ( IOException e ) { handle the error } finally { close the connection }

The finally clause in the second try statement ensures that the network connection will denitely be closed, whether or not an error occurs during the communication. The rst try statement is there to make sure that we dont even try to communicate over the network unless we have successfully opened a connection. The pseudocode in this example follows a general pattern that can be used to robustly obtain a resource, use the resource, and then release the resource.

8.3.3

Throwing Exceptions

There are times when it makes sense for a program to deliberately throw an exception. This is the case when the program discovers some sort of exceptional or error condition, but there is no reasonable way to handle the error at the point where the problem is discovered. The program can throw an exception in the hope that some other part of the program will catch and handle the exception. This can be done with a throw statement. You have already seen an example of this in Subsection 4.3.5. In this section, we cover the throw statement more fully. The syntax of the throw statement is:
throw exception-object ;

8.3. EXCEPTIONS AND TRY..CATCH

391

The exception-object must be an object belonging to one of the subclasses of Throwable. Usually, it will in fact belong to one of the subclasses of Exception. In most cases, it will be a newly constructed object created with the new operator. For example:
throw new ArithmeticException("Division by zero");

The parameter in the constructor becomes the error message in the exception object; if e refers to the object, the error message can be retrieved by calling e.getMessage(). (You might nd this example a bit odd, because you might expect the system itself to throw an ArithmeticException when an attempt is made to divide by zero. So why should a programmer bother to throw the exception? Recall that if the numbers that are being divided are of type int, then division by zero will indeed throw an ArithmeticException. However, no arithmetic operations with oating-point numbers will ever produce an exception. Instead, the special value Double.NaN is used to represent the result of an illegal operation. In some situations, you might prefer to throw an ArithmeticException when a real number is divided by zero.) An exception can be thrown either by the system or by a throw statement. The exception is processed in exactly the same way in either case. Suppose that the exception is thrown inside a try statement. If that try statement has a catch clause that handles that type of exception, then the computer jumps to the catch clause and executes it. The exception has been handled . After handling the exception, the computer executes the finally clause of the try statement, if there is one. It then continues normally with the rest of the program, which follows the try statement. If the exception is not immediately caught and handled, the processing of the exception will continue. When an exception is thrown during the execution of a subroutine and the exception is not handled in the same subroutine, then that subroutine is terminated (after the execution of any pending finally clauses). Then the routine that called that subroutine gets a chance to handle the exception. That is, if the subroutine was called inside a try statement that has an appropriate catch clause, then that catch clause will be executed and the program will continue on normally from there. Again, if the second routine does not handle the exception, then it also is terminated and the routine that called it (if any) gets the next shot at the exception. The exception will crash the program only if it passes up through the entire chain of subroutine calls without being handled. (In fact, even this is not quite true: In a multithreaded program, only the thread in which the exception occurred is terminated.) A subroutine that might generate an exception can announce this fact by adding a clause throws exception-class-name to the header of the routine. For example:
/** * Returns the larger of the two roots of the quadratic equation * A*x*x + B*x + C = 0, provided it has any roots. If A == 0 or * if the discriminant, B*B - 4*A*C, is negative, then an exception * of type IllegalArgumentException is thrown. */ static public double root( double A, double B, double C ) throws IllegalArgumentException { if (A == 0) { throw new IllegalArgumentException("A cant be zero."); } else { double disc = B*B - 4*A*C; if (disc < 0) throw new IllegalArgumentException("Discriminant < zero.");

392
return } }

CHAPTER 8. CORRECTNESS AND ROBUSTNESS


(-B + Math.sqrt(disc)) / (2*A);

As discussed in the previous section, the computation in this subroutine has the preconditions that A != 0 and B*B-4*A*C >= 0. The subroutine throws an exception of type IllegalArgumentException when either of these preconditions is violated. When an illegal condition is found in a subroutine, throwing an exception is often a reasonable response. If the program that called the subroutine knows some good way to handle the error, it can catch the exception. If not, the program will crashand the programmer will know that the program needs to be xed. A throws clause in a subroutine heading can declare several dierent types of exception, separated by commas. For example:
void processArray(int[] A) throws NullPointerException, ArrayIndexOutOfBoundsException { ...

8.3.4

Mandatory Exception Handling

In the preceding example, declaring that the subroutine root() can throw an IllegalArgumentException is just a courtesy to potential readers of this routine. This is because handling of IllegalArgumentExceptions is not mandatory. A routine can throw an IllegalArgumentException without announcing the possibility. And a program that calls that routine is free either to catch or to ignore the exception, just as a programmer can choose either to catch or to ignore an exception of type NullPointerException. For those exception classes that require mandatory handling, the situation is dierent. If a subroutine can throw such an exception, that fact must be announced in a throws clause in the routine denition. Failing to do so is a syntax error that will be reported by the compiler. On the other hand, suppose that some statement in the body of a subroutine can generate an exception of a type that requires mandatory handling. The statement could be a throw statement, which throws the exception directly, or it could be a call to a subroutine that can throw the exception. In either case, the exception must be handled. This can be done in one of two ways: The rst way is to place the statement in a try statement that has a catch clause that handles the exception; in this case, the exception is handled within the subroutine, so that any caller of the subroutine will never see the exception. The second way is to declare that the subroutine can throw the exception. This is done by adding a throws clause to the subroutine heading, which alerts any callers to the possibility that an exception might be generated when the subroutine is executed. The caller will, in turn, be forced either to handle the exception in a try statement or to declare the exception in a throws clause in its own header. Exception-handling is mandatory for any exception class that is not a subclass of either Error or RuntimeException. Exceptions that require mandatory handling generally represent conditions that are outside the control of the programmer. For example, they might represent bad input or an illegal action taken by the user. There is no way to avoid such errors, so a robust program has to be prepared to handle them. The design of Java makes it impossible for programmers to ignore the possibility of such errors. Among the exceptions that require mandatory handling are several that can occur when using Javas input/output routines. This means that you cant even use these routines unless you understand something about exception-handling. Chapter 11 deals with input/output and uses mandatory exception-handling extensively.

8.3. EXCEPTIONS AND TRY..CATCH

393

8.3.5

Programming with Exceptions

Exceptions can be used to help write robust programs. They provide an organized and structured approach to robustness. Without exceptions, a program can become cluttered with if statements that test for various possible error conditions. With exceptions, it becomes possible to write a clean implementation of an algorithm that will handle all the normal cases. The exceptional cases can be handled elsewhere, in a catch clause of a try statement. When a program encounters an exceptional condition and has no way of handling it immediately, the program can throw an exception. In some cases, it makes sense to throw an exception belonging to one of Javas predened classes, such as IllegalArgumentException or IOException. However, if there is no standard class that adequately represents the exceptional condition, the programmer can dene a new exception class. The new class must extend the standard class Throwable or one of its subclasses. In general, if the programmer does not want to require mandatory exception handling, the new class will extend RuntimeException (or one of its subclasses). To create a new exception class that does require mandatory handling, the programmer can extend one of the other subclasses of Exception or can extend Exception itself. Here, for example, is a class that extends Exception, and therefore requires mandatory exception handling when it is used:
public class ParseError extends Exception { public ParseError(String message) { // Create a ParseError object containing // the given message as its error message. super(message); } }

The class contains only a constructor that makes it possible to create a ParseError object containing a given error message. (The statement super(message) calls a constructor in the superclass, Exception. See Subsection 5.6.3.) Of course the class inherits the getMessage() and printStackTrace() routines from its superclass. If e refers to an object of type ParseError, then the function call e.getMessage() will retrieve the error message that was specied in the constructor. But the main point of the ParseError class is simply to exist. When an object of type ParseError is thrown, it indicates that a certain type of error has occurred. (Parsing , by the way, refers to guring out the syntax of a string. A ParseError would indicate, presumably, that some string that is being processed by the program does not have the expected form.) A throw statement can be used in a program to throw an error of type ParseError. The constructor for the ParseError object must specify an error message. For example:
throw new ParseError("Encountered an illegal negative number.");

or
throw new ParseError("The word " + word + " is not a valid file name.");

If the throw statement does not occur in a try statement that catches the error, then the subroutine that contains the throw statement must declare that it can throw a ParseError by adding the clause throws ParseError to the subroutine heading. For example,
void getUserData() throws ParseError { . . . }

394

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

This would not be required if ParseError were dened as a subclass of RuntimeException instead of Exception, since in that case exception handling for ParseErrors would not be mandatory. A routine that wants to handle ParseErrors can use a try statement with a catch clause that catches ParseErrors. For example:
try { getUserData(); processUserData(); } catch (ParseError pe) { . . . // Handle the error }

Note that since ParseError is a subclass of Exception, a catch clause of the form catch (Exception e) would also catch ParseErrors, along with any other object of type Exception. Sometimes, its useful to store extra data in an exception object. For example,
class ShipDestroyed extends RuntimeException { Ship ship; // Which ship was destroyed. int where x, where y; // Location where ship was destroyed. ShipDestroyed(String message, Ship s, int x, int y) { // Constructor creates a ShipDestroyed object // carrying an error message plus the information // that the ship s was destroyed at location (x,y) // on the screen. super(message); ship = s; where x = x; where y = y; } }

Here, a ShipDestroyed object contains an error message and some information about a ship that was destroyed. This could be used, for example, in a statement:
if ( userShip.isHit() ) throw new ShipDestroyed("Youve been hit!", userShip, xPos, yPos);

Note that the condition represented by a ShipDestroyed object might not even be considered an error. It could be just an expected interruption to the normal ow of a game. Exceptions can sometimes be used to handle such interruptions neatly.


The ability to throw exceptions is particularly useful in writing general-purpose subroutines and classes that are meant to be used in more than one program. In this case, the person writing the subroutine or class often has no reasonable way of handling the error, since that person has no way of knowing exactly how the subroutine or class will be used. In such circumstances, a novice programmer is often tempted to print an error message and forge ahead, but this is almost never satisfactory since it can lead to unpredictable results down the line. Printing an error message and terminating the program is almost as bad, since it gives the program no chance to handle the error. The program that calls the subroutine or uses the class needs to know that the error has occurred. In languages that do not support exceptions, the only alternative is to return some special value or to set the value of some variable to indicate that an error has occurred. For

8.3. EXCEPTIONS AND TRY..CATCH

395

example, the readMeasurement() function in Subsection 8.2.2 returns the value -1 if the users input is illegal. However, this only does any good if the main program bothers to test the return value. It is very easy to be lazy about checking for special return values every time a subroutine is called. And in this case, using -1 as a signal that an error has occurred makes it impossible to allow negative measurements. Exceptions are a cleaner way for a subroutine to react when it encounters an error. It is easy to modify the readMeasurement() subroutine to use exceptions instead of a special return value to signal an error. My modied subroutine throws a ParseError when the users input is illegal, where ParseError is the subclass of Exception that was dened above. (Arguably, it might be reasonable to avoid dening a new class by using the standard exception class IllegalArgumentException instead.) The changes from the original version are shown in italic:
/** * Reads the users input measurement from one line of input. * Precondition: The input line is not empty. * Postcondition: If the users input is legal, the measurement * is converted to inches and returned. * @throws ParseError if the users input is not legal. */ static double readMeasurement() throws ParseError { double inches; // Total number of inches in users measurement. // One measurement, // such as the 12 in "12 miles." // The units specified for the measurement, // such as "miles."

double measurement; String units; char ch;

// Used to peek at next character in the users input. // No inches have yet been read.

inches = 0;

skipBlanks(); ch = TextIO.peek(); /* As long as there is more input on the line, read a measurement and add the equivalent number of inches to the variable, inches. If an error is detected during the loop, end the subroutine immediately by throwing a ParseError. */ while (ch != \n) { /* Get the next measurement and the units. Before reading anything, make sure that a legal value is there to read. */ if ( ! Character.isDigit(ch) ) { throw new ParseError("Expected to find a number, but found " + ch); } measurement = TextIO.getDouble(); skipBlanks(); if (TextIO.peek() == \n) { throw new ParseError("Missing unit of measure at end of line."); } units = TextIO.getWord(); units = units.toLowerCase();

396

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

/* Convert the measurement to inches and add it to the total. */ if (units.equals("inch") || units.equals("inches") || units.equals("in")) { inches += measurement; } else if (units.equals("foot") || units.equals("feet") || units.equals("ft")) { inches += measurement * 12; } else if (units.equals("yard") || units.equals("yards") || units.equals("yd")) { inches += measurement * 36; } else if (units.equals("mile") || units.equals("miles") || units.equals("mi")) { inches += measurement * 12 * 5280; } else { throw new ParseError("\"" + units + "\" is not a legal unit of measure."); } /* Look ahead to see whether the next thing on the line is the end-of-line. */ skipBlanks(); ch = TextIO.peek(); } // end while

return inches; } // end readMeasurement()

In the main program, this subroutine is called in a try statement of the form
try { inches = readMeasurement(); } catch (ParseError e) { . . . // Handle the error. }

The complete program can be found in the le LengthConverter3.java. From the users point of view, this program has exactly the same behavior as the program LengthConverter2 from the previous section. Internally, however, the programs are signicantly dierent, since LengthConverter3 uses exception-handling.

8.4 In

Assertions

this short section, we look at assertions, another feature of the Java programming language that can be used to aid in the development of correct and robust programs. Recall that a precondition is a condition that must be true at a certain point in a program, for the execution of the program to continue correctly from that point. In the case where

8.4. ASSERTIONS

397

there is a chance that the precondition might not be satisedfor example, if it depends on input from the userthen its a good idea to insert an if statement to test it. But then the question arises, What should be done if the precondition does not hold? One option is to throw an exception. This will terminate the program, unless the exception is caught and handled elsewhere in the program. In many cases, of course, instead of using an if statement to test whether a precondition holds, a programmer tries to write the program in a way that will guarantee that the precondition holds. In that case, the test should not be necessary, and the if statement can be avoided. The problem is that programmers are not perfect. In spite of the programmers intention, the program might contain a bug that screws up the precondition. So maybe its a good idea to check the preconditionat least during the debugging phase of program development. Similarly, a postcondition is a condition that is true at a certain point in the program as a consequence of the code that has been executed before that point. Assuming that the code is correctly written, a postcondition is guaranteed to be true, but here again testing whether a desired postcondition is actually true is a way of checking for a bug that might have screwed up the postcondition. This is something that might be desirable during debugging. The programming languages C and C++ have always had a facility for adding what are called assertions to a program. These assertions take the form assert( condition ), where condition is a boolean-valued expression. This condition expresses a precondition or postcondition that should hold at that point in the program. When the computer encounters an assertion during the execution of the program, it evaluates the condition. If the condition is false, the program is terminated. Otherwise, the program continues normally. This allows the programmers belief that the condition is true to be tested; if if it not true, that indicates that the part of the program that preceded the assertion contained a bug. One nice thing about assertions in C and C++ is that they can be turned o at compile time. That is, if the program is compiled in one way, then the assertions are included in the compiled code. If the program is compiled in another way, the assertions are not included. During debugging, the rst type of compilation is used. The release version of the program is compiled with assertions turned o. The release version will be more ecient, because the computer wont have to evaluate all the assertions. Although early versions of Java did not have assertions, an assertion facility similar to the one in C/C++ has been available in Java since version 1.4. As with the C/C++ version, Java assertions can be turned on during debugging and turned o during normal execution. In Java, however, assertions are turned on and o at run time rather than at compile time. An assertion in the Java source code is always included in the compiled class le. When the program is run in the normal way, these assertions are ignored; since the condition in the assertion is not evaluated in this case, there is little or no performance penalty for having the assertions in the program. When the program is being debugged, it can be run with assertions enabled, as discussed below, and then the assertions can be a great help in locating and identifying bugs.


An assertion statement in Java takes one of the following two forms:
assert condition ;

or
assert condition : error-message ;

where condition is a boolean-valued expression and error-message is a string or an expression of type String. The word assert is a reserved word in Java, which cannot be used as an

398

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

identier. An assertion statement can be used anyplace in Java where a statement is legal. If a program is run with assertions disabled, an assertion statement is equivalent to an empty statement and has no eect. When assertions are enabled and an assertion statement is encountered in the program, the condition in the assertion is evaluated. If the value is true, the program proceeds normally. If the value of the condition is false, then an exception of type java.lang.AssertionError is thrown, and the program will crash (unless the error is caught by a try statement). If the assert statement includes an error-message , then the error message string becomes the message in the AssertionError. So, the statement assert condition : error-message ;" is similar to
if ( condition == false ) throw new AssertionError( error-message );

except that the if statement is executed whenever the program is run, and the assert statement is executed only when the program is run with assertions enabled. The question is, when to use assertions instead of exceptions? The general rule is to use assertions to test conditions that should denitely be true, if the program is written correctly. Assertions are useful for testing a program to see whether or not it is correct and for nding the errors in an incorrect program. After testing and debugging, when the program is used in the normal way, the assertions in the program will be ignored. However, if a problem turns up later, the assertions are still there in the program to be used to help locate the error. If someone writes to you to say that your program doesnt work when he does such-and-such, you can run the program with assertions enabled, do such-and-such, and hope that the assertions in the program will help you locate the point in the program where it goes wrong. Consider, for example, the root() method from Subsection 8.3.3 that calculates a root of a quadratic equation. If you believe that your program will always call this method with legal arguments, then it would make sense to write the method using assertions instead of exceptions:
/** * Returns the larger of the two roots of the quadratic equation * A*x*x + B*x + C = 0, provided it has any roots. * Precondition: A != 0 and B*B - 4*A*C >= 0. */ static public double root( double A, double B, double C ) { assert A != 0 : "Leading coefficient of quadratic equation cannot be zero."; double disc = B*B - 4*A*C; assert disc >= 0 : "Discriminant of quadratic equation cannot be negative."; return (-B + Math.sqrt(disc)) / (2*A); }

The assertions are not checked when the program is run in the normal way. If you are correct in your belief that the method is never called with illegal arguments, then checking the conditions in the assertions would be unnecessary. If your belief is not correct, the problem should turn up during testing or debugging, when the program is run with the assertions enabled. If the root() method is part of a software library that you expect other people to use, then the situation is less clear. Suns Java documentation advises that assertions should not be used for checking the contract of public methods: If the caller of a method violates the contract by passing illegal parameters, then an exception should be thrown. This will enforce the contract whether or not assertions are enabled. (However, while its true that Java programmers expect the contract of a method to be enforced with exceptions, there are reasonable arguments for using assertions instead, in some cases.)

8.5. INTRODUCTION TO THREADS

399

On the other hand, it never hurts to use an assertion to check a postcondition of a method. A postcondition is something that is supposed to be true after the method has executed, and it can be tested with an assert statement at the end of the method. If the postcodition is false, there is a bug in the method itself, and that is something that needs to be found during the development of the method.


To have any eect, assertions must be enabled when the program is run. How to do this depends on what programming environment you are using. (See Section 2.6 for a discussion of programming environments.) In the usual command line environment, assertions are enabled by adding the option -enableassertions to the java command that is used to run the program. For example, if the class that contains the main program is RootFinder, then the command
java -enableassertions RootFinder

will run the program with assertions enabled. The -enableassertions option can be abbreviated to -ea, so the command can alternatively be written as
java -ea RootFinder

In fact, it is possible to enable assertions in just part of a program. An option of the form -ea: class-name enables only the assertions in the specied class. Note that there are no spaces between the -ea, the :, and the name of the class. To enable all the assertions in a package and in its sub-packages, you can use an option of the form -ea: package-name .... To enable assertions in the default package (that is, classes that are not specied to belong to a package, like almost all the classes in this book), use -ea:.... For example, to run a Java program named MegaPaint with assertions enabled for every class in the packages named paintutils and drawing, you would use the command:
java -ea:paintutils... -ea:drawing... MegaPaint

If you are using the Eclipse integrated development environment, you can specify the -ea option by creating a run conguration. Right-click the name of the main program class in the Package Explorer pane, and select Run As from the pop-up menu and then Run. . . from the submenu. This will open a dialog box where you can manage run congurations. The name of the project and of the main class will be already be lled in. Click the Arguments tab, and enter -ea in the box under VM Arguments. The contents of this box are added to the java command that is used to run the program. You can enter other options in this box, including more complicated enableassertions options such as -ea:paintutils.... When you click the Run button, the options will be applied. Furthermore, they will be applied whenever you run the program, unless you change the run conguration or add a new conguration. Note that it is possible to make two run congurations for the same class, one with assertions enabled and one with assertions disabled.

8.5

Introduction to Threads

Like people, computers can multitask .

That is, they can be working on several dierent tasks at the same time. A computer that has just a single central processing unit cant literally do two things at the same time, any more than a person can, but it can still switch its attention back and forth among several tasks. Furthermore, it is increasingly common for computers to have more than one processing unit, and such computers can literally work on several tasks simultaneously. It is likely that from now on, most of the increase in computing power will

400

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

come from adding additional processors to computers rather than from increasing the speed of individual processors. To use the full power of these multiprocessing computers, a programmer must do parallel programming , which means writing a program as a set of several tasks that can be executed simultaneously. Even on a single-processor computer, parallel programming techniques can be useful, since some problems can be tackled most naturally by breaking the solution into a set of simultaneous tasks that cooperate to solve the problem. In Java, a single task is called a thread . The term thread refers to a thread of control or thread of execution, meaning a sequence of instructions that are executed one after another the thread extends through time, connecting each instruction to the next. In a multithreaded program, there can be many threads of control, weaving through time in parallel and forming the complete fabric of the program. (Ok, enough with the metaphor, already!) Every Java program has at least one thread; when the Java virtual machine runs your program, it creates a thread that is responsible for executing the main routine of the program. This main thread can in turn create other threads that can continue even after the main thread has terminated. In a GUI program, there is at least one additional thread, which is responsible for handling events and drawing components on the screen. This GUI thread is created when the rst window is opened. So in fact, you have already done parallel programming! When a main routine opens a window, both the main thread and the GUI thread can continue to run in parallel. Of course, parallel programming can be used in much more interesting ways. Unfortunately, parallel programming is even more dicult than ordinary, single-threaded programming. When several threads are working together on a problem, a whole new category of errors is possible. This just means that techniques for writing correct and robust programs are even more important for parallel programming than they are for normal programming. (Thats one excuse for having this section in this chapteranother is that we will need threads at several points in future chapters, and I didnt have another place in the book where the topic ts more naturally.) Since threads are a dicult topic, you will probably not fully understand everything in this section the rst time through the material. Your understanding should improve as you encounter more examples of threads in future sections.

8.5.1

Creating and Running Threads

In Java, a thread is represented by an object belonging to the class java.lang.Thread (or to a subclass of this class). The purpose of a Thread object is to execute a single method. The method is executed in its own thread of control, which can run in parallel with other threads. When the execution of the method is nished, either because the method terminates normally or because of an uncaught exception, the thread stops running. Once this happens, there is no way to restart the thread or to use the same Thread object to start another thread. There are two ways to program a thread. One is to create a subclass of Thread and to dene the method public void run() in the subclass. This run() method denes the task that will be performed by the thread; that is, when the thread is started, it is the run() method that will be executed in the thread. For example, here is a simple, and rather useless, class that denes a thread that does nothing but print a message on standard output:
public class NamedThread extends Thread { private String name; // The name of this thread. public NamedThread(String name) { // Constructor gives name to thread. this.name = name; } public void run() { // The run method prints a message to standard output.

8.5. INTRODUCTION TO THREADS


System.out.println("Greetings from thread " + name + "!"); } }

401

To use a NamedThread, you must of course create an object belonging to this class. For example,
NamedThread greetings = new NamedThread("Fred");

However, creating the object does not automatically start the thread running. To do that, you must call the start() method in the thread object. For the example, this would be done with the statement
greetings.start();

The purpose of the start() method is to create a new thread of control that will execute the Thread objects run() method. The new thread runs in parallel with the thread in which the start() method was called, along with any other threads that already existed. This means that the code in the run() method will execute at the same time as the statements that follow the call to greetings.start(). Consider this code segment:
NamedThread greetings = new NamedThread("Fred"); greetings.start(); System.out.println("Thread has been started.");

After greetings.start() is executed, there are two threads. One of them will print Thread has been started. while the other one wants to print Greetings from thread Fred !. It is important to note that these messages can be printed in either order. The two threads run simultaneously and will compete for access to standard output, so that they can print their messages. Whichever thread happens to be the rst to get access will be the rst to print its message. In a normal, single-threaded program, things happen in a denite, predictable order from beginning to end. In a multi-threaded program, there is a fundamental indeterminacy. You cant be sure what order things will happen in. This indeterminacy is what makes parallel programming so dicult! Note that calling greetings.start() is very dierent from calling greetings.run(). Calling greetings.run() will execute the run() method in the same thread, rather than creating a new thread. This means that all the work of the run() will be done before the computer moves on to the statement that follows the call to greetings.run() in the program. There is no parallelism and no indeterminacy.


I mentioned that there are two ways to program a thread. The rst way was to dene a subclass of Thread. The second is to dene a class that implements the interface java.lang.Runnable. The Runnable interface denes a single method, public void run(). An object that implements the Runnable interface can be passed as a parameter to the constructor of an object of type Thread. When that threads start method is called, the thread will execute the run() method in the Runnable object. For example, as an alternative to the NamedThread class, we could dene the class:
public class NamedRunnable implements Runnable { private String name; // The name of this thread. public NamedRunnable(String name) { // Constructor gives name to object. this.name = name; }

402

CHAPTER 8. CORRECTNESS AND ROBUSTNESS


public void run() { // The run method prints a message to standard output. System.out.println("Greetings from thread " + name +"!"); } }

To use this version of the class, we would create a NamedRunnable object and use that object to create an object of type Thread:
NamedRunnable greetings = new NamedRunnable("Fred"); Thread greetingsThread = new Thread(greetings); greetingsThread.start();

Finally, Ill note that it is sometimes convenient to dene a thread using an anonymous inner class (Subsection 5.7.3). For example:
Thread greetingsFromFred = new Thread() { public void run() { System.out.println("Greetings from Fred!"); } }; greetingsFromFred.start();


To help you understand how multiple threads are executed in parallel, we consider the sample program ThreadTest1.java. This program creates several threads. Each thread performs exactly the same task. The task is to count the number of integers less than 1000000 that are prime. (The particular task that is done is not important for our purposes here.) On my computer, this task takes a little more than one second of processing time. The threads that perform this task are dened by the following static nested class:
/** * When a thread belonging to this class is run it will count the * number of primes between 2 and 1000000. It will print the result * to standard output, along with its ID number and the elapsed * time between the start and the end of the computation. */ private static class CountPrimesThread extends Thread { int id; // An id number for this thread; specified in the constructor. public CountPrimesThread(int id) { this.id = id; } public void run() { long startTime = System.currentTimeMillis(); int count = countPrimes(2,1000000); // Counts the primes. long elapsedTime = System.currentTimeMillis() - startTime; System.out.println("Thread " + id + " counted " + count + " primes in " + (elapsedTime/1000.0) + " seconds."); } }

The main program asks the user how many threads to run, and then creates and starts the specied number of threads:

8.5. INTRODUCTION TO THREADS

403

public static void main(String[] args) { int numberOfThreads = 0; while (numberOfThreads < 1 || numberOfThreads > 25) { System.out.print("How many threads do you want to use (1 to 25) ? "); numberOfThreads = TextIO.getlnInt(); if (numberOfThreads < 1 || numberOfThreads > 25) System.out.println("Please enter a number between 1 and 25 !"); } System.out.println("\nCreating " + numberOfThreads + " prime counting threads..."); CountPrimesThread[] worker = new CountPrimesThread[numberOfThreads]; for (int i = 0; i < numberOfThreads; i++) worker[i] = new CountPrimesThread( i ); for (int i = 0; i < numberOfThreads; i++) worker[i].start(); System.out.println("Threads have been created and started."); }

It would be a good idea for you to compile and run the program or to try the applet version, which can be found in the on-line version of this section. When I ran the program with one thread, it took 1.18 seconds for my computer to do the computation. When I ran it using six threads, the output was:
Creating 6 prime counting threads... Threads have been created and started. Thread 1 counted 78498 primes in 6.706 Thread 4 counted 78498 primes in 6.693 Thread 0 counted 78498 primes in 6.838 Thread 2 counted 78498 primes in 6.825 Thread 3 counted 78498 primes in 6.893 Thread 5 counted 78498 primes in 6.859

seconds. seconds. seconds. seconds. seconds. seconds.

The second line was printed immediately after the rst. At this point, the main program has ended but the six threads continue to run. After a pause of about seven seconds, all six threads completed at about the same time. The order in which the threads complete is not the same as the order in which they were started, and the order is indeterminate. That is, if the program is run again, the order in which the threads complete will probably be dierent. On my computer, six threads take about six times longer than one thread. This is because my computer has only one processor. Six threads, all doing the same task, take six times as much processing as one thread. With only one processor to do the work, the total elapsed time for six threads is about six times longer than the time for one thread. On a computer with two processors, the computer can work on two tasks at the same time, and six threads might complete in as little as three times the time it takes for one thread. On a computer with six or more processors, six threads might take no more time than a single thread. Because of overhead and other reasons, the actual speedup will probably be smaller than this analysis indicates, but on a multiprocessor machine, you should see a denite speedup. What happens when you run the program on your own computer? How many processors do you have? Whenever there are more threads to be run than there are processors to run them, the computer divides its attention among all the runnable threads by switching rapidly from one thread to another. That is, each processor runs one thread for a while then switches to another thread and runs that one for a while, and so on. Typically, these context switches occur about 100 times or more per second. The result is that the computer makes progress on all

404

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

the tasks, and it looks to the user as if all the tasks are being executed simultaneously. This is why in the sample program, in which each thread has the same amount of work to do, all the threads complete at about the same time: Over any time period longer than a fraction of a second, the computers time is divided approximately equally among all the threads. When you do parallel programming in order to spread the work among several processors, you might want to take into account the number of available processors. You might, for example, want to create one thread for each processor. In Java, you can nd out the number of processors by calling the function
Runtime.getRuntime().availableProcessors()

which returns an int giving the number of processors that are available to the Java Virtual Machine. In some cases, this might be less than the actual number of processors in the computer.

8.5.2

Operations on Threads

The Thread class includes several useful methods in addition to the start() method that was discussed above. I will mention just a few of them. If thrd is an object of type Thread, then the boolean-valued function thrd.isAlive() can be used to test whether or not the thread is alive. A thread is alive between the time it is started and the time when it terminates. After the thread has terminated it is said to be dead. (The rather gruesome metaphor is also used when we refer to killing or aborting a thread.) The static method Thread.sleep(milliseconds) causes the thread that executes this method to sleep for the specied number of milliseconds. A sleeping thread is still alive, but it is not running. While a thread is sleeping, the computer will work on any other runnable threads (or on other programs). Thread.sleep() can be used to insert a pause in the execution of a thread. The sleep method can throw an exception of type InterruptedException, which is an exception class that requires mandatory exception handling (see Subsection 8.3.4). In practice, this means that the sleep method is usually used in a try..catch statement that catches the potential InterruptedException:
try { Thread.sleep(lengthOfPause); } catch (InterruptedException e) { }

One thread can interrupt another thread to wake it up when it is sleeping or paused for some other reason. A Thread, thrd, can be interrupted by calling its method thrd.interrupt(), but you are not likely to do this until you start writing rather advanced applications, and you are not likely to need to do anything in response to an InterruptedException (except to catch it). Its unfortunate that you have to worry about it at all, but thats the way that mandatory exception handling works. Sometimes, its necessary for one thread to wait for anther thread to die. This is done with the join() method from the Thread class. Suppose that thrd is a Thread. Then, if another thread calls thrd.join(), that other thread will go to sleep until thrd terminates. If thrd is already dead when thrd.join() is called, then it simply has no eect the thread that called thrd.join() proceeds immediately. The method join() can throw an InterruptedException, which must be handled. As an example, the following code starts several threads, waits for them all to terminate, and then outputs the elapsed time:

8.5. INTRODUCTION TO THREADS


CountPrimesThread[] worker = new CountPrimesThread[numberOfThreads]; long startTime = System.currentTimeMillis(); for (int i = 0; i < numberOfThreads; i++) { worker[i] = new CountPrimesThread(); worker[i].start(); } for (int i = 0; i < numberOfThreads; i++) { try { worker[i].join(); // Sleep until worker[i] has terminated. } catch (InterruptedException e) { } } // At this point, all the worker threads have terminated. long elapsedTime = System.currentTimeMillis() - startTime; System.out.println("Elapsed time: " + (elapsedTime/1000.0) + " seconds.");

405

An observant reader will note that this code assumes that no InterruptedException will occur. To be absolutely sure that the thread worker[i] has terminated in an environment where InterruptedExceptions are possible, you would have to do something like:
while (worker[i].isAlive()) { try { worker[i].join(); } catch (InterruptedException e) { } }

8.5.3

Mutual Exclusion with synchronized

Programming several threads to carry out independent tasks is easy. The real diculty arises when threads have to interact in some way. One way that threads interact is by sharing resources. When two threads need access to the same resource, such as a variable or a window on the screen, some care must be taken that they dont try to use the same resource at the same time. Otherwise, the situation could be something like this: Imagine several cooks sharing the use of just one measuring cup, and imagine that Cook A lls the measuring cup with milk, only to have Cook B grab the cup before Cook A has a chance to empty the milk into his bowl. There has to be some way for Cook A to claim exclusive rights to the cup while he performs the two operations: Add-Milk-To-Cup and Empty-Cup-Into-Bowl. Something similar happens with threads, even with something as simple as adding one to a counter. The statement
count = count + 1;

is actually a sequence of three operations:


Step 1. Step 2. Step 3. Get the value of count Add 1 to the value. Store the new value in count

406

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

Suppose that several threads perform these three steps. Remember that its possible for two threads to run at the same time, and even if there is only one processor, its possible for that processor to switch from one thread to another at any point. Suppose that while one thread is between Step 2 and Step 3, another thread starts executing the same sequence of steps. Since the rst thread has not yet stored the new value in count, the second thread reads the old value of count and adds one to that old value. After both threads have executed Step 3, the value of count has gone up only by 1 instead of by 2! This type of problem is called a race condition. This occurs when one thread is in the middle of a multi-step operation, and another thread changes some value or condition that the rst thread is depending upon. (The rst thread is in a race to complete all the steps before it is interrupted by another thread.) Another example of a race condition can occur in an if statement. Suppose the following statement, which is meant to avoid a division-by-zero error is executed by a thread:
if ( A != 0 ) B = C / A;

If the variable A is shared by several threads, and if nothing is done to guard against the race condition, then it is possible that a second thread will change the value of A to zero between the time that the rst thread checks the condition A != 0 and the time that it does the division. This means that the thread ends up dividing by zero, even though it just checked that A was not zero! To x the problem of race conditions, there has to be some way for a thread to get exclusive access to a shared resource. This is not a trivial thing to implement, but Java provides a high level and relatively easy-to-use approach to exclusive access. Its done with synchronized methods and with the synchronized statement. These are used to protect shared resources by making sure that only one thread at a time will try to access the resource. Synchronization in Java actually provides only mutual exclusion, which means that exclusive access to a resource is only guaranteed if every thread that needs access to that resource uses synchronization. Synchronization is like a cook leaving a note that says, Im using the measuring cup. This will get the cook exclusive access to the cupbut only if all the cooks agree to check the note before trying to grab the cup. Because this is a dicult topic, I will start with a simple example. Suppose that we want to avoid the race condition that occurs when several threads all want to add 1 to a counter. We can do this by dening a class to represent the counter and by using synchronized methods in that class:
public class ThreadSafeCounter { private int count = 0; // The value of the counter.

synchronized public void increment() { count = count + 1; } synchronized public int getValue() { return count; } }

If tsc is of type ThreadSafeCounter, then any thread can call tsc.increment() to add 1 to the counter in a completely safe way. The fact that tsc.increment() is synchronized means that only one thread can be in this method at a time; once a thread starts executing this

8.5. INTRODUCTION TO THREADS

407

method, it is guaranteed that it will nish executing it without having another thread change the value of tsc.count in the meantime. There is no possibility of a race condition. Note that the guarantee depends on the fact that count is a private variable. This forces all access to tsc.count to occur in the synchronized methods that are provided by the class. If count were public, it would be possible for a thread to bypass the synchronization by, for example, saying tsc.count++. This could change the value of count while another thread is in the middle of the tsc.increment(). Synchronization does not guarantee exclusive access; it only guarantees mutual exclusion among all the threads that are properly synchronized. The ThreadSafeCounter class does not prevent all possible race conditions that might arise when using a counter. Consider the if statement:
if ( tsc.getValue() == 0 ) doSomething();

where doSomething() is some method that requires the value of the counter to be zero. There is still a race condition here, which occurs if a second thread increments the counter between the time the rst thread tests tsc.getValue() == 0 and the time it executes doSomething(). The rst thread needs exclusive access to the counter during the execution of the whole if statement. (The synchronization in the ThreadSafeCounter class only gives it exclusive access during the time it is evaluating tsc.getValue().) We can solve the race condition by putting the if statement in a synchronized statement:
synchronized(tsc) { if ( tsc.getValue() == 0 ) doSomething(); }

Note that the synchronized statement takes an objecttsc in this caseas a kind of parameter. The syntax of the synchronized statement is:
synchronized( object statements } ) {

In Java, mutual exclusion is always associated with an object; we say that the synchronization is on that object. For example, the if statement above is synchronized on tsc. A synchronized instance method, such as those in the class ThreadSafeCounter, is synchronized on the object that contains the instance method. In fact, adding the synchronized modier to the denition of an instance method is pretty much equivalent to putting the body of the method in a synchronized statement, synchronized(this) {...}. It is also possible to have synchronized static methods; a synchronized static method is synchronized on a special class object that represents the class that contains the static method. The real rule of synchronization in Java is: Two threads cannot be synchronized on the same object at the same time; that is, they cannot simultaneously be executing code segments that are synchronized on that object. If one thread is synchronized on an object, and a second thread tries to synchronize on the same object, the second thread is forced to wait until the rst thread has nished with the object. This is implemented using something called a lock . Every object has a lock, and that lock can be held by only one thread at a time. To enter a synchronized statement or synchronized method, a thread must obtain the associated objects lock. If the lock is available, then the thread obtains the lock and immediately begins executing the synchronized code. It releases the lock after it nishes executing the synchronized code. If Thread A tries to obtain a lock that is already held by Thread B, then Thread A has

408

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

to wait until Thread B releases the lock. In fact, Thread A will go to sleep, and will not be awoken until the lock becomes available.


As a simple example of shared resources, we return to the prime-counting problem. Suppose that we want to count all the primes in a given range of integers, and suppose that we want to divide the work up among several threads. Each thread will be assigned part of the range of integers and will count the primes in its assigned range. At the end of its computation, the thread has to add its count to the overall total number of primes found. The variable that represents the total is shared by all the threads. If each thread just says
total = total + count;

then there is a (small) chance that two threads will try to do this at the same time and that the nal total will be wrong. To prevent this race condition, access to total has to be synchronized. My program uses a synchronized method to add the counts to the total:
synchronized private static void addToTotal(int x) { total = total + x; System.out.println(total + " primes found so far."); }

The source code for the program can be found in ThreadTest2.java. This program counts the primes in the range 3000001 to 6000000. (The numbers are rather arbitrary.) The main() routine in this program creates between 1 and 5 threads and assigns part of the job to each thread. It then waits for all the threads to nish, using the join() method as described above, and reports the total elapsed time. If you run the program on a multiprocessor computer, it should take less time for the program to run when you use more than one thread. You can compile and run the program or try the equivalent applet in the on-line version of this section.


Synchronization can help to prevent race conditions, but it introduces the possibility of another type of error, deadlock . A deadlock occurs when a thread waits forever for a resource that it will never get. In the kitchen, a deadlock might occur if two very simple-minded cooks both want to measure a cup of milk at the same time. The rst cook grabs the measuring cup, while the second cook grabs the milk. The rst cook needs the milk, but cant nd it because the second cook has it. The second cook needs the measuring cup, but cant nd it because the rst cook has it. Neither cook can continue and nothing more gets done. This is deadlock. Exactly the same thing can happen in a program, for example if there are two threads (like the two cooks) both of which need to obtain locks on the same two objects (like the milk and the measuring cup) before they can proceed. Deadlocks can easily occur, unless great care is taken to avoid them. Fortunately, we wont be looking at any examples that require locks on more than one object, so we will avoid that source of deadlock.

8.5.4

Wait and Notify

Threads can interact with each other in other ways besides sharing resources. For example, one thread might produce some sort of result that is needed by another thread. This imposes some restriction on the order in which the threads can do their computations. If the second thread gets to the point where it needs the result from the rst thread, it might have to stop and wait for the result to be produced. Since the second thread cant continue, it might as well go to sleep. But then there has to be some way to notify the second thread when the result is

8.5. INTRODUCTION TO THREADS

409

ready, so that it can wake up and continue its computation. Java, of course, has a way to do this kind of waiting and notication: It has wait() and notify() methods that are dened as instance methods in class Object and so can be used with any object. The reason why wait() and notify() should be associated with objects is not obvious, so dont worry about it at this point. It does, at least, make it possible to direct dierent notications to a dierent recipients, depending on which objects notify() method is called. The general idea is that when a thread calls a wait() method in some object, that thread goes to sleep until the notify() method in the same object is called. It will have to be called, obviously, by another thread, since the thread that called wait() is sleeping. A typical pattern is that Thread A calls wait() when it needs a result from Thread B, but that result is not yet available. When Thread B has the result ready, it calls notify(), which will wake Thread A up so that it can use the result. It is not an error to call notify() when no one is waiting; it just has no eect. To implement this, Thread A will execute code similar to the following, where obj is some object:
if ( resultIsAvailable() == false ) obj.wait(); // wait for noification that the result is available useTheResult();

while Thread B does something like:


generateTheResult(); obj.notify(); // send out a notification that the result is available

Now, there is a really nasty race condition in this code. The two threads might execute their code in the following order:
1. 2. 3. Thread so Thread Thread A checks resultIsAvailable() and finds that the result is not ready, it decides to execute the obj.wait() statement, but before it does, B finishes generating the result and calls obj.notify() A calls obj.wait() to wait for notification that the result is ready.

In Step 3, Thread A is waiting for a notication that will never come, because notify() has already been called. This is a kind of deadlock that can leave Thread A waiting forever. Obviously, we need some kind of synchronization. The solution is to enclose both Thread As code and Thread Bs code in synchronized statements, and it is very natural to synchronize on the same object, obj, that is used for the calls to wait() and notify(). In fact, since synchronization is almost always needed when wait() and notify() are used, Java makes it an absolute requirement. In Java, a thread can legally call obj.wait() or obj.notify() only if that thread holds the synchronization lock associated with the object obj. If it does not hold that lock, then an exception is thrown. (The exception is of type IllegalMonitorStateException, which does not require mandatory handling and which is typically not caught.) One further complication is that the wait() method can throw an InterruptedException and so should be called in a try statement that handles the exception. To make things more denite, lets consider a producer/consumer problem where one thread produces a result that is consumed by another thread. Assume that there is a shared variable named sharedResult that is used to transfer the result from the producer to the consumer. When the result is ready, the producer sets the variable to a non-null value. The producer can check whether the result is ready by testing whether the value of sharedResult is null. We will use a variable named lock for synchronization. The the code for the producer thread could have the form:

410

CHAPTER 8. CORRECTNESS AND ROBUSTNESS


makeResult = generateTheResult(); // Not synchronized! synchronized(lock) { sharedResult = makeResult; lock.notify(); }

while the consumer would execute code such as:


synchronized(lock) { while ( sharedResult == null ) { try { lock.wait(); } catch (InterruptedException e) { } } useResult = sharedResult; } useTheResult(useResult); // Not synchronized!

The calls to generateTheResult() and useTheResult() are not synchronized, which allows them to run in parallel with other threads that might also synchronize on lock. Since sharedResult is a shared variable, all references to sharedResult should be synchronized, so the references to sharedResult must be inside the synchronized statements. The goal is to do as little as possible (but not less) in synchronized code segments. If you are uncommonly alert, you might notice something funny: lock.wait() does not nish until lock.notify() is executed, but since both of these methods are called in synchronized statements that synchronize on the same object, shouldnt it be impossible for both methods to be running at the same time? In fact, lock.wait() is a special case: When the consumer thread calls lock.wait(), it gives up the lock that it holds on the synchronization object, lock. This gives the producer thread a chance to execute the synchronized(lock) block that contains the lock.notify() statement. After the producer thread exits from this block, the lock is returned to the consumer thread so that it can continue. The producer/consumer pattern can be generalized and made more useful without making it any more complex. In the general case, multiple results are produced by one or more producer threads and are consumed by one or more consumer threads. Instead of having just one sharedResult object, we keep a list of objects that have been produced but not yet consumed. Producer threads add objects to this list. Consumer threads remove objects from this list. The only time when a thread is blocked from running is when a consumer thread tries to get a result from the list, and no results are available. It is easy to encapsulate the whole producer/consumer pattern in a class (where I assume that there is a class ResultType that represents the result objects):
/** * An object of type ProducerConsumer represents a list of results * that are available for processing. Results are added to the list * by calling the produce method and are remove by calling consume. * If no result is available when consume is called, the method will * not return until a result becomes available. */ private static class ProducerConsumer { private ArrayList<ResultType> items = new ArrayList<ResultType>();

8.5. INTRODUCTION TO THREADS

411

// This ArrayList holds results that have been produced and are waiting // to be consumed. See Subsection 7.3.3 for information on ArrayList. public void produce(ResultType item) { synchronized(items) { items.add(item); // Add item to the list of results. items.notify(); // Notify any thread waiting in consume() method. } } public ResultType consume() { ResultType item; synchronized(items) { // If no results are available, wait for notification from produce(). while (items.size() == 0) { try { items.wait(); } catch (InterruptedException e) { } } // At this point, we know that at least one result is available. item = items.remove(0); } return item; } }

For an example of a program that uses a ProducerConsumer class, see ThreadTest3.java. This program performs the same task as ThreadTest2.java, but the threads communicate using the producer/consumer pattern instead of with a shared variable. Going back to our kitchen analogy for a moment, consider a restaurant with several waiters and several cooks. If we look at the ow of customer orders into the kitchen, the waiters produce the orders and leave them in a pile. The orders are consumed by the cooks; whenever a cook needs a new order to work on, she picks one up from the pile. The pile of orders, or course, plays the role of the list of result objects in the producer/consumer pattern. Note that the only time that a cook has to wait is when she needs a new order to work on, and there are no orders in the pile. The cook must wait until one of the waiters places an order in the pile. We can complete the analogy by imagining that the waiter rings a bell when he places the order in the pileringing the bell is like calling the notify() method to notify the cooks that an order is available. A nal note on notify: It is possible for several threads to be waiting for notication. A call to obj.notify() will wake only one of the threads that is waiting on obj. If you want to wake all threads that are waiting on obj, you can call obj.notifyAll(). And a nal note on wait: There is another version of wait() that takes a number of milliseconds as a parameter. A thread that calls obj.wait(milliseconds) will wait only up to the specied number of milliseconds for a notication. If a notication doesnt occur during that period, the thread will wake up and continue without the notication. In practice, this feature is most often used to let a waiting thread wake periodically while it is waiting in order to perform some periodic task, such as causing a message Waiting for computation to nish to blink.

412

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

8.5.5

Volatile Variables

And a nal note on communication among threads: In general, threads communicate by sharing variables and accessing those variables in synchronized methods or synchronized statements. However, synchronization is fairly expensive computationally, and excessive use of it should be avoided. So in some cases, it can make sense for threads to refer to shared variables without synchronizing their access to those variables. However, a subtle problem arises when the value of a shared variable is set is one thread and used in another. Because of the way that threads are implemented in Java, the second thread might not see the changed value of the variable immediately. That is, it is possible that a thread will continue to see the old value of the shared variable for some time after the value of the variable has been changed by another thread. This is because threads are allowed to cache shared data. That is, each thread can keep its own local copy of the shared data. When one thread changes the value of a shared variable, the local copies in the caches of other threads are not immediately changed, so the other threads continue to see the old value. When a synchronized method or statement is entered, threads are forced to update their caches to the most current values of the variables in the cache. So, using shared variables in synchronized code is always safe. It is still possible to use a shared variable outside of synchronized code, but in that case, the variable must be declared to be volatile. The volatile keyword is a modier that can be added to a variable declaration, as in
private volatile int count;

If a variable is declared to be volatile, no thread will keep a local copy of that variable in its cache. Instead, the thread will always use the ocial, main copy of the variable. This means that any change made to the variable will immediately be available to all threads. This makes it safe for threads to refer to volatile shared variables even outside of synchronized code. (Remember, though, that synchronization is still the only way to prevent race conditions.) When the volatile modier is applied to an object variable, only the variable itself is declared to be volatile, not the contents of the object that the variable points to. For this reason, volatile is generally only used for variables of simple types such as primitive types and enumerated types. A typical example of using volatile variables is to send a signal from one thread to another that tells the second thread to terminate. The two threads would share a variable
volatile boolean terminate = false;

The run method of the second thread would check the value of terminate frequently and end when the value of terminate becomes true:
public void run() { while (true) { if (terminate) return; . . // Do some work . } }

This thread will run until some other thread sets the value of terminate to true. Something like this is really the only clean way for one thread to cause another thread to die.

8.6. ANALYSIS OF ALGORITHMS

413

(By the way, you might be wondering why threads should use local data caches in the rst place, since it seems to complicate things unnecessarily. Caching is allowed because of the structure of multiprocessing computers. In many multiprocessing computers, each processor has some local memory that is directly connected to the processor. A threads cache is stored in the local memory of the processor on which the thread is running. Access to this local memory is much faster than access to other memory, so it is more ecient for a thread to use a local copy of a shared variable rather than some master copy that is stored in non-local memory.)

8.6

Analysis of Algorithms

In practice, another issue is also important: eciency . When analyzing a program in terms of eciency, we want to look at questions such as, How long does it take for the program to run? and Is there another approach that will get the answer more quickly? Eciency will always be less important than correctness; if you dont care whether a program works correctly, you can make it run very quickly indeed, but no one will think its much of an achievement! On the other hand, a program that gives a correct answer after ten thousand years isnt very useful either, so eciency is often an important issue. The term eciency can refer to ecient use of almost any resource, including time, computer memory, disk space, or network bandwidth. In this section, however, we will deal exclusively with time eciency, and the major question that we want to ask about a program is, how long does it take to perform its task? It really makes little sense to classify an individual program as being ecient or inecient. It makes more sense to compare two (correct) programs that perform the same task and ask which one of the two is more ecient, that is, which one performs the task more quickly. However, even here there are diculties. The running time of a program is not well-dened. The run time can be dierent depending on the number and speed of the processors in the computer on which it is run and, in the case of Java, on the design of the Java Virtual Machine which is used to interpret the program. It can depend on details of the compiler which is used to translate the program from high-level language to machine language. Furthermore, the run time of a program depends on the size of the problem which the program has to solve. It takes a sorting program longer to sort 10000 items than it takes it to sort 100 items. When the run times of two programs are compared, it often happens that Program A solves small problems faster than Program B, while Program B solves large problems faster than Program A, so that it is simply not the case that one program is faster than the other in all cases. In spite of these diculties, there is a eld of computer science dedicated to analyzing the eciency of programs. The eld is known as Analysis of Algorithms. The focus is on algorithms, rather than on programs as such, to avoid having to deal with multiple implementations of the same algorithm written in dierent languages, compiled with dierent compilers, and running on dierent computers. Analysis of Algorithms is a mathematical eld that abstracts away from these down-and-dirty details. Still, even though it is a theoretical eld, every working programmer should be aware of some of its techniques and results. This section is a very brief introduction to some of those techniques and results. Because this is not a mathematics book, the treatment will be rather informal. One of the main techniques of analysis of algorithms is asymptotic analysis. The term asymptotic here means basically the tendency in the long run. An asymptotic analysis of

This chapter has concentrated mostly on correctness of programs.

414

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

an algorithms run time looks at the question of how the run time depends on the size of the problem. The analysis is asymptotic because it only considers what happens to the run time as the size of the problem increases without limit; it is not concerned with what happens for problems of small size or, in fact, for problems of any xed nite size. Only what happens in the long run, as the problem size increases without limit, is important. Showing that Algorithm A is asymptotically faster than Algorithm B doesnt necessarily mean that Algorithm A will run faster than Algorithm B for problems of size 10 or size 1000 or even size 1000000it only means that if you keep increasing the problem size, you will eventually come to a point where Algorithm A is faster than Algorithm B. An asymptotic analysis is only a rst approximation, but in practice it often gives important and useful information.


Central to asymptotic analysis is Big-Oh notation. Using this notation, we might say, for example, that an algorithm has a running time that is O(n2 ) or O(n) or O(log(n)). These notations are read Big-Oh of n squared, Big-Oh of n, and Big-Oh of log n (where log is a logarithm function). More generally, we can refer to O(f(n)) (Big-Oh of f of n), where f(n) is some function that assigns a positive real number to every positive integer n. The n in this notation refers to the size of the problem. Before you can even begin an asymptotic analysis, you need some way to measure problem size. Usually, this is not a big issue. For example, if the problem is to sort a list of items, then the problem size can be taken to be the number of items in the list. When the input to an algorithm is an integer, as in the case of algorithm that checks whether a given positive integer is prime, the usual measure of the size of a problem is the number of bits in the input integer rather than the integer itself. More generally, the number of bits in the input to a problem is often a good measure of the size of the problem. To say that the running time of an algorithm is O(f(n)) means that for large values of the problem size, n, the running time of the algorithm is no bigger than some constant times f(n). (More rigorously, there is a number C and a positive integer M such that whenever n is greater than M, the run time is less than or equal to C*f(n).) The constant takes into account details such as the speed of the computer on which the algorithm is run; if you use a slower computer, you might have to use a bigger constant in the formula, but changing the constant wont change the basic fact that the run time is O(f(n)). The constant also makes it unnecessary to say whether we are measuring time in seconds, years, CPU cycles, or any other unit of measure; a change from one unit of measure to another is just multiplication by a constant. Note also that O(f(n)) doesnt depend at all on what happens for small problem sizes, only on what happens in the long run as the problem size increases without limit. To look at a simple example, consider the problem of adding up all the numbers in an array. The problem size, n, is the length of the array. Using A as the name of the array, the algorithm can be expressed in Java as:
total = 0; for (int i = 0; i < n; i++) total = total + A[i];

This algorithm performs the same operation, total = total + A[i], n times. The total time spent on this operation is a*n, where a is the time it takes to perform the operation once. Now, this is not the only thing that is done in the algorithm. The value of i is incremented and is compared to n each time through the loop. This adds an additional time of b*n to the run time, for some constant b. Furthermore, i and total both have to be initialized to zero; this adds some constant amount c to the running time. The exact running time would then be (a+b)*n+c, where the constants a, b, and c depend on factors such as how the code is compiled

8.6. ANALYSIS OF ALGORITHMS

415

and what computer it is run on. Using the fact that c is less than or equal to c*n for any positive integer n, we can say that the run time is less than or equal to (a+b+c)*n. That is, the run time is less than or equal to a constant times n. By denition, this means that the run time for this algorithm is O(n). If this explanation is too mathematical for you, we can just note that for large values of n, the c in the formula (a+b)*n+c is insignicant compared to the other term, (a+b)*n. We say that c is a lower order term. When doing asymptotic analysis, lower order terms can be discarded. A rough, but correct, asymptotic analysis of the algorithm would go something like this: Each iteration of the for loop takes a certain constant amount of time. There are n iterations of the loop, so the total run time is a constant times n, plus lower order terms (to account for the initialization). Disregarding lower order terms, we see that the run time is O(n).


Note that to say that an algorithm has run time O(f(n)) is to say that its run time is no bigger than some constant times n (for large values of n). O(f(n)) puts an upper limit on the run time. However, the run time could be smaller, even much smaller. For example, if the run time is O(n), it would also be correct to say that the run time is O(n2 ) or even O(n10 ). If the run time is less than a constant times n, then it is certainly less than the same constant times n2 or n10 . Of course, sometimes its useful to have a lower limit on the run time. That is, we want to be able to say that the run time is greater than or equal to some constant times f(n) (for large values of n). The notation for this is (f(n)), read Omega of f of n. Omega is the name of a letter in the Greek alphabet, and is the upper case version of that letter. (To be technical, saying that the run time of an algorithm is (f(n)) means that there is a positive number C and a positive integer M such that whenever n is greater than M, the run time is greater than or equal to C*f(n).) O(f(n)) tells you something about the maximum amount of time that you might have to wait for an algorithm to nish; (f(n)) tells you something about the minimum time. The algorithm for adding up the numbers in an array has a run time that is (n) as well as O(n). When an algorithm has a run time that is both (f(n)) and O(f(n)), its run time is said to be (f(n)), read Theta of f of n. (Theta is another letter from the Greek alphabet.) To say that the run time of an algorithm is (f(n)) means that for large values of n, the run time is between a*f(n) and b*f(n), where a and b are constants (with b greater than a, and both greater than 0). Lets look at another example. Consider the algorithm that can be expressed in Java in the following method:
/** * Sorts the n array elements A[0], A[1], ..., A[n-1] into increasing order. */ public static simpleBubbleSort( int[] A, int n ) { for (int i = 0; i < n; i++) { // Do n passes through the array... for (int j = 0; j < n-1; j++) { if ( A[j] > A[j+1] ) { // A[j] and A[j+1] are out of order, so swap them int temp = A[j]; A[j] = A[j+1]; A[j+1] = temp;

416
} } } }

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

Here, the parameter n represents the problem size. The outer for loop in the method is executed n times. Each time the outer for loop is executed, the inner for loop is exectued n-1 times, so the if statement is executed n*(n-1) times. This is n2 -n, but since lower order terms are not signicant in an asymptotic analysis, its good enough to say that the if statement is executed about n2 times. In particular, the test A[j] > A[j+1] is executed about n2 times, and this fact by itself is enough to say that the run time of the algorithm is (n2 ), that is, the run time is at least some constant times n2 . Furthermore, if we look at other operationsthe assignment statements, incrementing i and j, etc.none of them are executed more than n2 times, so the run time is also O(n2 ), that is, the run time is no more than some constant times n2 . Since it is both (n2 ) and O(n2 ), the run time of the simpleBubbleSort algorithm is (n2 ). You should be aware that some people use the notation O(f(n)) as if it meant (f(n)). That is, when they say that the run time of an algorithm is O(f(n)), they mean to say that the run time is about equal to a constant times f(n). For that, they should use (f(n)). Properly speaking, O(f(n)) means that the run time is less than a constant times f(n), possibly much less.


So far, my analysis has ignored an important detail. We have looked at how run time depends on the problem size, but in fact the run time usually depends not just on the size of the problem but on the specic data that has to be processed. For example, the run time of a sorting algorithm can depend on the initial order of the items that are to be sorted, and not just on the number of items. To account for this dependency, we can consider either the worst case run time analysis or the average case run time analysis of an algorithm. For a worst case run time analysis, we consider all possible problems of size n and look at the longest possible run time for all such problems. For an average case analysis, we consider all possible problems of size n and look at the average of the run times for all such problems. Usually, the average case analysis assumes that all problems of size n are equally likely to be encountered, although this is not always realisticor even possible in the case where there is an innite number of dierent problems of a given size. In many cases, the average and the worst case run times are the same to within a constant multiple. This means that as far as asymptotic analysis is concerned, they are the same. That is, if the average case run time is O(f(n)) or (f(n)), then so is the worst case. However, later in the book, we will encounter a few cases where the average and worst case asymptotic analyses dier.


So, what do you really have to know about analysis of algorithms to read the rest of this book? We will not do any rigorous mathematical analysis, but you should be able to follow informal discussion of simple cases such as the examples that we have looked at in this section. Most important, though, you should have a feeling for exactly what it means to say that the running time of an algorithm is O(f(n)) or (f(n)) for some common functions f(n). The main point is that these notations do not tell you anything about the actual numerical value of the running time of the algorithm for any particular case. They do not tell you anything at all

8.6. ANALYSIS OF ALGORITHMS

417

about the running time for small values of n. What they do tell you is something about the rate of growth of the running time as the size of the problem increases. Suppose you compare two algorithm that solve the same problem. The run time of one algorithm is (n2 ), while the run time of the second algorithm is (n3 ). What does this tell you? If you want to know which algorithm will be faster for some particular problem of size, say, 100, nothing is certain. As far as you can tell just from the asymptotic analysis, either algorithm could be faster for that particular caseor in any particular case. But what you can say for sure is that if you look at larger and larger problems, you will come to a point where the (n2 ) algorithm is faster than the (n3 ) algorithm. Furthermore, as you continue to increase the problem size, the relative advantage of the (n2 ) algorithm will continue to grow. There will be values of n for which the (n2 ) algorithm is a thousand times faster, a million times faster, a billion times faster, and so on. This is because for any positive constants a and b, the function a*n3 grows faster than the function b*n2 as n gets larger. (Mathematically, the limit of the ratio of a*n3 to b*n2 is innite as n approaches innity.) This means that for large problems, a (n2 ) algorithm will denitely be faster than a (n3 ) algorithm. You just dont knowbased on the asymptotic analysis aloneexactly how large large has to be. In practice, in fact, it is likely that the (n2 ) algorithm will be faster even for fairly small values of n, and absent other information you would generally prefer a (n2 ) algorithm to a (n3 ) algorithm. So, to understand and apply asymptotic analysis, it is essential to have some idea of the rates of growth of some common functions. For the power functions n, n2 , n3 , n4 , . . . , the larger the exponent, the greater the rate of growth of the function. Exponential functions such as 2n and 10n , where the n is in the exponent, have a growth rate that is faster than that of any power function. In fact, exponential functions grow so quickly that an algorithm whose run time grows exponentially is almost certainly impractical even for relatively modest values of n, because the running time is just too long. Another function that often turns up in asymptotic analysis is the logarithm function, log(n). There are actually many dierent logarithm functions, but the one that is usually used in computer science is the so-called logarithm to the base two, which is dened by the fact that log(2x ) = x for any number x. (Usually, this function is written log2 (n), but I will leave out the subscript 2, since I will only use the base-two logarithm in this book.) The logarithm function grows very slowly. The growth rate of log(n) is much smaller than the growth rate of n. The growth rate of n*log(n) is a little larger than the growth rate of n, but much smaller than the growth rate of n2 . The following table should help you understand the dierences among the rates of grows of various functions:
) n ( g o l / n 2 n 4 3 3 0 0 6 6 6 6 5 9 3 7 0 0 0 5 5 0 0 2 5 8 0 0 4 0 0 6 4 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 8 0 8 4 4 4 8 5 6 4 4 6 ) 3 0 5 8 2 n 0 1 ( 2 2 1 3 5 g 9 3 o l 9 7 * 1 9 n 8 9 2 ) n ( 8 0 0 0 4 6 g 1 3 2 o l 0 0 6 4 6 4 1 5 0 0 6 2 0 0 0 2 n 1 0 0 0 0 0 0 1 0 0 0 1

The reason that log(n) shows up so often is because of its association with multiplying and dividing by two: Suppose you start with the number n and divide it by 2, then divide by 2 again, and so on, until you get a number that is less than or equal to 1. Then the number of divisions is equal (to the nearest integer) to log(n).

0 7 0 1 3 4

. . . . . . 0 3 7

4 2 2 1 3 0 7 7 1 1 7 0 7 5 4

418

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

As an example, consider the binary search algorithm from Subsection 7.4.1. This algorithm searches for an item in a sorted array. The problem size, n, can be taken to be the length of the array. Each step in the binary search algorithm divides the number of items still under consideration by 2, and the algorithm stops when the number of items under consideration is less than or equal to 1 (or sooner). It follows that the number of steps for an array of length n is at most log(n). This means that the worst-case run time for binary search is (log(n)). (The average case run time is also (log(n)).) By comparison, the linear search algorithm, which was also presented in Subsection 7.4.1 has a run time that is (n). The notation gives us a quantitative way to express and to understand the fact that binary search is much faster than linear search. In binary search, each step of the algorithm divides the problem size by 2. It often happens that some operation in an algorithm (not necessarily a single step) divides the problem size by 2. Whenever that happens, the logarithm function is likely to show up in an asymptotic analysis of the run time of the algorithm. Analysis of Algorithms is a large, fascinating eld. We will only use a few of the most basic ideas from this eld, but even those can be very helpful for understanding the dierences among algorithms.

Exercises

419

Exercises for Chapter 8


1. Write a program that uses the following subroutine, from Subsection 8.3.3, to solve equations specied by the user.
/** * Returns the larger of the two roots of the quadratic equation * A*x*x + B*x + C = 0, provided it has any roots. If A == 0 or * if the discriminant, B*B - 4*A*C, is negative, then an exception * of type IllegalArgumentException is thrown. */ static public double root( double A, double B, double C ) throws IllegalArgumentException { if (A == 0) { throw new IllegalArgumentException("A cant be zero."); } else { double disc = B*B - 4*A*C; if (disc < 0) throw new IllegalArgumentException("Discriminant < zero."); return (-B + Math.sqrt(disc)) / (2*A); } }

Your program should allow the user to specify values for A, B, and C. It should call the subroutine to compute a solution of the equation. If no error occurs, it should print the root. However, if an error occurs, your program should catch that error and print an error message. After processing one equation, the program should ask whether the user wants to enter another equation. The program should continue until the user answers no. 2. As discussed in Section 8.1, values of type int are limited to 32 bits. Integers that are too large to be represented in 32 bits cannot be stored in an int variable. Java has a standard class, java.math.BigInteger, that addresses this problem. An object of type BigInteger is an integer that can be arbitrarily large. (The maximum size is limited only by the amount of memory on your computer.) Since BigIntegers are objects, they must be manipulated using instance methods from the BigInteger class. For example, you cant add two BigIntegers with the + operator. Instead, if N and M are variables that refer to BigIntegers, you can compute the sum of N and M with the function call N.add(M). The value returned by this function is a new BigInteger object that is equal to the sum of N and M. The BigInteger class has a constructor new BigInteger(str), where str is a string. The string must represent an integer, such as 3 or 39849823783783283733. If the string does not represent a legal integer, then the constructor throws a NumberFormatException. There are many instance methods in the BigInteger class. Here are a few that you will nd useful for this exercise. Assume that N and M are variables of type BigInteger. N.add(M) a function that returns a BigInteger representing the sum of N and M. N.multiply(M) a function that returns a BigInteger representing the result of multiplying N times M.

420

CHAPTER 8. CORRECTNESS AND ROBUSTNESS N.divide(M) a function that returns a BigInteger representing the result of dividing N by M, discarding the remainder. N.signum() a function that returns an ordinary int. The returned value represents the sign of the integer N. The returned value is 1 if N is greater than zero. It is -1 if N is less than zero. And it is 0 if N is zero. N.equals(M) a function that returns a boolean value that is true if N and M have the same integer value. N.toString() a function that returns a String representing the value of N. N.testBit(k) a function that returns a boolean value. The parameter k is an integer. The return value is true if the k-th bit in N is 1, and it is false if the k-th bit is 0. Bits are numbered from right to left, starting with 0. Testing if (N.testBit(0)) is an easy way to check whether N is even or odd. N.testBit(0) is true if and only if N is an odd number. For this exercise, you should write a program that prints 3N+1 sequences with starting values specied by the user. In this version of the program, you should use BigIntegers to represent the terms in the sequence. You can read the users input into a String with the TextIO.getln() function. Use the input value to create the BigInteger object that represents the starting point of the 3N+1 sequence. Dont forget to catch and handle the NumberFormatException that will occur if the users input is not a legal integer! You should also check that the input number is greater than zero. If the users input is legal, print out the 3N+1 sequence. Count the number of terms in the sequence, and print the count at the end of the sequence. Exit the program when the user inputs an empty line.

3. A Roman numeral represents an integer using letters. Examples are XVII to represent 17, MCMLIII for 1953, and MMMCCCIII for 3303. By contrast, ordinary numbers such as 17 or 1953 are called Arabic numerals. The following table shows the Arabic equivalent of all the single-letter Roman numerals:
M D C L 1000 500 100 50 X V I 10 5 1

When letters are strung together, the values of the letters are just added up, with the following exception. When a letter of smaller value is followed by a letter of larger value, the smaller value is subtracted from the larger value. For example, IV represents 5 - 1, or 4. And MCMXCV is interpreted as M + CM + XC + V, or 1000 + (1000 - 100) + (100 - 10) + 5, which is 1995. In standard Roman numerals, no more than thee consecutive copies of the same letter are used. Following these rules, every number between 1 and 3999 can be represented as a Roman numeral made up of the following one- and two-letter combinations:
M CM D CD C XC 1000 900 500 400 100 90 X IX V IV I 10 9 5 4 1

Exercises
L XL 50 40

421

Write a class to represent Roman numerals. The class should have two constructors. One constructs a Roman numeral from a string such as XVII or MCMXCV. It should throw a NumberFormatException if the string is not a legal Roman numeral. The other constructor constructs a Roman numeral from an int. It should throw a NumberFormatException if the int is outside the range 1 to 3999. In addition, the class should have two instance methods. The method toString() returns the string that represents the Roman numeral. The method toInt() returns the value of the Roman numeral as an int. At some point in your class, you will have to convert an int into the string that represents the corresponding Roman numeral. One way to approach this is to gradually move value from the Arabic numeral to the Roman numeral. Here is the beginning of a routine that will do this, where number is the int that is to be converted:
String roman = ""; int N = number; while (N >= 1000) { // Move 1000 from N to roman. roman += "M"; N -= 1000; } while (N >= 900) { // Move 900 from N to roman. roman += "CM"; N -= 900; } . . // Continue with other values from the above table. .

(You can save yourself a lot of typing in this routine if you use arrays in a clever way to represent the data in the above table.) Once youve written your class, use it in a main program that will read both Arabic numerals and Roman numerals entered by the user. If the user enters an Arabic numeral, print the corresponding Roman numeral. If the user enters a Roman numeral, print the corresponding Arabic numeral. (You can tell the dierence by using TextIO.peek() to peek at the rst character in the users input. If that character is a digit, then the users input is an Arabic numeral. Otherwise, its a Roman numeral.) The program should end when the user inputs an empty line. 4. The source code le le Expr.java denes a class, Expr, that can be used to represent mathematical expressions involving the variable x. The expression can use the operators +, -, *, /, and ^ (where ^ represents the operation of raising a number to a power). It can use mathematical functions such as sin, cos, abs, and ln. See the source code le for full details. The Expr class uses some advanced techniques which have not yet been covered in this textbook. However, the interface is easy to understand. It contains only a constructor and two public methods. The constructor new Expr(def) creates an Expr object dened by a given expression. The parameter, def, is a string that contains the denition. For example,

422

CHAPTER 8. CORRECTNESS AND ROBUSTNESS new Expr("x^2") or new Expr("sin(x)+3*x"). If the parameter in the constructor call does not represent a legal expression, then the constructor throws an IllegalArgumentException. The message in the exception describes the error. If func is a variable of type Expr and num is of type double, then func.value(num) is a function that returns the value of the expression when the number num is substituted for the variable x in the expression. For example, if Expr represents the expression 3*x+1, then func.value(5) is 3*5+1, or 16. If the expression is undened for the specied value of x, then the special value Double.NaN is returned. Finally, func.toString() returns the denition of the expression. This is just the string that was used in the constructor that created the expression object. For this exercise, you should write a program that lets the user enter an expression. If the expression contains an error, print an error message. Otherwise, let the user enter some numerical values for the variable x. Print the value of the expression for each number that the user enters. However, if the expression is undened for the specied value of x, print a message to that eect. You can use the boolean-valued function Double.isNaN(val) to check whether a number, val, is Double.NaN. The user should be able to enter as many values of x as desired. After that, the user should be able to enter a new expression. In the on-line version of this exercise, there is an applet that simulates my solution, so that you can see how it works.

5. This exercise uses the class Expr, which was described in Exercise 8.4 and which is dened in the source code le Expr.java. For this exercise, you should write a GUI program that can graph a function, f(x), whose denition is entered by the user. The program should have a text-input box where the user can enter an expression involving the variable x, such as x^2 or sin(x-3)/x. This expression is the denition of the function. When the user presses return in the text input box, the program should use the contents of the text input box to construct an object of type Expr. If an error is found in the denition, then the program should display an error message. Otherwise, it should display a graph of the function. (Note: A JTextField generates an ActionEvent when the user presses return.) The program will need a JPanel for displaying the graph. To keep things simple, this panel should represent a xed region in the xy-plane, dened by -5 <= x <= 5 and -5 <= y <= 5. To draw the graph, compute a large number of points and connect them with line segments. (This method does not handle discontinuous functions properly; doing so is very hard, so you shouldnt try to do it for this exercise.) My program divides the interval -5 <= x <= 5 into 300 subintervals and uses the 301 endpoints of these subintervals for drawing the graph. Note that the function might be undened at one of these x-values. In that case, you have to skip that point. A point on the graph has the form (x,y) where y is obtained by evaluating the users expression at the given value of x. You will have to convert these real numbers to the integer coordinates of the corresponding pixel on the canvas. The formulas for the conversion are:
a b = = (int)( (x + 5)/10 * width ); (int)( (5 - y)/10 * height );

where a and b are the horizontal and vertical coordinates of the pixel, and width and height are the width and height of the panel. You can nd an applet version of my solution in the on-line version of this exercise.

Exercises

423

6. Exercise 3.2 asked you to nd the integer in the range 1 to 10000 that has the largest number of divisors. Now write a program that uses multiple threads to solve the same problem. By using threads, your program will take less time to do the computation when it is run on a multiprocessor computer. At the end of the program, output the elapsed time, the integer that has the largest number of divisors, and the number of divisors that it has. The program can be modeled on the sample prime-counting program ThreadTest2.java from Subsection 8.5.3.

424

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

Quiz on Chapter 8
1. What does it mean to say that a program is robust? 2. Why do programming languages require that variables be declared before they are used? What does this have to do with correctness and robustness? 3. What is a precondition? Give an example. 4. Explain how preconditions can be used as an aid in writing correct programs. 5. Java has a predened class called Throwable. What does this class represent? Why does it exist? 6. Write a method that prints out a 3N+1 sequence starting from a given integer, N. The starting value should be a parameter to the method. If the parameter is less than or equal to zero, throw an IllegalArgumentException. If the number in the sequence becomes too large to be represented as a value of type int, throw an ArithmeticException. 7. Rewrite the method from the previous question, using assert statements instead of exceptions to check for errors. What the dierence between the two versions of the method when the program is run? 8. Some classes of exceptions require mandatory exception handling. Explain what this means. 9. Consider a subroutine processData() that has the header
static void processData() throws IOException

Write a try..catch statement that calls this subroutine and prints an error message if an IOException occurs. 10. Why should a subroutine throw an exception when it encounters an error? Why not just terminate the program? 11. Suppose that a program uses a single thread that takes 4 seconds to run. Now suppose that the program creates two threads and divides the same work between the two threads. What can be said about the expected execution time of the program that uses two threads? 12. Consider the ThreadSafeCounter example from Subsection 8.5.3:
public class ThreadSafeCounter { private int count = 0; // The value of the counter.

synchronized public void increment() { count = count + 1; } synchronized public int getValue() { return count; } }

Quiz

425 The increment() method is synchronized so that the caller of the method can complete the three steps of the operation Get value of count, Add 1 to value, Store new value in count without being interrupted by another thread. But getValue() consists of a single, simple step. Why is getValue() synchronized? (This is a deep and tricky question.)

426

CHAPTER 8. CORRECTNESS AND ROBUSTNESS

Chapter 9

Linked Data Structures and Recursion


In this chapter, we look at two advanced programming techniques, recursion and linked data
structures, and some of their applications. Both of these techniques are related to the seemingly paradoxical idea of dening something in terms of itself. This turns out to be a remarkably powerful idea. A subroutine is said to be recursive if it calls itself, either directly or indirectly. That is, the subroutine is used in its own denition. Recursion can often be used to solve complex problems by reducing them to simpler problems of the same type. A reference to one object can be stored in an instance variable of another object. The objects are then said to be linked. Complex data structures can be built by linking objects together. An especially interesting case occurs when an object contains a link to another object that belongs to the same class. In that case, the class is used in its own denition. Several important types of data structures are built using classes of this kind.

9.1 At

Recursion

one time or another, youve probably been told that you cant dene something in terms of itself. Nevertheless, if its done right, dening something at least partially in terms of itself can be a very powerful technique. A recursive denition is one that uses the concept or thing that is being dened as part of the denition. For example: An ancestor is either a parent or an ancestor of a parent. A sentence can be, among other things, two sentences joined by a conjunction such as and. A directory is a part of a disk drive that can hold les and directories. In mathematics, a set is a collection of elements, which can themselves be sets. A statement in Java can be a while statement, which is made up of the word while, a boolean-valued condition, and a statement. Recursive denitions can describe very complex situations with just a few words. A definition of the term ancestor without using recursion might go something like a parent, or a grandparent, or a great-grandparent, or a great-great-grandparent, and so on. But saying and so on is not very rigorous. (Ive often thought that recursion is really just a rigorous way of saying and so on.) You run into the same problem if you try to dene a directory as a le that is a list of les, where some of the les can be lists of les, where some of those les can be lists of les, and so on. Trying to describe what a Java statement can look like, without using recursion in the denition, would be dicult and probably pretty comical. 427

428

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

Recursion can be used as a programming technique. A recursive subroutine is one that calls itself, either directly or indirectly. To say that a subroutine calls itself directly means that its denition contains a subroutine call statement that calls the subroutine that is being dened. To say that a subroutine calls itself indirectly means that it calls a second subroutine which in turn calls the rst subroutine (either directly or indirectly). A recursive subroutine can dene a complex task in just a few lines of code. In the rest of this section, well look at a variety of examples, and well see other examples in the rest of the book.

9.1.1

Recursive Binary Search

Lets start with an example that youve seen before: the binary search algorithm from Subsection 7.4.1. Binary search is used to nd a specied value in a sorted list of items (or, if it does not occur in the list, to determine that fact). The idea is to test the element in the middle of the list. If that element is equal to the specied value, you are done. If the specied value is less than the middle element of the list, then you should search for the value in the rst half of the list. Otherwise, you should search for the value in the second half of the list. The method used to search for the value in the rst or second half of the list is binary search. That is, you look at the middle element in the half of the list that is still under consideration, and either youve found the value you are looking for, or you have to apply binary search to one half of the remaining elements. And so on! This is a recursive description, and we can write a recursive subroutine to implement it. Before we can do that, though, there are two considerations that we need to take into account. Each of these illustrates an important general fact about recursive subroutines. First of all, the binary search algorithm begins by looking at the middle element of the list. But what if the list is empty? If there are no elements in the list, then it is impossible to look at the middle element. In the terminology of Subsection 8.2.1, having a non-empty list is a precondition for looking at the middle element, and this is a clue that we have to modify the algorithm to take this precondition into account. What should we do if we nd ourselves searching for a specied value in an empty list? The answer is easy: If the list is empty, we can be sure that the value does not occur in the list, so we can give the answer without any further work. An empty list is a base case for the binary search algorithm. A base case for a recursive algorithm is a case that is handled directly, rather than by applying the algorithm recursively. The binary search algorithm actually has another type of base case: If we nd the element we are looking for in the middle of the list, we are done. There is no need for further recursion. The second consideration has to do with the parameters to the subroutine. The problem is phrased in terms of searching for a value in a list. In the original, non-recursive binary search subroutine, the list was given as an array. However, in the recursive approach, we have to able to apply the subroutine recursively to just a part of the original list. Where the original subroutine was designed to search an entire array, the recursive subroutine must be able to search part of an array. The parameters to the subroutine must tell it what part of the array to search. This illustrates a general fact that in order to solve a problem recursively, it is often necessary to generalize the problem slightly. Here is a recursive binary search algorithm that searches for a given value in part of an array of integers:
/** * Search in the array A in positions numbered loIndex to hiIndex, * inclusive, for the specified value. If the value is found, return * the index in the array where it occurs. If the value is not found,

9.1. RECURSION
* return -1. Precondition: The array must be sorted into increasing * order. */ static int binarySearch(int[] A, int loIndex, int hiIndex, int value) { if (loIndex > hiIndex) { // The starting position comes after the final index, // so there are actually no elements in the specified // range. The value does not occur in this empty list! return -1; } else { // Look at the middle position in the list. If the // value occurs at that position, return that position. // Otherwise, search recursively in either the first // half or the second half of the list. int middle = (loIndex + hiIndex) / 2; if (value == A[middle]) return middle; else if (value < A[middle]) return binarySearch(A, loIndex, middle - 1, value); else // value must be > A[middle] return binarySearch(A, middle + 1, hiIndex, value); } } // end binarySearch()

429

In this routine, the parameters loIndex and hiIndex specify the part of the array that is to be searched. To search an entire array, it is only necessary to call binarySearch(A, 0, A.length - 1, value). In the two base caseswhen there are no elements in the specied range of indices and when the value is found in the middle of the rangethe subroutine can return an answer immediately, without using recursion. In the other cases, it uses a recursive call to compute the answer and returns that answer. Most people nd it dicult at rst to convince themselves that recursion actually works. The key is to note two things that must be true for recursion to work properly: There must be one or more base cases, which can be handled without using recursion. And when recursion is applied during the solution of a problem, it must be applied to a problem that is in some sense smallerthat is, closer to the base casesthan the original problem. The idea is that if you can solve small problems and if you can reduce big problems to smaller problems, then you can solve problems of any size. Ultimately, of course, the big problems have to be reduced, possibly in many, many steps, to the very smallest problems (the base cases). Doing so might involve an immense amount of detailed bookkeeping. But the computer does that bookkeeping, not you! As a programmer, you lay out the big picture: the base cases and the reduction of big problems to smaller problems. The computer takes care of the details involved in reducing a big problem, in many steps, all the way down to base cases. Trying to think through this reduction in detail is likely to drive you crazy, and will probably make you think that recursion is hard. Whereas in fact, recursion is an elegant and powerful method that is often the simplest approach to solving a complex problem. A common error in writing recursive subroutines is to violate one of the two rules: There must be one or more base cases, and when the subroutine is applied recursively, it must be applied to a problem that is smaller than the original problem. If these rules are violated, the

430

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

result can be an innite recursion, where the subroutine keeps calling itself over and over, without ever reaching a base case. Innite recursion is similar to an innite loop. However, since each recursive call to the subroutine uses up some of the computers memory, a program that is stuck in an innite recursion will run out of memory and crash before long. (In Java, the program will crash with an exception of type StackOverflowError.)

9.1.2

Towers of Hanoi

Binary search can be implemented with a while loop, instead of with recursion, as was done in Subsection 7.4.1. Next, we turn to a problem that is easy to solve with recursion but dicult to solve without it. This is a standard example known as The Towers of Hanoi. The problem involves a stack of various-sized disks, piled up on a base in order of decreasing size. The object is to move the stack from one base to another, subject to two rules: Only one disk can be moved at a time, and no disk can ever be placed on top of a smaller disk. There is a third base that can be used as a spare. The starting situation for a stack of ten disks is shown in the top half of the following picture. The situation after a number of moves have been made is shown in the bottom half of the picture. These pictures are from the applet at the end of Section 9.5, which displays an animation of the step-by-step solution of the problem.

The problem is to move ten disks from Stack 0 to Stack 1, subject to certain rules. Stack 2 can be used as a spare location. Can we reduce this to smaller problems of the same type, possibly generalizing the problem a bit to make this possible? It seems natural to consider the size of the problem to be the number of disks to be moved. If there are N disks in Stack 0, we know that we will eventually have to move the bottom disk from Stack 0 to Stack 1. But before we can do that, according to the rules, the rst N-1 disks must be on Stack 2. Once weve moved the N-th disk to Stack 1, we must move the other N-1 disks from Stack 2 to Stack 1 to complete the solution. But moving N-1 disks is the same type of problem as moving N disks, except that its a smaller version of the problem. This is exactly what we need to do recursion! The problem has to be generalized a bit, because the smaller problems involve moving disks from Stack 0 to Stack 2 or from Stack 2 to Stack 1, instead of from Stack 0 to Stack 1. In the recursive subroutine that solves the problem, the stacks that serve as the source and destination

9.1. RECURSION

431

of the disks have to be specied. Its also convenient to specify the stack that is to be used as a spare, even though we could gure that out from the other two parameters. The base case is when there is only one disk to be moved. The solution in this case is trivial: Just move the disk in one step. Here is a version of the subroutine that will print out step-by-step instructions for solving the problem:
/** * Solve the problem of moving the number of disks specified * by the first parameter from the stack specified by the * second parameter to the stack specified by the third * parameter. The stack specified by the fourth parameter * is available for use as a spare. Stacks are specified by * number: 0, 1, or 2. */ static void TowersOfHanoi(int disks, int from, int to, int spare) { if (disks == 1) { // There is only one disk to be moved. Just move it. System.out.println("Move a disk from stack number " + from + " to stack number " + to); } else { // Move all but one disk to the spare stack, then // move the bottom disk, then put all the other // disks on top of it. TowersOfHanoi(disks-1, from, spare, to); System.out.println("Move a disk from stack number " + from + " to stack number " + to); TowersOfHanoi(disks-1, spare, to, from); } }

This subroutine just expresses the natural recursive solution. The recursion works because each recursive call involves a smaller number of disks, and the problem is trivial to solve in the base case, when there is only one disk. To solve the top level problem of moving N disks from Stack 0 to Stack 1, it should be called with the command TowersOfHanoi(N,0,1,2). The subroutine is demonstrated by the sample program TowersOfHanoi.java. Here, for example, is the output from the program when it is run with the number of disks set equal to 3:
Move Move Move Move Move Move Move Move Move Move Move Move Move Move Move a a a a a a a a a a a a a a a disk disk disk disk disk disk disk disk disk disk disk disk disk disk disk from from from from from from from from from from from from from from from stack stack stack stack stack stack stack stack stack stack stack stack stack stack stack number number number number number number number number number number number number number number number 0 0 2 0 1 1 0 0 2 2 1 2 0 0 2 to to to to to to to to to to to to to to to stack stack stack stack stack stack stack stack stack stack stack stack stack stack stack number number number number number number number number number number number number number number number 2 1 1 2 0 2 2 1 1 0 0 1 2 1 1

432

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

The output of this program shows you a mass of detail that you dont really want to think about! The diculty of following the details contrasts sharply with the simplicity and elegance of the recursive solution. Of course, you really want to leave the details to the computer. Its much more interesting to watch the applet from Section 9.5, which shows the solution graphically. That applet uses the same recursive subroutine, except that the System.out.println statements are replaced by commands that show the image of the disk being moved from one stack to another. There is, by the way, a story that explains the name of this problem. According to this story, on the rst day of creation, a group of monks in an isolated tower near Hanoi were given a stack of 64 disks and were assigned the task of moving one disk every day, according to the rules of the Towers of Hanoi problem. On the day that they complete their task of moving all the disks from one stack to another, the universe will come to an end. But dont worry. The number of steps required to solve the problem for N disks is 2N - 1, and 264 - 1 days is over 50,000,000,000,000 years. We have a long way to go. (In the terminology of Section 8.6, the Towers of Hanoi algorithm has a run time that is (2n ), where n is the number of disks that have to be moved. Since the exponential function 2n grows so quickly, the Towers of Hanoi problem can be solved in practice only for a small number of disks.)


By the way, in addtion to the graphical Towers of Hanoi applet at the end of this chapter, there are two other end-of-chapter applets in the on-line version of this text that use recursion. One is a maze-solving applet from the end of Section 11.5, and the other is a pentominos applet from the end of Section 10.5. The Maze applet rst builds a random maze. It then tries to solve the maze by nding a path through the maze from the upper left corner to the lower right corner. This problem is actually very similar to a blob-counting problem that is considered later in this section. The recursive maze-solving routine starts from a given square, and it visits each neighboring square and calls itself recursively from there. The recursion ends if the routine nds itself at the lower right corner of the maze. The Pentominos applet is an implementation of a classic puzzle. A pentomino is a connected gure made up of ve equal-sized squares. There are exactly twelve gures that can be made in this way, not counting all the possible rotations and reections of the basic gures. The problem is to place the twelve pentominos on an 8-by-8 board in which four of the squares have already been marked as lled. The recursive solution looks at a board that has already been partially lled with pentominos. The subroutine looks at each remaining piece in turn. It tries to place that piece in the next available place on the board. If the piece ts, it calls itself recursively to try to ll in the rest of the solution. If that fails, then the subroutine goes on to the next piece. A generalized version of the pentominos applet with many more features can be found at http://math.hws.edu/xJava/PentominosSolver/. The Maze applet and the Pentominos applet are fun to watch, and they give nice visual representations of recursion.

9.1.3

A Recursive Sorting Algorithm

Turning next to an application that is perhaps more practical, well look at a recursive algorithm for sorting an array. The selection sort and insertion sort algorithms, which were covered in Section 7.4, are fairly simple, but they are rather slow when applied to large arrays. Faster

9.1. RECURSION

433

sorting algorithms are available. One of these is Quicksort, a recursive algorithm which turns out to be the fastest sorting algorithm in most situations. The Quicksort algorithm is based on a simple but clever idea: Given a list of items, select any item from the list. This item is called the pivot. (In practice, Ill just use the rst item in the list.) Move all the items that are smaller than the pivot to the beginning of the list, and move all the items that are larger than the pivot to the end of the list. Now, put the pivot between the two groups of items. This puts the pivot in the position that it will occupy in the nal, completely sorted array. It will not have to be moved again. Well refer to this procedure as QuicksortStep.
e n i s a i g f , l a e 3 d s 2 t e i f v o 3 o t 2 f m r e l e e b b e u h o t n t s s o e e t e h l v . s a T t s r h r h e . e t g b ' 3 i b h n r 2 t m s m s f u f e t u o n i o o n t d e e o t h t h n a d t g o e h i n t i t r t a l r c o e o 3 n e s s h l 2 o t , i s e t t r s n o i s e t i a , s l b s s h o r t r e p m e e r h l u t b b e a t n g m n a m e n u e u i r h n t n t s g r t f e i e o s o s h r g n t t e i n s h t b i a s r l y r i o r m d n a s a u A e o n d t . r o n l d e T a a p s n e a a t c t S t f s r i e l o h t s s t k i n c i i o u t 3 2 e Q i , l s y l r 3 e p 2 b p a n m a o u h T

QuicksortStep is not recursive. It is used as a subroutine by Quicksort. The speed of Quicksort depends on having a fast implementation of QuicksortStep. Since its not the main point of this discussion, I present one without much comment.
/** * Apply QuicksortStep to the list of items in locations lo through hi * in the array A. The value returned by this routine is the final * position of the pivot item in the array. */ static int quicksortStep(int[] A, int lo, int hi) { int pivot = A[lo]; // // // // // // // // Get the pivot value.

The numbers hi and lo mark the endpoints of a range of numbers that have not yet been tested. Decrease hi and increase lo until they become equal, moving numbers bigger than pivot so that they lie above hi and moving numbers less than the pivot so that they lie below lo. When we begin, A[lo] is an available space, since it used to hold the pivot.

while (hi > lo) { while (hi > lo && A[hi] > pivot) { // Move hi down past numbers greater than pivot. // These numbers do not have to be moved. hi--; } if (hi == lo) break;

n t

434

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION


// The number A[hi] is less than pivot. Move it into // the available space at A[lo], leaving an available // space at A[hi]. A[lo] = A[hi]; lo++; while (hi > lo && A[lo] < pivot) { // Move lo up past numbers less than pivot. // These numbers do not have to be moved. lo++; } if (hi == lo) break; // The number A[lo] is greater than pivot. Move it into // the available space at A[hi], leaving an available // space at A[lo]. A[hi] = A[lo]; hi--; } // end while // // // // At this point, lo has become equal to hi, and there is an available space at that position. This position lies between numbers less than pivot and numbers greater than pivot. Put pivot in this space and return its location.

A[lo] = pivot; return lo; } // end QuicksortStep

With this subroutine in hand, Quicksort is easy. The Quicksort algorithm for sorting a list consists of applying QuicksortStep to the list, then applying Quicksort recursively to the items that lie to the left of the new position of the pivot and to the items that lie to the right of that position. Of course, we need base cases. If the list has only one item, or no items, then the list is already as sorted as it can ever be, so Quicksort doesnt have to do anything in these cases.
/** * Apply quicksort to put the array elements between * position lo and position hi into increasing order. */ static void quicksort(int[] A, int lo, int hi) { if (hi <= lo) { // The list has length one or zero. Nothing needs // to be done, so just return from the subroutine. return; } else { // Apply quicksortStep and get the new pivot position. // Then apply quicksort to sort the items that // precede the pivot and the items that follow it. int pivotPosition = quicksortStep(A, lo, hi); quicksort(A, lo, pivotPosition - 1); quicksort(A, pivotPosition + 1, hi);

9.1. RECURSION
} }

435

As usual, we had to generalize the problem. The original problem was to sort an array, but the recursive algorithm is set up to sort a specied part of an array. To sort an entire array, A, using the quickSort() subroutine, you would call quicksort(A, 0, A.length - 1). Quicksort is an interesting example from the point of view of the analysis of algorithms (Section 8.6), because its average case run time diers greatly from its worst case run time. Here is a very informal analysis, starting with the average case: Note that an application of quicksortStep divides a problem into two sub-problems. On the average, the subproblems will be of approximately the same size. A problem of size n is divided into two problems that are roughly of size n/2; these are then divided into four problems that are roughly of size n/4; and so on. Since the problem size is divided by 2 on each level, there will be approximately log(n) levels of subdivision. The amount of processing on each level is proportional to n. (On the top level, each element in the array is looked at and possibly moved. On the second level, where there are two subproblems, every element but one in the array is part of one of those two subproblems and must be looked at and possibly moved, so there is a total of about n steps in both subproblems combined. Similarly, on the third level, there are four subproblems and a total of about n steps in all four subproblems combined on that level. . . .) With a total of n steps on each level and approximately log(n) levels in the average case, the average case run time for Quicksort is (n*log(n)). This analysis assumes that quicksortStep divides a problem into two approximately equal parts. However, in the worst case, each application of quicksortStep divides a problem of size n into a problem of size 0 and a problem of size n-1. This happens when the pivot element ends up at the beginning or end of the array. In this worst case, there are n levels of subproblems, and the worst-case run time is (n2 ). The worst case is very rareit depends on the items in the array being arranged in a very special way, so the average performance of Quicksort can be very good even though it is not so good in certain rare cases. There are sorting algorithms that have both an average case and a worst case run time of (n*log(n)). One example is MergeSort, which you can look up if you are interested.

9.1.4

Blob Counting

The program Blobs.java displays a grid of small white and gray squares. The gray squares are considered to be lled and the white squares are empty. For the purposes of this example, we dene a blob to consist of a lled square and all the lled squares that can be reached from it by moving up, down, left, and right through other lled squares. If the user clicks on any lled square in the program, the computer will count the squares in the blob that contains the clicked square, and it will change the color of those squares to red. The program has several controls. There is a New Blobs button; clicking this button will create a new random pattern in the grid. A pop-up menu species the approximate percentage of squares that will be lled in the new pattern. The more lled squares, the larger the blobs. And a button labeled Count the Blobs will tell you how many dierent blobs there are in the pattern. You can try an applet version of the program in the on-line version of the book. Here is a picture of the program after the user has clicked one of the lled squares:

436

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

Recursion is used in this program to count the number of squares in a blob. Without recursion, this would be a very dicult thing to implement. Recursion makes it relatively easy, but it still requires a new technique, which is also useful in a number of other applications. The data for the grid of squares is stored in a two dimensional array of boolean values,
boolean[][] filled;

The value of filled[r][c] is true if the square in row r and in column c of the grid is lled. The number of rows in the grid is stored in an instance variable named rows, and the number of columns is stored in columns. The program uses a recursive instance method named getBlobSize() to count the number of squares in the blob that contains the square in a given row r and column c. If there is no lled square at position (r,c), then the answer is zero. Otherwise, getBlobSize() has to count all the lled squares that can be reached from the square at position (r,c). The idea is to use getBlobSize() recursively to get the number of lled squares that can be reached from each of the neighboring positions, (r+1,c), (r-1,c), (r,c+1), and (r,c-1). Add up these numbers, and add one to count the square at (r,c) itself, and you get the total number of lled squares that can be reached from (r,c). Here is an implementation of this algorithm, as stated. Unfortunately, it has a serious aw: It leads to an innite recursion!
int getBlobSize(int r, int c) { // BUGGY, INCORRECT VERSION!! // This INCORRECT method tries to count all the filled // squares that can be reached from position (r,c) in the grid. if (r < 0 || r >= rows || c < 0 || c >= columns) { // This position is not in the grid, so there is // no blob at this position. Return a blob size of zero. return 0; } if (filled[r][c] == false) { // This square is not part of a blob, so return zero. return 0; } int size = 1; // Count the square at this position, then count the

9.1. RECURSION
// the blobs that are connected to this square // horizontally or vertically. size += getBlobSize(r-1,c); size += getBlobSize(r+1,c); size += getBlobSize(r,c-1); size += getBlobSize(r,c+1); return size; // end INCORRECT getBlobSize()

437

Unfortunately, this routine will count the same square more than once. In fact, it will try to count each square innitely often! Think of yourself standing at position (r,c) and trying to follow these instructions. The rst instruction tells you to move up one row. You do that, and then you apply the same procedure. As one of the steps in that procedure, you have to move down one row and apply the same procedure yet again. But that puts you back at position (r,c)! From there, you move up one row, and from there you move down one row. . . . Back and forth forever! We have to make sure that a square is only counted and processed once, so we dont end up going around in circles. The solution is to leave a trail of breadcrumbsor on the computer a trail of boolean valuesto mark the squares that youve already visited. Once a square is marked as visited, it wont be processed again. The remaining, unvisited squares are reduced in number, so denite progress has been made in reducing the size of the problem. Innite recursion is avoided! A second boolean array, visited[r][c], is used to keep track of which squares have already been visited and processed. It is assumed that all the values in this array are set to false before getBlobSize() is called. As getBlobSize() encounters unvisited squares, it marks them as visited by setting the corresponding entry in the visited array to true. When getBlobSize() encounters a square that is already visited, it doesnt count it or process it further. The technique of marking items as they are encountered is one that used over and over in the programming of recursive algorithms. Here is the corrected version of getBlobSize(), with changes shown in italic:
/** * Counts the squares in the blob at position (r,c) in the * grid. Squares are only counted if they are filled and * unvisited. If this routine is called for a position that * has been visited, the return value will be zero. */ int getBlobSize(int r, int c) { if (r < 0 || r >= rows || c < 0 || c >= columns) { // This position is not in the grid, so there is // no blob at this position. Return a blob size of zero. return 0; } if (filled[r][c] == false || visited[r][c] == true) { // This square is not part of a blob, or else it has // already been counted, so return zero. return 0; } visited[r][c] = true; // Mark the square as visited so that // we wont count it again during the // following recursive calls. int size = 1; // Count the square at this position, then count the // the blobs that are connected to this square

438

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION


// horizontally or vertically. size += getBlobSize(r-1,c); size += getBlobSize(r+1,c); size += getBlobSize(r,c-1); size += getBlobSize(r,c+1); return size; // end getBlobSize()

In the program, this method is used to determine the size of a blob when the user clicks on a square. After getBlobSize() has performed its task, all the squares in the blob are still marked as visited. The paintComponent() method draws visited squares in red, which makes the blob visible. The getBlobSize() method is also used for counting blobs. This is done by the following method, which includes comments to explain how it works:
/** * When the user clicks the "Count the Blobs" button, find the * number of blobs in the grid and report the number in the * message label. */ void countBlobs() { int count = 0; // Number of blobs. /* First clear out the visited array. The getBlobSize() method will mark every filled square that it finds by setting the corresponding element of the array to true. Once a square has been marked as visited, it will stay marked until all the blobs have been counted. This will prevent the same blob from being counted more than once. */ for (int r = 0; r < rows; r++) for (int c = 0; c < columns; c++) visited[r][c] = false; /* For each position in the grid, call getBlobSize() to get the size of the blob at that position. If the size is not zero, count a blob. Note that if we come to a position that was part of a previously counted blob, getBlobSize() will return 0 and the blob will not be counted again. */ for (int r = 0; r < rows; r++) for (int c = 0; c < columns; c++) { if (getBlobSize(r,c) > 0) count++; } repaint(); // Note that all the filled squares will be red, // since they have all now been visited.

message.setText("The number of blobs is " + count); } // end countBlobs()

9.2. LINKED DATA STRUCTURES

439

9.2

Linked Data Structures

Every useful object contains instance variables.

When the type of an instance variable is given by a class or interface name, the variable can hold a reference to another object. Such a reference is also called a pointer, and we say that the variable points to the object. (Of course, any variable that can contain a reference to an object can also contain the special value null, which points to nowhere.) When one object contains an instance variable that points to another object, we think of the objects as being linked by the pointer. Data structures of great complexity can be constructed by linking objects together.

9.2.1

Recursive Linking

Something interesting happens when an object contains an instance variable that can refer to another object of the same type. In that case, the denition of the objects class is recursive. Such recursion arises naturally in many cases. For example, consider a class designed to represent employees at a company. Suppose that every employee except the boss has a supervisor, who is another employee of the company. Then the Employee class would naturally contain an instance variable of type Employee that points to the employees supervisor:
/** * An object of type Employee holds data about one employee. */ public class Employee { String name; // Name of the employee.

Employee supervisor; // The employees supervisor. . . . // (Other instance variables and methods.)

} // end class Employee

If emp is a variable of type Employee, then emp.supervisor is another variable of type Employee. If emp refers to the boss, then the value of emp.supervisor should be null to indicate the fact that the boss has no supervisor. If we wanted to print out the name of the employees supervisor, for example, we could use the following Java statement:
if ( emp.supervisor == null) { System.out.println( emp.name + " is the boss and has no supervisor!" ); } else { System.out.print( "The supervisor of " + emp.name + " is " ); System.out.println( emp.supervisor.name ); }

Now, suppose that we want to know how many levels of supervisors there are between a given employee and the boss. We just have to follow the chain of command through a series of supervisor links, and count how many steps it takes to get to the boss:
if ( emp.supervisor == null ) { System.out.println( emp.name + " is the boss!" ); } else {

440

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION


Employee runner; // For "running" up the chain of command. runner = emp.supervisor; if ( runner.supervisor == null) { System.out.println( emp.name + " reports directly to the boss." ); } else { int count = 0; while ( runner.supervisor != null ) { count++; // Count the supervisor on this level. runner = runner.supervisor; // Move up to the next level. } System.out.println( "There are " + count + " supervisors between " + emp.name + " and the boss." ); } }

As the while loop is executed, runner points in turn to the original employee, emp, then to emps supervisor, then to the supervisor of emps supervisor, and so on. The count variable is incremented each time runner visits a new employee. The loop ends when runner.supervisor is null, which indicates that runner has reached the boss. At that point, count has counted the number of steps between emp and the boss. In this example, the supervisor variable is quite natural and useful. In fact, data structures that are built by linking objects together are so useful that they are a major topic of study in computer science. Well be looking at a few typical examples. In this section and the next, well be looking at linked lists. A linked list consists of a chain of objects of the same type, linked together by pointers from one object to the next. This is much like the chain of supervisors between emp and the boss in the above example. Its also possible to have more complex situations, in which one object can contain links to several other objects. Well look at an example of this in Section 9.4.

l e

u r h

n t e l l h f l l t o u u e t n n g . c t o e t c j e b d j e o b k o n n t a i l x o e e t n b e e c n l l h n a l l t e c l u u r o s l g n n t e t f n u c s i e r n t e r e s j f o a b e e r e o s w r e t h l t n t t i , a s n c r a f i e n e t a e s i o t j e n a v a b a r s t o e c t o d o n c s c t o t h d e m a n c j c e e c h e t n b t e a b t h a j e o c t E n c b n v e i I , . o a o . l j e t e t c . b d p t n p s s e o e e s i t a y e m l p e g t c n c o r n y a u a s n e c t u e r t g e t o n h e e r m c t s n r e i a e n u n m o f h W s i r h o a t e l m w T c s s r l u n

9.2. LINKED DATA STRUCTURES

441

9.2.2

Linked Lists

For most of the examples in the rest of this section, linked lists will be constructed out of objects belonging to the class Node which is dened as follows:
class Node { String item; Node next; }

The term node is often used to refer to one of the objects in a linked data structure. Objects of type Node can be chained together as shown in the top part of the above picture. Each node holds a String and a pointer to the next node in the list (if any). The last node in such a list can always be identied by the fact that the instance variable next in the last node holds the value null instead of a pointer to another node. The purpose of the chain of nodes is to represent a list of strings. The rst string in the list is stored in the rst node, the second string is stored in the second node, and so on. The pointers and the node objects are used to build the structure, but the data that we are interested in representing is the list of strings. Of course, we could just as easily represent a list of integers or a list of JButtons or a list of any other type of data by changing the type of the item that is stored in each node. Although the Nodes in this example are very simple, we can use them to illustrate the common operations on linked lists. Typical operations include deleting nodes from the list, inserting new nodes into the list, and searching for a specied String among the items in the list. We will look at subroutines to perform all of these operations, among others. For a linked list to be used in a program, that program needs a variable that refers to the rst node in the list. It only needs a pointer to the rst node since all the other nodes in the list can be accessed by starting at the rst node and following links along the list from one node to the next. In my examples, I will always use a variable named head, of type Node, that points to the rst node in the linked list. When the list is empty, the value of head is null.

9.2.3

Basic Linked List Processing

It is very common to want to process all the items in a linked list in some way. The common pattern is to start at the head of the list, then move from each node to the next by by following the pointer in the node, stopping when the null that marks the end of the list is reached. If head is a variable of type Node that points to the rst node in the list, then the general form of the code is:
Node runner; // A pointer that will be used to traverse the list. runner = head; // Start with runner pointing to the head of the list. while ( runner != null ) { // Continue until null is encountered. process( runner.item ); // Do something with the item in the current node.

"

l m n , "

b e

a r

r e

a H

v . . e

a t s s

e o i l

b p r

t e u "

s h p

u e t n s

m n i a i j h

e " t e

e s

h o e

t n v

, r t

l e s

u s r

e d

s e a

u h e t "

e h

b d o t e e

o r l

t f t b "

t n a

s i i

i r

l o a p

a v t

r a e

o h h

F t t " l l i b " : d a e h

442

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION


runner = runner.next; } // Move on to the next node in the list.

Our only access to the list is through the variable head, so we start by getting a copy of the value in head with the assignment statement runner = head. We need a copy of head because we are going to change the value of runner. We cant change the value of head, or we would lose our only access to the list! The variable runner will point to each node of the list in turn. When runner points to one of the nodes in the list, runner.next is a pointer to the next node in the list, so the assignment statement runner = runner.next moves the pointer along the list from each node to the next. We know that weve reached the end of the list when runner becomes equal to null. Note that our list-processing code works even for an empty list, since for an empty list the value of head is null and the body of the while loop is not executed at all. As an example, we can print all the strings in a list of Strings by saying:
Node runner = head; while ( runner != null ) { System.out.println( runner.item ); runner = runner.next; }

The while loop can, by the way, be rewritten as a for loop. Remember that even though the loop control variable in a for loop is often numerical, that is not a requirement. Here is a for loop that is equivalent to the above while loop:
for ( Node runner = head; runner != null; runner = runner.next ) { System.out.println( runner.item ); }

Similarly, we can traverse a list of integers to add up all the numbers in the list. A linked list of integers can be constructed using the class
public class IntNode { int item; // One of the integers in the list. IntNode next; // Pointer to the next node in the list. }

If head is a variable of type IntNode that points to a linked list of integers, we can nd the sum of the integers in the list using:
int sum = 0; IntNode runner = head; while ( runner != null ) { sum = sum + runner.item; // Add current item to the sum. runner = runner.next; } System.out.println("The sum of the list items is " + sum);

It is also possible to use recursion to process a linked list. Recursion is rarely the natural way to process a list, since its so easy to use a loop to traverse the list. However, understanding how to apply recursion to lists can help with understanding the recursive processing of more complex data structures. A non-empty linked list can be thought of as consisting of two parts: the head of the list, which is just the rst node in the list, and the tail of the list, which consists of the remainder of the list after the head. Note that the tail is itself a linked list and that it is shorter than the original list (by one node). This is a natural setup for recursion, where the problem of processing a list can be divided into processing the head and recursively

9.2. LINKED DATA STRUCTURES

443

processing the tail. The base case occurs in the case of an empty list (or sometimes in the case of a list of length one). For example, here is a recursive algorithm for adding up the numbers in a linked list of integers:
if the list is empty then return 0 (since there are no numbers to be added up) otherwise let listsum = the number in the head node let tailsum be the sum of the numbers in the tail list (recursively) add tailsum to listsum return listsum

One remaining question is, how do we get the tail of a non-empty linked list? If head is a variable that points to the head node of the list, then head.next is a variable that points to the second node of the listand that node is in fact the rst node of the tail. So, we can view head.next as a pointer to the tail of the list. One special case is when the original list consists of a single node. In that case, the tail of the list is empty, and head.next is null. Since an empty list is represented by a null pointer, head.next represents the tail of the list even in this special case. This allows us to write a recursive list-summing function in Java as
/** * Compute the sum of all the integers in a linked list of integers. * @param head a pointer to the first node in the linked list */ public static int addItemsInList( IntNode head ) { if ( head == null ) { // Base case: The list is empty, so the sum is zero. return 0; } else { // Recursive case: The list is non-empty. Find the sum of // the tail list, and add that to the item in the head node. // (Note that this case could be written simply as // return head.item + addItemsInList( head.next );) int listsum = head.item; int tailsum = addItemsInList( head.next ); listsum = listsum + tailsum; return listsum; } }

I will nish by presenting a list-processing problem that is easy to solve with recursion, but quite tricky to solve without it. The problem is to print out all the strings in a linked list of strings in the reverse of the order in which they occur in the list. Note that when we do this, the item in the head of a list is printed out after all the items in the tail of the list. This leads to the following recursive routine. You should convince yourself that it works, and you should think about trying to do the same thing without using recursion:
public static void printReversed( Node head ) { if ( head == null ) { // Base case: The list is empty, and there is nothing to print. return; } else {

444

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION


// Recursive case: The list is non-empty. printReversed( head.next ); // Print strings in tail, in reverse order. System.out.println( head.item ); // Print string in head node. } }


In the rest of this section, well look at a few more advanced operations on a linked list of strings. The subroutines that we consider are instance methods in a class, StringList. An object of type StringList represents a linked list of nodes. The class has a private instance variable named head of type Node that points to the rst node in the list, or is null if the list is empty. Instance methods in class StringList access head as a global variable. The source code for StringList is in the le StringList.java, and it is used in the sample program ListDemo.java. Suppose we want to know whether a specied string, searchItem, occurs somewhere in a list of strings. We have to compare searchItem to each item in the list. This is an example of basic list traversal and processing. However, in this case, we can stop processing if we nd the item that we are looking for.
/** * Searches the list for a specified item. * @param searchItem the item that is to be searched for * @return true if searchItem is one of the items in the list or false if * searchItem does not occur in the list. */ public boolean find(String searchItem) { Node runner; runner = head; // A pointer for traversing the list. // Start by looking at the head of the list. // (head is an instance variable! )

while ( runner != null ) { // Go through the list looking at the string in each // node. If the string is the one we are looking for, // return true, since the string has been found in the list. if ( runner.item.equals(searchItem) ) return true; runner = runner.next; // Move on to the next node. } // At this point, we have looked at all the items in the list // without finding searchItem. Return false to indicate that // the item does not exist in the list. return false; } // end find()

It is possible that the list is empty, that is, that the value of head is null. We should be careful that this case is handled properly. In the above code, if head is null, then the body of the while loop is never executed at all, so no nodes are processed and the return value is false. This is exactly what we want when the list is empty, since the searchItem cant occur in an empty list.

9.2. LINKED DATA STRUCTURES

445

9.2.4

Inserting into a Linked List

The problem of inserting a new item into a linked list is more dicult, at least in the case where the item is inserted into the middle of the list. (In fact, its probably the most dicult operation on linked data structures that youll encounter in this chapter.) In the StringList class, the items in the nodes of the linked list are kept in increasing order. When a new item is inserted into the list, it must be inserted at the correct position according to this ordering. This means that, usually, we will have to insert the new item somewhere in the middle of the list, between two existing nodes. To do this, its convenient to have two variables of type Node, which refer to the existing nodes that will lie on either side of the new node. In the following illustration, these variables are previous and runner. Another variable, newNode, refers to the new node. In order to do the insertion, the link from previous to runner must be broken, and new links from previous to newNode and from newNode to runner must be added:

Once we have previous and runner pointing to the right nodes, the command previous.next = newNode; can be used to make previous.next point to the new node, instead of to the node indicated by runner. And the command newNode.next = runner will set newNode.next to point to the correct place. However, before we can use these commands, we need to set up runner and previous as shown in the illustration. The idea is to start at the rst node of the list, and then move along the list past all the items that are less than the new item. While doing this, we have to be aware of the danger of falling o the end of the list. That is, we cant continue if runner reaches the end of the list and becomes null. If insertItem is the item that is to be inserted, and if we assume that it does, in fact, belong somewhere in the middle of the list, then the following code would correctly position previous and runner:
Node runner, previous; previous = head; // Start at the beginning of the list. runner = head.next; while ( runner != null && runner.item.compareTo(insertItem) < 0 ) { previous = runner; // "previous = previous.next" would also work runner = runner.next; }

: l n

r d a d

e i

n g m

n n i

u e t

r r h t e s o n t : I n i e : d s o u N o w i e v n e r p

446

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

(This uses the compareTo() instance method from the String class to test whether the item in the node is less than the item that is being inserted. See Subsection 2.3.2.) This is ne, except that the assumption that the new node is inserted into the middle of the list is not always valid. It might be that insertItem is less than the rst item of the list. In that case, the new node must be inserted at the head of the list. This can be done with the instructions
newNode.next = head; head = newNode; // Make newNode.next point to the old head. // Make newNode the new head of the list.

It is also possible that the list is empty. In that case, newNode will become the rst and only node in the list. This can be accomplished simply by setting head = newNode. The following insert() method from the StringList class covers all of these possibilities:
/** * Insert a specified item to the list, keeping the list in order. * @param insertItem the item that is to be inserted. */ public void insert(String insertItem) { Node newNode; // A Node to contain the new item. newNode = new Node(); newNode.item = insertItem; // (N.B. newNode.next is null.) if ( head == null ) { // The new item is the first (and only) one in the list. // Set head to point to it. head = newNode; } else if ( head.item.compareTo(insertItem) >= 0 ) { // The new item is less than the first item in the list, // so it has to be inserted at the head of the list. newNode.next = head; head = newNode; } else { // The new item belongs somewhere after the first item // in the list. Search for its proper position and insert it. Node runner; // A node for traversing the list. Node previous; // Always points to the node preceding runner. runner = head.next; // Start by looking at the SECOND position. previous = head; while ( runner != null && runner.item.compareTo(insertItem) < 0 ) { // Move previous and runner along the list until runner // falls off the end or hits a list element that is // greater than or equal to insertItem. When this // loop ends, runner indicates the position where // insertItem must be inserted. previous = runner; runner = runner.next; } newNode.next = runner; // Insert newNode after previous. previous.next = newNode; } } // end insert()

9.2. LINKED DATA STRUCTURES

447

If you were paying close attention to the above discussion, you might have noticed that there is one special case which is not mentioned. What happens if the new node has to be inserted at the end of the list? This will happen if all the items in the list are less than the new item. In fact, this case is already handled correctly by the subroutine, in the last part of the if statement. If insertItem is greater than all the items in the list, then the while loop will end when runner has traversed the entire list and become null. However, when that happens, previous will be left pointing to the last node in the list. Setting previous.next = newNode adds newNode onto the end of the list. Since runner is null, the command newNode.next = runner sets newNode.next to null, which is the correct value that is needed to mark the end of the list.

9.2.5

Deleting from a Linked List

The delete operation is similar to insert, although a little simpler. There are still special cases to consider. When the rst node in the list is to be deleted, then the value of head has to be changed to point to what was previously the second node in the list. Since head.next refers to the second node in the list, this can be done by setting head = head.next. (Once again, you should check that this works when head.next is null, that is, when there is no second node in the list. In that case, the list becomes empty.) If the node that is being deleted is in the middle of the list, then we can set up previous and runner with runner pointing to the node that is to be deleted and with previous pointing to the node that precedes that node in the list. Once that is done, the command previous.next = runner.next; will delete the node. The deleted node will be garbage collected. I encourage you to draw a picture for yourself to illustrate this operation. Here is the complete code for the delete() method:
/** * Delete a specified item from the list, if that item is present. * If multiple copies of the item are present in the list, only * the one that comes first in the list is deleted. * @param deleteItem the item to be deleted * @return true if the item was found and deleted, or false if the item * was not in the list. */ public boolean delete(String deleteItem) { if ( head == null ) { // The list is empty, so it certainly doesnt contain deleteString. return false; } else if ( head.item.equals(deleteItem) ) { // The string is the first item of the list. Remove it. head = head.next; return true; } else { // The string, if it occurs at all, is somewhere beyond the // first element of the list. Search the list. Node runner; // A node for traversing the list. Node previous; // Always points to the node preceding runner. runner = head.next; // Start by looking at the SECOND list node. previous = head;

448

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION


while ( runner != null && runner.item.compareTo(deleteItem) < 0 ) { // Move previous and runner along the list until runner // falls off the end or hits a list element that is // greater than or equal to deleteItem. When this // loop ends, runner indicates the position where // deleteItem must be, if it is in the list. previous = runner; runner = runner.next; } if ( runner != null && runner.item.equals(deleteItem) ) { // Runner points to the node that is to be deleted. // Remove it by changing the pointer in the previous node. previous.next = runner.next; return true; } else { // The item does not exist in the list. return false; } } } // end delete()

9.3

Stacks, Queues, and ADTs

A linked list is a particular type of data structure, made up of objects linked together by pointers. In the previous section, we used a linked list to store an ordered list of Strings, and we implemented insert, delete, and find operations on that list. However, we could easily have stored the list of Strings in an array or ArrayList, instead of in a linked list. We could still have implemented the same operations on the list. The implementations of these operations would have been dierent, but their interfaces and logical behavior would still be the same. The term abstract data type, or ADT , refers to a set of possible values and a set of operations on those values, without any specication of how the values are to be represented or how the operations are to be implemented. An ordered list of strings can be dened as an abstract data type. Any sequence of Strings that is arranged in increasing order is a possible value of this data type. The operations on the data type include inserting a new string, deleting a string, and nding a string in the list. There are often several dierent ways to implement the same abstract data type. For example, the ordered list of strings ADT can be implemented as a linked list or as an array. A program that only depends on the abstract denition of the ADT can use either implementation, interchangeably. In particular, the implementation of the ADT can be changed without aecting the program as a whole. This can make the program easier to debug and maintain, so ADTs are an important tool in software engineering. In this section, well look at two common abstract data types, stacks and queues. Both stacks and queues are often implemented as linked lists, but that is not the only possible implementation. You should think of the rest of this section partly as a discussion of stacks and queues and partly as a case study in ADTs.

9.3. STACKS, QUEUES, AND ADTS

449

9.3.1

Stacks

A stack consists of a sequence of items, which should be thought of as piled one on top of the other like a physical stack of boxes or cafeteria trays. Only the top item on the stack is accessible at any given time. It can be removed from the stack with an operation called pop. An item lower down on the stack can only be removed after all the items on top of it have been popped o the stack. A new item can be added to the top of the stack with an operation called push . We can make a stack of any type of items. If, for example, the items are values of type int, then the push and pop operations can be implemented as instance methods void push (int newItem) Add newItem to top of stack. int pop() Remove the top int from the stack and return it. It is an error to try to pop an item from an empty stack, so it is important to be able to tell whether a stack is empty. We need another stack operation to do the test, implemented as an instance method boolean isEmpty() Returns true if the stack is empty. This denes a stack of ints as an abstract data type. This ADT can be implemented in several ways, but however it is implemented, its behavior must correspond to the abstract mental image of a stack.

In the linked list implementation of a stack, the top of the stack is actually the node at the head of the list. It is easy to add and remove nodes at the front of a linked listmuch easier than inserting and deleting nodes in the middle of the list. Here is a class that implements the stack of ints ADT using a linked list. (It uses a static nested class to represent the nodes of the linked list. If the nesting bothers you, you could replace it with a separate Node class.)
public class StackOfInts { /** * An object of type Node holds one of the items in the linked list * that represents the stack. */

450

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION


private static class Node { int item; Node next; } private Node top; // Pointer to the Node that is at the top of // of the stack. If top == null, then the // stack is empty.

/** * Add N to the top of the stack. */ public void push( int N ) { Node newTop; // A Node to hold the new item. newTop = new Node(); newTop.item = N; // Store N in the new Node. newTop.next = top; // The new Node points to the old top. top = newTop; // The new item is now on top. } /** * Remove the top item from the stack, and return it. * Throws an IllegalStateException if the stack is empty when * this method is called. */ public int pop() { if ( top == null ) throw new IllegalStateException("Cant pop from an empty stack."); int topItem = top.item; // The item that is being popped. top = top.next; // The previous second item is now on top. return topItem; } /** * Returns true if the stack is empty. Returns false * if there are one or more items on the stack. */ public boolean isEmpty() { return (top == null); } } // end class StackOfInts

You should make sure that you understand how the push and pop operations operate on the linked list. Drawing some pictures might help. Note that the linked list is part of the private implementation of the StackOfInts class. A program that uses this class doesnt even need to know that a linked list is being used. Now, its pretty easy to implement a stack as an array instead of as a linked list. Since the number of items on the stack varies with time, a counter is needed to keep track of how many spaces in the array are actually in use. If this counter is called top, then the items on the stack are stored in positions 0, 1, . . . , top-1 in the array. The item in position 0 is on the bottom of the stack, and the item in position top-1 is on the top of the stack. Pushing an item onto the stack is easy: Put the item in position top and add 1 to the value of top. If we dont want to put a limit on the number of items that the stack can hold, we can use the dynamic array techniques from Subsection 7.3.2. Note that the typical picture of the array would show the

9.3. STACKS, QUEUES, AND ADTS

451

stack upside down, with the top of the stack at the bottom of the array. This doesnt matter. The array is just an implementation of the abstract idea of a stack, and as long as the stack operations work the way they are supposed to, we are OK. Here is a second implementation of the StackOfInts class, using a dynamic array:
public class StackOfInts { // (alternate version, using an array) // Holds the items on the stack.

private int[] items = new int[10]; private int top = 0;

// The number of items currently on the stack.

/** * Add N to the top of the stack. */ public void push( int N ) { if (top == items.length) { // The array is full, so make a new, larger array and // copy the current stack items into it. int[] newArray = new int[ 2*items.length ]; System.arraycopy(items, 0, newArray, 0, items.length); items = newArray; } items[top] = N; // Put N in next available spot. top++; // Number of items goes up by one. } /** * Remove the top item from the stack, and return it. * Throws an IllegalStateException if the stack is empty when * this method is called. */ public int pop() { if ( top == 0 ) throw new IllegalStateException("Cant pop from an empty stack."); int topItem = items[top - 1] // Top item in the stack. top--; // Number of items on the stack goes down by one. return topItem; } /** * Returns true if the stack is empty. Returns false * if there are one or more items on the stack. */ public boolean isEmpty() { return (top == 0); } } // end class StackOfInts

Once again, the implentation of the stack (as an array) is private to the class. The two versions of the StackOfInts class can be used interchangeably, since their public interfaces are identical.


Its interesting to look at the run time analysis of stack operations. (See Section 8.6). We can measure the size of the problem by the number of items that are on the stack. For the linked list implementation of a stack, the worst case run time both for the push and for the pop

452

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

operation is (1). This just means that the run time is less than some constant, independent of the number of items on the stack. This is easy to see if you look at the code. The operations are implemented with a few simple assignment statements, and the number of items on the stack has no eect. For the array implementation, on the other hand, a special case occurs in the push operation when the array is full. In that case, a new array is created and all the stack items are copied into the new array. This takes an amount of time that is proportional to the number of items on the stack. So, although the run time for push is usually (1), the worst case run time is (n), where n is the number of items on the stack.

9.3.2

Queues

Queues are similar to stacks in that a queue consists of a sequence of items, and there are restrictions about how items can be added to and removed from the list. However, a queue has two ends, called the front and the back of the queue. Items are always added to the queue at the back and removed from the queue at the front. The operations of adding and removing items are called enqueue and dequeue. An item that is added to the back of the queue will remain on the queue until all the items in front of it have been removed. This should sound familiar. A queue is like a line or queue of customers waiting for service. Customers are serviced in the order in which they arrive on the queue.
e
k c

A queue can hold items of any type. For a queue of ints, the enqueue and dequeue operations can be implemented as instance methods in a QueueOfInts class. We also need an instance method for checking whether the queue is empty: void enqueue(int N) Add N to the back of the queue. int dequeue() Remove the item at the front and return it. boolean isEmpty() Return true if the queue is empty. A queue can be implemented as a linked list or as an array. An ecient array implementation is a little trickier than the array implementation of a stack, so I wont give it here. In the linked

u h e

e t
t

u
B n

q e
o

e t
f

t m m

f
m

o r
o

d n
f

n 7 t
3

e 8 t s s
v

e d 7
a

n 1 d r e
e )

o a u
) l

t o 2 n 7 e

a
d e

"

e e t d
e

c
e

a 2 r
u

l a
k e

p
c

e o u
q

k e
b

"

a u "
r

t q
e a

s
t r

n
e e

o 2 T t
u A f

i 1

t
e

a
A

r " u 1

e "
q

p e
r

o h
e

l
t

l f e
n

a . " h
e

, t e e

e
s

u t t o

e
m

"

u
e

q
I

a t m a e r b

n t n "

I i o r F

9.3. STACKS, QUEUES, AND ADTS

453

list implementation, the rst item of the list is at the front of the queue. Dequeueing an item from the front of the queue is just like popping an item o a stack. The back of the queue is at the end of the list. Enqueueing an item involves setting a pointer in the last node of the current list to point to a new node that contains the item. To do this, well need a command like tail.next = newNode;, where tail is a pointer to the last node in the list. If head is a pointer to the rst node of the list, it would always be possible to get a pointer to the last node of the list by saying:
Node tail; // This will point to the last node in the list. tail = head; // Start at the first node. while (tail.next != null) { tail = tail.next; // Move to next node. } // At this point, tail.next is null, so tail points to // the last node in the list.

However, it would be very inecient to do this over and over every time an item is enqueued. For the sake of eciency, well keep a pointer to the last node in an instance variable. This complicates the class somewhat; we have to be careful to update the value of this variable whenever a new node is added to the end of the list. Given all this, writing the QueueOfInts class is not all that dicult:
public class QueueOfInts { /** * An object of type Node holds one of the items * in the linked list that represents the queue. */ private static class Node { int item; Node next; } private Node head = null; private Node tail = null; // Points to first Node in the queue. // The queue is empty when head is null. // Points to last Node in the queue.

/** * Add N to the back of the queue. */ public void enqueue( int N ) { Node newTail = new Node(); // A Node to hold the new item. newTail.item = N; if (head == null) { // The queue was empty. The new Node becomes // the only node in the list. Since it is both // the first and last node, both head and tail // point to it. head = newTail; tail = newTail; } else { // The new node becomes the new tail of the list. // (The head of the list is unaffected.)

454

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION


tail.next = newTail; tail = newTail; } } /** * Remove and return the front item in the queue. * Throws an IllegalStateException if the queue is empty. */ public int dequeue() { if ( head == null) throw new IllegalStateException("Cant dequeue from an empty queue."); int firstItem = head.item; head = head.next; // The previous second item is now first. if (head == null) { // The queue has become empty. The Node that was // deleted was the tail as well as the head of the // list, so now there is no tail. (Actually, the // class would work fine without this step.) tail = null; } return firstItem; } /** * Return true if the queue is empty. */ boolean isEmpty() { return (head == null); } } // end class QueueOfInts

Queues are typically used in a computer (as in real life) when only one item can be processed at a time, but several items can be waiting for processing. For example: In a Java program that has multiple threads, the threads that want processing time on the CPU are kept in a queue. When a new thread is started, it is added to the back of the queue. A thread is removed from the front of the queue, given some processing time, and thenif it has not terminatedis sent to the back of the queue to wait for another turn. Events such as keystrokes and mouse clicks are stored in a queue called the event queue. A program removes events from the event queue and processes them. Its possible for several more events to occur while one event is being processed, but since the events are stored in a queue, they will always be processed in the order in which they occurred. A web server is a progam that receives requests from web browsers for pages. It is easy for new requests to arrive while the web server is still fullling a previous request. Requests that arrive while the web server is busy are placed into a queue to await processing. Using a queue ensures that requests will be processed in the order in which they were received. Queues are said to implement a FIFO policy: First In, First Out. Or, as it is more commonly expressed, rst come, rst served. Stacks, on the other hand implement a LIFO policy: Last In, First Out. The item that comes out of the stack is the last one that was put in. Just like queues, stacks can be used to hold items that are waiting for processing (although in applications where queues are typically used, a stack would be considered unfair).

9.3. STACKS, QUEUES, AND ADTS

455


To get a better handle on the dierence between stacks and queues, consider the sample program DepthBreadth.java. I suggest that you run the program or try the applet version that can be found in the on-line version of this section. The program shows a grid of squares. Initially, all the squares are white. When you click on a white square, the program will gradually mark all the squares in the grid, starting from the one where you click. To understand how the program does this, think of yourself in the place of the program. When the user clicks a square, you are handed an index card. The location of the squareits row and columnis written on the card. You put the card in a pile, which then contains just that one card. Then, you repeat the following: If the pile is empty, you are done. Otherwise, take an index card from the pile. The index card species a square. Look at each horizontal and vertical neighbor of that square. If the neighbor has not already been encountered, write its location on a new index card and put the card in the pile. While a square is in the pile, waiting to be processed, it is colored red; that is, red squares have been encountered but not yet processed. When a square is taken from the pile and processed, its color changes to gray. Once a square has been colored gray, its color wont change again. Eventually, all the squares have been processed, and the procedure ends. In the index card analogy, the pile of cards has been emptied. The program can use your choice of three methods: Stack, Queue, and Random. In each case, the same general procedure is used. The only dierence is how the pile of index cards is managed. For a stack, cards are added and removed at the top of the pile. For a queue, cards are added to the bottom of the pile and removed from the top. In the random case, the card to be processed is picked at random from among all the cards in the pile. The order of processing is very dierent in these three cases. You should experiment with the program to see how it all works. Try to understand how stacks and queues are being used. Try starting from one of the corner squares. While the process is going on, you can click on other white squares, and they will be added to the pile. When you do this with a stack, you should notice that the square you click is processed immediately, and all the red squares that were already waiting for processing have to wait. On the other hand, if you do this with a queue, the square that you click will wait its turn until all the squares that were already in the pile have been processed.


Queues seem very natural because they occur so often in real life, but there are times when stacks are appropriate and even essential. For example, consider what happens when a routine calls a subroutine. The rst routine is suspended while the subroutine is executed, and it will continue only when the subroutine returns. Now, suppose that the subroutine calls a second subroutine, and the second subroutine calls a third, and so on. Each subroutine is suspended while the subsequent subroutines are executed. The computer has to keep track of all the subroutines that are suspended. It does this with a stack. When a subroutine is called, an activation record is created for that subroutine. The activation record contains information relevant to the execution of the subroutine, such as its local variables and parameters. The activation record for the subroutine is placed on a stack. It will be removed from the stack and destroyed when the subroutine returns. If the subroutine calls another subroutine, the activation record of the second subroutine is pushed onto the stack, on top of the activation record of the rst subroutine. The stack can continue to grow as more subroutines are called, and it shrinks as those subroutines return.

456

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

9.3.3

Postx Expressions

As another example, stacks can be used to evaluate postx expressions. An ordinary mathematical expression such as 2+(15-12)*17 is called an inx expression. In an inx expression, an operator comes in between its two operands, as in 2 + 2. In a postx expression, an operator comes after its two operands, as in 2 2 +. The inx expression 2+(15-12)*17 would be written in postx form as 2 15 12 - 17 * +. The - operator in this expression applies to the two operands that precede it, namely 15 and 12. The * operator applies to the two operands that precede it, namely 15 12 - and 17. And the + operator applies to 2 and 15 12 - 17 *. These are the same computations that are done in the original inx expression. Now, suppose that we want to process the expression 2 15 12 - 17 * +, from left to right and nd its value. The rst item we encounter is the 2, but what can we do with it? At this point, we dont know what operator, if any, will be applied to the 2 or what the other operand might be. We have to remember the 2 for later processing. We do this by pushing it onto a stack. Moving on to the next item, we see a 15, which is pushed onto the stack on top of the 2. Then the 12 is added to the stack. Now, we come to the operator, -. This operation applies to the two operands that preceded it in the expression. We have saved those two operands on the stack. So, to process the - operator, we pop two numbers from the stack, 12 and 15, and compute 15 - 12 to get the answer 3. This 3 must be remembered to be used in later processing, so we push it onto the stack, on top of the 2 that is still waiting there. The next item in the expression is a 17, which is processed by pushing it onto the stack, on top of the 3. To process the next item, *, we pop two numbers from the stack. The numbers are 17 and the 3 that represents the value of 15 12 -. These numbers are multiplied, and the result, 51 is pushed onto the stack. The next item in the expression is a + operator, which is processed by popping 51 and 2 from the stack, adding them, and pushing the result, 53, onto the stack. Finally, weve come to the end of the expression. The number on the stack is the value of the entire expression, so all we have to do is pop the answer from the stack, and we are done! The value of the expression is 53. Although its easier for people to work with inx expressions, postx expressions have some advantages. For one thing, postx expressions dont require parentheses or precedence rules. The order in which operators are applied is determined entirely by the order in which they occur in the expression. This allows the algorithm for evaluating postx expressions to be fairly straightforward:
Start with an empty stack for each item in the expression: if the item is a number: Push the number onto the stack else if the item is an operator: Pop the operands from the stack // Can generate an error Apply the operator to the operands Push the result onto the stack else There is an error in the expression Pop a number from the stack // Can generate an error if the stack is not empty: There is an error in the expression else: The last number that was popped is the value of the expression

9.3. STACKS, QUEUES, AND ADTS

457

Errors in an expression can be detected easily. For example, in the expression 2 3 + *, there are not enough operands for the * operation. This will be detected in the algorithm when an attempt is made to pop the second operand for * from the stack, since the stack will be empty. The opposite problem occurs in 2 3 4 +. There are not enough operators for all the numbers. This will be detected when the 2 is left still sitting in the stack at the end of the algorithm. This algorithm is demonstrated in the sample program PostxEval.java. This program lets you type in postx expressions made up of non-negative real numbers and the operators +, -, *, /, and ^. The ^ represents exponentiation. That is, 2 3 ^ is evaluated as 23 . The program prints out a message as it processes each item in the expression. The stack class that is used in the program is dened in the le StackOfDouble.java. The StackOfDouble class is identical to the rst StackOfInts class, given above, except that it has been modied to store values of type double instead of values of type int. The only interesting aspect of this program is the method that implements the postx evaluation algorithm. It is a direct implementation of the pseudocode algorithm given above:
/** * Read one line of input and process it as a postfix expression. * If the input is not a legal postfix expression, then an error * message is displayed. Otherwise, the value of the expression * is displayed. It is assumed that the first character on * the input line is a non-blank. */ private static void readAndEvaluate() { StackOfDouble stack; // For evaluating the expression.

stack = new StackOfDouble(); // Make a new, empty stack. TextIO.putln(); while (TextIO.peek() != \n) { if ( Character.isDigit(TextIO.peek()) ) { // The next item in input is a number. Read it and // save it on the stack. double num = TextIO.getDouble(); stack.push(num); TextIO.putln(" Pushed constant " + num); } else { // Since the next item is not a number, the only thing // it can legally be is an operator. Get the operator // and perform the operation. char op; // The operator, which must be +, -, *, /, or ^. double x,y; // The operands, from the stack, for the operation. double answer; // The result, to be pushed onto the stack. op = TextIO.getChar(); if (op != + && op != - && op != * && op != / && op != ^) { // The character is not one of the acceptable operations. TextIO.putln("\nIllegal operator found in input: " + op); return; } if (stack.isEmpty()) {

458

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION


TextIO.putln(" Stack is empty while trying to evaluate " + op); TextIO.putln("\nNot enough numbers in expression!"); return; } y = stack.pop(); if (stack.isEmpty()) { TextIO.putln(" Stack is empty while trying to evaluate " + op); TextIO.putln("\nNot enough numbers in expression!"); return; } x = stack.pop(); switch (op) { case +: answer = x + y; break; case -: answer = x - y; break; case *: answer = x * y; break; case /: answer = x / y; break; default: answer = Math.pow(x,y); // (op must be ^.) } stack.push(answer); TextIO.putln(" Evaluated " + op + " and pushed " + answer); } TextIO.skipBlanks(); } // end while

// If we get to this point, the input has been read successfully. // If the expression was legal, then the value of the expression is // on the stack, and it is the only thing on the stack. if (stack.isEmpty()) { // Impossible if the input is really non-empty. TextIO.putln("No expression provided."); return; } double value = stack.pop(); // Value of the expression. TextIO.putln(" Popped " + value + " at end of expression."); if (stack.isEmpty() == false) { TextIO.putln(" Stack is not empty."); TextIO.putln("\nNot enough operators for all the numbers!"); return; } TextIO.putln("\nValue = " + value); } // end readAndEvaluate()

9.4. BINARY TREES

459

Postx expressions are often used internally by computers. In fact, the Java virtual machine is a stack machine which uses the stack-based approach to expression evaluation that we have been discussing. The algorithm can easily be extended to handle variables, as well as constants. When a variable is encountered in the expression, the value of the variable is pushed onto the stack. It also works for operators with more or fewer than two operands. As many operands as are needed are popped from the stack and the result is pushed back on to the stack. For example, the unary minus operator, which is used in the expression -x, has a single operand. We will continue to look at expressions and expression evaluation in the next two sections.

9.4

Binary Trees

We have seen in the two previous sections how objects can be linked into lists. When an object contains two pointers to objects of the same type, structures can be created that are much more complicated than linked lists. In this section, well look at one of the most basic and useful structures of this type: binary trees. Each of the objects in a binary tree contains two pointers, typically called left and right. In addition to these pointers, of course, the nodes can contain other types of data. For example, a binary tree of integers could be made up of objects of the following type:
class TreeNode { int item; TreeNode left; TreeNode right; } // The data in this node. // Pointer to the left subtree. // Pointer to the right subtree.

The left and right pointers in a TreeNode can be null or can point to other objects of type TreeNode. A node that points to another node is said to be the parent of that node, and the node it points to is called a child . In the picture below, for example, node 3 is the parent of node 6, and nodes 4 and 5 are children of node 2. Not every linked structure made up of tree nodes is a binary tree. A binary tree must have the following properties: There is exactly one node in the tree which has no parent. This node is called the root of the tree. Every other node in the tree has exactly one parent. Finally, there can be no loops in a binary tree. That is, it is not possible to follow a chain of pointers starting at some node and arriving back at the same node.

460

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

A node that has no children is called a leaf . A leaf node can be recognized by the fact that both the left and right pointers in the node are null. In the standard picture of a binary tree, the root node is shown at the top and the leaf nodes at the bottomwhich doesnt show much respect for the analogy to real trees. But at least you can see the branching, tree-like structure that gives a binary tree its name.

9.4.1

Tree Traversal

Consider any node in a binary tree. Look at that node together with all its descendents (that is, its children, the children of its children, and so on). This set of nodes forms a binary tree, which is called a subtree of the original tree. For example, in the picture, nodes 2, 4, and 5 form a subtree. This subtree is called the left subtree of the root. Similarly, nodes 3 and 6 make up the right subtree of the root. We can consider any non-empty binary tree to be made up of a root node, a left subtree, and a right subtree. Either or both of the subtrees can be empty. This is a recursive denition, matching the recursive denition of the TreeNode class. So it should not be a surprise that recursive subroutines are often used to process trees. Consider the problem of counting the nodes in a binary tree. (As an exercise, you might try to come up with a non-recursive algorithm to do the counting, but you shouldnt expect to nd one.) The heart of problem is keeping track of which nodes remain to be counted. Its not so easy to do this, and in fact its not even possible without an auxiliary data structure such as a stack or queue. With recursion, however, the algorithm is almost trivial. Either the tree is empty or it consists of a root and two subtrees. If the tree is empty, the number of nodes is zero. (This is the base case of the recursion.) Otherwise, use recursion to count the nodes in each subtree. Add the results from the subtrees together, and add one to count the root. This gives the total number of nodes in the tree. Written out in Java:
/** * Count the nodes in the binary tree to which root points, and * return the answer. If root is null, the answer is zero. */ static int countNodes( TreeNode root ) { if ( root == null )

d 2 N

o f a

N e

t l l

o L l l

o 4 u u

R n n

9.4. BINARY TREES


return 0; // The tree is empty. It contains no nodes. else { int count = 1; // Start by counting the root. count += countNodes(root.left); // Add the number of nodes // in the left subtree. count += countNodes(root.right); // Add the number of nodes // in the right subtree. return count; // Return the total. } } // end countNodes()

461

Or, consider the problem of printing the items in a binary tree. If the tree is empty, there is nothing to do. If the tree is non-empty, then it consists of a root and two subtrees. Print the item in the root and use recursion to print the items in the subtrees. Here is a subroutine that prints all the items on one line of output:
/** * Print all the items in the tree to which root points. * The item in the root is printed first, followed by the * items in the left subtree and then the items in the * right subtree. */ static void preorderPrint( TreeNode root ) { if ( root != null ) { // (Otherwise, theres nothing to print.) System.out.print( root.item + " " ); // Print the root item. preorderPrint( root.left ); // Print items in left subtree. preorderPrint( root.right ); // Print items in right subtree. } } // end preorderPrint()

This routine is called preorderPrint because it uses a preorder traversal of the tree. In a preorder traversal, the root node of the tree is processed rst, then the left subtree is traversed, then the right subtree. In a postorder traversal , the left subtree is traversed, then the right subtree, and then the root node is processed. And in an inorder traversal , the left subtree is traversed rst, then the root node is processed, then the right subtree is traversed. Printing subroutines that use postorder and inorder traversal dier from preorderPrint only in the placement of the statement that outputs the root item:
/** * Print all the items in the tree to which root points. * The item in the left subtree printed first, followed * by the items in the right subtree and then the item * in the root node. */ static void postorderPrint( TreeNode root ) { if ( root != null ) { // (Otherwise, theres nothing to print.) postorderPrint( root.left ); // Print items in left subtree. postorderPrint( root.right ); // Print items in right subtree. System.out.print( root.item + " " ); // Print the root item. } } // end postorderPrint() /** * Print all the items in the tree to which root points.

462

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION


* The item in the left subtree printed first, followed * by the item in the root node and then the items * in the right subtree. */ static void inorderPrint( TreeNode root ) { if ( root != null ) { // (Otherwise, theres nothing to print.) inorderPrint( root.left ); // Print items in left subtree. System.out.print( root.item + " " ); // Print the root item. inorderPrint( root.right ); // Print items in right subtree. } } // end inorderPrint()

Each of these subroutines can be applied to the binary tree shown in the illustration at the beginning of this section. The order in which the items are printed diers in each case:
preorderPrint outputs: postorderPrint outputs: inorderPrint outputs: 1 4 4 2 5 2 4 2 5 5 6 1 3 3 3 6 1 6

In preorderPrint, for example, the item at the root of the tree, 1, is output before anything else. But the preorder printing also applies to each of the subtrees of the root. The root item of the left subtree, 2, is printed before the other items in that subtree, 4 and 5. As for the right subtree of the root, 3 is output before 6. A preorder traversal applies at all levels in the tree. The other two traversal orders can be analyzed similarly.

9.4.2

Binary Sort Trees

One of the examples in Section 9.2 was a linked list of strings, in which the strings were kept in increasing order. While a linked list works well for a small number of strings, it becomes inecient for a large number of items. When inserting an item into the list, searching for that items position requires looking at, on average, half the items in the list. Finding an item in the list requires a similar amount of time. If the strings are stored in a sorted array instead of in a linked list, then searching becomes more ecient because binary search can be used. However, inserting a new item into the array is still inecient since it means moving, on average, half of the items in the array to make a space for the new item. A binary tree can be used to store an ordered list of strings, or other items, in a way that makes both searching and insertion ecient. A binary tree used in this way is called a binary sort tree. A binary sort tree is a binary tree with the following property: For every node in the tree, the item in that node is greater than every item in the left subtree of that node, and it is less than or equal to all the items in the right subtree of that node. Here for example is a binary sort tree containing items of type String. (In this picture, I havent bothered to draw all the pointer variables. Non-null pointers are shown as arrows.)

9.4. BINARY TREES

463

Binary sort trees have this useful property: An inorder traversal of the tree will process the items in increasing order. In fact, this is really just another way of expressing the denition. For example, if an inorder traversal is used to print the items in the tree shown above, then the items will be in alphabetical order. The denition of an inorder traversal guarantees that all the items in the left subtree of judy are printed before judy, and all the items in the right subtree of judy are printed after judy. But the binary sort tree property guarantees that the items in the left subtree of judy are precisely those that precede judy in alphabetical order, and all the items in the right subtree follow judy in alphabetical order. So, we know that judy is output in its proper alphabetical position. But the same argument applies to the subtrees. Bill will be output after alice and before fred and its descendents. Fred will be output after dave and before jane and joe. And so on. Suppose that we want to search for a given item in a binary search tree. Compare that item to the root item of the tree. If they are equal, were done. If the item we are looking for is less than the root item, then we need to search the left subtree of the rootthe right subtree can be eliminated because it only contains items that are greater than or equal to the root. Similarly, if the item we are looking for is greater than the item in the root, then we only need to look in the right subtree. In either case, the same procedure can then be applied to search the subtree. Inserting a new item is similar: Start by searching the tree for the position where the new item belongs. When that position is found, create a new node and attach it to the tree at that position. Searching and inserting are ecient operations on a binary search tree, provided that the tree is close to being balanced . A binary tree is balanced if for each node, the left subtree of that node contains approximately the same number of nodes as the right subtree. In a perfectly balanced tree, the two numbers dier by at most one. Not all binary trees are balanced, but if the tree is created by inserting items in a random order, there is a high probability that the tree is approximately balanced. (If the order of insertion is not random, however, its quite possible for the tree to be very unbalanced.) During a search of any binary sort tree, every comparison eliminates one of two subtrees from further consideration. If the tree is balanced, that means cutting the number of items still under consideration in half. This is exactly the same as the binary search algorithm, and the result, is a similarly ecient algorithm. In terms of asymptotic analysis (Section 8.6), searching, inserting, and deleting in a binary

o e

o c i

r l a

464

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

search tree have average case run time (log(n)). The problem size, n, is the number of items in the tree, and the average is taken over all the dierent orders in which the items could have been inserted into the tree. As long the actual insertion order is random, the actual run time can be expected to be close to the average. However, the worst case run time for binary search tree operations is (n), which is much worse than (log(n)). The worst case occurs for certain particular insertion orders. For example, if the items are inserted into the tree in order of increasing size, then every item that is inserted moves always to the right as it moves down the tree. The result is a tree that looks more like a linked list, since it consists of a linear string of nodes strung together by their right child pointers. Operations on such a tree have the same performance as operations on a linked list. Now, there are data structures that are similar to simple binary sort trees, except that insertion and deletion of nodes are implemented in a way that will always keep the tree balanced, or almost balanced. For these data structures, searching, inserting, and deleting have both average case and worst case run times that are (log(n)). Here, however, we will look at only the simple versions of inserting and searching. The sample program SortTreeDemo.java is a demonstration of binary sort trees. The program includes subroutines that implement inorder traversal, searching, and insertion. Well look at the latter two subroutines below. The main() routine tests the subroutines by letting you type in strings to be inserted into the tree. In this program, nodes in the binary tree are represented using the following static nested class, including a simple constructor that makes creating nodes easier:
/** * An object of type TreeNode represents one node in a binary tree of strings. */ private static class TreeNode { String item; // The data in this node. TreeNode left; // Pointer to left subtree. TreeNode right; // Pointer to right subtree. TreeNode(String str) { // Constructor. Make a node containing str. item = str; } } // end class TreeNode

A static member variable of type TreeNode points to the binary sort tree that is used by the program:
private static TreeNode root; // Pointer to the root node in the tree. // When the tree is empty, root is null.

A recursive subroutine named treeContains is used to search for a given item in the tree. This routine implements the search algorithm for binary trees that was outlined above:
/** * Return true if item is one of the items in the binary * sort tree to which root points. Return false if not. */ static boolean treeContains( TreeNode root, String item ) { if ( root == null ) { // Tree is empty, so it certainly doesnt contain item. return false; } else if ( item.equals(root.item) ) {

9.4. BINARY TREES


// Yes, the item has been found in the root node. return true; } else if ( item.compareTo(root.item) < 0 ) { // If the item occurs, it must be in the left subtree. return treeContains( root.left, item ); } else { // If the item occurs, it must be in the right subtree. return treeContains( root.right, item ); } // end treeContains()

465

When this routine is called in the main() routine, the rst parameter is the static member variable root, which points to the root of the entire binary sort tree. Its worth noting that recursion is not really essential in this case. A simple, non-recursive algorithm for searching a binary sort tree follows the rule: Start at the root and move down the tree until you nd the item or reach a null pointer. Since the search follows a single path down the tree, it can be implemented as a while loop. Here is non-recursive version of the search routine:
private static boolean treeContainsNR( TreeNode root, String item ) { TreeNode runner; // For "running" down the tree. runner = root; // Start at the root node. while (true) { if (runner == null) { // Weve fallen off the tree without finding item. return false; } else if ( item.equals(node.item) ) { // Weve found the item. return true; } else if ( item.compareTo(node.item) < 0 ) { // If the item occurs, it must be in the left subtree, // So, advance the runner down one level to the left. runner = runner.left; } else { // If the item occurs, it must be in the right subtree. // So, advance the runner down one level to the right. runner = runner.right; } } // end while } // end treeContainsNR();

The subroutine for inserting a new item into the tree turns out to be more similar to the non-recursive search routine than to the recursive. The insertion routine has to handle the case where the tree is empty. In that case, the value of root must be changed to point to a node that contains the new item:
root = new TreeNode( newItem );

466

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

But this means, eectively, that the root cant be passed as a parameter to the subroutine, because it is impossible for a subroutine to change the value stored in an actual parameter. (I should note that this is something that is possible in other languages.) Recursion uses parameters in an essential way. There are ways to work around the problem, but the easiest thing is just to use a non-recursive insertion routine that accesses the static member variable root directly. One dierence between inserting an item and searching for an item is that we have to be careful not to fall o the tree. That is, we have to stop searching just before runner becomes null. When we get to an empty spot in the tree, thats where we have to insert the new node:
/** * Add the item to the binary sort tree to which the global variable * "root" refers. (Note that root cant be passed as a parameter to * this routine because the value of root might change, and a change * in the value of a formal parameter does not change the actual parameter.) */ private static void treeInsert(String newItem) { if ( root == null ) { // The tree is empty. Set root to point to a new node containing // the new item. This becomes the only node in the tree. root = new TreeNode( newItem ); return; } TreeNode runner; // Runs down the tree to find a place for newItem. runner = root; // Start at the root. while (true) { if ( newItem.compareTo(runner.item) < 0 ) { // Since the new item is less than the item in runner, // it belongs in the left subtree of runner. If there // is an open space at runner.left, add a new node there. // Otherwise, advance runner down one level to the left. if ( runner.left == null ) { runner.left = new TreeNode( newItem ); return; // New item has been added to the tree. } else runner = runner.left; } else { // Since the new item is greater than or equal to the item in // runner, it belongs in the right subtree of runner. If there // is an open space at runner.right, add a new node there. // Otherwise, advance runner down one level to the right. if ( runner.right == null ) { runner.right = new TreeNode( newItem ); return; // New item has been added to the tree. } else runner = runner.right; } } // end while } // end treeInsert()

9.4. BINARY TREES

467

9.4.3

Expression Trees

Another application of trees is to store mathematical expressions such as 15*(x+y) or sqrt(42)+7 in a convenient form. Lets stick for the moment to expressions made up of numbers and the operators +, -, *, and /. Consider the expression 3*((7+1)/4)+(17-5). This expression is made up of two subexpressions, 3*((7+1)/4) and (17-5), combined with the operator +. When the expression is represented as a binary tree, the root node holds the operator +, while the subtrees of the root node represent the subexpressions 3*((7+1)/4) and (17-5). Every node in the tree holds either a number or an operator. A node that holds a number is a leaf node of the tree. A node that holds an operator has two subtrees representing the operands to which the operator applies. The tree is shown in the illustration below. I will refer to a tree of this type as an expression tree. Given an expression tree, its easy to nd the value of the expression that it represents. Each node in the tree has an associated value. If the node is a leaf node, then its value is simply the number that the node contains. If the node contains an operator, then the associated value is computed by rst nding the values of its child nodes and then applying the operator to those values. The process is shown by the upward-directed arrows in the illustration. The value computed for the root node is the value of the expression as a whole. There are other uses for expression trees. For example, a postorder traversal of the tree will output the postx form of the expression.
r 5 5 e w s n a 2 1 7 7 1 1 8 1 s t e o 7 5 g i . h n n 7 h s t i 3 d t e s n 3 e 7 n s w t e i o 1 r i u e o ( r o p s p h p p s + x m e e e ) w d r r r o o 4 p t e / a c h ) h x a t w s 1 h e e p t f b + s e u o 7 e n h w ( t ( e e e o a r r u h t c l * r T a a A 3 v

An expression tree contains two types of nodes: nodes that contain numbers and nodes that contain operators. Furthermore, we might want to add other types of nodes to make the trees more useful, such as nodes that contain variables. If we want to work with expression trees in Java, how can we deal with this variety of nodes? One waywhich will be frowned upon by object-oriented puristsis to include an instance variable in each node object to record which type of node it is:
enum NodeType { NUMBER, OPERATOR } // Possible kinds of node.

468
class ExpNode {

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION


// A node in an expression tree. // // // // // Which type of node is this? The value in a node of type NUMBER. The operator in a node of type OPERATOR. Pointers to subtrees, in a node of type OPERATOR.

NodeType kind; double number; char op; ExpNode left; ExpNode right;

ExpNode( double val ) { // Constructor for making a node of type NUMBER. kind = NodeType.NUMBER; number = val; } ExpNode( char op, ExpNode left, ExpNode right ) { // Constructor for making a node of type OPERATOR. kind = NodeType.OPERATOR; this.op = op; this.left = left; this.right = right; } } // end class ExpNode

Given this denition, the following recursive subroutine will nd the value of an expression tree:
static double getValue( ExpNode node ) { // Return the value of the expression represented by // the tree to which node refers. Node must be non-null. if ( node.kind == NodeType.NUMBER ) { // The value of a NUMBER node is the number it holds. return node.number; } else { // The kind must be OPERATOR. // Get the values of the operands and combine them // using the operator. double leftVal = getValue( node.left ); double rightVal = getValue( node.right ); switch ( node.op ) { case +: return leftVal + rightVal; case -: return leftVal - rightVal; case *: return leftVal * rightVal; case /: return leftVal / rightVal; default: return Double.NaN; // Bad operator. } } } // end getValue()

Although this approach works, a more object-oriented approach is to note that since there are two types of nodes, there should be two classes to represent them, ConstNode and BinOpNode. To represent the general idea of a node in an expression tree, we need another class, ExpNode. Both ConstNode and BinOpNode will be subclasses of ExpNode. Since any actual node will be either a ConstNode or a BinOpNode, ExpNode should be an abstract class. (See Subsection 5.5.5.) Since one of the things we want to do with nodes is nd their values, each class should have an instance method for nding the value:

9.4. BINARY TREES


abstract class ExpNode { // Represents a node of any type in an expression tree. abstract double value(); } // end class ExpNode class ConstNode extends ExpNode { // Represents a node that holds a number. double number; // The number in the node. // Return the value of this node.

469

ConstNode( double val ) { // Constructor. Create a node to hold val. number = val; } double value() { // The value is just the number that the node holds. return number; } } // end class ConstNode class BinOpNode extends ExpNode { // Represents a node that holds an operator. char op; ExpNode left; ExpNode right; // The operator. // The left operand. // The right operand.

BinOpNode( char op, ExpNode left, ExpNode right ) { // Constructor. Create a node to hold the given data. this.op = op; this.left = left; this.right = right; } double value() { // To get the value, compute the value of the left and // right operands, and combine them with the operator. double leftVal = left.value(); double rightVal = right.value(); switch ( op ) { case +: return leftVal + rightVal; case -: return leftVal - rightVal; case *: return leftVal * rightVal; case /: return leftVal / rightVal; default: return Double.NaN; // Bad operator. } } } // end class BinOpNode

Note that the left and right operands of a BinOpNode are of type ExpNode, not BinOpNode. This allows the operand to be either a ConstNode or another BinOpNodeor any other type of ExpNode that we might eventually create. Since every ExpNode has a value() method, we can

470

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

call left.value() to compute the value of the left operand. If left is in fact a ConstNode, this will call the value() method in the ConstNode class. If it is in fact a BinOpNode, then left.value() will call the value() method in the BinOpNode class. Each node knows how to compute its own value. Although it might seem more complicated at rst, the object-oriented approach has some advantages. For one thing, it doesnt waste memory. In the original ExpNode class, only some of the instance variables in each node were actually used, and we needed an extra instance variable to keep track of the type of node. More important, though, is the fact that new types of nodes can be added more cleanly, since it can be done by creating a new subclass of ExpNode rather than by modifying an existing class. Well return to the topic of expression trees in the next section, where well see how to create an expression tree to represent a given expression.

9.5 I

A Simple Recursive Descent Parser

have always been fascinated by languageboth natural languages like English and the articial languages that are used by computers. There are many dicult questions about how languages can convey information, how they are structured, and how they can be processed. Natural and articial languages are similar enough that the study of programming languages, which are pretty well understood, can give some insight into the much more complex and dicult natural languages. And programming languages raise more than enough interesting issues to make them worth studying in their own right. How can it be, after all, that computers can be made to understand even the relatively simple languages that are used to write programs? Computers, after all, can only directly use instructions expressed in very simple machine language. Higher level languages must be translated into machine language. But the translation is done by a compiler, which is just a program. How could such a translation program be written?

9.5.1

Backus-Naur Form

Natural and articial languages are similar in that they have a structure known as grammar or syntax. Syntax can be expressed by a set of rules that describe what it means to be a legal sentence or program. For programming languages, syntax rules are often expressed in BNF (Backus-Naur Form), a system that was developed by computer scientists John Backus and Peter Naur in the late 1950s. Interestingly, an equivalent system was developed independently at about the same time by linguist Noam Chomsky to describe the grammar of natural language. BNF cannot express all possible syntax rules. For example, it cant express the fact that a variable must be dened before it is used. Furthermore, it says nothing about the meaning or semantics of the langauge. The problem of specifying the semantics of a languageeven of an articial programming langaugeis one that is still far from being completely solved. However, BNF does express the basic structure of the language, and it plays a central role in the design of translation programs. In English, terms such as noun, transitive verb, and prepositional phrase are syntactic categories that describe building blocks of sentences. Similarly, statement, number, and while loop are syntactic categories that describe building blocks of Java programs. In BNF, a syntactic category is written as a word enclosed between < and >. For example: <noun>, <verb-phrase>, or <while-loop>. A rule in BNF species the structure of an item

9.5. A SIMPLE RECURSIVE DESCENT PARSER

471

in a given syntactic category, in terms of other syntactic categories and/or basic symbols of the language. For example, one BNF rule for the English language might be
<sentence> ::= <noun-phrase> <verb-phrase>

The symbol ::= is read can be, so this rule says that a <sentence> can be a <noun-phrase> followed by a <verb-phrase>. (The term is can be rather than is because there might be other rules that specify other possible forms for a sentence.) This rule can be thought of as a recipe for a sentence: If you want to make a sentence, make a noun-phrase and follow it by a verb-phrase. Noun-phrase and verb-phrase must, in turn, be dened by other BNF rules. In BNF, a choice between alternatives is represented by the symbol |, which is read or. For example, the rule
<verb-phrase> ::= <intransitive-verb> | ( <transitive-verb> <noun-phrase> )

says that a <verb-phrase> can be an <intransitive-verb>, or a <transitive-verb> followed by a <noun-phrase>. Note also that parentheses can be used for grouping. To express the fact that an item is optional, it can be enclosed between [ and ]. An optional item that can be repeated one or more times is enclosed between [ and ].... And a symbol that is an actual part of the language that is being described is enclosed in quotes. For example,
<noun-phrase> ::= <common-noun> [ "that" <verb-phrase> ] | <common-noun> [ <prepositional-phrase> ]...

says that a <noun-phrase> can be a <common-noun>, optionally followed by the literal word that and a <verb-phrase>, or it can be a <common-noun> followed by zero or more <prepositional-phrase>s. Obviously, we can describe very complex structures in this way. The real power comes from the fact that BNF rules can be recursive. In fact, the two preceding rules, taken together, are recursive. A <noun-phrase> is dened partly in terms of <verb-phrase>, while <verb-phrase> is dened partly in terms of <noun-phrase>. For example, a <noun-phrase> might be the rat that ate the cheese, since ate the cheese is a <verb-phrase>. But then we can, recursively, make the more complex <noun-phrase> the cat that caught the rat that ate the cheese out of the <common-noun> the cat, the word that and the <verb-phrase> caught the rat that ate the cheese. Building from there, we can make the <noun-phrase> the dog that chased the cat that caught the rat that ate the cheese. The recursive structure of language is one of the most fundamental properties of language, and the ability of BNF to express this recursive structure is what makes it so useful. BNF can be used to describe the syntax of a programming language such as Java in a formal and precise way. For example, a <while-loop> can be dened as
<while-loop> ::= "while" "(" <condition> ")" <statement>

This says that a <while-loop> consists of the word while, followed by a left parenthesis, followed by a <condition>, followed by a right parenthesis, followed by a <statement>. Of course, it still remains to dene what is meant by a condition and by a statement. Since a statement can be, among other things, a while loop, we can already see the recursive structure of the Java language. The exact specication of an if statement, which is hard to express clearly in words, can be given as
<if-statement> ::= "if" "(" <condition> ")" <statement> [ "else" "if" "(" <condition> ")" <statement> ]... [ "else" <statement> ]

472

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

This rule makes it clear that the else part is optional and that there can be, optionally, one or more else if parts.

9.5.2

Recursive Descent Parsing

In the rest of this section, I will show how a BNF grammar for a language can be used as a guide for constructing a parser. A parser is a program that determines the grammatical structure of a phrase in the language. This is the rst step to determining the meaning of the phrasewhich for a programming language means translating it into machine language. Although we will look at only a simple example, I hope it will be enough to convince you that compilers can in fact be written and understood by mortals and to give you some idea of how that can be done. The parsing method that we will use is called recursive descent parsing . It is not the only possible parsing method, or the most ecient, but it is the one most suited for writing compilers by hand (rather than with the help of so called parser generator programs). In a recursive descent parser, every rule of the BNF grammar is the model for a subroutine. Not every BNF grammar is suitable for recursive descent parsing. The grammar must satisfy a certain property. Essentially, while parsing a phrase, it must be possible to tell what syntactic category is coming up next just by looking at the next item in the input. Many grammars are designed with this property in mind. I should also mention that many variations of BNF are in use. The one that Ive described here is one that is well-suited for recursive descent parsing.


When we try to parse a phrase that contains a syntax error, we need some way to respond to the error. A convenient way of doing this is to throw an exception. Ill use an exception class called ParseError, dened as follows:
/** * An object of type ParseError represents a syntax error found in * the users input. */ private static class ParseError extends Exception { ParseError(String message) { super(message); } } // end nested class ParseError

Another general point is that our BNF rules dont say anything about spaces between items, but in reality we want to be able to insert spaces between items at will. To allow for this, Ill always call the routine TextIO.skipBlanks() before trying to look ahead to see whats coming up next in input. TextIO.skipBlanks() skips past any whitespace, such as spaces and tabs, in the input, and stops when the next character in the input is either a non-blank character or the end-of-line character. Lets start with a very simple example. A fully parenthesized expression can be specied in BNF by the rules
<expression> ::= <operator> ::= <number> | "(" <expression> <operator> <expression> ")"

"+" | "-" | "*" | "/"

9.5. A SIMPLE RECURSIVE DESCENT PARSER

473

where <number> refers to any non-negative real number. An example of a fully parenthesized expression is (((34-17)*8)+(2*7)). Since every operator corresponds to a pair of parentheses, there is no ambiguity about the order in which the operators are to be applied. Suppose we want a program that will read and evaluate such expressions. Well read the expressions from standard input, using TextIO. To apply recursive descent parsing, we need a subroutine for each rule in the grammar. Corresponding to the rule for <operator>, we get a subroutine that reads an operator. The operator can be a choice of any of four things. Any other input will be an error.
/** * If the next character in input is one of the legal operators, * read it and return it. Otherwise, throw a ParseError. */ static char getOperator() throws ParseError { TextIO.skipBlanks(); char op = TextIO.peek(); if ( op == + || op == - || op == * || op == / ) { TextIO.getAnyChar(); return op; } else if (op == \n) throw new ParseError("Missing operator at end of line."); else throw new ParseError("Missing operator. Found \"" + op + "\" instead of +, -, *, or /."); } // end getOperator()

Ive tried to give a reasonable error message, depending on whether the next character is an end-of-line or something else. I use TextIO.peek() to look ahead at the next character before I read it, and I call TextIO.skipBlanks() before testing TextIO.peek() in order to ignore any blanks that separate items. I will follow this same pattern in every case. When we come to the subroutine for <expression>, things are a little more interesting. The rule says that an expression can be either a number or an expression enclosed in parentheses. We can tell which it is by looking ahead at the next character. If the character is a digit, we have to read a number. If the character is a (, we have to read the (, followed by an expression, followed by an operator, followed by another expression, followed by a ). If the next character is anything else, there is an error. Note that we need recursion to read the nested expressions. The routine doesnt just read the expression. It also computes and returns its value. This requires semantical information that is not specied in the BNF rule.
/** * Read an expression from the current line of input and return its value. * @throws ParseError if the input contains a syntax error */ private static double expressionValue() throws ParseError { TextIO.skipBlanks(); if ( Character.isDigit(TextIO.peek()) ) { // The next item in input is a number, so the expression // must consist of just that number. Read and return // the number. return TextIO.getDouble(); } else if ( TextIO.peek() == ( ) {

474

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION


// The expression must be of the form // "(" <expression> <operator> <expression> ")" // Read all these items, perform the operation, and // return the result. TextIO.getAnyChar(); // Read the "(" double leftVal = expressionValue(); // Read and evaluate first operand. char op = getOperator(); // Read the operator. double rightVal = expressionValue(); // Read and evaluate second operand. TextIO.skipBlanks(); if ( TextIO.peek() != ) ) { // According to the rule, there must be a ")" here. // Since its missing, throw a ParseError. throw new ParseError("Missing right parenthesis."); } TextIO.getAnyChar(); // Read the ")" switch (op) { // Apply the operator and return the result. case +: return leftVal + rightVal; case -: return leftVal - rightVal; case *: return leftVal * rightVal; case /: return leftVal / rightVal; default: return 0; // Cant occur since op is one of the above. // (But Java syntax requires a return value.) } } else { // No other character can legally start an expression. throw new ParseError("Encountered unexpected character, \"" + TextIO.peek() + "\" in input."); } } // end expressionValue()

I hope that you can see how this routine corresponds to the BNF rule. Where the rule uses | to give a choice between alternatives, there is an if statement in the routine to determine which choice to take. Where the rule contains a sequence of items, ( <expression> <operator> <expression> ), there is a sequence of statements in the subroutine to read each item in turn. When expressionValue() is called to evaluate the expression (((34-17)*8)+(2*7)), it sees the ( at the beginning of the input, so the else part of the if statement is executed. The ( is read. Then the rst recursive call to expressionValue() reads and evaluates the subexpression ((34-17)*8), the call to getOperator() reads the + operator, and the second recursive call to expressionValue() reads and evaluates the second subexpression (2*7). Finally, the ) at the end of the expression is read. Of course, reading the rst subexpression, ((34-17)*8), involves further recursive calls to the expressionValue() routine, but its better not to think too deeply about that! Rely on the recursion to handle the details. Youll nd a complete program that uses these routines in the le SimpleParser1.java.


Fully parenthesized expressions arent very natural for people to use. But with ordinary expressions, we have to worry about the question of operator precedence, which tells us, for example, that the * in the expression 5+3*7 is applied before the +. The complex expression 3*6+8*(7+1)/4-24 should be seen as made up of three terms, 3*6, 8*(7+1)/4, and 24, combined with + and - operators. A term, on the other hand, can be made up of several factors combined with * and / operators. For example, 8*(7+1)/4 contains the

9.5. A SIMPLE RECURSIVE DESCENT PARSER

475

factors 8, (7+1) and 4. This example also shows that a factor can be either a number or an expression in parentheses. To complicate things a bit more, we allow for leading minus signs in expressions, as in -(3+4) or -7. (Since a <number> is a positive number, this is the only way we can get negative numbers. Its done this way to avoid 3 * -7, for example.) This structure can be expressed by the BNF rules
<expression> ::= [ "-" ] <term> [ ( "+" | "-" ) <term> ]... <term> ::= <factor> [ ( "*" | "/" ) <factor> ]... <factor> ::= <number> | "(" <expression> ")"

The rst rule uses the [ ]... notation, which says that the items that it encloses can occur zero, one, two, or more times. This means that an <expression> can begin, optionally, with a -. Then there must be a <term> which can optionally be followed by one of the operators + or - and another <term>, optionally followed by another operator and <term>, and so on. In a subroutine that reads and evaluates expressions, this repetition is handled by a while loop. An if statement is used at the beginning of the loop to test whether a leading minus sign is present:
/** * Read an expression from the current line of input and return its value. * @throws ParseError if the input contains a syntax error */ private static double expressionValue() throws ParseError { TextIO.skipBlanks(); boolean negative; // True if there is a leading minus sign. negative = false; if (TextIO.peek() == -) { TextIO.getAnyChar(); // Read the minus sign. negative = true; } double val; // Value of the expression. val = termValue(); if (negative) val = -val; TextIO.skipBlanks(); while ( TextIO.peek() == + || TextIO.peek() == - ) { // Read the next term and add it to or subtract it from // the value of previous terms in the expression. char op = TextIO.getAnyChar(); // Read the operator. double nextVal = termValue(); if (op == +) val += nextVal; else val -= nextVal; TextIO.skipBlanks(); } return val; } // end expressionValue()

The subroutine for <term> is very similar to this, and the subroutine for <factor> is similar to the example given above for fully parenthesized expressions. A complete program that reads and evaluates expressions based on the above BNF rules can be found in the le SimpleParser2.java.

476

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

9.5.3

Building an Expression Tree

Now, so far, weve only evaluated expressions. What does that have to do with translating programs into machine language? Well, instead of actually evaluating the expression, it would be almost as easy to generate the machine language instructions that are needed to evaluate the expression. If we are working with a stack machine, these instructions would be stack operations such as push a number or apply a + operation. The program SimpleParser3.java can both evaluate the expression and print a list of stack machine operations for evaluating the expression. Its quite a jump from this program to a recursive descent parser that can read a program written in Java and generate the equivalent machine language codebut the conceptual leap is not huge. The SimpleParser3 program doesnt actually generate the stack operations directly as it parses an expression. Instead, it builds an expression tree, as discussed in the Section 9.4, to represent the expression. The expression tree is then used to nd the value and to generate the stack operations. The tree is made up of nodes belonging to classes ConstNode and BinOpNode that are similar to those given in the Section 9.4. Another class, UnaryMinusNode, has been introduced to represent the unary minus operation. Ive added a method, printStackCommands(), to each class. This method is responsible for printing out the stack operations that are necessary to evaluate an expression. Here for example is the new BinOpNode class from SimpleParser3.java:
private static class BinOpNode extends ExpNode { char op; // The operator. ExpNode left; // The expression for its left operand. ExpNode right; // The expression for its right operand. BinOpNode(char op, ExpNode left, ExpNode right) { // Construct a BinOpNode containing the specified data. assert op == + || op == - || op == * || op == /; assert left != null && right != null; this.op = op; this.left = left; this.right = right; } double value() { // The value is obtained by evaluating the left and right // operands and combining the values with the operator. double x = left.value(); double y = right.value(); switch (op) { case +: return x + y; case -: return x - y; case *: return x * y; case /: return x / y; default: return Double.NaN; // Bad operator! } }

9.5. A SIMPLE RECURSIVE DESCENT PARSER


void printStackCommands() { // To evaluate the expression on a stack machine, first do // whatever is necessary to evaluate the left operand, leaving // the answer on the stack. Then do the same thing for the // second operand. Then apply the operator (which means popping // the operands, applying the operator, and pushing the result). left.printStackCommands(); right.printStackCommands(); TextIO.putln(" Operator " + op);

477

} }

Its also interesting to look at the new parsing subroutines. Instead of computing a value, each subroutine builds an expression tree. For example, the subroutine corresponding to the rule for <expression> becomes
static ExpNode expressionTree() throws ParseError { // Read an expression from the current line of input and // return an expression tree representing the expression. TextIO.skipBlanks(); boolean negative; // True if there is a leading minus sign. negative = false; if (TextIO.peek() == -) { TextIO.getAnyChar(); negative = true; } ExpNode exp; // The expression tree for the expression. exp = termTree(); // Start with a tree for first term. if (negative) { // Build the tree that corresponds to applying a // unary minus operator to the term weve // just read. exp = new UnaryMinusNode(exp); } TextIO.skipBlanks(); while ( TextIO.peek() == + || TextIO.peek() == - ) { // Read the next term and combine it with the // previous terms into a bigger expression tree. char op = TextIO.getAnyChar(); ExpNode nextTerm = termTree(); // Create a tree that applies the binary operator // to the previous tree and the term we just read. exp = new BinOpNode(op, exp, nextTerm); TextIO.skipBlanks(); } return exp; } // end expressionTree()

In some real compilers, the parser creates a tree to represent the program that is being parsed. This tree is called a parse tree. Parse trees are somewhat dierent in form from expression trees, but the purpose is the same. Once you have the tree, there are a number of things you can do with it. For one thing, it can be used to generate machine language code. But

478

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

there are also techniques for examining the tree and detecting certain types of programming errors, such as an attempt to reference a local variable before it has been assigned a value. (The Java compiler, of course, will reject the program if it contains such an error.) Its also possible to manipulate the tree to optimize the program. In optimization, the tree is transformed to make the program more ecient before the code is generated. And so we are back where we started in Chapter 1, looking at programming languages, compilers, and machine language. But looking at them, I hope, with a lot more understanding and a much wider perspective.

Exercises

479

Exercises for Chapter 9


1. In many textbooks, the rst examples of recursion are the mathematical functions factorial and bonacci. These functions are dened for non-negative integers using the following recursive formulas:
factorial(0) = factorial(N) = fibonacci(0) = fibonacci(1) = fibonacci(N) = 1 N*factorial(N-1) for N > 0

1 1 fibonacci(N-1) + fibonacci(N-2)

for N > 1

Write recursive functions to compute factorial(N) and fibonacci(N) for a given nonnegative integer N, and write a main() routine to test your functions. (In fact, factorial and bonacci are really not very good examples of recursion, since the most natural way to compute them is to use simple for loops. Furthermore, bonacci is a particularly bad example, since the natural recursive approach to computing this function is extremely inecient.) 2. Exercise 7.6 asked you to read a le, make an alphabetical list of all the words that occur in the le, and write the list to another le. In that exercise, you were asked to use an ArrayList<String> to store the words. Write a new version of the same program that stores the words in a binary sort tree instead of in an arraylist. You can use the binary sort tree routines from SortTreeDemo.java, which was discussed in Subsection 9.4.2. 3. Suppose that linked lists of integers are made from objects belonging to the class
class ListNode { int item; ListNode next; } // An item in the list. // Pointer to the next node in the list.

Write a subroutine that will make a copy of a list, with the order of the items of the list reversed. The subroutine should have a parameter of type ListNode, and it should return a value of type ListNode. The original list should not be modied. You should also write a main() routine to test your subroutine. 4. Subsection 9.4.1 explains how to use recursion to print out the items in a binary tree in various orders. That section also notes that a non-recursive subroutine can be used to print the items, provided that a stack or queue is used as an auxiliary data structure. Assuming that a queue is used, here is an algorithm for such a subroutine:
Add the root node to an empty queue while the queue is not empty: Get a node from the queue Print the item in the node if node.left is not null: add it to the queue if node.right is not null: add it to the queue

480

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION Write a subroutine that implements this algorithm, and write a program to test the subroutine. Note that you will need a queue of TreeNodes, so you will need to write a class to represent such queues. (Note that the order in which items are printed by this algorithm is dierent from all three of the orders considered in Subsection 9.4.1.)

5. In Subsection 9.4.2, I say that if the [binary sort] tree is created by inserting items in a random order, there is a high probability that the tree is approximately balanced. For this exercise, you will do an experiment to test whether that is true. The depth of a node in a binary tree is the length of the path from the root of the tree to that node. That is, the root has depth 0, its children have depth 1, its grandchildren have depth 2, and so on. In a balanced tree, all the leaves in the tree are about the same depth. For example, in a perfectly balanced tree with 1023 nodes, all the leaves are at depth 9. In an approximately balanced tree with 1023 nodes, the average depth of all the leaves should be not too much bigger than 9. On the other hand, even if the tree is approximately balanced, there might be a few leaves that have much larger depth than the average, so we might also want to look at the maximum depth among all the leaves in a tree. For this exercise, you should create a random binary sort tree with 1023 nodes. The items in the tree can be real numbers, and you can create the tree by generating 1023 random real numbers and inserting them into the tree, using the usual treeInsert() method for binary sort trees. Once you have the tree, you should compute and output the average depth of all the leaves in the tree and the maximum depth of all the leaves. To do this, you will need three recursive subroutines: one to count the leaves, one to nd the sum of the depths of all the leaves, and one to nd the maximum depth. The latter two subroutines should have an int-valued parameter, depth, that tells how deep in the tree youve gone. When you call this routine from the main program, the depth parameter is 0; when you call the routine recursively, the parameter increases by 1. 6. The parsing programs in Section 9.5 work with expressions made up of numbers and operators. We can make things a little more interesting by allowing the variable x to occur. This would allow expression such as 3*(x-1)*(x+1), for example. Make a new version of the sample program SimpleParser3.java that can work with such expressions. In your program, the main() routine cant simply print the value of the expression, since the value of the expression now depends on the value of x. Instead, it should print the value of the expression for x=0, x=1, x=2, and x=3. The original program will have to be modied in several other ways. Currently, the program uses classes ConstNode, BinOpNode, and UnaryMinusNode to represent nodes in an expression tree. Since expressions can now include x, you will need a new class, VariableNode, to represent an occurrence of x in the expression. In the original program, each of the node classes has an instance method, double value(), which returns the value of the node. But in your program, the value can depend on x, so you should replace this method with one of the form double value(double xValue), where the parameter xValue is the value of x. Finally, the parsing subroutines in your program will have to take into account the fact that expressions can contain x. There is just one small change in the BNF rules for the expressions: A <factor> is allowed to be the variable x:
<factor> ::= <number> | <x-variable> | "(" <expression> ")"

Exercises

481

where <x-variable> can be either a lower case or an upper case X. This change in the BNF requires a change in the factorTree() subroutine. 7. This exercise builds on the previous exercise, Exercise 9.6. To understand it, you should have some background in Calculus. The derivative of an expression that involves the variable x can be dened by a few recursive rules: The derivative of a constant is 0. The derivative of x is 1. If A is an expression, let dA be the derivative of A. Then the derivative of -A is -dA. If A and B are expressions, let dA be the derivative of A and let dB be the derivative of B. Then the derivative of A+B is dA+dB. The derivative of A-B is dA-dB. The derivative of A*B is A*dB + B*dA. The derivative of A/B is (B*dA - A*dB) / (B*B). For this exercise, you should modify your program from the previous exercise so that it can compute the derivative of an expression. You can do this by adding a derivativecomputing method to each of the node classes. First, add another abstract method to the ExpNode class:
abstract ExpNode derivative();

Then implement this method in each of the four subclasses of ExpNode. All the information that you need is in the rules given above. In your main program, instead of printing the stack operations for the original expression, you should print out the stack operations that dene the derivative. Note that the formula that you get for the derivative can be much more complicated than it needs to be. For example, the derivative of 3*x+1 will be computed as (3*1+0*x)+0. This is correct, even though its kind of ugly, and it would be nice for it to be simplied. However, simplifying expressions is not easy. As an alternative to printing out stack operations, you might want to print the derivative as a fully parenthesized expression. You can do this by adding a printInfix() routine to each node class. It would be nice to leave out unnecessary parentheses, but again, the problem of deciding which parentheses can be left out without altering the meaning of the expression is a fairly dicult one, which I dont advise you to attempt. (There is one curious thing that happens here: If you apply the rules, as given, to an expression tree, the result is no longer a tree, since the same subexpression can occur at multiple points in the derivative. For example, if you build a node to represent B*B by saying new BinOpNode(*,B,B), then the left and right children of the new node are actually the same node! This is not allowed in a tree. However, the dierence is harmless in this case since, like a tree, the structure that you get has no loops in it. Loops, on the other hand, would be a disaster in most of the recursive tree-processing subroutines that we have written, since it would lead to innite recursion.)

482

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

Quiz on Chapter 9
1. Explain what is meant by a recursive subroutine. 2. Consider the following subroutine:
static void printStuff(int level) { if (level == 0) { System.out.print("*"); } else { System.out.print("["); printStuff(level - 1); System.out.print(","); printStuff(level - 1); System.out.println("]"); } }

Show the output that would be produced by the subroutine calls printStuff(0), printStuff(1), printStuff(2), and printStuff(3). 3. Suppose that a linked list is formed from objects that belong to the class
class ListNode { int item; ListNode next; } // An item in the list. // Pointer to next item in the list.

Write a subroutine that will count the number of zeros that occur in a given linked list of ints. The subroutine should have a parameter of type ListNode and should return a value of type int. 4. What are the three operations on a stack? 5. What is the basic dierence between a stack and a queue? 6. What is an activation record? What role does a stack of activation records play in a computer? 7. Suppose that a binary tree of integers is formed from objects belonging to the class
class TreeNode { int item; // One item in the tree. TreeNode left; // Pointer to the left subtree. TreeNode right; // Pointer to the right subtree. }

Write a recursive subroutine that will nd the sum of all the nodes in the tree. Your subroutine should have a parameter of type TreeNode, and it should return a value of type int. 8. What is a postorder traversal of a binary tree? 9. Suppose that a <multilist> is dened by the BNF rule

Quiz
<multilist> ::= <word> | "(" [ <multilist> ]... ")"

483

where a <word> can be any sequence of letters. Give ve dierent <multilist>s that can be generated by this rule. (This rule, by the way, is almost the entire syntax of the programming language LISP! LISP is known for its simple syntax and its elegant and powerful semantics.) 10. Explain what is meant by parsing a computer program.

484

CHAPTER 9. LINKED DATA STRUCTURES AND RECURSION

Chapter 10

Generic Programming and Collection Classes


How to avoid reinventing the wheel? Many data structures and algorithms, such as those from Chapter 9, have been studied, programmed, and re-programmed by generations of computer science students. This is a valuable learning experience. Unfortunately, they have also been programmed and re-programmed by generations of working computer professionals, taking up time that could be devoted to new, more creative work. A programmer who needs a list or a binary tree shouldnt have to re-code these data structures from scratch. They are well-understood and have been programmed thousands of times before. The problem is how to make pre-written, robust data structures available to programmers. In this chapter, well look at Javas attempt to address this problem. 10.1 Generic Programming

programming refers to writing code that will work for many types of data. We encountered the term in Section 7.3, where we looked at dynamic arrays of integers. The source code presented there for working with dynamic arrays of integers works only for data of type int. But the source code for dynamic arrays of double, String, JButton, or any other type would be almost identical, except for the substitution of one type name for another. It seems silly to write essentially the same code over and over. As we saw in Subsection 7.3.3, Java goes some distance towards solving this problem by providing the ArrayList class. An ArrayList is essentially a dynamic array of values of type Object. Since every class is a subclass of Object, objects of any type can be stored in an ArrayList. Java goes even further by providing parameterized types, which were introduced in Subsection 7.3.4. There we saw that the ArrayList type can be parameterized, as in ArrayList<String>, to limit the values that can be stored in the list to objects of a specied type. Parameterized types extend Javas basic philosophy of type-safe programming to generic programming. The ArrayList class is just one of several standard classes that are used for generic programming in Java. We will spend the next few sections looking at these classes and how they are used, and well see that there are also generic methods and generic interfaces (see Subsection 5.7.1). All the classes and interfaces discussed in these sections are dened in the package java.util, and you will need an import statement at the beginning of your program to get access to them. (Before you start putting import java.util.* at the beginning of every program, you should know that some things in java.util have names that are the same as 485

Generic

486

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

things in other packages. For example, both java.util.List and java.awt.List exist, so it is often better to import the individual classes that you need.) In the nal section of this chapter, we will see that it is possible to dene new generic classes, interfaces, and methods. Until then, we will stick to using the generics that are predened in Javas standard library. It is no easy task to design a library for generic programming. Javas solution has many nice features but is certainly not the only possible approach. It is almost certainly not the best, and has a few features that in my opinion can only be called bizarre, but in the context of the overall design of Java, it might be close to optimal. To get some perspective on generic programming in general, it might be useful to look very briey at generic programming in two other languages.

10.1.1

Generic Programming in Smalltalk

Smalltalk was one of the very rst object-oriented programming languages. It is still used today, although its use is not very common. It has not achieved anything like the popularity of Java or C++, but it is the source of many ideas used in these languages. In Smalltalk, essentially all programming is generic, because of two basic properties of the language. First of all, variables in Smalltalk are typeless. A data value has a type, such as integer or string, but variables do not have types. Any variable can hold data of any type. Parameters are also typeless, so a subroutine can be applied to parameter values of any type. Similarly, a data structure can hold data values of any type. For example, once youve dened a binary tree data structure in SmallTalk, you can use it for binary trees of integers or strings or dates or data of any other type. There is simply no need to write new code for each data type. Secondly, all data values are objects, and all operations on objects are dened by methods in a class. This is true even for types that are primitive in Java, such as integers. When the + operator is used to add two integers, the operation is performed by calling a method in the integer class. When you dene a new class, you can dene a + operator, and you will then be able to add objects belonging to that class by saying a + b just as if you were adding numbers. Now, suppose that you write a subroutine that uses the + operator to add up the items in a list. The subroutine can be applied to a list of integers, but it can also be applied, automatically, to any other data type for which + is dened. Similarly, a subroutine that uses the <" operator to sort a list can be applied to lists containing any type of data for which < is dened. There is no need to write a dierent sorting subroutine for each type of data. Put these two features together and you have a language where data structures and algorithms will work for any type of data for which they make sense, that is, for which the appropriate operations are dened. This is real generic programming. This might sound pretty good, and you might be asking yourself why all programming languages dont work this way. This type of freedom makes it easier to write programs, but unfortunately it makes it harder to write programs that are correct and robust (see Chapter 8). Once you have a data structure that can contain data of any type, it becomes hard to ensure that it only holds the type of data that you want it to hold. If you have a subroutine that can sort any type of data, its hard to ensure that it will only be applied to data for which the < operator is dened. More particularly, there is no way for a compiler to ensure these things. The problem will only show up at run time when an attempt is made to apply some operation to a data type for which it is not dened, and the program will crash.

10.1. GENERIC PROGRAMMING

487

10.1.2

Generic Programming in C++

Unlike Smalltalk, C++ is a very strongly typed language, even more so than Java. Every variable has a type, and can only hold data values of that type. This means that the kind of generic programming that is used in Smalltalk is impossible in C++. Furthermore, C++ does not have anything corresponding to Javas Object class. That is, there is no class that is a superclass of all other classes. This means that C++ cant use Javas style of generic programming with non-parameterized generic types either. Nevertheless, C++ has a powerful and exible system of generic programming. It is made possible by a language feature known as templates. In C++, instead of writing a dierent sorting subroutine for each type of data, you can write a single subroutine template. The template is not a subroutine; its more like a factory for making subroutines. We can look at an example, since the syntax of C++ is very similar to Javas:
template<class ItemType> void sort( ItemType A[], int count ) { // Sort items in the array, A, into increasing order. // The items in positions 0, 1, 2, ..., (count-1) are sorted. // The algorithm that is used here is selection sort. for (int i = count-1; i > 0; i--) { int position of max = 0; for (int j = 1; j <= count ; j++) if ( A[j] > A[position of max] ) position of max = j; ItemType temp = A[count]; A[count] = A[position of max]; A[position of max] = temp; } }

This piece of code denes a subroutine template. If you remove the rst line, template<class ItemType>, and substitute the word int for the word ItemType in the rest of the template, you get a subroutine for sorting arrays of ints. (Even though it says class ItemType, you can actually substitute any type for ItemType, including the primitive types.) If you substitute string for ItemType, you get a subroutine for sorting arrays of strings. This is pretty much what the compiler does with the template. If your program says sort(list,10) where list is an array of ints, the compiler uses the template to generate a subroutine for sorting arrays of ints. If you say sort(cards,10) where cards is an array of objects of type Card, then the compiler generates a subroutine for sorting arrays of Cards. At least, it tries to. The template uses the > operator to compare values. If this operator is dened for values of type Card, then the compiler will successfully use the template to generate a subroutine for sorting cards. If > is not dened for Cards, then the compiler will failbut this will happen at compile time, not, as in Smalltalk, at run time where it would make the program crash. In addition to subroutine templates, C++ also has templates for making classes. If you write a template for a binary tree class, you can use it to generate classes for binary trees of ints, binary trees of strings, binary trees of dates, and so onall from one template. The most recent version of C++ comes with a large number of pre-written templates called the Standard Template Library or STL. The STL is quite complex. Many people would say that its much too complex. But it is also one of the most interesting features of C++.

488

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

10.1.3

Generic Programming in Java

Javas generic programming features have gone through several stages of development. The original version of Java had just a few generic data structure classes, such as Vector, that could hold values of type Object. Java version 1.2 introduced a much larger group of generics that followed the same basic model. These generic classes and interfaces as a group are known as the Java Collection Framework . The ArrayList class is part of the Collection Framework. The original Collection Framework was closer in spirit to Smalltalk than it was to C++, since a data structure designed to hold Objects can be used with objects of any type. Unfortunately, as in Smalltalk, the result is a category of errors that show up only at run time, rather than at compile time. If a programmer assumes that all the items in a data structure are strings and tries to process those items as strings, a run-time error will occur if other types of data have inadvertently been added to the data structure. In Java, the error will most likely occur when the program retrieves an Object from the data structure and tries to type-cast it to type String. If the object is not actually of type String, the illegal type-cast will throw an error of type ClassCastException. Java 5.0 introduced parameterized types, such as ArrayList<String>. This made it possible to create generic data structures that can be type-checked at compile time rather than at run time. With these data structures, type-casting is not necessary, so ClassCastExceptions are avoided. The compiler will detect any attempt to add an object of the wrong type to the data structure; it will report a syntax error and will refuse to compile the program. In Java 5.0, all of the classes and interfaces in the Collection Framework, and even some classes that are not part of that framework, have been parameterized. Javas parameterized classes are similar to template classes in C++ (although the implementation is very dierent), and their introduction moves Javas generic programming model closer to C++ and farther from Smalltalk. In this chapter, I will use the parameterized types almost exclusively, but you should remember that their use is not mandatory. It is still legal to use a parameterized class as a non-parameterized type, such as a plain ArrayList. Note that there is a signicant dierence between parameterized classes in Java and template classes in C++. A template class in C++ is not really a class at allits a kind of factory for generating classes. Every time the template is used with a new type, a new compiled class is created. With a Java parameterized class, there is only one compiled class le. For example, there is only one compiled class le, ArrayList.class, for the parameterized class ArrayList. The parameterized types ArrayList<String> and ArrayList<Integer> both use the same compiled class le, as does the plain ArrayList type. The type parameterString or Integer just tells the compiler to limit the type of object that can be stored in the data structure. The type parameter has no eect at run time and is not even known at run time. The type information is said to be erased at run time. This type erasuer introduces a certain amount of weirdness. For example, you cant test if (list instanceof ArrayList<String>) because the instanceof operator is evaluated at run time, and at run time only the plain ArrayList exists. Even worse, you cant create an array that has base type ArrayList<String> by using the new operator, as in new ArrayList<String>[N]. This is because the new operator is evaluated at run time, and at run time there is no such thing as ArrayList<String>; only the non-parameterized type ArrayList exists at run time. Fortunately, most programmers dont have to deal with such problems, since they turn up only in fairly advanced programming. Most people who use the Java Collection Framework will not encounter them, and they will get the benets of type-safe generic programming with little diculty.

10.1. GENERIC PROGRAMMING

489

10.1.4

The Java Collection Framework

Javas generic data structures can be divided into two categories: collections and maps. A collection is more or less what it sound like: a collection of objects. A map associates objects in one set with objects in another set in the way that a dictionary associates denitions with words or a phone book associates phone numbers with names. A map is similar to what I called an association list in Subsection 7.4.2. In Java, collections and maps are represented by the parameterized interfaces Collection<T> and Map<T,S>. Here, T and S stand for any type except for the primitive types. Map<T,S> is the rst example we have seen where there are two type parameters, T and S; we will not deal further with this possibility until we look at maps more closely in Section 10.3. In this section and the next, we look at collections only. There are two types of collections: lists and sets. A list is a collection in which the objects are arranged in a linear sequence. A list has a rst item, a second item, and so on. For any item in the list, except the last, there is an item that directly follows it. The dening property of a set is that no object can occur more than once in a set; the elements of a set are not necessarily thought of as being in any particular order. The ideas of lists and sets are represented as parameterized interfaces List<T> and Set<T>. These are sub-interfaces of Collection<T>. That is, any object that implements the interface List<T> or Set<T> automatically implements Collection<T> as well. The interface Collection<T> species general operations that can be applied to any collection at all. List<T> and Set<T> add additional operations that are appropriate for lists and sets respectively. Of course, any actual object that is a collection, list, or set must belong to a concrete class that implements the corresponding interface. For example, the class ArrayList<T> implements the interface List<T> and therefore also implements Collection<T>. This means that all the methods that are dened in the list and collection interfaces can be used with, for example, an ArrayList<String> object. We will look at various classes that implement the list and set interfaces in the next section. But before we do that, well look briey at some of the general operations that are available for all collections.


The interface Collection<T> species methods for performing some basic operations on any collection of objects. Since collection is a very general concept, operations that can be applied to all collections are also very general. They are generic operations in the sense that they can be applied to various types of collections containing various types of objects. Suppose that coll is an object that implements the interface Collection<T> (for some specic non-primitive type T ). Then the following operations, which are specied in the interface Collection<T>, are dened for coll: coll.size() returns an int that gives the number of objects in the collection. coll.isEmpty() returns a boolean value which is true if the size of the collection is 0. coll.clear() removes all objects from the collection. coll.add(tobject) adds tobject to the collection. The parameter must be of type T ; if not, a syntax error occurs at compile time. This method returns a boolean value which tells you whether the operation actually modied the collection. For example, adding an object to a Set has no eect if that object was already in the set. coll.contains(object) returns a boolean value that is true if object is in the collection. Note that object is not required to be of type T, since it makes sense to check whether object is in the collection, no matter what type object has. (For testing

490

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES equality, null is considered to be equal to itself. The criterion for testing non-null objects for equality can dier from one kind of collection to another; see Subsection 10.1.6, below.) coll.remove(object) removes object from the collection, if it occurs in the collection, and returns a boolean value that tells you whether the object was found. Again, object is not required to be of type T. coll.containsAll(coll2) returns a boolean value that is true if every object in coll2 is also in the coll. The parameter can be any collection. coll.addAll(coll2) adds all the objects in coll2 to coll. The parameter, coll2, can be any collection of type Collection<T>. However, it can also be more general. For example, if T is a class and S is a sub-class of T, then coll2 can be of type Collection<S>. This makes sense because any object of type S is automatically of type T and so can legally be added to coll. coll.removeAll(coll2) removes every object from coll that also occurs in the collection coll2. coll2 can be any collection. coll.retainAll(coll2) removes every object from coll that does not occur in the collection coll2. It retains only the objects that do occur in coll2. coll2 can be any collection. coll.toArray() returns an array of type Object[ ] that contains all the items in the collection. The return value can be type-cast to another array type, if appropriate. Note that the return type is Object[ ], not T[ ]! However, you can type-cast the return value to a more specic type. For example, if you know that all the items in coll are of type String, then (String[])coll.toArray() gives you an array of Strings containing all the strings in the collection.

Since these methods are part of the Collection<T> interface, they must be dened for every object that implements that interface. There is a problem with this, however. For example, the size of some kinds of collection cannot be changed after they are created. Methods that add or remove objects dont make sense for these collections. While it is still legal to call the methods, an exception will be thrown when the call is evaluated at run time. The type of the exception is UnsupportedOperationException. Furthermore, since Collection<T> is only an interface, not a concrete class, the actual implementation of the method is left to the classes that implement the interface. This means that the semantics of the methods, as described above, are not guaranteed to be valid for all collection objects; they are valid, however, for classes in the Java Collection Framework. There is also the question of eciency. Even when an operation is dened for several types of collections, it might not be equally ecient in all cases. Even a method as simple as size() can vary greatly in eciency. For some collections, computing the size() might involve counting the items in the collection. The number of steps in this process is equal to the number of items. Other collections might have instance variables to keep track of the size, so evaluating size() just means returning the value of a variable. In this case, the computation takes only one step, no matter how many items there are. When working with collections, its good to have some idea of how ecient operations are and to choose a collection for which the operations that you need can be implemented most eciently. Well see specic examples of this in the next two sections.

10.1. GENERIC PROGRAMMING

491

10.1.5

Iterators and for-each Loops

The interface Collection<T> denes a few basic generic algorithms, but suppose you want to write your own generic algorithms. Suppose, for example, you want to do something as simple as printing out every item in a collection. To do this in a generic way, you need some way of going through an arbitrary collection, accessing each item in turn. We have seen how to do this for specic data structures: For an array, you can use a for loop to iterate through all the array indices. For a linked list, you can use a while loop in which you advance a pointer along the list. For a binary tree, you can use a recursive subroutine to do an inx traversal. Collections can be represented in any of these forms and many others besides. With such a variety of traversal mechanisms, how can we even hope to come up with a single generic method that will work for collections that are stored in wildly dierent forms? This problem is solved by iterators. An iterator is an object that can be used to traverse a collection. Dierent types of collections have iterators that are implemented in dierent ways, but all iterators are used in the same way. An algorithm that uses an iterator to traverse a collection is generic, because the same technique can be applied to any type of collection. Iterators can seem rather strange to someone who is encountering generic programming for the rst time, but you should understand that they solve a dicult problem in an elegant way. The interface Collection<T> denes a method that can be used to obtain an iterator for any collection. If coll is a collection, then coll.iterator() returns an iterator that can be used to traverse the collection. You should think of the iterator as a kind of generalized pointer that starts at the beginning of the collection and can move along the collection from one item to the next. Iterators are dened by a parameterized interface named Iterator<T>. If coll implements the interface Collection<T> for some specic type T, then coll.iterator() returns an iterator of type Iterator<T>, with the same type T as its type parameter. The interface Iterator<T> denes just three methods. If iter refers to an object that implements Iterator<T>, then we have: iter.next() returns the next item, and advances the iterator. The return value is of type T. This method lets you look at one of the items in the collection. Note that there is no way to look at an item without advancing the iterator past that item. If this method is called when no items remain, it will throw a NoSuchElementException. iter.hasNext() returns a boolean value telling you whether there are more items to be processed. In general, you should test this before calling iter.next(). iter.remove() if you call this after calling iter.next(), it will remove the item that you just saw from the collection. Note that this method has no parameter. It removes the item that was most recently returned by iter.next(). This might produce an UnsupportedOperationException, if the collection does not support removal of items. Using iterators, we can write code for printing all the items in any collection. Suppose, for example, that coll is of type Collection<String>. In that case, the value returned by coll.iterator() is of type Iterator<String>, and we can say:
Iterator<String> iter; iter = coll.iterator(); while ( iter.hasNext() ) { String item = iter.next(); System.out.println(item); } // Declare the iterater variable. // Get an iterator for the collection. // Get the next item.

492

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

The same general form will work for other types of processing. For example, the following code will remove all null values from any collection of type Collection<JButton> (as long as that collection supports removal of values):
Iterator<JButton> iter = coll.iterator(): while ( iter.hasNext() ) { JButton item = iter.next(); if (item == null) iter.remove(); }

(Note, by the way, that when Collection<T>, Iterator<T>, or any other parameterized type is used in actual code, they are always used with actual types such as String or JButton in place of the formal type parameter T. An iterator of type Iterator<String> is used to iterate through a collection of Strings; an iterator of type Iterator<JButton> is used to iterate through a collection of JButtons; and so on.) An iterator is often used to apply the same operation to all the elements in a collection. In many cases, its possible to avoid the use of iterators for this purpose by using a for-each loop. The for-each loop was discussed in Subsection 3.4.4 for use with enumerated types and in Subsection 7.2.2 for use with arrays. A for-each loop can also be used to iterate through any collection. For a collection coll of type Collection<T>, a for-each loop takes the form:
for ( T x : coll ) { // "for each object x, of type T, in coll" // process x }

Here, x is the loop control variable. Each object in coll will be assigned to x in turn, and the body of the loop will be executed for each object. Since objects in coll are of type T, x is declared to be of type T. For example, if namelist is of type Collection<String>, we can print out all the names in the collection with:
for ( String name : namelist ) { System.out.println( name ); }

This for-each loop could, of course, be written as a while loop using an iterator, but the for-each loop is much easier to follow.

10.1.6

Equality and Comparison

There are several methods in the collection interface that test objects for equality. For example, the methods coll.contains(object) and coll.remove(object) look for an item in the collection that is equal to object. However, equality is not such a simple matter. The obvious technique for testing equalityusing the == operatordoes not usually give a reasonable answer when applied to objects. The == operator tests whether two objects are identical in the sense that they share the same location in memory. Usually, however, we want to consider two objects to be equal if they represent the same value, which is a very dierent thing. Two values of type String should be considered equal if they contain the same sequence of characters. The question of whether those characters are stored in the same location in memory is irrelevant. Two values of type Date should be considered equal if they represent the same time. The Object class denes the boolean-valued method equals(Object) for testing whether one object is equal to another. This method is used by many, but not by all, collection classes for deciding whether two objects are to be considered the same. In the Object class,

10.1. GENERIC PROGRAMMING

493

obj1.equals(obj2) is dened to be the same as obj1 == obj2. However, for most sub-classes of Object, this denition is not reasonable, and it should be overridden. The String class, for example, overrides equals() so that for a String str, str.equals(obj) if obj is also a String and obj contains the same sequence of characters as str. If you write your own class, you might want to dene an equals() method in that class to get the correct behavior when objects are tested for equality. For example, a Card class that will work correctly when used in collections could be dened as:
public class Card { int suit; // Class to represent playing cards.

// Number from 0 to 3 that codes for the suit -// spades, diamonds, clubs or hearts. int value; // Number from 1 to 13 that represents the value. public boolean equals(Object obj) { try { Card other = (Card)obj; // Type-cast obj to a Card. if (suit == other.suit && value == other.value) { // The other card has the same suit and value as // this card, so they should be considered equal. return true; } else return false; } catch (Exception e) { // This will catch the NullPointerException that occurs if obj // is null and the ClassCastException that occurs if obj is // not of type Card. In these cases, obj is not equal to // this Card, so return false. return false; } } . . // other methods and constructors . }

Without the equals() method in this class, methods such as contains() and remove() in the interface Collection<Card> will not work as expected. A similar concern arises when items in a collection are sorted. Sorting refers to arranging a sequence of items in ascending order, according to some criterion. The problem is that there is no natural notion of ascending order for arbitrary objects. Before objects can be sorted, some method must be dened for comparing them. Objects that are meant to be compared should implement the interface java.lang.Comparable. In fact, Comparable is dened as a parameterized interface, Comparable<T>, which represents the ability to be compared to an object of type T. The interface Comparable<T> denes one method:
public int compareTo( T obj )

The value returned by obj1.compareTo(obj2) should be negative if and only if obj1 comes before obj2, when the objects are arranged in ascending order. It should be positive if and only if obj1 comes after obj2. A return value of zero means that the objects are considered

494

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

to be the same for the purposes of this comparison. This does not necessarily mean that the objects are equal in the sense that obj1.equals(obj2) is true. For example, if the objects are of type Address, representing mailing addresses, it might be useful to sort the objects by zip code. Two Addresses are considered the same for the purposes of the sort if they have the same zip codebut clearly that would not mean that they are the same address. The String class implements the interface Comparable<String> and denes compareTo in a reasonable way (and in this case, the return value of compareTo is zero if and only if the two strings that are being compared are equal). If you dene your own class and want to be able to sort objects belonging to that class, you should do the same. For example:
/** * Represents a full name consisting of a first name and a last name. */ public class FullName implements Comparable<FullName> { private String firstName, lastName; // Non-null first and last names.

public FullName(String first, String last) { // Constructor. if (first == null || last == null) throw new IllegalArgumentException("Names must be non-null."); firstName = first; lastName = last; } public boolean equals(Object obj) { try { FullName other = (FullName)obj; // Type-cast obj to type FullName return firstName.equals(other.firstName) && lastName.equals(other.lastName); } catch (Exception e) { return false; // if obj is null or is not of type FirstName } } public int compareTo( FullName other ) { if ( lastName.compareTo(other.lastName) < 0 ) { // If lastName comes before the last name of // the other object, then this FullName comes // before the other FullName. Return a negative // value to indicate this. return -1; } if ( lastName.compareTo(other.lastName) > 0 ) { // If lastName comes after the last name of // the other object, then this FullName comes // after the other FullName. Return a positive // value to indicate this. return 1; } else { // Last names are the same, so base the comparison on // the first names, using compareTo from class String. return firstName.compareTo(other.firstName); }

10.1. GENERIC PROGRAMMING


} . . // other methods . }

495

(I nd it a little odd that the class here is declared as class FullName implements Comparable<FullName>, with FullName repeated as a type parameter in the name of the interface. However, it does make sense. It means that we are going to compare objects that belong to the class FullName to other objects of the same type. Even though this is the only reasonable thing to do, that fact is not obvious to the Java compilerand the type parameter in Comparable<FullName> is there for the compiler.) There is another way to allow for comparison of objects in Java, and that is to provide a separate object that is capable of making the comparison. The object must implement the interface Comparator<T>, where T is the type of the objects that are to be compared. The interface Comparator<T> denes the method:
public int compare( T obj1, T obj2 )

This method compares two objects of type T and returns a value that is negative, or positive, or zero, depending on whether obj1 comes before obj2, or comes after obj2, or is considered to be the same as obj2 for the purposes of this comparison. Comparators are useful for comparing objects that do not implement the Comparable interface and for dening several dierent orderings on the same collection of objects. In the next two sections, well see how Comparable and Comparator are used in the context of collections and maps.

10.1.7

Generics and Wrapper Classes

As noted above, Javas generic programming does not apply to the primitive types, since generic data structures can only hold objects, while values of primitive type are not objects. However, the wrapper classes that were introduced in Subsection 5.3.2 make it possible to get around this restriction to a great extent. Recall that each primitive type has an associated wrapper class: class Integer for type int, class Boolean for type boolean, class Character for type char, and so on. An object of type Integer contains a value of type int. The object serves as a wrapper for the primitive type value, which allows it to be used in contexts where objects are required, such as in generic data structures. For example, a list of Integers can be stored in a variable of type ArrayList<Integer>, and interfaces such as Collection<Integer> and Set<Integer> are dened. Furthermore, class Integer denes equals(), compareTo(), and toString() methods that do what you would expect (that is, that compare and write out the corresponding primitive type values in the usual way). Similar remarks apply for all the wrapper classes. Recall also that Java does automatic conversions between a primitive type and the corresponding wrapper type. (These conversions, which are called autoboxing and unboxing, were also introduced in Subsection 5.3.2.) This means that once you have created a generic data structure to hold objects belonging to one of the wrapper classes, you can use the data structure pretty much as if it actually contained primitive type values. For example, if numbers is a variable of type Collection<Integer>, it is legal to call numbers.add(17) or numbers.remove(42). You cant literally add the primitive type value 17 to numbers, but Java will automatically convert the 17 to the corresponding wrapper object, new Integer(17), and the wrapper object

496

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

will be added to the collection. (The creation of the object does add some time and memory overhead to the operation, and you should keep that in mind in situations where eciency is important. An array of int is more ecient than an ArrayList<Integer>.)

10.2

Lists and Sets

In the previous section, we looked at the general properties of collection classes in Java.

In this section, we look at some specic collection classes and how to use them. These classes can be divided into two categories: lists and sets. A list consists of a sequence of items arranged in a linear order. A list has a denite order, but is not necessarily sorted into ascending order. A set is a collection that has no duplicate entries. The elements of a set might or might not be arranged into some denite order.

10.2.1

ArrayList and LinkedList

There are two obvious ways to represent a list: as a dynamic array and as a linked list. Weve encountered these already in Section 7.3 and Section 9.2. Both of these options are available in generic form as the collection classes java.util.ArrayList and java.util.LinkedList. These classes are part of the Java Collection Framework. Each implements the interface List<T>, and therefore the interface Collection<T>. An object of type ArrayList<T> represents an ordered sequence of objects of type T, stored in an array that will grow in size whenever necessary as new items are added. An object of type LinkedList<T> also represents an ordered sequence of objects of type T, but the objects are stored in nodes that are linked together with pointers. Both list classes support the basic list operations that are dened in the interface List<T>, and an abstract data type is dened by its operations, not by its representation. So why two classes? Why not a single List class with a single representation? The problem is that there is no single representation of lists for which all list operations are ecient. For some operations, linked lists are more ecient than arrays. For others, arrays are more ecient. In a particular application of lists, its likely that only a few operations will be used frequently. You want to choose the representation for which the frequently used operations will be as ecient as possible. Broadly speaking, the LinkedList class is more ecient in applications where items will often be added or removed at the beginning of the list or in the middle of the list. In an array, these operations require moving a large number of items up or down one position in the array, to make a space for a new item or to ll in the hole left by the removal of an item. In terms of asymptotic analysis (Section 8.6), adding an element at the beginning or in the middle of an array has run time (n), where n is the number of items in the array. In a linked list, nodes can be added or removed at any position by changing a few pointer values, an operation that has run time (1). That is, the operation takes only some constant amount of time, independent of how many items are in the list. On the other hand, the ArrayList class is more ecient when random access to items is required. Random access means accessing the k-th item in the list, for any integer k. Random access is used when you get or change the value stored at a specied position in the list. This is trivial for an array, with run time (1). But for a linked list it means starting at the beginning of the list and moving from node to node along the list for k steps, an operation that has run time (n).

10.2. LISTS AND SETS

497

Operations that can be done eciently for both types of lists include sorting and adding an item at the end of the list. All lists implement the methods from interface Collection<T> that were discussed in Subsection 10.1.4. These methods include size(), isEmpty(), add(T), remove(Object), and clear(). The add(T) method adds the object at the end of the list. The remove(Object) method involves rst nding the object, which is not very ecient for any list since it involves going through the items in the list from beginning to end until the object is found. The interface List<T> adds some methods for accessing list items according to their numerical positions in the list. Suppose that list is an object of type List<T>. Then we have the methods: list.get(index) returns the object of type T that is at position index in the list, where index is an integer. Items are numbered 0, 1, 2, . . . , list.size()-1. The parameter must be in this range, or an IndexOutOfBoundsException is thrown. list.set(index,obj) stores the object obj at position number index in the list, replacing the object that was there previously. The object obj must be of type T. This does not change the number of elements in the list or move any of the other elements. list.add(index,obj) inserts an object obj into the list at position number index, where obj must be of type T. The number of items in the list increases by one, and items that come after position index move up one position to make room for the new item. The value of index must be in the range 0 to list.size(), inclusive. If index is equal to list.size(), then obj is added at the end of the list. list.remove(index) removes the object at position number index, and returns that object as the return value of the method. Items after this position move up one space in the list to ll the hole, and the size of the list decreases by one. The value of index must be in the range 0 to list.size()-1 list.indexOf(obj) returns an int that gives the position of obj in the list, if it occurs. If it does not occur, the return value is -1. The object obj can be of any type, not just of type T. If obj occurs more than once in the list, the index of the rst occurrence is returned. These methods are dened both in class ArrayList<T> and in class LinkedList<T>, although some of themget and setare only ecient for ArrayLists. The class LinkedList<T> adds a few additional methods, which are not dened for an ArrayList. If linkedlist is an object of type LinkedList<T>, then we have linkedlist.getFirst() returns the object of type T that is the rst item in the list. The list is not modied. If the list is empty when the method is called, an exception of type NoSuchElementException is thrown (the same is true for the next three methods as well). linkedlist.getLast() returns the object of type T that is the last item in the list. The list is not modied. linkedlist.removeFirst() removes the rst item from the list, and returns that object of type T as its return value. linkedlist.removeLast() removes the last item from the list, and returns that object of type T as its return value. linkedlist.addFirst(obj) adds the obj, which must be of type T, to the beginning of the list.

498

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES linkedlist.addLast(obj) adds the object obj, which must be of type T, to the end of the list. (This is exactly the same as linkedlist.add(obj) and is apparently dened just to keep the naming consistent.)

These methods are apparently dened to make it easy to use a LinkedList as if it were a stack or a queue. (See Section 9.3.) For example, we can use a LinkedList as a queue by adding items onto one end of the list (using the addLast() method) and removing them from the other end (using the removeFirst() method). If list is an object of type List<T>, then the method list.iterator(), dened in the interface Collection<T>, returns an Iterator that can be used to traverse the list from beginning to end. However, for Lists, there is a special type of Iterator, called a ListIterator, which oers additional capabilities. ListIterator<T> is an interface that extends the interface Iterator<T>. The method list.listIterator() returns an object of type ListIterator<T>. A ListIterator has the usual Iterator methods, hasNext(), next(), and remove(), but it also has methods hasPrevious(), previous(), and add(obj) that make it possible to move backwards in the list and to add an item at the current position of the iterator. To understand how these work, its best to think of an iterator as pointing to a position between two list elements, or at the beginning or end of the list. In this diagram, the items in a list are represented by squares, and arrows indicate the possible positions of an iterator:

If iter is of type ListIterator<T>, then iter.next() moves the iterator one space to the right along the list and returns the item that the iterator passes as it moves. The method iter.previous() moves the iterator one space to the left along the list and returns the item that it passes. The method iter.remove() removes an item from the list; the item that is removed is the item that the iterator passed most recently in a call to either iter.next() or iter.previous(). There is also a method iter.add(obj) that adds the specied object to the list at the current position of the iterator (where obj must be of type T ). This can be between two existing items or at the beginning of the list or at the end of the list. (By the way, the lists that are used in class LinkedList<T> are doubly linked lists. That is, each node in the list contains two pointersone to the next node in the list and one to the previous node. This makes it possible to eciently implement both the next() and previous() methods of a ListIterator. Also, to make the addLast() and getLast() methods of a LinkedList ecient, the class LinkedList<T> includes an instance variable that points to the last node in the list.) As an example of using a ListIterator, suppose that we want to maintain a list of items that is always sorted into increasing order. When adding an item to the list, we can use a ListIterator to nd the position in the list where the item should be added. Once the position has been found, we use the same list iterator to place the item in that position. The idea is to start at the beginning of the list and to move the iterator forward past all the items that are smaller than the item that is being inserted. At that point, the iterators add() method can be used to insert the item. To be more denite, suppose that stringList is a variable of type List<String>. Assume that that the strings that are already in the list are stored in ascending order and that newItem is a string that we would like to insert into the list. The following code will place newItem in the list in its correct position, so that the modied list is still in ascending order:

10.2. LISTS AND SETS


ListIterator<String> iter = stringList.listIterator(); // // // // // Move the iterator so that it points to the position where newItem should be inserted into the list. If newItem is bigger than all the items in the list, then the while loop will end when iter.hasNext() becomes false, that is, when the iterator has reached the end of the list.

499

while (iter.hasNext()) { String item = iter.next(); if (newItem.compareTo(item) <= 0) { // newItem should come BEFORE item in the list. // Move the iterator back one space so that // it points to the correct insertion point, // and end the loop. iter.previous(); break; } } iter.add(newItem);

Here, stringList might be of type ArrayList<String> or of type LinkedList<String>. The algorithm that is used to insert newItem into the list will be about equally ecient for both types of lists, and it will even work for other classes that implement the interface List<String>. You would probably nd it easier to design an insertion algorithm that uses array-like indexing with the methods get(index) and add(index,obj). However, that algorithm would be horribly inecient for LinkedLists because random access is so inecient for linked lists. (By the way, the insertion algorithm works when the list is empty. It might be useful for you to think about why this is true.)

10.2.2

Sorting

Sorting a list is a fairly common operation, and there should really be a sorting method in the List interface. There is not, presumably because it only makes sense to sort lists of certain types of objects, but methods for sorting lists are available as static methods in the class java.util.Collections. This class contains a variety of static utility methods for working with collections. The methods are generic; that is, they will work for collections of objects of various types. Suppose that list is of type List<T>. The command
Collections.sort(list);

can be used to sort the list into ascending order. The items in the list should implement the interface Comparable<T> (see Subsection 10.1.6). The method Collections.sort() will work, for example, for lists of String and for lists of any of the wrapper classes such as Integer and Double. There is also a sorting method that takes a Comparator as its second argument:
Collections.sort(list,comparator);

In this method, the comparator will be used to compare the items in the list. As mentioned in the previous section, a Comparator is an object that denes a compare() method that can be used to compare two objects. Well see an example of using a Comparator in Section 10.4. The sorting method that is used by Collections.sort() is the so-called merge sort algorithm, which has both worst-case and average-case run times that are (n*log(n)) for

500

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

a list of size n. Although the average run time for MergeSort is a little slower than that of QuickSort, its worst-case performance is much better than QuickSorts. (QuickSort was covered in Subsection 9.1.3.) MergeSort also has a nice property called stability that we will encounter at the end of Subsection 10.4.3. The Collections class has at least two other useful methods for modifying lists. Collections.shuffle(list) will rearrange the elements of the list into a random order. Collections.reverse(list) will reverse the order of the elements, so that the last element is moved to the beginning of the list, the next-to-last element to the second position, and so on. Since an ecient sorting method is provided for Lists, there is no need to write one yourself. You might be wondering whether there is an equally convenient method for standard arrays. The answer is yes. Array-sorting methods are available as static methods in the class java.util.Arrays. The statement
Arrays.sort(A);

will sort an array, A, provided either that the base type of A is one of the primitive types (except boolean) or that A is an array of Objects that implement the Comparable interface. You can also sort part of an array. This is important since arrays are often only partially lled. The command:
Arrays.sort(A,fromIndex,toIndex);

sorts the elements A[fromIndex], A[fromIndex+1], . . . , A[toIndex-1] into ascending order. You can use Arrays.sort(A,0,N-1) to sort a partially lled array which has items in the rst N positions. Java does not support generic programming for primitive types. In order to implement the command Arrays.sort(A), the Arrays class contains eight methods: one method for arrays of Objects and one method for each of the primitive types byte, short, int, long, oat, double, and char.

10.2.3

TreeSet and HashSet

A set is a collection of objects in which no object occurs more than once. Sets implement all the methods in the interface Collection<T>, but do so in a way that ensures that no element occurs twice in the set. For example, if set is an object of type Set<T>, then set.add(obj) will have no eect on the set if obj is already an element of the set. Java has two classes that implement the interface Set<T>: java.util.TreeSet and java.util.HashSet. In addition to being a Set, a TreeSet has the property that the elements of the set are arranged into ascending sorted order. An Iterator for a TreeSet will always visit the elements of the set in ascending order. A TreeSet cannot hold arbitrary objects, since there must be a way to determine the sorted order of the objects it contains. Ordinarily, this means that the objects in a set of type TreeSet<T> should implement the interface Comparable<T> and that obj1.compareTo(obj2) should be dened in a reasonable way for any two objects obj1 and obj2 in the set. Alternatively, an object of type Comparator<T> can be provided as a parameter to the constructor when the TreeSet is created. In that case, the compareTo() method of the Comparator will be used to compare objects that are added to the set. A TreeSet does not use the equals() method to test whether two objects are the same. Instead, it uses the compareTo() method. This can be a problem. Recall from Subsection 10.1.6 that compareTo() can consider two objects to be the same for the purpose of the comparison

10.2. LISTS AND SETS

501

even though the objects are not equal. For a TreeSet, this means that only one of those objects can be in the set. For example, if the TreeSet contains mailing addresses and if the compareTo() method for addresses just compares their zip codes, then the set can contain only one address in each zip code. Clearly, this is not right! But that only means that you have to be aware of the semantics of TreeSets, and you need to make sure that compareTo() is dened in a reasonable way for objects that you put into a TreeSet. This will be true, by the way, for Strings, Integers, and many other built-in types, since the compareTo() method for these types considers two objects to be the same only if they are actually equal. In the implementation of a TreeSet, the elements are stored in something similar to a binary sort tree. (See Subsection 9.4.2.) However, the data structure that is used is balanced in the sense that all the leaves of the tree are at about the same distance from the root of the tree. This ensures that all the basic operationsinserting, deleting, and searchingare ecient, with worst-case run time (log(n)), where n is the number of items in the set. The fact that a TreeSet sorts its elements and removes duplicates makes it very useful in some applications. Exercise 7.6 asked you to write a program that would read a le and output an alphabetical list of all the words that occurred in the le, with duplicates removed. The words were to be stored in an ArrayList, so it was up to you to make sure that the list was sorted and contained no duplicates. The same task can be programmed much more easily using a TreeSet instead of a list. A TreeSet automatically eliminates duplicates, and an iterator for the set will automatically visit the items in the set in sorted order. An algorithm for the program, using a TreeSet, would be:
TreeSet<String> words = new TreeSet<String>(); while there is more data in the input file: Let word = the next word from the file Convert word to lower case words.add(word) // Adds the word only if not already present. Iterator<String> iter = words.iterator(); while (iter.hasNext()): Output iter.next() // Prints the words in sorted order.

If you would like to see a complete, working program, you can nd it in the le WordListWithTreeSet.java. As another example, suppose that coll is any Collection of Strings. (This would also work for any other type for which compareTo() is properly dened). We can use a TreeSet to sort the items of coll and remove the duplicates simply by saying:
TreeSet<String> set = new TreeSet(); set.addAll(coll);

The second statement adds all the elements of the collection to the set. Since its a Set, duplicates are ignored. Since its a TreeSet, the elements of the set are sorted. If you would like to have the data in some other type of data structure, its easy to copy the data from the set. For example, to place the answer in an ArrayList, you could say:
TreeSet<String> set = new TreeSet<String>(); set.addAll(coll); ArrayList<String> list = new ArrayList<String>(); list.addAll(set);

502

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

Now, in fact, every one of Javas collection classes has a constructor that takes a Collection as an argument. All the items in that Collection are added to the new collection when it is created. So, if coll is of type Collection<String>, then new TreeSet<String>(coll) creates a TreeSet that contains the same elements as coll, but with duplicates removed and in sorted order. This means that we can abbreviate the four lines in the above example to the single command:
ArrayList<String> list = new ArrayList<String>( new TreeSet<String>(coll) );

This makes a sorted list of the elements of coll with no duplicates. Although the repeated type parameter, <String>, makes it a bit ugly to look at, this is still a nice example of the power of generic programming. (It seems, by the way, there is no equally easy way to get a sorted list with duplicates. To do this, we would need something like a TreeSet that allows duplicates. The C++ programming language has such a thing and refers to it as a multiset. The Smalltalk language has something similar and calls it a bag . Java, for the time being at least, lacks this data type.)


A HashSet stores its elements in a hash table, a type of data structure that I will discuss in the next section. The operations of nding, adding, and removing elements are implemented very eciently in hash tables, even more so than for TreeSets. The elements of a HashSet are not stored in any particular order, and so do not need to implement the Comparable interface. The equals() method is used to determine whether two objects in a HashSet are to be considered the same. An Iterator for a HashSet will visit its elements in what seems to be a completely arbitrary order, and its possible for the order to change completely when a new element is added. Use a HashSet instead of a TreeSet when the elements it contains are not comparable, or when the order is not important, or when the small advantage in eciency is important


A note about the mathematics of sets: In mathematical set theory, the items in a set are called members or elements of that set. Important operations include adding an element to a set, removing an element from a set, and testing whether a given entity is an element of a set. Operations that can be performed on two sets include union, intersection, and set dierence. All these operations are dened in Java for objects of type Set, but with dierent names. Suppose that A and B are Sets. Then: A.add(x) adds the element x to the set A. A.remove(x) removes the element x from the set A. A.contains(x) tests whether x is an element of the set A. A.addAll(B) computes the union of A and B. A.retainAll(B) computes the intersection of A and B. A.removeAll(B) computes the set dierence, A - B. There are of course, dierences between mathematical sets and sets in Java. Most important, perhaps, sets in Java must be nite, while in mathematics, most of the fun in set theory comes from working with innity. In mathematics, a set can contain arbitrary elements, while in Java, a set of type Set<T> can only contain elements of type T. The operation A.addAll(B) acts by modifying the value of A, while in mathematics the operation A union B computes a new set, without changing the value of A or B. See Exercise 10.2 for an example of mathematical set operations in Java.

10.2. LISTS AND SETS

503

10.2.4

EnumSet

Enumerated types (or enums) were introduced in Subsection 2.3.3. Suppose that E is an enumerated type. Since E is a class, it is possible to create objects of type TreeSet<E> and HashSet<E>. However, because enums are so simple, trees and hash tables are not the most ecient implementation for sets of enumerated type values. Java provides the class java.util.EnumSet as an alternative way to create such sets. Sets of enumerated type values are created using static methods in the class EnumSet. For example, if e1, e2, and e3 are values belonging to the enumerated type E, then the method
EnumSet.of( e1, e2, e3 )

creates and returns a set of type EnumSet<E> that contains exactly the elements e1, e2, and e3. The set implements the interface Set<E>, so all the usual set and collection operations are available. The implementation of these operations is very ecient. The implementation uses what is called a bit vector. A bit is a quantity that has only two possible values, zero and one. A set of type EnumSet<E> is represented by a bit vector that contains one bit for each enum constant in the enumerated type E ; the bit corresponding to the enum constant e is 1 if e is a member of the set and is 0 if e is not a member of the set. The bit vectors for two sets of type EnumSet<E> can be very easily combined to represent such operations as the union and intersection of two sets. The bit vector representation is feasible for EnumSets, but not for other sets in Java, because an enumerated type contains only a small nite number of enum constants. (Java actually has a class named BitSet that uses bit vectors to represent nite sets of non-negative integers, but this class is not part of the Java Collection Framework and does not implement the Set interface.) The function EnumSet.of can be used with any positive number of parameters. All the parameters must be values of the same enumerated type. Null values are not allowed. An EnumSet cannot contain the value nullany attempt to add null to an EnumSet will result in a NullPointerException. There is also a function EnumSet.range(e1,e2) that returns an EnumSet consisting of the enum constants between e1 and e2, inclusive. The ordering of enum constants is the same as the order in which they are listed in the denition of the enum. In EnumSet.range(e1,e2), e1 and e2 must belong to the same enumerated type, and e1 must be less than or equal to e2. If E is an enum, then EnumSet.allOf(E.class) is a set that contains all values of type E. EnumSet.noneOf(E.class) is an empty set, a set of type EnumSet<E> that contains no elements at all. Note that in EnumSet.allOf(E.class) and EnumSet.noneOf(E.class), the odd-looking parameter represents the enumerated type class itself. If eset is a set of type EnumSet<E>, then EnumSet.complementOf(eset) is a set that contains all the enum constants of E that are not in eset. As an example, consider a program that keeps schedules of events. The program must keep track of repeating events that happen on specied days of the week. For example, an event might take place only on weekdays, or only on Wednesdays and Fridays. In other words, associated with the event is the set of days of the week on which it takes place. This information can be represented using the enumerated type
enum Day { SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, SATURDAY }

The days of the week on which an event takes place would then be a value of type EnumSet<Day>. An object of type RepeatingEvent would have an instance variable of type EnumSet<Day> to hold this information. An event that takes place on Wednesdays and Fridays would have the associated set

504

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES


EnumSet.of( Day.WEDNESDAY, Day.FRIDAY )

We could dene some common sets of Days as


EnumSet<Day> weekday = EnumSet.range( Day.MONDAY, Day.FRIDAY ); EnumSet<Day> weekend = EnumSet.complementOf( weekday ); EnumSet<Day> everyday = EnumSet.allOf( Day.class );

EnumSets are often used to specify sets of options that are to be applied during some type of processing. For example, a program that draws characters in fancy fonts might have various options that can be applied. Lets say that the options are bold, italic, underlined, strikethrough, and boxed. Note that we are assuming that options can be combined in arbitrary ways. For example, you can have italic, boxed, underlined characters. This just means that we need to keep track of a set of options. If the options are represented by the enumerated type
enum FontOption { BOLD, ITALIC, UNDERLINED, STRIKETHROUGH, BOXED }

then a set of options is represented by a value of type EnumSet<FontOption>. Suppose that options is a variable of this type that represents the set of options that are currently being applied by the program. Then we can do things like: options = EnumSet.noneOf( FontOption.class ) Turn o all options. options = EnumSet.of( FontOption.BOLD ) Use bold, with no other options. options.add( FontOption.BOLD ) Add bold to any options that are already on. options.remove( FontOption.UNDERLINED ) Turn underlining o (if its on). This is a nice, safe way to work with sets of options, and applications like this are one of the major reasons that enumerated types were introduced.

10.3

Maps

An array of N elements can be thought of as a way of associating some item with each of the integers 0, 1, . . . , N-1. If i is one of these integers, its possible to get the item associated with i, and its possible to put a new item in the i-th position. These get and put operations dene what it means to be an array. A map is a kind of generalized array. Like an array, a map is dened by get and put operations. But in a map, these operations are dened not for integers 0, 1, . . . , N-1, but for arbitrary objects of some specied type T. Associated to these objects of type T are objects of some possibly dierent type S. In fact, some programming languages use the term associative array instead of map and use the same notation for associative arrays as for regular arrays. In those languages, for example, you might see the notation A["fred"] used to indicate the item associated to the string fred in the associative array A. Java does not use array notation for maps, unfortunately, but the idea is the same: A map is like an array, but the indices for a map are objects, not integers. In a map, an object that serves as an index is called a key . The object that is associated with a key is called a value. Note that a key can have at most one associated value, but the same value can be associated to several dierent keys. A map can be considered to be a set of associations, where each association is a key/value pair.

10.3. MAPS

505

10.3.1

The Map Interface

In Java, maps are dened by the interface java.util.Map, which includes put and get methods as well as other general methods for working with maps. The map interface, Map<T,S>, is parameterized by two types. The rst type parameter, T, species the type of objects that are possible keys in the map; the second type parameter, S, species the type of objects that are possible values in the map. For example, a map of type Map<Date,JButton> would associate values of type JButton to keys of type Date. For a map of type Map<String,String>, both the keys and the values are of type String. Suppose that map is a variable of type Map<T,S> for some specic types T and S. Then the following are some of the methods that are dened for map: map.get(key) returns the object of type S that is associated by the map to the key. key can be any object; it does not have to be of type T. If the map does not associate any value with obj, then the return value is null. Note that its also possible for the return value to be null when the map explicitly associates the value null with the key. Referring to map.get(key) is similar to referring to A[key] for an array A. (But note that there is nothing like an IndexOutOfBoundsException for maps.) map.put(key,value) Associates the specied value with the specied key, where key must be of type T and value must be of type S. If the map already associated some other value with the key, then the new value replaces the old one. This is similar to the command A[key] = value for an array. map.putAll(map2) if map2 is another map of type Map<T,S>, this copies all the associations from map2 into map. map.remove(key) if map associates a value to the specied key, that association is removed from the map. key can be any object; it does not have to be of type T. map.containsKey(key) returns a boolean value that is true if the map associates some value to the specied key. key can be any object; it does not have to be of type T. map.containsValue(value) returns a boolean value that is true if the map associates the specied value to some key. value can be any object; it does not have to be of type S. map.size() returns an int that gives the number of associations in the map. map.isEmpty() returns a boolean value that is true if the map is empty, that is if it contains no associations. map.clear() removes all associations from the map, leaving it empty. The put and get methods are certainly the most commonly used of the methods in the Map interface. In many applications, these are the only methods that are needed, and in such cases a map is really no more dicult to use than a standard array. Java includes two classes that implement the interface Map<T,S>: TreeMap<T,S> and HashMap<T,S>. In a TreeMap, the key/value associations are stored in a sorted tree, in which they are sorted according to their keys. For this to work, it must be possible to compare the keys to one another. This means either that the keys must implement the interface Comparable<T>, or that a Comparator must be provided for comparing keys. (The Comparator can be provided as a parameter to the TreeMap constructor.) Note that in a TreeMap, as in a TreeSet, the compareTo() method is used to decide whether two keys are to be considered the same. This can have undesirable consequences if the compareTo() method does not agree with the usual notion of equality, and you should keep this in mind when using TreeMaps.

506

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

A HashMap does not store associations in any particular order, so the keys that can be used in a HashMap do not have to be comparable. However, the key class should have reasonable denitions for the equals() method and for a hashCode() method that is discussed later in this section; most of Javas standard classes dene these methods correctly. Most operations are a little faster on HashMaps than they are on TreeMaps. In general, you should use a HashMap unless you have some particular need for the ordering property of a TreeMap. In particular, if you are only using the put and get operations, you can safely use a HashMap. Lets consider an example where maps would be useful. In Subsection 7.4.2, I presented a simple PhoneDirectory class that associated phone numbers with names. That class dened operations addEntry(name,number) and getNumber(name), where both name and number are given as Strings. In fact, the phone directory is acting just like a map, with the addEntry method playing the role of the put operation and getNumber playing the role of get. In a real programming application, there would be no need to dene a new class; we could simply use a map of type Map<String,String>. A directory would be dened as
Map<String,String> directory = new Map<String,String>();

and then directory.put(name,number) would record a phone number in the directory and directory.get(name) would retrieve the phone number associated with a given name.

10.3.2

Views, SubSets, and SubMaps

A Map is not a Collection, and maps do not implement all the operations dened on collections. In particular, there are no iterators for maps. Sometimes, though, its useful to be able to iterate through all the associations in a map. Java makes this possible is a roundabout but clever way. If map is a variable of type Map<T,S>, then the method
map.keySet()

returns the set of all objects that occur as keys for associations in the map. The value returned by this method is an object that implements the interface Set<T>. The elements of this set are the maps keys. The obvious way to implement the keySet() method would be to create a new set object, add all the keys from the map, and return that set. But thats not how its done. The value returned by map.keySet() is not an independent object. It is what is called a view of the actual objects that are stored in the map. This view of the map implements the Set<T> interface, but it does it in such a way that the methods dened in the interface refer directly to keys in the map. For example, if you remove a key from the view, that keyalong with its associated valueis actually removed from the map. Its not legal to add an object to the view, since it doesnt make sense to add a key to a map without specifying the value that should be associated to the key. Since map.keySet() does not create a new set, its very ecient, even for very large maps. One of the things that you can do with a Set is get an Iterator for it and use the iterator to visit each of the elements of the set in turn. We can use an iterator for the key set of a map to traverse the map. For example, if map is of type Set<String,Double>, we could write:
Set<String> keys = map.keySet(); // The set of keys in the map. Iterator<String> keyIter = keys.iterator(); System.out.println("The map contains the following associations:"); while (keyIter.hasNext()) { String key = keyIter.next(); // Get the next key. Double value = map.get(key); // Get the value for that key. System.out.println( " (" + key + "," + value + ")" );

10.3. MAPS
}

507

Or we could do the same thing more easily, avoiding the explicit use of an iterator, with a for-each loop:
System.out.println("The map contains the following associations:"); for ( String key : map.keySet() ) { // "for each key in the maps key set" Double value = map.get(key); System.out.println( " (" + key + "," + value + ")" ); }

If the map is a TreeMap, then the key set of the map is a sorted set, and the iterator will visit the keys in ascending order. For a HashMap, the keys are visited in an arbitrary, unpredictable order. The Map interface denes two other views. If map is a variable of type Map<T,S>, then the method:
map.values()

returns an object of type Collection<S> that contains all the values from the associations that are stored in the map. The return value is a Collection rather than a Set because it can contain duplicate elements (since a map can associate the same value to any number of keys). The method:
map.entrySet()

returns a set that contains all the associations from the map. The elements in the set are objects of type Map.Entry<T,S>. Map.Entry<T,S> is dened as a static nested interface inside the interface Map<T,S>, so its full name contains a period. However, it can be used in the same way as any other type name. (The return type of the method map.entrySet() is written as Set<Map.Entry<T,S>>. The type parameter in this case is itself a parameterized type. Although this might look confusing, its just Javas way of saying that the elements of the set are of type Map.Entry<T,S>.) The information in the set returned by map.entrySet() is actually no dierent from the information in the map itself, but the set provides a dierent view of this information, with dierent operations. Each Map.Entry object contains one key/value pair, and denes methods getKey() and getValue() for retrieving the key and the value. There is also a method, setValue(value), for setting the value; calling this method for a Map.Entry object will modify the map itself, just as if the maps put method were called. As an example, we can use the entry set of a map to print all the key/value pairs in the map. This is more ecient than using the key set to print the same information, as I did in the above example, since we dont have to use the get() method to look up the value associated with each key. Suppose again that map is of type Map<String,Double>. Then we can write:
Set<Map.Entry<String,Double>> entries = map.entrySet(); Iterator<Map.Entry<String,Double>> entryIter = entries.iterator(); System.out.println("The map contains the following associations:"); while (entryIter.hasNext()) { Map.Entry<String,Double> entry = entryIter.next(); String key = entry.getKey(); // Get the key from the entry. Double value = entry.getValue(); // Get the value. System.out.println( " (" + key + "," + value + ")" ); }

or, using a for-each loop:

508

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES


System.out.println("The map contains the following associations:"); for ( Map.Entry<String,Double> entry : map.entrySet() ) System.out.println( " (" + entry.getKey() + "," + entry.getValue() + ")" );


Maps are not the only place in Javas generic programming framework where views are used. For example, the interface List<T> denes a sublist as a view of a part of a list. If list implements the interface List<T>, then the method:
list.subList( fromIndex, toIndex )

where fromIndex and toList are integers, returns a view of the part of the list consisting of the list elements in positions between fromIndex and toIndex (including fromIndex but excluding toIndex). This view lets you operate on the sublist using any of the operations dened for lists, but the sublist is not an independent list. Changes made to the sublist are actually made to the original list. Similarly, it is possible to obtain views that represent certain subsets of a sorted set. If set is of type TreeSet<T>, then set.subSet(fromElement,toElement) returns a Set<T> that contains all the elements of set that are between fromElement and toElement (including fromElement and excluding toElement). The parameters fromElement and toElement must be objects of type T. For example, if words is a set of type TreeSet<String> in which all the elements are strings of lower case letters, then words.subSet("m","n") contains all the elements of words that begin with the letter m. This subset is a view of part of the original set. That is, creating the subset does not involve copying elements. And changes made to the subset, such as adding or removing elements, are actually made to the original set. The view set.headSet(toElement) consists of all elements from the set which are strictly less than toElement, and set.tailSet(fromElement) is a view that contains all elements from the set that are greater than or equal to fromElement. The class TreeMap<T,S> denes three submap views. A submap is similar to a subset. A submap is a Map that contains a subset of the keys from the original Map, along with their associated values. If map is a variable of type TreeMap<T,S>, and if fromKey and toKey are of type T, then map.subMap(fromKey,toKey) returns a view that contains all key/value pairs from map whose keys are between fromKey and toKey (including fromKey and excluding toKey). There are also views map.headMap(toKey) and map.tailMap(fromKey) which are dened analogously to headSet and tailSet. Suppose, for example, that blackBook is a map of type TreeMap<String,String> in which the keys are names and the values are phone numbers. We can print out all the entries from blackBook where the name begins with M as follows:
Map<String,String> ems = blackBook.subMap("M","N"); // This submap contains entries for which the key is greater // than or equal to "M" and strictly less than "N". if (ems.isEmpty()) { System.out.println("No entries beginning with M."); } else { System.out.println("Entries beginning with M:"); for ( Map.Entry<String,String> entry : ems.entrySet() ) System.out.println( " " + entry.getKey() + ": " + entry.getValue() ); }

10.3. MAPS

509

Subsets and submaps are probably best thought of as generalized search operations that make it possible to nd all the items in a range of values, rather than just to nd a single value. Suppose, for example that a database of scheduled events is stored in a map of type TreeMap<Date,Event> in which the keys are the times of the events, and suppose you want a listing of all events that are scheduled for some time on July 4, 2007. Just make a submap containing all keys in the range from 12:00 AM, July 4, 2007 to 12:00 AM, July 5, 2007, and output all the entries from that submap. This type of search, which is known as a subrange query is quite common.

10.3.3

Hash Tables and Hash Codes

HashSets and HashMaps are implemented using a data structure known as a hash table. You dont need to understand hash tables to use HashSets or HashMaps, but any computer programmer should be familiar with hash tables and how they work. Hash tables are an elegant solution to the search problem. A hash table, like a HashMap, stores key/value pairs. Given a key, you have to search the table for the corresponding key/value pair. When a hash table is used to implement a set, the values are all null, and the only question is whether or not the key occurs in the set. You still have to search for the key to check whether it is there or not. In most search algorithms, in order to nd the item you are interested in, you have to look through a bunch of other items that dont interest you. To nd something in an unsorted list, you have to go though the items one-by-one until you come to the one you are looking for. In a binary sort tree, you have to start at the root and move down the tree until you nd the item you want. When you search for a key/value pair in a hash table, you can go directly to the location that contains the item you want. You dont have to look through any other items. (This is not quite true, but its close.) The location of the key/value pair is computed from the key: You just look at the key, and then you go directly to the location where it is stored. How can this work? If the keys were integers in the range 0 to 99, we could store the key/value pairs in an array, A, of 100 elements. The key/value pair with key K would be stored in A[K]. The key takes us directly to the location of the key/value pair. The problem is that there are usually far too many dierent possible keys for us to be able to use an array with one location for each possible key. For example, if the key can be any value of type int, then we would need an array with over four billion locationsquite a waste of space if we are only going to store, say, a few thousand items! If the key can be a string of any length, then the number of possible keys is innite, and using an array with one location for each possible key is simply impossible. Nevertheless, hash tables store their data in an array, and the array index where a key is stored is based on the key. The index is not equal to the key, but it is computed from the key. The array index for a key is called the hash code for that key. A function that computes a hash code, given a key, is called a hash function. To nd a key in a hash table, you just have to compute the hash code of the key and go directly to the array location given by that hash code. If the hash code is 17, look in array location number 17. Now, since there are fewer array locations than there are possible keys, its possible that we might try to store two or more keys in the same array location. This is called a collision. A collision is not an error. We cant reject a key just because another key happened to have the same hash code. A hash table must be able to handle collisions in some reasonable way. In the type of hash table that is used in Java, each array location actually holds a linked list of

510

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

key/value pairs (possibly an empty list). When two items have the same hash code, they are in the same linked list. The structure of the hash table looks something like this:
m m e t i m m m e e e t t t i i i m m m m m e e e e e e t t t t t t i i i i i i 1 0 1 0 2 3 4 5 6 7 8 9 1 1

In this diagram, there is one item with hash code 0, no items with hash code 1, two items with hash code 2, and so on. In a properly designed hash table, most of the linked lists are of length zero or one, and the average length of the lists is less than one. Although the hash code of a key doesnt necessarily take you directly to that key, there are probably no more than one or two other items that you have to look through before nding the key you want. For this to work properly, the number of items in the hash table should be somewhat less than the number of locations in the array. In Javas implementation, whenever the number of items exceeds 75% of the array size, the array is replaced by a larger one and all the items in the old array are inserted into the new one. (This is why adding one new item will sometimes cause the ordering of all the items in the hash table to change completely.) There is still the question of where hash codes come from. Every object in Java has a hash code. The Object class denes the method hashCode(), which returns a value of type int. When an object, obj, is stored in a hash table that has N locations, a hash code in the range 0 to N-1 is needed. This hash code is computed as Math.abs(obj.hashCode()) % N, the remainder when the absolute value of obj.hashCode() is divided by N. (The Math.abs is necessary because obj.hashCode() can be a negative integer, and we need a non-negative number to use as an array index.) For hashing to work properly, two objects that are equal according to the equals() method must have the same hash code. In the Object class, this condition is satised because both equals() and hashCode() are based on the address of the memory location where the object is stored. However, as noted in Subsection 10.1.6, many classes redene the equals() method. If a class redenes the equals() method, and if objects of that class will be used as keys in hash tables, then the class should also redene the hashCode() method. For example, in the String class, the equals() method is redened so that two objects of type String are considered to be equal if they contain the same sequence of characters. The hashCode() method is also redened in the String class, so that the hash code of a string is computed from the characters in that string rather than from its location in memory. For Javas standard classes, you can expect equals() and hashCode() to be correctly dened. However, you might need to dene

10.4. PROGRAMMING WITH THE COLLECTION FRAMEWORK these methods in classes that you write yourself.

511

10.4 In

Programming with the Collection Framework

this section, well look at some programming examples that use classes from the Java Collection Framework. The Collection Framework is easy to use, especially compared to the diculty of programming new data structures from scratch.

10.4.1

Symbol Tables

We begin with a straightforward but important application of maps. When a compiler reads the source code of a program, it encounters denitions of variables, subroutines, and classes. The names of these things can be used later in the program. The compiler has to remember the denition of each name, so that it can recognize the name and apply the denition when the name is encountered later in the program. This is a natural application for a Map. The name can be used as a key in the map. The value associated to the key is the denition of the name, encoded somehow as an object. A map that is used in this way is called a symbol table. In a compiler, the values in a symbol table can be quite complicated, since the compiler has to deal with names for various sorts of things, and it needs a dierent type of information for each dierent type of name. We will keep things simple by looking at a symbol table in another context. Suppose that we want a program that can evaluate expressions entered by the user, and suppose that the expressions can contain variables, in addition to operators, numbers, and parentheses. For this to make sense, we need some way of assigning values to variables. When a variable is used in an expression, we need to retrieve the variables value. A symbol table can be used to store the data that we need. The keys for the symbol table are variable names. The value associated with a key is the value of that variable, which is of type double. The symbol table will be an object of type Map<String,Double>. (Remember that primitive types such as double cant be used as type parameters; a wrapper class such as Double must be used instead. See Subsection 10.1.7.) To demonstrate the idea, well use a rather simple-minded program in which the user types commands such as:
let x = 3 + 12 print 2 + 2 print 10*x +17 let rate = 0.06 print 1000*(1+rate)

The program is an interpreter for a very simple language. The only two commands that the program understands are print and let. When a print command is executed, the computer evaluates the expression and displays the value. If the expression contains a variable, the computer has to look up the value of that variable in the symbol table. A let command is used to give a value to a variable. The computer has to store the value of the variable in the symbol table. (Note: The variables I am talking about here are not variables in the Java program. The Java program is executing a sort of program typed in by the user. I am talking about variables in the users program. The user gets to make up variable names, so there is no way for the Java program to know in advance what the variables will be.) In Subsection 9.5.2, we saw how to write a program, SimpleParser2.java, that can evaluate expressions that do not contain variables. Here, I will discuss another example program, Sim-

512

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

pleInterpreter.java, that is based on the older program. I will only talk about the parts that are relevant to the symbol table. The program uses a HashMap as the symbol table. A TreeMap could also be used, but since the program does not need to access the variables in alphabetical order, we dont need to have the keys stored in sorted order. The symbol table in the program is represented by a variable named symbolTable of type HashMap<String,Double>. At the beginning of the program, the symbol table object is created with the command:
symbolTable = new HashMap<String,Double>();

This creates a map that initially contains no key/value associations. To execute a let command, the program uses the symbol tables put() method to associate a value with the variable name. Suppose that the name of the variable is given by a String, varName, and the value of the variable is stored in a variable val of type double. The following command would then set the value associated with the variable in the symbol table:
symbolTable.put( varName, val );

In the program SimpleInterpreter.java, youll nd this in the method named doLetCommand(). The actual value that is stored in the symbol table is an object of type Double. We can use the double value val in the call to put because Java does an automatic conversion of type double to Double when necessary. The double value is wrapped in an object of type Double, so that, in eect, the above statement is equivalent to
symbolTable.put( varName, new Double(val) );

Just for fun, I decided to pre-dene two variables named pi and e whose values are the usual mathematical constants and e. In Java, the values of these constants are given by Math.PI and Math.E. To make these variables available to the user of the program, they are added to the symbol table with the commands:
symbolTable.put( "pi", Math.PI ); symbolTable.put( "e", Math.E );

When the program encounters a variable while evaluating an expression, the symbol tables get() method is used to retrieve its value. The function symbolTable.get(varName) returns a value of type Double. It is possible that the return value is null; this will happen if no value has ever been assigned to varName in the symbol table. Its important to check this possibility. It indicates that the user is trying to use a variable that the user has not dened. The program considers this to be an error, so the processing looks something like this:
Double val = symbolTable.get(varName); if (val == null) { ... // Throw an exception: Undefined variable. } // The value associated to varName is val.doubleValue()

You will nd this code, more or less, in a method named primaryValue() in SimpleInterpreter.java. As you can see from this example, Maps are very useful and are really quite easy to use.

10.4.2

Sets Inside a Map

The objects in a collection or map can be of any type. They can even be collections. Heres an example where its natural to store sets as the value objects in a map.

10.4. PROGRAMMING WITH THE COLLECTION FRAMEWORK

513

Consider the problem of making an index for a book. An index consists of a list of terms that appear in the book. Next to each term is a list of the pages on which that term appears. To represent an index in a program, we need a data structure that can hold a list of terms, along with a list of pages for each term. Adding new data should be easy and ecient. When its time to print the index, it should be easy to access the terms in alphabetical order. There are many ways this could be done, but Id like to use Javas generic data structures and let them do as much of the work as possible. We can think of an index as a Map that associates a list of page references to each term. The terms are keys, and the value associated with a given key is the list of page references for that term. A Map can be either a TreeMap or a HashMap, but only a TreeMap will make it easy to access the terms in sorted order. The value associated with a term is a list of page references. How can we represent such a value? If you think about it, you see that its not really a list in the sense of Javas generic classes. If you look in any index, youll see that a list of page references has no duplicates, so its really a set rather than a list. Furthermore, the page references for a given term are always printed in increasing order, so we want a sorted set. This means that we should use a TreeSet to represent each list of page references. The values that we really want to put in this set are of type int, but once again we have to deal with the fact that generic data structures can only hold objects, so we must use the wrapper class, Integer, for the objects in the set. To summarize, an index will be represented by a TreeMap. The keys for the map will be terms, which are of type String. The values in the map will be TreeSets that contain the Integers which give the page numbers of every page on which a term appears. The parameterized type that we should use for the sets is TreeSet<Integer>. For the TreeMap that represents the index as a whole, the key type is String and the value type is TreeSet<Integer>. This means that the index has type
TreeMap< String, TreeSet<Integer> >

This is just the usual TreeMap<T,S> with T=String and S=TreeSet<Integer>. A type name as complicated as this one can look intimidating (especially, I think, when used in a constructor with the new operator), but if you think about the data structure that we want to represent, it makes sense. Given a little time and practice, you can get used to types like this one. To make an index, we need to start with an empty TreeMap, look through the book, inserting every reference that we want to be in the index into the map, and then print out the data from the map. Lets leave aside the question of how we nd the references to put in the index, and just look at how the TreeMap is used. It can be created with the commands:
TreeMap<String,TreeSet<Integer>> index; // Declare the variable. index = new TreeMap<String,TreeSet<Integer>>(); // Create the map object.

Now, suppose that we nd a reference to some term (of type String ) on some pageNum (of type int). We need to insert this information into the index. To do this, we should look up the term in the index, using index.get(term). The return value is either null or is the set of page references that we have previously found for the term. If the return value is null, then this is the rst page reference for the term, so we should add the term to the index, with a new set that contains the page reference weve just found. If the return value is non-null, we already have a set of page references, and we should just add the new page reference to the set. Here is a subroutine that does this:
/** * Add a page reference to the index.

514

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES


*/ void addReference(String term, int pageNum) { TreeSet<Integer> references; // The set of page references that we // have so far for the term. references = index.get(term); if (references == null){ // This is the first reference that we have // found for the term. Make a new set containing // the page number and add it to the index, with // the term as the key. TreeSet<Integer> firstRef = new TreeSet<Integer>(); firstRef.add( pageNum ); // pageNum is "autoboxed" to give an Integer! index.put(term,firstRef); } else { // references is the set of page references // that we have found previously for the term. // Add the new page number to that set. This // set is already associated to term in the index. references.add( pageNum ); // pageNum is "autoboxed" to give an Integer! } }

The only other thing we need to do with the index is print it out. We want to iterate through the index and print out each term, together with the set of page references for that term. We could use an Iterator to iterate through the index, but its much easier to do it with a for-each loop. The loop will iterate through the entry set of the map (see Subsection 10.3.2). Each entry is a key/value pair from the map; the key is a term and the value is the associated set of page references. Inside the for-each loop, we will have to print out a set of Integers, which can also be done with a for-each loop. So, here we have an example of nested for-each loops. (You might try to do the same thing entirely with iterators; doing so should give you some appreciation for the for-each loop!). Here is a subroutine that will print the index:
/** * Print each entry in the index. */ void printIndex() { for ( Map.Entry<String,TreeSet<Integer>> entry : String term = entry.getKey(); TreeSet<Integer> pageSet = entry.getValue(); System.out.print( term + " " ); for ( int page : pageSet ) { System.out.print( page + " " ); } System.out.println(); } } index.entrySet() ) {

The hardest thing here is the name of the type Map.Entry<String,TreeSet<Integer>>! Remember that the entries in a map of type Map<T,S> have type Map.Entry<T,S>, so the type parameters in Map.Entry<String,TreeSet<Integer>> are simply copied from the declaration

10.4. PROGRAMMING WITH THE COLLECTION FRAMEWORK

515

of index. Another thing to note is that I used a loop control variable, page, of type int to iterate through the elements of pageSet, which is of type TreeSet<Integer>. You might have expected page to be of type Integer, not int, and in fact Integer would have worked just as well here. However, int does work, because of automatic type conversion: its legal to assign a value of type Integer to a variable of type int. (To be honest, I was sort of surprised that this worked when I rst tried it!) This is not a lot of code, considering the complexity of the operations. I have not written a complete indexing program, but Exercise 10.5 presents a problem that is almost identical to the indexing problem.


By the way, in this example, I would prefer to print each list of page references with the integers separated by commas. In the printIndex() method given above,they are separated by spaces. There is an extra space after the last page reference in the list, but it does no harm since its invisible in the printout. An extra comma at the end of the list would be annoying. The lists should be in a form such as 17,42,105 and not 17,42,105,. The problem is, how to leave that last comma out. Unfortunately, this is not so easy to do with a for-each loop. It might be fun to look at a few ways to solve this problem. One alternative is to use an iterator:
Iterator<Integer> iter = pageSet.iterator(); int firstPage = iter.next(); // In this program, we know the set has at least // one element. Note also that this statement // uses an auto-conversion from Integer to int. System.out.print(firstPage); while ( iter.hasNext() ) { int nextPage = iter.next(); System.out.print("," + nextPage); }

Another possibility is to use the fact that the TreeSet class denes a method first() that returns the rst item in the set, that is, the one that is smallest in terms of the ordering that is used to compare items in the set. (It also denes the method last().) We can solve our problem using this method and a for-each loop:
int firstPage = pageSet.first(); // Find out the first page number in the set. for ( int page : pageSet ) { if ( page != firstPage ) System.out.print(","); // Output comma only if this is not the first page. System.out.print(page); }

Finally, here is an elegant solution using a subset view of the tree. (See Subsection 10.3.2.) Actually, this solution might be a bit extreme:
int firstPage = pageSet.first(); // Get first item, which we know exists. System.out.print(firstPage); // Print first item, with no comma. for ( int page : pageSet.tailSet( firstPage+1 ) ) // Process remaining items. System.out.print( "," + page );

10.4.3

Using a Comparator

There is a potential problem with our solution to the indexing problem. If the terms in the index can contain both upper case and lower case letters, then the terms will not be in alphabetical

516

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

order! The ordering on String is not alphabetical. It is based on the Unicode codes of the characters in the string. The codes for all the upper case letters are less than the codes for the lower case letters. So, for example, terms beginning with Z come before terms beginning with a. If the terms are restricted to use lower case letters only (or upper case only), then the ordering would be alphabetical. But suppose that we allow both upper and lower case, and that we insist on alphabetical order. In that case, our index cant use the usual ordering for Strings. Fortunately, its possible to specify a dierent method to be used for comparing the keys of a map. This is a typical use for a Comparator. Recall that an object that implements the interface Comparator<T> denes a method for comparing two objects of type T :
public int compare( T obj1, T obj2 )

This method should return an integer that is positive, zero, or negative, depending on whether obj1 is less than, equal to, or greater than obj2. We need an object of type Comparator<String> that will compare two Strings based on alphabetical order. The easiest way to do this is to convert the Strings to lower case and use the default comparison on the lower case Strings. The following class denes such a comparator:
/** * Represents a Comparator that can be used for comparing two * strings based on alphabetical order. */ class AlphabeticalOrder implements Comparator<String> { public int compare(String str1, String str2) { String s1 = str1.toLowerCase(); // Convert to lower case. String s2 = str2.toLowerCase(); return s1.compareTo(s2); // Compare lower-case Strings. } }

To solve our indexing problem, we just need to tell our index to use an object of type AlphabeticalOrder for comparing keys. This is done by providing a Comparator object as a parameter to the constructor. We just have to create the index in our example with the command:
index = new TreeMap<String,TreeSet<Integer>>( new AlphabeticalOrder() );

This does work. However, Ive been concealing one technicality. Suppose, for example, that the indexing program calls addReference("aardvark",56) and that it later calls addReference("Aardvark",102). The words aardvark and Aardvark dier only in that one of them begins with an upper case letter; when converted to lower case, they are the same. When we insert them into the index, do they count as two dierent terms or as one term? The answer depends on the way that a TreeMap tests objects for equality. In fact, TreeMaps and TreeSets always use a Comparator object or a compareTo method to test for equality. They do not use the equals() method for this purpose. The Comparator that is used for the TreeMap in this example returns the value zero when it is used to compare aardvark and Aardvark, so the TreeMap considers them to be the same. Page references to aardvark and Aardvark are combined into a single list. This is probably the correct behavior in this example. If not, some other technique must be used to sort the terms into alphabetical order.

10.4. PROGRAMMING WITH THE COLLECTION FRAMEWORK

517

10.4.4

Word Counting

The nal example in this section also deals with storing information about words. The problem here is to make a list of all the words that occur in a le, along with the number of times that each word occurs. The le will be selected by the user. The output of the program will consist of two lists. Each list contains all the words from the le, along with the number of times that the word occurred. One list is sorted alphabetically, and the other is sorted according to the number of occurrences, with the most common words at the top and the least common at the bottom. The problem here is a generalization of Exercise 7.6, which asked you to make an alphabetical list of all the words in a le, without counting the number of occurrences. My word counting program can be found in the le WordCount.java. As the program reads an input le, it must keep track of how many times it encounters each word. We could simply throw all the words, with duplicates, into a list and count them later. But that would require a lot of storage and would not be very ecient. A better method is to keep a counter for each word. The rst time the word is encountered, the counter is initialized to 1. On subsequent encounters, the counter is incremented. To keep track of the data for one word, the program uses a simple class that holds a word and the counter for that word. The class is a static nested class:
/** * Represents the data we need about a word: the word and * the number of times it has been encountered. */ private static class WordData { String word; int count; WordData(String w) { // Constructor for creating a WordData object when // we encounter a new word. word = w; count = 1; // The initial value of count is 1. } } // end class WordData

The program has to store all the WordData objects in some sort of data structure. We want to be able to add new words eciently. Given a word, we need to check whether a WordData object already exists for that word, and if it does, we need to nd that object so that we can increment its counter. A Map can be used to implement these operations. Given a word, we want to look up a WordData object in the Map. This means that the word is the key, and the WordData object is the value. (It might seem strange that the key is also one of the instance variables in the value object, but in fact this is probably the most common situation: The value object contains all the information about some entity, and the key is one of those pieces of information; the partial information in the key is used to retrieve the full information in the value object.) After reading the le, we want to output the words in alphabetical order, so we should use a TreeMap rather than a HashMap. This program converts all words to lower case so that the default ordering on Strings will put the words in alphabetical order. The data is stored in a variable named words of type TreeMap<String,WordData>. The variable is declared and the map object is created with the statement:
TreeMap<String,WordData> words = new TreeMap<String,WordData>();

518

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

When the program reads a word from a le, it calls words.get(word) to nd out if that word is already in the map. If the return value is null, then this is the rst time the word has been encountered, so a new WordData object is created and inserted into the map with the command words.put(word, new WordData(word)). If words.get(word) is not null, then its value is the WordData object for this word, and the program only has to increment the counter in that object. The program uses a method readNextWord(), which was given in Exercise 7.6, to read one word from the le. This method returns null when the end of the le is encountered. Here is the complete code segment that reads the le and collects the data:
String word = readNextWord(); while (word != null) { word = word.toLowerCase(); // convert word to lower case WordData data = words.get(word); if (data == null) words.put( word, new WordData(word) ); else data.count++; word = readNextWord(); }

After reading the words and printing them out in alphabetical order, the program has to sort the words by frequency and print them again. To do the sorting using a generic algorithm, I dened a simple Comparator class for comparing two word objects according to their frequency counts. The class implements the interface Comparator<WordData>, since it will be used to compare two objects of type WordData:
/** * A comparator class for comparing objects of type WordData according to * their counts. This is used for sorting the list of words by frequency. */ private static class CountCompare implements Comparator<WordData> { public int compare(WordData data1, WordData data2) { return data2.count - data1.count; // The return value is positive if data2.count > data1.count. // I.E., data1 comes after data2 in the ordering if there // were more occurrences of data2.word than of data1.word. // The words are sorted according to decreasing counts. } } // end class CountCompare

Given this class, we can sort the WordData objects according to frequency by rst copying them into a list and then using the generic method Collections.sort(list,comparator). The WordData objects that we need are the values in the map, words. Recall that words.values() returns a Collection that contains all the values from the map. The constructor for the ArrayList class lets you specify a collection to be copied into the list when it is created. So, we can use the following commands to create a list of type ArrayList<WordData> containing the word data and then sort that list according to frequency:
ArrayList<WordData> wordsByFrequency = new ArrayList<WordData>( words.values() ); Collections.sort( wordsByFrequency, new CountCompare() );

You should notice that these two lines replace a lot of code! It requires some practice to think in terms of generic data structures and algorithms, but the payo is signicant in terms of saved time and eort.

10.5. WRITING GENERIC CLASSES AND METHODS

519

The only remaining problem is to print the data. We have to print the data from all the WordData objects twice, rst in alphabetical order and then sorted according to frequency count. The data is in alphabetical order in the TreeMap, or more precisely, in the values of the TreeMap. We can use a for-each loop to print the data in the collection words.values(), and the words will appear in alphabetical order. Another for-each loop can be used to print the data in the list wordsByFrequency, and the words will be printed in order of decreasing frequency. Here is the code that does it:
TextIO.putln("List of words in alphabetical order" + " (with counts in parentheses):\n"); for ( WordData data : words.values() ) TextIO.putln(" " + data.word + " (" + data.count + ")"); TextIO.putln("\n\nList of words by frequency of occurrence:\n"); for ( WordData data : wordsByFrequency ) TextIO.putln(" " + data.word + " (" + data.count + ")");

You can nd the complete word-counting program in the le WordCount.java. Note that for reading and writing les, it uses the le I/O capabilities of TextIO.java, which were discussed in Subsection 2.4.5. By the way, if you run the WordCount program on a reasonably large le and take a look at the output, it will illustrate something about the Collections.sort() method. The second list of words in the output is ordered by frequency, but if you look at a group of words that all have the same frequency, you will see that the words in that group are in alphabetical order. The method Collections.sort() was applied to sort the words by frequency, but before it was applied, the words were already in alphabetical order. When Collections.sort() rearranged the words, it did not change the ordering of words that have the same frequency, so they were still in alphabetical order within the group of words with that frequency. This is because the algorithm used by Collections.sort() is a stable sorting algorithm. A sorting algorithm is said to be stable if it satises the following condition: When the algorithm is used to sort a list according to some property of the items in the list, then the sort does not change the relative order of items that have the same value of that property. That is, if item B comes after item A in the list before the sort, and if both items have the same value for the property that is being used as the basis for sorting, then item B will still come after item A after the sorting has been done. Neither SelectionSort nor QuickSort are stable sorting algorithms. Insertion sort is stable, but is not very fast. The sorting algorithm used by Collections.sort() is both stable and fast. I hope that the programming examples in this section have convinced you of the usefulness of the Java Collection Framework!

10.5

Writing Generic Classes and Methods

So far in this chapter, you have learned about using the generic classes and methods that
are part of the Java Collection Framework. Now, its time to learn how to write new generic classes and methods from scratch. Generic programming produces highly general and reusable codeits very useful for people who write reusable software libraries to know how to do generic programming, since it enables them to write code that can be used in many dierent situations. Not every programmer needs to write reusable software libraries, but every programmer should know at least a little about how to do it. In fact, just to read the JavaDoc documentation for

520

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

Javas standard generic classes, you need to know some of the syntax that is introduced in this section. I will not cover every detail of generic programming in Java in this section, but the material presented here should be sucient to cover the most common cases.

10.5.1

Simple Generic Classes

Lets start with an example that illustrates the motivation for generic programming. In Subsection 10.2.1, I remarked that it would be easy to use a LinkedList to implement a queue. (Queues were introduced in Subsection 9.3.2.) To ensure that the only operations that are performed on the list are the queue operations enqueue, dequeue, and isEmpty, we can create a new class that contains the linked list as a private instance variable. To implement queues of strings, for example, we can dene the class:
class QueueOfStrings { private LinkedList<String> items = new LinkedList<String>(); public void enqueue(String item) { items.addLast(item); } public String dequeue() { return items.removeFirst(); } public boolean isEmpty() { return (items.size() == 0); } }

This is a ne and useful class. But, if this is how we write queue classes, and if we want queues of Integers or Doubles or JButtons or any other type, then we will have to write a dierent class for each type. The code for all of these classes will be almost identical, which seems like a lot of redundant programming. To avoid the redundancy, we can write a generic Queue class that can be used to dene queues of any type of object. The syntax for writing the generic class is straightforward: We replace the specic type String with a type parameter such as T, and we add the type parameter to the name of the class:
class Queue<T> { private LinkedList<T> items = new LinkedList<T>(); public void enqueue(T item) { items.addLast(item); } public T dequeue() { return items.removeFirst(); } public boolean isEmpty() { return (items.size() == 0); } }

Note that within the class, the type parameter T is used just like any regular type name. Its used to declare the return type for dequeue, as the type of the formal parameter item in enqueue, and even as the actual type parameter in LinkedList<T>. Given this class denition, we can use parameterized types such as Queue<String> and Queue<Integer> and Queue<JButton>.

10.5. WRITING GENERIC CLASSES AND METHODS

521

That is, the Queue class is used in exactly the same way as built-in generic classes like LinkedList and HashSet. Note that you dont have to use T as the name of the type parameter in the denition of the generic class. Type parameters are like formal parameters in subroutines. You can make up any name you like in the denition of the class. The name in the denition will be replaced by an actual type name when the class is used to declare variables or create objects. If you prefer to use a more meaningful name for the type parameter, you might dene the Queue class as:
class Queue<ItemType> { private LinkedList<ItemType> items = new LinkedList<ItemType>(); public void enqueue(ItemType item) { items.addLast(item); } public ItemType dequeue() { return items.removeFirst(); } public boolean isEmpty() { return (items.size() == 0); } }

Changing the name from T to ItemType has absolutely no eect on the meaning of the class denition or on the way that Queue is used. Generic interfaces can be dened in a similar way. Its also easy to dene generic classes and interfaces that have two or more type parameters, as is done with the standard interface Map<T,S>. A typical example is the denition of a Pair that contains two objects, possibly of dierent types. A simple version of such a class can be dened as:
class Pair<T,S> { public T first; public S second; public Pair( T a, S b ) { first = a; second = b; } }

// Constructor.

This class can be used to declare variables and create objects such as:
Pair<String,Color> colorName = new Pair<String,Color>("Red", Color.RED); Pair<Double,Double> coordinates = new Pair<Double,Double>(17.3,42.8);

Note that in the denition of the constructor in this class, the name Pair does not have type parameters. You might have expected Pair<T,S>. However, the name of the class is Pair, not Pair<T,S>, and within the denition of the class, T and S are used as if they are the names of specic, actual types. Note in any case that type parameters are never added to the names of methods or constructors, only to the names of classes and interfaces.

10.5.2

Simple Generic Methods

In addition to generic classes, Java also has generic methods. An example is the method Collections.sort(), which can sort collections of objects of any type. To see how to write generic methods, lets start with a non-generic method for counting the number of times that a given string occurs in an array of strings:

522

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES


/** * Returns the number of times that itemToCount occurs in list. Items in the * list are tested for equality using itemToCount.equals(), except in the * special case where itemToCount is null. */ public static int countOccurrences(String[] list, String itemToCount) { int count = 0; if (itemToCount == null) { for ( String listItem : list ) if (listItem == null) count++; } else { for ( String listItem : list ) if (itemToCount.equals(listItem)) count++; } return count; }

Once again, we have some code that works for type String, and we can imagine writing almost identical code to work with other types of objects. By writing a generic method, we get to write a single method denition that will work for objects of any type. We need to replace the specic type String in the denition of the method with the name of a type parameter, such as T. However, if thats the only change we make, the compiler will think that T is the name of an actual type, and it will mark it as an undeclared identier. We need some way of telling the compiler that T is a type parameter. Thats what the <T> does in the denition of the generic class class Queue<T> { .... For a generic method, the <T> goes just before the name of the return type of the method:
public static <T> int countOccurrences(T[] list, T itemToCount) { int count = 0; if (itemToCount == null) { for ( T listItem : list ) if (listItem == null) count++; } else { for ( T listItem : list ) if (itemToCount.equals(listItem)) count++; } return count; }

The <T> marks the method as being generic and species the name of the type parameter that will be used in the denition. Of course, the name of the type parameter doesnt have to be T; it can be anything. (The <T> looks a little strange in that position, I know, but it had to go somewhere and thats just where the designers of Java decided to put it.) Given the generic method denition, we can apply it to objects of any type. If wordList is a variable of type String[ ] and word is a variable of type String, then
int ct = countOccurrences( wordList, word );

10.5. WRITING GENERIC CLASSES AND METHODS

523

will count the number of times that word occurs in wordList. If palette is a variable of type Color[ ] and color is a variable of type Color, then
int ct = countOccurrences( palette, color );

will count the number of times that color occurs in palette. If numbers is a variable of type Integer[ ], then
int ct = countOccurrences( numbers, 17 );

will count the number of times that 17 occurs in numbers. This last example uses autoboxing; the 17 is automatically converted to a value of type Integer, as if we had said countOccurrences( numbers, new Integer(17) ). Note that, since generic programming in Java applies only to objects, we cannot use countOccurrences to count the number of occurrences of 17 in an array of type int[ ]. A generic method can have one or more type parameters, such as the T in countOccurrences. Note that when a generic method is used, as in the function call countOccurrences(wordlist, word), there is no explicit mention of the type that is substituted for the type parameter. The compiler deduces the type from the types of the actual parameters in the method call. Since wordlist is of type String[ ], the compiler can tell that in countOccurrences(wordlist, word), the type that replaces T is String. This contrasts with the use of generic classes, as in new Queue<String>(), where the type parameter is specied explicitly. The countOccurrences method operates on an array. We could also write a similar method to count occurrences of an object in any collection:
public static <T> int countOccurrences(Collection<T> collection, T itemToCount) { int count = 0; if (itemToCount == null) { for ( T item : collection ) if (item == null) count++; } else { for ( T item : collection ) if (itemToCount.equals(item)) count++; } return count; }

Since Collection<T> is itself a generic type, this method is very general. It can operate on an ArrayList of Integers, a TreeSet of Strings, a LinkedList of JButtons, . . . .

10.5.3

Type Wildcards

There is a limitation on the sort of generic classes and methods that we have looked at so far: The type parameter in our examples, usually named T, can be any type at all. This is OK in many cases, but it means that the only things that you can do with T are things that can be done with every type, and the only things that you can do with objects of type T are things that you can do with every object. With the techniques that we have covered so far, you cant, for example, write a generic method that compares objects with the compareTo() method, since that method is not dened for all objects. The compareTo() method is dened

524

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

in the Comparable interface. What we need is a way of specifying that a generic class or method only applies to objects of type Comparable and not to arbitrary objects. With that restriction, we should be free to use compareTo() in the denition of the generic class or method. There are two dierent but related syntaxes for putting restrictions on the types that are used in generic programming. One of these is bounded type parameters, which are used as formal type parameters in generic class and method denitions; a bounded type parameter would be used in place of the simple type parameter T in class GenericClass<T> ... or in public static <T> void genericMethod(.... The second syntax is wildcard types, which are used as type parameters in the declarations of variables and of formal method parameters; a wildcard type could be used in place of the type parameter String in the declaration statement List<String> list; or in the formal parameter list void max(Collection<String> c). We will look at wildcard types rst, and we will return to the topic of bounded types later in this section. Lets start with a simple example in which a wildcard type is useful. Suppose that Shape is a class that denes a method public void draw(), and suppose that Shape has subclasses such as Rect and Oval. Suppose that we want a method that can draw all the shapes in a collection of Shapes. We might try:
public static void drawAll(Collection<Shape> shapes) { for ( Shape s : shapes ) s.draw(); }

This method works ne if we apply it to a variable of type Collection<Shape>, or ArrayList<Shape>, or any other collection class with type parameter Shape. Suppose, however, that you have list of Rects stored in a variable named rectangles of type Collection<Rect>. Since Rects are Shapes, you might expect to be able to call drawAll(rectangles). Unfortunately, this will not work; a collection of Rects is not considered to be a collection of Shapes! The variable rectangles cannot be assigned to the formal parameter shapes. The solution is to replace the type parameter Shape in the declaration of shapes with the wildcard type ? extends Shape:
public static void drawAll(Collection<? extends Shape> shapes) { for ( Shape s : shapes ) s.draw(); }

The wildcard type ? extends Shape means roughly any type that is either equal to Shape or that is a subclass of Shape. When the parameter shapes is declared to be of type Collection<? extends Shape>, it becomes possible to call the drawAll method with an actual parameter of type Collection<Rect> since Rect is a subclass of Shape and therefore matches the wildcard ? extends Shape. We could also pass actual parameters to drawAll of type ArrayList<Rect> or Set<Oval> or List<Oval>. And we can still pass variables of type Collection<Shape> or ArrayList<Shape>, since the class Shape itself matches ? extends Shape. We have greatly increased the usefulness of the method by using the wildcard type. (Although it is not essential, you might be interested in knowing why Java does not allow a collection of Rects to be used as a collection of Shapes, even though every Rect is considered to be a Shape. Consider the rather silly but legal method that adds an oval to a list of shapes:
static void addOval(List<Shape> shapes, Oval oval) { shapes.add(oval); }

10.5. WRITING GENERIC CLASSES AND METHODS

525

Suppose that rectangles is of type List<Rect>. Its illegal to call addOval(rectangles,oval), because of the rule that a list of Rects is not a list of Shapes. If we dropped that rule, then addOval(rectangles,oval) would be legal, and it would add an Oval to a list of Rects. This would be bad: Since Oval is not a subclass of Rect, an Oval is not a Rect, and a list of Rects should never be able to contain an Oval. The method call addOval(rectangles,oval) does not make sense and should be illegal, so the rule that a collection of Rects is not a collection of Shapes is a good rule.) As another example, consider the method addAll() from the interface Collection<T>. In my description of this method in Subsection 10.1.4, I say that for a collection, coll, of type Collection<T>, coll.addAll(coll2) adds all the objects in coll2 to coll. The parameter, coll2, can be any collection of type Collection<T>. However, it can also be more general. For example, if T is a class and S is a sub-class of T, then coll2 can be of type Collection<S>. This makes sense because any object of type S is automatically of type T and so can legally be added to coll. If you think for a moment, youll see that what Im describing here, a little awkwardly, is a use of wildcard types: We dont want to require coll2 to be a collection of of objects of type T ; we want to allow collections of any subclass of T. To be more specic, lets look at how a similar addAll() method could be added to the generic Queue class that was dened earlier in this section:
class Queue<T> { private LinkedList<T> items = new LinkedList<T>(); public void enqueue(T item) { items.addLast(item); } public T dequeue() { return items.removeFirst(); } public boolean isEmpty() { return (items.size() == 0); } public void addAll(Collection<? extends T> collection) { // Add all the items from the collection to the end of the queue for ( T item : collection ) enqueue(item); } }

Here, T is a type parameter in the generic class denition. We are combining wildcard types with generic classes. Inside the generic class denition, T is used as if it is a specic, though unknown, type. The wildcard type ? extends T means some type that extends that specic type. When we create a queue of type Queue<Shape>, T refers to Shape, and the wildcard type ? extends T in the class denition means ? extends Shape, meaning that the addAll method of the queue can be applied to collections of Rects and Ovals as well as to collections of Shapes. The for-each loop in the denition of addAll iterates through the collection using a variable, item, of type T. Now, collection can be of type Collection<S>, where S is a subclass of T. Since item is of type T, not S, do we have a problem here? No, no problem. As long as S is a subclass of T, a value of type S can be assigned to a variable of type T. The restriction on the wildcard type makes everything work nicely.

526

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

The addAll method adds all the items from a collection to the queue. Suppose that we wanted to do the opposite: Add all the items that are currently on the queue to a given collection. An instance method dened as
public void addAllTo(Collection<T> collection)

would only work for collections whose base type is exactly the same as T. This is too restrictive. We need some sort of wildcard. However, ? extends T wont work. Suppose we try it:
public void addAllTo(Collection<? extends T> collection) { // Remove all items currently on the queue and add them to collection while ( ! isEmpty() ) { T item = dequeue(); // Remove an item from the queue. collection.add( item ); // Add it to the collection. ILLEGAL!! } }

The problem is that we cant add an item of type T to a collection that might only be able to hold items belonging to some subclass, S, of T. The containment is going in the wrong direction: An item of type T is not necessarily of type S. For example, if we have a queue of type Queue<Shape>, it doesnt make sense to add items from the queue to a collection of type Collection<Rect>, since not every Shape is a Rect. On the other hand, if we have a Queue<Rect>, it would make sense to add items from that queue to a Collection<Shape> or indeed to any collection Collection<S> where S is a superclass of Rect. To express this type of relationship, we need a new kind of type wildcard: ? super T. This wildcard means, roughly, either T itself or any class that is a superclass of T. For example, Collection<? super Rect> would match the types Collection<Shape>, ArrayList<Object>, and Set<Rect>. This is what we need for our addAllTo method. With this change, our complete generic queue class becomes:
class Queue<T> { private LinkedList<T> items = new LinkedList<T>(); public void enqueue(T item) { items.addLast(item); } public T dequeue() { return items.removeFirst(); } public boolean isEmpty() { return (items.size() == 0); } public void addAll(Collection<? extends T> collection) { // Add all the items from the collection to the end of the queue for ( T item : collection ) enqueue(item); } public void addAllTo(Collection<? super T> collection) { // Remove all items currently on the queue and add them to collection while ( ! isEmpty() ) { T item = dequeue(); // Remove an item from the queue. collection.add( item ); // Add it to the collection. } } }

10.5. WRITING GENERIC CLASSES AND METHODS

527

In a wildcard type such as ? extends T, T can be an interface instead of a class. Note that the term extends (not implements) is used in the wildcard type, even if T is an interface. For example, recall that Runnable is an interface that denes the method public void run(). Runnable objects are usually associated with threads (see Section 8.5). Here is a method that runs all the objects in a collection of Runnables in parallel, by creating a separate thread for each object:
public static runAllInParallel( Collection<? extends Runnable> runnables ) { for ( Runnable runnable : runnables ) { Thread runner; // A thread to run the method runnable.run() runner = new Thread( runnable ); // Create the thread. runner.start(); // Start the thread running. } }


Wildcard types are used only as type parameters in parameterized types, such as Collection<? extends Runnable>. The place where a wildcard type is most likely to occur, by far, is in a formal parameter list, where the wildcard type is used in the declaration of the type of a formal parameter. However, they can also be used in a few other places. For example, they can be used in the type specication in a variable declaration statement. One nal remark: The wildcard type <?> is equivalent to <? extends Object>. That is, it matches any possible type. For example, the removeAll() method in the generic interface Collections<T> is declared as
public boolean removeAll( Collection<?> c ) { ...

This just means that the removeAll method can be applied to any collection of any type of object.

10.5.4

Bounded Types

Wildcard types dont solve all of our problems. They allow us to generalize method denitions so that they can work with collections of objects of various types, rather than just a single type. However, they do not allow us to restrict the types that are allowed as type parameters in a generic class or method denition. Bounded types exist for this purpose. We start with a small, not very realistic example. Suppose that you would like to create groups of GUI components using a generic class named ComponentGroup. For example, the parameterized type ComponentGroup<JButton> would represent a group of JButtons, while ComponentGroup<JPanel> would represent a group of JPanels. The class will include methods that can be called to apply certain operations to all components in the group at once. For example, there will be a instance method of the form
public void repaintAll() { . . // Call the repaint() method of every component in the group. . }

The problem is that the repaint() method is dened in any JComponent object, but not for objects of arbitrary type. It wouldnt make sense to allow types such as ComponentGroup<String> or ComponentGroup<Integer>, since Strings and Integers dont have repaint() methods. We

528

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

need some way to restrict the type parameter T in ComponentGroup<T> so that only JComponent and subclasses of JComponent are allowed as actual type parameters. We can do this by using the bounded type T extends JComponent instead of a plain T in the denition of the class:
public class ComponentGroup<T extends JComponent> { private ArrayList<T> components; // For storing the components in this group. public void repaintAll() { for ( JComponent c : components ) if (c != null) c.repaint(); } public void setAllEnabled( boolean enable ) { for ( JComponent c : components ) if (c != null) c.setEnabled(enable); } } public void add( T c ) { // Add a value c, of type T, to the group. components.add(c); } . . // Additional methods and constructors. . }

The restriction extends JComponent on T makes it illegal to create the parameterized types ComponentGroup<String> and ComponentGroup<Integer>, since the actual type parameter that replaces T is required to be either JComponent itself or a subclass of JComponent. With this restriction, we knowand, more important, the compiler knowsthat the objects in the group are of type JComponent and the operations c.repaint() and c.setEnabled() are dened for any c in the group. In general, a bounded type parameter T extends SomeType means roughly a type, T, that is either equal to SomeType or is a subclass of SomeType, and the upshot is that any object of type T is also of type SomeType, and any operation that is dened for objects of type SomeType is dened for objects of type T. The type SomeType doesnt have to be the name of a class. It can be anything that represents an actual type. For example, it can be an interface or even a parameterized type. Bounded types and wildcard types are clearly related. They are, however, used in very dierent ways. A bounded type can be used only as a formal type parameter in the denition of a generic method, class, or interface. A wildcard type is used most often to declare the type of a formal parameter in a method and cannot be used as a formal type parameter. One other dierence, by the way, is that, in contrast to wildcard types, bounded type parameters can only use extends, never super. Bounded type parameters can be used when declaring generic methods. For example, as an alternative to the generic ComponentGroup class, one could write a free-standing generic static method that can repaint any collection of JComponents as follows:
public static <T extends JComponent> void repaintAll(Collection<T> comps) { for ( JComponent c : comps ) if (c != null) c.repaint();

10.5. WRITING GENERIC CLASSES AND METHODS


}

529

Using <T extends JComponent> as the formal type parameter means that the method can only be called for collections whose base type is JComponent or some subclass of JComponent. Thus, it is legal to call repaintAll(coll) where coll is of type List<JPanel> but not where coll is of type Set<String>. Note that we dont really need a generic type parameter in this case. We can write an equivalent method using a wildcard type:
public static void repaintAll(Collection<? extends JComponent> comps) { for ( JComponent c : comps ) if (c != null) c.repaint(); }

In this situation, the version that uses the wildcard type is to be preferred, since the implementation is simpler. However, there are some situations where a generic method with a bounded type parameter cannot be rewritten using a wildcard type. Note that a generic type parameter gives a name, such as T, to the unknown type, while a wildcard type does not give a name to the unknown type. The name makes it possible to refer to the unknown type in the body of the method that is being dened. If a generic method denition uses the generic type name more than once or uses it outside the formal parameter list of the method, then the generic type cannot be replaced with a wildcard type. Lets look at a generic method in which a bounded type parameter is essential. In Subsection 10.2.1, I presented a code segment for inserting a string into a sorted list of strings, in such a way that the modied list is still in sorted order. Here is the same code, but this time in the form of a method denition (and without the comments):
static void sortedInsert(List<String> sortedList, String newItem) { ListIterator<String> iter = sortedList.listIterator(); while (iter.hasNext()) { String item = iter.next(); if (newItem.compareTo(item) <= 0) { iter.previous(); break; } } iter.add(newItem); }

This method works ne for lists of strings, but it would be nice to have a generic method that can be applied to lists of other types of objects. The problem, of course, is that the code assumes that the compareTo() method is dened for objects in the list, so the method can only work for lists of objects that implement the Comparable interface. We cant simply use a wildcard type to enforce this restriction. Suppose we try to do it, by replacing List<String> with List<? extends Comparable>:
static void sortedInsert(List<? extends Comparable> sortedList, ???? newItem) { ListIterator<????> iter = stringList.listIterator(); ...

We immediately run into a problem, because we have no name for the unknown type represented by the wildcard. We need a name for that type because the type of newItem and of iter should be the same as the type of the items in the list. The problem is solved if we write a generic

530

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

method with a bounded type parameter, since then we have a name for the unknown type, and we can write a valid generic method:
static <T extends Comparable> void sortedInsert(List<T> sortedList, T newItem) { ListIterator<T> iter = sortedList.listIterator(); while (iter.hasNext()) { T item = iter.next(); if (newItem.compareTo(item) <= 0) { iter.previous(); break; } } iter.add(newItem); }

There is still one technicality to cover in this example. Comparable is itself a parameterized type, but I have used it here without a type parameter. This is legal but the compiler might give you a warning about using a raw type. In fact, the objects in the list should implement the parameterized interface Comparable<T>, since they are being compared to items of type T. This just means that instead of using Comparable as the type bound, we should use Comparable<T>:
static <T extends Comparable<T>> void sortedInsert(List<T> sortedList, ...


With this example, I will leave the topic of generic types and generic programming. In this chapter, I have occasionally used terms such as strange and weird to talk about generic programming in Java. I will confess that I have some aection for the more simple-minded generic programming style of Smalltalk. Nevertheless, I recognize the power and increased robustness of generics in Java. I hope that I have convinced you that using the Java Collection Framework is reasonably natural and straightforward, and that using it can save you a lot of time and eort compared to repeatedly recoding the same data structures and algorithms from scratch. Things become a little more technical when you start writing new generic classes and methods of your own, and the syntax is (as Ive said) a little strange. But with some practice, youll get used to the syntax and will nd that its not that dicult after all.

Exercises

531

Exercises for Chapter 10


1. Rewrite the PhoneDirectory class from Subsection 7.4.2 so that it uses a TreeMap to store directory entries, instead of an array. (Doing this was suggested in Subsection 10.3.1.) 2. In mathematics, several operations are dened on sets. The union of two sets A and B is a set that contains all the elements that are in A together with all the elements that are in B. The intersection of A and B is the set that contains elements that are in both A and B. The dierence of A and B is the set that contains all the elements of A except for those elements that are also in B. Suppose that A and B are variables of type set in Java. The mathematical operations on A and B can be computed using methods from the Set interface. In particular: A.addAll(B) computes the union of A and B; A.retainAll(B) computes the intersection of A and B; and A.removeAll(B) computes the dierence of A and B. (These operations change the contents of the set A, while the mathematical operations create a new set without changing A, but that dierence is not relevant to this exercise.) For this exercise, you should write a program that can be used as a set calculator for simple operations on sets of non-negative integers. (Negative integers are not allowed.) A set of such integers will be represented as a list of integers, separated by commas and, optionally, spaces and enclosed in square brackets. For example: [1,2,3] or [17, 42, 9, 53,108]. The characters +, *, and - will be used for the union, intersection, and dierence operations. The user of the program will type in lines of input containing two sets, separated by an operator. The program should perform the operation and print the resulting set. Here are some examples:
Input ------------------------[1, 2, 3] + [3, 5, 7] [10,9,8,7] * [2,4,6,8] [ 5, 10, 15, 20 ] - [ 0, 10, 20 ] Output ------------------[1, 2, 3, 5, 7] [8] [5, 15]

To represent sets of non-negative integers, use sets of type TreeSet<Integer>. Read the users input, create two TreeSets, and use the appropriate TreeSet method to perform the requested operation on the two sets. Your program should be able to read and process any number of lines of input. If a line contains a syntax error, your program should not crash. It should report the error and move on to the next line of input. (Note: To print out a Set, A, of Integers, you can just say System.out.println(A). Ive chosen the syntax for sets to be the same as that used by the system for outputting a set.) 3. The fact that Java has a HashMap class means that no Java programmer has to write an implementation of hash tables from scratchunless, of course, you are a computer science student. For this exercise, you should write a hash table in which both the keys and the values are of type String. (This is not an exercise in generic programming; do not try to write a generic class.) Write an implementation of hash tables from scratch. Dene the following methods: get(key), put(key,value), remove(key), containsKey(key), and size(). Remember that every object, obj, has a method obj.hashCode() that can be used for computing a hash code for the object, so at least you dont have to dene your own hash function. Do not use any of Javas built-in generic types; create your own linked lists

532

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES using nodes as covered in Subsection 9.2.2. However, you do not have to worry about increasing the size of the table when it becomes too full. You should also write a short program to test your solution.

4. A predicate is a boolean-valued function with one parameter. Some languages use predicates in generic programming. Java doesnt, but this exercise looks at how predicates might work in Java. In Java, we could implement predicate objects by dening a generic interface:
public interface Predicate<T> { public boolean test( T obj ); }

The idea is that an object that implements this interface knows how to test objects of type T in some way. Dene a class that contains the following generic static methods for working with predicate objects. The name of the class should be Predicates, in analogy with the standard class Collections that provides various static methods for working with collections.
public static <T> void remove(Collection<T> coll, Predicate<T> pred) // Remove every object, obj, from coll for which // pred.test(obj) is true. public static <T> void retain(Collection<T> coll, Predicate<T> pred) // Remove every object, obj, from coll for which // pred.test(obj) is false. (That is, retain the // objects for which the predicate is true.) public static <T> List<T> collect(Collection<T> coll, Predicate<T> pred) // Return a List that contains all the objects, obj, // from the collection, coll, such that pred.test(obj) // is true. public static <T> int find(ArrayList<T> list, Predicate<T> pred) // Return the index of the first item in list // for which the predicate is true, if any. // If there is no such item, return -1.

(In C++, methods similar to these are included as a standard part of the generic programming framework.) 5. An example in Subsection 10.4.2 concerns the problem of making an index for a book. A related problem is making a concordance for a document. A concordance lists every word that occurs in the document, and for each word it gives the line number of every line in the document where the word occurs. All the subroutines for creating an index that were presented in Subsection 10.4.2 can also be used to create a concordance. The only real dierence is that the integers in a concordance are line numbers rather than page numbers. Write a program that can create a concordance. The document should be read from an input le, and the concordance data should be written to an output le. You can use the indexing subroutines from Subsection 10.4.2, modied to write the data to TextIO instead of to System.out. (You will need to make these subroutines static.) The input and output les should be selected by the user when the program is run. The sample

Exercises

533

program WordCount.java, from Subsection 10.4.4, can be used as a model of how to use les. That program also has a useful subroutine that reads one word from input. As you read the le, you want to take each word that you encounter and add it to the concordance along with the current line number. Keeping track of the line numbers is one of the trickiest parts of the problem. In an input le, the end of each line in the le is marked by the newline character, \n. Every time you encounter this character, you have to add one to the line number. WordCount.java ignores ends of lines. Because you need to nd and count the end-of-line characters, your program cannot process the input le in exactly the same way as does WordCount.java. Also, you will need to detect the end of the le. The function TextIO.peek(), which is used to look ahead at the next character in the input, returns the value TextIO.EOF at end-of-le, after all the characters in the le have been read. Because it is so common, dont include the word the in your concordance. Also, do not include words that have length less than 3. 6. The sample program SimpleInterpreter.java from Subsection 10.4.1 can carry out commands of the form let variable = expression or print expression. That program can handle expressions that contain variables, numbers, operators, and parentheses. Extend the program so that it can also handle the standard mathematical functions sin, cos, tan, abs, sqrt, and log. For example, the program should be able to evaluate an expression such as sin(3*x-7)+log(sqrt(y)), assuming that the variables x and y have been given values. Note that the name of a function must be followed by an expression that is enclosed in parentheses. In the original program, a symbol table holds a value for each variable that has been dened. In your program, you should add another type of symbol to the table to represent standard functions. You can use the following nested enumerated type and class for this purpose:
private enum Functions { SIN, COS, TAN, ABS, SQRT, LOG } /** * An object of this class represents one of the standard functions. */ private static class StandardFunction { /** * Tells which function this is. */ Functions functionCode; /** * Constructor creates an object to represent one of * the standard functions * @param code which function is represented. */ StandardFunction(Functions code) { functionCode = code; } /** * Finds the value of this function for the specified * parameter value, x.

534

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES


*/ double evaluate(double x) { switch(functionCode) { case SIN: return Math.sin(x); case COS: return Math.cos(x); case TAN: return Math.tan(x); case ABS: return Math.abs(x); case SQRT: return Math.sqrt(x); default: return Math.log(x); } } } // end class StandardFunction

Add a symbol to the symbol table to represent each function. The key is the name of the function and the value is an object of type StandardFunction that represents the function. For example:
symbolTable.put("sin", new StandardFunction(StandardFunction.SIN));

In SimpleInterpreter.java, the symbol table is a map of type HashMap<String,Double>. Its not legal to use a StandardFunction as the value in such a map, so you will have to change the type of the map. The map has to hold two dierent types of objects. The easy way to make this possible is to create a map of type HashMap<String,Object>. (A better way is to create a general type to represent objects that can be values in the symbol table, and to dene two subclasses of that class, one to represent variables and one to represent standard functions, but for this exercise, you should do it the easy way.) In your parser, when you encounter a word, you have to be able to tell whether its a variable or a standard function. Look up the word in the symbol table. If the associated object is non-null and is of type Double, then the word is a variable. If it is of type StandardFunction, then the word is a function. Remember that you can test the type of an object using the instanceof operator. For example: if (obj instanceof Double)

Quiz

535

Quiz on Chapter 10
1. What is meant by generic programming and what is the alternative? 2. Why cant you make an object of type LinkedList<int>? What should you do instead? 3. What is an iterator and why are iterators necessary for generic programming? 4. Suppose that integers is a variable of type Collection<Integer>. Write a code segment that uses an iterator to compute the sum of all the integer values in the collection. Write a second code segment that does the same thing using a for-each loop. 5. Interfaces such as List, Set, and Map dene abstract data types. Explain what this means. 6. What is the fundamental property that distinguishes Sets from other types of Collections? 7. What is the essential dierence in functionality between a TreeMap and a HashMap? 8. Explain what is meant by a hash code. 9. Modify the following Date class so that it implements the interface Comparable<Date>. The ordering on objects of type Date should be the natural, chronological ordering.
class Date { int month; // Month number in range 1 to 12. int day; // Day number in range 1 to 31. int year; // Year number. Date(int m, int d, int y) { // Convenience constructor. month = m; day = d; year = y; } }

10. Suppose that syllabus is a variable of type TreeMap<Date,String>, where Date is the class from the preceding exercise. Write a code segment that will write out the value string for every key that is in the month of December, 2006. 11. Write a generic class Stack<T> that can be used to represent stacks of objects of type T. The class should include methods push(), pop(), and isEmpty(). Inside the class, use an ArrayList to hold the items on the stack. 12. Write a generic method, using a generic type parameter <T>, that replaces every occurrence in a ArrayList<T> of a specied item with a specied replacement item. The list and the two items are parameters to the method. Both items are of type T. Take into account the fact that the item that is being replaced might be null. For a non-null item, use equals() to do the comparison.

536

CHAPTER 10. GENERIC PROGRAMMING AND COLLECTION CLASSES

Chapter 11

Files and Networking


programs are only useful if they interact with the rest of the world in some way. This interaction is referred to as input/output, or I/O. Up until now, this book has concentrated on just one type of interaction: interaction with the user, through either a graphical user interface or a command-line interface. But the user is only one possible source of information and only one possible destination for information. We have already encountered one other type of input/output, since TextIO can read data from les and write data to les. However, Java has an input/output framework that provides much more power and exibility than does TextIO, and that covers other kinds of I/O in addition to les. Most importantly, it supports communication over network connections. In Java, input/output involving les and networks is based on streams, which are objects that support I/O commands that are similar to those that you have already used. In fact, standard output (System.out) and standard input (System.in) are examples of streams. Working with les and networks requires familiarity with exceptions, which were covered in Chapter 8. Many of the subroutines that are used can throw exceptions that require mandatory exception handling. This generally means calling the subroutine in a try..catch statement that can deal with the exception if one occurs.

Computer

11.1

Streams, Readers, and Writers

the ability to interact with the rest of the world, a program would be useless. The interaction of a program with the rest of the world is referred to as input/output or I/O. Historically, one of the hardest parts of programming language design has been coming up with good facilities for doing input and output. A computer can be connected to many dierent types of input and output devices. If a programming language had to deal with each type of device as a special case, the complexity would be overwhelming. One of the major achievements in the history of programming has been to come up with good abstractions for representing I/O devices. In Java, the main I/O abstractions are called streams. Other I/O abstractions, such as les and channels also exist, but in this section we will look only at streams. Every stream represents either a source of input or a destination to which output can be sent.

Without

11.1.1

Character and Byte Streams

When dealing with input/output, you have to keep in mind that there are two broad categories of data: machine-formatted data and human-readable data. Machine-formatted data is represented in binary form, the same way that data is represented inside the computer, that is, as 537

538

CHAPTER 11. FILES AND NETWORKING

strings of zeros and ones. Human-readable data is in the form of characters. When you read a number such as 3.141592654, you are reading a sequence of characters and interpreting them as a number. The same number would be represented in the computer as a bit-string that you would nd unrecognizable. To deal with the two broad categories of data representation, Java has two broad categories of streams: byte streams for machine-formatted data and character streams for humanreadable data. There are many predened classes that represent streams of each type. An object that outputs data to a byte stream belongs to one of the subclasses of the abstract class OutputStream. Objects that read data from a byte stream belong to subclasses of InputStream. If you write numbers to an OutputStream, you wont be able to read the resulting data yourself. But the data can be read back into the computer with an InputStream. The writing and reading of the data will be very ecient, since there is no translation involved: the bits that are used to represent the data inside the computer are simply copied to and from the streams. For reading and writing human-readable character data, the main classes are the abstract classes Reader and Writer. All character stream classes are subclasses of one of these. If a number is to be written to a Writer stream, the computer must translate it into a humanreadable sequence of characters that represents that number. Reading a number from a Reader stream into a numeric variable also involves a translation, from a character sequence into the appropriate bit string. (Even if the data you are working with consists of characters in the rst place, such as words from a text editor, there might still be some translation. Characters are stored in the computer as 16-bit Unicode values. For people who use Western alphabets, character data is generally stored in les in ASCII code, which uses only 8 bits per character. The Reader and Writer classes take care of this translation, and can also handle non-western alphabets in countries that use them.) Byte streams can be useful for direct machine-to-machine communication, and they can sometimes be useful for storing data in les, especially when large amounts of data need to be stored eciently, such as in large databases. However, binary data is fragile in the sense that its meaning is not self-evident. When faced with a long series of zeros and ones, you have to know what information it is meant to represent and how that information is encoded before you will be able to interpret it. Of course, the same is true to some extent for character data, which is itself coded into binary form. But the binary encoding of character data has been standardized and is well understood, and data expressed in character form can be made meaningful to human readers. The current trend seems to be towards increased use of character data, represented in a way that will make its meaning as self-evident as possible. Well look at how this is done in Section 11.6. I should note that the original version of Java did not have character streams, and that for ASCII-encoded character data, byte streams are largely interchangeable with character streams. In fact, the standard input and output streams, System.in and System.out, are byte streams rather than character streams. However, you should use Readers and Writers rather than InputStreams and OutputStreams when working with character data. The standard stream classes discussed in this section are dened in the package java.io, along with several supporting classes. You must import the classes from this package if you want to use them in your program. That means either importing individual classes or putting the directive import java.io.*; at the beginning of your source le. Streams are necessary for working with les and for doing communication over a network. They can be also used for communication between two concurrently running threads, and there are stream classes for

11.1. STREAMS, READERS, AND WRITERS

539

reading and writing data stored in the computers memory. The beauty of the stream abstraction is that it is as easy to write data to a le or to send data over a network as it is to print information on the screen.


The basic I/O classes Reader, Writer, InputStream, and OutputStream provide only very primitive I/O operations. For example, the InputStream class declares the instance method
public int read() throws IOException

for reading one byte of data, as a number in the range 0 to 255, from an input stream. If the end of the input stream is encountered, the read() method will return the value -1 instead. If some error occurs during the input attempt, an exception of type IOException is thrown. Since IOException is an exception class that requires mandatory exception-handling, this means that you cant use the read() method except inside a try statement or in a subroutine that is itself declared with a throws IOException clause. (Mandatory exception handling was covered in Subsection 8.3.4.) The InputStream class also denes methods for reading several bytes of data in one step into an array of bytes. However, InputStream provides no convenient methods for reading other types of data, such as int or double, from a stream. This is not a problem because youll never use an object of type InputStream itself. Instead, youll use subclasses of InputStream that add more convenient input methods to InputStreams rather primitive capabilities. Similarly, the OutputStream class denes a primitive output method for writing one byte of data to an output stream. The method is dened as:
public void write(int b) throws IOException

The parameter is of type int rather than byte, but the parameter value is type-cast to type byte before it is written; this eectively discards all but the eight low order bytes of b. Again, in practice, you will almost always use higher-level output operations dened in some subclass of OutputStream. The Reader and Writer classes provide identical low-level read and write methods. As in the byte stream classes, the parameter of the write(c) method in Writer and the return value of the read() method in Reader are of type int, but in these character-oriented classes, the I/O operations read and write characters rather than bytes. The return value of read() is -1 if the end of the input stream has been reached. Otherwise, the return value must be type-cast to type char to obtain the character that was read. In practice, you will use ordinarily use higher level I/O operations provided by sub-classes of Reader and Writer, as discussed below.

11.1.2

PrintWriter

One of the neat things about Javas I/O package is that it lets you add capabilities to a stream by wrapping it in another stream object that provides those capabilities. The wrapper object is also a stream, so you can read from or write to itbut you can do so using fancier operations than those available for basic streams. For example, PrintWriter is a subclass of Writer that provides convenient methods for outputting human-readable character representations of all of Javas basic data types. If you have an object belonging to the Writer class, or any of its subclasses, and you would like to use PrintWriter methods to output data to that Writer, all you have to do is wrap the Writer in a PrintWriter object. You do this by constructing a new PrintWriter object, using the Writer as input to the constructor. For example, if charSink is of type Writer, then you could say

540

CHAPTER 11. FILES AND NETWORKING


PrintWriter printableCharSink = new PrintWriter(charSink);

When you output data to printableCharSink, using the high-level output methods in PrintWriter, that data will go to exactly the same place as data written directly to charSink. Youve just provided a better interface to the same output stream. For example, this allows you to use PrintWriter methods to send data to a le or over a network connection. For the record, if out is a variable of type PrintWriter, then the following methods are dened: out.print(x) prints the value of x, represented in the form of a string of characters, to the output stream; x can be an expression of any type, including both primitive types and object types. An object is converted to string form using its toString() method. A null value is represented by the string null. out.println() outputs an end-of-line to the output stream. out.println(x) outputs the value of x, followed by an end-of-line; this is equivalent to out.print(x) followed by out.println(). out.printf(formatString, x1, x2, ...) does formated output of x1, x2, ... to the output stream. The rst parameter is a string that species the format of the output. There can be any number of additional parameters, of any type, but the types of the parameters must match the formatting directives in the format string. Formatted output for the standard output stream, System.out, was introduced in Subsection 2.4.4, and out.printf has the same functionality. Note that none of these methods will ever throw an IOException. Instead, the PrintWriter class includes the method
public boolean checkError()

which will return true if any error has been encountered while writing to the stream. The PrintWriter class catches any IOExceptions internally, and sets the value of an internal error ag if one occurs. The checkError() method can be used to check the error ag. This allows you to use PrintWriter methods without worrying about catching exceptions. On the other hand, to write a fully robust program, you should call checkError() to test for possible errors whenever you used a PrintWriter.

11.1.3

Data Streams

When you use a PrintWriter to output data to a stream, the data is converted into the sequence of characters that represents the data in human-readable form. Suppose you want to output the data in byte-oriented, machine-formatted form? The java.io package includes a bytestream class, DataOutputStream that can be used for writing data values to streams in internal, binary-number format. DataOutputStream bears the same relationship to OutputStream that PrintWriter bears to Writer. That is, whereas OutputStream only has methods for outputting bytes, DataOutputStream has methods writeDouble(double x) for outputting values of type double, writeInt(int x) for outputting values of type int, and so on. Furthermore, you can wrap any OutputStream in a DataOutputStream so that you can use the higher level output methods on it. For example, if byteSink is of type OutputStream, you could say
DataOutputStream dataSink = new DataOutputStream(byteSink);

11.1. STREAMS, READERS, AND WRITERS

541

to wrap byteSink in a DataOutputStream, dataSink. For input of machine-readable data, such as that created by writing to a DataOutputStream, java.io provides the class DataInputStream. You can wrap any InputStream in a DataInputStream object to provide it with the ability to read data of various types from the bytestream. The methods in the DataInputStream for reading binary data are called readDouble(), readInt(), and so on. Data written by a DataOutputStream is guaranteed to be in a format that can be read by a DataInputStream. This is true even if the data stream is created on one type of computer and read on another type of computer. The cross-platform compatibility of binary data is a major aspect of Javas platform independence. In some circumstances, you might need to read character data from an InputStream or write character data to an OutputStream. This is not a problem, since characters, like all data, are represented as binary numbers. However, for character data, it is convenient to use Reader and Writer instead of InputStream and OutputStream. To make this possible, you can wrap a byte stream in a character stream. If byteSource is a variable of type InputStream and byteSink is of type OutputStream, then the statements
Reader charSource = new InputStreamReader( byteSource ); Writer charSink = new OutputStreamWriter( byteSink );

create character streams that can be used to read character data from and write character data to the byte streams. In particular, the standard input stream System.in, which is of type InputStream for historical reasons, can be wrapped in a Reader to make it easier to read character data from standard input:
Reader charIn = new InputStreamReader( System.in );

As another application, the input and output streams that are associated with a network connection are byte streams rather than character streams, but the byte streams can be wrapped in character streams to make it easy to send and receive character data over the network. We will encounter network I/O in Section 11.4.

11.1.4

Reading Text

Still, the fact remains that much I/O is done in the form of human-readable characters. In view of this, it is surprising that Java does not provide a standard character input class that can read character data in a manner that is reasonably symmetrical with the character output capabilities of PrintWriter. There is one basic case that is easily handled by a standard class. The BueredReader class has a method
public String readLine() throws IOException

that reads one line of text from its input source. If the end of the stream has been reached, the return value is null. When a line of text is read, the end-of-line marker is read from the input stream, but it is not part of the string that is returned. Dierent input streams use dierent characters as end-of-line markers, but the readLine method can deal with all the common cases. (Traditionally, Unix computers, including Linux and Mac OS X, use a line feed character, \n, to mark an end of line; classic Macintosh used a carriage return character, \r; and Windows uses the two-character sequence \r\n. In general, modern computers can deal correctly with all of these possibilities.) Line-by-line processing is very common. Any Reader can be wrapped in a BueredReader to make it easy to read full lines of text. If reader is of type Reader, then a BueredReader wrapper can be created for reader with

542

CHAPTER 11. FILES AND NETWORKING


BufferedReader in = new BufferedReader( reader );

This can be combined with the InputStreamReader class that was mentioned above to read lines of text from an InputStream. For example, we can apply this to System.in:
BufferedReader in; // BufferedReader for reading from standard input. in = new BufferedReader( new InputStreamReader( System.in ) ); try { String line = in.readLine(); while ( line != null && line.length() > 0 ) { processOneLineOfInput( line ); line = in.readLine(); } } catch (IOException e) { }

This code segment reads and processes lines from standard input until either an empty line or an end-of-stream is encountered. (An end-of-stream is possible even for interactive input. For example, on at least some computers, typing a Control-D generates an end-of-stream on the standard input stream.) The try..catch statement is necessary because the readLine method can throw an exception of type IOException, which requires mandatory exception handling; an alternative to try..catch would be to declare that the method that contains the code throws IOException. Also, remember that BueredReader, InputStreamReader, and IOException must be imported from the package java.io.


Previously in this book, we have used the non-standard class TextIO for input both from users and from les. The advantage of TextIO is that it makes it fairly easy to read data values of any of the primitive types. Disadvantages include the fact that TextIO can only read from one le at a time, that it cant do I/O operations on network connections, and that it does not follow the same pattern as Javas built-in input/output classes. I have written a class named TextReader to x some of these disadvantages, while providing input capabilities similar to those of TextIO. Like TextIO, TextReader is a non-standard class, so you have to be careful to make it available to any program that uses it. The source code for the class can be found in the le TextReader.java Just as for many of Javas stream classes, an object of type TextReader can be used as a wrapper for an existing input stream, which becomes the source of the characters that will be read by the TextReader. (Unlike the standard classes, however, a TextReader is not itself a stream and cannot be wrapped inside other stream classes.) The constructors
public TextReader(Reader characterSource)

and
public TextReader(InputStream byteSource)

create objects that can be used to read human-readable data from the given Reader or InputStream using the convenient input methods of the TextReader class. In TextIO, the input methods were static members of the class. The input methods in the TextReader class are instance methods. The instance methods in a TextReader object read from the data source that was specied in the objects constructor. This makes it possible for several TextReader objects to exist at the same time, reading from dierent streams; as a result, TextReader can be used to read data from more than one le at the same time.

11.1. STREAMS, READERS, AND WRITERS

543

A TextReader object has essentially the same set of input methods as the TextIO class. One big dierence is how errors are handled. When a TextReader encounters an error in the input, it throws an exception of type IOException. This follows the standard pattern that is used by Javas standard input streams. IOExceptions require mandatory exception handling, so TextReader methods are generally called inside try..catch statements. If an IOException is thrown by the input stream that is wrapped inside a TextReader, that IOException is simply passed along. However, other types of errors can also occur. One such possible error is an attempt to read data from the input stream when there is no more data left in the stream. A TextReader throws an exception of type TextReader.EndOfStreamException when this happens. The exception class in this case is a nested class in the TextReader class; it is a subclass of IOException, so a try..catch statement that handles IOExceptions will also handle end-ofstream exceptions. However, having a class to represent end-of-stream errors makes it possible to detect such errors and provide special handling for them. Another type of error occurs when a TextReader tries to read a data value of a certain type, and the next item in the input stream is not of the correct type. In this case, the TextReader throws an exception of type TextReader.BadDataException, which is another subclass of IOException. For reference, here is a list of some of the more useful instance methods in the TextReader class. All of these methods can throw exceptions of type IOException: public char peek() looks ahead at the next character in the input stream, and returns that character. The character is not removed from the stream. If the next character is an end-of-line, the return value is \n. It is legal to call this method even if there is no more data left in the stream; in that case, the return value is the constant TextReader.EOF. (EOF stands for End-Of-File, a term that is more commonly used than End-OfStream, even though not all streams are les.) public boolean eoln() and public boolean eof() convenience methods for testing whether the next thing in the le is an end-of-line or an end-of-le. Note that these methods do not skip whitespace. If eof() is false, you know that there is still at least one character to be read, but there might not be any more non-blank characters in the stream. public void skipBlanks() and public void skipWhiteSpace() skip past whitespace characters in the input stream; skipWhiteSpace() skips all whitespace characters, including end-of-line while skipBlanks() only skips spaces and tabs. public String getln() reads characters up to the next end-of-line (or end-of-stream), and returns those characters in a string. The end-of-line marker is read but it not part of the returned string. This will throw an exception if there are no more characters in the stream. public char getAnyChar() reads and returns the next character from the stream. The character can be a whitespace character such as a blank or end-of-line. If this method is called after all the characters in the stream have been read, an exception is thrown. public int getlnInt(), public double getlnDouble(), public char getlnChar(), etc. skip any whitespace characters in the stream, including end-of-lines, then read a value of the specied type, which will be the return value of the method. Any remaining characters on the line are then discarded, including the end-of-line marker. There is a method for each primitive type. An exception occurs if its not possible to read a data value of the requested type. public int getInt(), public double getDouble(), public char getChar(), etc.

544

CHAPTER 11. FILES AND NETWORKING skip any whitespace characters in the stream, including end-of-lines, then read and return a value of the specied type. Extra characters on the line are not discarded and are still available to be read by subsequent input methods. There is a method for each primitive type. An exception occurs if its not possible to read a data value of the requested type.

11.1.5

The Scanner Class

Since its introduction, Java has been notable for its lack of built-in support for basic input, and for its reliance on fairly advanced techniques for the support that it does oer. (This is my opinion, at least.) The Scanner class was introduced in Java 5.0 to make it easier to read basic data types from a character input source. It does not (again, in my opinion) solve the problem completely, but it is a big improvement. The Scanner class is in the package java.util. Input routines are dened as instance methods in the Scanner class, so to use the class, you need to create a Scanner object. The constructor species the source of the characters that the Scanner will read. The scanner acts as a wrapper for the input source. The source can be a Reader, an InputStream, a String, or a File. (If a String is used as the input source, the Scanner will simply read the characters in the string from beginning to end, in the same way that it would process the same sequence of characters from a stream. The File class will be covered in the next section.) For example, you can use a Scanner to read from standard input by saying:
Scanner standardInputScanner = new Scanner( System.in );

and if charSource is of type Reader, you can create a Scanner for reading from charSource with:
Scanner scanner = new Scanner( charSource );

When processing input, a scanner usually works with tokens. A token is a meaningful string of characters that cannot, for the purposes at hand, be further broken down into smaller meaningful pieces. A token can, for example, be an individual word or a string of characters that represents a value of type double. In the case of a scanner, tokens must be separated by delimiters. By default, the delimiters are whitespace characters such as spaces and end-of-line markers. In normal processing, whitespace characters serve simply to separate tokens and are discarded by the scanner. A scanner has instance methods for reading tokens of various types. Suppose that scanner is an object of type Scanner. Then we have: scanner.next() reads the next token from the input source and returns it as a String. scanner.nextInt(), scanner.nextDouble(), and so on reads the next token from the input source and tries to convert it to a value of type int, double, and so on. There are methods for reading values of any of the primitive types. scanner.nextLine() reads an entire line from the input source, up to the next endof-line and returns the line as a value of type String. The end-of-line marker is read but is not part of the return value. Note that this method is not based on tokens. An entire line is read and returned, including any whitespace characters in the line. All of these methods can generate exceptions. If an attempt is made to read past the end of input, an exception of type NoSuchElementException is thrown. Methods such as scanner.getInt() will throw an exception of type InputMismatchException if the next token in the input does not represent a value of the requested type. The exceptions that can be generated do not require mandatory exception handling.

11.1. STREAMS, READERS, AND WRITERS

545

The Scanner class has very nice look-ahead capabilities. You can query a scanner to determine whether more tokens are available and whether the next token is of a given type. If scanner is of type Scanner : scanner.hasNext() returns a boolean value that is true if there is at least one more token in the input source. scanner.hasNextInt(), scanner.hasNextDouble(), and so on returns a boolean value that is true if there is at least one more token in the input source and that token represents a value of the requested type. scanner.hasNextLine() returns a boolean value that is true if there is at least one more line in the input source. Although the insistence on dening tokens only in terms of delimiters limits the usability of scanners to some extent, they are easy to use and are suitable for many applications.

11.1.6

Serialized Object I/O

The classes PrintWriter, TextReader, Scanner, DataInputStream, and DataOutputStream allow you to easily input and output all of Javas primitive data types. But what happens when you want to read and write objects? Traditionally, you would have to come up with some way of encoding your object as a sequence of data values belonging to the primitive types, which can then be output as bytes or characters. This is called serializing the object. On input, you have to read the serialized data and somehow reconstitute a copy of the original object. For complex objects, this can all be a major chore. However, you can get Java to do all the work for you by using the classes ObjectInputStream and ObjectOutputStream. These are subclasses of InputStream and Outputstream that can be used for writing and reading serialized objects. ObjectInputStream and ObjectOutputStream are wrapper classes that can be wrapped around arbitrary InputStreams and OutputStreams. This makes it possible to do object input and output on any byte stream. The methods for object I/O are readObject(), in ObjectInputStream, and writeObject(Object obj), in ObjectOutputStream. Both of these methods can throw IOExceptions. Note that readObject() returns a value of type Object, which generally has to be type-cast to a more useful type. ObjectOutputStream also has methods writeInt(), writeDouble(), and so on, for outputting primitive type values to the stream, and ObjectInputStream has corresponding methods for reading primitive type values. Object streams are byte streams. The objects are represented in binary, machine-readable form. This is good for eciency, but it does suer from the fragility that is often seen in binary data. They suer from the additional problem that the binary format of Java objects is very specic to Java, so the data in object streams is not easily available to programs written in other programming languages. For these reasons, object streams are appropriate mostly for short-term storage of objects and for transmitting objects over a network connection from one Java program to another. For long-term storage and for communication with non-Java programs, other approaches to object serialization are usually better. (See Subsection 11.6.2 for a character-based approach.) ObjectInputStream and ObjectOutputStream only work with objects that implement an interface named Serializable. Furthermore, all of the instance variables in the object must be serializable. However, there is little work involved in making an object serializable, since the Serializable interface does not declare any methods. It exists only as a marker for the compiler,

546

CHAPTER 11. FILES AND NETWORKING

to tell it that the object is meant to be writable and readable. You only need to add the words implements Serializable to your class denitions. Many of Javas standard classes are already declared to be serializable, including all the component classes and many other classes in Swing and in the AWT. One of the programming examples in Section 11.3 uses object IO.

11.2

Files

The data and programs in a computers main memory survive only as long as the power is
on. For more permanent storage, computers use les, which are collections of data stored on a hard disk, on a USB memory stick, on a CD-ROM, or on some other type of storage device. Files are organized into directories (sometimes called folders). A directory can hold other directories, as well as les. Both directories and les have names that are used to identify them. Programs can read data from existing les. They can create new les and can write data to les. In Java, such input and output can be done using streams. Human-readable character data is read from a le using an object belonging to the class FileReader, which is a subclass of Reader. Similarly, data is written to a le in human-readable format through an object of type FileWriter, a subclass of Writer. For les that store data in machine format, the appropriate I/O classes are FileInputStream and FileOutputStream. In this section, I will only discuss characteroriented le I/O using the FileReader and FileWriter classes. However, FileInputStream and FileOutputStream are used in an exactly parallel fashion. All these classes are dened in the java.io package. Its worth noting right at the start that applets which are downloaded over a network connection are not allowed to access les (unless you have made a very foolish change to your web browsers conguration). This is a security consideration. You can download and run an applet just by visiting a Web page with your browser. If downloaded applets had access to the les on your computer, it would be easy to write an applet that would destroy all the data on a computer that downloads it. To prevent such possibilities, there are a number of things that downloaded applets are not allowed to do. Accessing les is one of those forbidden things. Standalone programs written in Java, however, have the same access to your les as any other program. When you write a standalone Java application, you can use all the le operations described in this section.

11.2.1

Reading and Writing Files

The FileReader class has a constructor which takes the name of a le as a parameter and creates an input stream that can be used for reading from that le. This constructor will throw an exception of type FileNotFoundException if the le doesnt exist. It requires mandatory exception handling, so you have to call the constructor in a try..catch statement (or inside a routine that is declared to throw the exception). For example, suppose you have a le named data.txt, and you want your program to read data from that le. You could do the following to create an input stream for the le:
FileReader data; // (Declare the variable before the // try statement, or else the variable // is local to the try block and you wont // be able to use it later in the program.)

try { data = new FileReader("data.txt"); // create the stream

11.2. FILES
} catch (FileNotFoundException e) { ... // do something to handle the error---maybe, end the program }

547

The FileNotFoundException class is a subclass of IOException, so it would be acceptable to catch IOExceptions in the above try...catch statement. More generally, just about any error that can occur during input/output operations can be caught by a catch clause that handles IOException. Once you have successfully created a FileReader, you can start reading data from it. But since FileReaders have only the primitive input methods inherited from the basic Reader class, you will probably want to wrap your FileReader in a Scanner, in a TextReader, or in some other wrapper class. (The TextReader class is not a standard part of Java; it is described in Subsection 11.1.4. Scanner is discussed in Subsection 11.1.5.) To create a TextReader for reading from a le named data.dat, you could say:
TextReader data; try { data = new TextReader( new FileReader("data.dat") ); } catch (FileNotFoundException e) { ... // handle the exception }

Once you have a TextReader named data, you can read from it using such methods as data.getInt() and data.peek(), exactly as you would from any other TextReader. Working with output les is no more dicult than this. You simply create an object belonging to the class FileWriter. You will probably want to wrap this output stream in an object of type PrintWriter. For example, suppose you want to write data to a le named result.dat. Since the constructor for FileWriter can throw an exception of type IOException, you should use a try..catch statement:
PrintWriter result; try { result = new PrintWriter(new FileWriter("result.dat")); } catch (IOException e) { ... // handle the exception }

If no le named result.dat exists, a new le will be created. If the le already exists, then the current contents of the le will be erased and replaced with the data that your program writes to the le. This will be done without any warning. To avoid overwriting a le that already exists, you can check whether a le of the same name already exists before trying to create the stream, as discussed later in this section. An IOException might occur in the PrintWriter constructor if, for example, you are trying to create a le on a disk that is writeprotected, meaning that it cannot be modied. After you are nished using a le, its a good idea to close the le, to tell the operating system that you are nished using it. You can close a le by calling the close() method of the associated stream. Once a le has been closed, it is no longer possible to read data from it or write data to it, unless you open it again as a new stream. (Note that for most stream

548

CHAPTER 11. FILES AND NETWORKING

classes, the close() method can throw an IOException, which must be handled; however, both PrintWriter and TextReader override this method so that it cannot throw such exceptions.) If you forget to close a le, the le will ordinarily be closed automatically when the program terminates or when the le object is garbage collected, but in the case of an output le, some of the data that has been written to the le might be lost. This can occur because data that is written to a le can be buered ; that is, the data is not sent immediately to the le but is retained in main memory (in a buer) until a larger chunk of data is ready to be written. This is done for eciency. The close() method of an output stream will cause all the data in the buer to be sent to the le. Every output stream also has a flush() method that can be called to force any data in the buer to be written to the le without closing the le. As a complete example, here is a program that will read numbers from a le named data.dat, and will then write out the same numbers in reverse order to another le named result.dat. It is assumed that data.dat contains only one number on each line. Exceptionhandling is used to check for problems along the way. Although the application is not a particularly useful one, this program demonstrates the basics of working with les. (By the way, at the end of this program, youll nd our rst useful example of a finally clause in a try statement. When the computer executes a try statement, the commands in its finally clause are guaranteed to be executed, no matter what.)
import java.io.*; import java.util.ArrayList; /** * Reads numbers from a file named data.dat and writes them to a file * named result.dat in reverse order. The input file should contain * exactly one real number per line. */ public class ReverseFile { public static void main(String[] args) { TextReader data; PrintWriter result; // Character input stream for reading data. // Character output stream for writing data. // An ArrayList for holding the data.

ArrayList<Double> numbers;

numbers = new ArrayList<Double>(); try { // Create the input stream. data = new TextReader(new FileReader("data.dat")); } catch (FileNotFoundException e) { System.out.println("Cant find file data.dat!"); return; // End the program by returning from main(). } try { // Create the output stream. result = new PrintWriter(new FileWriter("result.dat")); } catch (IOException e) { System.out.println("Cant open file result.dat!"); System.out.println("Error: " + e); data.close(); // Close the input file. return; // End the program. }

11.2. FILES

549

try { // Read numbers from the input file, adding them to the ArrayList. while ( data.eof() == false ) { // Read until end-of-file. double inputNumber = data.getlnDouble(); numbers.add( inputNumber ); } // Output the numbers in reverse order. for (int i = numbers.size()-1; i >= 0; i--) result.println(numbers.get(i)); System.out.println("Done!"); } catch (IOException e) { // Some problem reading the data from the input file. System.out.println("Input Error: " + e.getMessage()); } finally { // Finish by closing the files, whatever else may have happened. data.close(); result.close(); } } // end of main()

} // end of class

11.2.2

Files and Directories

The subject of le names is actually more complicated than Ive let on so far. To fully specify a le, you have to give both the name of the le and the name of the directory where that le is located. A simple le name like data.dat or result.dat is taken to refer to a le in a directory that is called the current directory (also known as the default directory or working directory). The current directory is not a permanent thing. It can be changed by the user or by a program. Files not in the current directory must be referred to by a path name, which includes both the name of the le and information about the directory where it can be found. To complicate matters even further, there are two types of path names, absolute path names and relative path names. An absolute path name uniquely identies one le among all the les available to the computer. It contains full information about which directory the le is in and what the les name is. A relative path name tells the computer how to locate the le starting from the current directory. Unfortunately, the syntax for le names and path names varies somewhat from one type of computer to another. Here are some examples: data.dat on any computer, this would be a le named data.dat in the current directory. /home/eck/java/examples/data.dat This is an absolute path name in a UNIX operating system, including Linux and Mac OS X. It refers to a le named data.dat in a directory named examples, which is in turn in a directory named java, . . . .

550

CHAPTER 11. FILES AND NETWORKING C:\eck\java\examples\data.dat An absolute path name on a Windows computer. Hard Drive:java:examples:data.dat Assuming that Hard Drive is the name of a disk drive, this would be an absolute path name on a computer using a classic Macintosh operating system such as Mac OS 9. examples/data.dat a relative path name under UNIX. Examples is the name of a directory that is contained within the current directory, and data.data is a le in that directory. The corresponding relative path name for Windows would be examples\data.dat. ../examples/data.dat a relative path name in UNIX that means go to the directory that contains the current directory, then go into a directory named examples inside that directory, and look there for a le named data.data. In general, .. means go up one directory.

Its reasonably safe to say, though, that if you stick to using simple le names only, and if the les are stored in the same directory with the program that will use them, then you will be OK. Later in this section, well look at a convenient way of letting the user specify a le in a GUI program, which allows you to avoid the issue of path names altogether. It is possible for a Java program to nd out the absolute path names for two important directories, the current directory and the users home directory. The names of these directories are system properties, and they can be read using the function calls: System.getProperty("user.dir") returns the absolute path name of the current directory as a String. System.getProperty("user.home") returns the absolute path name of the users home directory as a String. To avoid some of the problems caused by dierences in path names between platforms, Java has the class java.io.File. An object belonging to this class represents a le. More precisely, an object of type File represents a le name rather than a le as such. The le to which the name refers might or might not exist. Directories are treated in the same way as les, so a File object can represent a directory just as easily as it can represent a le. A File object has a constructor, new File(String), that creates a File object from a path name. The name can be a simple name, a relative path, or an absolute path. For example, new File("data.dat") creates a File object that refers to a le named data.dat, in the current directory. Another constructor, new File(File,String), has two parameters. The rst is a File object that refers to the directory that contains the le. The second can be the name of the le or a relative path from the directory to the le. File objects contain several useful instance methods. Assuming that file is a variable of type File, here are some of the methods that are available: file.exists() This boolean-valued function returns true if the le named by the File object already exists. You can use this method if you want to avoid overwriting the contents of an existing le when you create a new FileWriter. file.isDirectory() This boolean-valued function returns true if the File object refers to a directory. It returns false if it refers to a regular le or if no le with the given name exists. file.delete() Deletes the le, if it exists. Returns a boolean value to indicate whether the le was successfully deleted.

11.2. FILES

551

file.list() If the File object refers to a directory, this function returns an array of type String[] containing the names of the les in that directory. Otherwise, it returns null. Here, for example, is a program that will list the names of all the les in a directory specied by the user. Just for fun, I have used a Scanner (Subsection 11.1.5) to read the users input:
import java.io.File; import java.util.Scanner; /** * This program lists the files in a directory specified by * the user. The user is asked to type in a directory name. * If the name entered by the user is not a directory, a * message is printed and the program ends. */ public class DirectoryList { public static void main(String[] args) { String directoryName; File directory; String[] files; Scanner scanner; // // // // Directory name entered by the user. File object referring to the directory. Array of file names in the directory. For reading a line of input from the user.

scanner = new Scanner(System.in); // scanner reads from standard input. System.out.print("Enter a directory name: "); directoryName = scanner.nextLine().trim(); directory = new File(directoryName); if (directory.isDirectory() == false) { if (directory.exists() == false) System.out.println("There is no such directory!"); else System.out.println("That file is not a directory."); } else { files = directory.list(); System.out.println("Files in directory \"" + directory + "\":"); for (int i = 0; i < files.length; i++) System.out.println(" " + files[i]); } } // end main() } // end class DirectoryList

All the classes that are used for reading data from les and writing data to les have constructors that take a File object as a parameter. For example, if file is a variable of type File, and you want to read character data from that le, you can create a FileReader to do so by saying new FileReader(file). If you want to use a TextReader to read from the le, you could say:

552
TextReader data;

CHAPTER 11. FILES AND NETWORKING

try { data = new TextReader( new FileReader(file) ); } catch (FileNotFoundException e) { ... // handle the exception }

11.2.3

File Dialog Boxes

In many programs, you want the user to be able to select the le that is going to be used for input or output. If your program lets the user type in the le name, you will just have to assume that the user understands how to work with les and directories. But in a graphical user interface, the user expects to be able to select les using a le dialog box , which is a window that a program can open when it wants the user to select a le for input or output. Swing includes a platform-independent technique for using le dialog boxes in the form of a class called JFileChooser. This class is part of the package javax.swing. We looked at using some basic dialog boxes in Subsection 6.8.2. File dialog boxes are similar to those, but are a little more complicated to use. A le dialog box shows the user a list of les and sub-directories in some directory, and makes it easy for the user to specify a le in that directory. The user can also navigate easily from one directory to another. The most common constructor for JFileChooser has no parameter and sets the starting directory in the dialog box to be the users home directory. There are also constructors that specify the starting directory explicitly:
new JFileChooser( File startDirectory ) new JFileChooser( String pathToStartDirectory )

Constructing a JFileChooser object does not make the dialog box appear on the screen. You have to call a method in the object to do that. There are two dierent methods that can be used because there are two types of le dialog: An open le dialog allows the user to specify an existing le to be opened for reading data into the program; a save le dialog lets the user specify a le, which might or might not already exist, to be opened for writing data from the program. File dialogs of these two types are opened using the showOpenDialog and showSaveDialog methods. These methods make the dialog box appear on the screen; the methods do not end until the user selects a le or cancels the dialog. A le dialog box always has a parent, another component which is associated with the dialog box. The parent is specied as a parameter to the showOpenDialog or showSaveDialog methods. The parent is a GUI component, and can often be specied as this in practice, since le dialogs are often used in instance methods of GUI component classes. (The parameter can also be null, in which case an invisible component is created to be used as the parent.) Both showOpenDialog and showSaveDialog have a return value, which will be one of the constants JFileChooser.CANCEL OPTION, JFileChooser.ERROR OPTION, or JFileChooser.APPROVE OPTION. If the return value is JFileChooser.APPROVE OPTION, then the user has selected a le. If the return value is something else, then the user did not select a le. The user might have clicked a Cancel button, for example. You should always check the return value, to make sure that the user has, in fact, selected a le. If that is the case, then you can nd out which le was selected by calling the JFileChooser s getSelectedFile() method, which returns an object of type File that represents the selected le.

11.2. FILES

553

Putting all this together, we can look at a typical subroutine that reads data from a le that is selected using a JFileChooser :
public void readFile() { if (fileDialog == null) // (fileDialog is an instance variable) fileDialog = new JFileChooser(); fileDialog.setDialogTitle("Select File for Reading"); fileDialog.setSelectedFile(null); // No file is initially selected. int option = fileDialog.showOpenDialog(this); // (Using "this" as a parameter to showOpenDialog() assumes that the // readFile() method is an instance method in a GUI component class.) if (option != JFileChooser.APPROVE OPTION) return; // User canceled or clicked the dialogs close box. File selectedFile = fileDialog.getSelectedFile(); TextReader in; // (or use some other wrapper class) try { FileReader stream = new FileReader(selectedFile); // (or a FileInputStream) in = new TextReader( stream ); } catch (Exception e) { JOptionPane.showMessageDialog(this, "Sorry, but an error occurred while trying to open the file:\n" + e); return; } try { . . // Read and process the data from the input stream, in. . in.close(); } catch (Exception e) { JOptionPane.showMessageDialog(this, "Sorry, but an error occurred while trying to read the data:\n" + e); } }

One ne point here is that the variable fileDialog is an instance variable of type JFileChooser. This allows the le dialog to continue to exist between calls to readFile(). The main eect of this is that the dialog box will keep the same selected directory from one call of readFile() to the next. When the dialog reappears, it will show the same directory that the user selected the previous time it appeared. This is probably what the user expects. Note that its common to do some conguration of a JFileChooser before calling showOpenDialog or showSaveDialog. For example, the instance method setDialogTitle(String) is used to specify a title to appear in the title bar of the window. And setSelectedFile(File) is used to set the le that is selected in the dialog box when it appears. This can be used to provide a default le choice for the user. In the readFile() method, above, fileDialog.setSelectedFile(null) species that no le is pre-selected when the dialog box appears. Writing data to a le is similar, but its a good idea to add a check to determine whether the output le that is selected by the user already exists. In that case, ask the user whether to replace the le. Here is a typical subroutine for writing to a user-selected le:

554

CHAPTER 11. FILES AND NETWORKING


public void writeFile() { if (fileDialog == null) fileDialog = new JFileChooser(); // (fileDialog is an instance variable) File selectedFile = new File("(default file name)"); fileDialog.setSelectedFile(selectedFile); // Specify a default file name. fileDialog.setDialogTitle("Select File for Writing"); int option = fileDialog.showSaveDialog(this); if (option != JFileChooser.APPROVE OPTION) return; // User canceled or clicked the dialogs close box. selectedFile = fileDialog.getSelectedFile(); if (selectedFile.exists()) { // Ask the user whether to replace the file. int response = JOptionPane.showConfirmDialog( this, "The file \"" + selectedFile.getName() + "\" already exists.\nDo you want to replace it?", "Confirm Save", JOptionPane.YES NO OPTION, JOptionPane.WARNING MESSAGE ); if (response != JOptionPane.YES OPTION) return; // User does not want to replace the file. } PrintWriter out; // (or use some other wrapper class) try { FileWriter stream = new FileWriter(selectedFile); // (or FileOutputStream) out = new PrintWriter( stream ); } catch (Exception e) { JOptionPane.showMessageDialog(this, "Sorry, but an error occurred while trying to open the file:\n" + e); return; } try { . . // Write data to the output stream, out. . out.close(); if (out.checkError()) // (need to check for errors in PrintWriter) throw new IOException("Error check failed."); } catch (Exception e) { JOptionPane.showMessageDialog(this, "Sorry, but an error occurred while trying to write the data:\n" + e); } }

The readFile() and writeFile() routines presented here can be used, with just a few changes, when you need to read or write a le in a GUI program. Well look at some more complete examples of using les and le dialogs in the next section.

11.3 In

Programming With Files

this section, we look at several programming examples that work with les, using the techniques that were introduced in Section 11.1 and Section 11.2.

11.3. PROGRAMMING WITH FILES

555

11.3.1

Copying a File

As a rst example, we look at a simple command-line program that can make a copy of a le. Copying a le is a pretty common operation, and every operating system already has a command for doing it. However, it is still instructive to look at a Java program that does the same thing. Many le operations are similar to copying a le, except that the data from the input le is processed in some way before it is written to the output le. All such operations can be done by programs with the same general form. Since the program should be able to copy any le, we cant assume that the data in the le is in human-readable form. So, we have to use InputStream and OutputStream to operate on the le rather than Reader and Writer. The program simply copies all the data from the InputStream to the OutputStream, one byte at a time. If source is the variable that refers to the InputStream, then the function source.read() can be used to read one byte. This function returns the value -1 when all the bytes in the input le have been read. Similarly, if copy refers to the OutputStream, then copy.write(b) writes one byte to the output le. So, the heart of the program is a simple while loop. As usual, the I/O operations can throw exceptions, so this must be done in a try..catch statement:
while(true) { int data = source.read(); if (data < 0) break; copy.write(data); }

The le-copy command in an operating system such as UNIX uses command line arguments to specify the names of the les. For example, the user might say copy original.dat backup.dat to copy an existing le, original.dat, to a le named backup.dat. Commandline arguments can also be used in Java programs. The command line arguments are stored in the array of strings, args, which is a parameter to the main() routine. The program can retrieve the command-line arguments from this array. (See Subsection 7.2.3.) For example, if the program is named CopyFile and if the user runs the program with the command java CopyFile work.dat oldwork.dat, then in the program, args[0] will be the string "work.dat" and args[1] will be the string "oldwork.dat". The value of args.length tells the program how many command-line arguments were specied by the user. My CopyFile program gets the names of the les from the command-line arguments. It prints an error message and exits if the le names are not specied. To add a little interest, there are two ways to use the program. The command line can simply specify the two le names. In that case, if the output le already exists, the program will print an error message and end. This is to make sure that the user wont accidently overwrite an important le. However, if the command line has three arguments, then the rst argument must be -f while the second and third arguments are le names. The -f is a command-line option, which is meant to modify the behavior of the program. The program interprets the -f to mean that its OK to overwrite an existing program. (The f stands for force, since it forces the le to be copied in spite of what would otherwise have been considered an error.) You can see in the source code how the command line arguments are interpreted by the program:
import java.io.*; /** * Makes a copy of a file. The original file and the name of the

556

CHAPTER 11. FILES AND NETWORKING


* copy must be given as command-line arguments. In addition, the * first command-line argument can be "-f"; if present, the program * will overwrite an existing file; if not, the program will report * an error and end if the output file already exists. The number * of bytes that are copied is reported. */ public class CopyFile { public static void main(String[] args) { String sourceName; // Name of the source file, // as specified on the command line. String copyName; // Name of the copy, // as specified on the command line. InputStream source; // Stream for reading from the source file. OutputStream copy; // Stream for writing the copy. boolean force; // This is set to true if the "-f" option // is specified on the command line. int byteCount; // Number of bytes copied from the source file. /* Get file names from the command line and check for the presence of the -f option. If the command line is not one of the two possible legal forms, print an error message and end this program. */ if (args.length == 3 && args[0].equalsIgnoreCase("-f")) { sourceName = args[1]; copyName = args[2]; force = true; } else if (args.length == 2) { sourceName = args[0]; copyName = args[1]; force = false; } else { System.out.println( "Usage: java CopyFile <source-file> <copy-name>"); System.out.println( " or java CopyFile -f <source-file> <copy-name>"); return; } /* Create the input stream. If an error occurs, end the program. */

try { source = new FileInputStream(sourceName); } catch (FileNotFoundException e) { System.out.println("Cant find file \"" + sourceName + "\"."); return; } /* If the output file already exists and the -f option was not specified, print an error message and end the program. */ File file = new File(copyName);

11.3. PROGRAMMING WITH FILES


if (file.exists() && force == false) { System.out.println( "Output file exists. Use the -f option to replace it."); return; } /* Create the output stream. If an error occurs, end the program. */

557

try { copy = new FileOutputStream(copyName); } catch (IOException e) { System.out.println("Cant open output file \"" + copyName + "\"."); return; } /* Copy one byte at a time from the input stream to the output stream, ending when the read() method returns -1 (which is the signal that the end of the stream has been reached). If any error occurs, print an error message. Also print a message if the file has been copied successfully. */ byteCount = 0; try { while (true) { int data = source.read(); if (data < 0) break; copy.write(data); byteCount++; } source.close(); copy.close(); System.out.println("Successfully copied " + byteCount + " bytes."); } catch (Exception e) { System.out.println("Error occurred while copying. " + byteCount + " bytes copied."); System.out.println("Error: " + e); } } // end main()

} // end class CopyFile

11.3.2

Persistent Data

Once a program ends, any data that was stored in variables and objects in the program is gone. In many cases, it would be useful to have some of that data stick around so that it will be available when the program is run again. The problem is, how to make the data persistent between runs of the program? The answer, of course, is to store the data in a le (or, for some applications, in a databasebut the data in a database is itself stored in les). Consider a phone book program that allows the user to keep track of a list of names and associated phone numbers. The program would make no sense at all if the user had to create

558

CHAPTER 11. FILES AND NETWORKING

the whole list from scratch each time the program is run. It would make more sense to think of the phone book as a persistent collection of data, and to think of the program as an interface to that collection of data. The program would allow the user to look up names in the phone book and to add new entries. Any changes that are made should be preserved after the program ends. The sample program PhoneDirectoryFileDemo.java is a very simple implementation of this idea. It is meant only as an example of le use; the phone book that it implements is a toy version that is not meant to be taken seriously. This program stores the phone book data in a le named .phone book demo in the users home directory. To nd the users home directory, it uses the System.getProperty() method that was mentioned in Subsection 11.2.2. When the program starts, it checks whether the le already exists. If it does, it should contain the users phone book, which was saved in a previous run of the program, so the data from the le is read and entered into a TreeMap named phoneBook that represents the phone book while the program is running. (See Subsection 10.3.1.) In order to store the phone book in a le, some decision must be made about how the data in the phone book will be represented. For this example, I chose a simple representation in which each line of the le contains one entry consisting of a name and the associated phone number. A percent sign (%) separates the name from the number. The following code at the beginning of the program will read the phone book data le, if it exists and has the correct format:
File userHomeDirectory = new File( System.getProperty("user.home") ); File dataFile = new File( userHomeDirectory, ".phone book data" ); if ( ! dataFile.exists() ) { System.out.println("No phone book data file found."); System.out.println("A new one will be created."); System.out.println("File name: " + dataFile.getAbsolutePath()); } else { System.out.println("Reading phone book data..."); try { Scanner scanner = new Scanner( dataFile ); while (scanner.hasNextLine()) { // Read one line from the file, containing one name/number pair. String phoneEntry = scanner.nextLine(); int separatorPosition = phoneEntry.indexOf(%); if (separatorPosition == -1) throw new IOException("File is not a phonebook data file."); name = phoneEntry.substring(0, separatorPosition); number = phoneEntry.substring(separatorPosition+1); phoneBook.put(name,number); } } catch (IOException e) { System.out.println("Error in phone book data file."); System.out.println("File name: " + dataFile.getAbsolutePath()); System.out.println("This program cannot continue."); System.exit(1); } }

11.3. PROGRAMMING WITH FILES

559

The program then lets the user do various things with the phone book, including making modications. Any changes that are made are made only to the TreeMap that holds the data. When the program ends, the phone book data is written to the le (if any changes have been made while the program was running), using the following code:
if (changed) { System.out.println("Saving phone directory changes to file " + dataFile.getAbsolutePath() + " ..."); PrintWriter out; try { out = new PrintWriter( new FileWriter(dataFile) ); } catch (IOException e) { System.out.println("ERROR: Cant open data file for output."); return; } for ( Map.Entry<String,String> entry : phoneBook.entrySet() ) out.println(entry.getKey() + "%" + entry.getValue() ); out.close(); if (out.checkError()) System.out.println("ERROR: Some error occurred while writing data file."); else System.out.println("Done."); }

The net eect of this is that all the data, including the changes, will be there the next time the program is run. Ive shown you all the le-handling code from the program. If you would like to see the rest of the program, including an example of using a Scanner to read integer-valued responses from the user, see the source code le, PhoneDirectoryFileDemo.java.

11.3.3

Files in GUI Programs

The previous examples in this section use a command-line interface, but graphical user interface programs can also manipulate les. Programs typically have an Open command that reads the data from a le and displays it in a window and a Save command that writes the data from the window into a le. We can illustrate this in Java with a simple text editor program, TrivialEdit.java. The window for this program uses a JTextArea component to display some text that the user can edit. It also has a menu bar, with a File menu that includes Open and Save commands. These commands are implemented using the techniques for reading and writing les that were covered in Section 11.2. When the user selects the Open command from the File menu in the TrivialEdit program, the program pops up a le dialog box where the user species the le. It is assumed that the le is a text le. A limit of 10000 characters is put on the size of the le, since a JTextArea is not meant for editing large amounts of text. The program reads the text contained in the specied le, and sets that text to be the content of the JTextArea. In this case, I decided to use a BueredReader to read the le line-by-line. The program also sets the title bar of the window to show the name of the le that was opened. All this is done in the following method, which is just a variation of the readFile() method presented in Section 11.2:
/** * Carry out the Open command by letting the user specify a file to be opened * and reading up to 10000 characters from that file. If the file is read

560

CHAPTER 11. FILES AND NETWORKING


* successfully and is not too long, then the text from the file replaces the * text in the JTextArea. */ public void doOpen() { if (fileDialog == null) fileDialog = new JFileChooser(); fileDialog.setDialogTitle("Select File to be Opened"); fileDialog.setSelectedFile(null); // No file is initially selected. int option = fileDialog.showOpenDialog(this); if (option != JFileChooser.APPROVE OPTION) return; // User canceled or clicked the dialogs close box. File selectedFile = fileDialog.getSelectedFile(); BufferedReader in; try { FileReader stream = new FileReader(selectedFile); in = new BufferedReader( stream ); } catch (Exception e) { JOptionPane.showMessageDialog(this, "Sorry, but an error occurred while trying to open the file:\n" + e); return; } try { String input = ""; while (true) { String lineFromFile = in.readLine(); if (lineFromFile == null) break; // End-of-file has been reached. input = input + lineFromFile + \n; if (input.length() > 10000) throw new IOException("Input file is too large for this program."); } in.close(); text.setText(input); editFile = selectedFile; setTitle("TrivialEdit: " + editFile.getName()); } catch (Exception e) { JOptionPane.showMessageDialog(this, "Sorry, but an error occurred while trying to read the data:\n" + e); } }

In this program, the instance variable editFile is used to keep track of the le that is currently being edited, if any, and the setTitle() method (from class JFrame) is used to set the title of the window to show the name of the le. Similarly, the response to the Save command is a minor variation on the writeFile() method from Section 11.2. I will not repeat it here. If you would like to see the entire program, you will nd the source code in the le TrivialEdit.java.

11.3. PROGRAMMING WITH FILES

561

11.3.4

Storing Objects in Files

Whenever data is stored in les, some denite format must be adopted for representing the data. As long as the output routine that writes the data and the input routine that reads the data use the same format, the les will be usable. However, as usual, correctness is not the end of the story. The representation that is used for data in les should also be robust. (See Section 8.1) To see what this means, we will look at several dierent ways of representing the same data. This example builds on the example SimplePaint2.java from Subsection 7.3.4. In that program, the user could use the mouse to draw simple sketches. Now, we will add le input/output capabilities to that program. This will allow the user to save a sketch to a le and later read the sketch back from the le into the program so that the user can continue to work on the sketch. The basic requirement is that all relevant data about the sketch must be saved in the le, so that the sketch can be exactly restored when the le is read by the program. The new version of the program can be found in the source code le SimplePaintWithFiles.java. A File menu has been added to the new version. It contains two sets of Save/Open commands, one for saving and reloading sketch data in text form and one for data in binary form. We will consider both possibilities here, in some detail. The data for a sketch consists of the background color of the picture and a list of the curves that were drawn by the user. A curve consists of a list of Points. (Point is a standard class in package java.awt; a Point pt has instance variables x and y of type int that represent the coordinates of a point on the xy-plane.) Each curve can be a dierent color. Furthermore, a curve can be symmetric, which means that in addition to the curve itself, the horizontal and vertical reections of the curve are also drawn.) The data for each curve is stored in an object of type CurveData, which is dened in the program as:
/** * An object of type CurveData represents the data required to redraw one * of the curves that have been sketched by the user. */ private static class CurveData implements Serializable { Color color; // The color of the curve. boolean symmetric; // Are horizontal and vertical reflections also drawn? ArrayList<Point> points; // The points on the curve. }

Note that this class has been declared to implement Serializable. This allows objects of type CurveData to be written in binary form to an ObjectOutputStream. See Subsection 11.1.6. Lets think about how the data for a sketch could be saved to an ObjectOuputStream. The sketch is displayed on the screen in an object of type SimplePaintPanel, which is a subclass of JPanel. All the data needed for the sketch is stored in instance variables of that object. One possibility would be to simply write the entire SimplePaintPanel component as a single object to the stream. The could be done in a method in the SimplePaintPanel class with the statement
outputStream.writeObject(this);

where outputStream is the ObjectOutputStream and this refers to the SimplePaintPanel itself. This statement saves the entire current state of the panel. To read the data back into the program, you would create an ObjectInputStream for reading the object from the le, and you would retrieve the object from the le with the statement
SimplePaintPanel newPanel = (SimplePaintPanel)in.readObject();

562

CHAPTER 11. FILES AND NETWORKING

where in is the ObjectInputStream. Note that the type-cast is necessary because the method in.readObject() returns a value of type Object. (To get the saved sketch to appear on the screen, the newPanel must replace the current content pane in the programs window; furthermore, the menu bar of the window must be replaced, because the menus are associated with a particular SimplePaintPanel object.) It might look tempting to be able to save data and restore it with a single command, but in this case, its not a good idea. The main problem with doing things this way is that the serialized form of objects that represent Swing components can change from one version of Java to the next. This means that data les that contain serialized components such as a SimplePaintPanel might become unusable in the future, and the data that they contain will be eectively lost. This is an important consideration for any serious application. Taking this into consideration, my program uses a dierent format when it creates a binary le. The data written to the le consists of (1) the background color of the sketch, (2) the number of curves in the sketch, and (3) all the CurveData objects that describe the individual curves. The method that saves the data is similar to the writeFile() method from Subsection 11.2.3. Here is the complete doSaveAsBinary() from SimplePaintWithFiles, with the changes from the generic readFile() method shown in italic:
/** * Save the users sketch to a file in binary form as serialized * objects, using an ObjectOutputStream. Files created by this method * can be read back into the program using the doOpenAsBinary() method. */ private void doSaveAsBinary() { if (fileDialog == null) fileDialog = new JFileChooser(); File selectedFile; //Initially selected file name in the dialog. if (editFile == null) selectedFile = new File("sketchData.binary"); else selectedFile = new File(editFile.getName()); fileDialog.setSelectedFile(selectedFile); fileDialog.setDialogTitle("Select File to be Saved"); int option = fileDialog.showSaveDialog(this); if (option != JFileChooser.APPROVE OPTION) return; // User canceled or clicked the dialogs close box. selectedFile = fileDialog.getSelectedFile(); if (selectedFile.exists()) { // Ask the user whether to replace the file. int response = JOptionPane.showConfirmDialog( this, "The file \"" + selectedFile.getName() + "\" already exists.\nDo you want to replace it?", "Confirm Save", JOptionPane.YES NO OPTION, JOptionPane.WARNING MESSAGE ); if (response != JOptionPane.YES OPTION) return; // User does not want to replace the file. } ObjectOutputStream out; try { FileOutputStream stream = new FileOutputStream(selectedFile); out = new ObjectOutputStream( stream ); }

11.3. PROGRAMMING WITH FILES

563

catch (Exception e) { JOptionPane.showMessageDialog(this, "Sorry, but an error occurred while trying to open the file:\n" + e); return; } try { out.writeObject(getBackground()); out.writeInt(curves.size()); for ( CurveData curve : curves ) out.writeObject(curve); out.close(); editFile = selectedFile; setTitle("SimplePaint: " + editFile.getName()); } catch (Exception e) { JOptionPane.showMessageDialog(this, "Sorry, but an error occurred while trying to write the text:\n" + e); } }

The heart of this method consists of the following lines, which do the actual writing of the data to the le:
out.writeObject(getBackground()); out.writeInt(curves.size()); for ( CurveData curve : curves ) out.writeObject(curve); // Writes the panels background color. // Writes the number of curves. // For each curve... // write the corresponding CurveData object.

The doOpenAsBinary() method, which is responsible for reading sketch data back into the program from an ObjectInputStream, has to read exactly the same data that was written, in the same order, and use that data to build the data structures that will represent the sketch while the program is running. Once the data structures have been successfully built, they replace the data structures that describe the previous contents of the panel. This is done as follows:
/* Read data from the file into local variables */ Color newBackgroundColor = (Color)in.readObject(); int curveCount = in.readInt(); ArrayList<CurveData> newCurves = new ArrayList<CurveData>(); for (int i = 0; i < curveCount; i++) newCurves.add( (CurveData)in.readObject() ); in.close(); /* Copy the data that was read into the instance variables that describe the sketch that is displayed by the program.*/ curves = newCurves; setBackground(newBackgroundColor); repaint();

This is only a little harder than saving the entire SimplePaintPanel component to the le in one step, and it is more robust since the serialized form of the objects that are saved to le is unlikely to change in the future. But it still suers from the general fragility of binary data.

564

CHAPTER 11. FILES AND NETWORKING

An alternative to using object streams is to save the data in human-readable, character form. The basic idea is the same: All the data necessary to reconstitute a sketch must be saved to the output le in some denite format. The method that reads the le must follow exactly the same format as it reads the data, and it must use the data to rebuild the data structures that represent the sketch while the program is running. When writing character data, we cant write out entire objects in one step. All the data has to be expressed, ultimately, in terms of simple data values such as strings and primitive type values. A color, for example, can be expressed in terms of three integers giving the red, green, and blue components of the color. The rst (not very good) idea that comes to mind might be to just dump all the necessary data, in some denite order, into the le. Suppose that out is a PrintWriter that is used to write to the le. We could then say:
Color bgColor = getBackground(); out.println( bgColor.getRed() ); out.println( bgColor.getGreen() ); out.println( bgColor.getBlue() ); out.println( curves.size() ); // Write the background color to the file.

// Write the number of curves.

for ( CurveData curve : curves ) { // For each curve, write... out.println( curve.color.getRed() ); // the color of the curve out.println( curve.color.getGreen() ); out.println( curve.color.getBlue() ); out.println( curve.symmetric ? 0 : 1 ); // the curves symmetry property out.println( curve.points.size() ); // the number of points on curve for ( Point pt : curve.points ) { // the coordinates of each point out.println( pt.x ); out.println( pt.y ); } }

This works in the sense that the le-reading method can read the data and rebuild the data structures. Suppose that the input method uses a Scanner named scanner to read the data le (see Subsection 11.1.5). Then it could say:
Color newBackgroundColor; // Read the background Color. int red = scanner.nextInt(); int green = scanner.nextInt(); int blue = scanner.nextInt(); newBackgroundColor = new Color(red,green,blue); ArrayList<CurveData> newCurves = new ArrayList<CurveData>(); int curveCount = scanner.nextInt(); // The number of curves to be read. for (int i = 0; i < curveCount; i++) { CurveData curve = new CurveData(); int r = scanner.nextInt(); // Read the curves color. int g = scanner.nextInt(); int b = scanner.nextInt(); curve.color = new Color(r,g,b); int symmetryCode = scanner.nextInt(); // Read the curves symmetry property. curve.symmetric = (symmetryCode == 1); curveData.points = new ArrayList<Point>(); int pointCount = scanner.nextInt(); // The number of points on this curve. for (int j = 0; j < pointCount; j++) {

11.3. PROGRAMMING WITH FILES

565

int x = scanner.nextInt(); // Read the coordinates of the point. int y = scanner.nextInt(); curveData.points.add(new Point(x,y)); } newCurves.add(curve); } curves = newCurves; setBackground(newBackgroundColor); // Install the new data structures.

Note how every piece of data that was written by the output method is read, in the same order, by the input method. While this does work, the data le is just a long string of numbers. It doesnt make much more sense to a human reader than a binary-format le would. Furthermore, it is still fragile in the sense that any small change made to the data representation in the program, such as adding a new property to curves, will render the data le useless (unless you happen to remember exactly which version of the program created the le). So, I decided to use a more complex, more meaningful data format for the text les created by my program. Instead of just writing numbers, I add words to say what the numbers mean. Here is a short but complete data le for the program; just by looking at it, you can probably tell what is going on:
SimplePaintWithFiles 1.0 background 110 110 180 startcurve color 255 255 255 symmetry true coords 10 10 coords 200 250 coords 300 10 endcurve startcurve color 0 255 255 symmetry false coords 10 400 coords 590 400 endcurve

The rst line of the le identies the program that created the data le; when the user selects a le to be opened, the program can check the rst word in the le as a simple test to make sure the the le is of the correct type. The rst line also contains a version number, 1.0. If the le format changes in a later version of the program, a higher version number would be used; if the program sees a version number of 1.2 in a le, but the program only understands version 1.0, the program can explain to the user that a newer version of the program is needed to read the data le. The second line of the le species the background color of the picture. The three integers specify the red, green, and blue components of the color. The word background at the beginning of the line makes the meaning clear. The remainder of the le consists of data for the curves that appear in the picture. The data for each curve is clearly marked with startcurve and endcurve. The data consists of the color and symmetry properties of the curve and the xy-coordinates of each point on the curve. Again, the meaning is clear. Files in this format can easily be created or edited by hand. In fact, the data le shown above was actually created in

566

CHAPTER 11. FILES AND NETWORKING

a text editor rather than by the program. Furthermore, its easy to extend the format to allow for additional options. Future versions of the program could add a thickness property to the curves to make it possible to have curves that are more than one pixel wide. Shapes such as rectangles and ovals could easily be added. Outputting data in this format is easy. Suppose that out is a PrintWriter that is being used to write the sketch data to a le. Then the output can be done with:
out.println("SimplePaintWithFiles 1.0"); // Version number. Color bgColor = getBackground(); out.println( "background " + bgColor.getRed() + " " + bgColor.getGreen() + " " + bgColor.getBlue() ); for ( CurveData curve : curves ) { out.println(); out.println("startcurve"); out.println(" color " + curve.color.getRed() + " " + curve.color.getGreen() + " " + curve.color.getBlue() ); out.println( " symmetry " + curve.symmetric ); for ( Point pt : curve.points ) out.println( " coords " + pt.x + " " + pt.y ); out.println("endcurve"); }

Reading the data is somewhat harder, since the input routine has to deal with all the extra words in the data. In my input routine, I decided to allow some variation in the order in which the data occurs in the le. For example, the background color can be specied at the end of the le, instead of at the beginning. It can even be left out altogether, in which case white will be used as the default background color. This is possible because each item of data is labeled with a word that describes its meaning; the labels can be used to drive the processing of the input. Here is the complete method from SimplePaintWithFiles.java that reads data les in text format. It uses a Scanner to read items from the le:
private void doOpenAsText() { if (fileDialog == null) fileDialog = new JFileChooser(); fileDialog.setDialogTitle("Select File to be Opened"); fileDialog.setSelectedFile(null); // No file is initially selected. int option = fileDialog.showOpenDialog(this); if (option != JFileChooser.APPROVE OPTION) return; // User canceled or clicked the dialogs close box. File selectedFile = fileDialog.getSelectedFile(); Scanner scanner; // For reading from the data file. try { Reader stream = new BufferedReader(new FileReader(selectedFile)); scanner = new Scanner( stream ); } catch (Exception e) { JOptionPane.showMessageDialog(this, "Sorry, but an error occurred while trying to open the file:\n" + e); return; } try { // Read the contents of the file. String programName = scanner.next();

11.3. PROGRAMMING WITH FILES

567

if ( ! programName.equals("SimplePaintWithFiles") ) throw new IOException("File is not a SimplePaintWithFiles data file."); double version = scanner.nextDouble(); if (version > 1.0) throw new IOException("File requires newer version of this program."); Color newBackgroundColor = Color.WHITE; ArrayList<CurveData> newCurves = new ArrayList<CurveData>(); while (scanner.hasNext()) { String itemName = scanner.next(); if (itemName.equalsIgnoreCase("background")) { int red = scanner.nextInt(); int green = scanner.nextInt(); int blue = scanner.nextInt(); newBackgroundColor = new Color(red,green,blue); } else if (itemName.equalsIgnoreCase("startcurve")) { CurveData curve = new CurveData(); curve.color = Color.BLACK; curve.symmetric = false; curve.points = new ArrayList<Point>(); itemName = scanner.next(); while ( ! itemName.equalsIgnoreCase("endcurve") ) { if (itemName.equalsIgnoreCase("color")) { int r = scanner.nextInt(); int g = scanner.nextInt(); int b = scanner.nextInt(); curve.color = new Color(r,g,b); } else if (itemName.equalsIgnoreCase("symmetry")) { curve.symmetric = scanner.nextBoolean(); } else if (itemName.equalsIgnoreCase("coords")) { int x = scanner.nextInt(); int y = scanner.nextInt(); curve.points.add( new Point(x,y) ); } else { throw new Exception("Unknown term in input."); } itemName = scanner.next(); } newCurves.add(curve); } else { throw new Exception("Unknown term in input."); } } scanner.close(); setBackground(newBackgroundColor); // Install the new picture data. curves = newCurves; repaint(); editFile = selectedFile; setTitle("SimplePaint: " + editFile.getName());

568

CHAPTER 11. FILES AND NETWORKING


} catch (Exception e) { JOptionPane.showMessageDialog(this, "Sorry, but an error occurred while trying to read the data:\n" + e); } }

The main reason for this long discussion of le formats has been to get you to think about the problem of representing complex data in a form suitable for storing the data in a le. The same problem arises when data must be transmitted over a network. There is no one correct solution to the problem, but some solutions are certainly better than others. In Section 11.6, we will look at one solution to the data representation problem that has become increasingly common.


In addition to being able to save sketch data in both text form and binary form, SimplePaintWithFiles can also save the picture itself as an image le that could be, for example, printed or put on a web page. This is a preview of image-handling techniques that will be covered in Chapter 12.

11.4

Networking

As far as a program is concerned, a network is just another possible source of input data,
and another place where data can be output. That does oversimplify things, because networks are not as easy to work with as les are. But in Java, you can do network communication using input streams and output streams, just as you can use such streams to communicate with the user or to work with les. Nevertheless, opening a network connection between two computers is a bit tricky, since there are two computers involved and they have to somehow agree to open a connection. And when each computer can send data to the other, synchronizing communication can be a problem. But the fundamentals are the same as for other forms of I/O. One of the standard Java packages is called java.net. This package includes several classes that can be used for networking. Two dierent styles of network I/O are supported. One of these, which is fairly high-level, is based on the World-Wide Web, and provides the sort of network communication capability that is used by a Web browser when it downloads pages for you to view. The main classes for this style of networking are java.net.URL and java.net.URLConnection. An object of type URL is an abstract representation of a Universal Resource Locator , which is an address for an HTML document or other resource on the Web. A URLConnection represents a network connection to such a resource. The second style of I/O, which is more general and much more important, views the network at a lower level. It is based on the idea of a socket. A socket is used by a program to establish a connection with another program on a network. Communication over a network involves two sockets, one on each of the computers involved in the communication. Java uses a class called java.net.Socket to represent sockets that are used for network communication. The term socket presumably comes from an image of physically plugging a wire into a computer to establish a connection to a network, but it is important to understand that a socket, as the term is used here, is simply an object belonging to the class Socket. In particular, a program can have several sockets at the same time, each connecting it to another program running on some other computer on the network. All these connections use the same physical network connection.

11.4. NETWORKING

569

This section gives a brief introduction to these basic networking classes, and shows how they relate to input and output streams.

11.4.1

URLs and URLConnections

The URL class is used to represent resources on the World-Wide Web. Every resource has an address, which identies it uniquely and contains enough information for a Web browser to nd the resource on the network and retrieve it. The address is called a url or universal resource locator. An object belonging to the URL class represents such an address. Once you have a URL object, you can use it to open a URLConnection to the resource at that address. A url is ordinarily specied as a string, such as http://math.hws.edu/eck/index.html. There are also relative urls. A relative url species the location of a resource relative to the location of another url, which is called the base or context for the relative url. For example, if the context is given by the url http://math.hws.edu/eck/, then the incomplete, relative url index.html would really refer to http://math.hws.edu/eck/index.html. An object of the class URL is not simply a string, but it can be constructed from a string representation of a url. A URL object can also be constructed from another URL object, representing a context, and a string that species a url relative to that context. These constructors have prototypes
public URL(String urlName) throws MalformedURLException

and
public URL(URL context, String relativeName) throws MalformedURLException

Note that these constructors will throw an exception of type MalformedURLException if the specied strings dont represent legal urls. The MalformedURLException class is a subclass of IOException, and it requires mandatory exception handling. That is, you must call the constructor inside a try..catch statement that handles the exception or in a subroutine that is declared to throw the exception. The second constructor is especially convenient when writing applets. In an applet, two methods are available that provide useful URL contexts. The method getDocumentBase(), dened in the Applet and JApplet classes, returns an object of type URL. This URL represents the location from which the HTML page that contains the applet was downloaded. This allows the applet to go back and retrieve other les that are stored in the same location as that document. For example,
URL url = new URL(getDocumentBase(), "data.txt");

constructs a URL that refers to a le named data.txt on the same computer and in the same directory as the source le for the web page on which the applet is running. Another method, getCodeBase(), returns a URL that gives the location of the applet class le (which is not necessarily the same as the location of the document). Once you have a valid URL object, you can call its openConnection() method to set up a connection. This method returns a URLConnection. The URLConnection object can, in turn, be used to create an InputStream for reading data from the resource represented by the URL. This is done by calling its getInputStream() method. For example:
URL url = new URL(urlAddressString); URLConnection connection = url.openConnection(); InputStream in = connection.getInputStream();

570

CHAPTER 11. FILES AND NETWORKING

The openConnection() and getInputStream() methods can both throw exceptions of type IOException. Once the InputStream has been created, you can read from it in the usual way, including wrapping it in another input stream type, such as TextReader, or using a Scanner. Reading from the stream can, of course, generate exceptions. One of the other useful instance methods in the URLConnection class is getContentType(), which returns a String that describes the type of information available from the URL. The return value can be null if the type of information is not yet known or if it is not possible to determine the type. The type might not be available until after the input stream has been created, so you should generally call getContentType() after getInputStream(). The string returned by getContentType() is in a format called a mime type. Mime types include text/plain, text/html, image/jpeg, image/gif, and many others. All mime types contain two parts: a general type, such as text or image, and a more specic type within that general category, such as html or gif. If you are only interested in text data, for example, you can check whether the string returned by getContentType() starts with text. (Mime types were rst introduced to describe the content of email messages. The name stands for Multipurpose Internet Mail Extensions. They are now used almost universally to specify the type of information in a le or other resource.) Lets look at a short example that uses all this to read the data from a URL. This subroutine opens a connection to a specied URL, checks that the type of data at the URL is text, and then copies the text onto the screen. Many of the operations in this subroutine can throw exceptions. They are handled by declaring that the subroutine throws IOException and leaving it up to the main program to decide what to do when an error occurs.
static void readTextFromURL( String urlString ) throws IOException { /* Open a connection to the URL, and get an input stream for reading data from the URL. */ URL url = new URL(urlString); URLConnection connection = url.openConnection(); InputStream urlData = connection.getInputStream(); /* Check that the content is some type of text. */ String contentType = connection.getContentType(); if (contentType == null || contentType.startsWith("text") == false) throw new IOException("URL does not seem to refer to a text file."); /* Copy lines of text from the input stream to the screen, until end-of-file is encountered (or an error occurs). */ BufferedReader in; // For reading from the connections input stream. in = new BufferedReader( new InputStreamReader(urlData) ); while (true) { String line = in.readLine(); if (line == null) break; System.out.println(line); } } // end readTextFromURL()

A complete program that uses this subroutine can be found in the le ReadURL.java. When using the program, note that you have to specify a complete url, including the http:// at

11.4. NETWORKING

571

the beginning. There is also an applet version of the program, which you can nd in the on-line version of this section.

11.4.2

TCP/IP and Client/Server

Communication over the Internet is based on a pair of protocols called the Transmission Control Protocol and the Internet Protocol , which are collectively referred to as TCP/IP . (In fact, there is a more basic communication protocol called UDP that can be used instead of TCP in certain applications. UDP is supported in Java, but for this discussion, Ill stick to the full TCP/IP, which provides reliable two-way communication between networked computers.) For two programs to communicate using TCP/IP, each program must create a socket, as discussed earlier in this section, and those sockets must be connected. Once such a connection is made, communication takes place using input streams and output streams. Each program has its own input stream and its own output stream. Data written by one program to its output stream is transmitted to the other computer. There, it enters the input stream of the program at the other end of the network connection. When that program reads data from its input stream, it is receiving the data that was transmitted to it over the network. The hard part, then, is making a network connection in the rst place. Two sockets are involved. To get things started, one program must create a socket that will wait passively until a connection request comes in from another socket. The waiting socket is said to be listening for a connection. On the other side of the connection-to-be, another program creates a socket that sends out a connection request to the listening socket. When the listening socket receives the connection request, it responds, and the connection is established. Once that is done, each program can obtain an input stream and an output stream for sending data over the connection. Communication takes place through these streams until one program or the other closes the connection. A program that creates a listening socket is sometimes said to be a server , and the socket is called a server socket. A program that connects to a server is called a client, and the socket that it uses to make a connection is called a client socket. The idea is that the server is out there somewhere on the network, waiting for a connection request from some client. The server can be thought of as oering some kind of service, and the client gets access to that service by connecting to the server. This is called the client/server model of network communication. In many actual applications, a server program can provide connections to several clients at the same time. When a client connects to a servers listening socket, that socket does not stop listening. Instead, it continues listening for additional client connections at the same time that the rst client is being serviced. To do this, it is necessary to use threads (Section 8.5). Well look at how it works in the next section. The URL class that was discussed at the beginning of this section uses a client socket behind the scenes to do any necessary network communication. On the other side of that connection is a server program that accepts a connection request from the URL object, reads a request from that object for some particular le on the server computer, and responds by transmitting the contents of that le over the network back to the URL object. After transmitting the data, the server closes the connection.


A client program has to have some way to specify which computer, among all those on the network, it wants to communicate with. Every computer on the Internet has an IP address which identies it uniquely among all the computers on the net. Many computers can also

572

CHAPTER 11. FILES AND NETWORKING

be referred to by domain names such as math.hws.edu or www.whitehouse.gov. (See Section 1.7.) Traditional (or IPv4 ) IP addresses are 32-bit integers. They are usually written in the so-called dotted decimal form, such as 69.9.161.200, where each of the four numbers in the address represents an 8-bit integer in the range 0 through 255. A new version of the Internet Protocol, IPv6 , is currently being introduced. IPv6 addresses are 128-bit integers and are usually written in hexadecimal form (with some colons and maybe some extra information thrown in). In actual use, IPv6 addresses are still fairly rare. A computer can have several IP addresses, and can have both IPv4 and IPv6 addresses. Usually, one of these is the loopback address, which can be used when a program wants to communicate with another program on the same computer. The loopback address has IPv4 address 127.0.0.1 and can also, in general, be referred to using the domain name localhost. In addition, there can be one or more IP addresses associated with physical network connections. Your computer probably has some utility for displaying your computers IP addresses. I have written a small Java program, ShowMyNetwork.java, that does the same thing. When I run ShowMyNetwork on my computer, the output is:
en1 : lo0 : /192.168.1.47 /fe80:0:0:0:211:24ff:fe9c:5271%5 /127.0.0.1 /fe80:0:0:0:0:0:0:1%1 /0:0:0:0:0:0:0:1%0

The rst thing on each line is a network interface name, which is really meaningful only to the computers operating system. The output also contains the IP addresses for that interface. In this example, lo0 refers to the loopback address, which has IPv4 address 127.0.0.1 as usual. The most important number here is 192.168.1.47, which is the IPv4 address that can be used for communication over the network. Now, a single computer might have several programs doing network communication at the same time, or one program communicating with several other computers. To allow for this possibility, a network connection is actually identied by a port number in combination with an IP address. A port number is just a 16-bit integer. A server does not simply listen for connectionsit listens for connections on a particular port. A potential client must know both the Internet address (or domain name) of the computer on which the server is running and the port number on which the server is listening. A Web server, for example, generally listens for connections on port 80; other standard Internet services also have standard port numbers. (The standard port numbers are all less than 1024, and are reserved for particular services. If you create your own server programs, you should use port numbers greater than 1024.)

11.4.3

Sockets

To implement TCP/IP connections, the java.net package provides two classes, ServerSocket and Socket. A ServerSocket represents a listening socket that waits for connection requests from clients. A Socket represents one endpoint of an actual network connection. A Socket can be a client socket that sends a connection request to a server. But a Socket can also be created by a server to handle a connection request from a client. This allows the server to create multiple sockets and handle multiple connections. A ServerSocket does not itself participate in connections; it just listens for connection requests and creates Sockets to handle the actual connections. When you construct a ServerSocket object, you have to specify the port number on which the server will listen. The specication for the constructor is
public ServerSocket(int port) throws IOException

11.4. NETWORKING

573

The port number must be in the range 0 through 65535, and should generally be greater than 1024. (A value of 0 tells the server socket to listen on any available port.) The constructor might throw a SecurityException if a smaller port number is specied. An IOException can occur if, for example, the specied port number is already in use. As soon as a ServerSocket is created, it starts listening for connection requests. The accept() method in the ServerSocket class accepts such a request, establishes a connection with the client, and returns a Socket that can be used for communication with the client. The accept() method has the form
public Socket accept() throws IOException

When you call the accept() method, it will not return until a connection request is received (or until some error occurs). The method is said to block while waiting for the connection. (While the method is blocked, the thread that called the method cant do anything else. However, other threads in the same program can proceed.) You can call accept() repeatedly to accept multiple connection requests. The ServerSocket will continue listening for connections until it is closed, using its close() method, or until some error occurs, or until the program is terminated in some way. Suppose that you want a server to listen on port 1728, and suppose that youve written a method provideService(Socket) to handle the communication with one client. Then the basic form of the server program would be:
try { ServerSocket server = new ServerSocket(1728); while (true) { Socket connection = server.accept(); provideService(connection); } } catch (IOException e) { System.out.println("Server shut down with error: " + e); }

On the client side, a client socket is created using a constructor in the Socket class. To connect to a server on a known computer and port, you would use the constructor
public Socket(String computer, int port) throws IOException

The rst parameter can be either an IP number or a domain name. This constructor will block until the connection is established or until an error occurs. Once you have a connected socket, no matter how it was created, you can use the Socket methods getInputStream() and getOutputStream() to obtain streams that can be used for communication over the connection. These methods return objects of type InputStream and OutputStream, respectively. Keeping all this in mind, here is the outline of a method for working with a client connection:
/** * Open a client connection to a specified server computer and * port number on the server, and then do communication through * the connection. */ void doClientConnection(String computerName, int serverPort) { Socket connection; InputStream in;

574

CHAPTER 11. FILES AND NETWORKING


OutputStream out; try { connection = new Socket(computerName,serverPort); in = connection.getInputStream(); out = connection.getOutputStream(); } catch (IOException e) { System.out.println( "Attempt to create connection failed with error: " + e); return; } . . // Use the streams, in and out, to communicate with server. . try { connection.close(); // (Alternatively, you might depend on the server // to close the connection.) } catch (IOException e) { } // end doClientConnection()

All this makes network communication sound easier than it really is. (And if you think it sounded hard, then its even harder.) If networks were completely reliable, things would be almost as easy as Ive described. The problem, though, is to write robust programs that can deal with network and human error. I wont go into detail here. However, what Ive covered here should give you the basic ideas of network programming, and it is enough to write some simple network applications. Lets look at a few working examples of client/server programming.

11.4.4

A Trivial Client/Server

The rst example consists of two programs. The source code les for the programs are DateClient.java and DateServer.java. One is a simple network client and the other is a matching server. The client makes a connection to the server, reads one line of text from the server, and displays that text on the screen. The text sent by the server consists of the current date and time on the computer where the server is running. In order to open a connection, the client must know the computer on which the server is running and the port on which it is listening. The server listens on port number 32007. The port number could be anything between 1025 and 65535, as long the server and the client use the same port. Port numbers between 1 and 1024 are reserved for standard services and should not be used for other servers. The name or IP number of the computer on which the server is running must be specied as a command-line argument For example, if the server is running on a computer named math.hws.edu, then you would typically run the client with the command java DateClient math.hws.edu. Here is the complete client program:
import java.net.*; import java.io.*; /** * This program opens a connection to a computer specified * as the first command-line argument. The connection is made to * the port specified by LISTENING PORT. The program reads one

11.4. NETWORKING
* line of text from the connection and then closes the * connection. It displays the text that it read on * standard output. This program is meant to be used with * the server program, DateServer, which sends the current * date and time on the computer where the server is running. */ public class DateClient { public static final int LISTENING PORT = 32007; public static void main(String[] args) { String hostName; // Name of the server computer to connect to. Socket connection; // A socket for communicating with server. BufferedReader incoming; // For reading data from the connection. /* Get computer name from command line. */ if (args.length > 0) hostName = args[0]; else { // No computer name was given. Print a message and exit. System.out.println("Usage: java DateClient <server host name>"); return; } /* Make the connection, then read and display a line of text. */

575

try { connection = new Socket( hostName, LISTENING PORT ); incoming = new BufferedReader( new InputStreamReader(connection.getInputStream()) ); String lineFromServer = incoming.readLine(); if (lineFromServer == null) { // A null from incoming.readLine() indicates that // end-of-stream was encountered. throw new IOException("Connection was opened, " + "but server did not send any data."); } System.out.println(); System.out.println(lineFromServer); System.out.println(); incoming.close(); } catch (Exception e) { System.out.println("Error: " + e); } } // end main()

} //end class DateClient

Note that all the communication with the server is done in a try..catch statement. This will catch the IOExceptions that can be generated when the connection is opened or closed and when data is read from the input stream. The connections input stream is wrapped in a BueredReader, which has a readLine() method that makes it easy to read one line of text. (See Subsection 11.1.4.)

576

CHAPTER 11. FILES AND NETWORKING

In order for this program to run without error, the server program must be running on the computer to which the client tries to connect. By the way, its possible to run the client and the server program on the same computer. For example, you can open two command windows, start the server in one window and then run the client in the other window. To make things like this easier, most computers will recognize the domain name localhost and the IP number 127.0.0.1 as referring to this computer. This means that the command java DateClient localhost will tell the DateClient program to connect to a server running on the same computer. If that command doesnt work, try java DateClient 127.0.0.1. The server program that corresponds to the DateClient client program is called DateServer. The DateServer program creates a ServerSocket to listen for connection requests on port 32007. After the listening socket is created, the server will enter an innite loop in which it accepts and processes connections. This will continue until the program is killed in some wayfor example by typing a CONTROL-C in the command window where the server is running. When a connection is received from a client, the server calls a subroutine to handle the connection. In the subroutine, any Exception that occurs is caught, so that it will not crash the server. Just because a connection to one client has failed for some reason, it does not mean that the server should be shut down; the error might have been the fault of the client. The connection-handling subroutine creates a PrintWriter for sending data over the connection. It writes the current date and time to this stream and then closes the connection. (The standard class java.util.Date is used to obtain the current time. An object of type Date represents a particular date and time. The default constructor, new Date(), creates an object that represents the time when the object is created.) The complete server program is as follows:
import java.net.*; import java.io.*; import java.util.Date; /** * This program is a server that takes connection requests on * the port specified by the constant LISTENING PORT. When a * connection is opened, the program sends the current time to * the connected socket. The program will continue to receive * and process connections until it is killed (by a CONTROL-C, * for example). Note that this server processes each connection * as it is received, rather than creating a separate thread * to process the connection. */ public class DateServer { public static final int LISTENING PORT = 32007; public static void main(String[] args) { ServerSocket listener; Socket connection; // Listens for incoming connections. // For communication with the connecting program.

/* Accept and process connections forever, or until some error occurs. (Note that errors that occur while communicating with a connected program are caught and handled in the sendDate() routine, so they will not crash the server.) */ try { listener = new ServerSocket(LISTENING PORT); System.out.println("Listening on port " + LISTENING PORT);

11.4. NETWORKING
while (true) { // Accept next connection request and handle it. connection = listener.accept(); sendDate(connection); } } catch (Exception e) { System.out.println("Sorry, the server has shut down."); System.out.println("Error: " + e); return; } } // end main()

577

/** * The parameter, client, is a socket that is already connected to another * program. Get an output stream for the connection, send the current time, * and close the connection. */ private static void sendDate(Socket client) { try { System.out.println("Connection from " + client.getInetAddress().toString() ); Date now = new Date(); // The current date and time. PrintWriter outgoing; // Stream for sending data. outgoing = new PrintWriter( client.getOutputStream() ); outgoing.println( now.toString() ); outgoing.flush(); // Make sure the data is actually sent! client.close(); } catch (Exception e){ System.out.println("Error: " + e); } } // end sendDate() } //end class DateServer

When you run DateServer in a command-line interface, it will sit and wait for connection requests and report them as they are received. To make the DateServer service permanently available on a computer, the program really should be run as a daemon. A daemon is a program that runs continually on a computer, independently of any user. The computer can be congured to start the daemon automatically as soon as the computer boots up. It then runs in the background, even while the computer is being used for other purposes. For example, a computer that makes pages available on the World Wide Web runs a daemon that listens for requests for pages and responds by transmitting the pages. Its just a souped-up analog of the DateServer program! However, the question of how to set up a program as a daemon is not one I want to go into here. For testing purposes, its easy enough to start the program by hand, and, in any case, my examples are not really robust enough or full-featured enough to be run as serious servers. (By the way, the word daemon is just an alternative spelling of demon and is usually pronounced the same way.) Note that after calling out.println() to send a line of data to the client, the server program calls out.flush(). The flush() method is available in every output stream class. Calling it

578

CHAPTER 11. FILES AND NETWORKING

ensures that data that has been written to the stream is actually sent to its destination. You should generally call this function every time you use an output stream to send data over a network connection. If you dont do so, its possible that the stream will collect data until it has a large batch of data to send. This is done for eciency, but it can impose unacceptable delays when the client is waiting for the transmission. It is even possible that some of the data might remain untransmitted when the socket is closed, so it is especially important to call flush() before closing the connection. This is one of those unfortunate cases where dierent implementations of Java can behave dierently. If you fail to ush your output streams, it is possible that your network application will work on some types of computers but not on others.

11.4.5

A Simple Network Chat

In the DateServer example, the server transmits information and the client reads it. Its also possible to have two-way communication between client and server. As a rst example, well look at a client and server that allow a user on each end of the connection to send messages to the other user. The program works in a command-line interface where the users type in their messages. In this example, the server waits for a connection from a single client and then closes down its listener so that no other clients can connect. After the client and server are connected, both ends of the connection work in much the same way. The user on the client end types a message, and it is transmitted to the server, which displays it to the user on that end. Then the user of the server types a message that is transmitted to the client. Then the client user types another message, and so on. This continues until one user or the other enters quit when prompted for a message. When that happens, the connection is closed and both programs terminate. The client program and the server program are very similar. The techniques for opening the connections dier, and the client is programmed to send the rst message while the server is programmed to receive the rst message. The client and server programs can be found in the les CLChatClient.java and CLChatServer.java. (The name CLChat stands for command-line chat.) Here is the source code for the server:
import java.net.*; import java.io.*; /** * This program is one end of a simple command-line interface chat program. * It acts as a server which waits for a connection from the CLChatClient * program. The port on which the server listens can be specified as a * command-line argument. If it is not, then the port specified by the * constant DEFAULT PORT is used. Note that if a port number of zero is * specified, then the server will listen on any available port. * This program only supports one connection. As soon as a connection is * opened, the listening socket is closed down. The two ends of the connection * each send a HANDSHAKE string to the other, so that both ends can verify * that the program on the other end is of the right type. Then the connected * programs alternate sending messages to each other. The client always sends * the first message. The user on either end can close the connection by * entering the string "quit" when prompted for a message. Note that the first * character of any string sent over the connection must be 0 or 1; this * character is interpreted as a command. */ public class CLChatServer { /**

11.4. NETWORKING
* Port to listen on, if none is specified on the command line. */ static final int DEFAULT PORT = 1728;

579

/** * Handshake string. Each end of the connection sends this string to the * other just after the connection is opened. This is done to confirm that * the program on the other side of the connection is a CLChat program. */ static final String HANDSHAKE = "CLChat"; /** * This character is prepended to every message that is sent. */ static final char MESSAGE = 0; /** * This character is sent to the connected program when the user quits. */ static final char CLOSE = 1; public static void main(String[] args) { int port; // The port on which the server listens. // Listens for a connection request. // For communication with the client. // // // // Stream for receiving data from client. Stream for sending data to client. A message to be sent to the client. A message received from the client.

ServerSocket listener; Socket connection;

BufferedReader incoming; PrintWriter outgoing; String messageOut; String messageIn;

BufferedReader userInput; // A wrapper for System.in, for reading // lines of input from the user. /* First, get the port number from the command line, or use the default port if none is specified. */ if (args.length == 0) port = DEFAULT PORT; else { try { port= Integer.parseInt(args[0]); if (port < 0 || port > 65535) throw new NumberFormatException(); } catch (NumberFormatException e) { System.out.println("Illegal port number, " + args[0]); return; } } /* Wait for a connection request. When it arrives, close down the listener. Create streams for communication and exchange the handshake. */ try {

580

CHAPTER 11. FILES AND NETWORKING


listener = new ServerSocket(port); System.out.println("Listening on port " + listener.getLocalPort()); connection = listener.accept(); listener.close(); incoming = new BufferedReader( new InputStreamReader(connection.getInputStream()) ); outgoing = new PrintWriter(connection.getOutputStream()); outgoing.println(HANDSHAKE); // Send handshake to client. outgoing.flush(); messageIn = incoming.readLine(); // Receive handshake from client. if (! HANDSHAKE.equals(messageIn) ) { throw new Exception("Connected program is not a CLChat!"); } System.out.println("Connected. Waiting for the first message."); } catch (Exception e) { System.out.println("An error occurred while opening connection."); System.out.println(e.toString()); return; } /* Exchange messages with the other end of the connection until one side or the other closes the connection. This server program waits for the first message from the client. After that, messages alternate strictly back and forth. */ try { userInput = new BufferedReader(new InputStreamReader(System.in)); System.out.println("NOTE: Enter quit to end the program.\n"); while (true) { System.out.println("WAITING..."); messageIn = incoming.readLine(); if (messageIn.length() > 0) { // The first character of the message is a command. If // the command is CLOSE, then the connection is closed. // Otherwise, remove the command character from the // message and proceed. if (messageIn.charAt(0) == CLOSE) { System.out.println("Connection closed at other end."); connection.close(); break; } messageIn = messageIn.substring(1); } System.out.println("RECEIVED: " + messageIn); System.out.print("SEND: "); messageOut = userInput.readLine(); if (messageOut.equalsIgnoreCase("quit")) { // User wants to quit. Inform the other side // of the connection, then close the connection. outgoing.println(CLOSE); outgoing.flush(); // Make sure the data is sent! connection.close(); System.out.println("Connection closed.");

11.5. NETWORK PROGRAMMING AND THREADS

581

break; } outgoing.println(MESSAGE + messageOut); outgoing.flush(); // Make sure the data is sent! if (outgoing.checkError()) { throw new IOException("Error occurred while transmitting message."); } } } catch (Exception e) { System.out.println("Sorry, an error has occurred. System.out.println("Error: " + e); System.exit(1); } } // end main()

Connection lost.");

} //end class CLChatServer

This program is a little more robust than DateServer. For one thing, it uses a handshake to make sure that a client who is trying to connect is really a CLChatClient program. A handshake is simply information sent between client and server as part of setting up the connection, before any actual data is sent. In this case, each side of the connection sends a string to the other side to identify itself. The handshake is part of the protocol that I made up for communication between CLChatClient and CLChatServer. A protocol is a detailed specication of what data and messages can be exchanged over a connection, how they must be represented, and what order they can be sent in. When you design a client/server application, the design of the protocol is an important consideration. Another aspect of the CLChat protocol is that after the handshake, every line of text that is sent over the connection begins with a character that acts as a command. If the character is 0, the rest of the line is a message from one user to the other. If the character is 1, the line indicates that a user has entered the quit command, and the connection is to be shut down. Remember that if you want to try out this program on a single computer, you can use two command-line windows. In one, give the command java CLChatServer to start the server. Then, in the other, use the command java CLChatClient localhost to connect to the server that is running on the same machine.

11.5 In

Network Programming and Threads

the previous section, we looked at several examples of network programming. Those examples showed how to create network connections and communicate through them, but they didnt deal with one of the fundamental characteristics of network programming, the fact that network communication is fundamentally asynchronous. From the point of view of a program on one end of a network connection, messages can arrive from the other side of the connection at any time; the arrival of a message is an event that is not under the control of the program that is receiving the message. Certainly, it is possible to design a network communication protocol that proceeds in a synchronous, step-by-step process from beginning to endbut whenever the process gets to a point in the protocol where it needs to read a message from the other side of the connection, it has to wait for that message to arrive. Essentially, the process has to wait

582

CHAPTER 11. FILES AND NETWORKING

for a message-arrival event to occur before it can proceed. While it is waiting for the message, we say that the process is blocked . Perhaps an event-oriented networking API would be a good approach to dealing with the asynchronous nature of network communication, but that is not the approach that is taken in Java (or, typically, in other languages). Instead, a serious network program in Java uses threads. Threads were introduced in Section 8.5. A thread is a separate computational process that can run in parallel with other threads. When a program uses threads to do network communication, it is possible that some threads will be blocked, waiting for incoming messages, but other threads will still be able to continue performing useful work.

11.5.1

A Threaded GUI Chat Program.

The command-line chat programs, CLChatClient.java and CLChatServer.java, from the previous section use a straight-through, step-by-step protocol for communication. After a user on one side of a connection enters a message, the user must wait for a reply from the other side of the connection. An asynchronous chat program would be much nicer. In such a program, a user could just keep typing lines and sending messages without waiting for any response. Messages that arriveasynchronouslyfrom the other side would be displayed as soon as they arrive. Its not easy to do this in a command-line interface, but its a natural application for a graphical user interface. The basic ideas for a GUI chat program is to create a thread whose job is to read messages that arrive from the other side of the connection. As soon as the message arrives, it is displayed to the user; then, the message-reading thread blocks until the next incoming message arrives. While it is blocked, however, other threads can continue to run. In particular, the GUI event-handling thread that responds to user actions keeps running; that thread can send outgoing messages as soon as the user generates them. The sample program GUIChat.java is an example of this. GUIChat is a two-way network chat program that allows two users to send messages to each other over the network. In this chat program, each user can send messages at any time, and incoming messages are displayed as soon as they are received. The GUIChat program can act as both the client end and the server end of a connection. When GUIChat is started, a window appears on the screen. This window has a Listen button that the user can click to create a server socket that will listen for an incoming connection request; this makes the program act as a server. It also has a Connect button that the the user can click to send a connection request; this makes the program act as a client. As usual, the server listens on a specied port number. The client needs to know the computer on which the server is running and the port on which the server is listening. There are input boxes in the GUIChat window where the user can enter this information. Once a connection has been established between two GUIChat windows, each user can send messages to the other. The window has an input box where the user types the message. Pressing return while typing in this box sends the message. This means that the sending of the message is handled by the usual event-handling thread, in response to an event generated by a user action. Messages are received by a separate thread that just sits around waiting for incoming messages. This thread blocks while waiting for a message to arrive; when a message does arrive, it displays that message to the user. The window contains a large transcript area that displays both incoming and outgoing messages, along with other information about the network connection. I urge you to compile the source code, GUIChat.java, and try the program. To make it easy to try it on a single computer, you can make a connection between one window and another window on the same computer, using localhost or 127.0.0.1 as the name of the computer.

11.5. NETWORK PROGRAMMING AND THREADS

583

(Once you have one GUIChat window open, you can open a second one by clicking the New button.) I also urge you to read the source code. I will discuss only parts of it here. The program uses a nested class, ConnectionHandler, to handle most network-related tasks. ConnectionHandler is a subclass of Thread. The ConnectionHandler thread is responsible for opening the network connection and then for reading incoming messages once the connection has been opened. (By putting the connection-opening code in a separate thread, we make sure that the GUI is not blocked while the connection is being opened. Like reading incoming messages, opening a connection is a blocking operation that can take some time to complete.) A ConnectionHandler is created when the user clicks the Listen or Connect button. The Listen button should make the thread act as a server, while Connect should make it act as a client. To distinguish these two cases, the ConnectionHandler class has two constructors:
/** * Listen for a connection on a specified port. The constructor * does not perform any network operations; it just sets some * instance variables and starts the thread. Note that the * thread will only listen for one connection, and then will * close its server socket. */ ConnectionHandler(int port) { state = ConnectionState.LISTENING; this.port = port; postMessage("\nLISTENING ON PORT " + port + "\n"); start(); } /** * Open a connection to specified computer and * does not perform any network operations; it * instance variables and starts the thread. */ ConnectionHandler(String remoteHost, int port) state = ConnectionState.CONNECTING; this.remoteHost = remoteHost; this.port = port; postMessage("\nCONNECTING TO " + remoteHost start(); } port. The constructor just sets some

+ " ON PORT " + port + "\n");

Here, state is an instance variable whose type is dened by an enumerated type


enum ConnectionState { LISTENING, CONNECTING, CONNECTED, CLOSED };

The values of this enum represent dierent possible states of the network connection. It is often useful to treat a network connection as a state machine (see Subsection 6.5.4), since the response to various events can depend on the state of the connection when the event occurs. Setting the state variable to LISTENING or CONNECTING tells the thread whether it should act as a server or as a client. Note that the postMessage() method posts a message to the transcript area of the window, where it will be visible to the user. Once the thread has been started, it executes the following run() method:
/** * The run() method that is executed by the thread. It opens a * connection as a client or as a server (depending on which

584

CHAPTER 11. FILES AND NETWORKING


* constructor was used). */ public void run() { try { if (state == ConnectionState.LISTENING) { // Open a connection as a server. listener = new ServerSocket(port); socket = listener.accept(); listener.close(); } else if (state == ConnectionState.CONNECTING) { // Open a connection as a client. socket = new Socket(remoteHost,port); } connectionOpened(); // Sets up to use the connection (including // creating a BufferedReader, in, for reading // incoming messages). while (state == ConnectionState.CONNECTED) { // Read one line of text from the other side of // the connection, and report it to the user. String input = in.readLine(); if (input == null) connectionClosedFromOtherSide(); else received(input); // Report message to user. } } catch (Exception e) { // An error occurred. Report it to the user, but not // if the connection has been closed (since the error // might be the expected error that is generated when // a socket is closed). if (state != ConnectionState.CLOSED) postMessage("\n\n ERROR: " + e); } finally { // Clean up before terminating the thread. cleanUp(); } }

This method calls several other methods to do some of its work, but you can see the general outline of how it works. After opening the connection as either a server or client, the run() method enters a while loop in which it receives and processes messages from the other side of the connection until the connection is closed. It is important to understand how the connection can be closed. The GUIChat window has a Disconnect button that the user can click to close the connection. The program responds to this event by closing the socket that represents the connection. It is likely that when this happens, the connection-handling thread is blocked in the in.readLine() method, waiting for an incoming message. When the socket is closed by another thread, this method will fail and will throw an exception; this exception causes the thread to terminate. (If the connection-handling thread happens to be between calls to in.readLine() when the socket is closed, the while loop will terminate because the connection state changes

11.5. NETWORK PROGRAMMING AND THREADS

585

from CONNECTED to CLOSED.) Note that closing the window will also close the connection in the same way. It is also possible for the user on the other side of the connection to close the connection. When that happens, the stream of incoming messages ends, and the in.readLine() on this side of the connection returns the value null, which indicates end-of-stream and acts as a signal that the connection has been closed by the remote user. For a nal look into the GUIChat code, consider the methods that send and receive messages. These methods are called from dierent threads. The send() method is called by the eventhandling thread in response to a user action. Its purpose is to transmit a message to the remote user. It uses a PrintWriter, out, that writes to the sockets output stream. Synchronization of this method prevents the connection state from changing in the middle of the send operation:
/** * Send a message to the other side of the connection, and post the * message to the transcript. This should only be called when the * connection state is ConnectionState.CONNECTED; if it is called at * other times, it is ignored. */ synchronized void send(String message) { if (state == ConnectionState.CONNECTED) { postMessage("SEND: " + message); out.println(message); out.flush(); if (out.checkError()) { postMessage("\nERROR OCCURRED WHILE TRYING TO SEND DATA."); close(); // Closes the connection. } } }

The received() method is called by the connection-handling thread after a message has been read from the remote user. Its only job is to display the message to the user, but again it is synchronized to avoid the race condition that could occur if the connection state were changed by another thread while this method is being executed:
/** * This is called by the run() method when a message is received from * the other side of the connection. The message is posted to the * transcript, but only if the connection state is CONNECTED. (This * is because a message might be received after the user has clicked * the "Disconnect" button; that message should not be seen by the * user.) */ synchronized private void received(String message) { if (state == ConnectionState.CONNECTED) postMessage("RECEIVE: " + message); }

11.5.2

A Multithreaded Server

There is still one big problem with the GUIChat program. In order to open a connection to another computer, a user must know that there is a GUIChat program listening on some

586

CHAPTER 11. FILES AND NETWORKING

particular port on some particular computer. Except in rather contrived situations, there is no way for a user to know that. It would be nice if it were possible to discover, somehow, whos out there on the Internet waiting for a connection. Unfortunately, this is not possible. And yet, applications such as AOL Instant Messenger seem to do just thatthey can show you a list of users who are available to receive messages. How can they do that? I dont know the details of instant messenger protocols, but it has to work something like this: When you start the client program, that program contacts a server program that runs constantly on some particular computer and on some particular port. Since the server is always available at the same computer and port, the information needed to contact it can be built into the client program or otherwise made available to the users of the program. The purpose of the server is to keep a list of available users. When your client program contacts the server, it gets a list of available users, along with whatever information is necessary to send messages to those users. At the same time, your client program registers you with the server, so that the server can tell other users that you are on-line. When you shut down the client program, you are removed from the servers list of users, and other users can be informed that you have gone o-line. Of course, in an application like AOL server, you only get to see a list of available users from your buddy list, a list of your friends who are also AOL users. To implement this, you need to have an account on the AOL server. The server needs to keep a database of information about all user accounts, including the buddy list for each user. This makes the server program rather complicated, and I wont consider that aspect of its functionality here. However, it is not very dicult to write a scaled-down application that uses the network in a similar way. I call my scaled-down version BuddyChat. It doesnt keep separate buddy lists for each user; it assumes that youre willing to be buddies with anyone who happens to connect to the server. In this application, the server keeps a list of connected users and makes that list available to each connected user. A user can connect to another user and chat with that user, using a window that is very similar to the chat window in GUIChat. BuddyChat is still just a toy, compared to serious network applications, but it does illustrate some core ideas. The BuddyChat application comes in several pieces. BuddyChatServer.java is the server program, which keeps the list of available users and makes that list available to clients. Ideally, the server program would run constantly (as a daemon) on a computer and port that are known to all the possible client users. For testing, of course, it can simply be stated like any other program. The client program is BuddyChat.java. This program is to be run by any user who wants to use the BuddyChat service. When a user starts the client program, it connects to the server, and it gets from the server a list of other users who are currently connected. The list is displayed to the user of the client program, who can send a request for a chat connection with any user on the list. The client can also receive incoming chat connection requests from other users. The window that is used for chatting is dened by BuddyChatWindow.java, which is not itself a program but just a subclass of JFrame that denes the chat window. (There is also a fourth piece, BuddyChatServerShutdown.java. This is a program that can be run to shut down the BuddyChatServer gracefully. I will not discuss it further here. See the source code for more information, if you are interested.) I urge you to compile the programs and try them out. For testing, you can try them on a single computer (although all the windows can get a little confusing). First, start BuddyChatServer. The server has no GUI interface, but it does print some information to standard output as it runs. Then start the BuddyChat client program. When BuddyChat starts up, it presents a window where you can enter the name and port number for the server and your

11.5. NETWORK PROGRAMMING AND THREADS

587

handle, which is just a name that will identify you in the servers list of users. The server info is already set up to connect to a server on the same machine. When you hit the Connect button, a new window will open with a list, currently empty, of other users connected to the server. Now, start another copy of the BuddyChat client program. When you click Connect, youll have two client list windows, one for each copy of the client program that youve started. (One of these windows will be exactly on top of the other, so youll have to move it to see the second window.) Each client window will display the other client in its list of users. You can run additional copies of the client program, if you want, and you might want to try connecting from another computer if one is available. At this point, there is a network connection in place between the server and each client. Whenever a client connects to or disconnects from the server, the server sends a notication of the event to each connected client, so that the client can modify its own list of connected users. The server also maintains a listening socket that listens for connection requests from new clients. In order to manage all this, the server is running several threads. One thread waits for connection requests on the listening socket. In addition to this, there are two threads for each connected clientone thread for sending messages to the client and one thread for reading messages sent by the client to the server. Back to trying out the program. Remember that the whole point was to provide each user with a list of potential chat partners. Click on a user in one of the client user lists, and then click the Connect to Selected Buddy button. When you do this, your BuddyChat program sends a connection request to the BuddyChat program that is being run by the selected user. Each BuddyChat program, one on each side of the connection, opens a chat window (of type BuddyChatWindow ). A network connection between these two windows is set up without any further action on the part of the two users, and the users can use the windows to send messages back and forth to each other. The BuddyChatServer program has nothing to do with opening, closing, or using the connection between its two clients (although a dierent design might have had the messages go through the server). In order to open the chat connection from one program to another, the second program must be listening for connection requests and the rst program must know the computer and port on which the rst user is listening. In the BuddyChat system, the BuddyChatServer knows this information and provides it to each BuddyChat client program. The users of the client programs never have to be aware of this information. How does the server know about the clients computers and port numbers? When a BuddyChat client program is run, in addition to opening a connection to the BuddyChatServer, the client also creates a listening socket to accept connection requests from other users. When the client registers with the server, it tells the server the port number of the clients listening socket. The server also knows the IP address of the computer on which the client is running, since it has a network connection to that computer. This means that the BuddyChatServer knows the IP address and listening socket port number of every BuddyChat client. A copy of this information is provided (along with the users handles) to each connected client program. The net result is that every BuddyChat client program has the information that it needs to contact all the other clients. The basic techniques used in the BuddyChat system are the same as those used in previous networking examples: server sockets, client sockets, input and output streams for sending messages over the network, and threads to handle the communication. The important dierence is how these basic building blocks are combined to build a more complex application. I have tried to explain the logic of that application here. I will not discuss the BuddyChat source code

588

CHAPTER 11. FILES AND NETWORKING

here, since it is locally similar to examples that we have already looked at, but I encourage you to study the source code if you are interested in network programming. BuddyChat seems to have a lot of functionality, yet I said it was still a toy program. What exactly makes it a toy? There are at least two big problems. First of all, it is not scalable. A network program is scalable if it will work well for a large number of simultaneous users. BuddyChat would have problems with a large number of users because it uses so many threads (two for each user). It takes a certain amount of processing for a computer to switch its attention from one thread to another. On a very busy server, the constant switching between threads would soon start to degrade the performance. One solution to this is to use a more advanced network API. Java has a class SelectableChannel that makes it possible for one thread to manage communication over a large number of network connections. This class is part of the package java.nio that provides a number of advanced I/O capabilities for working with les and networking. However, I will not cover those capabilities in this book. But the biggest problem is that BuddyChat oers absolutely no defense against denial of service attacks. In a denial of service, a malicious user attacks a network server in some way that prevents other users from accessing the service or severely degrades the performance of the service for those users. It would be simple to launch a denial of service attack on BuddyChat by making a huge number of connections to the server. The server would then spend most of its time servicing those bogus connections. The server could guard against this to some extent by putting a limit on the number of simultaneous connections that it will accept from a given IP address. It would also be helpful to add some security to the server by requiring users to know a password in order to connect. However, neither of these measures would fully solve the problem, and it is very dicult to nd a complete defense against denial of service attacks.

11.5.3

Distributed Computing

In Section 8.5, we saw how threads can be used to do parallel processing, where a number of processors work together to complete some task. In that section, it was assumed that all the processors were inside one multi-processor computer. But parallel processing can also be done using processors that are in dierent computers, as long as those computers are connected to a network over which they can communicate. This type of parallel processingin which a number of computers work together on a task and communicate over a networkis called distributed computing . In some sense, the whole Internet is an immense distributed computation, but here I am interested in how computers on a network can cooperate to solve some computational problem. There are several approaches to distributed computing that are supported in Java. RMI and CORBA are standards that enable a program running on one computer to call methods in objects that exist on other computers. This makes it possible to design an object-oriented program in which dierent parts of the program are executed on dierent computers. RMI (Remote Method Invocation) only supports communication between Java objects. CORBA (Common Object Request Broker Architecture) is a more general standard that allows objects written in various programming languages, including Java, to communicate with each other. As is commonly the case in networking, there is the problem of locating services (where in this case, a service means an object that is available to be called over the network). That is, how can one computer know which computer a service is located on and what port it is listening on? RMI and CORBA solve this problem using something like our little BuddyChatServer examplea server running at a known location keeps a list of services that are available on other computers. Computers that oer services register those services with the server; computers that

11.5. NETWORK PROGRAMMING AND THREADS

589

need services contact the server to nd out where they are located. RMI and CORBA are complex systems that are not very easy to use. I mention them here because they are part of Javas standard network API, but I will not discuss them further. Instead, we will look at a relatively simple demonstration of distributed computing that uses only basic networking. The problem that we will look at uses the simplest type of parallel programming, in which the problem can be broken down into tasks that can be performed independently, with no communication between the tasks. To apply distributed computing to this type of problem, we can use one master program that divides the problem into tasks and sends those tasks over the network to worker programs that do the actual work. The worker programs send their results back to the master program, which combines the results from all the tasks into a solution of the overall problem. In this context, the worker programs are often called slaves, and the program uses the so-called master/slave approach to distributed computing. The demonstration program is dened by three source code les: CLMandelbrotMaster.java denes the master program; CLMandelbrotWorker.java denes the worker programs; and CLMandelbrotTask.java denes the class, CLMandelbrotTask, that represents an individual task that is performed by the workers. To run the demonstration, you must start the CLMandelbrotWorker program on several computers (probably by running it on the command line). This program uses CLMandelbrotTask, so both class les, CLMandelbrotWorker.class and CLMandelbrotTask.class, must be present on the worker computers. You can then run CLMandelbrotMaster on the master computer. Note that this program also requires the class CLMandelbrotTask. You must specify the host name or IP address of each of the worker computers as command line arguments for CLMandelbrotMaster. A worker program listens for connection requests from the master program, and the master program must be told where to send those requests. For example, if the worker program is running on three computers with IP addresses 172.30.217.101, 172.30.217.102, and 172.30.217.103, then you can run CLMandelbrotMaster with the command
java CLMandelbrotMaster 172.30.217.101 172.30.217.102 172.30.217.103

The master will make a network connection to the worker at each IP address; these connections will be used for communication between the master program and the workers. It is possible to run several copies of CLMandelbrotWorker on the same computer, but they must listen for network connections on dierent ports. It is also possible to run CLMandelbrotWorker on the same computer as CLMandelbrotMaster. You might even see some speed-up when you do this, if your computer has several processors. See the comments in the program source code les for more information, but here are some commands that you can use to run the master program and two copies of the worker program on the same computer. Give these commands in separate command windows:
java java java CLMandelbrotWorker CLMandelbrotWorker 1501 CLMandelbrotMaster localhost localhost:1501 (Listens on default port) (Listens on port 1501)

Every time CLMandelbrotMaster is run, it solves exactly the same problem. (For this demonstration, the nature of the problem is not important, but the problem is to compute the data needed for a picture of a small piece of the famous Mandelbrot Set. If you are interested in seeing the picture that is produced, uncomment the call to the saveImage() method at the

590

CHAPTER 11. FILES AND NETWORKING

end of the main() routine in CLMandelbrotMaster.java. We will encounter the Mandelbrot Set again as an example in Chapter 12.) You can run CLMandelbrotMaster with dierent numbers of worker programs to see how the time required to solve the problem depends on the number of workers. (Note that the worker programs continue to run after the master program exists, so you can run the master program several times without having to restart the workers.) In addition, if you run CLMandelbrotMaster with no command line arguments, it will solve the entire problem on its own, so you can see how long it takes to do so without using distributed computing. In a trial that I ran, it took 40 seconds for CLMandelbrotMaster to solve the problem on its own. Using just one worker, it took 43 seconds. The extra time represents extra work involved in using the network; it takes time to set up a network connection and to send messages over the network. Using two workers (on dierent computers), the problem was solved in 22 seconds. In this case, each worker did about half of the work, and their computations were performed in parallel, so that the job was done in about half the time. With larger numbers of workers, the time continued to decrease, but only up to a point. The master program itself has a certain amount of work to do, no matter how many workers there are, and the total time to solve the problem can never be less than the time it takes for the master program to do its part. In this case, the minimum time seemed to be about ve seconds.


Lets take a look at how this distributed application is programmed. The master program divides the overall problem into a set of tasks. Each task is represented by an object of type CLMandelbrotTask. These tasks have to be communicated to the worker programs, and the worker programs must send back their results. Some protocol is needed for this communication. I decided to use character streams. The master encodes a task as a line of text, which is sent to a worker. The worker decodes the text (into an object of type CLMandelbrotTask) to nd out what task it is supposed to perform. It performs the assigned task. It encodes the results as another line of text, which it sends back to the master program. Finally, the master decodes the results and combines them with the results from other tasks. After all the tasks have been completed and their results have been combined, the problem has been solved. The problem is divided into a fairly large number of tasks. A worker receives not just one task, but a sequence of tasks. Each time it nishes a task and sends back the result, it is assigned a new task. After all tasks are complete, the worker receives a close command that tells it to close the connection. In CLMandelbrotWorker.java, all this is done in a method named handleConnection() that is called to handle a connection that has already been opened to the master program. It uses a method readTask() to decode a task that it receives from the master and a method writeResults() to encode the results of the task for transmission back to the master. It must also handle any errors that occur:
private static void handleConnection(Socket connection) { try { BufferedReader in = new BufferedReader( new InputStreamReader( connection.getInputStream()) ); PrintWriter out = new PrintWriter(connection.getOutputStream()); while (true) { String line = in.readLine(); // Message from the master. if (line == null) { // End-of-stream encountered -- should not happen. throw new Exception("Connection closed unexpectedly."); }

11.5. NETWORK PROGRAMMING AND THREADS


if (line.startsWith(CLOSE CONNECTION COMMAND)) { // Represents the normal termination of the connection. System.out.println("Received close command."); break; } else if (line.startsWith(TASK COMMAND)) { // Represents a CLMandelbrotTask that this worker is // supposed to perform. CLMandelbrotTask task = readTask(line); // Decode the message. task.compute(); // Peform the task. out.println(writeResults(task)); // Send back the results. out.flush(); } else { // No other messages are part of the protocol. throw new Exception("Illegal command received."); } } } catch (Exception e) { System.out.println("Client connection closed with error " + e); } finally { try { connection.close(); // Make sure the socket is closed. } catch (Exception e) { } } }

591

Note that this method is not executed in a separate thread. The worker has only one thing to do at a time and does not need to be multithreaded. You might wonder why so many tasks are used. Why not just divide the problem into one task for each worker? The reason is that using a larger number of tasks makes it possible to do load balancing . Not all tasks take the same amount of time to execute. This is true for many reasons. Some of the tasks might simply be more computationally complex than others. Some of the worker computers might be slower than others. Or some worker computers might be busy running other programs, so that they can only give part of their processing power to the worker program. If we assigned one task per worker, it is possible that a complex task running on a slow, busy computer would take much longer than the other tasks to complete. This would leave the other workers idle and delay the completion of the job while that worker completes its task. To complete the job as quickly as possible, we want to keep all the workers busy and have them all nish at about the same time. This is called load balancing. If we have a large number of tasks, the load will automatically be approximately balanced: A worker is not assigned a new task until it nishes the task that it is working on. A slow worker, or one that happens to receive more complex tasks, will complete fewer tasks than other workers, but all workers will be kept busy until close to the end of the job. On the other hand, individual tasks shouldnt be too small. Network communication takes some time. If it takes longer to transmit a task and its results than it does to perform the task, then using distributed computing will take more

592

CHAPTER 11. FILES AND NETWORKING

time than simply doing the whole job on one computer! A problem is a good candidate for distributed computing if it can be divided into a fairly large number of fairly large tasks. Turning to the master program, CLMandelbrotMaster.java, we encounter a more complex situation. The master program must communicate with several workers over several network connections. To accomplish this, the master program is multi-threaded, with one thread to manage communication with each worker. A pseudocode outline of the main() routine is quite simple:
create a list of all tasks that must be performed if there are no command line arguments { // The master program does all the tasks itself. Perform each task. } else { // The tasks will be performed by worker programs. for each command line argument: Get information about a worker from command line argument. Create and start a thread to communicate with the worker. Wait for all threads to terminate. } // All tasks are now complete (assuming no error occurred).

The list of tasks is stored in a variable, tasks, of type ArrayList<CLMandelbrotTask>. The communication threads take tasks from this list and send them to worker programs. The method getNextTask() gets one task from the list. If the list is empty, it returns null as a signal that all tasks have been assigned and the communication thread can terminate. Since tasks is a resource that is shared by several threads, access to it must be controlled; this is accomplished by writing getNextTask() as a synchronized method:
synchronized private static CLMandelbrotTask getNextTask() { if (tasks.size() == 0) return null; else return tasks.remove(0); }

(The reason for the synchronization is to avoid the race condition that could occur between the time that the value of tasks.size() is tested and the time that tasks.remove() is called. See Subsection 8.5.3 for information about parallel programming, race conditions, and synchronized.) The job of a thread is to send a sequence of tasks to a worker thread and to receive the results that the worker sends back. The thread is also responsible for opening the connection in the rst place. A pseudocode outline for the process executed by the thread might look like:
Create a socket connected to the worker program. Create input and output streams for communicating with the worker. while (true) { Let task = getNextTask(). If task == null break; // All tasks have been assigned. Encode the task into a message and transmit it to the worker. Read the response from the worker. Decode and process the response. }

11.5. NETWORK PROGRAMMING AND THREADS


Send a "close" command to the worker. Close the socket.

593

This would work OK. However, there are a few subtle points. First of all, the thread must be ready to deal with a network error. For example, a worker might shut down unexpectedly. But if that happens, the master program can continue, provided other workers are still available. (You can try this when you run the program: Stop one of the worker programs, with CONTROL-C, and observe that the master program still completes successfully.) A diculty arises if an error occurs while the thread is working on a task: If the problem as a whole is going to be completed, that task will have to be reassigned to another worker. I take care of this by putting the uncompleted task back into the task list. (Unfortunately, my program does not handle all possible errors. If a network connection hangs indenitely without actually generating an error, my program will also hang, waiting for a response from a worker that will never arrive. A more robust program would have some way of detecting the problem and reassigning the task.) Another defect in the procedure outlined above is that it leaves the worker program idle while the thread is processing the workers response. It would be nice to get a new task to the worker before processing the response from the previous task. This would keep the worker busy and allow two operations to proceed simultaneously instead of sequentially. (In this example, the time it takes to process a response is so short that keeping the worker waiting while it is done probably makes no signicant dierence. But as a general principle, its desirable to have as much parallelism as possible in the algorithm.) We can modify the procedure to take this into account:
try { Create a socket connected to the worker program. Create input and output streams for communicating with the worker. Let currentTask = getNextTask(). Encode currentTask into a message and send it to the worker. while (true) { Read the response from the worker. Let nextTask = getNextTask(). If nextTask != null { // Send nextTask to the worker before processing the // response to currentTask. Encode nextTask into a message and send it to the worker. } Decode and process the response to currentTask. currentTask = nextTask. if (currentTask == null) break; // All tasks have been assigned. } Send a "close" command to the worker. Close the socket. } catch (Exception e) { Put uncompleted task, if any, back into the task list. } finally { Close the connection. }

Finally, here is how this translates into Java. The pseudocode presented above becomes the run() method in the class that denes the communication threads used by the master program:

594

CHAPTER 11. FILES AND NETWORKING


/** * This class represents one worker thread. The job of a worker thread * is to send out tasks to a CLMandelbrotWorker program over a network * connection, and to get back the results computed by that program. */ private static class WorkerConnection extends Thread { int id; String host; int port; // Identifies this thread in output statements. // The host to which this thread will connect. // The port number to which this thread will connect.

/** * The constructor just sets the values of the instance * variables id, host, and port and starts the thread. */ WorkerConnection(int id, String host, int port) { this.id = id; this.host = host; this.port = port; start(); } /** * The run() method of the thread opens a connection to the host and * port specified in the constructor, then sends tasks to the * CLMandelbrotWorker program on the other side of that connection. * If the thread terminates normally, it outputs the number of tasks * that it processed. If it terminates with an error, it outputs * an error message. */ public void run() { int tasksCompleted = 0; // How many tasks has this thread handled. Socket socket; // The socket for the connection. try { socket = new Socket(host,port); // open the connection. } catch (Exception e) { System.out.println("Thread " + id + " could not open connection to " + host + ":" + port); System.out.println(" Error: " + e); return; } CLMandelbrotTask currentTask = null; CLMandelbrotTask nextTask = null; try { PrintWriter out = new PrintWriter(socket.getOutputStream()); BufferedReader in = new BufferedReader( new InputStreamReader(socket.getInputStream()) ); currentTask = getNextTask(); if (currentTask != null) { // Send first task to the worker program. String taskString = writeTask(currentTask);

11.5. NETWORK PROGRAMMING AND THREADS


out.println(taskString); out.flush();

595

} while (currentTask != null) { String resultString = in.readLine(); // Get results for currentTask. if (resultString == null) throw new IOException("Connection closed unexpectedly."); if (! resultString.startsWith(RESULT COMMAND)) throw new IOException("Illegal string received from worker."); nextTask = getNextTask(); // Get next task and send it to worker. if (nextTask != null) { // Send nextTask to worker before processing results for // currentTask, so that the worker can work on nextTask // while the currentTask results are processed. String taskString = writeTask(nextTask); out.println(taskString); out.flush(); } readResults(resultString, currentTask); finishTask(currentTask); // Process results from currentTask. tasksCompleted++; currentTask = nextTask; // We are finished with old currentTask. nextTask = null; } out.println(CLOSE CONNECTION COMMAND); // Send close command to worker. out.flush(); } catch (Exception e) { System.out.println("Thread " + id + " terminated because of an error"); System.out.println(" Error: " + e); e.printStackTrace(); // Put uncompleted task, if any, back into the task list. if (currentTask != null) reassignTask(currentTask); if (nextTask != null) reassignTask(nextTask); } finally { System.out.println("Thread " + id + " ending after completing " + tasksCompleted + " tasks"); try { socket.close(); } catch (Exception e) { } } } //end run() } // end nested class WorkerConnection

596

CHAPTER 11. FILES AND NETWORKING

11.6

A Brief Introduction to XML

When data is saved to a le or transmitted over a network, it must be represented in some way that will allow the same data to be rebuilt later, when the le is read or the transmission is received. We have seen that there are good reasons to prefer textual, character-based representations in many cases, but there are many ways to represent a given collection of data as text. In this section, well take a brief look at one type of character-based data representation that has become increasingly common. XML (eXtensible Markup Language) is a syntax for creating data representation languages. There are two aspects or levels of XML. On the rst level, XML species a strict but relatively simple syntax. Any sequence of characters that follows that syntax is a well-formed XML document. On the second level, XML provides a way of placing further restrictions on what can appear in a document. This is done by associating a DTD (Document Type Denition) with an XML document. A DTD is essentially a list of things that are allowed to appear in the XML document. A well-formed XML document that has an associated DTD and that follows the rules of the DTD is said to be a valid XML document. The idea is that XML is a general format for data representation, and a DTD species how to use XML to represent a particular kind of data. (There is also an alternative to DTDs, known as XML schemas, for dening valid XLM documents, but lets ignore them here.) There is nothing magical about XML. Its certainly not perfect. Its a very verbose language, and some people think its ugly. On the other hand its very exible; it can be used to represent almost any type of data. It was built from the start to support all languages and alphabets. Most important, it has become an accepted standard. There is support in just about any programming language for processing XML documents. There are standard DTDs for describing many dierent kinds of data. There are many ways to design a data representation language, but XML is the one that has happened to come into widespread use. In fact, it has found its way into almost every corner of information technology. For example: There are XML languages for representing mathematical expressions (MathML), musical notation (MusicXML), molecules and chemical reactions (CML), vector graphics (SVG), and many other kinds of information. XML is used by OpenOce and recent versions of Microsoft Oce in the document format for oce applications such as word processing, spreadsheets, and presentations. XML site syndication languages (RSS, ATOM) make it possible for web sites, newspapers, and blogs to make a list of recent headlines available in a standard format that can be used by other web sites and by web browsers; the same format is used to publish podcasts. And XML is a common format for the electronic exchange of business information. My purpose here is not to tell you everything there is to know about XML. I will just explain a few ways in which it can be used in your own programs. In particular, I will not say anything further about DTDs and valid XML. For many purposes, it is sucient to use well-formed XML documents with no associated DTDs.
11.6.1 Basic XML Syntax

An XML document looks a lot like an HTML document (see Subsection 6.2.3). HTML is not itself an XML language, since it does not follow all the strict XML syntax rules, but the basic ideas are similar. Here is a short, well-formed XML document:
<?xml version="1.0"?> <simplepaint version="1.0"> <background red=255 green=153 blue=51/>

11.6. A BRIEF INTRODUCTION TO XML


<curve> <color red=0 green=0 blue=255/> <symmetric>false</symmetric> <point x=83 y=96/> <point x=116 y=149/> <point x=159 y=215/> <point x=216 y=294/> <point x=264 y=359/> <point x=309 y=418/> <point x=371 y=499/> <point x=400 y=543/> </curve> <curve> <color red=255 green=255 blue=255/> <symmetric>true</symmetric> <point x=54 y=305/> <point x=79 y=289/> <point x=128 y=262/> <point x=190 y=236/> <point x=253 y=209/> <point x=341 y=158/> </curve> </simplepaint>

597

The rst line, which is optional, merely identies this as an XML document. This line can also specify other information, such as the character encoding that was used to encode the characters in the document into binary form. If this document had an associated DTD, it would be specied in a DOCTYPE directive on the next line of the le. Aside from the rst line, the document is made up of elements, attributes, and textual content. An element starts with a tag , such as <curve> and ends with a matching end-tag such as </curve>. Between the tag and end-tag is the content of the element, which can consist of text and nested elements. (In the example, the only textual content is the true or false in the <symmetric> elements.) If an element has no content, then the opening tag and end-tag can be combined into a single empty tag , such as <point x=83 y=96/>, which is an abbreviation for <point x=83 y=96></point>. A tag can include attributes such as the x and y in <point x=83 y=96/> or the version in <simplepaint version="1.0">. A document can also include a few other things, such as comments, that I will not discuss here. The basic structure should look familiar to someone familiar with HTML. The most striking dierence is that in XML, you get to choose the tags. Whereas HTML comes with a xed, nite set of tags, with XML you can make up meaningful tag names that are appropriate to your application and that describe the data that is being represented. (For an XML document that uses a DTD, its the author of the DTD who gets to choose the tag names.) Every well-formed XML document follows a strict syntax. Here are some of the most important syntax rules: Tag names and attribute names in XML are case sensitive. A name must begin with a letter and can contain letters, digits and certain other characters. Spaces and ends-of-line are signicant only in textual content. Every tag must either be an empty tag or have a matching end-tag. By matching here, I mean that elements must be properly nested; if a tag is inside some element, then the matching end-tag must also be inside that element. A document must have a root element, which contains all the other elements. The root element in the above example has tag name simplepaint. Every attribute must have a value, and that value must be enclosed in quotation marks; either single quotes or double

598

CHAPTER 11. FILES AND NETWORKING

quotes can be used for this. The special characters < and &, if they appear in attribute values or textual content, must be written as &lt; and &amp;. &lt; and &amp; are examples of entities. The entities &gt;, &quot;, and &apos; are also dened, representing >, double quote, and single quote. (Additional entities can be dened in a DTD.) While this description will not enable you to understand everything that you might encounter in XML documents, it should allow you to design well-formed XML documents to represent data structures used in Java programs.

11.6.2

XMLEncoder and XMLDecoder

We will look at two approaches to representing data from Java programs in XML format. One approach is to design a custom XML language for the specic data structures that you want to represent. We will consider this approach in the next subsection. First, well look at an easy way to store data in XML les and to read those les back into a program. The technique uses the classes XMLEncoder and XMLDecoder. These classes are dened in the package java.beans. An XMLEncoder can be used to write objects to an OutputStream in XML form. An XMLDecoder can be used to read the output of an XMLEncoder and reconstruct the objects that were written by it. XMLEncoder and XMLDecoder have much the same functionality as ObjectOutputStream and ObjectInputStream and are used in much the same way. In fact, you dont even have to know anything about XML to use them. However, you do need to know a little about Java beans. A Java bean is just an object that has certain characteristics. The class that denes a Java bean must be a public class. It must have a constructor that takes no parameters. It should have a get method and a set method for each of its important instance variables. (See Subsection 5.1.3). The last rule is a little vague. The idea is that is should be possible to inspect all aspects of the objects state by calling get methods, and it should be possible to set all aspects of the state by calling set methods. A bean is not required to implement any particular interface; it is recognized as a bean just by having the right characteristics. Usually, Java beans are passive data structures that are acted upon by other objects but dont do much themselves. XMLEncoder and XMLDecoder cant be used with arbitrary objects; they can only be used with beans. When an XMLEncoder writes an object, it uses the get methods of that object to nd out what information needs to be saved. When an XMLDecoder reconstructs an object, it creates the object using the constructor with no parameters and it uses set methods to restore the objects state to the values that were saved by the XMLEncoder. (Some standard java classes are processed using additional techniques. For example, a dierent constructor might be used, and other methods might be used to inspect and restore the state.) For an example, we return to the same SimplePaint example that was used in Subsection 11.3.4. Suppose that we want to use XMLEncoder and XMLDecoder to create and read les in that program. Part of the data for a SimplePaint sketch is stored in objects of type CurveData, dened as:
private static class CurveData { Color color; // The color of the curve. boolean symmetric; // Are reflections also drawn? ArrayList<Point> points; // The points on the curve. }

To use such objects with XMLEncoder and XMLDecoder, we have to modify this class so that

11.6. A BRIEF INTRODUCTION TO XML

599

it follows the Java bean pattern. The class has to be public, and we need get and set methods for each instance variable. This gives:
public static class CurveData { private Color color; // The color of the curve. private boolean symmetric; // Are reflections also drawn? private ArrayList<Point> points; // The points on the curve. public Color getColor() { return color; } public void setColor(Color color) { this.color = color; } public ArrayList<Point> getPoints() { return points; } public void setPoints(ArrayList<Point> points) { this.points = points; } public boolean isSymmetric() { return symmetric; } public void setSymmetric(boolean symmetric) { this.symmetric = symmetric; } }

I didnt really need to make the instance variables private, but bean properties are usually private and are accessed only through their get and set methods. At this point, we might dene another bean class, SketchData, to hold all the necessary data for representing the users picture. If we did that, we could write the data to a le with a single output statement. In my program, however, I decided to write the data in several pieces. An XMLEncoder can be constructed to write to any output stream. The output stream is specied in the encoders constructor. For example, to create an encoder for writing to a le:
XMLEncoder encoder; try { FileOutputStream stream = new FileOutputStream(selectedFile); encoder = new XMLEncoder( stream ); . .

Once an encoder has been created, its writeObject() method is used to write objects, coded into XML form, to the stream. In the SimplePaint program, I save the background color, the number of curves in the picture, and the data for each curve. The curve data are stored in a list of type ArrayList<CurveData> named curves. So, a complete representation of the users picture can be created with:
encoder.writeObject(getBackground()); encoder.writeObject(new Integer(curves.size())); for (CurveData c : curves) encoder.writeObject(c); encoder.close();

600

CHAPTER 11. FILES AND NETWORKING

When reading the data back into the program, an XMLDecoder is created to read from an input le stream. The objects are then read, using the decoders readObject() method, in the same order in which they were written. Since the return type of readObject() is Object, the returned values must be type-cast to their correct type:
Color bgColor = (Color)decoder.readObject(); Integer curveCt = (Integer)decoder.readObject(); ArrayList<CurveData> newCurves = new ArrayList<CurveData>(); for (int i = 0; i < curveCt; i++) { CurveData c = (CurveData)decoder.readObject(); newCurves.add(c); } decoder.close(); curves = newCurves; // Replace the programs data with data from the file. setBackground(bgColor); repaint();

You can look at the sample program SimplePaintWithXMLEncoder.java to see this code in the context of a complete program. Files are created by the method doSaveAsXML() and are read by doOpenAsXML(). The XML format used by XMLEncoder and XMLDecoder is more robust than the binary format used for object streams and is more appropriate for long-term storage of objects in les.

11.6.3

Working With the DOM

The output produced by an XMLEncoder tends to be long and not very easy for a human reader to understand. It would be nice to represent data in a more compact XML format that uses meaningful tag names to describe the data and makes more sense to human readers. Well look at yet another version of SimplePaint that does just that. See SimplePaintWithXML.java for the source code. The sample XML document shown earlier in this section was produced by this program. I designed the format of that document to represent all the data needed to reconstruct a picture in SimplePaint. The document encodes the background color of the picture and a list of curves. Each <curve> element contains the data from one object of type CurveData. It is easy enough to write data in a customized XML format, although we have to be very careful to follow all the syntax rules. Here is how I write the data for a SimplePaint picture to a PrintWriter, out:
out.println("<?xml version=\"1.0\"?>"); out.println("<simplepaint version=\"1.0\">"); Color bgColor = getBackground(); out.println(" <background red=" + bgColor.getRed() + " green=" + bgColor.getGreen() + " blue=" + bgColor.getBlue() + "/>"); for (CurveData c : curves) { out.println(" <curve>"); out.println(" <color red=" + c.color.getRed() + " green=" + c.color.getGreen() + " blue=" + c.color.getBlue() + "/>"); out.println(" <symmetric>" + c.symmetric + "</symmetric>"); for (Point pt : c.points) out.println(" <point x=" + pt.x + " y=" + pt.y + "/>"); out.println(" </curve>"); } out.println("</simplepaint>");

11.6. A BRIEF INTRODUCTION TO XML

601

Reading the data back into the program is another matter. To reconstruct the data structure represented by the XML Document, it is necessary to parse the document and extract the data from it. Fortunately, Java has a standard API for parsing and processing XML Documents. (Actually, it has two, but we will only look at one of them.) A well-formed XML document has a certain structure, consisting of elements containing attributes, nested elements, and textual content. Its possible to build a data structure in the computers memory that corresponds to the structure and content of the document. Of course, there are many ways to do this, but there is one common standard representation known as the Document Object Model , or DOM. The DOM species how to build data structures to represent XML documents, and it species some standard methods for accessing the data in that structure. The data structure is a kind of tree whose structure mirrors the structure of the document. The tree is constructed from nodes of various types. There are nodes to represent elements, attributes, and text. (The tree can also contain several other types of node, representing aspects of XML that we can ignore here.) Attributes and text can be processed without directly manipulating the corresponding nodes, so we will be concerned almost entirely with element nodes. The sample program XMLDemo.java lets you experiment with XML and the DOM. It has a text area where you can enter an XML document. Initially, the input area contains the sample XML document from this section. When you click a button named Parse XML Input, the program will attempt to read the XML from the input box and build a DOM representation of that document. If the input is not legal XML, an error message is displayed. If it is legal, the program will traverse the DOM representation and display a list of elements, attributes, and textual content that it encounteres. (The program uses a few techniques that I wont discuss here.) In Java, the DOM representation of an XML document le can be created with just two statements. If selectedFile is a variable of type File that represents the XML le, then
DocumentBuilder docReader = DocumentBuilderFactory.newInstance().newDocumentBuilder(); xmldoc = docReader.parse(selectedFile);

will open the le, read its contents, and build the DOM representation. The classes DocumentBuilder and DocumentBuilderFactory are both dened in the package javax.xml.parsers. The method docReader.parse() does the actual work. It will throw an exception if it cant read the le or if the le does not contain a legal XML document. If it succeeds, then the value returned by docReader.parse() is an object that represents the entire XML document. (This is a very complex task! It has been coded once and for all into a method that can be used very easily in any Java program. We see the benet of using a standardized syntax.) The structure of the DOM data structure is dened in the package org.w3c.dom, which contains several data types that represent an XML document as a whole and the individual nodes in a document. The org.w3c in the name refers to the World Wide Web Consortium, W3C, which is the standards organization for the Web. DOM, like XML, is a general standard, not just a Java standard. The data types that we need here are Document, Node, Element, and NodeList. (They are dened as interfaces rather than classes, but that fact is not relevant here.) We can use methods that are dened in these data types to access the data in the DOM representation of an XML document. An object of type Document represents an entire XML document. The return value of docReader.parse()xmldoc in the above exampleis of type Document. We will only need one method from this class: If xmldoc is of type Document, then

602
xmldoc.getDocumentElement()

CHAPTER 11. FILES AND NETWORKING

returns a value of type Element that represents the root element of the document. (Recall that this is the top-level element that contains all the other elements.) In the sample XML document from earlier in this section, the root element consists of the tag <simplepaint version="1.0">, the end-tag </simplepaint>, and everything in between. The elements that are nested inside the root element are represented by their own nodes, which are said to be children of the root node. An object of type Element contains several useful methods. If element is of type Element, then we have: element.getTagName() returns a String containing the name that is used in the elements tag. For example, the name of a <curve> element is the string curve. element.getAttribute(attrName) if attrName is the name of an attribute in the element, then this method returns the value of that attribute. For the element, <point x="83" y="42"/>, element.getAttribute("x") would return the string 83. Note that the return value is always a String, even if the attribute is supposed to represent a numerical value. If the element has no attribute with the specied name, then the return value is an empty string. element.getTextContent() returns a String containing all the textual content that is contained in the element. Note that this includes text that is contained inside other elements that are nested inside the element. element.getChildNodes() returns a value of type NodeList that contains all the Nodes that are children of the element. The list includes nodes representing other elements and textual content that are directly nested in the element (as well as some other types of node that I dont care about here). The getChildNodes() method makes it possible to traverse the entire DOM data structure by starting with the root element, looking at children of the root element, children of the children, and so on. (There is a similar method that returns the attributes of the element, but I wont be using it here.) element.getElementsByTagName(tagName) returns a NodeList that contains all the nodes representing all elements that are are nested inside element and which have the given tag name. Note that this includes elements that are nested to any level, not just elements that are directly contained inside element. The getElementsByTagName() method allows you to reach into the document and pull out specic data that you are interested in. An object of type NodeList represents a list of Nodes. It does not use the API dened for lists in the Java Collection Framework. Instead, a value, nodeList, of type NodeList has two methods: nodeList.getLength() returns the number of nodes in the list, and nodeList.item(i) returns the node at position i, where the positions are numbered 0, 1, . . . , nodeList.getLength() - 1. Note that the return value of nodeList.get() is of type Node, and it might have to be type-cast to a more specic node type before it is used. Knowing just this much, you can do the most common types of processing of DOM representations. Lets look at a few code fragments. Suppose that in the course of processing a document you come across an Element node that represents the element
<background red=255 green=153 blue=51/>

This element might be encountered either while traversing the document with getChildNodes() or in the result of a call to getElementsByTagName("background"). Our goal is to reconstruct the data structure represented by the document, and this element represents part of that data. In this case, the element represents a color, and the red, green, and blue components are given

11.6. A BRIEF INTRODUCTION TO XML

603

by the attributes of the element. If element is a variable that refers to the node, the color can be obtained by saying:
int r int g int b Color = Integer.parseInt( element.getAttribute("red") ); = Integer.parseInt( element.getAttribute("green") ); = Integer.parseInt( element.getAttribute("blue") ); bgColor = new Color(r,g,b);

Suppose now that element refers to the node that represents the element
<symmetric>true</symmetric>

In this case, the element represents the value of a boolean variable, and the value is encoded in the textual content of the element. We can recover the value from the element with:
String bool = element.getTextContent(); boolean symmetric; if (bool.equals("true")) symmetric = true; else symmetric = false;

Next, consider an example that uses a NodeList. Suppose we encounter an element that represents a list of Points:
<pointlist> <point x=17 y=42/> <point x=23 y=8/> <point x=109 y=342/> <point x=18 y=270/> </pointlist>

Suppose that element refers to the node that represents the <pointlist> element. Our goal is to build the list of type ArrayList<Point> that is represented by the element. We can do this by traversing the NodeList that contains the child nodes of element:
ArrayList<Point> points = new ArrayList<Point>(); NodeList children = element.getChildNodes(); for (int i = 0; i < children.getLength(); i++) { Node child = children.item(i); // One of the child nodes of element. if ( child instanceof Element ) { Element pointElement = (Element)child; // One of the <point> elements. int x = Integer.parseInt( pointElement.getAttribute("x") ); int y = Integer.parseInt( pointElement.getAttribute("y") ); Point pt = new Point(x,y); // Create the Point represented by pointElement. points.add(pt); // Add the point to the list of points. } }

All the nested <point> elements are children of the <pointlist> element. The if statement in this code fragment is necessary because an element can have other children in addition to its nested elements. In this example, we only want to process the children that are elements. All these techniques can be employed to write the le input method for the sample program SimplePaintWithXML.java. When building the data structure represented by an XML le, my approach is to start with a default data structure and then to modify and add to it as I traverse the DOM representation of the le. Its not a trivial process, but I hope that you can follow it:

604

CHAPTER 11. FILES AND NETWORKING


Color newBackground = Color.WHITE; ArrayList<CurveData> newCurves = new ArrayList<CurveData>(); Element rootElement = xmldoc.getDocumentElement(); if ( ! rootElement.getNodeName().equals("simplepaint") ) throw new Exception("File is not a SimplePaint file."); String version = rootElement.getAttribute("version"); try { double versionNumber = Double.parseDouble(version); if (versionNumber > 1.0) throw new Exception("File requires a newer version of SimplePaint."); } catch (NumberFormatException e) { } NodeList nodes = rootElement.getChildNodes(); for (int i = 0; i < nodes.getLength(); i++) { if (nodes.item(i) instanceof Element) { Element element = (Element)nodes.item(i); if (element.getTagName().equals("background")) { // Read background color. int r = Integer.parseInt(element.getAttribute("red")); int g = Integer.parseInt(element.getAttribute("green")); int b = Integer.parseInt(element.getAttribute("blue")); newBackground = new Color(r,g,b); } else if (element.getTagName().equals("curve")) { // Read data for a curve. CurveData curve = new CurveData(); curve.color = Color.BLACK; curve.points = new ArrayList<Point>(); newCurves.add(curve); // Add this curve to the new list of curves. NodeList curveNodes = element.getChildNodes(); for (int j = 0; j < curveNodes.getLength(); j++) { if (curveNodes.item(j) instanceof Element) { Element curveElement = (Element)curveNodes.item(j); if (curveElement.getTagName().equals("color")) { int r = Integer.parseInt(curveElement.getAttribute("red")); int g = Integer.parseInt(curveElement.getAttribute("green")); int b = Integer.parseInt(curveElement.getAttribute("blue")); curve.color = new Color(r,g,b); } else if (curveElement.getTagName().equals("point")) { int x = Integer.parseInt(curveElement.getAttribute("x")); int y = Integer.parseInt(curveElement.getAttribute("y")); curve.points.add(new Point(x,y)); } else if (curveElement.getTagName().equals("symmetric")) { String content = curveElement.getTextContent(); if (content.equals("true")) curve.symmetric = true; } } } }

11.6. A BRIEF INTRODUCTION TO XML


} } curves = newCurves; // Change picture in window to show the data from file. setBackground(newBackground); repaint();

605


XML has developed into an extremely important technology, and some applications of it are very complex. But there is a core of simple ideas that can be easily applied in Java. Knowing just the basics, you can make good use of XML in your own Java programs.

606

CHAPTER 11. FILES AND NETWORKING

Exercises for Chapter 11


1. The sample program DirectoryList.java, given as an example in Subsection 11.2.2, will print a list of les in a directory specied by the user. But some of the les in that directory might themselves be directories. And the subdirectories can themselves contain directories. And so on. Write a modied version of DirectoryList that will list all the les in a directory and all its subdirectories, to any level of nesting. You will need a recursive subroutine to do the listing. The subroutine should have a parameter of type File. You will need the constructor from the File class that has the form
public File( File dir, String fileName ) // Constructs the File object representing a file // named fileName in the directory specified by dir.

2. Write a program that will count the number of lines in each le that is specied on the command line. Assume that the les are text les. Note that multiple les can be specied, as in:
java LineCounts file1.txt file2.txt file3.txt

Write each le name, along with the number of lines in that le, to standard output. If an error occurs while trying to read from one of the les, you should print an error message for that le, but you should still process all the remaining les. Do not use TextIO to process the les; use a FileReader to access each le. 3. For this exercise, you will write a network server program. The program is a simple le server that makes a collection of les available for transmission to clients. When the server starts up, it needs to know the name of the directory that contains the collection of les. This information can be provided as a command-line argument. You can assume that the directory contains only regular les (that is, it does not contain any sub-directories). You can also assume that all the les are text les. When a client connects to the server, the server rst reads a one-line command from the client. The command can be the string index. In this case, the server responds by sending a list of names of all the les that are available on the server. Or the command can be of the form get filename, where filename is a le name. The server checks whether the requested le actually exists. If so, it rst sends the word ok as a message to the client. Then it sends the contents of the le and closes the connection. Otherwise, it sends the word error to the client and closes the connection. Ideally, your server should start a separate thread to handle each connection request. However, if you dont want to deal with threads you can just call a subroutine to handle the request. See the DirectoryList example in Subsection 11.2.2 for help with the problem of getting the list of les in the directory. 4. Write a client program for the server from Exercise 11.3. Design a user interface that will let the user do at least two things: (1) Get a list of les that are available on the server and display the list on standard output; and (2) Get a copy of a specied le from the server and save it to a local le (on the computer where the client is running). 5. The sample program PhoneDirectoryFileDemo.java, from Subsection 11.3.2, stores name/number pairs for a simple phone book in a text le in the users home directory.

Exercises

607

Modify that program so that is uses an XML format for the data. The only signicant changes that you will have to make are to the parts of the program that read and write the data le. Use the DOM to read the data, as discussed in Subsection 11.6.3. You can use the XML format illustrated in the following sample phone directory le:
<?xml version="1.0"?> <phone directory> <entry name=barney number=890-1203/> <entry name=fred number=555-9923/> </phone directory>

(This is just an easy exercise in simple XML processing; as before, the program in this exercise is not meant to be a useful phone directory program.) 6. The sample program Checkers.java from Subsection 7.5.3 lets two players play checkers. It would be nice if, in the middle of a game, the state of the game could be saved to a le. Later, the le could be read back into the le to restore the game and allow the players to continue. Add the ability to save and load les to the checkers program. Design a simple text-based format for the les. Here is a picture of my solution to this exercise, just after a le has been loaded into the program:

Note: The original checkers program could be run as either an applet or a stand-alone application. Since the new version uses les, however, it can only be run as an application. An applet running in a web browser is not allowed to access les. Its a little tricky to completely restore the state of a game. The program has a variable board of type CheckersData that stores the current contents of the board, and it has a variable currentPlayer of type int that indicates whether Red or Black is currently moving. This data must be stored in the le when a le is saved. When a le is read into the program, you should read the data into two local variables newBoard of type CheckersData and newCurrentPlayer of type int. Once you have successfully read all the data from the le, you can use the following code to set up the program state correctly. This code assumes that you have introduced two new variables saveButton and loadButton of type JButton to represent the Save Game and Load Game buttons:

608

CHAPTER 11. FILES AND NETWORKING


board = newBoard; // Set up game with data read from file. currentPlayer = newCurrentPlayer; legalMoves = board.getLegalMoves(currentPlayer); selectedRow = -1; gameInProgress = true; newGameButton.setEnabled(false); loadButton.setEnabled(false); saveButton.setEnabled(true); resignButton.setEnabled(true); if (currentPlayer == CheckersData.RED) message.setText("Game loaded -- its REDs move."); else message.setText("Game loaded -- its BLACKs move."); repaint();

(Note, by the way, that I used a TextReader to read the data from the le into my program. TextReader is a non-standard class introduced in Subsection 11.1.4 and dened in the le TextReader.java. How to read the data in a le depends, of course, on the format that you have chosen for the data.)

Quiz

609

Quiz on Chapter 11
1. In Java, input/output is done using streams. Streams are an abstraction. Explain what this means and why it is important. 2. Java has two types of streams: character streams and byte streams. Why? What is the dierence between the two types of streams? 3. What is a le? Why are les necessary? 4. What is the point of the following statement?
out = new PrintWriter( new FileWriter("data.dat") );

Why would you need a statement that involves two dierent stream classes, PrintWriter and FileWriter ? 5. The package java.io includes a class named URL. What does an object of type URL represent, and how is it used? 6. Explain what is meant by the client / server model of network communication. 7. What is a socket? 8. What is a ServerSocket and how is it used? 9. Network server programs are often multithreaded. Explain what this means and why it is true. 10. What is meant by an element in an XML document? 11. What is it about XML that makes it suitable for representing almost any type of data? 12. Write a complete program that will display the rst ten lines from a text le. The lines should be written to standard output, System.out. The le name is given as the commandline argument args[0]. You can assume that the le contains at least ten lines. Dont bother to make the program robust. Do not use TextIO to process the le; use a FileReader to access the le.

610

CHAPTER 11. FILES AND NETWORKING

Chapter 12

Advanced GUI Programming


Its possible to program a wide variety of GUI applications using only the techniques covered in Chapter 6. In many cases, the basic events, components, layouts, and graphics routines covered in that chapter suce. But the Swing graphical user interface library is far richer than what we have seen so far, and it can be used to build highly sophisticated applications. This chapter is a further introduction to Swing and other aspects of GUI programming. Although the title of the chapter is Advanced GUI Programming, it is still just an introduction. Full coverage of this topic would require at least another complete book.

12.1

Images and Resources

We have seen how to use the Graphics class to draw on a GUI component that is visible on
the computers screen. Often, however, it is useful to be able to create a drawing o-screen, in the computers memory. It is also important to be able to work with images that are stored in les. To a computer, an image is just a set of numbers. The numbers specify the color of each pixel in the image. The numbers that represent the image on the computers screen are stored in a part of memory called a frame buer . Many times each second, the computers video card reads the data in the frame buer and colors each pixel on the screen according to that data. Whenever the computer needs to make some change to the screen, it writes some new numbers to the frame buer, and the change appears on the screen a fraction of a second later, the next time the screen is redrawn by the video card. Since its just a set of numbers, the data for an image doesnt have to be stored in a frame buer. It can be stored elsewhere in the computers memory. It can be stored in a le on the computers hard disk. Just like any other data le, an image le can be downloaded over the Internet. Java includes standard classes and subroutines that can be used to copy image data from one part of memory to another and to get data from an image le and use it to display the image on the screen.

12.1.1

Images and BueredImages

The class java.awt.Image represents an image stored in the computers memory. There are two fundamentally dierent types of Image. One kind represents an image read from a source outside the program, such as from a le on the computers hard disk or over a network connection. The second type is an image created by the program. I refer to this second type as an o-screen canvas. An o-screen canvas is a region of the computers memory that can be used as a 611

612

CHAPTER 12. ADVANCED GUI PROGRAMMING

drawing surface. It is possible to draw to an oscreen image using the same Graphics class that is used for drawing on the screen. An Image of either type can be copied onto the screen (or onto an o-screen canvas) using methods that are dened in the Graphics class. This is most commonly done in the paintComponent() method of a JComponent. Suppose that g is the Graphics object that is provided as a parameter to the paintComponent() method, and that img is of type Image. Then the statement
g.drawImage(img, x, y, this);

will draw the image img in a rectangular area in the component. The integer-valued parameters x and y give the position of the upper-left corner of the rectangle in which the image is displayed, and the rectangle is just large enough to hold the image. The fourth parameter, this, is the special variable from Subsection 5.6.1 that refers to the JComponent itself. This parameter is there for technical reasons having to do with the funny way Java treats image les. For most applications, you dont need to understand this, but here is how it works: g.drawImage() does not actually draw the image in all cases. It is possible that the complete image is not available when this method is called; this can happen, for example, if the image has to be read from a le. In that case, g.drawImage() merely initiates the drawing of the image and returns immediately. Pieces of the image are drawn later, asynchronously, as they become available. The question is, how do they get drawn? Thats where the fourth parameter to the drawImage method comes in. The fourth parameter is something called an ImageObserver. When a piece of the image becomes available to be drawn, the system will inform the ImageObserver, and that piece of the image will appear on the screen. Any JComponent object can act as an ImageObserver. The drawImage method returns a boolean value to indicate whether the image has actually been drawn or not when the method returns. When drawing an image that you have created in the computers memory, or one that you are sure has already been completely loaded, you can set the ImageObserver parameter to null. There are a few useful variations of the drawImage() method. For example, it is possible to scale the image as it is drawn to a specied width and height. This is done with the command
g.drawImage(img, x, y, width, height, imageObserver);

The parameters width and height give the size of the rectangle in which the image is displayed. Another version makes it possible to draw just part of the image. In the command:
dest y1, dest x2, dest y2, g.drawImage(img, dest x1, source x1, source y1, source x2, source y2, imageObserver);

the integers source x1, source y1, source x2, and source y2 specify the top-left and bottomright corners of a rectangular region in the source image. The integers dest x1, dest y1, dest x2, and dest y2 specify the corners of a region in the destination graphics context. The specied rectangle in the image is drawn, with scaling if necessary, to the specied rectangle in the graphics context. For an example in which this is useful, consider a card game that needs to display 52 dierent cards. Dealing with 52 image les can be cumbersome and inecient, especially for downloading over the Internet. So, all the cards might be put into a single image:

12.1. IMAGES AND RESOURCES

613

(This image is from the Gnome desktop project, http://www.gnome.org, and is shown here much smaller than its actual size.) Now, only one Image object is needed. Drawing one card means drawing a rectangular region from the image. This technique is used in a variation of the sample program HighLowGUI.java from Subsection 6.7.6. In the original version, the cards are represented by textual descriptions such as King of Hearts. In the new version, HighLowWithImages.java, the cards are shown as images. An applet version of the program can be found in the on-line version of this section. In the program, the cards are drawn using the following method. The instance variable cardImages is a variable of type Image that represents the image that is shown above, containing 52 cards, plus two Jokers and a face-down card. Each card is 79 by 123 pixels. These numbers are used, together with the suit and value of the card, to compute the corners of the source rectangle for the drawImage() command:
/** * Draws a card in a 79x123 pixel rectangle with its * upper left corner at a specified point (x,y). Drawing the card * requires the image file "cards.png". * @param g The graphics context used for drawing the card. * @param card The card that is to be drawn. If the value is null, then a * face-down card is drawn. * @param x the x-coord of the upper left corner of the card * @param y the y-coord of the upper left corner of the card */ public void drawCard(Graphics g, Card card, int x, int y) { int cx; // x-coord of upper left corner of the card inside cardsImage int cy; // y-coord of upper left corner of the card inside cardsImage if (card == null) { cy = 4*123; // coords for a face-down card. cx = 2*79; } else {

614

CHAPTER 12. ADVANCED GUI PROGRAMMING


cx = (card.getValue()-1)*79; switch (card.getSuit()) { case Card.CLUBS: cy = 0; break; case Card.DIAMONDS: cy = 123; break; case Card.HEARTS: cy = 2*123; break; default: // spades cy = 3*123; break; } } g.drawImage(cardImages,x,y,x+79,y+123,cx,cy,cx+79,cy+123,this); }

I will tell you later in this section how the image le, cards.png, can be loaded into the program.


In addition to images loaded from les, it is possible to create images by drawing to an o-screen canvas. An o-screen canvas can be represented by an object belonging to the class BueredImage, which is dened in the package java.awt.image. BueredImage is a subclass of Image, so that once you have a BueredImage, you can copy it into a graphics context g using one of the g.drawImage() methods, just as you would do with any other image. A BueredImage can be created using the constructor
public BufferedImage(int width, int height, int imageType)

where width and height specify the width and height of the image in pixels, and imageType can be one of several constants that are dened in the BueredImage. The image type species how the color of each pixel is represented. The most likely value for imageType is BufferedImage.TYPE INT RGB, which species that the color of each pixel is a usual RGB color, with red, green and blue components in the range 0 to 255. The image type BufferedImage.TYPE INT ARGB represents an RGB image with transparency; see the next section for more information on this. The image type BufferedImage.TYPE BYTE GRAY can be used to create a grayscale image in which the only possible colors are shades of gray. To draw to a BueredImage, you need a graphics context that is set up to do its drawing on the image. If OSC is of type BueredImage, then the method
OSC.getGraphics()

returns an object of type Graphics that can be used for drawing on the image. There are several reasons why a programmer might want to draw to an o-screen canvas. One is to simply keep a copy of an image that is shown on the screen. Remember that a picture that is drawn on a component can be lost, for example when the component is covered by another window. This means that you have to be able to redraw the picture on demand, and that in turn means keeping enough information around to enable you to redraw the picture. One way to do this is to keep a copy of the picture in an o-screen canvas. Whenever the onscreen picture needs to be redrawn, you just have to copy the contents of the o-screen canvas onto the screen. Essentially, the o-screen canvas allows you to save a copy of the color of every

12.1. IMAGES AND RESOURCES

615

individual pixel in the picture. The sample program PaintWithOScreenCanvas.java is a little painting program that uses an o-screen canvas in this way. In this program, the user can draw curves, lines, and various shapes; a Tool menu allows the user to select the thing to be drawn. There is also an Erase tool and a Smudge tool that I will get to later. A BueredImage is used to store the users picture. When the user changes the picture, the changes are made to the image, and the changed image is then copied to the screen. No record is kept of the shapes that the user draws; the only record is the color of the individual pixels in the o-screen image. (You should contrast this with the program SimplePaint2.java in Subsection 7.3.4, where the users drawing is recorded as a list of objects that represent the shapes that user drew.) You should try the program (or the applet version in the on-line version of this section). Try drawing a Filled Rectangle on top of some other shapes. As you drag the mouse, the rectangle stretches from the starting point of the mouse drag to the current mouse location. As the mouse moves, the underlying picture seems to be unaectedparts of the picture can be covered up by the rectangle and later uncovered as the mouse moves, and they are still there. What this means is that the rectangle that is shown as you drag the mouse cant actually be part of the o-screen canvas, since drawing something into an image means changing the color of some pixels in the image. The previous colors of those pixels are not stored anywhere else and so are permanently lost. In fact, when you draw a line, rectangle, or oval in PaintWithOffScreenCanvas, the shape that is shown as you drag the mouse is not drawn to the o-screen canvas at all. Instead, the paintComponent() method draws the shape on top of the contents of the canvas. Only when you release the mouse does the shape become a permanent part of the o-screen canvas. This illustrates the point that when an o-screen canvas is used, not everything that is visible on the screen has to be drawn on the canvas. Some extra stu can be drawn on top of the contents of the canvas by the paintComponent() method. The other tools are handled dierently from the shape tools. For the curve, erase, and smudge tools, the changes are made to the canvas immediately, as the mouse is being dragged. Lets look at how an o-screen canvas is used in this program. The canvas is represented by an instance variable, OSC, of type BueredImage. The size of the canvas must be the same size as the panel on which the canvas is displayed. The size can be determined by calling the getWidth() and getHeight() instance methods of the panel. Furthermore, when the canvas is rst created, it should be lled with the background color, which is represented in the program by an instance variable named fillColor. All this is done by the method:
/** * This method creates the off-screen canvas and fills it with the current * fill color. */ private void createOSC() { OSC = new BufferedImage(getWidth(),getHeight(),BufferedImage.TYPE INT RGB); Graphics osg = OSC.getGraphics(); osg.setColor(fillColor); osg.fillRect(0,0,getWidth(),getHeight()); osg.dispose(); }

Note how it uses OSC.getGraphics() to obtain a graphics context for drawing to the image. Also note that the graphics context is disposed at the end of the method. It is good practice to dispose a graphics context when you are nished with it. There still remains the problem of where to call this method. The problem is that the width and height of the panel object are not set until some time after the panel object is constructed. If createOSC() is called in

616

CHAPTER 12. ADVANCED GUI PROGRAMMING

the constructor, getWidth() and getHeight() will return the value zero and we wont get an o-screen image of the correct size. The approach that I take in PaintWithOffScreenCanvas is to call createOSC() in the paintComponent() method, the rst time the paintComponent() method is called. At that time, the size of the panel has denitely been set, but the user has not yet had a chance to draw anything. With this in mind you are ready to understand the paintComponent() method:
public void paintComponent(Graphics g) { /* First create the off-screen canvas, if it does not already exist. */ if (OSC == null) createOSC(); /* Copy the off-screen canvas to the panel. Since we know that the image is already completely available, the fourth "ImageObserver" parameter to g.drawImage() can be null. Since the canvas completely fills the panel, there is no need to call super.paintComponent(g). */ g.drawImage(OSC,0,0,null); /* If the user is currently dragging the mouse to draw a line, oval, or rectangle, draw the shape on top of the image from the off-screen canvas, using the current drawing color. (This is not done if the user is drawing a curve or using the smudge tool or the erase tool.) */ if (dragging && SHAPE TOOLS.contains(currentTool)) { g.setColor(currentColor); putCurrentShape(g); } }

Here, dragging is a boolean instance variable that is set to true while the user is dragging the mouse, and currentTool tells which tool is currently in use. The possible tools are dened by an enum named Tool, and SHAPE TOOLS is a variable of type EnumSet<Tool> that contains the line, oval, rectangle, lled oval, and lled rectangle tools. (See Subsection 10.2.4.) You might notice that there is a problem if the size of the panel is ever changed, since the size of the o-screen canvas will not be changed to match. The PaintWithOffScreenCanvas program does not allow the user to resize the programs window, so this is not an issue in that program. If we want to allow resizing, however, a new o-screen canvas must be created whenever the size of the panel changes. One simple way to do this is to check the size of the canvas in the paintComponent() method and to create a new canvas if the size of the canvas does not match the size of the panel:
if (OSC == null || getWidth() != OSC.getWidth() || getHeight() != OSC.getHeight()) createOSC();

Of course, this will discard the picture that was contained in the old canvas unless some arrangement is made to copy the picture from the old canvas to the new one before the old canvas is discarded. The other point in the program where the o-screen canvas is used is during a mouse-drag operation, which is handled in the mousePressed(), mouseDragged(), and mouseReleased() methods. The strategy that is implemented was discussed above. Shapes are drawn to the o-screen canvas only at the end of the drag operation, in the mouseReleased() method.

12.1. IMAGES AND RESOURCES

617

However, as the user drags the mouse, the part of the image where the shape appears is redrawn each time the mouse is moved; the shape that appears on the screen is drawn on top of the canvas by the paintComponent() method. For the other tools, changes are made directly to the canvas, and the region that was changed is repainted so that the change will appear on the screen. (By the way, the program uses a version of the repaint() method that repaints just a part of a component. The command repaint(x,y,width,height) tells the system to repaint the rectangle with upper left corner (x,y) and with the specied width and height. This can be substantially faster than repainting the entire component.) See the source code, PaintWithOScreenCanvas.java, if you want to see how its all done.


One traditional use of o-screen canvasses is for double buering . In double-buering, the o-screen image is an exact copy of the image that appears on screen. In double buering, whenever the on-screen picture needs to be redrawn, the new picture is drawn step-by-step to an o-screen image. This can take some time. If all this drawing were done on screen, the user would see the image icker as it is drawn. Instead, the long drawing process takes place o-screen and the completed image is then copied very quickly onto the screen. The user doesnt see all the steps involved in redrawing. This technique can be used to implement smooth, icker-free animation. The term double buering comes from the term frame buer, which refers to the region in memory that holds the image on the screen. In fact, true double buering uses two frame buers. The video card can display either frame buer on the screen and can switch instantaneously from one frame buer to the other. One frame buer is used to draw a new image for the screen. Then the video card is told to switch from one frame buer to the other. No copying of memory is involved. Double-buering as it is implemented in Java does require copying, which takes some time and is not perfectly icker-free. In Javas older AWT graphical API, it was up to the programmer to do double buering by hand. In the Swing graphical API, double buering is applied automatically by the system, and the programmer doesnt have to worry about it. (It is possible to turn this automatic double buering o in Swing, but there is seldom a good reason to do so.) One nal historical note about o-screen canvasses: There is an alternative way to create them. The Component class denes the following instance method, which can be used in any GUI component object:
public Image createImage(int width, int height)

This method creates an Image with a specied width and height. You can use this image as an o-screen canvas in the same way that you would a BueredImage. In fact, you can expect that in a modern version of Java, the image that is returned by this method is in fact a BueredImage. The createImage() method was part of Java from the beginning, before the BueredImage class was introduced.

12.1.2

Working With Pixels

One good reason to use a BueredImage is that it allows easy access to the colors of individual pixels. If image is of type BueredImage, then we have the methods: image.getRGB(x,y) returns an int that encodes the color of the pixel at coordinates (x,y) in the image. The values of the integers x and y must lie within the image. That is,

618

CHAPTER 12. ADVANCED GUI PROGRAMMING it must be true that 0 <= x < image.getWidth() and 0 <= y < image.getHeight(); if not, then an exception is thrown. image.setRGB(x,y,rgb) sets the color of the pixel at coordinates (x,y) to the color encoded by rgb. Again, x and y must be in the valid range. The third parameter, rgb, is an integer that encodes the color.

These methods use integer codes for colors. If c is of type Color, the integer code for the color can be obtained by calling c.getRGB(). Conversely, if rgb is an integer that encodes a color, the corresponding Color object can be obtained with the constructor call new Color(rgb). This means that you can use
Color c = new Color( image.getRGB(x,y) )

to get the color of a pixel as a value of type Color. And if c is of type Color, you can set a pixel to that color with
image.setRGB( x, y, c.getRGB() );

The red, green, and blue components of a color are represented as 8-bit integers, in the range 0 to 255. When a color is encoded as a single int, the blue component is contained in the eight low-order bits of the int, the green component in the next lowest eight bits, and the red component in the next eight bits. (The eight high order bits store the alpha component of the color, which well encounter in the next section.) It is easy to translate between the two representations using the shift operators << and >> and the bitwise logical operators & and |. (I have not covered these operators previously in this book. Briey: If A and B are integers, then A << B is the integer obtained by shifting each bit of A B bit positions to the left; A >> B is the integer obtained by shifting each bit of A B bit positions to the right; A & B is the integer obtained by applying the logical and operation to each pair of bits in A and B; and A | B is obtained similarly, using the logical or operation. For example, using 8-bit binary numbers, we have: 01100101 & 10100001 is 00100001, while 01100101 | 10100001 is 11100101.) You dont necessarily need to understand these operators. Here are incantations that you can use to work with color codes:
/* Suppose that rgb is an int that encodes a color. To get separate red, green, and blue color components: *; int red = (rgb >> 16) & 0xFF; int green = (rgb >> 8) & 0xFF; int blue = rgb & 0xFF; /* Suppose that red, green, and blue are color components in the range 0 to 255. To combine them into a single int: */ int rgb = (red << 16) | (green << 8) | blue;


An example of using pixel colors in a BueredImage is provided by the smudge tool in the sample program PaintWithOScreenCanvas.java. The purpose of this tool is to smear the colors of an image, as if it were drawn in wet paint. For example, if you rub the middle of a black rectangle with the smudge tool, youll get something like this:

12.1. IMAGES AND RESOURCES

619

This is an eect that can only be achieved by manipulating the colors of individual pixels! Heres how it works: when the user presses the mouse using the smudge tool, the color components of a 7-by-7 block of pixels are copied from the o-screen canvas into arrays named smudgeRed, smudgeGreen and smudgeBlue. This is done in the mousePressed() routine with the following code:
int w = OSC.getWidth(); int h = OSC.getHeight(); int x = evt.getX(); int y = evt.getY(); for (int i = 0; i < 7; i++) for (int j = 0; j < 7; j++) { int r = y + j - 3; int c = x + i - 3; if (r < 0 || r >= h || c < 0 || c >= w) { // A -1 in the smudgeRed array indicates that the // corresponding pixel was outside the canvas. smudgeRed[i][j] = -1; } else { int color = OSC.getRGB(c,r); smudgeRed[i][j] = (color >> 16) & 0xFF; smudgeGreen[i][j] = (color >> 8) & 0xFF; smudgeBlue[i][j] = color & 0xFF; } }

The arrays are of type double[ ][ ] because I am going to do some computations with them that require real numbers. As the user moves the mouse, the colors in the array are blended with the colors in the image, just as if you were mixing wet paint by smudging it with your nger. That is, the colors at the new mouse position in the image are replaced with a weighted average of the current colors in the image and the colors in the arrays. This has the eect of moving some of the color from the previous mouse position to the new mouse position. At the same time, the colors in the arrays are replaced by a weighted average of the colors in the arrays and the colors from the image. This has the eect of moving some color from the image into the arrays. This is done using the following code for each pixel position, (c,r), in a 7-by-7 block around the new mouse location:
int curCol = OSC.getRGB(c,r); int curRed = (curCol >> 16) & 0xFF; int curGreen = (curCol >> 8) & 0xFF; int curBlue = curCol & 0xFF; int newRed = (int)(curRed*0.7 + smudgeRed[i][j]*0.3); int newGreen = (int)(curGreen*0.7 + smudgeGreen[i][j]*0.3); int newBlue = (int)(curBlue*0.7 + smudgeBlue[i][j]*0.3); int newCol = newRed << 16 | newGreen << 8 | newBlue; OSC.setRGB(c,r,newCol);

620

CHAPTER 12. ADVANCED GUI PROGRAMMING


smudgeRed[i][j] = curRed*0.3 + smudgeRed[i][j]*0.7; smudgeGreen[i][j] = curGreen*0.3 + smudgeGreen[i][j]*0.7; smudgeBlue[i][j] = curBlue*0.3 + smudgeBlue[i][j]*0.7;

12.1.3

Resources

Throughout this textbook, up until now, we have been thinking of a program as made up entirely of Java code. However, programs often use other types of data, including images, sounds, and text, as part of their basic structure. These data are referred to as resources. An example is the image le, cards.png, that was used in the HighLowWithImages.java program earlier in this section. This le is part of the program. The program needs it in order to run. The user of the program doesnt need to know that this le exists or where it is located; as far as the user is concerned, it is just part of the program. The program of course, does need some way of locating the resource le and loading its data. Resources are ordinarily stored in les that are in the same locations as the compiled class les for the program. Class les are located and loaded by something called a class loader , which is represented in Java by an object of type ClassLoader. A class loader has a list of locations where it will look for class les. This list is called the class path . It includes the location where Javas standard classes are stored. It generally includes the current directory. If the program is stored in a jar le, the jar le is included on the class path. In addition to class les, a ClassLoader is capable of locating resource les that are located on the class path or in subdirectories of locations that are on the class path. The rst step in using a resource is to obtain a ClassLoader and to use it to locate the resource le. In the HighLowWithImages program, this is done with:
ClassLoader cl = HighLowWithImages.class.getClassLoader(); URL imageURL = cl.getResource("cards.png");

The idea of the rst line is that in order to get a class loader, you have to ask a class that was loaded by the class loader. Here, HighLowWithImages.class is a name for the object that represents the actual class HighLowWithImages. In other programs, you would just substitute for HighLowWithImages the name of the class that contains the call to getClassLoader(). The second line uses the class loader to locate the resource le named cards.png. The return value of cl.getResource() is of type java.net.URL, and it represents the location of the resource rather than the resource itself. If the resource le cannot be found, then the return value is null. The class URL was discussed in Subsection 11.4.1. Often, resources are stored not directly on the class path but in a subdirectory. In that case, the parameter to getResource() must be a path name that includes the directory path to the resource. For example, suppose that the image le cards.png were stored in a directory named images inside a directory named resources, where resources is directly on the class path. Then the path to the le is resources/images/cards.png and the command for locating the resource would be
URL imageURL = cl.getResource("resources/images/cards.png");

Once you have a URL that represents the location of a resource le, you could use a URLConnection, as discussed in Subsection 11.4.1, to read the contents of that le. However, Java provides more convenient methods for loading several types of resources. For loading image resources, a convenient method is available in the class java.awt.Toolkit. It can be used as in the following line from HighLowWithImages, where cardImages is an instance variable of type Image and imageURL is the URL that represents the location of the image le:

12.1. IMAGES AND RESOURCES


cardImages = Toolkit.getDefaultToolkit().createImage(imageURL);

621

This still does not load the image completelythat will only be done later, for example when cardImages is used in a drawImage command.


The Applet and JApplet classes have an instance method that can be used to load an image from a given URL:
public Image getImage(URL imageURL)

When you are writing an applet, this method can be used as an alternative to the createImage() method in class Toolkit. More interesting is the fact that Applet and JApplet contain a static method that can be used to load sound resources:
public static AudioClip newAudioClip(URL soundURL)

Since this is a static method, it can be used in any program, simply by calling it as Applet.newAudioClip(soundURL) or JApplet.newAudioClip(soundURL). (This seems to be the only easy way to use sounds in a Java program; its not clear why this capability is only in the applet classes.) The return value is of type java.applet.AudioClip. Once you have an AudioClip, you can call its play() method to play the audio clip from the beginning. Here is a method that puts all this together to load and play the sound from an audio resource le:
private void playAudioResource(String audioResourceName) { ClassLoader cl = SoundAndCursorDemo.class.getClassLoader(); URL resourceURL = cl.getResource(audioResourceName); if (resourceURL != null) { AudioClip sound = JApplet.newAudioClip(resourceURL); sound.play(); } }

This method is from a sample program SoundAndCursorDemo that will be discussed in the next subsection. Of course, if you plan to reuse the sound often, it would be better to load the sound once into an instance variable of type AudioClip, which could then be used to play the sound any number of times, without the need to reload it each time. The AudioClip class supports audio les in the common WAV, AIFF, and AU formats.

12.1.4

Cursors and Icons

The position of the mouse is represented on the computers screen by a small image called a cursor . In Java, the cursor is represented by an object of type java.awt.Cursor. A Cursor has an associated image. It also has a hot spot, which is a Point that species the pixel within the image that corresponds to the exact point on the screen where the mouse is pointing. For example, for a typical arrow cursor, the hot spot is the tip of the arrow. For a crosshair cursor, the hot spot is the center of the crosshairs. The Cursor class denes several standard cursors, which are identied by constants such as Cursor.CROSSHAIR CURSOR and Cursor.DEFAULT CURSOR. You can get a standard cursor by calling the static method Cursor.getPredefinedCursor(code), where code is one of the constants that identify the standard cursors. It is also possible to create a custom cursor from an Image. The Image might be obtained as an image resource, as described in the previous

622

CHAPTER 12. ADVANCED GUI PROGRAMMING

subsection. It could even be a BueredImage that you create in your program. It should be small, maybe 16-by-16 or 24-by-24 pixels. (Some platforms might only be able to handle certain cursor sizes; see the documentation for Toolkit.getBestCursorSize() for more information.) A custom cursor can be created by calling the static method createCustomCursor() in the Toolkit class:
Cursor c = Toolkit.getDefaultToolkit().createCustomCursor(image,hotSpot,name);

where hotSpot is of type Point and name is a String that will act as a name for the cursor (and which serves no real purpose that I know of). Cursors are associated with GUI components. When the mouse moves over a component, the cursor changes to whatever Cursor is associated with that component. To associate a Cursor with a component, call the components instance method setCursor(cursor). For example, to set the cursor for a JPanel, panel, to be the standard wait cursor:
panel.setCursor( Cursor.getPredefinedCursor(Cursor.WAIT CURSOR) );

To reset the cursor to be the default cursor, you can use:


panel.setCursor( Curser.getDefaultCursor() );

To set the cursor to be an image that is dened in an image resource le named imageResource, you might use:
ClassLoader cl = SoundAndCursorDemo.class.getClassLoader(); URL resourceURL = cl.getResource(imageResource); if (resourceURL != null) { Toolkit toolkit = Toolkit.getDefaultToolkit(); Image image = toolkit.createImage(resourceURL); Point hotSpot = new Point(7,7); Cursor cursor = toolkit.createCustomCursor(image, hotSpot, "mycursor"); panel.setCursor(cursor); }

The sample program SoundAndCursorDemo.java shows how to use predened and custom cursors and how to play sounds from resource les. The program has several buttons that you can click. Some of the buttons change the cursor that is associated with the main panel of the program. Some of the buttons play sounds. When you play a sound, the cursor is reset to be the default cursor. You can nd an applet version of the program in the on-line version of this section. Another standard use of images in GUI interfaces is for icons. An icon is simply a small picture. As well see in Section 12.3, icons can be used on Javas buttons, menu items, and labels; in fact, for our purposes, an icon is simply an image that can be used in this way. An icon is represented by an object of type Icon, which is actually an interface rather than a class. The class ImageIcon, which implements the Icon interface, is used to create icons from Images. If image is a (rather small) Image, then the constructor call new ImageIcon(image) creates an ImageIcon whose picture is the specied image. Often, the image comes from a resource le. We will see examples of this later in this chapter

12.1.5

Image File I/O

The class javax.imageio.ImageIO makes it easy to save images from a program into les and to read images from les into a program. This would be useful in a program such as

12.1. IMAGES AND RESOURCES

623

PaintWithOffScreenCanvas, so that the users would be able to save their work and to open and edit existing images. (See Exercise 12.1.) There are many ways that the data for an image could be stored in a le. Many standard formats have been created for doing this. Java supports at least three standard image formats: PNG, JPEG, and GIF. (Individual implementations of Java might support more.) The JPEG format is lossy, which means that the picture that you get when you read a JPEG le is only an approximation of the picture that was saved. Some information in the picture has been lost. Allowing some information to be lost makes it possible to compress the image into a lot fewer bits than would otherwise be necessary. Usually, the approximation is quite good. It works best for photographic images and worst for simple line drawings. The PNG format, on the other hand is lossless, meaning that the picture in the le is an exact duplicate of the picture that was saved. A PNG le is compressed, but not in a way that loses information. The compression works best for images made up mostly of large blocks of uniform color; it works worst for photographic images. GIF is an older format that is limited to just 256 colors in an image; it has mostly been superseded by PNG. Suppose that image is a BueredImage. The image can be saved to a le simply by calling
ImageIO.write( image, format, file )

where format is a String that species the image format of the le and file is a File that species the le that is to be written. (See Subsection 11.2.2 for information about the File class.) The format string should ordinarily be either "PNG" or "JPEG", although other formats might be supported. ImageIO.write() is a static method in the ImageIO class. It returns a boolean value that is false if the image format is not supported. That is, if the specied image format is not supported, then the image is not saved, but no exception is thrown. This means that you should always check the return value! For example:
boolean hasFormat = ImageIO.write(OSC,format,selectedFile); if ( ! hasFormat ) throw new Exception(format + " format is not available.");

If the image format is recognized, it is still possible that an IOExcption might be thrown when the attempt is made to send the data to the le. Usually, the le to be used in ImageIO.write() will be selected by the user using a JFileChooser, as discussed in Subsection 11.2.3. For example, here is a typical method for saving an image:
/** * Attempts to save an image to a file selected by the user. * @param image the BufferedImage to be saved to the file * @param format the format of the image, probably either "PNG" or "JPEG" */ private void doSaveFile(BufferedImage image, String format) { if (fileDialog == null) fileDialog = new JFileChooser(); fileDialog.setSelectedFile(new File("image." + format.toLowerCase())); fileDialog.setDialogTitle("Select File to be Saved"); int option = fileDialog.showSaveDialog(null); if (option != JFileChooser.APPROVE OPTION) return; // User canceled or clicked the dialogs close box. File selectedFile = fileDialog.getSelectedFile();

624

CHAPTER 12. ADVANCED GUI PROGRAMMING


if (selectedFile.exists()) { // Ask the user whether to replace the file. int response = JOptionPane.showConfirmDialog( null, "The file \"" + selectedFile.getName() + "\" already exists.\nDo you want to replace it?", "Confirm Save", JOptionPane.YES NO OPTION, JOptionPane.WARNING MESSAGE ); if (response != JOptionPane.YES OPTION) return; // User does not want to replace the file. } try { boolean hasFormat = ImageIO.write(image,format,selectedFile); if ( ! hasFormat ) throw new Exception(format + " format is not available."); } catch (Exception e) { System.out.println("Sorry, an error occured while trying to save image."); e.printStackTrace(); } }


The ImageIO class also has a static read() method for reading an image from a le into a program. The method
ImageIO.read( inputFile )

takes a variable of type File as a parameter and returns a BueredImage. The return value is null if the le does not contain an image that is stored in a supported format. Again, no exception is thrown in this case, so you should always be careful to check the return value. It is also possible for an IOException to occur when the attempt is made to read the le. There is another version of the read() method that takes an InputStream instead of a le as its parameter, and a third version that takes a URL. Earlier in this section, we encountered another method for reading an image from a URL, the createImage() method from the Toolkit class. The dierence is that ImageIO.read() reads the image data completely and stores the result in a BueredImage. On the other hand, createImage() does not actually read the data; it really just stores the image location and the data wont be read until later, when the image is used. This has the advantage that the createImage() method itself can complete very quickly. ImageIO.read(), on the other hand, can take some time to execute.

12.2 The

Fancier Graphics

graphics commands provided by the Graphics class are sucient for many purposes. However, recent versions of Java provide a much larger and richer graphical toolbox in the form of the class java.awt.Graphics2D. I mentioned Graphics2D in Subsection 6.3.5 and promised to discuss it further in this chapter. Graphics2D is a subclass of Graphics, so all of the graphics commands that you already know can be used with a Graphics2D object. In fact, when you obtain a Graphics context for drawing on a Swing component or on a BueredImage, the graphics object is actually of type Graphics2D and can be type-cast to gain access to the advanced Graphics2D graphics commands.

12.2. FANCIER GRAPHICS

625

For example, if image is of type BueredImage, then you can get a Graphics2D for drawing on the image using:
Graphics2D g2 = (Graphics2D)image.getGraphics();

And, as mentioned in Subsection 6.3.5, to use Graphics2D commands in the paintComponent() method of a Swing component, you can write a paintComponent() method of the form:
public void paintComponent(Graphics g) { super.paintComponent(g); Graphics g2 = (Graphics2D)g; . . // Draw to the component using g2 (and g). . }

Note that when you do this, g and g2 are just two variables that refer to the same object, so they both draw to the same drawing surface; g2 just gives you access to methods that are dened in Graphics2D but not in Graphics. When properties of g2, such as drawing color, are changed, the changes also apply to g. By saying
Graphics2D g2 = (Graphics2D)g.create()

you can obtain a newly created graphics context. The object created by g.create() is a graphics context that draws to the same drawing surface as g and that initially has all the same properties as g. However, it is a separate object, so that changing properties in g2 has no eect on g. This can be useful if you want to keep an unmodied copy of the original graphics context around for some drawing operations.

12.2.1

Measuring Text

Although this section is mostly about Graphics2D, we start with a topic that has nothing to do with it. Often, when drawing a string, its important to know how big the image of the string will be. For example, you need this information if you want to center a string in a component. Or if you want to know how much space to leave between two lines of text, when you draw them one above the other. Or if the user is typing the string and you want to position a cursor at the end of the string. In Java, questions about the size of a string can be answered by an object belonging to the standard class java.awt.FontMetrics. There are several lengths associated with any given font. Some of them are shown in this illustration:

e t

c g n

s n i e

A d c s a e e L D

626

CHAPTER 12. ADVANCED GUI PROGRAMMING

The dashed lines in the illustration are the baselines of the two lines of text. The baseline of a string is the line on which the bases of the characters rest. The suggested distance between two baselines, for single-spaced text, is known as the lineheight of the font. The ascent is the distance that tall characters can rise above the baseline, and the descent is the distance that tails like the one on the letter g can descend below the baseline. The ascent and descent do not add up to the lineheight, because there should be some extra space between the tops of characters in one line and the tails of characters on the line above. The extra space is called leading . (The term comes from the time when lead blocks were used for printing. Characters were formed on blocks of lead that were lined up to make up the text of a page, covered with ink, and pressed onto paper to print the page. Extra, blank leading was used to separate the lines of characters.) All these quantities can be determined by calling instance methods in a FontMetrics object. There are also methods for determining the width of a character and the total width of a string of characters. Recall that a font in Java is represented by the class Font. A FontMetrics object is associated with a given font and is used to measure characters and strings in that font. If font is of type Font and g is a graphics context, you can get a FontMetrics object for the font by calling g.getFontMetrics(font). Then, if fm is the variable that refers to the FontMetrics object, then the ascent, descent, leading, and lineheight of the font can be obtained by calling fm.getAscent(), fm.getDescent(), fm.getLeading(), and fm.getHeight(). If ch is a character, then fm.charWidth(ch) is the width of the character when it is drawn in that font. If str is a string, then fm.stringWidth(str) is the width of the string when drawn in that font. For example, here is a paintComponent() method that shows the message Hello World in the exact center of the component:
public void paintComponent(Graphics g) { super.paintComponent(g); int int int int strWidth, strHeight; centerX, centerY; baseX, baseY; topOfString; // // // // Width and height of the string. Coordinates of the center of the component. Coordinates of the basepoint of the string. y-coordinate of the top of the string.

centerX = getWidth() / 2; centerY = getHeight() / 2; Font font = g.getFont(); // What font will g draw in? FontMetrics fm = g.getFontMetrics(font); strWidth = fm.stringWidth("Hello World"); strHeight = fm.getAscent(); // Note: There are no tails on // any of the chars in the string! baseX = centerX - (strWidth/2); // Move back from center by half the // width of the string. topOfString = centerY - (strHeight/2); // Move up from center by half // the height of the string. baseY = topOfString + fm.getAscent(); // Baseline is fm.getAscent() pixels // below the top of the string. g.drawString("Hello World", baseX, baseY); // Draw the string. }

12.2. FANCIER GRAPHICS

627

You can change the font that is used for drawing strings as described in Subsection 6.3.3. For the height of the string in this method, I use fm.getAscent(). If I were drawing Goodbye World instead of Hello World, I would have used fm.getAscent() + fm.getDescent(), where the descent is added to the height in order to take into account the tail on the y in Goodbye. The value of baseX is computed to be the amount of space between the left edge of the component and the start of the string. It is obtained by subtracting half the width of the string from the horizontal center of the component. This will center the string horizontally in the component. The next line computes the position of the top of the string in the same way. However, to draw the string, we need the y-coordinate of the baseline, not the y-coordinate of the top of the string. The baseline of the string is below the top of the string by an amount equal to the ascent of the font. There is an example of centering a two-line block of text in the sample program TransparencyDemo.java, which is discussed in the next subsection.

12.2.2

Transparency

A color is represented by red, blue, and green components. In Javas usual representation, each component is an eight-bit number in the range 0 to 255. The three color components can be packed into a 32-bit integer, but that only accounts for 24 bits in the integer. What about the other eight bits? They dont have to be wasted. They can be used as a fourth component of the color, the alpha component . The alpha component can be used in several ways, but it is most commonly associated with transparency . When you draw with a transparent color, its like laying down a sheet of colored glass. It doesnt completely obscure the part of the image that is colored over. Instead, the background image is blended with the transparent color that is used for drawingas if you were looking at the background through colored glass. This type of drawing is properly referred to as alpha blending , and it is not equivalent to true transparency; nevertheless, most people refer to it as transparency. The value of the alpha component determines how transparent that color is. Actually, the alpha component gives the opaqueness of the color. Opaqueness is the opposite of transparency. If something is fully opaque, you cant see through it at all; if something is almost fully opaque, then it is just a little transparent; and so on. When the alpha component of a color has the maximum possible value, the color is fully opaque. When you draw with a fully opaque color, that color simply replaces the color of the background over which you draw. This is the only type of color that we have used up until now. If the alpha component of a color is zero, then the color is perfectly transparent, and drawing with that color has no eect at all. Intermediate values of the alpha component give partially opaque colors that will blend with the background when they are used for drawing. The sample program TransparencyDemo.java can help you to understand transparency. When you run the program you will see a display area containing a triangle, an oval, a rectangle, and some text. Sliders at the bottom of the applet allow you to control the degree of transparency of each shape. When a slider is moved all the way to the right, the corresponding shape is fully opaque; all the way to the left, and the shape is fully transparent. An applet version of the program can be found in the on-line version of this section.


Colors with alpha components were introduced in Java along with Graphics2D, but they can be used with ordinary Graphics objects as well. To specify the alpha component of a color, you can create the Color object using one of the following constructors from the Color class:

628

CHAPTER 12. ADVANCED GUI PROGRAMMING


public Color(int red, int green, int blue, int alpha); public Color(float red, float green, float blue, float alpha);

In the rst constructor, all the parameters must be integers in the range 0 to 255. In the second, the parameters must be in the range 0.0 to 1.0. For example,
Color transparentRed = new Color( 255, 0, 0, 200 );

makes a slightly transparent red, while


Color tranparentCyan = new Color( 0.0F, 1.0F, 1.0F, 0.5F);

makes a blue-green color that is 50% opaque. (The advantage of the constructor that takes parameters of type oat is that it lets you think in terms of percentages.) When you create an ordinary RGB color, as in new Color(255,0,0), you just get a fully opaque color. Once you have a transparent color, you can use it in the same way as any other color. That is, it if want to use a Color c to draw in a graphics context g, you just say g.setColor(c), and subsequent drawing operations will use that color. As you can see, transparent colors are very easy to use.


A BueredImage with image type BufferedImage.TYPE INT ARGB can use transparency. The color of each pixel in the image can have its own alpha component, which tells how transparent that pixel will be when the image is drawn over some background. A pixel whose alpha component is zero is perfectly transparent, and has no eect at all when the image is drawn; in eect, its not part of the image at all. It is also possible for pixels to be partly transparent. When an image is saved to a le, information about transparency might be lost, depending on the le format. The PNG image format supports transparency; JPEG does not. (If you look at the images of playing cards that are used in the program HighLowWithImages in Subsection 12.1.1, you might notice that the tips of the corners of the cards are fully transparent. The card images are from a PNG le, cards.png.) If you want to experiment with transparency in BueredImages, I suggest that you start by making the entire canvas fully transparent, before you draw anything else on the canvas. Here is one way of doing this: The Graphics2D class has a method setBackground() that can be used to set a background color for the graphics context, and it has a clearRect() method that lls a rectangle with the current background color. To create a fully transparent image with width w and height h, you can use:
BufferedImage image = new BufferedImage(w, h, BufferedImage.TYPE INT ARGB); Graphics2D g2 = (Graphics2D)image.getGraphics(); g2.setBackground(new Color(0,0,0,0)); // (The R, G, and B values dont matter.) g2.clearRect(0, 0, w, h);

As an example, just for fun, here is a method that will set the cursor of a component to be a red square with a transparent interior:
private void useRedSquareCursor() { BufferedImage image = new BufferedImage(24,24,BufferedImage.TYPE INT ARGB); Graphics2D g2 = (Graphics2D)image.getGraphics(); g2.setBackground(new Color(0,0,0,0)); g2.clearRect(0, 0, 24, 24); g2.setColor(Color.RED); g2.drawRect(0,0,23,23); g2.drawRect(1,1,21,21);

12.2. FANCIER GRAPHICS


g2.dispose(); Point hotSpot = new Point(12,12); Toolkit tk = Toolkit.getDefaultToolkit(); Cursor cursor = tk.createCustomCursor(image,hotSpot,"square"); setCursor(cursor); }

629

12.2.3

Antialiasing

To draw a geometric gure such as a line or circle, you just have to color the pixels that are part of the gure, right? Actually, there is a problem with this. Pixels are little squares. Geometric gures, on the other hand, are made of geometric points that have no size at all. Think about drawing a circle, and think about a pixel on the boundary of that circle. The innitely thin geometric boundary of the circle cuts through the pixel. Part of the pixel lies inside the circle, part lies outside. So, when we are lling the circle with color, do we color that pixel or not? A possible solution is to color the pixel if the geometric circle covers 50% or more of the pixel. Following this procedure, however, leads to a visual defect known as aliasing . It is visible in images as a jaggedness or staircasing eect along the borders of shapes. Lines that are not horizontal or vertical also have a jagged, aliased appearance. (The term aliasing seems to refer to the fact that many dierent geometric points map to the same pixel. If you think of the real-number coordinates of a geometric point as a name for the pixel that contains that point, then each pixel has many dierent names or aliases.) Its not possible to build a circle out of squares, but there is a technique that can eliminate some of the jaggedness of aliased images. The technique is called antialiasing . Antialiasing is based on transparency. The idea is simple: If 50% of a pixel is covered by the geometric gure that you are trying to draw, then color that pixel with a color that is 50% transparent. If 25% of the pixel is covered, use a color that is 75% transparent (25% opaque). If the entire pixel is covered by the gure, of course, use a color that is 100% opaqueantialiasing only aects pixels along the boundary of the shape. In antialiasing, the color that you are drawing with is blended with the original color of the pixel, and the amount of blending depends on the fraction of the pixel that is covered by the geometric shape. (The fraction is dicult to compute exactly, so in practice, various methods are used to approximate it.) Of course, you still dont get a picture of the exact geometric shape, but antialiased images do tend to look better than jagged, aliased images. For an example, look at the image in the next subsection. Antialiasing is used to draw the panels in the second and third row of the image, but it is not used in the top row. You should note the jagged appearance of the lines and rectangles in the top row. (By the way, when antialiasing is applied to a line, the line is treated as a geometric rectangle whose width is equal to the size of one pixel.) Antialiasing is supported in Graphics2D. By default, antialiasing is turned o. If g2 is a graphics context of type Graphics2D, you can turn on antialiasing in g2 by saying:
g2.setRenderingHint(RenderingHints.KEY ANTIALIASING, RenderingHints.VALUE ANTIALIAS ON);

As you can see, this is only a hint that you would like to use antialiasing, and it is even possible that the hint will be ignored. However, it is likely that subsequent drawing operations in g2 will be antialiased. If you want to turn antialiasing o in g2, you can just say:

630

CHAPTER 12. ADVANCED GUI PROGRAMMING


g2.setRenderingHint(RenderingHints.KEY ANTIALIASING, RenderingHints.VALUE ANTIALIAS OFF);

12.2.4

Strokes and Paints

When using the Graphics class, any line that you draw will be a solid line that is one pixel thick. The Graphics2D class makes it possible to draw a much greater variety of lines. You can draw lines of any thickness, and you can draw lines that are dotted or dashed instead of solid. An object of type Stroke contains information about how lines should be drawn, including how thick the line should be and what pattern of dashes and dots, if any, should be used. Every Graphics2D has an associated Stroke object. The default Stroke draws a solid line of thickness one. To get lines with dierent properties, you just have to install a dierent stroke into the graphics context. Stroke is an interface, not a class. The class BasicStroke, which implements the Stroke interface, is the one that is actually used to create stroke objects. For example, to create a stroke that draws solid lines with thickness equal to 3, use:
BasicStroke line3 = new BasicStroke(3);

If g2 is of type Graphics2D, the stroke can be installed in g2 by calling its setStroke() command:
g2.setStroke(line3)

After calling this method, subsequent drawing operations will use lines that are three times as wide as the usual thickness. The thickness of a line can be given by a value of type oat, not just by an int. For example, to use lines of thickness 2.5 in the graphics context g2, you can say:
g2.setStroke( new BasicStroke(2.5F) );

(Fractional widths make more sense if antialiasing is turned on.) When you have a thick line, the question comes up, what to do at the ends of the line. If you draw a physical line with a large, round piece of chalk, the ends of the line will be rounded. When you draw a line on the computer screen, should the ends be rounded, or should the line simply be cut o at? With the BasicStroke class, the choice is up to you. Maybe its time to look at examples. This illustration shows fteen lines, drawn using dierent BasicStrokes. Lines in the middle row have rounded ends; lines in the other two rows are simply cut o at their endpoints. Lines of various thicknesses are shown, and the bottom row shows dashed lines. (And, as mentioned above, only the bottom two rows are antialiased.)

12.2. FANCIER GRAPHICS

631

This illustration shows the sample program StrokeDemo.java. (You can try an applet version of the program in the on-line version of this section.) In this program, you can click and drag in any of the small panels, and the lines in all the panels will be redrawn as you move the mouse. In addition, if you right-click and drag, then rectangles will be drawn instead of lines; this shows that strokes are used for drawing the outlines of shapes and not just for straight lines. If you look at the corners of the rectangles that are drawn by the program, youll see that there are several ways of drawing a corner where two wide line segments meet. All the options that you want for a BasicStroke have to be specied in the constructor. Once the stroke object is created, there is no way to change the options. There is one constructor that lets you specify all possible options:
public BasicStroke( float width, int capType, int joinType, float miterlimit, float[] dashPattern, float dashPhase )

I dont want to cover all the options in detail, but heres some basic info: width species the thickness of the line capType species how the ends of a line are capped. The possible values are BasicStroke.CAP SQUARE, BasicStroke.CAP ROUND and BasicStroke.CAP BUTT. These values are used, respectively, in the rst, second, and third rows of the above picture. The default is BasicStroke.CAP SQUARE. joinType species how two line segments are joined together at corners. Possible values are BasicStroke.JOIN MITER, BasicStroke.JOIN ROUND, and BasicStroke.JOIN BEVEL. Again, these are used in the three rows of panels in the sample program. The default is BasicStroke.JOIN MITER. miterLimit is used only if the value of joinType is JOIN MITER; just use the default value, 10.0F. dashPattern is used to specify dotted and dashed lines. The values in the array specify lengths in the dot/dash pattern. The numbers in the array represent the length of a solid piece, followed by the length of a transparent piece, followed by the length of a solid piece,

632

CHAPTER 12. ADVANCED GUI PROGRAMMING and so on. At the end of the array, the pattern wraps back to the beginning of the array. If you want a solid line, use a dierent constructor that has fewer parameters. dashPhase tells the computer where to start in the dashPattern array, for the rst segment of the line. Use 0 for this parameter in most cases.

For the third row in the above picture, the dashPattern is set to new float[] {5,5}. This means that the lines are drawn starting with a solid segment of length 5, followed by a transparent section of length 5, and then repeating the same pattern. A simple dotted line would have thickness 1 and dashPattern new float[] {1,1}. A pattern of short and long dashes could be made by using new float[] {10,4,4,4}. For more information, see the Java documentation, or try experimenting with the source code for the sample program.


So now we can draw fancier lines. But any drawing operation is still restricted to drawing with a single color. We can get around that restriction by using Paint. An object of type Paint is used to assign color to each pixel that is hit by a drawing operation. Paint is an interface, and the Color class implements the Paint interface. When a color is used for painting, it applies the same color to every pixel that is hit. However, there are other types of paint where the color that is applied to a pixel depends on the coordinates of that pixel. Standard Java includes two classes that dene paint with this property: GradientPaint and TexturePaint. In a gradient, the color that is applied to pixels changes gradually from one color to a second color as you move in a certain direction. In a texture, the pixel colors come from an image, which is repeated, if necessary, like a wallpaper pattern to cover the entire xy-plane. It will be helpful to look at some examples. This illustration shows a polygon lled with two dierent textures. The polygon on the left uses a GradientPaint while the one on the right uses a TexturePaint. Note that in this picture, the paint is used only for lling the polygon. The outline of the polygon is drawn in a plain black color. However, Paint objects can be used for drawing lines as well as for lling shapes. These pictures were made by the sample program PaintDemo.java. In that program, you can select among several dierent paints, and you can control certain properties of the paints. As usual, an applet version of the program is available on line.

Gradient paints are created using the constructor


public GradientPaint(float x1, float y1, Color c1, float x2, float y2, Color c2, boolean cyclic)

12.2. FANCIER GRAPHICS

633

This constructs a gradient that has color c1 at the point with coordinates (x1,y1) and color c2 at the point (x2,y2). As you move along the line between the two points, the color of the gradient changes from c1 to c2; along lines perpendicular to this line, the color is constant. The last parameter, cyclic, tells what happens if you move past the point (x2,y2) on the line from (x1,y1) to (x2,y2). If cyclic is false, the color stops changing and any point beyond (x2,y2) has color c2. If cyclic is true, then the colors continue to change in a cyclic pattern after you move past (x2,y2). (It works the same way if you move past the other endpoint, (x1,y1).) In most cases, you will set cyclic to true. Note that you can vary the points (x1,y1) and (x2,y2) to change the width and direction of the gradient. For example, to create a cyclic gradient that varies from black to light gray along the line from (0,0) to (100,100), use:
new GradientPaint( 0, 0, Color.BLACK, 100, 100, Color.LIGHT GRAY, true)

To construct a TexturePaint, you need a BueredImage that contains the image that will be used for the texture. You also specify a rectangle in which the image will be drawn. The image will be scaled, if necessary, to exactly ll the rectangle. Outside the specied rectangle, the image will be repeated horizontally and vertically to ll the plane. You can vary the size and position of the rectangle to change the scale of the texture and its positioning on the plane. Ordinarily, however the upper left corner of the rectangle is placed at (0,0), and the size of the rectangle is the same as the actual size of the image. The constructor for TexturePaint is dened as
public TexturePaint( BufferedImage textureImage, Rectangle2D anchorRect)

The Rectangle2D is part of the Graphics2D framework and will be discussed at the end of this section. Often, a call to the constructor takes the form:
new TexturePaint( image, new Rectangle2D.Double(0,0,image.getWidth(),image.getHeight() )

Once you have a Paint object, you can use the setPaint() method of a Graphics2D object to install the paint in a graphics context. For example, if g2 is of type Graphics2D, then the command
g2.setPaint( new GradientPaint(0,0,Color.BLUE,100,100,Color.GREEN,true) );

sets up g2 to use a gradient paint. Subsequent drawing operations with g2 will draw using a blue/green gradient.

12.2.5

Transforms

In the standard drawing coordinates on a component, the upper left corner of the component has coordinates (0,0). Coordinates are integers, and the coordinates (x,y) refer to the point that is x pixels over from the left edge of the component and y pixels down from the top. With Graphics2D, however, you are not restricted to using these coordinates. In fact, you can can set up a Graphics2D graphics context to use any system of coordinates that you like. You can use this capability to select the coordinate system that is most appropriate for the things that you want to draw. For example, if you are drawing architectural blueprints, you might use coordinates in which one unit represents an actual distance of one foot. Changes to a coordinate system are referred to as transforms. There are three basic types of transform. A translate transform changes the position of the origin, (0,0). A scale transform changes the scale, that is, the unit of distance. And a rotation transforms applies a

634

CHAPTER 12. ADVANCED GUI PROGRAMMING

rotation about some point. You can make more complex transforms by combining transforms of the three basic types. For example, you can apply a rotation, followed by a scale, followed by a translation, followed by another rotation. When you apply several transforms in a row, their eects are cumulative. It takes a fair amount of study to fully understand complex transforms. I will limit myself here to discussing a few of the most simple cases, just to give you an idea of what transforms can do Suppose that g2 is of type Graphics2D. Then g2.translate(x,y) moves the origin, (0,0), to the point (x,y). This means that if you use coordinates (0,0) after saying g2.translate(x,y), then you are referring to the point that used to be (x,y), before the translation was applied. All other coordinate pairs are moved by the same amount. For example saying
g.translate(x,y); g.drawLine( 0, 0, 100, 200 );

draws the same line as


g.drawLine( x, y, 100+x, 200+y );

In the second case, you are just doing the same translation by hand. A translation (like all transforms) aects all subsequent drawing operations. Instead of thinking in terms of coordinate systems, you might nd it clearer to think of what happens to the objects that are drawn. After you say g2.translate(x,y), any objects that you draw are displaced x units vertically and y units horizontally. Note that the parameters x and y can be real numbers. As an example, perhaps you would prefer to have (0,0) at the center of a component, instead of at its upper left corner. To do this, just use the following command in the paintComponent() method of the component:
g2.translate( getWidth()/2, getHeight()/2 );

To apply a scale transform to a Graphics2D g2, use g2.scale(s,s), where s is the real number that species the scaling factor. If s is greater than 1, everything is magnied by a factor of s, while if s is between 0 and 1, everything is shrunk by a factor of s. The center of scaling is (0,0). That is, the point (0,0) is unaected by the scaling, and other points more towards or away from (0,0) by a factor of s. Again, it can be clearer to think of the eect on objects that are drawn after a scale transform is applied. Those objects will be magnied or shrunk by a factor of s. Note that scaling aects everything, including thickness of lines and size of fonts. By the way, it is possible to use scale factors that are less than 0. It is even possible to use dierent scale factors in the horizontal and vertical direction with a command of the form g2.scale(sx,sy), although that will distort the shapes of objects. The third type of basic transform is rotation. The command g2.rotate(r) rotates all subsequently drawn objects through an angle of r about the point (0,0). You can rotate instead about the point (x,y) with the command g2.rotate(r,x,y). All the parameters can be real numbers. Angles are measured in radians, where one radian is equal to 180 degrees. To rotate through an angle of d degrees, use
g2.rotate( d * Math.PI / 180 );

Positive angles are clockwise rotations, while negative angles are counterclockwise (unless you have already applied a negative scale factor, which reverses the orientation). Rotation is not as common as translation or scaling, but there are a few things that you can do with it that cant be done any other way. For example, you can use it to draw an image on the slant. Rotation also makes it possible to draw text that is rotated so that its baseline

12.2. FANCIER GRAPHICS

635

is slanted or even vertical. To draw the string Hello World with its basepoint at (x,y) and rising at an angle of 30 degrees, use:
g2.rotate( -30 * Math.PI / 180, x, y ); g2.drawString( "Hello World", x, y );

To draw the message vertically, with the center of its baseline at the point (x,y), we can use FontMetrics to measure the string, and say:
FontMetrics fm = g2.getFontMetrics( g2.getFont() ); int baselineLength = fm.stringWidth("Hello World"); g2.rotate( -90 * Math.PI / 180, x, y); g2.drawString( "Hello World", x - baselineLength/2, y );


The drawing operations in the Graphics class use integer coordinates only. Graphics2D makes it possible to use real numbers as coordinates. This becomes particularly important once you start using transforms, since after you apply a scale, a square of size one might cover many pixels instead of just a single pixel. Unfortunately, the designers of Java couldnt decide whether to use numbers of type oat or double as coordinates, and their indecision makes things a little more complicated than they need to be. (My guess is that they really wanted to use oat, since values of type oat have enough accuracy for graphics and are probably used in the underlying graphical computations of the computer. However, in Java programming, its easier to use double than oat, so they wanted to make it possible to use double values too.) To use real number coordinates, you have to use classes dened in the package java.awt.geom. Among the classes in this package are classes that represent geometric shapes such as lines and rectangles. For example, the class Line2D represents a line whose endpoints are given as real number coordinates. The unfortunate thing is that Line2D is an abstract class, which means that you cant create objects of type Line2D directly. However, Line2D has two concrete subclasses that can be used to create objects. One subclass uses coordinates of type oat, and one uses coordinates of type double. The most peculiar part is that these subclasses are dened as static nested classes inside Line2D. Their names are Line2D.Float and Line2D.Double. This means that Line2D objects can be created, for example, with:
Line2D line1 = new Line2D.Float( 0.17F, 1.3F, -2.7F, 5.21F ); Line2D line2 = new Line2D.Double( 0, 0, 1, 0); Line2D line3 = new Line2D.Double( x1, y1, x2, y2 );

where x1, y1, x2, y2 are any numeric variables. In my own code, I generally use Line2D.Double rather than Line2D.Float. Other shape classes in java.awt.geom are similar. The class that represents rectangles is Rectangle2D. To create a rectangle object, you have to use either Rectangle2D.Float or Rectangle2D.Double. For example,
Rectangle2D rect = new Rectangle2D.Double( -0.5, -0.5, 1.0, 1.0 );

creates a rectangle with a corner at (-0.5,-0.5) and with width and height both equal to 1. Other classes include Point2D, which represents a single point; Ellipse2D, which represents an oval; and Arc2D, which represents an arc of a circle. If g2 is of type Graphics2D and shape is an object belonging to one of the 2D shape classes, then the command
g2.draw(shape);

636

CHAPTER 12. ADVANCED GUI PROGRAMMING

draws the shape. For a shape such as a rectangle or ellipse that has an interior, only the outline is drawn. To ll in the interior of such a shape, use
g2.fill(shape)

For example, to draw a line from (x1,y1) to (x2,y2), use


g2.draw( new Line2D.Double(x1,y1,x2,y2) );

and to draw a lled rectangle with a corner at (3.5,7), with width 5, and with height 3, use
g2.fill( new Rectangle2D.Double(3.5, 7, 5, 3) );

The package java.awt.geom also has a very nice class GeneralPath that can be used to draw polygons and curves dened by any number of points. See the Java documentation if you want to nd out how to use it. There is still a large part of the Graphics2D framework for you to explore.

12.3 For

Actions and Buttons

the past two sections, we have been looking at some of the more advanced aspects of the Java graphics API. But the heart of most graphical user interface programming is using GUI components. In this section and the next, well be looking at JComponents. Well cover several component classes that were not covered in Chapter 6, as well as some features of classes that were covered there. This section is mostly about buttons. Buttons are among the simplest of GUI components, and it seems like there shouldnt be all that much to say about them. However, buttons are not as simple as they seem. For one thing, there are many dierent types of buttons. The basic functionality of buttons in Java is dened by the class javax.swing.AbstractButton. Subclasses of this class represent push buttons, check boxes, and radio buttons. Menu items are also considered to be buttons. The AbstractButton class denes a surprisingly large API for controlling the appearance of buttons. This section will cover part of that API, but you should see the class documentation for full details. In this section, well also encounter a few classes that do not themselves dene buttons but that are related to the button API, starting with actions.

12.3.1

Action and AbstractAction

The JButton class represents push buttons. Up until now, we have created push buttons using the constructor
public JButton(String text);

which species text that will appear on the button. We then added an ActionListener to the button, to respond when the user presses it. Another way to create a JButton is using an Action. The Action interface represents the general idea of some action that can be performed, together with properties associated with that action, such as a name for the action, an icon that represents the action, and whether the action is currently enabled or disabled. Actions are usually dened using the class AbstractAction, an abstract class which includes a method,
public void actionPerformed(ActionEvent evt)

that must be dened in any concrete subclass. Often, this is done in an anonymous inner class. For example, if display is an object that has a clear() method, an Action object that represents the action clear the display might be dened as:

12.3. ACTIONS AND BUTTONS


Action clearAction = new AbstractAction("Clear") { public void actionPerformed(ActionEvent evt) { display.clear(); } };

637

The parameter, "Clear", in the constructor of the AbstractAction is the name of the action. Other properties can be set by calling the method setValue(key,value), which is part of the Action interface. For example,
clearAction.setValue(Action.SHORT DESCRIPTION, "Clear the Display");

sets the SHORT DESCRIPTION property of the action to have the value Clear the Display. The key parameter in the setValue() method is usually given as one of several constants dened in the Action interface. As another example, you can change the name of an action by using Action.NAME as the key in the setValue() method. Once you have an Action, you can use it in the constructor of a button. For example, using the action clearAction dened above, we can create the JButton
JButton clearButton = new JButton( clearAction );

The name of the action will be used as the text of the button, and some other properties of the button will be taken from properties of the action. For example, if the SHORT DESCRIPTION property of the action has a value, then that value is used as the tooltip text for the button. (The tooltip text appears when the user hovers the mouse over the button.) Furthermore, when you change a property of the action, the corresponding property of the button will also be changed. The Action interface denes a setEnabled() method that is used to enable and disable the action. The clearAction action can be enabled and disabled by calling clearAction.setEnabled(true) and clearAction.setEnabled(false). When you do this, any button that has been created from the action is also enabled or disabled at the same time. Now of course, the question is, why should you want to use Actions at all? One advantage is that using actions can help you to organize your code better. You can create separate objects that represent each of the actions that can be performed in your program. This represents a nice division of responsibility. Of course, you could do the same thing with individual ActionListener objects, but then you couldnt associate descriptions and other properties with the actions. More important is the fact that Actions can also be used in other places in the Java API. You can use an Action to create a JMenuItem in the same way as for a JButton:
JMenuItem clearCommand = new JMenuItem( clearAction );

A JMenuItem, in fact, is a kind of button and shares many of the same properties that a JButton can have. You can use the same Action to create both a button and a menu item (or even several of each if you want). Whenever you enable or disable the action or change its name, the button and the menu item will both be changed to match. If you change the NAME property of the action, the text of both the menu item and the button will be set to new name of the action. You can think of the button and the menu items as being two presentations of the Action, and you dont have to keep track of the button or menu item after you create them. You can do everything that you need to do by manipulating the Action object. It is also possible to associate an Action with any key, so that the action will be performed whenever the user presses that key. I wont explain how to do it here, but you can look up the documentation for the classes javax.swing.InputMap and javax.swing.ActionMap.

638

CHAPTER 12. ADVANCED GUI PROGRAMMING

By the way, if you want to add a menu item that is dened from an Action to a menu, you dont even need to create the JMenuItem yourself. You can add the action object directly to the menu, and the menu item will be created from the properties of the action. For example, if menu is a JMenu and clearAction is an Action, you can simply say menu.add(clearAction).

12.3.2

Icons on Buttons

In addition toor instead oftext, buttons can also show icons. Icons are represented by the Icon interface and are usually created as ImageIcons, as discussed in Subsection 12.1.4. For example, here is a picture of a button that displays an image of a large X as its icon:

The icon for a button can be set by calling the buttons setIcon() method, or by passing the icon object as a parameter to the constructor when the button is created. To create the button shown above, I created an ImageIcon from a BueredImage on which I drew the picture that I wanted, and I constructed the JButton using a constructor that takes both the text and the icon for the button as parameters. Heres the code segment that does it:
BufferedImage image = new BufferedImage(24,24,BufferedImage.TYPE INT RGB); Graphics2D g2 = (Graphics2D)image.getGraphics(); // Draw the image for the icon. g2.setColor(Color.LIGHT GRAY); g2.fillRect(0,0,24,24); g2.setStroke( new BasicStroke(3) ); // Use thick lines. g2.setColor(Color.BLACK); g2.drawLine(4,4,20,20); // Draw the "X". g2.drawLine(4,20,20,4); g2.dispose(); Icon clearIcon = new ImageIcon(image); // Create the icon.

JButton clearButton = new JButton("Clear the Display", clearIcon);

You can create a button with an icon but no text by using a constructor that takes just the icon as parameter. Another alternative is for the button to get its icon from an Action. When a button is constructed from an action, it takes its icon from the value of the action property Action.SMALL ICON. For example, suppose that we want to use an action named clearAction to create the button shown above. This could be done with:
clearAction.putValue( Action.SMALL ICON, clearIcon ); JButton clearButton = new JButton( clearAction );

The icon could also be associated with the action by passing it as a parameter to the constructor of an AbstractAction:
Action clearAction = new AbstractAction("Clear the Display", clearIcon) { public void actionPerformed(ActionEvent evt) { . . // Carry out the action. . } } JButton clearButton = new JButton( clearAction );

12.3. ACTIONS AND BUTTONS

639

The appearance of buttons can be tweaked in many ways. For example, you can change the size of the gap between the buttons text and its icon. You can associate additional icons with a button that are used when the button is in certain states, such as when it is pressed or when it is disabled. It is even possible to change the positioning of the text with respect to the icon. For example, to place the text centered below the icon on a button, you can say:
button.setHorizontalTextPosition(JButton.CENTER); button.setVerticalTextPosition(JButton.BOTTOM);

These methods and many others are dened in the class AbstractButton. This class is a superclass for JMenuItem, as well as for JButton and for the classes that dene check boxes and radio buttons. Note in particular that an icon can be shown in a menu by associating the icon with a menu item or with the action that is used to create the menu item. Finally, I will mention that it is possible to use icons on JLabels in much the same way that they can be used on JButtons.

12.3.3

Radio Buttons

The JCheckBox class was covered in Subsection 6.6.3, and the equivalent for use in menus, JCheckBoxMenuItem, in Subsection 6.8.1. A checkbox has two states, selected and not selected, and the user can change the state by clicking on the check box. The state of a checkbox can also be set programmatically by calling its setSelected() method, and the current value of the state can be checked using the isSelected() method. Closely related to checkboxes are radio buttons. Like a checkbox, a radio button can be either selected or not. However, radio buttons are expected occur in groups, and at most one radio button in a group can be selected at any given time. In Java, a radio button is represented by an object of type JRadioButton. When used in isolation, a JRadioButton acts just like a JCheckBox, and it has the same methods and events. Ordinarily, however, a JRadioButton is used in a group. A group of radio buttons is represented by an object belonging to the class ButtonGroup. A ButtonGroup is not a component and does not itself have a visible representation on the screen. A ButtonGroup works behind the scenes to organize a group of radio buttons, to ensure that at most one button in the group can be selected at any given time. To use a group of radio buttons, you must create a JRadioButton object for each button in the group, and you must create one object of type ButtonGroup to organize the individual buttons into a group. Each JRadioButton must be added individually to some container, so that it will appear on the screen. (A ButtonGroup plays no role in the placement of the buttons on the screen.) Each JRadioButton must also be added to the ButtonGroup, which has an add() method for this purpose. If you want one of the buttons to be selected initially, you can call setSelected(true) for that button. If you dont do this, then none of the buttons will be selected until the user clicks on one of them. As an example, here is how you could set up a set of radio buttons that can be used to select a color:
JRadioButton redRadio, blueRadio, greenRadio, blackRadio; // Variables to represent the radio buttons. // These should probably be instance variables, so // that they can be used throughout the program. ButtonGroup colorGroup = new ButtonGroup();

640

CHAPTER 12. ADVANCED GUI PROGRAMMING


redRadio = new JRadioButton("Red"); // Create a button. colorGroup.add(redRadio); // Add it to the group. blueRadio = new JRadioButton("Blue"); colorGroup.add(blueRadio); greenRadio = new JRadioButton("Green"); colorGroup.add(greenRadio); blackRadio = new JRadioButton("Black"); colorGroup.add(blackRadio); redRadio.setSelected(true); // Make an initial selection.

The individual buttons must still be added to a container if they are to appear on the screen. If you want to respond immediately when the user clicks on one of the radio buttons, you can register an ActionListener for each button. Just as for checkboxes, it is not always necessary to register listeners for radio buttons. In some cases, you can simply check the state of each button when you need to know it, using the buttons isSelected() method. All this is demonstrated in the sample program RadioButtonDemo.java. The program shows four radio buttons. When the user selects one of the radio buttons, the text and background color of a label is changed. Here is a picture of the program, with the Green radio button selected:

You can add the equivalent of a group of radio buttons to a menu by using the class JRadioButtonMenuItem. To use this class, create several objects of this type, and create a ButtonGroup to manage them. Add each JRadioButtonMenuItem to the ButtonGroup, and also add them to a JMenu. If you want one of the items to be selected initially, call its setSelected() method to set its selection state to true. You can add ActionListeners to each JRadioButtonMenuItem if you need to take some action when the user selects the menu item; if not, you can simply check the selected states of the buttons whenever you need to know them. As an example, suppose that menu is a JMenu. Then you can add a group of buttons to menu as follows:
JRadioButtonMenuItem selectRedItem, selectGreenItem, selectBlueItem; // These might be defined as instance variables ButtonGroup group = new ButtonGroup(); selectRedItem = new JRadioButtonMenuItem("Red"); group.add(selectRedItem); menu.add(selectRedItem);

12.3. ACTIONS AND BUTTONS


selectGreenItem = new JRadioButtonMenuItem("Green"); group.add(selectGreenItem); menu.add(selectGreenItem); selectBlueItem = new JRadioButtonMenuItem("Blue"); group.add(selectBlueItem); menu.add(selectBlueItem);

641


When its drawn on the screen, a JCheckBox includes a little box that is either checked or unchecked to show the state of the box. That box is actually a pair of Icons. One icon is shown when the check box is unselected; the other is shown when it is selected. You can change the appearance of the check box by substituting dierent icons for the standard ones. The icon that is shown when the check box is unselected is just the main icon for the JCheckBox. You can provide a dierent unselected icon in the constructor or you can change the icon using the setIcon() method of the JCheckBox object. To change the icon that is shown when the check box is selected, use the setSelectedIcon() method of the JCheckBox. All this applies equally to JRadioButton, JCheckBoxMenuItem, and JRadioButtonMenuItem. An example of this can be found in the sample program ToolBarDemo.java, which is discussed in the next subsection. That program creates a set of radio buttons that use custom icons. The buttons are created by the following method:
/** * Create a JRadioButton and add it to a specified button group. The button * is meant for selecting a drawing color in the display. The color is used to * create two custom icons, one for the unselected state of the button and one * for the selected state. These icons are used instead of the usual * radio button icons. * @param c the color of the button, and the color to be used for drawing. * (Note that c has to be "final" since it is used in the anonymous inner * class that defines the response to ActionEvents on the button.) * @param grp the ButtonGroup to which the radio button will be added. * @param selected if true, then the state of the button is set to selected. * @return the radio button that was just created; sorry, but the button is not as pretty as I would like! */ private JRadioButton makeColorRadioButton(final Color c, ButtonGroup grp, boolean selected) { /* Create an ImageIcon for the normal, unselected state of the button, using a BufferedImage that is drawn here from scratch. */ BufferedImage image = new BufferedImage(30,30,BufferedImage.TYPE INT RGB); Graphics g = image.getGraphics(); g.setColor(Color.LIGHT GRAY); g.fillRect(0,0,30,30); g.setColor(c); g.fill3DRect(1, 1, 24, 24, true); g.dispose(); Icon unselectedIcon = new ImageIcon(image); /* Create an ImageIcon for the selected state of the button. */ image = new BufferedImage(30,30,BufferedImage.TYPE INT RGB); g = image.getGraphics();

642

CHAPTER 12. ADVANCED GUI PROGRAMMING


g.setColor(Color.DARK GRAY); g.fillRect(0,0,30,30); g.setColor(c); g.fill3DRect(3, 3, 24, 24, false); g.dispose(); Icon selectedIcon = new ImageIcon(image); /* Create and configure the button. */ JRadioButton button = new JRadioButton(unselectedIcon); button.setSelectedIcon(selectedIcon); button.addActionListener( new ActionListener() { public void actionPerformed(ActionEvent e) { // The action for this button sets the current drawing color // in the display to c. display.setCurrentColor(c); } }); grp.add(button); if (selected) button.setSelected(true); return button; } // end makeColorRadioButton


It is possible to create radio buttons and check boxes from Actions. The button takes its name, main icon, tooltip text, and enabled/disabled state from the action. However, in Java 5.0, an action has no property corresponding to the selected/unselected state. This means that you cant check or set the selection state through the action. In Java 6.0, the action API is considerably improved, and among the changes is support for selection state.

12.3.4

Toolbars

It has become increasingly common for programs to have a row of small buttons along the top or side of the program window that oer access to some of the commonly used features of the program. The row of buttons is known as a tool bar . Typically, the buttons in a tool bar are presented as small icons, with no text. Tool bars can also contain other components, such as JTextFields and JLabels. In Swing, tool bars are represented by the class JToolBar. A JToolBar is a container that can hold other components. It is also itself a component, and so can be added to other containers. In general, the parent component of the tool bar should use a BorderLayout. The tool bar should occupy one of the edge positionsNORTH, SOUTH, EAST, or WESTin the BorderLayout. Furthermore, the other three edge positions should be empty. The reason for this is that it might be possible (depending on the platform and conguration) for the user to drag the tool bar from one edge position in the parent container to another. It might even be possible for the user to drag the toolbar o its parent entirely, so that it becomes a separate window. Here is a picture of a tool bar from the sample program ToolBarDemo.java.

12.3. ACTIONS AND BUTTONS

643

In this program, the user can draw colored curves in a large drawing area. The rst three buttons in the tool bar are a set of radio buttons that control the drawing color. The fourth button is a push button that the user can click to clear the drawing. Tool bars are easy to use. You just have to create the JToolBar object, add it to a container, and add some buttons and possibly other components to the tool bar. One ne point is adding space to a tool bar, such as the gap between the radio buttons and the push button in the sample program. You can leave a gap by adding a separator to the tool bar. For example:
toolbar.addSeparator(new Dimension(20,20));

This adds an invisible 20-by-20 pixel block to the tool bar. This will appear as a 20 pixel gap between components. Here is the constructor from the ToolBarDemo program. It shows how to create the tool bar and place it in a container. Note that class ToolBarDemo is a subclass of JPanel, and the tool bar and display are added to the panel object that is being constructed:
public ToolBarDemo() { setLayout(new BorderLayout(2,2)); setBackground(Color.GRAY); setBorder(BorderFactory.createLineBorder(Color.GRAY,2)); display = new Display(); add(display, BorderLayout.CENTER); JToolBar toolbar = new JToolBar(); add(toolbar, BorderLayout.NORTH); ButtonGroup group = new ButtonGroup(); toolbar.add( makeColorRadioButton(Color.RED,group,true) ); toolbar.add( makeColorRadioButton(Color.GREEN,group,false) ); toolbar.add( makeColorRadioButton(Color.BLUE,group,false) ); toolbar.addSeparator(new Dimension(20,20)); toolbar.add( makeClearButton() ); }

Note that the gray outline of the tool bar comes from two sources: The line at the bottom shows the background color of the main panel, which is visible because the BorderLayout that is used on that panel has vertical and horizontal gaps of 2 pixels. The other three sides are part of the border of the main panel. If you want a vertical tool bar that can be placed in the EAST or WEST position of a BorderLayout, you should specify the orientation in the tool bars constructor:
JToolBar toolbar = new JToolBar( JToolBar.VERTICAL );

The default orientation is JToolBar.HORIZONTAL. The orientation is adjusted automatically when the user drags the toolbar into a new position. If you want to prevent the user from dragging the toolbar, just say toolbar.setFloatable(false).

12.3.5

Keyboard Accelerators

In most programs, commonly used menu commands have keyboard equivalents. The user can type the keyboard equivalent instead of selecting the command from the menu, and the result will be exactly the same. Typically, for example, the Save command has keyboard

644

CHAPTER 12. ADVANCED GUI PROGRAMMING

equivalent CONTROL-S, and the Undo command corresponds to CONTROL-Z. (Under Mac OS, the keyboard equivalents for these commands would probably be META-C and META-Z, where META refers to holding down the apple key.) The keyboard equivalents for menu commands are referred to as accelerators. The class javax.swing.KeyStroke is used to represent key strokes that the user can type on the keyboard. A key stroke consists of pressing a key, possibly while holding down one or more of the modier keys control, shift, alt, and meta. The KeyStroke class has a static method, getKeyStroke(String), that makes it easy to create key stroke objects. For example,
KeyStroke.getKeyStroke( "ctrl S" )

returns a KeyStroke that represents the action of pressing the S key while holding down the control key. In addition to ctrl, you can use the modiers shift, alt, and meta in the string that describes the key stroke. You can even combine several modiers, so that
KeyStroke.getKeyStroke( "ctrl shift Z" )

represents the action of pressing the Z key while holding down both the control and the shift keys. When the key stroke involves pressing a character key, the character must appear in the string in upper case form. You can also have key strokes that correspond to non-character keys. The number keys can be referred to as 1, 2, etc., while certain special keys have names such as F1, ENTER, and LEFT (for the left arrow key). The class KeyEvent denes many constants such as VK ENTER, VK LEFT, and VK S. The names that are used for keys in the keystroke description are just these constants with the leading VK removed. There are at least two ways to associate a keyboard accelerator with a menu item. One is to use the setAccelerator() method of the menu item object:
JMenuItem saveCommand = new JMenuItem( "Save..." ); saveCommand.setAccelerator( KeyStroke.getKeyStroke("ctrl S") );

The other technique can be used if the menu item is created from an Action. The action property Action.ACCELERATOR KEY can be used to associate a KeyStroke with an Action. When a menu item is created from the action, the keyboard accelerator for the menu item is taken from the value of this property. For example, if redoAction is an Action representing a Redo action, then you might say:
redoAction.putValue( Action.ACCELERATOR KEY, KeyStroke.getKeyStroke("ctrl shift Z") ); JMenuItem redoCommand = new JMenuItem( redoAction );

or, alternatively, you could simply add the action to a JMenu, editMenu, with editMenu.add(redoAction). (Note, by the way, that accelerators apply only to menu items, not to push buttons. When you create a JButton from an action, the ACCELERATOR KEY property of the action is ignored.) Note that you can use accelerators for JCheckBoxMenuItems and JRadioButtonMenuItems, as well as for simple JMenuItems. For an example of using keyboard accelerators, see the solution to Exercise 12.2.


By the way, as noted above, in the Macintosh operating system, the meta (or apple) key is usually used for keyboard accelerators instead of the control key. If you would like to make your program more Mac-friendly, you can test whether your program is running under Mac OS and, if so, adapt your accelerators to the Mac OS style. The recommended way to detect Mac

12.3. ACTIONS AND BUTTONS

645

OS is to test the value of System.getProperty("mrj.version"). This function call happens to return a non-null value under Mac OS but returns null under other operating systems. For example, here is a simple utility routine for making Mac-friendly accelerators:
/** * Create a KeyStroke that uses the meta key on Mac OS and * the control key on other operating systems. * @param description a string that describes the keystroke, * without the "meta" or "ctrl"; for example, "S" or * "shift Z" or "alt F1" * @return a keystroke created from the description string * with either "ctrl " or "meta " prepended */ private static KeyStroke makeAccelerator(String description) { String commandKey; if ( System.getProperty("mrj.version") == null ) commandKey = "ctrl"; else commandKey = "meta"; return KeyStroke.getKeyStroke( commandKey + " " + description ); }

12.3.6

HTML on Buttons

As a nal stop in this brief tour of ways to spi up your buttons, Ill mention the fact that the text that is displayed on a button can be specied in HTML format. HTML is the markup language that is used to write web pages. A brief introduction to HTML can be found in Subsection 6.2.3. HTML allows you to apply color or italics or other styles to just part of the text on your buttons. It also makes it possible to have buttons that display multiple lines of text. (You can also use HTML on JLabels, which can be even more useful.) Heres a picture of a button with HTML text (along with a Java icon):

If the string of text that is applied to a button starts with <html>, then the string is interpreted as HTML. The string does not have to use strict HTML format; for example, you dont need a closing </html> at the end of the string. To get multi-line text, use <br> in the string to represent line breaks. If you would like the lines of text to be center justied, include the entire text (except for the <html>) between <center> and </center>. For example,
JButton button = new JButton( "<html><center>This button has<br>two lines of text</center>" );

creates a button that displays two centered lines of text. You can apply italics to part of the string by enclosing that part between <i> and </i>. Similarly, use <b>...</b> for bold text and <u>...</u> for underlined text. For green text, enclose the text between <font color=green> and </font >. You can, of course, use other colors in place of green. The Java button that is shown above was created using:

646

CHAPTER 12. ADVANCED GUI PROGRAMMING


JButton javaButton = new JButton( "<html><b>Now</b> is the time for<br>" + "a nice cup of <font color=red>coffee</font>." );

Other HTML features can also be used on buttons and labelsexperiment to see what you can get away with!

12.4 Since

Complex Components and MVC

even buttons turn out to be pretty complex, as seen in the previous section, you might guess that there is a lot more complexity lurking in the Swing API. While this is true, a lot of that complexity works to your benet as a programmer, since a lot of it is hidden in normal uses of Swing components. For example, you dont have to know about all the complex details of buttons in order to use them eectively in most programs. Swing denes several component classes that are much more complex than those we have looked at so far, but even the most complex components are not very dicult to use for many purposes. In this section, well look at components that support display and manipulation of lists, tables, and text documents. To use these complex components eectively, youll need to know something about the Model-View-Controller pattern that is used as a basis for the design of many Swing components. This pattern is discussed in the rst part of this section. This section is our last look at Swing components, but there are a number of component classes that have not even been touched on in this book. Some useful ones that you might want to look into include: JTabbedPane, JSplitPane, JTree, JSpinner, JPopupMenu, JProgressBar, JScrollBar, and JPasswordField. At the end of the section, well look briey at the idea of writing custom component classes something that you might consider when even the large variety of components that are already dened in Swing dont do quite what you want.

12.4.1

Model-View-Controller

One of the principles of object-oriented design is division of responsibilities. Ideally, an object should have a single, clearly dened role, with a limited realm of responsibility. One application of this principle to the design of graphical user interfaces is the MVC pattern. MVC stands for Model-View-Controller and refers to three dierent realms of responsibility in the design of a graphical user interface. When the MVC pattern is applied to a component, the model consists of the data that represents the current state of the component. The view is simply the visual presentation of the component on the screen. And the controller is the aspect of the component that carries out actions in response to events generated by the user. The idea is to assign responsibility for the model, the view, and the controller to dierent objects. The view is the easiest part of the MVC pattern to understand. It is often represented by the component object itself, and its responsibility is to draw the component on the screen. In doing this, of course, it has to consult the model, since the model represents the state of the component, and that state can determine what appears on the screen. To get at the model datawhich is stored in a separate object according to the MVC patternthe component object needs to keep a reference to the model object. Furthermore, when the model changes, the view might have to be redrawn to reect the changed state. The component needs some way of knowing when changes in the model occur. Typically, in Java, this is done with events and listeners. The model object is set up to generate events when its data changes. The view object

12.4. COMPLEX COMPONENTS AND MVC

647

registers itself as a listener for those events. When the model changes, an event is generated, the view is notied of that event, and the view responds by updating its appearance on the screen. When MVC is used for Swing components, the controller is generally not so well dened as the model and view, and its responsibilities are often split among several objects. The controller might include mouse and keyboard listeners that respond to user events on the view; Actions that respond to menu commands or buttons; and listeners for other high-level events, such as those from a slider, that aect the state of the component. Usually, the controller responds to events by making modications to the model, and the view is changed only indirectly, in response to the changes in the model. The MVC pattern is used in many places in the design of Swing. It is even used for buttons. The state of a Swing button is stored in an object of type ButtonModel. The model stores such information as whether the button is enabled, whether it is selected, and what ButtonGroup it is part of, if any. If button is of type JButton (or one of the other subclasses of AbstractButton), then its ButtonModel can be obtained by calling button.getModel(). In the case of buttons, you might never need to use the model or even know that it exists. But for the list and table components that we will look at next, knowledge of the model is essential.

12.4.2

Lists and ListModels

A JList is a component that represents a list of items that can be selected by the user. The sample program SillyStamper.java allows the user to select one icon from a JList of Icons. The user selects an icon from the list by clicking on it. The selected icon can be stamped onto a drawing area by clicking on the drawing area. (The icons in this program are from the KDE desktop project.) Here is a picture of the program with several icons already stamped onto the drawing area and with the light bulb icon selected:

Note that the scrollbar in this program is not part of the JList. To add a scrollbar to a list, the list must be placed into a JScrollPane. See Subsection 6.6.4, where the use of JScrollPane to hold a JTextArea was discussed. Scroll panes are used in the same way with lists and with other components. In this case, the JList, iconList, was added to a scroll pane and the scroll pane was added to a panel with the single command:
add( new JScrollPane(iconList), BorderLayout.EAST );

648

CHAPTER 12. ADVANCED GUI PROGRAMMING

One way to construct a JList is from an array that contains the objects that will appear in the list. The items can be of any type, but only icons and strings can actually appear in the list; an item that is not of type Icon or String is converted into a string by calling its toString() method. (Its possible to teach a JList to display other types of items; see the setCellRenderer() method in the JList class.) In the SillyStamper program, the images for the icons are read from resource les, the icons are placed into an array, and the array is used to construct the list. This is done by the following method:
private JList createIconList() { String[] iconNames = new String[] { "icon5.png", "icon7.png", "icon8.png", "icon9.png", "icon10.png", "icon11.png", "icon24.png", "icon25.png", "icon26.png", "icon31.png", "icon33.png", "icon34.png" }; // Array containing resource file names for the icon images. iconImages = new Image[iconNames.length]; ClassLoader classLoader = getClass().getClassLoader(); Toolkit toolkit = Toolkit.getDefaultToolkit(); try { // Get the icon images from the resource files. for (int i = 0; i < iconNames.length; i++) { URL imageURL = classLoader.getResource("stamper icons/" + iconNames[i]); if (imageURL == null) throw new Exception(); iconImages[i] = toolkit.createImage(imageURL); } } catch (Exception e) { iconImages = null; return null; } ImageIcon[] icons = new ImageIcon[iconImages.length]; for (int i = 0; i < iconImages.length; i++) // Create the icons. icons[i] = new ImageIcon(iconImages[i]); JList list = new JList(icons); // A list containing the image icons. list.setSelectionMode(ListSelectionModel.SINGLE SELECTION); list.setSelectedIndex(0); // First item in the list is currently selected. return list; }

By default, the user can select any number of items in a list. A single item is selected by clicking on it. Multiple items can be selected by shift-clicking and by either control-clicking or meta-clicking (depending on the platform). In the SillyStamper program, I wanted to restrict the selection so that only one item can be selected at a time. This restriction is imposed by calling
list.setSelectionMode(ListSelectionModel.SINGLE SELECTION);

With this selection mode, when the user selects an item, the previously selected item, if any, is deselected. Note that the selection can be changed by the program by calling list.setSelectedIndex(itemNum). Items are numbered starting from zero. To nd out the currently selected item in single selection mode, call list.getSelectedIdex(). This returns

12.4. COMPLEX COMPONENTS AND MVC

649

the item number of the selected item, or -1 if no item is currently selected. If multiple selections are allowed, you can call list.getSelectedIndices(), which returns an array of ints that contains the item numbers of all selected items. Now, the list that you see on the screen is only the view aspect of the list. The controller consists of the listener objects that respond when the user clicks an item in the list. For its model, a JList uses an object of type ListModel. This is the object that knows the actual list of items. Now, a model is dened not only by the data that it contains but by the set of operations that can be performed on the data. When a JList is constructed from an array of objects, the model that is used is very simple. The model can tell you how many items it contains and what those items are, but it cant do much else. In particular, there is no way to add items to the list or to delete items from the list. If you need that capability, you will have to use a dierent list model. The class DefaultListModel denes list models that support adding items to and removing items from the list. (Note that the list model that you get when you create a JList from an array is not of this type.) If dlmodel is of type DefaultListModel, the following methods, among others, are dened: dlmodel.getSize() returns the number of items dlmodel.getElementAt(index) returns the item at position index in the list. dlmodel.addElement(item) Adds item to the end of the list; item can be any Object. dlmodel.insertElementAt(item, index) inserts the specied item into the list at the specied index; items that come after that position in the list are moved down to make room for the new item. dlmodel.setElementAt(item, index) Replaces the item that is currently at position index in the list with item. dlmodel.remove(index) removes the item at position index in the list. dlmodel.removeAllElements() removes everything from the list, leaving it empty. To use a modiable JList, you should create a DefaultListModel, add any items to it that should be in the list initially, and pass it to the JList constructor. For example:
DefaultListModel listModel; JList flavorList; // Should probably be instance variables! // Create the model object. // Add items to the model.

listModel = new DefaultListModel(); listModel.addElement("Chocolate"); listModel.addElement("Vanilla"); listModel.addElement("Strawberry"); listModel.addElement("Rum Raisen"); flavorList = new JList(listModel);

// Create the list component.

By keeping a reference to the model around in an instance variable, you will be able to add and delete avors as the program is running by calling the appropriate methods in listModel. Keep in mind that changes that are made to the model will automatically be reected in the view. Behind the scenes, when a list model is modied, it generates an event of type ListDataEvent. The JList registers itself with its model as a listener for these events, and it responds to an event by redrawing itself to reect the changes in the model. The programmer doesnt have to take any extra action, beyond changing the model.

650

CHAPTER 12. ADVANCED GUI PROGRAMMING

By the way, the model for a JList actually has another part in addition to the ListModel : An object of type ListSelectionModel stores information about which items in the list are currently selected. When the model is complex, its not uncommon to use several model objects to store dierent aspects of the state.

12.4.3

Tables and TableModels

Like a JList, a JTable displays a collection of items to the user. However, tables are much more complicated than lists. Perhaps the most important dierence is that it is possible for the user to edit items in the table. Table items are arranged in a grid of rows and columns. Each grid position is called a cell of the table. Each column can have a header , which appears at the top of the column and contains a name for the column. It is easy to create a JTable from an array that contains the names of the columns and a two-dimensional array that contains the items that go into the cells of the table. As an example, the sample program StatesAndCapitalsTableDemo.java creates a table with two columns named State and Capital City. The rst column contains a list of the states of the United States and the second column contains the name of the capital city of each state. The table can be created as follows:
String[][] statesAndCapitals = new String[][] { { "Alabama", "Montgomery" }, { "Alaska", "Juneau" }, { "Arizona", "Phoenix" }, . . . { "Wisconsin", "Madison" }, { "Wyoming", "Cheyenne" } }; String[] columnHeads = new String[] { "State", "Capital City" }; JTable table = new JTable(statesAndCapitals, columnHeads);

Since a table does not come with its own scroll bars, it is almost always placed in a JScrollPane to make it possible to scroll the table. In the example program this is done with:
add( new JScrollPane(table), BorderLayout.CENTER );

The column headers of a JTable are not actually part of the table; they are in a separate component. But when you add the table to a JScrolPane, the column headers are automatically placed at the top of the pane. Using the default settings, the user can edit any cell in the table. (To select an item for editing, click it and start typing. The arrow keys can be used to move from one cell to another.) The user can change the order of the columns by dragging a column header to a new position. The user can also change the width of the columns by dragging the line that separates neighboring column headers. You can try all this in the sample program; there is an applet version in the on-line version of this section. Allowing the user to edit all entries in the table is not always appropriate; certainly its not appropriate in the states and capitals example. A JTable uses an object of type TableModel to store information about the contents of the table. The model object is also responsible for deciding whether or not the user should be able to edit any given cell in the table. TableModel includes the method

12.4. COMPLEX COMPONENTS AND MVC


public boolean isCellEditable(int rowNum, columnNum)

651

where rowNum and columnNum are the position of a cell in the grid of rows and columns that make up the table. When the controller wants to know whether a certain cell is editable, it calls this method in the table model. If the return value is true, the user is allowed to edit the cell. The default model that is used when the table is created, as above, from an array of objects allows editing of all cells. For this model, the return value of isCellEditable() is true in all cases. To make some cells non-editable, you have to provide a dierent model for the table. One way to do this is to create a subclass of DefaultTableModel and override the isCellEditable() method. (DefaultTableModel and some other classes that are discussed in this section are dened in the package javax.swing.table.) Here is how this might be done in the states and capitals program to make all cells non-editable:
TableModel model = new DefaultTableModel(statesAndCapitals,columnHeads) { public boolean isCellEditable(int row, int col) { return false; } }; JTable table = new JTable(model);

Here, an anonymous subclass of DefaultTableModel is created in which the isCellEditable() method returns false in all cases, and the model object that is created from that class is passed as a parameter to the JTable constructor. The DefaultTableModel class denes many methods that can be used to modify the table, including for example: setValueAt(item,rowNum,colNum) to change the item in a given cell; removeRow(rowNum) to delete a row; and addRow(itemArray) to add a new row at the end of the table that contains items from the array itemArray. Note that if the item in a given cell is null, then that cell will be empty. Remember, again, that when you modify the model, the view is automatically changed to reect the changes. In addition to the isCellEditable() method, the table model method that you are most likely to want to override is getColumnClass(), which is dened as
public Class<?> getColumnClass(columnNum)

The purpose of this method is to specify what kind of values are allowed in the specied column. The return value from this method is of type Class. (The <?> is there for technical reasons having to do with generic programming. See Section 10.5, but dont worry about understanding it here.) Although class objects have crept into this book in a few places in the discussion of ClassLoaders in Subsection 12.1.3 for examplethis is the rst time we have directly encountered the class named Class. An object of type Class represents a class. A Class object is usually obtained from the name of the class using expressions of the form Double.class or JTable.class. If you want a three-column table in which the column types are String, Double, and Boolean, you can use a table model in which getColumnClass is dened as:
public Class<?> getColumnClass(columnNum) { if (columnNum == 0) return String.class; else if (columnNum = 1) return Double.class; else return Boolean.class;

652
}

CHAPTER 12. ADVANCED GUI PROGRAMMING

The table will call this method and use the return value to decide how to display and edit items in the table. For example, if a column is specied to hold Boolean values, the cells in that column will be displayed and edited as check boxes. For numeric types, the table will not accept illegal input when the user types in the value. (It is possible to change the way that a table edits or displays items. See the methods setDefaultEditor() and setDefaultRenderer() in the JTable class.) As an alternative to using a subclass of DefaultTableModel, a custom table model can also be dened using a subclass of AbstractTableModel. Whereas DefaultTableModel provides a lot of predened functionality, AbstractTableModel provides very little. However, using AbstractTableModel gives you the freedom to represent the table data any way you want. The sample program ScatterPlotTableDemo.java uses a subclass of AbstractTableModel to dene the model for a JTable. In this program, the table has three columns. The rst column holds a row number and is not editable. The other columns hold values of type Double; these two columns represent the x- and y-coordinates of points in the plane. The points themselves are graphed in a scatter plot next to the table. Initially, the program lls in the rst six points with random values. Here is a picture of the program, with the x-coordinate in row 5 selected for editing:

Note, by the way, that in this program, the scatter plot can be considered to be a view of the table model, in the same way that the table itself is. The scatter plot registers itself as a listener with the model, so that it will receive notication whenever the model changes. When that happens, the scatter plot redraws itself to reect the new state of the model. It is an important property of the MVC pattern that several views can share the same model, oering alternative presentations of the same data. The views dont have to know about each other or communicate with each other except by sharing the model. Although I didnt do it in this program, it would even be possible to add a controller to the scatter plot view. This would let the user drag a point in the scatter plot to change its coordinates. Since the scatter plot and table share the same model, the values in the table would automatically change to match. Here is the denition of the class that denes the model in the scatter plot program. All the methods in this class must be dened in any subclass of AbstractTableModel except for setValueAt(), which only has to be dened if the table is modiable.
/**

12.4. COMPLEX COMPONENTS AND MVC


* This class defines the TableModel that is used for the JTable in this * program. The table has three columns. Column 0 simply holds the * row number of each row. Column 1 holds the x-coordinates of the * points for the scatter plot, and Column 2 holds the y-coordinates. * The table has 25 rows. No support is provided for adding more rows. */ private class CoordInputTableModel extends AbstractTableModel { private private // // Double[] xCoord = new Double[25]; // Data for Column 1. Double[] yCoord = new Double[25]; // Data for Column 2. Initially, all the values in the array are null, which means that all the cells are empty.

653

public int getColumnCount() { return 3; } public int getRowCount() { return xCoord.length; }

// Tells caller how many columns there are.

// Tells caller how many rows there are.

public Object getValueAt(int if (col == 0) return (row+1); else if (col == 1) return xCoord[row]; else return yCoord[row]; }

row, int col) {

// Get value from cell.

// Column 0 holds the row number. // Column 1 holds the x-coordinates. // column 2 holds the y-coordinates. // Get data type of column.

public Class<?> getColumnClass(int col) { if (col == 0) return Integer.class; else return Double.class; } public String getColumnName(int col) { if (col == 0) return "Num"; else if (col == 1) return "X"; else return "Y"; }

// Returns a name for column header.

public boolean isCellEditable(int row, int col) { // Can user edit cell? return col > 0; } public void setValueAt(Object obj, int row, int col) { // (This method is called by the system if the value of the cell // needs to be changed because the user has edited the cell. // It can also be called to change the value programmatically. // In this case, only columns 1 and 2 can be modified, and the data // type for obj must be Double. The method fireTableCellUpdated() // has to be called to send an event to registered listeners to

654

CHAPTER 12. ADVANCED GUI PROGRAMMING


// notify them of the modification to the table model.) if (col == 1) xCoord[row] = (Double)obj; else if (col == 2) yCoord[row] = (Double)obj; fireTableCellUpdated(row, col); } } // end nested class CoordInputTableModel

In addition to dening a custom table model, I customized the appearance of the table in several ways. Because this involves changes to the view, most of the changes are made by calling methods in the JTable object. For example, since the default height of the cells was too small for my taste, I called table.setRowHeight(25) to increase the height. To make lines appear between the rows and columns, I found that I had to call both table.setShowGrid(true) and table.setGridColor(Color.BLACK). Some of the customization has to be done to other objects. For example, to prevent the user from changing the order of the columns by dragging the column headers, I had to use
table.getTableHeader().setReorderingAllowed(false);

Tables are quite complex, and I have only discussed a part of the table API here. Nevertheless, I hope that you have learned enough to start using them and to learn more about them on your own.

12.4.4

Documents and Editors

As a nal example of complex components, we look briey at JTextComponent and its subclasses. A JTextComponent displays text that can, optionally, be edited by the user. Two subclasses, JTextField and JTextArea, were introduced in Subsection 6.6.4. But the real complexity comes in another subclass, JEditorPane, that supports display and editing of styled text, which allows features such as boldface and italic. A JEditorPane can even work with basic HTML documents. It is almost absurdly easy to write a simple web browser program using a JEditorPane. This is done in the sample program SimpleWebBrowser.java. In this program, the user enters the URL of a web page, and the program tries to load and display the web page at that location. A JEditorPane can handle pages with content type text/plain, text/html, and text/rtf. (The content type text/rtf represents styled or rich text format text. URLs and content types were covered in Subsection 11.4.1.) If editPane is of type JEditorPane and url is of type URL, then the statement editPane.setPage(url); is sucient to load the page and display it. Since this can generate an exception, the following method is used in SimpleWebBrowser.java to display a page:
private void loadURL(URL url) { try { editPane.setPage(url); } catch (Exception e) { editPane.setContentType("text/plain"); // Set pane to display plain text. editPane.setText( "Sorry, the requested document was not found\n" +"or cannot be displayed.\n\nError:" + e); } }

12.4. COMPLEX COMPONENTS AND MVC

655

An HTML document can display links to other pages. When the user clicks on a link, the web browser should go to the linked page. A JEditorPane does not do this automatically, but it does generate an event of type HyperLinkEvent when the user clicks a link (provided that the edit pane has been set to be non-editable by the user). A program can register a listener for such events and respond by loading the new page. There are a lot of web pages that a JEditorPane wont be able to display correctly, but it can be very useful in cases where you have control over the pages that will be displayed. A nice application is to distribute HTML-format help and information les with a program. The les can be stored as resource les in the jar le of the program, and a URL for a resource le can be obtained in the usual way, using the getResource() method of a ClassLoader. (See Subsection 12.1.3.) It turns out, by the way, that SimpleWebBrowser.java is a little too simple. A modied version, SimpleWebBrowserWithThread.java, improves on the original by using a thread to load a page and by checking the content type of a page before trying to load it. It actually does work as a simple web browser. The model for a JTextComponent is an object of type Document. If you want to be notied of changes in the model, you can add a listener to the model using
textComponent.getDocument().addDocumentListener(listener)

where textComponent is of type JTextComponent and listener is of type DocumentListener. The Document class also has methods that make it easy to read a document from a le and write a document to a le. I wont discuss all the things you can do with text components here. For one more peek at their capabilities, see the sample program SimpleRTFEdit.java, a very minimal editor for les that contain styled text of type text/rtf.

12.4.5

Custom Components

Javas standard component classes are often all you need to construct a user interface. At some point, however, you might need a component that Java doesnt provide. In that case, you can write your own component class, building on one of the components that Java does provide. Weve already done this, actually, every time weve written a subclass of the JPanel class to use as a drawing surface. A JPanel is a blank slate. By dening a subclass, you can make it show any picture you like, and you can program it to respond in any way to mouse and keyboard events. Sometimes, if you are lucky, you dont need such freedom, and you can build on one of Javas more sophisticated component classes. For example, suppose I have a need for a stopwatch component. When the user clicks on the stopwatch, I want it to start timing. When the user clicks again, I want it to display the elapsed time since the rst click. The textual display can be done with a JLabel, but we want a JLabel that can respond to mouse clicks. We can get this behavior by dening a StopWatchLabel component as a subclass of the JLabel class. A StopWatchLabel object will listen for mouse clicks on itself. The rst time the user clicks, it will change its display to Timing... and remember the time when the click occurred. When the user clicks again, it will check the time again, and it will compute and display the elapsed time. (Of course, I dont necessarily have to dene a subclass. I could use a regular label in my program, set up a listener to respond to mouse events on the label, and let the program do the work of keeping track of the time and changing the text displayed on the label. However, by writing a new class, I have something that can be reused in other projects. I also have all the code involved in the

656

CHAPTER 12. ADVANCED GUI PROGRAMMING

stopwatch function collected together neatly in one place. For more complicated components, both of these considerations are very important.) The StopWatchLabel class is not very hard to write. I need an instance variable to record the time when the user starts the stopwatch. Times in Java are measured in milliseconds and are stored in variables of type long (to allow for very large values). In the mousePressed() method, I need to know whether the timer is being started or stopped, so I need a boolean instance variable, running, to keep track of this aspect of the components state. There is one more item of interest: How do I know what time the mouse was clicked? The method System.currentTimeMillis() returns the current time. But there can be some delay between the time the user clicks the mouse and the time when the mousePressed() routine is called. To make my stopwatch as accurate as possible, I dont want to know the current time. I want to know the exact time when the mouse was pressed. When I wrote the StopWatchLabel class, this need sent me on a search in the Java documentation. I found that if evt is an object of type MouseEvent, then the function evt.getWhen() returns the time when the event occurred. I call this function in the mousePressed() routine to determine the exact time when the user clicked on the label. The complete StopWatch class is rather short:
import java.awt.event.*; import javax.swing.*; /** * A custom component that acts as a simple stop-watch. When the user clicks * on it, this component starts timing. When the user clicks again, * it displays the time between the two clicks. Clicking a third time * starts another timer, etc. While it is timing, the label just * displays the message "Timing....". */ public class StopWatchLabel extends JLabel implements MouseListener { private long startTime; private boolean running; // Start time of timer. // (Time is measured in milliseconds.) // True when the timer is running.

/** * Constructor sets initial text on the label to * "Click to start timer." and sets up a mouse listener * so the label can respond to clicks. */ public StopWatchLabel() { super(" Click to start timer. ", JLabel.CENTER); addMouseListener(this); } /** * Tells whether the timer is currently running. */ public boolean isRunning() { return running; } /** * React when the user presses the mouse by starting or stopping

12.4. COMPLEX COMPONENTS AND MVC


* the timer and changing the text that is shown on the label. */ public void mousePressed(MouseEvent evt) { if (running == false) { // Record the time and start the timer. running = true; startTime = evt.getWhen(); // Time when mouse was clicked. setText("Timing...."); } else { // Stop the timer. Compute the elapsed time since the // timer was started and display it. running = false; long endTime = evt.getWhen(); double seconds = (endTime - startTime) / 1000.0; setText("Time: " + seconds + " sec."); } } public public public public } void void void void mouseReleased(MouseEvent evt) { } mouseClicked(MouseEvent evt) { } mouseEntered(MouseEvent evt) { } mouseExited(MouseEvent evt) { }

657

Dont forget that since StopWatchLabel is a subclass of JLabel, you can do anything with a StopWatchLabel that you can do with a JLabel. You can add it to a container. You can set its font, foreground color, and background color. You can set the text that it displays (although this would interfere with its stopwatch function). You can even add a Border if you want. Lets look at one more example of dening a custom component. Suppose thatfor no good reason whatsoeverI want a component that acts like a JLabel except that it displays its text in mirror-reversed form. Since no standard component does anything like this, the MirrorText class is dened as a subclass of JPanel. It has a constructor that species the text to be displayed and a setText() method that changes the displayed text. The paintComponent() method draws the text mirror-reversed, in the center of the component. This uses techniques discussed in Subsection 12.1.1 and Subsection 12.2.1. Information from a FontMetrics object is used to center the text in the component. The reversal is achieved by using an o-screen canvas. The text is drawn to the o-screen canvas, in the usual way. Then the image is copied to the screen with the following command, where OSC is the variable that refers to the o-screen canvas, and width and height give the size of both the component and the o-screen canvas:
g.drawImage(OSC, width, 0, 0, height, 0, 0, width, height, this);

This is the version of drawImage() that species corners of destination and source rectangles. The corner (0,0) in OSC is matched to the corner (width,0) on the screen, while (width,height) is matched to (0,height). This reverses the image left-to-right. Here is the complete class:
import java.awt.*; import javax.swing.*; import java.awt.image.BufferedImage; /**

658

CHAPTER 12. ADVANCED GUI PROGRAMMING


* A component for displaying a mirror-reversed line of text. * The text will be centered in the available space. This component * is defined as a subclass of JPanel. It respects any background * color, foreground color, and font that are set for the JPanel. * The setText(String) method can be used to change the displayed * text. Changing the text will also call revalidate() on this * component. */ public class MirrorText extends JPanel { private String text; // The text displayed by this component. private BufferedImage OSC; // Holds an un-reversed picture of the text. /** * Construct a MirrorText component that will display the specified * text in mirror-reversed form. */ public MirrorText(String text) { if (text == null) text = ""; this.text = text; } /** * Change the text that is displayed on the label. * @param text the new text to display */ public void setText(String text) { if (text == null) text = ""; if ( ! text.equals(this.text) ) { this.text = text; // Change the instance variable. revalidate(); // Tell container to recompute its layout. repaint(); // Make sure component is redrawn. } } /** * Return the text that is displayed on this component. * The return value is non-null but can be an empty string. */ public String getText() { return text; } /** * The paintComponent method makes a new off-screen canvas, if necessary, * writes the text to the off-screen canvas, then copies the canvas onto * the screen in mirror-reversed form. */ public void paintComponent(Graphics g) { int width = getWidth(); int height = getHeight(); if (OSC == null || width != OSC.getWidth() || height != OSC.getHeight()) {

12.4. COMPLEX COMPONENTS AND MVC


OSC = new BufferedImage(width,height,BufferedImage.TYPE INT RGB); } Graphics OSG = OSC.getGraphics(); OSG.setColor(getBackground()); OSG.fillRect(0, 0, width, height); OSG.setColor(getForeground()); OSG.setFont(getFont()); FontMetrics fm = OSG.getFontMetrics(getFont()); int x = (width - fm.stringWidth(text)) / 2; int y = (height + fm.getAscent() - fm.getDescent()) / 2; OSG.drawString(text, x, y); OSG.dispose(); g.drawImage(OSC, width, 0, 0, height, 0, 0, width, height, null); } /** * Compute a preferred size that includes the size of the text, plus * a boundary of 5 pixels on each edge. */ public Dimension getPreferredSize() { FontMetrics fm = getFontMetrics(getFont()); return new Dimension(fm.stringWidth(text) + 10, fm.getAscent() + fm.getDescent() + 10); } } // end MirrorText

659

This class denes the method public Dimension getPreferredSize(). This method is called by a layout manager when it wants to know how big the component would like to be. Standard components come with a way of computing a preferred size. For a custom component based on a JPanel, its a good idea to provide a custom preferred size. Every component has a method setPrefferedSize() that can be used to set the preferred size of the component. For our MirrorText component, however, the preferred size depends on the font and the text of the component, and these can change from time to time. We need a way to compute a preferred size on demand, based on the current font and text. Thats what we do by dening a getPreferredSize() method. The system calls this method when it wants to know the preferred size of the component. In response, we can compute the preferred size based on the current font and text. The StopWatchLabel and MirrorText classes dene components. Components dont stand on their own. You have to add them to a panel or other container. The sample program CustomComponentTest.java demonstrates using a MirrorText and a StopWatchLabel component. In this program, the two custom components and a button are added to a panel that uses a FlowLayout as its layout manager, so the components are not arranged very neatly. If you click the button labeled Change Text in this Program, the text in all the components will be changed. You can also click on the stopwatch label to start and stop the stopwatch. When you do any of these things, you will notice that the components will be rearranged to take the new sizes into account. This is known as validating the container. This is done automatically when a standard component changes in some way that requires a change in preferred size or location. This may or may not be the behavior that you want. (Validation doesnt always cause as much disruption as it does in this program. For example, in a GridLayout, where all the components are displayed at the same size, it will have no eect at all. I chose a FlowLayout for this example to make the eect more obvious.) When the text is changed in a MirrorText

660

CHAPTER 12. ADVANCED GUI PROGRAMMING

component, there is no automatic validation of its container. A custom component such as MirrorText must call the revalidate() method to indicate that the container that contains the component should be validated. In the MirrorText class, revalidate() is called in the setText() method.

12.5 In

Finishing Touches

this final section, I will present a program that is more complex and more polished than those we have looked at previously. Most of the examples in this book have been toy programs that illustrated one or two points about programming techniques. Its time to put it all together into a full-scale program that uses many of the techniques that we have covered, and a few more besides. After discussing the program and its basic design, Ill use it as an excuse to talk briey about some of the features of Java that didnt t into the rest of this book. The program that we will look at is a Mandelbrot Viewer that lets the user explore the famous Mandelbrot set. I will begin by explaining what that means. If you have downloaded the web version of this book, note that the jar le MandelbrotViewer.jar is an executable jar le that you can use to run the program as a stand-alone application. The jar le is in the directory c12, which contains all the les for this chapter. The on-line version of this page has two applet versions of the program. One shows the program running on the web page. The other applet appears on the web page as a button; clicking the button opens the program in a separate window.

12.5.1

The Mandelbrot Set

The Mandelbrot set is a set of points in the xy-plane that is dened by a computational procedure. To use the program, all you really need to know is that the Mandelbrot set can be used to make some pretty pictures, but here are the mathematical details: Consider the point that has real-number coordinates (a,b) and apply the following computation:
Let x = a Let y = b Repeat: Let newX = x*x - y*y + a Let newY = 2*x*y + b Let x = newX Let y = newY

As the loop is repeated, the point (x,y) changes. The question is, does (x,y) grow without bound or is it trapped forever in a nite region of the plane? If (x,y) escapes to innity (that is, grows without bound), then the starting point (a,b) is not in the Mandelbrot set. If (x,y) is trapped in a nite region, then (a,b) is in the Mandelbrot set. Now, it is known that if x2 + y2 ever becomes strictly greater than 4, then (x,y) will escape to innity. If x2 + y2 ever becomes bigger than 4 in the above loop, we can end the loop and say that (a,b) is not in the Mandelbrot set. For a point (a,b) in the Mandelbrot set, this will never happen. When we do this on a computer, of course, we dont want to have a loop that runs forever, so we put a limit on the number of times that the loop is executed:
x = a; y = b;

12.5. FINISHING TOUCHES


count = 0; while ( x*x + y*y < 4.1 ) { count++; if (count > maxIterations) break; double newX = x*x - y*y + a; double newY = 2*x*y + b; x = newY; y = newY; }

661

After this loop ends, if count is less than or equal to maxIterations, we can say that (a,b) is not in the Mandelbrot set. If count is greater than maxIterations, then (a,b) might or might not be in the Mandelbrot set (but the larger maxIterations is, the more likely that (a,b) is actually in the set). To make a picture from this procedure, use a rectangular grid of pixels to represent some rectangle in the plane. Each pixel corresponds to some real number coordinates (a,b). (Use the coordinates of the center of the pixel.) Run the above loop for each pixel. If the count goes past maxIterations, color the pixel black; this is a point that is possibly in the Mandelbrot set. Otherwise, base the color of the pixel on the value of count after the loop ends, using dierent colors for dierent counts. In some sense, the higher the count, the closer the point is to the Mandelbrot set, so the colors give some information about points outside the set and about the shape of the set. However, its important to understand that the colors are arbitrary and that colored points are not in the set. Here is a picture that was produced by the Mandelbrot Viewer program using this computation. The black region is the Mandelbrot set:

When you use the program, you can zoom in on small regions of the plane. To do so, just drag the mouse on the picture. This will draw a rectangle around part of the picture. When you release the mouse, the part of the picture inside the rectangle will be zoomed to ll the entire display. If you simply click a point in the picture, you will zoom in on the point where you click by a magnication factor of two. (Shift-click or use the right mouse button to zoom out instead of zooming in.) The interesting points are along the boundary of the Mandelbrot

662

CHAPTER 12. ADVANCED GUI PROGRAMMING

set. In fact, the boundary is innitely complex. (Note that if you zoom in too far, you will exceed the capabilities of the double data type; nothing is done in the program to prevent this.) Use the MaxIterations menu to increase the maximum number of iterations in the loop. Remember that black pixels might or might not be in the set; when you increase MaxIterations, you might nd that a black region becomes lled with color. The Palette menu determines the set of colors that are used. Dierent palettes give very dierent visualizations of the set. The PaletteLength menu determines how many dierent colors are used. In the default setting, a dierent color is used for each possible value of count in the algorithm. Sometimes, you can get a much better picture by using a dierent number of colors. If the palette length is less than maxIterations, the palette is repeated to cover all the possible values of count; if the palette length is greater than maxIterations, only part of of the palette will be used. (If the picture is of an almost uniform color, try decreasing the palette length, since that makes the color vary more quickly as count changes. If you see what look like randomly colored dots instead of bands of color, try increasing the palette length.) If you run the Mandelbrot Viewer program as a stand-alone application, it will have a File menu that can be used to save the picture as a PNG image le. You can also save a param le which simply saves the settings that produced the current picture. A param le can be read back into the program using the Open command. The Mandelbrot set is named after Benoit Mandelbrot, who was the rst person to note the incredible complexity of the set. It is astonishing that such complexity and beauty can arise out of such a simple algorithm.

12.5.2

Design of the Program

Most classes in Java are dened in packages. While we have used standard packages such as javax.swing and java.io extensively, all of my programming examples have been in the default package, which means that they are not declared to belong to any named package. However, when doing more serious programming, it is good style to create a package to hold the classes for your program. Sun Microsystems recommends that package names should be based on an Internet domain name of the organization that produces the package. My oce computer has domain name eck.hws.edu, and no other computer in the world should have the same name. According to Sun, this allows me to use the package name edu.hws.eck, with the elements of the domain name in reverse order. I can also use sub-packages of this package, such as edu.hws.eck.mdb, which is the package name that I decided to use for my Mandelbrot Viewer application. No one elseor at least no one else who uses Suns naming conventionwill ever use the same package name, so this package name uniquely identies my program. I briey discussed using packages in Subsection 2.6.4. Heres what you need to know for the Mandelbrot Viewer program: The program is dened in ten Java source code les. They can be found in the directory edu/hws/eck/mdb inside the source directory of the web site. (That is, they are in a directory named mdb, which is inside a directory named eck, which is inside hws, which is inside edu. The directory structure must follow the package name in this way.) The same directory also contains a le named strings.properties that is used by the program and that will be discussed below. For an Integrated Development Environment such as Eclipse, you should just have to add the edu directory to your project. To compile the les on the command line, you must be working in the directory that contains the edu directory. Use the command
javac edu/hws/eck/mdb/*.java

12.5. FINISHING TOUCHES or, if you use Windows,


javac edu\hws\eck\mdb\*.java

663

to compile the source code. The main routine for the stand-alone application version of the program is dened by a class named Main. To run this class, use the command:
java edu.hws.eck.mdb.Main

This command must also be given in the directory that contains the edu directory.


The work of computing and displaying images of the Mandelbrot set is done in MandelbrotDisplay.java. The MandelbrotDisplay class is a subclass of JPanel. It uses an o-screen canvas to hold a copy of the image. (See Subsection 12.1.1.) The paintComponent() method copies this image onto the panel. Then, if the user is drawing a zoom box with the mouse, the zoom box is drawn on top of the image. In addition to the image, the class uses a two-dimensional array to store the iteration count for each pixel in the image. If the range of xy-values changes, or if the size of the window changes, all the counts must be recomputed. Since the computation can take quite a while, it would not be acceptable to block the user interface while the computation is being performed. The solution is to do the computation in a separate thread. (See Section 8.5.) When the computation begins, the image is lled with gray. Every so often, about twice a second, the data that has been computed by the computation thread is gathered and applied to the o-screen canvas, and the part of the canvas that has been modied is copied to the screen. The user can continue to use the menus and even the mouse while the image is being computed. Actually, this simplies things a bit, because there can be more than one computational thread. The program creates as many computational threads as there are available processors in the computer, and the task of computing the image is split among these threads. The algorithm that is used is essentially the same as the distributed algorithm from Subsection 11.5.3, except that in this case all the computational threads are on the same computer. The le MandelbrotPanel.java denes the main panel of the Mandelbrot Viewer window. MandelbrotPanel is another subclass of JPanel. A MandelbrotPanel is mostly lled with a MandelbrotDisplay. It also adds a JLabel beneath the display. The JLabel is used as a status bar that shows some information that might be interesting to the user. The MandelbrotPanel also denes the programs mouse listener. In addition to handling zooming, the mouse listener puts the x and y coordinates of the current mouse location in the status bar as the user moves or drags the mouse. Also, when the mouse exits the drawing area, the text in the status bar is set to read Idle. This is the rst time that we have seen an actual use for mouseMoved and mouseExited events. (See Subsection 6.4.2 and Subsection 6.4.4.) The menu bar for the program is dened in Menus.java. Commands in the File and Control menu are dened as Actions. (See Subsection 12.3.1.) Note that among the actions are le manipulation commands that use techniques from Subsection 11.2.3, Subsection 11.6.3, and Subsection 12.1.5. The MaxIterations, Palette, and PaletteLength menus each contain a group of JRadioButtonMenuItems. (See Subsection 12.3.3.) I have tried several approaches for handling such groups, and none of them have satised me completely. In this program, I have dened a nested class inside Menus to represent each group. For example, the PaletteManager class contains the menu items in the Palette menu as instance variables. It registers an action listener with each item, and it denes a few utility routines for operating on the menu. The classes for the three menus are very similar and should probably have been dened as subclasses of some more general class.

664

CHAPTER 12. ADVANCED GUI PROGRAMMING

One interesting point is that the contents of the menu bar are dierent, depending on whether the program is being run as an applet or as a stand-alone application. Since applets cannot access the le system, there is no File menu for an applet. Furthermore, accelerator keys are generally not functional in an applet that is running on a web page, so accelerator keys are only added to menu items if the program is being run in its own window. (See Subsection 12.3.5 for information on accelerators.) To accomplish this, the constructor in the Menus class has parameters that tell it whether the menu bar will be used in an applet and whether it will be used in a frame, and these parameters are consulted as the menu bar is being built. A third parameter to the constructor is the MandelbrotPanel that is being used in the program. Many of the menu commands operate on this panel or on the MandelbrotDisplay that it contains. In order to carry out these commands, the Menus object needs a reference to the MandelbrotPanel. As for the MandelbrotDisplay, the panel has a method getDisplay() that returns a reference to the display that it contains. So as long as the menu bar has a reference to the panel, it can obtain a reference to the display. In previous examples, everything was written as one large class le, so all the objects were directly available to all the code. When a program is made up of multiple interacting les, getting access to the necessary objects can be more of a problem. MandelbrotPanel, MandelbrotDisplay, and Menus are the main classes that make up the Mandelbrot Viewer program. MandelbrotFrame.java denes a simple subclass of JFrame that runs the program in its own window. MandelbrotApplet.java denes an applet that runs the program on a web page. (This applet version has an extra Examples menu that is discussed in the source code le.) There are a few other classes that I will discuss below. This brief discussion of the design of the Mandelbrot Viewer has shown that it uses a wide variety of techniques that were covered earlier in this book. In the rest of this section, well look at a few new features of Java that were used in the program.

12.5.3

Internationalization

Internationalization refers to writing a program that is easy to adapt for running in dierent parts of the world. Internationalization is often referred to as I18n, where 18 is the number of letters between the I and the nal n in Internationalization. The process of adapting the program to a particular location is called localization, and the locations are called locales. Locales dier in many ways, including the type of currency used and the format used for numbers and dates, but the most obvious dierence is language. Here, I will discuss how to write a program so that it can be easily translated into other languages. The key idea is that strings that will be presented to the user should not be coded into the program source code. If they were, then a translator would have to search through the entire source code, replacing every string with its translation. Then the program would have to be recompiled. In a properly internationalized program, all the strings are stored together in one or more les that are separate from the source code, where they can easily be found and translated. And since the source code doesnt have to be modied to do the translation, no recompilation is necessary. To implement this idea, the strings are stored in one or more properties les. A properties le is just a list of key/value pairs. For translation purposes, the values are strings that will be presented to the user; these are the strings that have to be translated. The keys are also strings, but they dont have to be translated because they will never be presented to the user. Since they wont have to be modied, the key strings can be used in the program source code. Each

12.5. FINISHING TOUCHES

665

key uniquely identies one of the value strings. The program can use the key string to look up the corresponding value string from the properties le. The program only needs to know the key string; the user will only see the value string. When the properties le is translated, the user of the program will see dierent value strings. The format of a properties le is very simple. The key/value pairs take the form
key.string=value string

There are no spaces in the key string or before the equals sign. The value string can contain spaces or any other characters. If the line ends with a backslash (\), the value string can be continued on the next line; in this case, spaces at the beginning of that line are ignored. One unfortunate detail is that a properties le can contain only plain ASCII characters. The ASCII character set only supports the English alphabet. Nevertheless, a value string can include arbitrary UNICODE characters. Non-ASCII characters just have to be specially encoded. Sun Microsystems provides a program, native2ascii, that can convert les that use non-ASCII characters into a form that is suitable for use as a properties le. Suppose that the program wants to present a string to the user (as the name of a menu command, for example). The properties le would contain a key/value pair such as
menu.saveimage=Save PNG Image...

where Save PNG Image. . . is the string that will appear in the menu. The program would use the key string, menu.saveimage, to look up the corresponding value string and would then use the value string as the text of the menu item. In Java, the look up process is supported by the ResourceBundle class, which knows how to retrieve and use properties les. Sometimes a string that is presented to the user contains substrings that are not known until the time when the program is running. A typical example is the name of a le. Suppose, for example, that the program wants to tell the user, Sorry, the le, lename, cannot be loaded, where lename is the name of a le that was selected by the user at run time. To handle cases like this, value strings in properties les can include placeholders that will be replaced by strings to be determined by the program at run time. The placeholders take the form {0}, {1}, {2}, . . . . For the le error example, the properties le might contain:
error.cantLoad=Sorry, the file, {0}, cannot be loaded

The program would fetch the value string for the key error.cantLoad. It would then substitute the actual le name for the placeholder, {0}. Note that when the string is translated, the word order might be completely dierent. By using a placeholder for the le name, you can be sure that the le name will be put in the correct grammatical position for the language that is being used. Placeholder substitution is not handled by the ResourceBundle class, but Java has another class, MessageFormat, that makes such substitutions easy. For the Mandelbrot Viewer program, the properties le is strings.properties. (Any properties le should have a name that ends in .properties.) Any string that you see when you run the program comes from this le. For handling value string lookup, I wrote I18n.java. The I18n class has a static method
public static tr( String key, Object... args )

that handles the whole process. Here, key is the key string that will be looked up in strings.properties. Additional parameters, if any, will be substituted for placeholders in the value string. (Recall that the formal parameter declaration Object... means that there can be any number of actual parameters after key; see Subsection 7.2.6.) Typical uses would include:

666

CHAPTER 12. ADVANCED GUI PROGRAMMING


String saveImageCommandText = I18n.tr( "menu.saveimage" ); String errMess = I18n.tr( "error.cantLoad" , selectedFile.getName() );

You will see function calls like this throughout the Mandelbrot Viewer source code. The I18n class is written in a general way so that it can be used in any program. As long as you provide a properties le as a resource, the only things you need to do are change the resource le name in I18n.java and put the class in your own package. It is actually possible to provide several alternative properties les in the same program. For example, you might include French and Japanese versions of the properties le along with an English version. If the English properties le is named string.properties, then the names for the French and Japanese versions should be strings fr.properties and strings ja.properties. Every language has a two-letter code, such as fr and ja, that is used in constructing properties le names for that language. The program asks for the properties le using the simple name string. If the program is being run on a Java system in which the preferred language is French, the program will try to load string fr.properties; if that fails, it will look for strings.properties. This means that the program will use the French properties les in a French locale; it will use the Japanese properties le in a Japanese locale; and in any other locale it will use the default properties le.

12.5.4

Events, Events, Events

We have worked extensively with mouse events, key events, and action events, but these are only a few of the event types that are used in Java. The Mandelbrot Viewer program makes use of several other types of events. It also serves as an example of the benets of event-oriented programming. Lets start from the following fact: The MandelbrotDisplay class knows nothing about any of the other classes that make up the program (with the single exception of one call to the internationalization method I18n.tr). Yet other classes are aware of things that are going on in the MandelbrotDisplay class. For example, when the size of the display is changed, the new size is reported in the status bar that is part of the MandelbrotPanel class. In the Menus class, certain menus are disabled when the display begins the computation of an image and are re-enabled when the computation completes. The display doesnt call methods in the MandelbrotPanel or Menus classes, so how do these classes get their information about what is going on in the display? The answer, of course, is events. The MandelbrotDisplay object emits events of various types when various things happen. The MandelbrotPanel and MandelbrotDisplay objects set up listeners that hear those events and respond to them. The point is that because events are used for communication, the MandelbrotDisplay class is not strongly coupled to the other classes. In fact, it can be used in other programs without any modication and without access to the other classes. The alternative to using events would be to have the display object call methods such as displaySizeChanged() or computationStarted() in the MandelbrotPanel and MandelbrotFrame objects to tell them what is going on in the display. This would be strong coupling: Any programmer who wanted to use MandelbrotDisplay would also have to use the other two classes or would have to modify the display class so that it no longer refers to the other classes. Of course, not everything can be done with events and not all strong coupling is bad: The MandelbrotPanel class refers directly to the MandelbrotDisplay class and cannot be used without itbut since the whole purpose of a MandelbrotPanel is to hold a MandelbrotDisplay, the coupling is not a problem.

12.5. FINISHING TOUCHES

667

The Mandelbrot Viewer program responds to mouse events on the display. These events are generated by the display object, but the display class itself doesnt care about mouse events and doesnt do anything in response to them. Mouse events are handled by a listener in the MandelbrotPanel, which responds to them by zooming the display and by showing mouse coordinates in the status bar. The status bar also shows the new size of the display whenever that size is changed. To handle this, events of type ComponentEvent are used. When the size of a component is changed, a ComponentEvent is generated. In the Mandelbrot Viewer program, a ComponentListener in the MandelbrotPanel class listens for size-change events in the display. When one occurs, the listener responds by showing the new size in the status bar; the display knows nothing about the status bar that shows the displays size. Component events are also used internally in the MandelbrotDisplay class in an interesting way. When the user dynamically changes the size of the display, its size can change several times each second. Normally, a change of display size would trigger the creation of a new oscreen canvas and the start of a new asynchronous computation of the image. However, doing this is a big deal, not something I want to do several times in a second. If you try resizing the programs window, youll notice that the image doesnt change size dynamically as the window size changes. The same image and o-screen canvas are used as long as the size is changing. Only about one-third of a second after the size has stopped changing will a new, resized image be produced. Here is how this works: The display sets up a ComponentEvent to listen for resize events on itself. When a resize occurs, the listener starts a Timer that has a delay of 1/3 second. (See Subsection 6.5.1.) While this timer is running, the paintComponent() method does not resize the image; instead, it reuses the image that already exists. If the timer res 1/3 second later, the image will be resized at that time. However, if another resize event occurs while the rst timer is running, then the rst timer will be stopped before it has a chance to re, and a new timer will be started with a delay of 1/3 second. The result is that the image does not get resized until 1/3 second after the size of the window stops changing. The Mandelbrot Viewer program also uses events of type WindowEvent, which are generated by a window when it opens or closes (among other things). One example is in the le LauncherApplet.java. This le denes an applet that appears as a button on the web page. The button is labeled Launch Mandelbrot Viewer. When the user clicks the button, a MandelbrotFrame is opened on the screen, and the text on the button changes to Close Mandelbrot Viewer. When the frame closes, the button changes back to Launch Mandelbrot Viewer, and the button can be used to open another window. The frame can be closed by clicking the button, but it can also be closed using a Close command in the frames menu bar or by clicking the close box in the frames title bar. The question is, how does the buttons text get changed when the frame is closed by one of the latter two methods? One possibility would be to have the frame call a method in the applet to tell the applet that it is closing, but that would tightly couple the frame class to the applet class. In fact, its done with WindowEvents. A WindowListener in the applet listens for close events from the frame. In response to a close event, the text of the button is changed. Again, this can happen even though the frame class knows nothing about the applet class. Window events are also used by Main.java to trigger an action that has to be taken when the program is ending; this will be discussed below. Perhaps the most interesting use of events in the Mandelbrot Viewer program is to enable and disable menu commands based on the status of the display. For this, events of type PropertyChangeEvent are used. This event class is part of the bean framework that was discussed briey in Subsection 11.6.2, and class PropertyChangeEvent and related classes are dened in

668

CHAPTER 12. ADVANCED GUI PROGRAMMING

the package java.beans. The idea is that bean objects are dened by their properties (which are just aspects of the state of the bean). When a bean property changes, the bean can emit a PropertyChangeEvent to notify other objects of the change. Properties for which property change events are emitted are known technically as bound properties. A bound property has a name that identies that particular property among all the properties of the bean. When a property change event is generated, the event object includes the name of the property that has changed, the previous value of the property, and the new value of the property. The MandelbrotDisplay class has a bound property whose name is given by the constant MandelbrotDisplay.STATUS PROPERTY. A display emits a property change event when its status changes. The possible values of the status property are given by other constants, such as MandelbrotDisplay.STATUS READY. The READY status indicates that the display is not currently running a computation and is ready to do another one. There are several menu commands that should be enabled only when the status of the display is READY. To implement this, the Menus class denes a PropertyChangeListener to listen for property change events from the display. When this listener hears an event, it responds by enabling or disabling menu commands according to the new value of the status property. All of Javas GUI components are beans and are capable of emitting property change events. In any subclass of Component, this can be done simply by calling the method
public void firePropertyChange(String propertyName, Object oldValue, Object newValue)

For example, the MandelbrotDisplay class uses the following method for setting its current status:
private void setStatus(String status) { if (status == this.status) { // Note: Event should be fired only if status actually changes. return; } String oldStatus = this.status; this.status = status; firePropertyChange(STATUS PROPERTY, oldStatus, status); }

When writing bean classes from scratch, you have to add support for property change events, if you need them. To make this easier, the java.beans package provides the PropertyChangeSupport class.

12.5.5

Custom Dialogs

Java has several standard dialog boxes that are dened in the classes JOptionPane, JColorChooser, and JFileChooser. These were introduced in Subsection 6.8.2 and Subsection 11.2.3. Dialogs of all these types are used in the Mandelbrot Viewer program. However, sometimes other types of dialog are needed. In such cases, you can build a custom dialog box. Dialog boxes are dened by subclasses of the class JDialog. Like frames, dialog boxes are separate windows on the screen, and the JDialog class is very similar to the JFrame class. The big dierence is that a dialog box has a parent, which is a frame or another dialog box that owns the dialog box. If the parent of a dialog box closes, the dialog box closes automatically. Furthermore, the dialog box will probably oat on top of its parent, even when its parent is the active window.

12.5. FINISHING TOUCHES

669

Dialog boxes can be either modal or modeless. When a modal dialog is put up on the screen, the rest of the application is blocked until the dialog box is dismissed. This is the most common case, and all the standard dialog boxes are modal. Modeless dialog boxes are more like independent windows, since they can stay on the screen while the user interacts with other windows. There are no modeless dialogs in the Mandelbrot Viewer program. The Mandelbrot Viewer program uses two custom dialog boxes. They are used to implement the Set Image Size and Set Limits commands and are dened by the les SetImageSizeDialog.java and SetLimitsDialog.java. The set image size dialog lets the user enter a new width and height for the Mandelbrot image. The set limits dialog lets the user input the minimum and maximum values for x and y that are shown in the image. The two dialog classes are very similar. In both classes, several JTextFields are used for user input. Two buttons named OK and Cancel are added to the window, and listeners are set up for these buttons. If the user clicks OK, the listener checks whether the inputs in the text elds are legal; if not, an error message is displayed to the user and the dialog stays on the screen. If the input is legal when the user clicks OK, the dialog is disposed. The dialog is also disposed if the user clicks Cancel or clicks the dialog boxs close box. The net eect is that the dialog box stays on the screen until the user either cancels the dialog or enters legal values for the inputs and clicks OK. The user can nd out which of these occurred by calling a method named getInput() in the dialog object after showing the dialog. This method returns null if the dialog was canceled; otherwise it returns the user input. To make my custom dialog boxes easy to use, I added a static showDialog() method to each dialog class. When this function is called, it shows the dialog, waits for it to be dismissed, and then returns the value of the getInput() method. This makes it possible to use my custom dialog boxes in much the same way as Javas standard dialog boxes are used. Custom dialog boxes are not dicult to create and to use, if you already know about frames. I will not discuss them further here, but you can look at the source code le SetImageSizeDialog.java as a model.

12.5.6

Preferences

Most serious programs allow the user to set preferences. A preference is really just a piece of the programs state that is saved between runs of the program. In order to make preferences persistent from one run of the program to the next, the preferences could simply be saved to a le in the users home directory. However, there would then be the problem of locating the le. There would be the problem of naming the le in a way that avoids conicts with le names used by other programs. And there would be the problem of cluttering up the users home directory with les that the user shouldnt even have to know about. To deal with these problems, Java has a standard means of handling preferences. It is dened by the package java.util.prefs. In general, the only thing that you need from this package is Preferences. In the Mandelbrot Viewer program, the le Main.java has an example of using Preferences. Main.java runs the stand-alone application version of the program, and its use of preferences applies only when the program is run in that way. In most programs, the user sets preferences in a custom dialog box. However, the Mandelbrot program doesnt have any preferences that are appropriate for that type of treatment. Instead, as an example, I automatically save a few aspects of the programs state as preferences. Every time the program starts up, it reads the preferences, if any are available. Every time the program terminates, it saves the preferences. (Saving the preferences poses an interesting

670

CHAPTER 12. ADVANCED GUI PROGRAMMING

problem because the program ends when the MandelbrotFrame window closes, not when the main() routine ends. In fact, the main() routine ends as soon as the window appears on the screen. So, it wont work to save the preferences at the end of the main program. The solution is to use events: A listener listens for WindowEvents from the frame. When a window-closed event is received, indicating that the program is ending, the listener saves the preferences.) Preferences for Java programs are stored in some platform-dependent form in some platformdependent location. As a Java programmer, you dont have to worry about it; the Java preferences system knows where to store the data. There is still the problem of identifying the preferences for one program among all the possible Java programs that might be running on a computer. Java solves this problem in the same way that it solves the package naming problem. In fact, by convention, the preferences for a program are identied by the package name of the program, with a slight change in notation. For example, the Mandelbrot Viewer program is dened in the package edu.hws.eck.mdb, and its preferences are identied by the string /edu/hws/eck/mdb. (The periods have been changed to /, and an extra / has been added at the beginning.) The preferences for a program are stored in something called a node. The user preferences node for a given program identier can be accessed as follows:
Preferences root = Preferences.userRoot(); Preferences node = root.node(pathName);

where pathname is the string, such as /edu/hws/eck/mdb, that identies the node. The node itself consists of a simple list of key/value pairs, where both the key and the value are strings. You can store any strings you want in preferences nodesthey are really just a way of storing some persistent data between program runs. In general, though, the key string identies some particular preference item, and the associated value string is the value of that preference. A Preferences object, node, contains methods node.get(key) for retrieving the value string associated with a given key and node.put(key,value) for setting the value string for a given key. In Main.java, I use preferences to store the shape and position of the programs window. This makes the size and shape of the window persistent between runs of the program; when you run the program, the window will be right where you left it the last time you ran it. I also store the name of the directory that is currently selected in the le dialog box that is used by the program for the Save and Open commands. This is particularly satisfying, since the default behavior for a le dialog box is to start in the users home directory, which is hardly ever the place where the user wants to keep a programs les. With the preferences feature, I can switch to the right directory the rst time I use the program, and from then on Ill automatically be back in that directory when I use the program. You can look at the source code in Main.java for the details.


And thats it. . . . Theres a lot more that I could say about Java and about programming in general, but this book is only An Introduction to Programming with Java, and its time for our journey to end. I hope that it has been a pleasant journey for you, and I hope that I have helped you establish a foundation that you can use as a basis for further exploration.

Exercises

671

Exercises for Chapter 12


1. The sample program PaintWithOScreenCanvas.java from Section 12.1 is a simple paint program. Make two improvements to this program: First, add a File menu that lets the user open an image le and save the current image in either PNG or JPEG format. Second, add a basic one-level Undo command that lets the user undo the most recent operation that was applied to the image. (Do not try to make a multilevel Undo, which would allow the user to undo several operations.) When you read a le into the program, you should copy the image that you read into the programs o-screen canvas. Since the canvas in the program has a xed size, you should scale the image that you read so that it exactly lls the canvas. 2. For this exercise, you should continue to work on the program from the previous exercise. Add a StrokeWidth menu that allows the user to draw lines of varying thicknesses. Make it possible to use dierent colors for the interior of a lled shape and for the outline of that shape. To do this, change the Color menu to StrokeColor and add a FillColor menu. (My solution adds two new tools, Stroked Filled Rectangle and Stroked Filled Oval, to represent lled shapes that are outlined with the current stroke.) Add support for lling shapes with transparent color. A simple approach to this is to use a JCheckboxMenuItem to select either fully opaque or 50% opaque ll. (Dont try to apply transparency to stokesits dicult to make transparency work correctly for the Curve tool, and in any case, shape outlines look better if they are opaque.) Finally, make the menus more user friendly by adding some keyboard accelerators to some commands and by using JRadioButtonMenuItems where appropriate, such as in the color and tool menus. This exercise takes quite a bit of work to get it all right, so you should tackle the problem in pieces. 3. The StopWatchLabel component from Subsection 12.4.5 displays the text Timing. . . when the stop watch is running. It would be nice if it displayed the elapsed time since the stop watch was started. For that, you need to create a Timer. (See Subsection 6.5.1.) Add a Timer to the original source code, StopWatchLabel.java, to drive the display of the elapsed time in seconds. Create the timer in the mousePressed() routine when the stop watch is started. Stop the timer in the mousePressed() routine when the stop watch is stopped. The elapsed time wont be very accurate anyway, so just show the integral number of seconds. You only need to set the text a few times per second. For my Timer method, I use a delay of 200 milliseconds for the timer. 4. The custom component example MirrorText.java, from Subsection 12.4.5, uses an oscreen canvas to show mirror-reversed text in a JPanel. An alternative approach would be to draw the text after applying a transform to the graphics context that is used for drawing. (See Subsection 12.2.5.) With this approach, the custom component can be dened as a subclass of JLabel in which the paintComponent() method is overridden. Write a version of the MirrorText component that takes this approach. The solution is very short, but tricky. Note that the scale transform g2.scale(-1,1) does a left-right reection through the left edge of the component. 5. The sample program PhoneDirectoryFileDemo.java from Subsection 11.3.2 keeps data for a phone directory in a le in the users home directory. Exercise 11.5 asked you to

672

CHAPTER 12. ADVANCED GUI PROGRAMMING revise that program to use an XML format for the data. Both programs have a simple command-line user interface. For this exercise, you should provide a GUI interface for the phone directory data. You can base your program either on the original sample program or on the modied version from the exercise. Use a JTable to hold the data. The user should be able to edit all the entries in the table. Also, the user should be able to add and delete rows. Include either buttons or menu commands that can be used to perform these actions. The delete command should delete the selected row, if any. New rows should be added at the end of the table. For this program, you can use a standard DefaultTableModel. Your program should load data from the le when it starts and save data to the le when it ends, just as the two previous programs do. For a GUI program, you cant simply save the data at the end of the main() routine, since main() terminates as soon as the window shows up on the screen. You want to save the data when the user closes the window and ends the program. There are several approaches. One is to use a WindowListener to detect the event that occurs when the window closes. Another is to use a Quit command to end the program; when the user quits, you can save the data and close the window (by calling its dispose() method), and end the program. If you use the Quit command approach, you dont want the user to be able to end the program simply by closing the window. To accomplish this, you should call
frame.setDefaultCloseOperation(JFrame.DO NOTHING ON CLOSE);

where frame refers to the JFrame that you have created for the programs user interface. When using a WindowListener, you want the close box on the window to close the window, not end the program. For this, you need
frame.setDefaultCloseOperation(JFrame.DISPOSE ON CLOSE);

When the listener is notied of a window closed event, it can save the data and end the program. Most of the JTable and DefaultTableModel methods that you need for this exercise are discussed in Subsection 12.4.3, but there are a few more that you need to know about. To determine which row is selected in a JTable, call table.getSelectedRow(). This method returns the row number of the selected row, or returns -1 if no row is selected. To specify which cell is currently being edited, you can use:
table.setRowSelectionInterval(rowNum, rowNum); // Selects row number rowNum. table.editCellAt( rowNum, colNum ); // Edit cell at position (rowNum,colNum). phoneTable.getEditorComponent().requestFocus(); // Put input cursor in cell.

One particularly troublesome point is that the data that is in the cell that is currently being edited is not in the table model. The value in the edit cell is not put into the table model until after the editing is nished. This means that even though the user sees the data in the cell, its not really part of the table data yet. If you lose that data, the user would be justied in complaining. To make sure that you get the right data when you save the data at the end of the program, you have to turn o editing before retrieving the data from the model. This can be done with the following method:
private void stopEditing() { if (table.getCellEditor() != null) table.getCellEditor().stopCellEditing(); }

This method must also be called before modifying the table by adding or deleting rows; if such modications are made while editing is in progress, the eect can be very strange.

Quiz

673

Quiz on Chapter 12
1. Describe the object that is created by the following statement, and give an example of how it might be used in a program:
BufferedImage OSC = new BufferedImage(32,32,BufferedImage.TYPE INT RGB);

2. Many programs depend on resource les. What is meant by a resource in this sense? Give an example. 3. What is the FontMetrics class used for? 4. If a Color, c, is created as c = new Color(0,0,255,125), what is eect of drawing with this color? 5. What is antialiasing? 6. How is the ButtonGroup class used? 7. What does the acronym MVC stand for, and how does it apply to the JTable class? 8. Describe the picture that is produced by the following paintComponent() method:
public void paintComponent(Graphics g) { super.paintComponent(g); Graphics2D g2 = (Graphics2D)g; g2.translate( getWidth()/2, getHeight()/2 ); g2.rotate( 30 * Math.PI / 180 ); g2.fillRect(0,0,100,100); }

9. What is meant by Internationalization of a program? 10. Suppose that the class that you are writing has an instance method doOpen() (with no parameters) that is meant to be used to open a le selected by the user. Write a code segment that creates an Action that represents the action of opening a le. Then show how to create a button and a menu item from that action.

You might also like