A tree is an abstract data type that stores elements hierarchically. With the
exception of the top element, each element in a tree has a parent element and zero
or more children elements. A tree is usually visualized by placing elements inside
ovals or rectangles, and by drawing the connections between parents and children
with straight lines. (See Figure 7.2.) We typically call the top element the root of
the tree, but it is drawn as the highest element, with the other elements being
connected below (just the opposite of a botanical tree).
Figure 7.2: A tree with 17 nodes representing the
organization of a fictitious corporation. The root stores
Electronics R'Us. The children of the root store R&D,
Sales, Purchasing, and Manufacturing. The internal
nodes store Sales, International, Overseas, Electronics
R'Us, and Manufacturing
92 trang |
Chia sẻ: tlsuongmuoi | Lượt xem: 2827 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Tree Definitions and Properties, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
methods of the array list ADT on such a representation. Is it better to store the
entries in L by increasing indices or not?
C-6.17
There is a simple, but inefficient, algorithm, called bubble-sort, for sorting a
sequence S of n comparable elements. This algorithm scans the sequence n−1
times, where, in each scan, the algorithm compares the current element with the
next one and swaps them if they are out of order. Give a pseudo-code
description of bubble-sort that is as efficient as possible assuming S is
implemented with a doubly linked list. What is the running time of this
algorithm?
C-6.18
Answer Exercise C-6.17 assuming S is implemented with an array list.
C-6.19
A useful operation in databases is the natural join. If we view a database as a
list of ordered pairs of objects, then the natural join of databases A and B is the
list of all ordered triples (x,y,z) such that the pair (x,y) is in A and the pair (y,z) is
in B. Describe and analyze an efficient algorithm for computing the natural join
of a list A of n pairs and a list B of m pairs.
C-6.20
When Bob wants to send Alice a message M on the Internet, he breaks M into n
data packets, numbers the packets consecutively, and injects them into the
network. When the packets arrive at Alice's computer, they may be out of order,
so Alice must assemble the sequence of n packets in order before she can be
sure she has the entire message. Describe an efficient scheme for Alice to do
this. What is the running time of this algorithm?
C-6.21
Given a list L of n positive integers, each represented with k = logn + 1 bits,
describe an O(n)-time method for finding a k-bit integer not in L.
C-6.22
Argue why any solution to the previous problem must run in Ω(n) time.
C-6.23
Given a list L of n arbitrary integers, design an O(n)-time method for finding an
integer that cannot be formed as the sum of two integers in L.
370
C-6.24
Isabel has an interesting way of summing up the values in an array A of n
integers, where n is a power of two. She creates an array B of half the size of A
and sets B[i] = A[2i] +A[2i+ 1], for i = 0,1,..., (n/2) − 1. If B has size 1, then she
outputs B[0]. Otherwise, she replaces A with B, and repeats the process. What is
the running time of her algorithm?
Projects
P-6.1
Implement the array list ADT by means of an extendable array used in a circular
fashion, so that insertions and deletions at the beginning and end of the array list
run in constant time.
P-6.2
Implement the array list ADT using a doubly linked list. Show experimentally
that this implementation is worse than the array-based approach.
P-6.3
Write a simple text editor, which stores and displays a string of characters using
the list ADT, together with a cursor object that highlights a position in this
string. Your editor should support the following operations:
•
left: Move cursor left one character (do nothing if at text end).
•
right: Move cursor right one character (do nothing if at text end).
•
cut: Delete the character right of the cursor (do nothing at text end).
•
paste c: Insert the character c just after the cursor.
P-6.4
Implement a phased favorites list. A phase consists of N accesses in the list, for
a given parameter N. During a phase, the list should maintain itself so that
elements are ordered by decreasing access counts during that phase. At the end
371
of a phase, it should clear all the access counts and start the next phase.
Experimentally, determine what are the best values of N for various list sizes.
P-6.5
Write a complete adapter class that implements the sequence ADT using a
java.util.ArrayList object.
P-6.6
Implement the favorites list application using an array list instead of a list.
Compare it experimentally to the list-based implementation.
Chapter Notes
The concept of viewing data structures as collections (and other principles of object-
oriented design) can be found in object-oriented design books by Booch [14], Budd
[17], Golberg and Robson [40], and Liskov and Guttag [69]. Lists and iterators are
pervasive concepts in the Java Collections Framework. Our node list ADT is derived
from the "position" abstraction introduced by Aho, Hopcroft, and Ullman [5], and the
list ADT of Wood [100]. Implementations of lists via arrays and linked lists are
discussed by Knuth [62].
372
Chapter 7 Trees
Contents
7.1
General Trees ......................
266
7.1.1
Tree Definitions and Properties............
267
7.1.2
The Tree Abstract Data Type ............
373
270
7.1.3
Implementing a Tree .................
271
7.2
Tree Traversal Algorithms................
273
7.2.1
Depth and Height...................
273
7.2.2
Preorder Traversal...................
276
7.2.3
Postorder Traversal..................
279
7.3
Binary Trees.......................
282
7.3.1
The Binary Tree ADT.................
284
7.3.2
A Binary Tree Interface in Java............
284
374
7.3.3
Properties of Binary Trees ..............
285
7.3.4
A Linked Structure for Binary Trees.........
287
7.3.5
An Array-List Representation of a Binary Tree....
296
7.3.6
Traversals of Binary Trees...............
298
7.3.7
The Template Method Pattern............
305
7.4
Exercises.........................
309
java.datastructures.net
7.1 General Trees
Productivity experts say that breakthroughs come by thinking "nonlinearly." In this
chapter, we discuss one of the most important nonlinear data structures in
computing—trees. Tree structures are indeed a breakthrough in data organization, for
they allow us to implement a host of algorithms much faster than when using linear
data structures, such as list. Trees also provide a natural organization for data, and
consequently have become ubiquitous structures in file systems, graphical user
interfaces, databases, Web sites, and other computer systems.
375
It is not always clear what productivity experts mean by "nonlinear" thinking, but
when we say that trees are "nonlinear," we are referring to an organizational
relationship that is richer than the simple "before" and "after" relationships between
objects in sequences. The relationships in a tree are hierarchical, with some objects
being "above" and some "below" others. Actually, the main terminology for tree data
structures comes from family trees, with the terms "parent," "child," "ancestor," and
"descendent" being the most common words used to describe relationships. We show
an example of a family tree in Figure 7.1.
Figure 7.1: A family tree showing some
descendents of Abraham, as recorded in Genesis,
chapters 25–36.
7.1.1 Tree Definitions and Properties
A tree is an abstract data type that stores elements hierarchically. With the
exception of the top element, each element in a tree has a parent element and zero
or more children elements. A tree is usually visualized by placing elements inside
ovals or rectangles, and by drawing the connections between parents and children
with straight lines. (See Figure 7.2.) We typically call the top element the root of
the tree, but it is drawn as the highest element, with the other elements being
connected below (just the opposite of a botanical tree).
Figure 7.2: A tree with 17 nodes representing the
organization of a fictitious corporation. The root stores
376
Electronics R'Us. The children of the root store R&D,
Sales, Purchasing, and Manufacturing. The internal
nodes store Sales, International, Overseas, Electronics
R'Us, and Manufacturing.
Formal Tree Definition
Formally, we define a tree T as a set of nodes storing elements such that the nodes
have a parent-child relationship, that satisfies the following properties:
• If T is nonempty, it has a special node, called the root of T, that has no
parent.
• Each node v of T different from the root has a unique parent node w;
every node with parent w is a child of w.
Note that according to our definition, a tree can be empty, meaning that it doesn't
have any nodes. This convention also allows us to define a tree recursively, such
that a tree T is either empty or consists of a node r, called the root of T, and a
(possibly empty) set of trees whose roots are the children of r.
Other Node Relationships
377
Two nodes that are children of the same parent are siblings. A node v is external
if v has no children. A node v is internal if it has one or more children. External
nodes are also known as leaves.
Example 7.1: In most operating systems, files are organized hierarchically
into nested directories (also called folders), which are presented to the user in the
form of a tree. (See Figure 7.3.) More specifically, the internal nodes of the tree
are associated with directories and the external nodes are associated with regular
files. In the UNIX and Linux operating systems, the root of the tree is
appropriately called the "root directory," and is represented by the symbol "/."
Figure 7.3: Tree representing a portion of a file
system.
A node u is an ancestor of a node v if u = v or u is an ancestor of the parent of v.
Conversely, we say that a node v is a descendent of a node u if u is an ancestor of
v. For example, in Figure 7.3, cs252/ is an ancestor of papers/, and pr3 is a
descendent of cs016/. The subtree of T rooted at a node v is the tree consisting
of all the descendents of v in T (including v itself). In Figure 7.3, the subtree
rooted at cs016/ consists of the nodes cs016/, grades, homeworks/,
programs/, hw1, hw2, hw3, pr1, pr2, and pr3.
378
Edges and Paths in Trees
An edge of tree T is a pair of nodes (u, v) such that u is the parent of v, or vice
versa. A path of T is a sequence of nodes such that any two consecutive nodes in
the sequence form an edge. For example, the tree in Figure 7.3 contains the path
(cs252/, projects/, demos/, market).
Example 7.2: The inheritance relation between classes in a Java program
forms a tree. The root, java. lang. Object, is an ancestor of all other
classes. Each class, C, is a descendent of this root and is the root of a subtree of
the classes that extend C. Thus, there is a path from C to the root,
java.lang.Object, in this inheritance tree.
Ordered Trees
A tree is ordered if there is a linear ordering defined for the children of each node;
that is, we can identify the children of a node as being the first, second, third, and
so on. Such an ordering is usually visualized by arranging siblings left to right,
according to their ordering. Ordered trees typically indicate the linear order
among siblings by listing them in the correct order.
Example 7.3: The components of a structured document, such as a book, are
hierarchically organized as a tree whose internal nodes are parts, chapters, and
sections, and whose external nodes are paragraphs, tables, figures, and so on.
(See Figure 7.4.) The root of the tree corresponds to the book itself. We could, in
fact, consider expanding the tree further to show paragraphs consisting of
sentences, sentences consisting of words, and words consisting of characters.
Such a tree is an example of an ordered tree, because there is a well-defined
ordering among the children of each node.
Figure 7.4: An ordered tree associated with a book.
379
7.1.2 The Tree Abstract Data Type
The tree ADT stores elements at positions, which, as with positions in a list, are
defined relative to neighboring positions. The positions in a tree are its nodes, and
neighboring positions satisfy the parent-child relationships that define a valid tree.
Therefore, we use the terms "position" and "node" interchangeably for trees. As
with a list position, a position object for a tree supports the method:
element(): return the object stored at this position.
The real power of node positions in a tree, however, comes from the accessor
methods of the tree ADT that return and accept positions, such as the following:
root():
return the tree's root; an error occurs if the tree is empty.
parent (v):
return the parent of v; an error occurs if v is the root.
children(v):
return an iterable collection containing the children of node v.
If a tree T is ordered, then the iterable collection, children(v), stores the children of
v in order. If v is an external node, then children(v) is empty.
In addition to the above fundamental accessor methods, we also include the
following query methods:
isInternal(v):
Test whether node v is internal.
isExternal(v):
Test whether node v is external.
isRoot(v):
Test whether node v is the root.
These methods make programming with trees easier and more readable, since we
can use them in the conditionals of if statements and while loops, rather than
using a nonintuitive conditional.
380
There are also a number of generic methods a tree should probably support that are
not necessarily related to its tree structure, including the following:
size():
return the number of nodes in the tree.
isEmpty():
Test whether the tree has any nodes or not.
iterator():
return an iterator of all the elements stored at nodes of the tree.
positions():
return an iterable collection of all the nodes of the tree.
replace(v,e):
Replace with e and return the element stored at node v.
Any method that takes a position as an argument should generate an error condition
if that position is invalid. We do not define any specialized update methods for trees
here. Instead, we prefer to describe different tree update methods in conjunction
with specific applications of trees in subsequent chapters. In fact, we can imagine
several kinds of tree update operations beyond those given in this book.
7.1.3 Implementing a Tree
The Java interface shown in Code Fragment 7.1 represents the tree ADT. Error
conditions are handled as follows: Each method that can take a position as an
argument, may throw an InvalidPositionException, to indicate that the
position is invalid. Method parent throws a BoundaryViolationException
if it is called on the root. Method root throws an EmptyTreeException if it is
called on an empty tree.
Code Fragment 7.1: Java interface Tree
representing the tree ADT. Additional update methods
may be added, depending on the application. We do
not include such methods in the interface, however.
381
A Linked Structure for General Trees
A natural way to realize a tree T is to use a linked structure, where we represent
each node v of T by a position object (see Figure 7.5a) with the following fields:
A reference to the element stored at v, a link to the parent of v, and a some kind of
collection (for example, a list or array) to store links to the children of v. If v is the
root of T, then the parent field of v is null. Also, we store a reference to the root of
T and the number of nodes of T in internal variables. This structure is
schematically illustrated in Figure 7.5b.
382
Figure 7.5: The linked structure for a general tree:
(a) the position object associated with a node; (b) the
portion of the data structure associated with a node
and its children.
Table 7.1 summarizes the performance of the implementation of a general tree
using a linked structure. The analysis is left as an exercise (C-7.25), but we note
that, by using a collection to store the children of each node v, we can implement
children(v) simply by returning a reference to this collection.
Table 7.1: Running times of the methods of an n-
node general tree implemented with a linked
structure. We let cv denote the number of children of a
node v. The space usage is O(n).
Operation
Time
size, isEmpty
O(1)
iterator, positions
O(n)
383
replace
O(1)
root, parent
O(1)
children(v)
O(cv)
isInternal, isExternal, isRoot
O(1)
7.2 Tree Traversal Algorithms
In this section, we present algorithms for performing traversal computations on a tree
by accessing it through the tree ADT methods.
7.2.1 Depth and Height
Let v be a node of a tree T. The depth of v is the number of ancestors of v,
excluding v itself. For example, in the tree of Figure 7.2, the node storing
International has depth 2. Note that this definition implies that the depth of the root
of T is 0.
The depth of a node v can also be recursively defined as follows:
• If v is the root, then the depth of v is 0
• Otherwise, the depth of v is one plus the depth of the parent of v.
Based on this definition, we present a simple, recursive algorithm, depth, in Code
Fragment 7.2, for computing the depth of a node v in T. This method calls itself
recursively on the parent of v, and adds 1 to the value returned. A simple Java
implementation of this algorithm is shown in Code Fragment 7.3.
Code Fragment 7.2: Algorithm for computing the
depth of a node v in a tree T.
384
Code Fragment 7.3: Method depth written in Java.
The running time of algorithm depth(T, v) is O(dv), where dv denotes the depth of
the node v in the tree T, because the algorithm performs a constant-time recursive
step for each ancestor of v. Thus, algorithm depth (T, v) runs in O(n) worst-case
time, where n is the total number of nodes of T, since a node of T may have depth n
− 1 in the worst case. Although such a running time is a function of the input size, it
is more accurate to characterize the running time in terms of the parameter dv, since
this parameter can be much smaller than n.
Height
The height of a node v in a tree T is also defined recursively:
• If v is an external node, then the height of v is 0
• Otherwise, the height of v is one plus the maximum height of a child of v.
The height of a nonempty tree T is the height of the root of T. For example, the
tree of Figure 7.2 has height 4. In addition, height can also be viewed as follows.
Proposition 7.4: The height of a nonempty tree T is equal to the maximum
depth of an external node of T.
We leave the justification of this fact to an exercise (R-7.6). We present here an
algorithm, height1, shown in Code Fragment 7.4 and implemented in Java in
Code Fragment 7.5, for computing the height of a nonempty tree T based on the
proposition above and the algorithm depth from Code Fragment 7.2.
385
Code Fragment 7.4: Algorithm height1 for
computing the height of a nonempty tree T. Note that
this algorithm calls algorithm depth (Code Fragment
7.2).
Code Fragment 7.5: Method height1 written in
Java. Note the use of the max method of class
java.lang. Math.
Unfortunately, algorithm height1 is not very efficient. Since height1 calls
algorithm depth (v) on each external node v of T, the running time of height1 is
given by O(n + Σv(1 + dv)), where n is the number of nodes of T, dv is the depth
of node v, and E is the set of external nodes of T. In the worst case, the sumΣv(1
d
+
v) is proportional to n2. (See Exercise C-7.6.) Thus, algorithm height1 runs in
O(n2) time.
Algorithm height2, shown in Code Fragment 7.6 and implemented in Java in
Code Fragment 7.7, computes the height of tree T in a more efficient manner by
using the recursive definition of height.
386
Code Fragment 7.6: Algorithm height2 for
computing the height of the subtree of tree T rooted
at a node v.
Code Fragment 7.7: Method height2 written in
Java.
Algorithm height2 is more efficient than height1 (from Code Fragment 7.4).
The algorithm is recursive, and, if it is initially called on the root of T, it will
eventually be called on each node of T. Thus, we can determine the running time
of this method by summing, over all the nodes, the amount of time spent at each
node (on the nonrecursive part). Processing each node in children(v) takes O(cv)
time, where cv denotes the number of children of node v. Also, the while loop
has cv iterations and each iteration of the loop takes O(1) time plus the time for
the recursive call on a child of v. Thus, algorithm height2 spends O(1 + cv)
time at each node v, and its running time is O(Σv(1 + cv)). In order to complete the
analysis, we make use of the following property.
Proposition 7.5: Let T be a tree with n nodes, and let cv denote the
number of children of a node v of T. Then, summing over the vertices in T, Σvcv=
n − 1.
387
Justification: Each node of T, with the exception of the root, is a child of
another node, and thus contributes one unit to the above sum.
By Proposition 7.5, the running time of algorithm height2, when called on the
root of T, is O(n), where n is the number of nodes of T.
7.2.2 Preorder Traversal
A traversal of a tree T is a systematic way of accessing, or "visiting," all the nodes
of T. In this section, we present a basic traversal scheme for trees, called preorder
traversal. In the next section, we will study another basic traversal scheme, called
postorder traversal.
In a preorder traversal of a tree T, the root of T is visited first and then the subtrees
rooted at its children are traversed recursively. If the tree is ordered, then the
subtrees are traversed according to the order of the children. The specific action
associated with the "visit" of a node v depends on the application of this traversal,
and could involve anything from incrementing a counter to performing some
complex computation for v. The pseudo-code for the preorder traversal of the
subtree rooted at a node v is shown in Code Fragment 7.8. We initially call this
algorithm with preorder(T,T.root()).
Code Fragment 7.8: Algorithm preorder for
performing the preorder traversal of the subtree of a
tree T rooted at a node v.
The preorder traversal algorithm is useful for producing a linear ordering of the
nodes of a tree where parents must always come before their children in the
ordering. Such orderings have several different applications. We explore a simple
instance of such an application in the next example.
Figure 7.6: Preorder traversal of an ordered tree,
where the children of each node are ordered from left
to right.
388
Example 7.6: The preorder traversal of the tree associated with a document, as
in Example 7.3, examines an entire document sequentially, from beginning to end.
If the external nodes are removed before the traversal, then the traversal examines
the table of contents of the document. (See Figure 7.6.)
The preorder traversal is also an efficient way to access all the nodes of a tree. To
justify this, let us consider the running time of the preorder traversal of a tree T with
n nodes under the assumption that visiting a node takes O(1) time. The analysis of
the preorder traversal algorithm is actually similar to that of algorithm height2
(Code Fragment 7.7), given in Section 7.2.1. At each node v, the nonrecursive part
of the preorder traversal algorithm requires time O(1 + cv), where cv is the number
of children of v. Thus, by Proposition 7.5, the overall running time of the preorder
traversal of T is O(n).
Algorithm toStringPreorder(T, v), implemented in Java in Code
Fragment 7.9, performs a preorder printing of the subtree of a node v of T, that is, it
performs the preorder traversal of the subtree rooted at v and prints the element
stored at a node when the node is visited. Recall that, for an ordered tree T, method
T.children(v) returns an iterable collection that accesses the children of v in
order.
Code Fragment 7.9: Method toStringPreorder(T, v)
that performs a preorder printing of the elements in the
subtree of node v of T.
389
There is an interesting application of the preorder traversal algorithm that produces
a string representation of an entire tree. Let us assume again that for each element e
stored in tree T, calling e.toString() returns a string associated with e. The
parenthetic string representation P(T) of tree T is recursively defined as follows. If
T consists of a single node v, then
P(T) = v.element().toString().
Otherwise,
P(T) = v.element().toString() + " (" + P(T1) + "," + ··· + ", "+ P(Tk) +")",
where v is the root of T and T1, T2,..., Tk are the subtrees rooted at the children of v,
which are given in order if T is an ordered tree.
Note that the above definition of P(T) is recursive. Also, we are using "+" here to
denote string concatenation. The parenthetic representation of the tree of Figure 7.2
is shown in Figure 7.7.
Figure 7.7: Parenthetic representation of the tree of
Figure 7.2. Indentation, line breaks and spaces have
been added for clarity.
Note that, technically speaking, there are some computations that occur between
and after the recursive calls at a node's children in the above algorithm. We still
consider this algorithm to be a preorder traversal, however, since the primary action
of printing a node's contents occurs prior to the recursive calls.
The Java method parentheticRepresentation, shown in Code Fragment
7.10, is a variation of method toStringPreorder (Code Fragment 7.9). It
implements the definition given above to output a parenthetic string representation
of a tree T. As with the method toStringPreorder, the method
parentheticRepresentation makes use of the toString method that
is defined for every Java object. In fact, we can view this method as a kind of
toString() method for tree objects.
390
Code Fragment 7.10: Algorithm
parentheticRepresentation. Note the use of the +
operator to concatenate two strings.
We explore a modification to Code Fragment 7.10 in Exercise R-7.9, to display a
tree in a fashion more closely matching that given in Figure 7.7.
7.2.3 Postorder Traversal
Another important tree traversal algorithm is the postorder traversal. This
algorithm can be viewed as the opposite of the preorder traversal, because it
recursively traverses the subtrees rooted at the children of the root first, and then
visits the root. It is similar to the preorder traversal, however, in that we use it to
solve a particular problem by specializing an action associated with the "visit" of a
node v. Still, as with the preorder traversal, if the tree is ordered, we make recursive
calls for the children of a node v according to their specified order. Pseudo-code for
the postorder traversal is given in Code Fragment 7.11.
Code Fragment 7.11: Algorithm postorder for
performing the postorder traversal of the subtree of a
tree T rooted at a node v.
391
The name of the postorder traversal comes from the fact that this traversal method
will visit a node v after it has visited all the other nodes in the subtree rooted at v.
(See Figure 7.8.)
Figure 7.8: Postorder traversal of the ordered tree
of Figure 7.6.
The analysis of the running time of a postorder traversal is analogous to that of a
preorder traversal. (See Section 7.2.2.) The total time spent in the nonrecursive
portions of the algorithm is proportional to the time spent visiting the children of
each node in the tree. Thus, a postorder traversal of a tree T with n nodes takes O(n)
time, assuming that visiting each node takes O(1) time. That is, the postorder
traversal runs in linear time.
As an example of postorder traversal, we show a Java method
toStringPostorder in Code Fragment 7.12, which performs a postorder
traversal of a tree T. This method prints the element stored at a node when it is
visited.
Code Fragment 7.12: Method
toStringPostorder(T, v) that performs a
postorder printing of the elements in the subtree of
node v of T. The method implicitly calls toString on
elements, when they are involved in a string
concatenation operation.
392
The postorder traversal method is useful for solving problems where we wish to
compute some property for each node v in a tree, but computing that property for v
requires that we have already computed that same property for v's children. Such an
application is illustrated in the following example.
Example 7.7: Consider a file system tree T, where external nodes represent
files and internal nodes represent directories (Example 7.1). Suppose we want to
compute the disk space used by a directory, which is recursively given by the sum
of:
• The size of the directory itself
• The sizes of the files in the directory
• The space used by the children directories.
(See Figure 7.9.) This computation can be done with apostorder traversal of tree T.
After the subtrees of an internal node v have been traversed, we compute the space
used by v by adding the sizes of the directory v itself and of the files contained in v
to the space used by each internal child of v, which was computed by the recursive
postorder traversals of the children of v.
A Recursive Java Method for Computing Disk Space
Motivated by Example 7.7, Java method diskSpace, shown in Code Fragment 7.13,
performs a postorder traversal of a file-system tree T, printing the name and disk
space used by the directory associated with each internal node of T. When called on
the root of tree T, diskSpace runs in time O(n), where n is the number of nodes of T,
provided the auxiliary methods name and size take O(1) time.
Figure 7.9: The tree of Figure 7.3 representing a file
system, showing the name and size of the associated
file/directory inside each node, and the disk space used
by the associated directory above each internal node.
393
Code Fragment 7.13: Method diskSpace prints
the name and disk space used by the directory
associated with each internal node of a file-system tree.
This method calls the auxiliary methods name and size,
which should be defined to return the name and size of
the file/directory associated with a node.
Other Kinds of Traversals
394
Although the preorder and postorder traversals are common ways of visiting the
nodes of a tree, we can also imagine other traversals. For example, we could
traverse a tree so that we visit all the nodes at depth d before we visit the nodes at
depth d + 1. Consecutively numbering the nodes of a tree T as we visit them in
this traversal is called the level numbering of the nodes of T (see Section 7.3.5).
7.3 Binary Trees
A binary tree is an ordered tree with the following properties:
1. Every node has at most two children.
2. Each child node is labeled as being either a left child or a right child.
3. A left child precedes a right child in the ordering of children of a node.
The subtree rooted at a left or right child of an internal node v is called a left subtree
or right subtree, respectively, of v. A binary tree is proper if each node has either
zero or two children. Some people also refer to such trees as being full binary trees.
Thus, in a proper binary tree, every internal node has exactly two children. A binary
tree that is not proper is improper.
Example 7.8: An important class of binary trees arises in contexts where we wish
to represent a number of different outcomes that can result from answering a series of
yes-or-no questions. Each internal node is associated with a question. Starting at the
root, we go to the left or right child of the current node, depending on whether the
answer to the question is "Yes" or "No." With each decision, we follow an edge from
a parent to a child, eventually tracing a path in the tree from the root to an external
node. Such binary trees are known as decision trees, because each external node v in
such a tree represents a decision of what to do if the questions associated with v's
ancestors are answered in a way that leads to v. A decision tree is a proper binary
tree. Figure 7.10 illustrates a decision tree that provides recommendations to a
prospective investor.
Figure 7.10: A decision tree providing investment
advice.
395
Example 7.9: An arithmetic expression can be represented by a binary tree
whose external nodes are associated with variables or constants, and whose internal
nodes are associated with one of the operators +, −, ×, and /. (See Figure 7.11.)
Each node in such a tree has a value associated with it.
• If a node is external, then its value is that of its variable or constant.
• If a node is internal, then its value is defined by applying its operation to the
values of its children.
An arithmetic expression tree is a proper binary tree, since each operator +, −, ×, and /
takes exactly two operands. Of course, if we were to allow for unary operators, like
negation (−), as in "−x," then we could have an improper binary tree.
Figure 7.11: A binary tree representing an arithmetic
expression. This tree represents the expression ((((3 + 1)
× 3)/((9 −5) +2)) − ((3 × (7 −4)) + 6)). The value
associated with the internal node labeled "/" is 2.
396
A Recursive Binary Tree Definition
Incidentally, we can also define a binary tree in a recursive way such that a binary
tree is either empty or consists of:
• A node r, called the root of T and storing an element
• A binary tree, called the left subtree of T
• A binary tree, called the right subtree of T.
We discuss some of the specialized topics for binary trees below.
7.3.1 The Binary Tree ADT
As an abstract data type, a binary tree is a specialization of a tree that supports three
additional accessor methods:
left(v):
Return the left child of v; an error condition occurs if v has no left child.
right(v):
Return the right child of v; an error condition occurs if v has no right child.
hasLeft(v):
Test whether v has a left child.
397
hasRight(v):
Test whether v has a right child.
Just as in Section 7.1.2 for the tree ADT, we do not define specialized update
methods for binary trees here. Instead, we will consider some possible update
methods when we describe specific implementations and applications of binary
trees.
7.3.2 A Binary Tree Interface in Java
We model a binary tree as an abstract data type that extends the tree ADT and adds
the three specialized methods for a binary tree. In Code Fragment 7.14, we show the
simple Java interface we can define using this approach. By the way, since binary
trees are ordered trees, the iterable collection returned by method children(v)
(inherited from the Tree interface) stores the left child of v before the right child of
v.
Code Fragment 7.14: Java interface Binary Tree
for the binary tree ADT. Interface Binary Tree
extends interface Tree (Code Fragment 7.1).
7.3.3 Properties of Binary Trees
Binary trees have several interesting properties dealing with relationships between
their heights and number of nodes. We denote the set of all nodes of a tree T at the
398
same depth d as the level dof T. In a binary tree, level 0 has at most one node (the
root), level 1 has at most two nodes (the children of the root), level 2 has at most
four nodes, and so on. (See Figure 7.12.) In general, level d has at most 2d nodes.
Figure 7.12: Maximum number of nodes in the levels
of a binary tree.
We can see that the maximum number of nodes on the levels of a binary tree grows
exponentially as we go down the tree. From this simple observation, we can derive
the following properties relating the height of a binary T with its number of nodes.
A detailed justification of these properties is left as an exercise (R-7.15).
Proposition 7.10: Let T be a nonempty binary tree, and let n, nE, nI and h
denote the number of nodes, number of external nodes, number of internal nodes,
and height of T, respectively. Then T has the following properties:
1. h+1 ≤ n ≤ 2 h+1 −1
2. 1≤nE≤2h
3. h≤nI≤2h−1
4. log(n+1)−1 ≤h≤n−1.
Also, if T is proper, then T has the following properties:
399
1. 2h+1 ≤ n≤2h+1−1
2. h+1≤nE≤2h
3. h≤nI≤2h−1
4. log(n + 1) − 1 ≤ h ≤ (n − 1)/2.
Relating Internal Nodes to External Nodes in a Proper
Binary Tree
In addition to the binary tree properties above, we also have the following
relationship between the number of internal nodes and external nodes in a proper
binary tree.
Proposition 7.11: In a nonempty proper binary tree T, with nE external
nodes and nI internal nodes, we have ne = nI + 1.
Justification: We justify this proposition by removing nodes from T and
dividing them up into two "piles", an internal-node pile and an external-node pile,
until T becomes empty. The piles are initially empty. At the end, the external-
node pile will have one more node than the internal-node pile. We consider two
cases:
Case 1: If T has only one node v, we remove v and place it on the external-node
pile. Thus, the external-node pile has one node and the internal-node pile is
empty.
Case 2: Otherwise (T has more than one node), we remove from T an (arbitrary)
external node w and its parent v, which is an internal node. We place w on the
external-node pile and v on the internal-node pile. If v has a parent u, then we
reconnect u with the former sibling z of w, as shown in Figure 7.13. This
operation, removes one internal node and one external node, and leaves the tree
being a proper binary tree.
Repeating this operation, we eventually are left with a final tree consisting of a
single node. Note that the same number of external and internal nodes have been
removed and placed on their respective piles by the sequence of operations
leading to this final tree. Now, we remove the node of the final tree and we place
it on the external-node pile. Thus, the the external-node pile has one more node
than the internal-node pile.
400
Figure 7.13: Operation that removes an external
node and its parent node, used in the justification of
Proposition 7.11.
Note that the above relationship does not hold, in general, for improper binary
trees and nonbinary trees, although there are other interesting relationships that
can hold, as we explore in an exercise (C-7.7).
7.3.4 A Linked Structure for Binary
Trees
As with a general tree, a natural way to realize a binary tree T is to use a linked
structure, where we represent each node v of T by a position object (see Figure
7.14a) with fields providing references to the element stored at v and to the position
objects associated with the children and parent of v. If v is the root of T, then the
parent field of v is null. If v has no left child, then the left field of v is null. If v has
no right child, then the right field of v is null. Also, we store the number of nodes of
T in a variable, called size. We show the linked structure representation of a binary
tree in Figure 7.14b.
Figure 7.14: A node (a) and a linked structure (b) for
representing a binary tree.
401
Java Implementation of a Binary Tree Node
402
We use a Java interface BTPosition (not shown) to represent a node of a
binary tree. This interfaces extends Position, thus inheriting method element,
and has additional methods for setting the element stored at the node
(setElement) and for setting and returning the left child (setLeft and
getLeft), right child (setRight and getRight), and parent (setParent
and getParent) of the node. Class BTNode (Code Fragment 7.15) implements
interface BTPosition by an object with fields element, left, right, and parent,
which, for a node v, reference the element at v, the left child of v, the right child of
v, and the parent of v, respectively.
Code Fragment 7.15: Auxiliary class BTNode for
implementing binary tree nodes.
403
Java Implementation of the Linked Binary Tree Structure
In Code Fragments 7.16–7.18, we show portions of class Linked Binary
Tree that implements the Binary Tree interface (Code Fragment 7.14)
using a linked data structure. This class stores the size of the tree and a reference
to the BTNode object associated with the root of the tree in internal variables. In
404
addition to the Binary Tree interface methods, LinkedBinaryTree has
various other methods, including accessor method sibling(v), which returns
the sibling of a node v, and the following update methods:
addRoot(e):
Create and return a new node r storing element e and make r the root of
the tree; an error occurs if the tree is not empty.
insertLeft(v,e):
Create and return a new node w storing element e, add w as the the left
child of v and return w; an error occurs if v already has a left child.
insertRight(v,e):
Create and return a new node z storing element e, add z as the the right
child of v and return z; an error occurs if v already has a right child.
remove(v):
Remove node v, replace it with its child, if any, and return the element
stored at v; an error occurs if v has two children.
attach(v,T1,T2):
Attach T1 and T2, respectively, as the left and right subtrees of the
external node v; an error condition occurs if v is not external.
Class LinkedBinaryTree has a constructor with no arguments that returns an
empty binary tree. Starting from this empty tree, we can build any binary tree by
creating the first node with method addRoot and repeatedly applying the
insertLeft and insertRight methods and/or the attach method. Likewise,
we can dismantle any binary tree T using the remove operation, ultimately
reducing such a tree T to an empty binary tree.
When a position v is passed as an argument to one of the methods of class
LinkedBinaryTree, its validity is checked by calling an auxiliary helper
method, checkPosition(v). A list of the nodes visited in a preorder traversal
of the tree is constructed by recursive method preorderPositions. Error
conditions are indicated by throwing exceptions Invalid Position
Exception, BoundaryViolation Exception,
EmptyTreeException, and NonEmptyTreeException.
Code Fragment 7.16: Portions of the Linked
Binary Tree class, which implements the Binary
Tree interface. (Continues in Code Fragment 7.17.)
405
406
Code Fragment 7.17: Portions of the Linked
Binary Tree class, which implements the Binary
Tree interface. (Continues in Code Fragment 7.18.)
407
408
Code Fragment 7.18: Portions of the Linked
Binary Tree class, which implements the Binary
Tree interface. (Continues in Code Fragment 7.19.)
409
Code Fragment 7.19: Portions of the
LinkedBinaryTree class, which implements the
410
Binary Tree interface. (Continues in Code Fragment
7.20.)
Code Fragment 7.20: Portions of the Linked
Binary Tree class, which implements the Binary
Tree interface. (Continued from Code Fragment 7.19.)
411
Performance of the Linked Binary Tree Implementation
Let us now analyze the running times of the methods of class Linked Binary
Tree, which uses a linked structure representation:
412
• Methods size() and isEmpty() use an instance variable storing the
number of nodes of T, and each take O(1) time.
• The accessor methods root, left, right, sibling and parent take O(1) time.
• Method replace(v,e) takes O(1) time.
• Methods iterator() and positions() are implemented by performing a pre-
order traversal of the tree (using the auxiliary method preorderPositions). The
nodes visited by the traversal are stored in a position list implemented by class
NodePositionList (Section 6.2.4) and the output iterator is generated with
method iterator() of class NodePositionList. Methods
iterator() and positions() take O(n) time and methods hasNext() and
next() of the returned iterators run in O(1) time.
• Method children uses a similar approach to construct the returned iterable
collection, but it runs in O(1) time, since there are at most two children for any
node in a binary tree.
• The update methods insertLeft, insertRight, attach, and
remove all run in O(1) time, as they involve constant-time manipulation of a
constant number of nodes.
Considering the space required by this data structure for a tree with n nodes, note
that there is an object of class BTNode (Code Fragment 7.15) for every node of
tree T. Thus, the overall space requirement is O(n). Table 7.2 summarizes the
performance of the linked structure implementation of a binary tree.
Table 7.2: Running times for the methods of an n-
node binary tree implemented with a linked structure.
Methods hasNext() and next() of the iterators returned
by iterator(), positions().iterator(), and
children(v).iterator() run in O(1) time. The
space usage is O(n).
Operation
Time
size, isEmpty
O(1)
iterator, positions
413
O(n)
replace
O(1)
root, parent, children, left, right, sibling
O(1)
hasLeft, hasRight, isInternal, isExternal,
isRoot
O(1)
insertLeft, insertRight, attach, remove
O(1)
7.3.5 An Array-List Representation of a
Binary Tree
An alternative representation of a binary tree T is based on a way of numbering the
nodes of T. For every node v of T, let p(v) be the integer defined as follows.
• If v is the root of T, then p(v) = 1.
• If v is the left child of node u, then p(v) = 2p(u).
• If v is the right child of node u, then p(v) = 2p(u) + 1.
The numbering function p is known as a level numbering of the nodes in a binary
tree T, for it numbers the nodes on each level of T in increasing order from left to
right, although it may skip some numbers. (See Figure 7.15.)
Figure 7.15: Binary tree level numbering: (a) general
scheme; (b) an example.
414
The level numbering function p suggests a representation of a binary tree T by
means of an array list S such that node v of T is the element of S at index p(v). As
mentioned in the previous chapter, we realize the array list S by means of an
extendable array. (See Section 6.1.4.) Such an implementation is simple and
efficient, for we can use it to easily perform the methods root, parent,
left, right, hasLeft, hasRight, isInternal, isExternal,
and isRoot by using simple arithmetic operations on the numbers p(v) associated
with each node v involved in the operation. We leave the details of this
implementation as an exercise (R-7.26).
We show an example array-list representation of a binary tree in Figure 7.16.
Figure 7.16: Representation of a binary tree T by
means of an array list S.
415
Let n be the number of nodes of T, and let pM be the maximum value of p(v) over
all the nodes of T. The array list S has size N = pM + 1 since the element of S at
index 0 is not associated with any node of T. Also, S will have, in general, a number
of empty elements that do not refer to existing nodes of T. In fact, in the worst case,
N = 2n, the justification of which is left as an exercise (R-7.23). In Section 8.3, we
will see a class of binary trees, called "heaps" for which N = n + 1. Thus, in spite of
the worst-case space usage, there are applications for which the array-list
representation of a binary tree is space efficient. Still, for general binary trees, the
exponential worst-case space requirement of this representation is prohibitive.
Table 7.3 summarizes running times of the methods of a binary tree implemented
with an array list. We do not include any tree update methods here.
Table 7.3: Running times for a binary tree T
implemented with an array list S. We denote the
number of nodes of T with n, and N denotes the size of
S. The space usage is O(N), which is O(2n) in the worst
case.
Operation
416
Time
size, isEmpty
O(1)
iterator, positions
O(n)
replace
O(1)
root, parent, children, left, right
O(1)
hasLeft, hasRight, isInternal, isExternal, isRoot
O(1)
7.3.6 Traversals of Binary Trees
As with general trees, binary tree computations often involve traversals.
Building an Expression Tree
Consider the problem of constructing an expression tree from a fully
parenthesized arithmetic expression of size n. (Recall Example 7.9 and Code
Fragment 7.24.) In Code Fragment 7.21, we give algorithm buildExpression
for building such an expression tree, assuming all arithmetic operations are binary
and variables are not parenthesized. Thus, every parenthesized subexpression
contains an operator in the middle. The algorithm uses a stack S while scanning
the input expression E looking for variables, operators, and right parentheses.
• When we see a variable or operator x, we create a single-node binary tree
T, whose root stores x and we push T on the stack.
• When we see a right parenthesis, ")", we pop the top three trees from the
stack S, which represent a subexpression (E1 o E2). We then attach the trees for
E1 and E2 to the one for o, and push the resulting tree back on S.
We repeat this until the expression E has been processed, at which time the top
element on the stack is the expression tree for E. The total running time is O(n).
417
Code Fragment 7.21: Algorithm
buildExpression.
Preorder Traversal of a Binary Tree
Since any binary tree can also be viewed as a general tree, the preorder traversal
for general trees (Code Fragment 7.8) can be applied to any binary tree. We can
simplify the algorithm in the case of a binary tree traversal, however, as we show
in Code Fragment 7.22.
Code Fragment 7.22: Algorithm binaryPreorder
for performing the preorder traversal of the subtree of
a binary tree T rooted at a node v.
418
As is the case for general trees, there are many applications of the preorder
traversal for binary trees.
Postorder Traversal of a Binary Tree
Analogously, the postorder traversal for general trees (Code Fragment 7.11) can
be specialized for binary trees, as shown in Code Fragment 7.23.
Code Fragment 7.
Các file đính kèm theo tài liệu này:
- Tree Definitions and Properties.pdf