Matching Parentheses and HTML Tags
In this subsection, we explore two related applications of stacks, the first of which
is for matching parentheses and grouping symbols in arithmetic expressions.
Arithmetic expressions can contain various pairs of grouping symbols, such as
ã Parentheses: "(" and ")"
ã Braces: "{" and "}"
ã Brackets: "[" and "]"
ã Floor function symbols: " " and ""
ã Ceiling function symbols: "" and ","
and each opening symbol must match with its corresponding closing symbol. For
example, a left bracket, "[," must match with a corresponding right bracket, "]," as
in the following expression:
[(5 + x) − (y + z)]
92 trang |
Chia sẻ: tlsuongmuoi | Lượt xem: 5135 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Data structures algorithms in java, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
and then fill the array back up again by popping the elements off of the stack. In
Code Fragment 5.8, we give a Java implementation of this algorithm. Incidentally,
this method also illustrates how we can use generic types in a simple application
that uses a generic stack. In particular, when the elements are popped off the stack
in this example, they are automatically returned as elements of the E type; hence,
they can be immediately returned to the input array. We show an example use of
this method in Code Fragment 5.9.
Code Fragment 5.8: A generic method that
reverses the elements in an array of type E objects,
using a stack declared using the Stack interface.
Code Fragment 5.9: A test of the reverse
method using two arrays.
278
5.1.5 Matching Parentheses and HTML Tags
In this subsection, we explore two related applications of stacks, the first of which
is for matching parentheses and grouping symbols in arithmetic expressions.
Arithmetic expressions can contain various pairs of grouping symbols, such as
• Parentheses: "(" and ")"
• Braces: "{" and "}"
• Brackets: "[" and "]"
• Floor function symbols: " " and ""
• Ceiling function symbols: "" and ","
and each opening symbol must match with its corresponding closing symbol. For
example, a left bracket, "[," must match with a corresponding right bracket, "]," as
in the following expression:
[(5 + x) − (y + z)].
279
The following examples further illustrate this concept:
• Correct: ( )(( )){([( )])}
• Correct: ((( )(( )){([( )])}))
• Incorrect: )(( )){([( )])}
• Incorrect: ({[])}
• Incorrect: (.
We leave the precise definition of matching of grouping symbols to Exercise R-5.5.
An Algorithm for Parentheses Matching
An important problem in processing arithmetic expressions is to make sure their
grouping symbols match up correctly. We can use a stack S to perform the
matching of grouping symbols in an arithmetic expression with a single left-to-
right scan. The algorithm tests that left and right symbols match up and also that
the left and right symbols are both of the same type.
Suppose we are given a sequence X = x0x1x2…xn−1, where each xi is a token that
can be a grouping symbol, a variable name, an arithmetic operator, or a number.
The basic idea behind checking that the grouping symbols in S match correctly, is
to process the tokens in X in order. Each time we encounter an opening symbol,
we push that symbol onto S, and each time we encounter a closing symbol, we
pop the top symbol from the stack S (assuming S is not empty) and we check that
these two symbols are of the same type. If the stack is empty after we have
processed the whole sequence, then the symbols in X match. Assuming that the
push and pop operations are implemented to run in constant time, this algorithm
runs in O(n), that is linear, time. We give a pseudo-code description of this
algorithm in
Code Fragment 5.10.
Code Fragment 5.10: Algorithm for matching
grouping symbols in an arithmetic expression.
280
Matching Tags in an HTML Document
Another application in which matching is important is in the validation of HTML
documents. HTML is the standard format for hyperlinked documents on the
Internet. In an HTML document, portions of text are delimited by HTML tags. A
simple opening HTML tag has the form "" and the corresponding closing
tag has the form "." Commonly used HTML tags include
• body: document body
• h1: section header
• center: center justify
• p: paragraph
• ol: numbered (ordered) list
• li: list item.
Ideally, an HTML document should have matching tags, although most browsers
tolerate a certain number of mismatching tags.
We show a sample HTML document and a possible rendering in Figure 5.3.
281
Figure 5.3: Illustrating HTML tags. (a) An HTML
document; (b) its rendering.
Fortunately, more or less the same algorithm as in Code Fragment 5.10 can be
used to match the tags in an HTML document. In Code Fragments 5.11 and 5.12,
we give a Java program for matching tags in an HTML document read from
standard input. For simplicity, we assume that all tags are the simple opening or
closing tags defined above and that no tags are formed incorrectly.
Code Fragment 5.11: A complete Java program for
testing if an HTML document has fully matching tags.
(Continues in Code Fragment 5.12.)
282
Code Fragment 5.12: Java program for testing for
matching tags in an HTML document. (Continued from
5.11.) Method isHTMLMatched uses a stack to store
the names of the opening tags seen so far, similar to
how the stack was used in Code Fragment 5.10.
Method parseHTML uses a Scanner s to extract the
tags from the HTML document, using the pattern
"]*>," which denotes a string that starts with '<',
followed by zero or more characters that are not '>',
followed by a '>'.
283
5.2 Queues
284
Another fundamental data structure is the queue. It is a close "cousin" of the stack, as
a queue is a collection of objects that are inserted and removed according to the first-
in first-out (FIFO) principle. That is, elements can be inserted at any time, but only
the element that has been in the queue the longest can be removed at any time.
We usually say that elements enter a queue at the rear and are removed from the
front. The metaphor for this terminology is a line of people waiting to get on an
amusement park ride. People waiting for such a ride enter at the rear of the line and
get on the ride from the front of the line.
5.2.1 The Queue Abstract Data Type
Formally, the queue abstract data type defines a collection that keeps objects in a
sequence, where element access and deletion are restricted to the first element in the
sequence, which is called the front of the queue, and element insertion is restricted
to the end of the sequence, which is called the rear of the queue. This restriction
enforces the rule that items are inserted and deleted in a queue according to the
first-in first-out (FIFO) principle.
The queue abstract data type (ADT) supports the following two fundamental
methods:
enqueue(e): Insert element e at the rear of the queue.
dequeue(): Remove and return from the queue the object at the front;
an error occurs if the queue is empty.
Additionally, similar to the case with the Stack ADT, the queue ADT includes the
following supporting methods:
size(): Return the number of objects in the queue.
isEmpty(): Return a Boolean value that indicates whether the queue is
empty.
front(): Return, but do not remove, the front object in the queue; an
error occurs if the queue is empty.
Example 5.4: The following table shows a series of queue operations and their
effects on an initially empty queue Q of integer objects. For simplicity, we use
integers instead of integer objects as arguments of the operations.
Operation
Output
front ← Q ← rear
285
enqueue(5)
-
(5)
enqueue(3)
-
(5, 3)
dequeue( )
5
(3)
enqueue(7)
-
(3, 7)
dequeue( )
3
(7)
front( )
7
(7)
dequeue( )
7
( )
dequeue( )
"error"
( )
isEmpty( )
286
true
( )
enqueue(9)
-
(9)
enqueue(7)
-
(9, 7)
size()
2
(9, 7)
enqueue(3)
-
(9, 7, 3)
enqueue(5)
-
(9, 7, 3, 5)
dequeue( )
9
(7, 3, 5)
Example Applications
There are several possible applications for queues. Stores, theaters, reservation
centers, and other similar services typically process customer requests according
to the FIFO principle. A queue would therefore be a logical choice for a data
structure to handle transaction processing for such applications. For example, it
would be a natural choice for handling calls to the reservation center of an airline
or to the box office of a theater.
287
A Queue Interface in Java
A Java interface for the queue ADT is given in Code Fragment 5.13. This generic
interface specifies that objects of arbitrary object types can be inserted into the
queue. Thus, we don't have to use explicit casting when removing elements.
Note that the size and isEmpty methods have the same meaning as their
counterparts in the Stack ADT. These two methods, as well as the front method,
are known as accessor methods, for they return a value and do not change the
contents of the data structure.
Code Fragment 5.13: Interface Queue documented
with comments in Javadoc style.
288
5.2.2 A Simple Array-Based Queue Implementation
We present a simple realization of a queue by means of an array, Q, of fixed
capacity, storing its elements. Since the main rule with the queue ADT is that we
insert and delete objects according to the FIFO principle, we must decide how we
are going to keep track of the front and rear of the queue.
289
One possibility is to adapt the approach we used for the stack implementation,
letting Q[0] be the front of the queue and then letting the queue grow from there.
This is not an efficient solution, however, for it requires that we move all the
elements forward one array cell each time we perform a dequeue operation. Such an
implementation would therefore take O(n) time to perform the dequeue method,
where n is the current number of objects in the queue. If we want to achieve
constant time for each queue method, we need a different approach.
Using an Array in a Circular Way
To avoid moving objects once they are placed in Q, we define two variables f and
r, which have the following meanings:
• f is an index to the cell of Q storing the first element of the queue (which
is the next candidate to be removed by a dequeue operation), unless the queue is
empty (in which case f = r).
• r is an index to the next available array cell in Q.
Initially, we assign f = r = 0, which indicates that the queue is empty. Now, when
we remove an element from the front of the queue, we increment f to index the
next cell. Likewise, when we add an element, we store it in cell Q[r] and
increment r to index the next available cell in Q. This scheme allows us to
implement methods front, enqueue, and dequeue in constant time, that is,
O(1) time. However, there is still a problem with this approach.
Consider, for example, what happens if we repeatedly enqueue and dequeue a
single element N different times. We would have f = r = N. If we were then to try
to insert the element just one more time, we would get an array-out-of-bounds
error (since the N valid locations in Q are from Q[0] to Q[N − 1]), even though
there is plenty of room in the queue in this case. To avoid this problem and be
able to utilize all of the array Q, we let the f and r indices "wrap around" the end
of Q. That is, we now view Q as a "circular array" that goes from Q[0] to Q[N −
1] and then immediately back to Q[0] again. (See Figure 5.4.)
Figure 5.4: Using array Q in a circular fashion: (a)
the "normal" configuration with f ≤ r; (b) the "wrapped
around" configuration with r < f. The cells storing
queue elements are highlighted.
290
Using the Modulo Operator to Implement a Circular
Array
Implementing this circular view of Q is actually pretty easy. Each time we
increment f or r, we compute this increment as "(f + 1) mod N" or "(r + 1) mod
N," respectively.
Recall that operator "mod" is the modulo operator, which is computed by taking
the remainder after an integral division. For example, 14 divided by 4 is 3 with
remainder 2, so 14 mod 4 = 2. Specifically, given integers x and y such that x ≥ 0
and y > 0, we have x mod y = x − x/yy. That is, if r = x mod y, then there is a
nonnegative integer q, such that x = qy + r. Java uses "%" to denote the modulo
operator. By using the modulo operator, we can view Q as a circular array and
implement each queue method in a constant amount of time (that is, O(1) time).
We describe how to use this approach to implement a queue in Code Fragment
5.14.
Code Fragment 5.14: Implementation of a queue
using a circular array. The implementation uses the
modulo operator to "wrap" indices around the end of
the array and it also includes two instance variables, f
and r, which index the front of the queue and first
empty cell after the rear of the queue respectively.
291
The implementation above contains an important detail, which might be missed at
first. Consider the situation that occurs if we enqueue N objects into Q without
dequeuing any of them. We would have f = r, which is the same condition that
occurs when the queue is empty. Hence, we would not be able to tell the
difference between a full queue and an empty one in this case. Fortunately, this is
not a big problem, and a number of ways for dealing with it exist.
The solution we describe here is to insist that Q can never hold more than N − 1
objects. This simple rule for handling a full queue takes care of the final problem
with our implementation, and leads to the pseudo-coded descriptions of the queue
methods given in Code Fragment 5.14. Note our introduction of an
implementation-specific exception, called FullQueueException, to signal
that no more elements can be inserted in the queue. Also note the way we
compute the size of the queue by means of the expression (N − f + r) mod N,
292
which gives the correct result both in the "normal" configuration (when f ≤ r) and
in the "wrapped around" configuration (when r < f). The Java implementation of a
queue by means of an array is similar to that of a stack, and is left as an exercise
(P-5.4).
Table 5.2 shows the running times of methods in a realization of a queue by an
array. As with our array-based stack implementation, each of the queue methods
in the array realization executes a constant number of statements involving
arithmetic operations, comparisons, and assignments. Thus, each method in this
implementation runs in O(1) time.
Table 5.2: Performance of a queue realized by an
array. The space usage is O(N), where N is the size of
the array, determined at the time the queue is created.
Note that the space usage is independent from the
number n < N of elements that are actually in the
queue.
Method
Time
size
O(1)
isEmpty
O(1)
front
O(1)
enqueue
O(1)
dequeue
O(1)
As with the array-based stack implementation, the only real disadvantage of the
array-based queue implementation is that we artificially set the capacity of the
queue to be some fixed value. In a real application, we may actually need more or
293
less queue capacity than this, but if we have a good capacity estimate, then the
array-based implementation is quite efficient.
5.2.3 Implementing a Queue with a Generic Linked List
We can efficiently implement the queue ADT using a generic singly linked list. For
efficiency reasons, we choose the front of the queue to be at the head of the list, and
the rear of the queue to be at the tail of the list. In this way, we remove from the
head and insert at the tail. (Why would it be bad to insert at the head and remove at
the tail?) Note that we need to maintain references to both the head and tail nodes of
the list. Rather than go into every detail of this implementation, we simply give a
Java implementation for the fundamental queue methods in Code Fragment 5.15.
Code Fragment 5.15: Methods enqueue and
dequeue in the implementation of the queue ADT by
means of a singly linked list, using nodes from class
Node of Code Fragment 5.6.
294
Each of the methods of the singly linked list implementation of the queue ADT runs
in O(1) time. We also avoid the need to specify a maximum size for the queue, as
was done in the array-based queue implementation, but this benefit comes at the
expense of increasing the amount of space used per element. Still, the methods in
the singly linked list queue implementation are more complicated than we might
like, for we must take extra care in how we deal with special cases where the queue
is empty before an enqueue or where the queue becomes empty after a dequeue.
5.2.4 Round Robin Schedulers
A popular use of the queue data structure is to implement a round robin scheduler,
where we iterate through a collection of elements in a circular fashion and "service"
each element by performing a given action on it. Such a schedule is used, for
example, to fairly allocate a resource that must be shared by a collection of clients.
295
For instance, we can use a round robin scheduler to allocate a slice of CPU time to
various applications running concurrently on a computer.
We can implement a round robin scheduler using a queue, Q, by repeatedly
performing the following steps (see Figure 5.5):
1. e ← Q.dequeue()
2. Service element e
3. Q.enqueue(e)
Figure 5.5: The three iterative steps for using a
queue to implement a round robin scheduler.
The Josephus Problem
In the children's game "hot potato," a group of n children sit in a circle passing an
object, called the "potato," around the circle. The potato begins with a starting
child in the circle, and the children continue passing the potato until a leader rings
a bell, at which point the child holding the potato must leave the game after
handing the potato to the next child in the circle. After the selected child leaves,
the other children close up the circle. This process is then continued until there is
only one child remaining, who is declared the winner. If the leader always uses
the strategy of ringing the bell after the potato has been passed k times, for some
fixed value k, then determining the winner for a given list of children is known as
the Josephus problem.
Solving the Josephus Problem Using a Queue
We can solve the Josephus problem for a collection of n elements using a queue,
by associating the potato with the element at the front of the queue and storing
elements in the queue according to their order around the circle. Thus, passing the
296
potato is equivalent to dequeuing an element and immediately enqueuing it again.
After this process has been performed k times, we remove the front element by
dequeuing it from the queue and discarding it. We show a complete Java program
for solving the Josephus problem using this approach in Code Fragment 5.16,
which describes a solution that runs in O(nk) time. (We can solve this problem
faster using techniques beyond the scope of this book.)
Code Fragment 5.16: A complete Java program for
solving the Josephus problem using a queue. Class
NodeQueue is shown in Code Fragment 5.15.
297
5.3 Double-Ended Queues
Consider now a queue-like data structure that supports insertion and deletion at both
the front and the rear of the queue. Such an extension of a queue is called a double-
ended queue, or deque, which is usually pronounced "deck" to avoid confusion with
the dequeue method of the regular queue ADT, which is pronounced like the
abbreviation "D.Q."
5.3.1 The Deque Abstract Data Type
The deque abstract data type is richer than both the stack and the queue ADTs. The
fundamental methods of the deque ADT are as follows:
addFirst(e): Insert a new element e at the beginning of the deque.
addLast(e): Insert a new element e at the end of the deque.
removeFirst(): Remove and return the first element of the deque; an
error occurs if the deque is empty.
removeLast(): Remove and return the last element of the deque; an
error occurs if the deque is empty.
Additionally, the deque ADT may also include the following support methods:
getFirst(): Return the first element of the deque; an error occurs if
the deque is empty.
getLast(): Return the last element of the deque; an error occurs if the
deque is empty.
size(): Return the number of elements of the deque.
isEmpty(): Determine if the deque is empty.
Example 5.5: The following table shows a series of operations and their effects
on an initially empty deque D of integer objects. For simplicity, we use integers
instead of integer objects as arguments of the operations.
Operation
Output
D
addFirst(3)
298
-
(3)
addFirst(5)
-
(5,3)
removeFirst()
5
(3)
addLast(7)
-
(3,7)
removeFirst()
3
(7)
removeLast()
7
()
removeFirst()
"error"
()
isEmpty()
true
()
5.3.2 Implementing a Deque
299
Since the deque requires insertion and removal at both ends of a list, using a singly
linked list to implement a deque would be inefficient. We can use a doubly linked
list, however, to implement a deque efficiently. As discussed in Section 3.3,
inserting or removing elements at either end of a doubly linked list is
straightforward to do in O(1) time, if we use sentinel nodes for the header and
trailer.
For an insertion of a new element e, we can have access to the node p before the
place e should go and the node q after the place e should go. To insert a new
element between the two nodes p and q (either or both of which could be sentinels),
we create a new node t, have t's prev and next links respectively refer to p and q,
and then have p's next link refer to t, and have q's prev link refer to t.
Likewise, to remove an element stored at a node t, we can access the nodes p and q
on either side of t (and these nodes must exist, since we are using sentinels). To
remove node t between nodes p and q, we simply have p and q point to each other
instead of t. We need not change any of the fields in t, for now t can be reclaimed
by the garbage collector, since no one is pointing to t.
Table 5.3 shows the running times of methods for a deque implemented with a
doubly linked list. Note that every method runs in O(1) time.
Table 5.3: Performance of a deque realized by a
doubly linked list.
Method
Time
size, isEmpty
O(1)
getFirst, getLast
O(1)
add First, addLast
O(1)
removeFirst, removeLast
O(1)
300
Thus, a doubly linked list can be used to implement each method of the deque ADT
in constant time. We leave the details of implementing the deque ADT efficiently in
Java as an exercise (P-5.7).
Incidentally, all of the methods of the deque ADT, as described above, are included
in the java.util.LinkedList class. So, if we need to use a deque and
would rather not implement one from scratch, we can simply use the built-in
java.util.LinkedList class.
In any case, we show a Deque interface in Code Fragment 5.17 and an
implementation of this interface in Code Fragment 5.18.
Code Fragment 5.17: Interface Deque documented
with comments in Javadoc style (Section 1.9.3). Note
also the use of the generic parameterized type, E, which
implies that a deque can contain elements of any
specified class.
301
302
Code Fragment 5.18: Class NodeDeque
implementing the Deque interface, except that we have
not shown the class DLNode, which is a generic doubly
linked list node, nor have we shown methods getLast,
addLast, and removeFirst.
303
304
5.4 Exercises
For source code and help with exercises, please visit
java.datastructures.net.
Reinforcement
R-5.1
Suppose an initially empty stack S has performed a total of 25 push operations,
12 top operations, and 10 pop operations, 3 of which generated
StackEmptyExceptions, which were caught and ignored. What is the current
size of S?
R-5.2
If we implemented the stack S from the previous problem with an array, as
described in this chapter, then what is the current value of the top instance
variable?
R-5.3
Describe the output of the following series of stack operations: push(5),
push(3), pop(), push(2), push(8), pop(), pop(),
push(9), push(1), pop(), push(7), push(6), pop(),
pop(), push(4), pop(), pop().
R-5.4
Give a recursive method for removing all the elements in a stack.
R-5.5
Give a precise and complete definition of the concept of matching for grouping
symbols in an arithmetic expression.
R-5.6
Describe the output for the following sequence of queue operations:
enqueue(5), enqueue(3), dequeue(), enqueue(2),
enqueue(8), dequeue(), dequeue(), enqueue(9),
enqueue(1), dequeue(), enqueue(7), enqueue(6),
dequeue(), dequeue(), enqueue(4), dequeue(),
dequeue().
R-5.7
305
Suppose an initially-empty queue Q has performed a total of 32 enqueue
operations, 10 front operations, and 15 dequeue operations, 5 of which
generated QueueEmptyExceptions, which were caught and ignored. What is the
current size of Q?
R-5.8
If the queue of the previous problem was implemented with an array of capacity
N = 30, as described in the chapter, and it never generated a
FullQueueException, what would be the current values of f and r?
R-5.9
Describe the output for the following sequence of deque ADT operations:
addFirst(3), addLast(8), addLast(9), addFirst(5),
removeFirst(), remove-Last(), first(), addLast(7),
removeFirst(), last(), removeLast().
R-5.10
Suppose you have a deque D containing the numbers (1,2,3,4,5,6,7,8), in this
order. Suppose further that you have an initially empty queue Q. Give a pseudo-
code description of a method that uses only D and Q (and no other variables or
objects) and results in D storing the elements (1,2,3,5,4,6,7,8), in this order.
R-5.11
Repeat the previous problem using the deque D and an initially empty stack S.
Creativity
C-5.1
Suppose you have a stack S containing n elements and a queue Q that is initially
empty. Describe how you can use Q to scan S to see if it contains a certain
element x, with the additional constraint that your algorithm must return the
elements back to S in their original order. You may not use an array or linked
list—only S and Q and a constant number of reference variables.
C-5.2
Give a pseudo-code description for an array-based implementation of the
double-ended queue ADT. What is the running time for each operation?
C-5.3
306
Suppose Alice has picked three distinct integers and placed them into a stack S
in random order. Write a short, straightline piece of pseudo-code (with no loops
or recursion) that uses only one comparison and only one variable x, yet
guarantees with probability 2/3 that at the end of this code the variable x will
store the largest of Alice's three integers. Argue why your method is correct.
C-5.4
Describe how to implement the stack ADT using two queues. What is the
running time of the push() and pop() methods in this case?
C-5.5
Show how to use a stack S and a queue Q to generate all possible subsets of an
n-element set T nonrecursively.
C-5.6
Suppose we have an n × n two-dimensional array A that we want to use to store
integers, but we don't want to spend the O(n2) work to initialize it to all 0's (the
way Java does), because we know in advance that we are only going to use up
to n of these cells in our algorithm, which itself runs in O(n) time (not counting
the time to initialize A). Show how to use an array-based stack S storing (i, j, k)
integer triples to allow us to use the array A without initializing it and still
implement our algorithm in O(n) time, even though the initial values in the cells
of A might be total garbage.
C-5.7
Describe a nonrecursive algorithm for enumerating all permutations of the
numbers {1,2,…,n}.
C-5.8
Postfix notation is an unambiguous way of writing an arithmetic expression
without parentheses. It is defined so that if "(exp1 )op(exp2)" is a normal fully
parenthesized expression whose operation is op, then the postfix version of this
is "pexp1 pexp2 op", where pexp1 is the postfix version of exp1 and pexp2 is the
postfix version of exp
an
2. The postfix version of a single number or variable is
just that number or variable. So, for example, the postfix version of "((5 + 2) *
(8 − 3))/4" is "5 2 + 8 3 − * 4 /". Describe a nonrecursive way of evaluating
expression in postfix notation.
C-5.9
Suppose you have two nonempty stacks S and T and a deque D. Describe how
to use D so that S stores all the elements of T below all of its original elements,
with both sets of elements still in their original order.
307
C-5.10
Alice has three array-based stacks, A, B, and C, such that A has capacity 100, B
has capacity 5, and C has capacity 3. Initially, A is full, and B and C are empty.
Unfortunately, the person who programmed the class for these stacks made the
push and pop methods private. The only method Alice can use is a static
method, transfer(S,T), which transfers (by itera-tively applying the private pop
and push methods) elements from stack S to stack T until either S becomes
empty or T becomes full. So, for example, starting from our initial configuration
and performing transfer(A, C) results in A now holding 97 elements and C
holding 3. Describe a sequence of transfer operations that starts from the
initial configuration and results in B holding 4 elements at the end.
C-5.11
Alice has two queues, S and T, which can store integers. Bob gives Alice 50 odd
integers and 50 even integers and insists that she stores all 100 integers in S and
T. They then play a game where Bob picks S or T at random and then applies
the round-robin scheduler, described in the chapter, to the chosen queue a
random number of times. If the number left out of the queue at the end of this
game is odd, Bob wins. Otherwise, Alice wins. How can Alice allocate integers
to queues to optimize her chances of winning? What is her chance of winning?
C-5.12
Suppose Bob has four cows that he wants to take across a bridge, but only one
yoke, which can hold up to two cows, side by side, tied to the yoke. The yoke is
too heavy for him to carry across the bridge, but he can tie (and untie) cows to it
in no time at all. Of his four cows, Mazie can cross the bridge in 2 minutes,
Daisy can cross it in 4 minutes, Crazy can cross it in 10 minutes, and Lazy can
cross it in 20 minutes. Of course, when two cows are tied to the yoke, they must
go at the speed of the slower cow. Describe how Bob can get all his cows across
the bridge in 34 minutes.
Projects
P-5.1
Implement the stack ADT with a doubly linked list.
P-5.2
Implement the stack ADT using the Java ArrayList class (without using the
built-in Java Stack class).
P-5.3
308
Implement a program that can input an expression in postfix notation (see
Exercise C-5.8) and output its value.
P-5.4
Implement the queue ADT using an array.
P-5.5
Implement the entire queue ADT using a singly linked list.
P-5.6
Design an ADT for a two-color, double-stack ADT that consists of two stacks—
one "red" and one "blue"—and has as its operations color-coded versions of the
regular stack ADT operations. For example, this ADT should allow for both a
red push operation and a blue push operation. Give an efficient implementation
of this ADT using a single array whose capacity is set at some value N that is
assumed to always be larger than the sizes of the red and blue stacks combined.
P-5.7
Implement the deque ADT with a doubly linked list.
P-5.8
Implement the deque ADT with an array used in a circular fashion.
P-5.9
Implement the Stack and Queue interfaces with a unique class that extends
class NodeDeque (Code Fragment 5.18).
P-5.10
When a share of common stock of some company is sold, the capital gain (or,
sometimes, loss) is the difference between the share's selling price and the price
originally paid to buy it. This rule is easy to understand for a single share, but if
we sell multiple shares of stock bought over a long period of time, then we must
identify the shares actually being sold. A standard accounting principle for
identifying which shares of a stock were sold in such a case is to use a FIFO
protocol—the shares sold are the ones that have been held the longest (indeed,
this is the default method built into several personal finance software packages).
For example, suppose we buy 100 shares at $20 each on day 1, 20 shares at $24
on day 2, 200 shares at $36 on day 3, and then sell 150 shares on day 4 at $30
each. Then applying the FIFO protocol means that of the 150 shares sold, 100
were bought on day 1, 20 were bought on day 2, and 30 were bought on day 3.
The capital gain in this case would therefore be 100 · 10 + 20 · 6 + 30 · (−6), or
309
$940. Write a program that takes as input a sequence of transactions of the form
"buy x; share(s) at $y each" or "sell x share(s) at $y
each," assuming that the transactions occur on consecutive days and the
values x and y are integers. Given this input sequence, the output should be the
total capital gain (or loss) for the entire sequence, using the FIFO protocol to
identify shares.
Chapter Notes
We were introduced to the approach of defining data structures first in terms of their
ADTs and then in terms of concrete implementations by the classic books by Aho,
Hopcroft, and Ullman [4, 5], which incidentally is where we first saw aproblem
similar to Exercise C-5.6. Exercises C-5.10, C-5.11, and C-5.12 are similar to
interview questions said to be from a well-known software company. For further
study of abstract data types, see Liskov and Guttag [69], Cardelli and Wegner [20], or
Demurjian [28].
Chapter 6 Lists and Iterators
Contents
6.1
Array Lists ......................
222
6.1.1
The Array List Abstract Data Type..........
222
6.1.2
The Adapter Pattern.................
310
223
6.1.3
A Simple Array-Based Implementation........
224
6.1.4
A Simple Interface and the java.util.ArrayList
Class.
226
6.1.5
Implementing an Array List Using Extendable Arrays
227
6.2
Node Lists ......................
231
6.2.1
Node-Based Operations................
231
6.2.2
Positions........................
232
6.2.3
The Node List Abstract Data Type..........
232
6.2.4
Doubly Linked List Implementation..........
236
311
6.3
Iterators ......................
242
6.3.1
The Iterator and Iterable Abstract Data Types....
242
6.3.2
The Java For-Each Loop...............
244
6.3.3
Implementing Iterators................
245
6.3.4
List Iterators in Java.................
247
6.4
List ADTs and the Collections Framework
......................
249
6.4.1
The Java Collections Framework...........
249
6.4.2
The java.util.LinkedList Class.............
250
6.4.3
312
Sequences.......................
251
6.5
Case Study: The Move-to-Front Heuristic
......................
253
6.5.1
Using a Sorted List and a Nested Class.......
253
6.5.2
Using a List with the Move-to-Front Heuristic....
256
6.5.3
Possible Uses of a Favorites List...........
257
6.6
Exercises ......................
260
java.datastructures.net
6.1 Array Lists
Suppose we have a collection S of n elements stored in a certain linear order, so that
we can refer to the elements in S as first, second, third, and so on. Such a collection is
generically referred to as a list or sequence. We can uniquely refer to each element e
in S using an integer in the range [0,n − 1] that is equal to the number of elements of S
that precede e in S. The index of an element e in S is the number of elements that are
before e in S. Hence, the first element in S has index 0 and the last element has index
n − 1. Also, if an element of S has index i, its previous element (if it exists) has index
i − 1, and its next element (if it exists) has index i + 1. This concept of index is related
313
to that of the rank of an element in a list, which is usually defined to be one more
than its index; so the first element is at rank 1, the second is at rank 2, and so on.
A sequence that supports access to its elements by their indices is called an array list
(or vector, using an older term). Since our index definition is more consistent with the
way arrays are indexed in Java and other programming languages (such as C and
C++), we will be referring to the place where an element is stored in an array list as
its "index," not its "rank" (although we may use r to denote this index, if the letter "i"
is being used as a for-loop counter).
This index concept is a simple yet powerful notion, since it can be used to specify
where to insert a new element into a list or where to remove an old element.
6.1.1 The Array List Abstract Data Type
As an ADT, an array list S has the following methods (besides the standard
size() and isEmpty() methods):
get(i): Return the element of S with index i; an error condition
occurs if i size() − 1.
set(i, e): Replace with e and return the element at index i; an error
condition occurs if i size() − 1.
add(i, e): Insert a new element e into S to have index i; an error
condition occurs if i size().
remove(i): Remove from S the element at index i; an error condition
occurs if i size() − 1.
We do not insist that an array should be used to implement an array list, so that the
element at index 0 is stored at index 0 in the array, although that is one (very
natural) possibility. The index definition offers us a way to refer to the "place"
where an element is stored in a sequence without having to worry about the exact
implementation of that sequence. The index of an element may change whenever
the sequence is updated, however, as we illustrate in the following example.
Example 6.1: We show below some operations on an initially empty array list
S.
Operation
Output
S
add(0,7)
314
-
(7)
add(0,4)
-
(4,7)
get(1)
7
(4,7)
add(2,2)
-
(4,7,2)
get(3)
"error"
(4,7,2)
remove(1)
7
(4,2)
add(1,5)
-
(4,5,2)
add(1,3)
-
(4,3,5,2)
add(4,9)
-
315
(4,3,5,2,9)
get(2)
5
(4,3,5,2,9)
set(3,8)
2
(4,3,5,8,9)
6.1.2 The Adapter Pattern
Classes are often written to provide similar functionality to other classes. The
adapter design pattern applies to any context where we want to modify an existing
class so that its methods match those of a related, but different, class or interface.
One general way for applying the adapter pattern is to define the new class in such a
way that it contains an instance of the old class as a hidden field, and implement
each method of the new class using methods of this hidden instance variable. The
result of applying the adapter pattern is that a new class that performs almost the
same functions as a previous class, but in a more convenient way, has been created.
With respect to our discussion of the array list ADT, we note that this ADT is
sufficient to define an adapter class for the deque ADT, as shown in Table 6.1. (See
also Exercise C-6.8.)
Table 6.1: Realization of a deque by means of an
array list.
Deque Method
Realization with Array-List Methods
size(), isEmpty()
size(), isEmpty()
getFirst()
get(0)
getLast()
316
get(size() −1)
addFirst(e)
add(0,e)
addLast(e)
add(size(),e)
removeFirst()
remove(0)
removeLast()
remove(size() − 1)
6.1.3 A Simple Array-Based Implementation
An obvious choice for implementing the array list ADT is to use an array A, where
A[i] stores (a reference to) the element with index i. We choose the size N of array
A sufficiently large, and we maintain the number of elements in an instance
variable, n < N.
The details of this implementation of the array list ADT are simple. To implement
the get(i) operation, for example, we just return A[i]. Implementations of
methods add(i, e) and remove(i) are given in Code Fragment 6.1. An
important (and time-consuming) part of this implementation involves the shifting of
elements up or down to keep the occupied cells in the array contiguous. These
shifting operations are required to maintain our rule of always storing an element
whose list index is i at index i in the array A. (See Figure 6.1 and also Exercise R-
6.12.)
Code Fragment 6.1: Methods add(i, e) and
remove(i) in the array implementation of the array
list ADT. We denote, with n, the instance variable
storing the number of elements in the array list.
317
Figure 6.1: Array-based implementation of an array
list S that is storing n elements: (a) shifting up for an
insertion at index i(b); shifting down for a removal at
index i
The Performance of a Simple Array-Based
Implementation
Table 6.2 shows the worst-case running times of the methods of an array list with
n elements realized by means of an array. Methods isEmpty, size, get and
set clearly run in O(1) time, but the insertion and removal methods can take
much longer than this. In particular, add(i, e) runs in time O(n). Indeed, the
worst case for this operation occurs when i = 0, since all the existing n elements
318
have to be shifted forward. A similar argument applies to method remove(i),
which runs in O(n) time, because we have to shift backward n − 1 elements in the
worst case (i = 0). In fact, assuming that each possible index is equally likely to be
passed as an argument to these operations, their average running time is O(n), for
we will have to shift n/2 elements on average.
Table 6.2: Performance of an array list with n
elements realized by an array. The space usage is O(N),
where N is the size of the array.
Method
Time
size()
O(1)
isEmpty()
O(1)
get(i)
O(1)
set(i, e)
O(1)
add(i, e)
O(n)
remove(i)
O(n)
Looking more closely at add(i, e) and remove(i), we note that they each run
in time O(n − i + 1), for only those elements at index i and higher have to be
shifted up or down. Thus, inserting or removing an item at the end of an array list,
using the methods add(n, e) and remove(n − 1), respectively take O(1)
time each. Moreover, this observation has an interesting consequence for the
adaptation of the array list ADT to the deque ADT given in Section 6.1.1. If the
array list ADT in this case is implemented by means of an array as described
above, then methods addLast and removeLast of the deque each run in O(1)
319
time. However, methods addFirst and removeFirst of the deque each run
in O(n) time.
Actually, with a little effort, we can produce an array-based implementation of the
array list ADT that achieves O(1) time for insertions and removals at index 0, as
well as insertions and removals at the end of the array list. Achieving this requires
that we give up on our rule that an element at index i is stored in the array at index
i, however, as we would have to use a circular array approach like the one we
used in Section 5.2 to implement a queue. We leave the details of this
implementation for an exercise (C-6.9).
6.1.4 A Simple Interface and the java. util. ArrayList Class
To prepare for constructing a Java implementation of the array list ADT, we show,
in Code Fragment 6.2, a Java interface, IndexList, that captures the main
methods from the array list ADT. In this case, we use a
IndexOutOfBoundsException to signal an invalid index argument.
Code Fragment 6.2: The IndexList interface for
the array list ADT.
The java.util.ArrayList Class
Java provides a class, java.util.ArrayList, that implements all the
methods that we give above for our array list ADT. That is, it includes all of the
methods included in Code Fragment 6.2 for the IndexList interface.
320
Moreover, the java.util.ArrayList class has features in addition to those
of our simplified array list ADT. For example, the class
java.util.ArrayList also includes a method, clear(), which removes
all the elements from the array list, and a method, toArray(), which returns an
array containing all the elements of the array list in the same order. In addition,
the class java.util.ArrayList also has methods for searching the list,
including a method indexOf(e), which returns the index of the first occurrence
of an element equal to e in the array list, and a method lastIndexOf(e),
which returns the index of the last occurrence of an element equal to e in the array
list. Both of these methods return the (invalid) index value − 1 if an element equal
to e is not found.
6.1.5 Implementing an Array List Using Extendable
Arrays
In addition to implementing the methods of the IndexList interface (and some
other useful methods), the class java.util.ArrayList provides an an
interesting feature that overcomes a weakness in the simple array implementation.
Specifically, a major weakness of the simple array implementation for the array list
ADT given in Section 6.1.3, is that it requires advance specification of a fixed
capacity, N, for the total number of elements that may be stored in the array list. If
the actual number of elements, n, of the array list is much smaller than N, then this
implementation will waste space. Worse, if n increases past N, then this
implementation will crash.
Instead, the java.util.ArrayList uses an interesting extendable-array
technique so that we never have to worry about array overflows when using this
class.
As with the java.util.ArrayList class, let us provide a means to grow the
array A that stores the elements of an array list S. Of course, in Java (and other
programming languages), we cannot actually grow the array A; its capacity is fixed
at some number N, as we have already observed. Instead, when an overflow occurs,
that is, when n = N and we make a call to the method add, we perform the
following additional steps:
1. Allocate a new array B of capacity 2N
2. Let B[i ]← A[i], for i = 0,... , N − 1
3. Let A ← B, that is, we use B as the array supporting S
4. Insert the new element in A.
321
This array replacement strategy is known as an extendable array, for it can be
viewed as extending the end of the underlying array to make room for more
elements. (See Figure 6.2.) Intuitively, this strategy is much like that of the hermit
crab, which moves into a larger shell when it outgrows its previous one.
Figure 6.2: An illustration of the three steps for
"growing" an extendable array: (a) create new array B;
(b) copy elements from A to B; (c) reassign reference A
to the new array. Not shown is the future garbage
collection of the old array.
Implementing the IndexList Interface with an
Extendable Array
We give portions of a Java implementation of the array list ADT using an
extendable array in Code Fragment 6.3. This class only provides means for the
array to grow. Exercise C-6.2 explores an implementation that can also shrink.
Code Fragment 6.3: Portions of class
ArrayIndexList realizing the array list ADT by
means of an extendable array. Method
checkIndex(r, n) (not shown) checks whether an
index r is in the range [0, n − 1].
322
An Amortized Analysis of Extendable Arrays
This array replacement strategy might at first seem slow, for performing a single
array replacement required by some element insertion can take O(n) time. Still,
323
notice that after we perform an array replacement, our new array allows us to add
n new elements to the array list before the array must be replaced again. This
simple fact allows us to show that performing a series of operations on an initially
empty array list is actually quite efficient. As a shorthand notation, let us refer to
the insertion of an element to be the last element in an array list as a push
operation. (See Figure 6.3.)
Figure 6.3: Running times of a series of push
operations on a java.util.ArrayList of initial
size 1.
Using an algorithmic design pattern called amortization, we can show that
performing a sequence of such push operations on an array list implemented with
an extendable array is actually quite efficient. To perform an amortized analysis,
we use an accounting technique where we view the computer as a coin-operated
appliance that requires the payment of one cyber-dollar for a constant amount of
computing time. When an operation is executed, we should have enough cyber-
dollars available in our current "bank account" to pay for that operation's running
time. Thus, the total amount of cyber-dollars spent for any computation will be
proportional to the total time spent on that computation. The beauty of using this
analysis method is that we can overcharge some operations in order to save up
cyber-dollars to pay for others.
324
Proposition 6.2: Let S be an array list implemented by means of an
extendable array with initial length one. The total time to perform a series of n
push operations in S, starting from S being empty is O(n).
Justification: Let us assume that one cyber-dollar is enough to pay for the
execution of each push operation in S, excluding the time spent for growing the
array. Also, let us assume that growing the array from size k to size 2k requires k
cyber-dollars for the time spent copying the elements. We shall charge each push
operation three cyber-dollars. Thus, we overcharge each push operation that does
not cause an overflow by two cyber-dollars. Think of the two cyber-dollars
profited in an insertion that does not grow the array as being "stored" at the
element inserted. An overflow occurs when the array list S has 2i elements, for
some integer i ≥ 0, and the size of the array used by the array list representing S is
2i. Thus, doubling the size of the array will require 2i cyber-dollars. Fortunately,
these cyber-dollars can be found at the elements stored in cells 2i−1 through 2i − 1.
(See Figure 6.4.) Note that t
Các file đính kèm theo tài liệu này:
- Data structures algorithms in java.pdf