Data structures algorithms in java

Matching Parentheses and HTML Tags In this subsection, we explore two related applications of stacks, the first of which is for matching parentheses and grouping symbols in arithmetic expressions. Arithmetic expressions can contain various pairs of grouping symbols, such as ã Parentheses: "(" and ")" ã Braces: "{" and "}" ã Brackets: "[" and "]" ã Floor function symbols: " " and "" ã Ceiling function symbols: "" and "," and each opening symbol must match with its corresponding closing symbol. For example, a left bracket, "[," must match with a corresponding right bracket, "]," as in the following expression: [(5 + x) − (y + z)]

92 trang | Chia sẻ: tlsuongmuoi | Lượt xem: 4985 | Lượt tải: 0

Bạn đang xem trước 20 trang tài liệu Data structures algorithms in java, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên

and then fill the array back up again by popping the elements off of the stack. In Code Fragment 5.8, we give a Java implementation of this algorithm. Incidentally, this method also illustrates how we can use generic types in a simple application that uses a generic stack. In particular, when the elements are popped off the stack in this example, they are automatically returned as elements of the E type; hence, they can be immediately returned to the input array. We show an example use of this method in Code Fragment 5.9. Code Fragment 5.8: A generic method that reverses the elements in an array of type E objects, using a stack declared using the Stack interface. Code Fragment 5.9: A test of the reverse method using two arrays. 278 5.1.5 Matching Parentheses and HTML Tags In this subsection, we explore two related applications of stacks, the first of which is for matching parentheses and grouping symbols in arithmetic expressions. Arithmetic expressions can contain various pairs of grouping symbols, such as • Parentheses: "(" and ")" • Braces: "{" and "}" • Brackets: "[" and "]" • Floor function symbols: " " and "" • Ceiling function symbols: "" and "," and each opening symbol must match with its corresponding closing symbol. For example, a left bracket, "[," must match with a corresponding right bracket, "]," as in the following expression: [(5 + x) − (y + z)]. 279 The following examples further illustrate this concept: • Correct: ( )(( )){([( )])} • Correct: ((( )(( )){([( )])})) • Incorrect: )(( )){([( )])} • Incorrect: ({[])} • Incorrect: (. We leave the precise definition of matching of grouping symbols to Exercise R-5.5. An Algorithm for Parentheses Matching An important problem in processing arithmetic expressions is to make sure their grouping symbols match up correctly. We can use a stack S to perform the matching of grouping symbols in an arithmetic expression with a single left-to- right scan. The algorithm tests that left and right symbols match up and also that the left and right symbols are both of the same type. Suppose we are given a sequence X = x0x1x2…xn−1, where each xi is a token that can be a grouping symbol, a variable name, an arithmetic operator, or a number. The basic idea behind checking that the grouping symbols in S match correctly, is to process the tokens in X in order. Each time we encounter an opening symbol, we push that symbol onto S, and each time we encounter a closing symbol, we pop the top symbol from the stack S (assuming S is not empty) and we check that these two symbols are of the same type. If the stack is empty after we have processed the whole sequence, then the symbols in X match. Assuming that the push and pop operations are implemented to run in constant time, this algorithm runs in O(n), that is linear, time. We give a pseudo-code description of this algorithm in Code Fragment 5.10. Code Fragment 5.10: Algorithm for matching grouping symbols in an arithmetic expression. 280 Matching Tags in an HTML Document Another application in which matching is important is in the validation of HTML documents. HTML is the standard format for hyperlinked documents on the Internet. In an HTML document, portions of text are delimited by HTML tags. A simple opening HTML tag has the form "" and the corresponding closing tag has the form "." Commonly used HTML tags include • body: document body • h1: section header • center: center justify • p: paragraph • ol: numbered (ordered) list • li: list item. Ideally, an HTML document should have matching tags, although most browsers tolerate a certain number of mismatching tags. We show a sample HTML document and a possible rendering in Figure 5.3. 281 Figure 5.3: Illustrating HTML tags. (a) An HTML document; (b) its rendering. Fortunately, more or less the same algorithm as in Code Fragment 5.10 can be used to match the tags in an HTML document. In Code Fragments 5.11 and 5.12, we give a Java program for matching tags in an HTML document read from standard input. For simplicity, we assume that all tags are the simple opening or closing tags defined above and that no tags are formed incorrectly. Code Fragment 5.11: A complete Java program for testing if an HTML document has fully matching tags. (Continues in Code Fragment 5.12.) 282 Code Fragment 5.12: Java program for testing for matching tags in an HTML document. (Continued from 5.11.) Method isHTMLMatched uses a stack to store the names of the opening tags seen so far, similar to how the stack was used in Code Fragment 5.10. Method parseHTML uses a Scanner s to extract the tags from the HTML document, using the pattern "]*>," which denotes a string that starts with '<', followed by zero or more characters that are not '>', followed by a '>'. 283 5.2 Queues 284 Another fundamental data structure is the queue. It is a close "cousin" of the stack, as a queue is a collection of objects that are inserted and removed according to the first- in first-out (FIFO) principle. That is, elements can be inserted at any time, but only the element that has been in the queue the longest can be removed at any time. We usually say that elements enter a queue at the rear and are removed from the front. The metaphor for this terminology is a line of people waiting to get on an amusement park ride. People waiting for such a ride enter at the rear of the line and get on the ride from the front of the line. 5.2.1 The Queue Abstract Data Type Formally, the queue abstract data type defines a collection that keeps objects in a sequence, where element access and deletion are restricted to the first element in the sequence, which is called the front of the queue, and element insertion is restricted to the end of the sequence, which is called the rear of the queue. This restriction enforces the rule that items are inserted and deleted in a queue according to the first-in first-out (FIFO) principle. The queue abstract data type (ADT) supports the following two fundamental methods: enqueue(e): Insert element e at the rear of the queue. dequeue(): Remove and return from the queue the object at the front; an error occurs if the queue is empty. Additionally, similar to the case with the Stack ADT, the queue ADT includes the following supporting methods: size(): Return the number of objects in the queue. isEmpty(): Return a Boolean value that indicates whether the queue is empty. front(): Return, but do not remove, the front object in the queue; an error occurs if the queue is empty. Example 5.4: The following table shows a series of queue operations and their effects on an initially empty queue Q of integer objects. For simplicity, we use integers instead of integer objects as arguments of the operations. Operation Output front ← Q ← rear 285 enqueue(5) - (5) enqueue(3) - (5, 3) dequeue( ) 5 (3) enqueue(7) - (3, 7) dequeue( ) 3 (7) front( ) 7 (7) dequeue( ) 7 ( ) dequeue( ) "error" ( ) isEmpty( ) 286 true ( ) enqueue(9) - (9) enqueue(7) - (9, 7) size() 2 (9, 7) enqueue(3) - (9, 7, 3) enqueue(5) - (9, 7, 3, 5) dequeue( ) 9 (7, 3, 5) Example Applications There are several possible applications for queues. Stores, theaters, reservation centers, and other similar services typically process customer requests according to the FIFO principle. A queue would therefore be a logical choice for a data structure to handle transaction processing for such applications. For example, it would be a natural choice for handling calls to the reservation center of an airline or to the box office of a theater. 287 A Queue Interface in Java A Java interface for the queue ADT is given in Code Fragment 5.13. This generic interface specifies that objects of arbitrary object types can be inserted into the queue. Thus, we don't have to use explicit casting when removing elements. Note that the size and isEmpty methods have the same meaning as their counterparts in the Stack ADT. These two methods, as well as the front method, are known as accessor methods, for they return a value and do not change the contents of the data structure. Code Fragment 5.13: Interface Queue documented with comments in Javadoc style. 288 5.2.2 A Simple Array-Based Queue Implementation We present a simple realization of a queue by means of an array, Q, of fixed capacity, storing its elements. Since the main rule with the queue ADT is that we insert and delete objects according to the FIFO principle, we must decide how we are going to keep track of the front and rear of the queue. 289 One possibility is to adapt the approach we used for the stack implementation, letting Q[0] be the front of the queue and then letting the queue grow from there. This is not an efficient solution, however, for it requires that we move all the elements forward one array cell each time we perform a dequeue operation. Such an implementation would therefore take O(n) time to perform the dequeue method, where n is the current number of objects in the queue. If we want to achieve constant time for each queue method, we need a different approach. Using an Array in a Circular Way To avoid moving objects once they are placed in Q, we define two variables f and r, which have the following meanings: • f is an index to the cell of Q storing the first element of the queue (which is the next candidate to be removed by a dequeue operation), unless the queue is empty (in which case f = r). • r is an index to the next available array cell in Q. Initially, we assign f = r = 0, which indicates that the queue is empty. Now, when we remove an element from the front of the queue, we increment f to index the next cell. Likewise, when we add an element, we store it in cell Q[r] and increment r to index the next available cell in Q. This scheme allows us to implement methods front, enqueue, and dequeue in constant time, that is, O(1) time. However, there is still a problem with this approach. Consider, for example, what happens if we repeatedly enqueue and dequeue a single element N different times. We would have f = r = N. If we were then to try to insert the element just one more time, we would get an array-out-of-bounds error (since the N valid locations in Q are from Q[0] to Q[N − 1]), even though there is plenty of room in the queue in this case. To avoid this problem and be able to utilize all of the array Q, we let the f and r indices "wrap around" the end of Q. That is, we now view Q as a "circular array" that goes from Q[0] to Q[N − 1] and then immediately back to Q[0] again. (See Figure 5.4.) Figure 5.4: Using array Q in a circular fashion: (a) the "normal" configuration with f ≤ r; (b) the "wrapped around" configuration with r < f. The cells storing queue elements are highlighted. 290 Using the Modulo Operator to Implement a Circular Array Implementing this circular view of Q is actually pretty easy. Each time we increment f or r, we compute this increment as "(f + 1) mod N" or "(r + 1) mod N," respectively. Recall that operator "mod" is the modulo operator, which is computed by taking the remainder after an integral division. For example, 14 divided by 4 is 3 with remainder 2, so 14 mod 4 = 2. Specifically, given integers x and y such that x ≥ 0 and y > 0, we have x mod y = x − x/yy. That is, if r = x mod y, then there is a nonnegative integer q, such that x = qy + r. Java uses "%" to denote the modulo operator. By using the modulo operator, we can view Q as a circular array and implement each queue method in a constant amount of time (that is, O(1) time). We describe how to use this approach to implement a queue in Code Fragment 5.14. Code Fragment 5.14: Implementation of a queue using a circular array. The implementation uses the modulo operator to "wrap" indices around the end of the array and it also includes two instance variables, f and r, which index the front of the queue and first empty cell after the rear of the queue respectively. 291 The implementation above contains an important detail, which might be missed at first. Consider the situation that occurs if we enqueue N objects into Q without dequeuing any of them. We would have f = r, which is the same condition that occurs when the queue is empty. Hence, we would not be able to tell the difference between a full queue and an empty one in this case. Fortunately, this is not a big problem, and a number of ways for dealing with it exist. The solution we describe here is to insist that Q can never hold more than N − 1 objects. This simple rule for handling a full queue takes care of the final problem with our implementation, and leads to the pseudo-coded descriptions of the queue methods given in Code Fragment 5.14. Note our introduction of an implementation-specific exception, called FullQueueException, to signal that no more elements can be inserted in the queue. Also note the way we compute the size of the queue by means of the expression (N − f + r) mod N, 292 which gives the correct result both in the "normal" configuration (when f ≤ r) and in the "wrapped around" configuration (when r < f). The Java implementation of a queue by means of an array is similar to that of a stack, and is left as an exercise (P-5.4). Table 5.2 shows the running times of methods in a realization of a queue by an array. As with our array-based stack implementation, each of the queue methods in the array realization executes a constant number of statements involving arithmetic operations, comparisons, and assignments. Thus, each method in this implementation runs in O(1) time. Table 5.2: Performance of a queue realized by an array. The space usage is O(N), where N is the size of the array, determined at the time the queue is created. Note that the space usage is independent from the number n < N of elements that are actually in the queue. Method Time size O(1) isEmpty O(1) front O(1) enqueue O(1) dequeue O(1) As with the array-based stack implementation, the only real disadvantage of the array-based queue implementation is that we artificially set the capacity of the queue to be some fixed value. In a real application, we may actually need more or 293 less queue capacity than this, but if we have a good capacity estimate, then the array-based implementation is quite efficient. 5.2.3 Implementing a Queue with a Generic Linked List We can efficiently implement the queue ADT using a generic singly linked list. For efficiency reasons, we choose the front of the queue to be at the head of the list, and the rear of the queue to be at the tail of the list. In this way, we remove from the head and insert at the tail. (Why would it be bad to insert at the head and remove at the tail?) Note that we need to maintain references to both the head and tail nodes of the list. Rather than go into every detail of this implementation, we simply give a Java implementation for the fundamental queue methods in Code Fragment 5.15. Code Fragment 5.15: Methods enqueue and dequeue in the implementation of the queue ADT by means of a singly linked list, using nodes from class Node of Code Fragment 5.6. 294 Each of the methods of the singly linked list implementation of the queue ADT runs in O(1) time. We also avoid the need to specify a maximum size for the queue, as was done in the array-based queue implementation, but this benefit comes at the expense of increasing the amount of space used per element. Still, the methods in the singly linked list queue implementation are more complicated than we might like, for we must take extra care in how we deal with special cases where the queue is empty before an enqueue or where the queue becomes empty after a dequeue. 5.2.4 Round Robin Schedulers A popular use of the queue data structure is to implement a round robin scheduler, where we iterate through a collection of elements in a circular fashion and "service" each element by performing a given action on it. Such a schedule is used, for example, to fairly allocate a resource that must be shared by a collection of clients. 295 For instance, we can use a round robin scheduler to allocate a slice of CPU time to various applications running concurrently on a computer. We can implement a round robin scheduler using a queue, Q, by repeatedly performing the following steps (see Figure 5.5): 1. e ← Q.dequeue() 2. Service element e 3. Q.enqueue(e) Figure 5.5: The three iterative steps for using a queue to implement a round robin scheduler. The Josephus Problem In the children's game "hot potato," a group of n children sit in a circle passing an object, called the "potato," around the circle. The potato begins with a starting child in the circle, and the children continue passing the potato until a leader rings a bell, at which point the child holding the potato must leave the game after handing the potato to the next child in the circle. After the selected child leaves, the other children close up the circle. This process is then continued until there is only one child remaining, who is declared the winner. If the leader always uses the strategy of ringing the bell after the potato has been passed k times, for some fixed value k, then determining the winner for a given list of children is known as the Josephus problem. Solving the Josephus Problem Using a Queue We can solve the Josephus problem for a collection of n elements using a queue, by associating the potato with the element at the front of the queue and storing elements in the queue according to their order around the circle. Thus, passing the 296 potato is equivalent to dequeuing an element and immediately enqueuing it again. After this process has been performed k times, we remove the front element by dequeuing it from the queue and discarding it. We show a complete Java program for solving the Josephus problem using this approach in Code Fragment 5.16, which describes a solution that runs in O(nk) time. (We can solve this problem faster using techniques beyond the scope of this book.) Code Fragment 5.16: A complete Java program for solving the Josephus problem using a queue. Class NodeQueue is shown in Code Fragment 5.15. 297 5.3 Double-Ended Queues Consider now a queue-like data structure that supports insertion and deletion at both the front and the rear of the queue. Such an extension of a queue is called a double- ended queue, or deque, which is usually pronounced "deck" to avoid confusion with the dequeue method of the regular queue ADT, which is pronounced like the abbreviation "D.Q." 5.3.1 The Deque Abstract Data Type The deque abstract data type is richer than both the stack and the queue ADTs. The fundamental methods of the deque ADT are as follows: addFirst(e): Insert a new element e at the beginning of the deque. addLast(e): Insert a new element e at the end of the deque. removeFirst(): Remove and return the first element of the deque; an error occurs if the deque is empty. removeLast(): Remove and return the last element of the deque; an error occurs if the deque is empty. Additionally, the deque ADT may also include the following support methods: getFirst(): Return the first element of the deque; an error occurs if the deque is empty. getLast(): Return the last element of the deque; an error occurs if the deque is empty. size(): Return the number of elements of the deque. isEmpty(): Determine if the deque is empty. Example 5.5: The following table shows a series of operations and their effects on an initially empty deque D of integer objects. For simplicity, we use integers instead of integer objects as arguments of the operations. Operation Output D addFirst(3) 298 - (3) addFirst(5) - (5,3) removeFirst() 5 (3) addLast(7) - (3,7) removeFirst() 3 (7) removeLast() 7 () removeFirst() "error" () isEmpty() true () 5.3.2 Implementing a Deque 299 Since the deque requires insertion and removal at both ends of a list, using a singly linked list to implement a deque would be inefficient. We can use a doubly linked list, however, to implement a deque efficiently. As discussed in Section 3.3, inserting or removing elements at either end of a doubly linked list is straightforward to do in O(1) time, if we use sentinel nodes for the header and trailer. For an insertion of a new element e, we can have access to the node p before the place e should go and the node q after the place e should go. To insert a new element between the two nodes p and q (either or both of which could be sentinels), we create a new node t, have t's prev and next links respectively refer to p and q, and then have p's next link refer to t, and have q's prev link refer to t. Likewise, to remove an element stored at a node t, we can access the nodes p and q on either side of t (and these nodes must exist, since we are using sentinels). To remove node t between nodes p and q, we simply have p and q point to each other instead of t. We need not change any of the fields in t, for now t can be reclaimed by the garbage collector, since no one is pointing to t. Table 5.3 shows the running times of methods for a deque implemented with a doubly linked list. Note that every method runs in O(1) time. Table 5.3: Performance of a deque realized by a doubly linked list. Method Time size, isEmpty O(1) getFirst, getLast O(1) add First, addLast O(1) removeFirst, removeLast O(1) 300 Thus, a doubly linked list can be used to implement each method of the deque ADT in constant time. We leave the details of implementing the deque ADT efficiently in Java as an exercise (P-5.7). Incidentally, all of the methods of the deque ADT, as described above, are included in the java.util.LinkedList class. So, if we need to use a deque and would rather not implement one from scratch, we can simply use the built-in java.util.LinkedList class. In any case, we show a Deque interface in Code Fragment 5.17 and an implementation of this interface in Code Fragment 5.18. Code Fragment 5.17: Interface Deque documented with comments in Javadoc style (Section 1.9.3). Note also the use of the generic parameterized type, E, which implies that a deque can contain elements of any specified class. 301 302 Code Fragment 5.18: Class NodeDeque implementing the Deque interface, except that we have not shown the class DLNode, which is a generic doubly linked list node, nor have we shown methods getLast, addLast, and removeFirst. 303 304 5.4 Exercises For source code and help with exercises, please visit java.datastructures.net. Reinforcement R-5.1 Suppose an initially empty stack S has performed a total of 25 push operations, 12 top operations, and 10 pop operations, 3 of which generated StackEmptyExceptions, which were caught and ignored. What is the current size of S? R-5.2 If we implemented the stack S from the previous problem with an array, as described in this chapter, then what is the current value of the top instance variable? R-5.3 Describe the output of the following series of stack operations: push(5), push(3), pop(), push(2), push(8), pop(), pop(), push(9), push(1), pop(), push(7), push(6), pop(), pop(), push(4), pop(), pop(). R-5.4 Give a recursive method for removing all the elements in a stack. R-5.5 Give a precise and complete definition of the concept of matching for grouping symbols in an arithmetic expression. R-5.6 Describe the output for the following sequence of queue operations: enqueue(5), enqueue(3), dequeue(), enqueue(2), enqueue(8), dequeue(), dequeue(), enqueue(9), enqueue(1), dequeue(), enqueue(7), enqueue(6), dequeue(), dequeue(), enqueue(4), dequeue(), dequeue(). R-5.7 305 Suppose an initially-empty queue Q has performed a total of 32 enqueue operations, 10 front operations, and 15 dequeue operations, 5 of which generated QueueEmptyExceptions, which were caught and ignored. What is the current size of Q? R-5.8 If the queue of the previous problem was implemented with an array of capacity N = 30, as described in the chapter, and it never generated a FullQueueException, what would be the current values of f and r? R-5.9 Describe the output for the following sequence of deque ADT operations: addFirst(3), addLast(8), addLast(9), addFirst(5), removeFirst(), remove-Last(), first(), addLast(7), removeFirst(), last(), removeLast(). R-5.10 Suppose you have a deque D containing the numbers (1,2,3,4,5,6,7,8), in this order. Suppose further that you have an initially empty queue Q. Give a pseudo- code description of a method that uses only D and Q (and no other variables or objects) and results in D storing the elements (1,2,3,5,4,6,7,8), in this order. R-5.11 Repeat the previous problem using the deque D and an initially empty stack S. Creativity C-5.1 Suppose you have a stack S containing n elements and a queue Q that is initially empty. Describe how you can use Q to scan S to see if it contains a certain element x, with the additional constraint that your algorithm must return the elements back to S in their original order. You may not use an array or linked list—only S and Q and a constant number of reference variables. C-5.2 Give a pseudo-code description for an array-based implementation of the double-ended queue ADT. What is the running time for each operation? C-5.3 306 Suppose Alice has picked three distinct integers and placed them into a stack S in random order. Write a short, straightline piece of pseudo-code (with no loops or recursion) that uses only one comparison and only one variable x, yet guarantees with probability 2/3 that at the end of this code the variable x will store the largest of Alice's three integers. Argue why your method is correct. C-5.4 Describe how to implement the stack ADT using two queues. What is the running time of the push() and pop() methods in this case? C-5.5 Show how to use a stack S and a queue Q to generate all possible subsets of an n-element set T nonrecursively. C-5.6 Suppose we have an n × n two-dimensional array A that we want to use to store integers, but we don't want to spend the O(n2) work to initialize it to all 0's (the way Java does), because we know in advance that we are only going to use up to n of these cells in our algorithm, which itself runs in O(n) time (not counting the time to initialize A). Show how to use an array-based stack S storing (i, j, k) integer triples to allow us to use the array A without initializing it and still implement our algorithm in O(n) time, even though the initial values in the cells of A might be total garbage. C-5.7 Describe a nonrecursive algorithm for enumerating all permutations of the numbers {1,2,…,n}. C-5.8 Postfix notation is an unambiguous way of writing an arithmetic expression without parentheses. It is defined so that if "(exp1 )op(exp2)" is a normal fully parenthesized expression whose operation is op, then the postfix version of this is "pexp1 pexp2 op", where pexp1 is the postfix version of exp1 and pexp2 is the postfix version of exp an 2. The postfix version of a single number or variable is just that number or variable. So, for example, the postfix version of "((5 + 2) * (8 − 3))/4" is "5 2 + 8 3 − * 4 /". Describe a nonrecursive way of evaluating expression in postfix notation. C-5.9 Suppose you have two nonempty stacks S and T and a deque D. Describe how to use D so that S stores all the elements of T below all of its original elements, with both sets of elements still in their original order. 307 C-5.10 Alice has three array-based stacks, A, B, and C, such that A has capacity 100, B has capacity 5, and C has capacity 3. Initially, A is full, and B and C are empty. Unfortunately, the person who programmed the class for these stacks made the push and pop methods private. The only method Alice can use is a static method, transfer(S,T), which transfers (by itera-tively applying the private pop and push methods) elements from stack S to stack T until either S becomes empty or T becomes full. So, for example, starting from our initial configuration and performing transfer(A, C) results in A now holding 97 elements and C holding 3. Describe a sequence of transfer operations that starts from the initial configuration and results in B holding 4 elements at the end. C-5.11 Alice has two queues, S and T, which can store integers. Bob gives Alice 50 odd integers and 50 even integers and insists that she stores all 100 integers in S and T. They then play a game where Bob picks S or T at random and then applies the round-robin scheduler, described in the chapter, to the chosen queue a random number of times. If the number left out of the queue at the end of this game is odd, Bob wins. Otherwise, Alice wins. How can Alice allocate integers to queues to optimize her chances of winning? What is her chance of winning? C-5.12 Suppose Bob has four cows that he wants to take across a bridge, but only one yoke, which can hold up to two cows, side by side, tied to the yoke. The yoke is too heavy for him to carry across the bridge, but he can tie (and untie) cows to it in no time at all. Of his four cows, Mazie can cross the bridge in 2 minutes, Daisy can cross it in 4 minutes, Crazy can cross it in 10 minutes, and Lazy can cross it in 20 minutes. Of course, when two cows are tied to the yoke, they must go at the speed of the slower cow. Describe how Bob can get all his cows across the bridge in 34 minutes. Projects P-5.1 Implement the stack ADT with a doubly linked list. P-5.2 Implement the stack ADT using the Java ArrayList class (without using the built-in Java Stack class). P-5.3 308 Implement a program that can input an expression in postfix notation (see Exercise C-5.8) and output its value. P-5.4 Implement the queue ADT using an array. P-5.5 Implement the entire queue ADT using a singly linked list. P-5.6 Design an ADT for a two-color, double-stack ADT that consists of two stacks— one "red" and one "blue"—and has as its operations color-coded versions of the regular stack ADT operations. For example, this ADT should allow for both a red push operation and a blue push operation. Give an efficient implementation of this ADT using a single array whose capacity is set at some value N that is assumed to always be larger than the sizes of the red and blue stacks combined. P-5.7 Implement the deque ADT with a doubly linked list. P-5.8 Implement the deque ADT with an array used in a circular fashion. P-5.9 Implement the Stack and Queue interfaces with a unique class that extends class NodeDeque (Code Fragment 5.18). P-5.10 When a share of common stock of some company is sold, the capital gain (or, sometimes, loss) is the difference between the share's selling price and the price originally paid to buy it. This rule is easy to understand for a single share, but if we sell multiple shares of stock bought over a long period of time, then we must identify the shares actually being sold. A standard accounting principle for identifying which shares of a stock were sold in such a case is to use a FIFO protocol—the shares sold are the ones that have been held the longest (indeed, this is the default method built into several personal finance software packages). For example, suppose we buy 100 shares at $20 each on day 1, 20 shares at $24 on day 2, 200 shares at $36 on day 3, and then sell 150 shares on day 4 at $30 each. Then applying the FIFO protocol means that of the 150 shares sold, 100 were bought on day 1, 20 were bought on day 2, and 30 were bought on day 3. The capital gain in this case would therefore be 100 · 10 + 20 · 6 + 30 · (−6), or 309 $940. Write a program that takes as input a sequence of transactions of the form "buy x; share(s) at $y each" or "sell x share(s) at $y each," assuming that the transactions occur on consecutive days and the values x and y are integers. Given this input sequence, the output should be the total capital gain (or loss) for the entire sequence, using the FIFO protocol to identify shares. Chapter Notes We were introduced to the approach of defining data structures first in terms of their ADTs and then in terms of concrete implementations by the classic books by Aho, Hopcroft, and Ullman [4, 5], which incidentally is where we first saw aproblem similar to Exercise C-5.6. Exercises C-5.10, C-5.11, and C-5.12 are similar to interview questions said to be from a well-known software company. For further study of abstract data types, see Liskov and Guttag [69], Cardelli and Wegner [20], or Demurjian [28]. Chapter 6 Lists and Iterators Contents 6.1 Array Lists ...................... 222 6.1.1 The Array List Abstract Data Type.......... 222 6.1.2 The Adapter Pattern................. 310 223 6.1.3 A Simple Array-Based Implementation........ 224 6.1.4 A Simple Interface and the java.util.ArrayList Class. 226 6.1.5 Implementing an Array List Using Extendable Arrays 227 6.2 Node Lists ...................... 231 6.2.1 Node-Based Operations................ 231 6.2.2 Positions........................ 232 6.2.3 The Node List Abstract Data Type.......... 232 6.2.4 Doubly Linked List Implementation.......... 236 311 6.3 Iterators ...................... 242 6.3.1 The Iterator and Iterable Abstract Data Types.... 242 6.3.2 The Java For-Each Loop............... 244 6.3.3 Implementing Iterators................ 245 6.3.4 List Iterators in Java................. 247 6.4 List ADTs and the Collections Framework ...................... 249 6.4.1 The Java Collections Framework........... 249 6.4.2 The java.util.LinkedList Class............. 250 6.4.3 312 Sequences....................... 251 6.5 Case Study: The Move-to-Front Heuristic ...................... 253 6.5.1 Using a Sorted List and a Nested Class....... 253 6.5.2 Using a List with the Move-to-Front Heuristic.... 256 6.5.3 Possible Uses of a Favorites List........... 257 6.6 Exercises ...................... 260 java.datastructures.net 6.1 Array Lists Suppose we have a collection S of n elements stored in a certain linear order, so that we can refer to the elements in S as first, second, third, and so on. Such a collection is generically referred to as a list or sequence. We can uniquely refer to each element e in S using an integer in the range [0,n − 1] that is equal to the number of elements of S that precede e in S. The index of an element e in S is the number of elements that are before e in S. Hence, the first element in S has index 0 and the last element has index n − 1. Also, if an element of S has index i, its previous element (if it exists) has index i − 1, and its next element (if it exists) has index i + 1. This concept of index is related 313 to that of the rank of an element in a list, which is usually defined to be one more than its index; so the first element is at rank 1, the second is at rank 2, and so on. A sequence that supports access to its elements by their indices is called an array list (or vector, using an older term). Since our index definition is more consistent with the way arrays are indexed in Java and other programming languages (such as C and C++), we will be referring to the place where an element is stored in an array list as its "index," not its "rank" (although we may use r to denote this index, if the letter "i" is being used as a for-loop counter). This index concept is a simple yet powerful notion, since it can be used to specify where to insert a new element into a list or where to remove an old element. 6.1.1 The Array List Abstract Data Type As an ADT, an array list S has the following methods (besides the standard size() and isEmpty() methods): get(i): Return the element of S with index i; an error condition occurs if i size() − 1. set(i, e): Replace with e and return the element at index i; an error condition occurs if i size() − 1. add(i, e): Insert a new element e into S to have index i; an error condition occurs if i size(). remove(i): Remove from S the element at index i; an error condition occurs if i size() − 1. We do not insist that an array should be used to implement an array list, so that the element at index 0 is stored at index 0 in the array, although that is one (very natural) possibility. The index definition offers us a way to refer to the "place" where an element is stored in a sequence without having to worry about the exact implementation of that sequence. The index of an element may change whenever the sequence is updated, however, as we illustrate in the following example. Example 6.1: We show below some operations on an initially empty array list S. Operation Output S add(0,7) 314 - (7) add(0,4) - (4,7) get(1) 7 (4,7) add(2,2) - (4,7,2) get(3) "error" (4,7,2) remove(1) 7 (4,2) add(1,5) - (4,5,2) add(1,3) - (4,3,5,2) add(4,9) - 315 (4,3,5,2,9) get(2) 5 (4,3,5,2,9) set(3,8) 2 (4,3,5,8,9) 6.1.2 The Adapter Pattern Classes are often written to provide similar functionality to other classes. The adapter design pattern applies to any context where we want to modify an existing class so that its methods match those of a related, but different, class or interface. One general way for applying the adapter pattern is to define the new class in such a way that it contains an instance of the old class as a hidden field, and implement each method of the new class using methods of this hidden instance variable. The result of applying the adapter pattern is that a new class that performs almost the same functions as a previous class, but in a more convenient way, has been created. With respect to our discussion of the array list ADT, we note that this ADT is sufficient to define an adapter class for the deque ADT, as shown in Table 6.1. (See also Exercise C-6.8.) Table 6.1: Realization of a deque by means of an array list. Deque Method Realization with Array-List Methods size(), isEmpty() size(), isEmpty() getFirst() get(0) getLast() 316 get(size() −1) addFirst(e) add(0,e) addLast(e) add(size(),e) removeFirst() remove(0) removeLast() remove(size() − 1) 6.1.3 A Simple Array-Based Implementation An obvious choice for implementing the array list ADT is to use an array A, where A[i] stores (a reference to) the element with index i. We choose the size N of array A sufficiently large, and we maintain the number of elements in an instance variable, n < N. The details of this implementation of the array list ADT are simple. To implement the get(i) operation, for example, we just return A[i]. Implementations of methods add(i, e) and remove(i) are given in Code Fragment 6.1. An important (and time-consuming) part of this implementation involves the shifting of elements up or down to keep the occupied cells in the array contiguous. These shifting operations are required to maintain our rule of always storing an element whose list index is i at index i in the array A. (See Figure 6.1 and also Exercise R- 6.12.) Code Fragment 6.1: Methods add(i, e) and remove(i) in the array implementation of the array list ADT. We denote, with n, the instance variable storing the number of elements in the array list. 317 Figure 6.1: Array-based implementation of an array list S that is storing n elements: (a) shifting up for an insertion at index i(b); shifting down for a removal at index i The Performance of a Simple Array-Based Implementation Table 6.2 shows the worst-case running times of the methods of an array list with n elements realized by means of an array. Methods isEmpty, size, get and set clearly run in O(1) time, but the insertion and removal methods can take much longer than this. In particular, add(i, e) runs in time O(n). Indeed, the worst case for this operation occurs when i = 0, since all the existing n elements 318 have to be shifted forward. A similar argument applies to method remove(i), which runs in O(n) time, because we have to shift backward n − 1 elements in the worst case (i = 0). In fact, assuming that each possible index is equally likely to be passed as an argument to these operations, their average running time is O(n), for we will have to shift n/2 elements on average. Table 6.2: Performance of an array list with n elements realized by an array. The space usage is O(N), where N is the size of the array. Method Time size() O(1) isEmpty() O(1) get(i) O(1) set(i, e) O(1) add(i, e) O(n) remove(i) O(n) Looking more closely at add(i, e) and remove(i), we note that they each run in time O(n − i + 1), for only those elements at index i and higher have to be shifted up or down. Thus, inserting or removing an item at the end of an array list, using the methods add(n, e) and remove(n − 1), respectively take O(1) time each. Moreover, this observation has an interesting consequence for the adaptation of the array list ADT to the deque ADT given in Section 6.1.1. If the array list ADT in this case is implemented by means of an array as described above, then methods addLast and removeLast of the deque each run in O(1) 319 time. However, methods addFirst and removeFirst of the deque each run in O(n) time. Actually, with a little effort, we can produce an array-based implementation of the array list ADT that achieves O(1) time for insertions and removals at index 0, as well as insertions and removals at the end of the array list. Achieving this requires that we give up on our rule that an element at index i is stored in the array at index i, however, as we would have to use a circular array approach like the one we used in Section 5.2 to implement a queue. We leave the details of this implementation for an exercise (C-6.9). 6.1.4 A Simple Interface and the java. util. ArrayList Class To prepare for constructing a Java implementation of the array list ADT, we show, in Code Fragment 6.2, a Java interface, IndexList, that captures the main methods from the array list ADT. In this case, we use a IndexOutOfBoundsException to signal an invalid index argument. Code Fragment 6.2: The IndexList interface for the array list ADT. The java.util.ArrayList Class Java provides a class, java.util.ArrayList, that implements all the methods that we give above for our array list ADT. That is, it includes all of the methods included in Code Fragment 6.2 for the IndexList interface. 320 Moreover, the java.util.ArrayList class has features in addition to those of our simplified array list ADT. For example, the class java.util.ArrayList also includes a method, clear(), which removes all the elements from the array list, and a method, toArray(), which returns an array containing all the elements of the array list in the same order. In addition, the class java.util.ArrayList also has methods for searching the list, including a method indexOf(e), which returns the index of the first occurrence of an element equal to e in the array list, and a method lastIndexOf(e), which returns the index of the last occurrence of an element equal to e in the array list. Both of these methods return the (invalid) index value − 1 if an element equal to e is not found. 6.1.5 Implementing an Array List Using Extendable Arrays In addition to implementing the methods of the IndexList interface (and some other useful methods), the class java.util.ArrayList provides an an interesting feature that overcomes a weakness in the simple array implementation. Specifically, a major weakness of the simple array implementation for the array list ADT given in Section 6.1.3, is that it requires advance specification of a fixed capacity, N, for the total number of elements that may be stored in the array list. If the actual number of elements, n, of the array list is much smaller than N, then this implementation will waste space. Worse, if n increases past N, then this implementation will crash. Instead, the java.util.ArrayList uses an interesting extendable-array technique so that we never have to worry about array overflows when using this class. As with the java.util.ArrayList class, let us provide a means to grow the array A that stores the elements of an array list S. Of course, in Java (and other programming languages), we cannot actually grow the array A; its capacity is fixed at some number N, as we have already observed. Instead, when an overflow occurs, that is, when n = N and we make a call to the method add, we perform the following additional steps: 1. Allocate a new array B of capacity 2N 2. Let B[i ]← A[i], for i = 0,... , N − 1 3. Let A ← B, that is, we use B as the array supporting S 4. Insert the new element in A. 321 This array replacement strategy is known as an extendable array, for it can be viewed as extending the end of the underlying array to make room for more elements. (See Figure 6.2.) Intuitively, this strategy is much like that of the hermit crab, which moves into a larger shell when it outgrows its previous one. Figure 6.2: An illustration of the three steps for "growing" an extendable array: (a) create new array B; (b) copy elements from A to B; (c) reassign reference A to the new array. Not shown is the future garbage collection of the old array. Implementing the IndexList Interface with an Extendable Array We give portions of a Java implementation of the array list ADT using an extendable array in Code Fragment 6.3. This class only provides means for the array to grow. Exercise C-6.2 explores an implementation that can also shrink. Code Fragment 6.3: Portions of class ArrayIndexList realizing the array list ADT by means of an extendable array. Method checkIndex(r, n) (not shown) checks whether an index r is in the range [0, n − 1]. 322 An Amortized Analysis of Extendable Arrays This array replacement strategy might at first seem slow, for performing a single array replacement required by some element insertion can take O(n) time. Still, 323 notice that after we perform an array replacement, our new array allows us to add n new elements to the array list before the array must be replaced again. This simple fact allows us to show that performing a series of operations on an initially empty array list is actually quite efficient. As a shorthand notation, let us refer to the insertion of an element to be the last element in an array list as a push operation. (See Figure 6.3.) Figure 6.3: Running times of a series of push operations on a java.util.ArrayList of initial size 1. Using an algorithmic design pattern called amortization, we can show that performing a sequence of such push operations on an array list implemented with an extendable array is actually quite efficient. To perform an amortized analysis, we use an accounting technique where we view the computer as a coin-operated appliance that requires the payment of one cyber-dollar for a constant amount of computing time. When an operation is executed, we should have enough cyber- dollars available in our current "bank account" to pay for that operation's running time. Thus, the total amount of cyber-dollars spent for any computation will be proportional to the total time spent on that computation. The beauty of using this analysis method is that we can overcharge some operations in order to save up cyber-dollars to pay for others. 324 Proposition 6.2: Let S be an array list implemented by means of an extendable array with initial length one. The total time to perform a series of n push operations in S, starting from S being empty is O(n). Justification: Let us assume that one cyber-dollar is enough to pay for the execution of each push operation in S, excluding the time spent for growing the array. Also, let us assume that growing the array from size k to size 2k requires k cyber-dollars for the time spent copying the elements. We shall charge each push operation three cyber-dollars. Thus, we overcharge each push operation that does not cause an overflow by two cyber-dollars. Think of the two cyber-dollars profited in an insertion that does not grow the array as being "stored" at the element inserted. An overflow occurs when the array list S has 2i elements, for some integer i ≥ 0, and the size of the array used by the array list representing S is 2i. Thus, doubling the size of the array will require 2i cyber-dollars. Fortunately, these cyber-dollars can be found at the elements stored in cells 2i−1 through 2i − 1. (See Figure 6.4.) Note that t

Các file đính kèm theo tài liệu này:

Data structures algorithms in java.pdf