Algorithms Part 1

4 - 1 - Stacks

Welcome back. Today we're going to talk about algorithms and data structures for implementing some fundamental data types called bags, queues and stacks. You maybe somewhat familiar with these, but today we're going to take a careful and close look at them. The idea is that in many applications, we have collections of objects that we want to maintain. And the operations are very simple. We want to add something to the collection, maybe remove something from the collection and iterate through the objects in the collection performing some operation on them, and of course, test if it's empty. Now for most of these, the intent is very clear. The key is when it comes to removing an item, which item do we remove? The two fundamental classic data structures for this, the stack and the queue differ in the way in which the item to be removed is chosen. For the stack, we take out the item that was most recently added for, the terminology that we used is push to insert an item and pop to remove the item most recently added. That's also called the LIFO discipline last-in-first-out. For queue, we examine the item least recently added and those operations to distinguish them we call inqueue to insert an item and dequeue to remove an item and that's also called the FIFO discipline, first in, first out. So now we're going to take a look today on how to implement these things. Our subtext today is all about modular programming. And that's going to be a discipline that we're going to follow up carefully throughout this course. The idea is to completely separate the interface and the implementation. So, when we have these types of data structures and data types that are precisely defined like stacks and queues and so forth, what we want to do is completely separate the details of the implementation from the client. The client has, can have many different implementations from which to choose but the client code should only perform the basic operations. The implementation on the other hand, can't know the details of the client needs, all it's supposed to do is implement those operations. In that way, many clients can reuse the same implementation. So this allows us to create modular reusable libraries of algorithms and data structures that we can use to build more complicated algorithms and data structures. It also allows us to focus on performance when appropriate. Again, this is a modular programming style that's enabled by object oriented programming languages such as Java and we'll be very disciplined in our use of this style. Alright. So to begin, we'll talk about the stacks. [cough] Stacks are familiar, many of you probably implemented stacks in an introductory programming course but we'll do a thorough introduction to implementations right now. As a warm up, let's suppose that we have string, a collection of strings. They might be short, they might be long and what we want to have is the ability to save away a collection of strings and remove and return the most recently added string periodically, and also test if it's empty. So that's our API. We have a constructor to create an empty stack, we have for insert and we have a method called push that takes a string as argument. And for remove, we have a method pop that returns to the string most recently added. And we have these empty test which returns a Boolean. Also in some applications, we would include the size as well. So again, as always, we'll first read a client and then look at implementations and our client, simple client is to take some strings on standard input and some pop commands which are indicated with hyphens. And so, it'll this client reads strings from standard input. If the string is equal to the hyphened character, it'll pop the string at the top of the stack and print it. Otherwise, if it's a string that's not equal to the hyphen character, it'll just push it on to the stack. So in the example down below here if we have this file called tobe.txt then what we'll, what the client will do is push to be or not to all in the stack then when it comes to this hyphen it'll pop the most recently inserted item which is two in this case then it'll put b in the top of the stack and then pop the top item on the stack which is now b and then pop the item most recently added, b is gone, two is gone so the next is not and so forth. So, this is a simple test client that we can use to test our implementations. So now, let's look at the code for implementing a stack. Now, the first implementation that we'll look at, uses link list. If you're not familiar with the link list you'll need to review that in section 1.3, 1.3 at the book or in our Introduction To Programming Java book. Even if you are familiar with link list, it's worth taking a look at this code because it's the style of coding that we'll use throughout the coarse for much more complicated data structures. So the idea is to keep a link list where which is consists of nodes that have strings in them and references to the next item in the link list and to, to implement a stack when we do a, a push operation, we insert a new node at the beginning of the link list and we do a pop operation where we move the first node from the beginning of the link list, that's the most recently added items. So, let's look at what that code looks like. We use to implement link list in all linked data structures through out the course. We use what's called an inner class in Java and that's just a way to describe that we're going to be manipulating node objects that consist, each consist of a string and a reference to another node. So, the pop operation for link list is very easy to implement. [cough] first, we, we're going to need to return the first item on the list so we save that away. Take first that item and save that in the variable item. A h, then, to get rid on the first node, we just advance our pointer to the first item on the list to point two of the next item and then that first node is ready to be reclaimed by the garbage collector. And then, the last thing we need to do is just return the item that we saved away. Okay, so that's the pop operation. What about the push operation? [cough] push operation, we want to add a new node at the beginning of the link list. So, first thing we do is save a way the pointer to the beginning of the list. That's a little first thing first. Then we'll create a new node, that's going to be the new node that we put at the beginning of the list, that's first equals new node. And then we said it's instance variables. It's items is the string that we want to put at the beginning of the list, in this case, not. And it's next is the old first item of the list which is now the second item of the list. So, after this operation, we are first pointing to the beginning of the list and we have the items on the list in decreasing order of when they were put on to the stack. So that also is a four liner to implement the stack push operation. So this is a complete link list implementation of all the code to implement a link list for a stack of strings in Java. It's, it's a class the constructor doesn't have to do anything, there's no constructor. We have this inner class that we use to build the items in the link list and we make them an inner class so we can directly refer to those instance variables. And then the only instance variable of a stack is a reference to the first node on, on the list and that starts out being null. Then it's empty is just testing whether the first node on the list is null and then push is the four lines of code that I gave on the previous slide and pop is the three lines of code that I gave on the slide before that. Tha t's a complete implementation for the link list that'll work with as a fine push down stack implementation for any client. So now we can analyze the performance of that so that we can provide clients with information and how well the algorithm data structure will perform. In this case, it's easy to see that every operation takes constant time in the worst case. There is only a few instructions for each one of the operations, there's no loops. So that's obviously a very desirable characteristic. Then how about space units, usage? Well, that depends very much on the implementation in machines so this a typical Java implementation that we'll do the analysis for and contest this out for different types of environments easily in this representative. So, in Java, an inter class there's for every object there is sixteen bytes of over head. There are some extra over head, eight bytes because that's an inter class and then there is two references that we built in our, in, in our class node. One to string and another one to a node and those are each eight bytes. So, we have 40 bytes per stack node, if we have a stack of size n, we have about 40 n bytes. That's a little extra first but that's about an overhead for the whole stack but when n is large, 40n is a very close estimate to the amount of space needed. This does not include the space for the strings themselves which are owned by the client. But with that, we can properly asses the research usage of this implementation for different client programs. Now it's constant time but there's faster implementations of stack and since stack is used inner loop of some algorithms it's important to think about even faster implementations. And another, natural way to implement a stack is to use an array to store the items on a stack so let's take a look at that. This alternative of choosing between length structures and arrays is fundamental and it's go ing to come up again and again when we consider more complicated data structures in algorithms. So, we want to be sure to analyze it in the simple case for stacks to set the stage for more complicated applications later on. Alright, so the use in array we just keep the n items on the stack in the array and the array location within the n is the place the top of the stack where the next item is going to go. So, to push we just add a new item at s(n) into pop we remove the item that's at s(n) - one and decrement n. Now there is a fundamental defect in using an array and that is that you have to declare the size of array ahead of time and so the stack has a certain capacity. And if there is more items on the stack than the capacity we have to deal with that problem and that's a fundamental problem that we have to deal with in array implementations in all sorts of algorithms and data structures. So again, considering it for the simple case we'll pay off later on. Alright, so here's the full implementation of stack for using an array to represent the stack. So now we have an instance variable which is an array of strings and or variable n which is both the size of the stack and the index of the next position, next open position on the stack. This one has a constructor and the constructor creates the array. Now, we are cheating in this implementation to keep it simple and we'll take care of this cheat in a little while by requiring the client to provide the capacity of a stack. In a few applications this might be fine but in many, many applications that's two owners are requirement and client really can't know how big the stack is. Client might have a lot of stacks that need to be maintained simultaneously and then maybe they reached their maximum capacities at different times and various other things. So, we need to remove this cheat and will. But the code is nearly trivial. If we have the capacity to check if it's empty we check if n is zero. To push an item we use n to index into the array put the item there and then increment n, that's the short cut in many programming languages nowadays for use the index and then increment it. And to pop we decrement the index and then use it to return the item in the array. So each of the operations is a one liner and this is a fine implementation for some clients. That's array implementation of stack but it breaks the API by requiring the client to provide the capacity. So what are we going to do about that? Well, there are a couple of things that we didn't consider. We didn't put in a code to throw an exception if the client pops from an empty stack. Probably should do that and for overflow, what happens when the client does too much well, we're going to talk about an approach called resizing that will allow us to avoid overflow for clients. There's another issue about whether clients can insert null items into the data structure. In this case, we do allow the client to insert null items but we do have to worry about in Java about a problem called loitering and that is the idea that we have references to an object in our array implementation and the stack array and we are not really using it. So, when we decrement that value in, there's still a pointer to the thing that we took off the stack in that array even though we know we're not using it. The Java system doesn't know that. So to avoid that and really allow most efficient use of memory it's best to set that. [cough] removed item entry to null so there's no reference to the old item left there and then the garbage collector can reclaim the memory since there's no outstanding references. So that's a, a detailed but an important one that we have to take care of and or implementations to make sure that we're getting most efficien t use of memory.