CS225 Graph Search Project Unordered Data Structures — Week 3

$30.00

Download Details:

  • Name: graph_search-qblcw0.zip
  • Type: zip
  • Size: 267.39 KB

Category:

Description

Rate this product

This project covers content from both Weeks 3 and 4, dealing with graph representations and graph search problems. You might expect to work on it over the course of two weeks, but there isn’t as much code you need to write as there is here for you to study. Hints about how to do the exercises are given throughout this document as well as the provided code. We’ll use an “adjacency list” graph data structure, similar to what was described in the lecture, but rather than using pointers, we’ll use value copies of simple objects stored in STL containers. Our GridGraph class uses a combination of std::unordered_map, std::unordered_set, and std::pair to achieve this in a convenient way. Make sure you’ve finished the Week 1 project on std::unordered_map so that you’re ready. In this document we’ll discuss the background for some of the concepts in the project, and then you will try implementing some of the graph internals yourself and running a breadth-first search (BFS) on a GridGraph. Lastly, we’ll illustrate how “graph modeling” can be used to solve a problem in application.

2 Getting Started 2.1 C++ Environment Setup If you are new to C++, we recommend checking out the first MOOC in this course sequence, which has more information on getting your C++ programming environment set up. In particular, we show how to set up AWS Cloud9 to do your coding entirely through your browser, but it’s also possible to do this assignment on your own computer in Windows (with WSL Bash), in macOS (with some additional tools installed for the terminal), or in Linux natively. Cloud9 itself hosts a version of a Linux. In the Week 3 Programming Assignment item on Coursera, where you found this PDF, there is a download link for the starter files: graph_search_starter_files.zip. If you are using a web IDE like Cloud9, you should still download the zip file locally to your computer first, then upload it to your Cloud9 workspace. With the default Cloud9 workspace configuration, make sure you’ve updated the compiler first, by typing this command in the terminal: sudo yum install -y gcc72 gcc72-c++ (If you forget to do that, you may see a compiler error message that C++14 is not supported.) Now, in the Cloud9 interface, click File > Upload Local Files, and select your local copy of the starter zip file to upload it to your workspace. After that, you’ll see that the zip file appears in the environment file listing on the left of the screen. Graph Search Project Unordered Data Structures — Week 3 Fig. 1: Updating the compiler in the Cloud9 terminal as described above. Next, you’ll want to extract the contents of the zip file to a new directory. Your terminal on Cloud9 already has some commands for this, zip and unzip. First, type ls in the terminal and make sure you see the zip file in the current directory listing. If not, then as described in the earlier readings, you should use the cd command to change directory until you see the file in the same directory. You can use the file listing on the left side of the window to help you see where you are. The pwd command (“print working directory”) also reports where your terminal is currently browsing in the file system. Once in the same directory as the zip file, enter the following command to extract the contents of the file, which contains a subdirectory named graph_search: (Be careful, because if you already have a subdirectory called graph_search, this may overwrite files in that directory!) unzip graph_search_starter_files.zip After you extract the code files, you should be able to see the filenames in the file list pane on the left, and double-clicking a filename will load that file in the code editor. 2.2 Provided Files Below we describe which files you are meant to edit and which ones you should read through for examples and hints. 2.2.1 Files that will be graded This time there’s only one file you must edit, and the autograder on Coursera will discard changes that you make to any other files. (You can add unit tests to the other files mentioned in the next section, but the autograder will discard those changes.) • GraphSearchExercises.cpp: There are three exercises to complete, which are described in the code file (as well as in this document you are reading now). 2.2.2 Files you should study Among the other code files provided, some of them are meant for you to read through, even though you don’t need to edit them. They contain examples of syntax, hints, and comments about how things work. • main.cpp: This defines the main program, which you can compile just by typing make at the terminal, as described later in this document. The program is set up by default to run some tests on your code and show some example output. You should read the provided code and try compiling and running the main program. Be sure to look through the terminal output for any warnings or errors. If you want, you can edit this file to add more tests and experiments of your own. • GraphSearchCommon.h: This header contains some common includes needed by the other source files. 2 Graph Search Project Unordered Data Structures — Week 3 • IntPair2.h: This uses STL to define the IntPair class for a pair of ints, as well as IntPairPair for a pair of pairs. Similar to the version found in the week 1 project, but a few things have been added. • GridGraph.h, GridGraph.cpp: These define the GridGraph class that we use for a simple graph of 2D grid points connected by horizontal and vertical edges. The class uses IntPair as the vertex type representing a 2D point. STL containers are used to create the “adjacency list” structure, which is not actually a list in this case. • PuzzleState.h, PuzzleState.cpp: The PuzzleState class allows us to represent a state of the “8 puzzle,” a sliding tile puzzle we will try to solve with BFS. PuzzleState does not rely on GridGraph. • tests/week3_tests.cpp: This configures the test program you can compile with make test at the terminal, as described later in this document. This program runs some unit tests that have been defined with the Catch library. You don’t need to learn how to use the Catch library syntax, but this file shows more information about what exactly the test program will check on your code, and the autograder on Coursera will also run the same correctness tests. If you are interested in Catch, the documentation is here: https://github.com/catchorg/Catch2 2.2.3 Other files There are some additional files in the provided package, although you don’t need to look at them. They’re part of the mechanical inner workings of the assignment. For example, there are files containing Makefile definitions for the GNU Make program to use when it launches the compiler, and there is the source code for the Catch unit testing library. You are welcome to look at this code, but we won’t discuss how it works. 2.3 Compiling and Testing Your Code To compile the code, you must use the terminal to enter the same directory where the Makefile is stored; it’s in the same directory where main.cpp is located. As explained in the readings, use cd to change to the appropriate directory, use ls to view the file list in that directory (just to make sure that you’re in the right place), and then type make to automatically start the compilation. If you need to clear out the files you’ve previously built, type make clean. If you encounter any warnings or errors during compilation, study the messages in the terminal carefully to figure out what might be wrong with your code. If your compilation is successful, you’ll get an executable file simply called main in the same directory as main.cpp. You can try running it by typing ./main while you’re in that directory. As the compilation message suggests, you should also run the unit tests included in the test program. To compile the test program, just type make test at the same prompt. If this compilation is also successful, you’ll see a file named test appear in the directory. Then, you can run it by typing ./test. As you work on your code, recompilation is faster if you don’t do make clean, but cleaning the previously-generated files can help to resolve some compilation errors, so we’ll show you some one-liner commands that will clean, recompile, and run. • To recompile and run the main program: make clean && make && ./main 3 Graph Search Project Unordered Data Structures — Week 3 • To recompile and run the unit testing suite: make clean && make test && ./test For this assignment, you can run the same test suite that the autograder will use on Coursera. In addition, you can choose to run only the specific unit tests for a given exercise with a command-line argument. For example, to run only the Exercise 1 unit tests, you can do this: ./test [ex1] The ./main program doesn’t support this command-line argument, but you can toggle various tests on or off in the source code itself. Look in the definition of the informalTests function in main.cpp for details. 2.4 Submitting Your Work When you’re ready to submit your work for final grading on Coursera, you’ll need to bundle it into a zip file in a specific way. To do that, you can type this command from your main subdirectory, where the Makefile is located: make zip This will automatically package the files to be graded into a new zip file, graph_search_submission.zip, which will appear in the same directory. You can download the created zip file to your own computer from Cloud9 by right-clicking on the zip file in your Cloud9 file list and choosing “Download.” Once you’ve got the zip file on your local computer, you can log into Coursera and submit the zip file for this assignment. (If you need a reminder about how to use Cloud9 or how to submit work on Coursera, the first MOOC in this course sequence on Coursera has more detailed information about these topics in the lessons and homework assignments.) 3 Background Information Here we discuss some of the topics, concepts, and syntax that the project touches upon. Much of this information is supplementary rather than essential, but please read to get the best understanding of what’s going on in the project code. 3.1 Syntax for std::unordered_set In this project, we use std::unordered_set in several places to represent collections of items having some shared property. A set contains value copies of objects, and it contains no duplicates; inserting the same item twice has no effect. In C++, the unordered set type uses hashing internally, so the key type must support equality comparison with the == operator as well as std::hash specialization. (We discussed that in the Week 1 project about std::unordered_map.) Unlike the unordered map type, the keys for an unordered set do not have associated values. We only care whether a given key has been inserted or not. This is a convenient alternative to “labeling” vertices with some status, for example. If an algorithm tells you to label vertices as “visited”, you could instead make a set for all the visited vertices, and insert copies of vertices to the visited set when you want to record that they have been visited. For example, we could make a set to contain unique integers like this: 4 Graph Search Project Unordered Data Structures — Week 3 std::unordered_set favoriteNumbers; Then we can insert some items with insert(): favoriteNumbers.insert(7); favoriteNumbers.insert(42); // Inserting again doesn’t change anything: favoriteNumbers.insert(42); At this point, the set contains 7 and 42. (It only has a single copy of 42.) You can check the number of items with size() or empty(). And then we can use count() to query if an item is in the set or not, and we can remove items from the set with erase(): std::cout << favoriteNumbers.count(7) << std::endl; // output: 1 favoriteNumbers.erase(7); std::cout << favoriteNumbers.count(7) << std::endl; // output: 0 3.2 Other C++ set implementations It’s worth mentioning that hashed containers that STL offers may not be optimal in some cases. Perhaps you wish to maintain ordered data or perhaps there isn’t a fast enough hash function for your needs. There does also exist an ordered type std::set which is based on a binary tree implementation instead of hashing, and consequently uses the < and == operators to arrange items instead of std::hash and ==. This would have different efficiency properties, but it may be appropriate when you need to maintain data in order. It is also possible to use any of the other ordered STL containers such as std::vector as a generalpurpose ordered set (as long as you maintain them in a sorted order). Then there are some utility functions in that can let you compute set union and intersection efficiently.1 In your real-world projects, you would want to profile the performance of several options and see what works best. For other “set” data structures like the disjoint-sets (or “union-find”) structure or geometric partitioning structures, which have specific use cases, there are no STL classes but you can find robust open-source libraries for C++, or implement your own. 3.3 Initialization with type inference As you examine the provided code, you may see objects being initialized in various ways. Remember that in C++11 and newer, there are compiler features that can infer (detect) the correct types automatically, which can help you write neater code. This was mentioned in the Unordered Map project, but now it’s much more useful. Our IntPair type, defined in IntPair2.h, is based on std::pair<int,int>. The lines below all construct the same IntPair. The last one uses an “initializer list” in braces , which can be used wherever the type can be understood correctly by the compiler. (This is not the same thing as the “member initializer” lists that can follow a colon : in a class constructor.) 1https://en.cppreference.com/w/cpp/algorithm/set_union 5 Graph Search Project Unordered Data Structures — Week 3 IntPair point1 = IntPair(1,2); IntPair point2 = std::make_pair(1,2); auto point3 = IntPair(1,2); IntPair point4 = {1,2}; However, if you’re not explicit enough, the compilation may fail. The compiler needs to be able to tell if you mean IntPair, std::vector, or something else, based on the structure of the variable and how you use it. The following may or may not work, depending on the context of your code: auto point5 = {1,2}; // Probably too vague 3.4 Implementing adjacency lists For this project, we’ll use the “adjacency lists” model for a graph data structure. We can rely on STL containers to get a somewhat simpler implementation that the one that is explained thoroughly in the lectures. That is, instead of maintaining a linked list of pointers for each vertex, and a separate, shared lookup table of explicit edge records, we will use this scheme: • Each vertex object has a unique key. We create a map where each vertex has a separate entry based on its own key. • Each vertex should be mapped to a set of other keys, one key for each adjacent vertex (connected by an edge, implicitly). We can do this because we guarantee that there are no duplicate vertices, so keys are unique, and because the keys themselves are small, it’s efficient enough to store value copies of them instead of pointers to a shared storage structure elsewhere in memory. Our class GridGraph uses this internal representation. However, we’ll see in the last exercise that in some cases, an even more implicit representation can be achieved, when we do not need to maintain records of all existing vertices and their adjacencies, but rather can calculate that information on demand. 3.5 The GridGraph class Our GridGraph class is defined in GridGraph.h and GridGraph.cpp. A GridGraph is meant to represent undirected graphs on a 2D grid, where vertices are “points” with integer coordinates, each point represented by an IntPair. IntPair is defined in IntPair2.h, where it is implemented as a std::pair<int, int>. (We add a few things like std::hash support.) The first item in the pair is the row index, and the second item is the column index. We want our GridGraph class to only allow for edges between grid points that are one unit apart, so we’ll only have horizontal or vertical edges that are exactly one unit in length. For example, we would allow an edge between (0,0) and (1,0), but not between (0,0) and (1,1). We’ll also allow for “isolated points,” that is, individual points that have no incident edges. Note that you can display a plot of a GridGraph in the terminal. We overload the << operator so that you can output a GridGraph directly to std::cout, and this delegates to the plot member function by default. If the verbose diagrams in the terminal cause problems with screen readers or for any other reason, you can prevent the plotting output by changing options listed in informalTests in main.cpp. 6 Graph Search Project Unordered Data Structures — Week 3 (0,0) —- (0,1) —- (0,2) | | (1,1) | | (2,1) | | (3,0) —- (3,1) —- (3,2) Fig. 2: Example terminal output of an “I”-shaped GridGraph plot. In the GridGraph class namespace, we define a type for the “adjacency set” belonging to a single vertex: using NeighborSet = std::unordered_set; Therefore this type could be referred to globally as GridGraph::NeighborSet. Each GridGraph graph stores all of its internal data in its member variable adjacencyMap, which is declared like this: std::unordered_map<IntPair, GridGraph::NeighborSet> adjacencyMap; That is, the adjacencyMap associates one IntPair, the key point, with a set of several other IntPair, which are all points that are adjacent to the key point, connected to it implicitly by edges. (We say two points are adjacent only if they are connected by an edge. In Fig. 3, although points (7, 2) and (7, 3) are one space apart, they are not connected by an edge, and therefore not “adjacent.”) (7,0) —- (7,1) —- (7,2) (7,3) Fig. 3: Some connected points and an isolated point. (7, 0) is connected to (7, 1), and (7, 1) is connected to (7, 2). (7, 3) is an isolated point with no edge connections. Because GridGraph represents undirected graphs, if any point X includes a point Y as an adjacency, then Y must also include X as an adjacency. Consider the points shown in Fig. 3. If we look up the point (7,1) in the adjacencyMap, we should get a std::unordered_set containing the two adjacent points (7, 0) and (7, 2). If we look up (7, 0), we should get a set containing only (7, 1). If we look up (7, 3), we should get an empty set. The fact that there is a key in adjacencyMap for a given point is enough to declare that the point exists in the graph. Therefore, if we want to insert an isolated point with no edge connections, we would insert the key for that point with an empty value, that is, an empty set of neighbors. If you recall how the unordered map type works, just by referring to a non-existent key with [], we cause it to be created with a default value. This is actually all that GridGraph::insertPoint does to insert an isolated point, as you can see in the class definition. GridGraph::insertEdge will also initialize the endpoints of the given edge if necessary, so points do not need to be insert individually before inserting an edge. Let’s avoid a point of confusion. Although our graph data structure doesn’t store edges explicitly, sometimes we may want to refer to edges separately as objects. For example, we may want to enumerate 7 Graph Search Project Unordered Data Structures — Week 3 all the unique edges in a graph and list them separately. The function GridGraph::printDetails does this. Then it’s good to remember that even as a single GridGraph point is a pair of two integers, an edge could be represented as a pair of two points. For these reasons, in IntPair2.h we’ve also defined IntPairPair as a pair of IntPair. There are some comments in IntPair2.h warning about unique edge representations. For example, for an edge between (1, 7) and (1, 8), the explicit edge representations ((1, 7),(1, 8)) and ((1, 8),(1, 7)) must mean the same edge, but our IntPairPair type won’t automatically consider these to be equal, nor hash them the same way. You need to take care about that if you are trying to enumerate edges explicitly. (It would be possible to make a class type similar to IntPairPair that ensures that undirected edges are correctly compared automatically, but that’s unnecessary to complete this assignment.) 3.6 Graph search algorithms overview In this project you’ll be dealing with breadth-first search, but there are some other commonly discussed algorithms and they all have various strengths. We mention them here just for your reference. Note that these can all be used for both directed and undirected graphs, with some modifications. • breadth-first search (BFS): This can be used to find the shortest paths from one vertex to other vertices in the graph, as long as all of the edges have the same length (or “weight,” or “cost,” which are basically synonyms). Since BFS doesn’t consider different edge lengths, it finds the shortest path only in terms of the number of steps. This works because as the algorithm expands the frontier of its search, it always visits all of the vertices that are reachable by a length-1 path before it visits any of the vertices that are reachable by a length-2 path (but not a length-1 path), etc. If you are searching for a single goal vertex, then as soon as BFS discovers it, you know you’ve found a path of shortest length to that vertex. (There may be more than one shortest path between two points, if multiple paths have the same length, and that length is the shortest. In that case, we’re just finding one of these shortest paths.) • Dijkstra’s algorithm: This one is also discussed in this course. It finds the shortest paths from one vertex to all others, where the edge weights can all be different, as long as they have non-negative values. • depth-first search (DFS): This has various creative uses, but note that it won’t necessarily find shortest paths. An “iterative deepening” variation on this algorithm can be used for that purpose efficiently, though. That has the benefit of using less memory than BFS. There are also some other commonly-discussed algorithms that we haven’t talked about: • A*: Pronounced “A-star,” this algorithm is somewhat similar to Dijkstra’s, but it uses an estimation function (called a heuristic) to guess which nodes to try exploring next, with the intention of finding a single goal node and stopping. The heuristic must be carefully designed to ensure correctness. Traditionally, this was the algorithm used for computing shortest paths in video games and so on, but many newer variations on the algorithm exist. • Bellman-Ford: This algorithm is similarly in principle to BFS, but it performs iterations that re-check edges multiple times to see if shorter path information can be updated. Because of this, it is somewhat slow, but it’s also compatible with negative edge lengths—unless there is a negative cycle somewhere in the graph. In that case, there may exist no such thing as a shortest path, but this algorithm can at least be used to detect that. 8 Graph Search Project Unordered Data Structures — Week 3 3.7 Graph modeling Many applications of graph theory and graph algorithms deal with things that are not obviously graphs shaped like road maps. Sometimes you’ll have to think creatively to realize what the “vertices” and “edges” are for a given application, and what the weights of the edges should be. One typical application is where a graph represents a sequence of choices or events, where each vertex in the graph represents state, that is, the current conditions at any given moment. Then the graph edges would represent possible transitions from one state to another, and the edge weights would signify the relative cost of choosing one option instead of another. In the simplest case where edges all have the same weight, we can use BFS to compute the shortest path from some initial state to some desired goal state. Many single-player puzzles with no hidden or random game elements, such as a Rubik’s cube or sliding tile puzzle, theoretically can be solved this way. In the last exercise we’ll look at a simple example. However, for many problems, even a small puzzle, the number of potential states (and thus graph vertices and edges) is incredibly large, and so an exhaustive graph search can take a very long time and consume lots of memory. That’s why in practice, an algorithm like A* (or something more advanced) can perform much better, assuming thee is some way to estimate a cost for a given choice. In our implementation of BFS, we’ll at least discuss a few minor improvements that can reduce the memory and preprocessing requirements. 4 Exercises Please look at the starter code in GraphSearchExercises.cpp. You’ll need to write all of your graded code in this file, because the autograder will ignore changes you make to any others. 4.1 Exercise 1: Adjacency list utilities In this exercise, you’ll finish implementing the GridGraph member functions countEdges and removePoint. This should help you understand the Week 3 materials and become more comfortable with GridGraph for the next exercise. To code these, it’s okay to use any of the GridGraph class member functions, or anything else useful from the provided code. • countEdges: This function should return the number of unique edges in the GridGraph. Remember that GridGraph doesn’t explicitly store edge objects, but instead it stores point adjacency information. Since the graph edges are undirected, for any two vertices A and B, the edges (A, B) and (B,A) are the same edge. This means if we add up the sizes of the GridGraph::NeighborSet sets mapped in adjacencyMap, we’ll be double-counting the number of edges. We can still use that to get the answer, or instead, the implementation of GridGraph::printDetails shows another method, by constructing a set of the unique edges only. • removePoint: This function takes a point (an IntPair) as input, and it should remove all references to that point from the data structure. That means it should remove the input point’s key from the adjacency map, but first it should remove the edges connected to this point by removing any references to the input point from among the adjacency sets of the other points that were originally adjacent to the input point. You shouldn’t change any other data in the graph, such as adding anything new. 9 Graph Search Project Unordered Data Structures — Week 3 4.2 Exercise 2: graphBFS You need to finish implementing the graphBFS function that performs a breadth-first search within a GridGraph. Note that the provided code is almost complete, and you only need to implement the few parts marked with “TODO.” The provided code is heavily documented with comments, and there are some hints about what exactly you need to do. Here are some significant details about this implementation of BFS; in particular, some aspects that are different from how the algorithm is listed in pseudocode in the lectures: • This implementation of BFS is geared toward finding the shortest path from a single start point to a single goal point. So, it only explores within a single connected component of the graph (if there are several), and it may report that the goal is unreachable. As soon as the goal point is found, our function stops searching, unlike the pseudocode given in lecture that describes exploring and labeling every node and every edge in every connected component. • This implementation uses sets of points to implicitly “label” points with some status such as “visited,” instead of assigning a label property to a point itself. This lets us associate more than one kind of status with any given point at the same time, by putting the point in several sets. It’s convenient to do this with STL containers like std::unordered_set, since this way, we don’t have to add some kind of additional status member variables to our own classes like IntPair just to record meta-data about the search process. It also means we don’t have to initialize a status for every vertex from the beginning. In some graph search problems, such as for solving a puzzle with graph modeling, the number of vertices is extremely large, so we’d rather not have to initialize them all, or even loop over them all just once! • We use a std::unordered_map, pred, to record the predecessor vertex of any newly-discovered vertex during the search. (You may also see this concept referred to as “parent” or “previous” instead of “predecessor.”) This implicitly records what the “discovery” edges are, as described in lecture. We can use that to rebuild the path from goal to start (in reverse) when we’re done searching. Then it’s trivial to arrange the path in the reverse order and return the path from start to goal. The terms “visit,” “discover,” and “explore” are used somewhat interchangeably when discussing graph search. To restate what we’re doing here, a node X becomes the predecessor of a node Y if we are currently viewing the adjacencies of node X when we visit Y for the first time, adding Y to the exploration queue. It’s as though X has discovered Y, so call this a “discovery edge,” and then we know that a shortest path from the start to Y uses this edge. • We do not assign status directly to edges, and we aren’t explicitly storing the edges in a separate structure, only implicitly in the adjacencyMap that the GridGraph class has. However, the vertex predecessor information we record during the search is basically revealing the “discovery” edges. • We use yet another map, dist, to record information about the shortest-path distance from the start to any given node that has been discovered so far. This is similar to something that Dijkstra’s algorithm does, although with BFS, when the edge lengths are all equal, we know that as soon as we discover a node we have found the shortest path to it. We can still use this to detect if we’ve taken more steps than expected and quit, since in some cases, we know a definite upper bound on how many steps should be required before we give up. (For example, you could set the maximum allowed distance to be the same as the number of edges you counted in the graph structure. When there are no negative edge weights, the path can never 10 Graph Search Project Unordered Data Structures — Week 3 be shorter by repeating an edge, and so in the worst case, the shortest path would use every graph edge exactly once. But you may be able to calculate an even more helpful upper bound for some types of problems.) • Redundantly, we have also created a “dequeued” set of vertices that we use to help you check for mistakes that could cause an infinite loop. This isn’t normally part of the BFS algorithm itself. In a way, this is mirroring what the “visited” set is already supposed to accomplish. Some of these features are mainly intended to help make our implementation of BFS more efficient and less prone to infinite looping mistakes the programmer might make, which could be especially helpful for the next exercise where the graph is much larger. In practice, you would probably want to use a more advanced graph search algorithm for a heavy graph search operation with a single start and single goal. (For example, you might use an iterative version of the A* algorithm mentioned earlier in this document. Some other courses at the University of Illinois discuss these algorithms.) However, the simple principles behind how BFS works also inform many of those other algorithms. Please test your work with both the ./main and ./test programs as described earlier in this document. When you run ./main, you can see a graphical printout of your shortest path results if you have enabled the Exercise 2 test option listed in informalTests in main.cpp. 4.3 Exercise 3: puzzleBFS This time, we will use BFS to solve a graph modeling problem. This is where we model a realistic problem in terms of an imaginary graph, and then we can use graph search concepts to solve the modeled problem. This is mostly a conceptual exercise; if you’ve done the previous exercise, you’ll realize there is very little work to do for this one. The “8 puzzle” is a sliding tile puzzle played in a 3×3 square grid with eight tiles, labeled 1 through 8, where the ninth space on the grid is blank. Whenever a tile is adjacent to the blank space, we can slide it into the blank space, and this counts as one puzzle move. ——- | 1 2 3 | | 4 5 6 | | 7 8 | ——- Fig. 4: Example terminal output of a PuzzleState object in the solved configuration, [1, 2, 3, 4, 5, 6, 7, 8, 9] When the puzzle is solved, it looks like Fig. 4 above. Reading the numbers row by row, using 9 for the blank space, we can write the state compactly as an array of numbers. So this state is: [1, 2, 3, 4, 5, 6, 7, 8, 9]. We have a class PuzzleState, defined in PuzzleState.h, that implements such a puzzle state as a fixed-length array, std::array<int, 9>. You can think of this like a std::vector that always has nine items. When the puzzle begins in a scrambled state, we want to slide the tiles until the grid looks like the solution state again. We can use graph search to find the solution steps. At first, one might think that we would make edges between grid squares or something, but that’s not the right approach. Instead, the current state of the entire puzzle grid is a single vertex in our imaginary graph model. This state vertex is connected by implicit edges to other state vertices that can be reached by performing a single puzzle move. For example, beginning from the puzzle state shown in Fig. 5, there are two adjacent states, which are the result states of the two potential moves. 11 Graph Search Project Unordered Data Structures — Week 3 ——- | 1 3 | | 4 2 5 | | 7 8 6 | ——- Fig. 5: A PuzzleState in the configuration [9, 1, 3, 4, 2, 5, 7, 8, 6] First, we could slide the 1 to the left. (This is the same idea as swapping the blank space to the right. Our PuzzleState member functions refer to the moves as if the blank space itself is moving through the grid.) Then this is the neighboring state we would reach (Fig. 6): ——- | 1 3 | | 4 2 5 | | 7 8 6 | ——- Fig. 6: A PuzzleState in the configuration [1, 9, 3, 4, 2, 5, 7, 8, 6] Or, instead, we could slide the 4 up. (That’s the same as swapping the blank space down.) Then we’d reach this state (Fig. 7): ——- | 4 1 3 | | 2 5 | | 7 8 6 | ——- Fig. 7: A PuzzleState in the configuration [4, 1, 3, 9, 2, 5, 7, 8, 6] There are helper functions in the PuzzleState class that can query what these neighboring states are for any given state. Since we calculate these adjacent states implicitly, we don’t need a separate graph data structure with a map of adjacencies. We also don’t need to keep a list of all the possible vertices that exist, although we can “label” vertices as we go by inserting them into sets. It’s good that we don’t need to label them all from the start, because the number of potential puzzle state configurations is huge! If we ignore the sliding rules and just rearrange tiles however we want, we could potentially make 362,880 different permutations, but not all of those arrangements are even reachable from the solved state by valid puzzle moves. So simply listing all the vertices in our graph model would be very inefficient, but even if we wanted to make a reduced list of valid puzzle states, that would require figuring out which of all the vertices are reachable, too. Therefore we won’t make a list of all the vertices either way. We’ll just rely on the start state and goal states we know about, and we’ll follow the implicit adjacency information based on the rules of the puzzle. (There is actually a way to figure out if a given configuration is solvable based on the number of ordering flips.2 But apart from that, if you recall the “landmark path” problem from the Week 4 lectures, we could initiate BFS from the solved state with no particular goal in mind, and exhaustively explore every reachable puzzle configuration. In the process of doing that, we’d find the shortest path solution for every valid puzzle—but this would take a lot of time and memory.) 2http://mathworld.wolfram.com/15Puzzle.html 12 Graph Search Project Unordered Data Structures — Week 3 Each puzzle move slides a single tile by a single space, and since we want to find the solution that takes the shortest sequence of moves, we can naively suppose that all potential moves are equally costly, so we can think of all the implicit edges in our graph model having the same weight. Then we can implement puzzleBFS to perform BFS for the 8 puzzle very similarly to graphBFS. We can also use the dist record to see if we’ve taken too many steps and should give up. In the case of the 8 puzzle, there is a known maximum number of steps that it should take to solve any given puzzle (reportedly 31 steps,3 but we’ve hard-coded 35 as the limit). Therefore if we take more steps than that, we know the puzzle can’t be solved or we have some bug, and we can immediately quit. There are other kinds of problems you can model as graphs where such upper bound information is not known. However, the concept of the “best-yet-found” distance record is still useful in many cases, and other graph search algorithms use this to detect errors or optimize where to search next. If we could think of a way to correctly assign a score to potential moves, it might be possible to use one of the other graph search algorithms with much greater efficiency. But, we’ll set that aside for now. v20190516b — Eric Huber 3http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.40.9889