CS 361: Algorithms and Data Structures
Homework 5
Due date: 12/4/15 by 11:59pm



This homework assignment is entirely a programming assignment. I recommend that you work with a partner (although this is not required). This is a substantial assignment. Please do not wait until the last minute to start.

The data we will be working with is Netflix data. Click here for a zipped directory containing all the necessary files for this assignment.

In this assignment, you will implement the following Java classes:

  • NetflixAnalyzer- A Java class with a main method that allows the user to explore the Netflix data.
  • Graph - A Java class that represents a directed graph using an adjacency list representation
  • GraphAlgorithms - A Java class that implements two graph algorithms: Dijkstra's algorithm and the Floyd-Warshall algorithm.
This assignment requires heavy use of the Java Collections classes: List, ArrayList, Map, and HashMap. For an example of how to iterate over the elements in a HashMap, see the toString() method in the Reviewer class.

Netflix Analyzer

Inside the NetflixAnalyzer.java class should be a main method that allows the user to explore the Netflix data. When the main method is run, here is what the user should see:



This message asks the user how they want to build a graph from the Netflix data. The nodes of the graph should always be people (i.e. reviewers) but there are lots of ways to define the edges of the graph. You should come up with at least two (2) different options for defining what it means for u and v to be adjacent in the graph. You are free to use the options I came up with above. Note: I recommend coming up with options that produce an unweighted graph.

Here is what should be printed after choosing an option:



If the user chooses option 1, you should print out the following information about the graph: the number of nodes, the number of edges, the maximum degree (i.e. the largest number of outgoing edges of any node) , the diameter of the graph (i.e. the longest shortest path), and the average length of the shortest paths in the graph. These last two (diameter and average) require you to compute the shortest path between all pairs of nodes in the graph using the Floyd-Warshall algorithm.

If the user chooses option 2, you should ask the user for a starting node and an ending node. Then use Dijkstra's algorithm to find the shortest path between the two nodes. You should print out the shortest path for the user.

Continue printing out the menu and letting the user choose options until they choose option 3 which should cause your program to quit. Click here to see a full session.

You should add other (probably static) methods to your NetflixAnalyzer class in addition to the main method. For example, a method that uses a NetflixProcessor to read in the data, a method that constructs the graph according to the user's choice, a method to print out different messages to the console etc. In particular, it's better to break your code up into small methods instead of having a single giant main method with dozens of lines of code.

The Graph Class

The Graph class should use an adjacency list representation for storing a directed graph. Assume that the nodes are represented using integers. As such, you can use a List<List<Integer>> or a Map<Integer, List<Integer>> for the adjacency list. At the very least, your Graph class should have the following methods:

  • addNode(u) - Add a node to the graph
  • addEdge(u, v) - Add a directed edge from u to v to the graph
  • getAdjacency(u) - Return the adjacency list for the node u
The inputs to these methods (u and v) can be integers since each Netflix reviewer is represented using an integer id. You'll probably find it useful to add other methods to your Graph class as needed.

Graph Algorithms

Finally, you should have a class called GraphAlgorithms that has methods for both Dijkstra's algorithm and the Floyd-Warshall algorithm. Even though the graph is unweighted, I still want you to implement Dijkstra's algorithm. Just assume each edge has a weight of 1.

Dijkstra's algorithm should return back the set of parent nodes because you'll need to print out the actual path for the user. The Floyd-Warshall algorithm however can simply return back the path costs. Here is my recommendation for the definition of both methods:

  • public static int[][] floydWarshall(Graph graph)
  • public static int[] dijkstrasAlgorithm(Graph graph, int source)

The floydWarshall method takes in a graph and returns a two-dimensional array of integers. The entry in spot [i][j] should be the length of the shortest path from node i to node j. Dijkstra's algorithm takes in a graph and a source and returns an integer array. This array is the "prev" data structure we talked about in class. The i-th element in the array is the parent of node i on the shortest path from the source.

Final Tips

The zipped directory for this assignment should contain the following files:
  • movie_reviews.txt - A plain text file that contains reviews of movies by 1495 Netflix reviewers.
  • movie_titles.txt - A plain text file that contains the year and title for 17,770 movies.
  • movie_short.txt - A plain text file that contains "fake" reviews of movies by only 5 users. Use this to help you debug your code.
  • Reviewer.java - A Java class that represents a single Netflix reviewer
  • Movie.java - A Java class that represents a single movie
  • NetflixProcessor.java - A Java class that has methods for reading and parsing the movie_reviews.txt and movie_titles.txt files
  • PriorityQueue.java - My implementation of a minimum priority queue. You can use your own priority queue (although you'll have to modify it to be a minimum priorty queue instead of a maximum priority queue) or you're welcome to use mine.

You should test your code as you write it!! A good idea is to test each class using a main method. You can use movie_short.txt as a small graph to test your code. Or, for your GraphAlgorithms class, draw a small graph by hand on paper and check that your Dijkstra's and Floyd-Warshall return the correct answer.

To enable assertions (which are extremely useful for catching bugs), see the previous posting on Piazza.



Submission Instructions

Your Java code should be submitted in a zipped directory. The directory should contain all necessary Java files. You and your partner only need to submit one directory. Please make sure that you put both of your names in the class comments so I know who you worked with. Your code should compile with no errors.

Your assignment will be graded based on functionality. That is, I will run your code on some small test graphs that I have and check that your code returns the correct answers for both option 1 (printing out information about the graph) and option 2 (shortest paths).




Last modified: Fri Jan 24 10:58:47 PST 2014