Data Warehousing Lab sheets
  • CITS3401/5504 Lab Sheets
  • Week 1 - Introduction (Self-Reading)
  • Week 2 - Software Setup
  • Week 3 - Extract, transform, and load (ETL)
    • Task Demonstration Video
  • Week 4 - Designing a New Data Warehouse
  • Week 5 - Mini End-to-End Data Warehouse Project
  • Week 6 - Association Rules Mining
  • Week 8 - Neo4j
  • Week 9 - Import data from a relational database into Neo4j
  • Week 10 - Cypher Queries
    • Week 10 Sample Solution
  • Week 11 - Advanced Cypher and APOC
    • Week 11 Sample Solution
  • Week 12 - Graph Data Science
    • Week 12 Sample Solution
Powered by GitBook
On this page
  • A. Search Algorithms
  • B. Neo4j
  • Task 2
  • Task 4
  1. Week 12 - Graph Data Science

Week 12 Sample Solution

PreviousWeek 12 - Graph Data Science

Last updated 1 month ago

A. Search Algorithms

Scenario: Siri is currently in Arad, and she wants to drive to Bucharest.

  1. What is the solution for BFS?

Arad->Sibiu->Fagaras->Bucharest

  1. What is the solution for A*? The heuristic function is the straight-line distance to Bucharest.

Arad->Sibiu->Rimnicu Vilcea->Pitesti->Bucharest

B. Neo4j

Task 2

Write a LOAD CSV statement that imports the roads into your database. Remember, the Countries have already been created, so after LOAD CSV you'll need to run a MATCH statement to find the countries listed in each row of roads.csv, and only CREATE a road between those two countries. Don't forget the distance property, which should also be a float.

// Load data from CSV
LOAD CSV WITH HEADERS FROM "file:///roads.csv" AS row
MATCH (country1:Country), (country2:Country)
WHERE country1.name = row.country_1 AND country2.name = row.country_2
CREATE (country1)-[r:ROAD {distance: toFloat(row.distance)}]->(country2)

Task 4

Task 4.1

Change the query to be a depth-first search (DFS) and run it. Are the results the same, or different - and why?

// DFS
MATCH (source: Country {name: "Spain"})
CALL gds.dfs.stream("Countries", {
    sourceNode: source
    })
YIELD path
RETURN path

The results between BFS and DFS will differ in the order they visit nodes.

Task 4.2

Try running the A* source-target shortest path algorithm to find the shortest path between Spain and India.

#Clean the environment
CALL gds.graph.drop('Countries')

#Create a new project
CALL gds.graph.project('Transport','Country',{ROAD:{orientation: 'UNDIRECTED'}},
{relationshipProperties:'distance',
nodeProperties:['latitude','longitude']})

#Running the A* algorithm
MATCH (source: Country {name: 'Spain'}), (target: Country {name: 'India'})
CALL gds.shortestPath.astar.stream('Transport', {
    sourceNode: source,
    targetNode: target,
    latitudeProperty: 'latitude',
    longitudeProperty: 'longitude',
    relationshipWeightProperty: 'distance'
})
YIELD index, sourceNode, targetNode, totalCost, nodeIds, costs, path
RETURN
    index,
    gds.util.asNode(sourceNode).name AS sourceNodeName,
    gds.util.asNode(targetNode).name AS targetNodeName,
    totalCost,
    [nodeId IN nodeIds | gds.util.asNode(nodeId).name] AS nodeNames,
    costs,
    nodes(path) AS path
ORDER BY index

Task 4.3

Create relationships of the type MINST between the Canada and every country reachable from Canada. Write a Cypher query to display the minimum spanning tree. Hint: the relationship type is MINST, not ROAD!

MATCH (c:Country{name:'Canada'})
CALL gds.beta.spanningTree.write('Transport', {
  sourceNode: id(c),
  relationshipWeightProperty: 'distance',
  writeRelationshipType: 'MINST',
  writeProperty: 'writeCost'
})
YIELD preProcessingMillis, computeMillis, writeMillis, effectiveNodeCount
RETURN preProcessingMillis,computeMillis, writeMillis, effectiveNodeCount;
MATCH (c1:Country)-[m:MINST]->(c2:Country) RETURN c1, m, c2