Data Warehousing Lab sheets
  • CITS3401/5504 Lab Sheets
  • Week 1 - Introduction (Self-Reading)
  • Week 2 - Software Setup
  • Week 3 - Extract, transform, and load (ETL)
    • Task Demonstration Video
  • Week 4 - Designing a New Data Warehouse
  • Week 5 - Mini End-to-End Data Warehouse Project
  • Week 6 - Association Rules Mining
  • Week 8 - Neo4j
  • Week 9 - Import data from a relational database into Neo4j
  • Week 10 - Cypher Queries
    • Week 10 Sample Solution
  • Week 11 - Advanced Cypher and APOC
    • Week 11 Sample Solution
  • Week 12 - Graph Data Science
    • Week 12 Sample Solution
Powered by GitBook
On this page
  • 1. READ
  • 2. UPDATE
  • 3. DELETE

Week 10 - Cypher Queries

PreviousWeek 9 - Import data from a relational database into Neo4jNextWeek 10 Sample Solution

Last updated 1 month ago

This week's lab focuses on writing and understanding Cypher queries. Let's use the database that we created last week (week 9 lab sheet) as the example.

1. READ

Write queries that find each of the following:

  1. All nodes in the database.

MATCH (n) RETURN n
  1. All products in the database.

MATCH (n:Product) RETURN n
  1. Up to 5 products in the database.

MATCH (n:Product) RETURN n LIMIT 5
  1. All products whose prices are greater than 20. We will talk about toFloat() in

MATCH (n:Product)
WHERE toFloat(n.purchasePrice) > 20
RETURN n
  1. All sales which contain a product with the brand "Mega".

MATCH (s:Sale)-[:CONTAINS]->(p:Product {brand: "Mega"})
RETURN s

Alternatively, If you don't need to reference the "Product" node itself, you can simplify the query by removing "p".

MATCH (s:Sale)-[:CONTAINS]->(:Product {brand: "Mega"})
RETURN s
  1. All shops that have sales of a product with the brand "Mega".

MATCH (s:Shop)<-[:SOLD_IN]-(:Sale)-[:CONTAINS]->(:Product{brand:"Mega"})
RETURN s
  1. Same query as 6, but return the path.

MATCH p=(s:Shop)<-[:SOLD_IN]-(:Sale)-[:CONTAINS]->(:Product{brand:"Mega"})
RETURN p
  1. In this example dataset, our relationships don't have properties. However, what if we need to query based on a property of a relationship? Let's assume there's a "year" property on the "SOLD_IN" relationship. Here's how you can modify the query to include this relationship property:

MATCH (p:Product {brand: "Mega"})-[:CONTAINS]-(s:Sale)-[:SOLD_IN]-(sh:Shop)
WHERE s.date STARTS WITH '2021'
RETURN DISTINCT sh

If there's a "year" property on the "SOLD_IN" relationship, the query should be:

MATCH (p:Product {brand: "Mega"})-[:CONTAINS]->(s:Sale)-[r:SOLD_IN {year: 2021}]->(sh:Shop)
RETURN DISTINCT sh

The output of the above query should be (no changes, no records) as there is no "year" property on the "SOLD_IN" relationship.

Your turn: What does Query 8 mean?

  1. Find the path with a length of 4 from the product with ID "TAB_1" to the product with ID "TSH_9":

MATCH p=(:Product {productID: "TAB_1"})-[*4]-(:Product {productID: "TSH_9"})
RETURN p

NOTE: All the variables used in the queries, such as "s", "n", and "p", are self-defined. You can replace these variable names with any valid variable name of your choice.

2. UPDATE

Sometimes, records such as product price might need updating. Instead of uploading a new CSV file, we can write queries to update the information directly in the database.

  1. Updates a product's price to a different price.

Check the current price of the product with ID: "BED_3" and its type:

MATCH (p:Product {productID: "BED_3"})
RETURN p.purchasePrice

Alternatively, you can return the node and hover over the property to see its type:

You can see that the purchasePrice is stored as a string. This is because when we created nodes from a CSV file (see Week9), we did not specify the price as a float, so Neo4j automatically treated every variable as a string.

Two methods to change the type:

For example:

LOAD CSV WITH HEADERS FROM 'file:///products.csv' AS row
CREATE (d:Product {
    productID: row.`Product`,
    category: row.Category,
    subcategory: row.`Sub category`,
    productSize: row.Size,
    purchasePrice: toFloat(row.`Purchase price`),
    color: row.Color,
    brand:row.Brand
})

In this example, toFloat() is used to convert the purchasePrice property from a string to a float during node creation.

Method 2: Updating the type aftering creating:

MATCH (p:Product)
SET p.purchasePrice = toFloat(p.purchasePrice)

Your turn: compare Figure 1 with Figure 3. What's the difference between their results?

Now, let's update the price:

MATCH (p:Product {productID: "BED_3"})
SET p.purchasePrice = 125.0

Your turn:

  • Verify whether the price of product "BED_3" has been correctly updated to 125.0.

  • Choose a shop and update its size to a different value.

  • Update the "unitPrice" property of all Sale nodes to float.

  • Without changing the type of "quantity" for sale nodes, update the sale whose "SaleID" is "S000004969" to have a quantity of "5.0".

3. DELETE

  1. Deletes a specific shop in the database

MATCH (n:Shop {shopID: "shop_14"})
DETACH DELETE n
  1. Deletes a relationship between a specific Shop and a specific Sale

MATCH p=(n:Sale{saleID: "S000001480"})-[r:SOLD_IN]->(s:Shop{shopID:"shop_0"})
DELETE r

Your turn:

  1. Delete all nodes and relationships in the database.

For more information on path queries:

Method 1: Changing the type when creating the database from a CSV file, you can use conversion functions such as toInteger(), toFloat(), toString(), toBoolean(). You can learn more about these functions in the Neo4j documentation:

Remove the "subcategory" property from all product nodes.

What is the difference between "DELETE" and "DETACH DELETE":

Section 2. UPDATE.
https://neo4j.com/developer-blog/the-power-of-the-path-1/
Scalar functions
https://neo4j.com/docs/cypher-manual/current/clauses/remove/#:~:text=The%20REMOVE%20clause%20is%20used,label%20on%20it%2C%20nothing%20happens.
https://neo4j.com/docs/cypher-manual/current/clauses/delete/
Figure 1: checking property types from RETURN clause
Figure 2: checking property types from nodes
Figure 3: checking property types from RETURN clause (after updated)