Week 11 - Advanced Cypher and APOC

This week's lab focuses on practicing advanced cipher features and using the APOC procedures and functions introduced in the lecture. APOC is not mandatory for Project 2 due to the lack of support for APOC in cloud Neo4j, and queries without APOC can still effectively address Project 2 questions. However, if you choose to incorporate APOC into your Project 2, you're more than welcome to do so.

In this lab, the solutions are deliberately hidden inside the expandable blocks. You are strongly encouraged to attempt the queries yourself first before expanding to check the sample answers.

A. Preparation

downloading the dataset: The dataset is the people-city dataset we used in the lecture demo, about relationships between people and cities people live in.

installing APOC: create a new database, and install the APOC plugin before starting the database.

Note: After installing APOC, it is necessary to restart your database.

importing CSVs: place the CSV files in the import directory of the database. If you do not know how to do it, refer to Week 9 - Import data from a relational database into Neo4j for instructions.
To check if APOC is functioning correctly, execute the following code:

CALL apoc.import.csv([{fileName: 'file:/people.csv', labels: ['Person']}], [], {})

If the code runs without errors, proceed to the next step. If you encounter the error mentions "apoc.conf" settings, there are TWO solutions.

Solution 1 (Note: This solution only works for a database with version 4.xx.xx)

a. go to settings:

b. use "Ctrl+F" to search for "apoc," then add "apoc.import.file.enabled=true" in the indicated location. Click "Apply" and then restart your database.

c. Execute the above code again; you should now be able to use APOC.

Solution 2 (Note: This solution works for a database with version 4.xx.xx AND 5.xx.xx)

a. Download the apoc.conf file.

29B

apoc.conf

b. Open the conf folder.

c. Move the downloaded apoc.conf file to the conf folder.

d. Restart the database.

B. Populating the Graph Database

Let's clear your database first:

MATCH (n) DETACH DELETE n

Make use of APOC dynamic label and dynamic relationship loading to import the three csv files and build a graph database.

Step 1: Import people and set a static label Person as well as dynamic label according to Occupation (people.csv)

//Step 1: Load people into the database
//with a dynamic label based on their occupation

    LOAD CSV WITH HEADERS FROM 'file:///people.csv' AS row
    CALL apoc.create.node(["Person", row.occupation], row)
    YIELD node
    RETURN node

Step 2: Dynamically load relationship between people using APOC (relationships.csv)

//Step 2: Load relationships between people into the database

    LOAD CSV WITH HEADERS FROM 'file:///relationships.csv' AS row
    MATCH (p1:Person {name: row.person_1}), (p2:Person {name: row.person_2})
    CALL apoc.create.relationship(p1, row.relationship, {}, p2)
    YIELD rel
    RETURN rel

Step 3: Create cities as nodes (promoting a column as nodes) and establish relationships with people (cities.csv)

//Step 3: Create cities and relationships between people and cities

    LOAD CSV WITH HEADERS FROM 'file:///cities.csv' AS row
    MATCH (p:Person {name: row.person})
    MERGE (c:City {name: row.city})
    CREATE (p)-[:LIVES_IN]->(c)

C. Advanced Cypher and APOC

C.1. Aggregation and Data Profiling in Cypher

List the number of followers

// List number of followers
MATCH  (p1:Person)-[:FOLLOWS]->(p2:Person)
RETURN  p2.name, COUNT(p1) as NumOfFollowers
ORDER BY NumOfFollowers DESC

List the name of followers

// List name of followers
MATCH  (p1:Person)-[:FOLLOWS]->(p2:Person)
RETURN  p2.name, COLLECT(p1.name)

FOAF - Find Friend of A Friend for a person

// FOAF
MATCH (d {name: "Etta"})--(otherPerson)-->()
WITH otherPerson, count(*) AS foaf
WHERE foaf > 1
RETURN otherPerson.name

C.2. Path Expansion

Path Expansion in Cypher - note the syntax that starts with a * in pattern matching

//PATH Expansion
MATCH (me)-[:KNOWS*1..2]-(remote_friend)
WHERE me.name = "Kellsie"
RETURN me, remote_friend.name

Path Expander in APOC

//PATH Expansion in APOC
MATCH (me:Person {name: "Kellsie"})
CALL apoc.path.expand(me, "KNOWS", null, 1,2)
YIELD path
RETURN path, count(path) as hops
ORDER BY hops;

Find the Mechanic that "Kellsie" KNOWS or who KNOWS "Kellsie" (no specified direction) who are one or two hops away

//PATH Expansion in APOC Mechanic
MATCH (me:Person {name: "Kellsie"})
CALL apoc.path.expand(me, "KNOWS", "+Mechanic", 1,2)
YIELD path
RETURN path, count(path) as hops
ORDER BY hops;

Find people "Kellsie" FOLLOWS and KNOWS as well as those who KNOWS "Kellsie" that are 1 or 2 hops away.

//PATH Expansion in APOC More Relations
MATCH (me:Person {name: "Kellsie"})
CALL apoc.path.expand(me, "KNOWS|FOLLOWS>", null, 1,2)
YIELD path
RETURN path, count(path) as hops
ORDER BY hops;

For more information about APOC path finding, please refer to this APOC documentation:

Expand paths - APOC Extended DocumentationNeo4j Graph Data Platform

C.3. List and Pattern Comprehension

Use pattern comprehension to find the average, minimum and maximum number of relationships of all nodes.

// What kind of nodes exist
// Find all nodes, reporting on relationship counts per node.
MATCH (n) 
WITH n, [(n)-[r]-() | r] as all_rels
RETURN
DISTINCT labels(n),
avg(size(all_rels)) as Avg_RelationshipCount,
min(size(all_rels)) as Min_RelationshipCount,
max(size(all_rels)) as Max_RelationshipCount

The queries below allow you test out integrating pattern comprehension with other clauses (e.g. UNWIND) and functions (e.g. collect()). What is UNWIND:

UNWIND - Cypher ManualNeo4j Graph Data Platform

Find a Cook named "Jodi", use Pattern Comprehension to find the occupations of all persons related to him. Store it in a list and attach to entity "Jodi" as a property, make sure the list contain only `DISTINCT` occupations.

First let's find the list of occupations of all persons related to Jodi

//Pattern Comprehension and Storing for persons
MATCH (jobi:Cook {name: 'Jobi'})
WITH jobi, [(jobi)-->(b:Person) | b.occupation] AS jobi_circle_occupation
SET jobi.circle_occupations=jobi_circle_occupation
RETURN jobi.circle_occupations

However, the occupation list is not unique, let's try to use UNWIND clause and collect() function together to get the unique list of occupations and reset the property.

MATCH (jobi:Cook {name: 'Jobi'})
UNWIND jobi.circle_occupations AS occupations
WITH collect(DISTINCT occupations) AS distinct_occupations, jobi
SET jobi.circle_occupations = distinct_occupations
RETURN jobi

Modify the above query to add the set of occupations of one's relationship circles.

//Pattern Comprehension and Storing for ALL persons
MATCH (n:Person)
WITH n, [(n)-->(b:Person) | b.occupation] AS my_circle_occupations
UNWIND my_circle_occupations AS occupations
WITH collect(DISTINCT occupations) AS distinct_occupations, n
SET n.circle_occupations = distinct_occupations
RETURN n LIMIT 10

YOUR TURN: Combine the two above into one block of queries to reduce redundant code.

YOUR TURN: Modify the third code chunk and try to list the people with the number of occupation types related to them in desending order. e.g. (if the circle_occupation of Alice is ["Mechanic", "Actor"], the result should be Alice 2)

C.4. Virtual Relationships and Virtual Nodes (Graph Projections)

Virtual Nodes and Relationships do not exist in the graph, they are only returned to the UI/user for representing a graph projection. They can be visualised or processed. All virtual nodes and relationships have negative id’s.

For example, the following code creates a virtual node for all people living in Perth and created a virtual relationship between this virtual node with the city node Perth. Note how we can add the population count as a property when creating the virtual relationship.

MATCH (p:Person)-[:LIVES_IN]->(c:City{name:"Perth"})
WITH collect(p.name) AS residents, c, count(p) as count
CALL apoc.create.vNode(residents,{name:residents}) yield node as a
CALL apoc.create.vRelationship(a,"LIVES_IN",{count:count},c) yield rel
RETURN a, c, rel

YOUR TURN, adapt the above code to create such virtual nodes for all cities.

For more information on Virtual Node and Virtual Relationships, please read the APOC documentation:

Virtual Nodes/Rels - APOC Extended DocumentationNeo4j Graph Data Platform

PreviousWeek 10 Sample Solution NextWeek 11 Sample Solution

Last updated 3 months ago