Dylan Storey

Recovering academic, hacker, tinkerer, scientist.

The Panama Papers or : How I learned to stop worrying and install a graph database

As part of a data challenge I needed to import the panama papers into a graph database. While images were provided for MacOSX and Windows, none were provided in Linux. Here’s what I learned as a result.

Setting up Neo4J

Per http://debian.neo4j.org/:

wget -O - https://debian.neo4j.org/neotechnology.gpg.key | sudo apt-key add -
echo 'deb http://debian.neo4j.org/repo stable/' >/tmp/neo4j.list
sudo mv /tmp/neo4j.list /etc/apt/sources.list.d
sudo apt-get update
sudo apt-get install neo4j

You’ll need to make an edit to a neo4j conf file (/etc/neo4j/neo4j.conf):

Find the line with dbms.directories.import=import , and comment it out like so :

#dbms.directories.import=import

In order for neo4j to run without warning you’ll need to increase the number of open files it is allowed:

Open /etc/security/limits.conf , add the following lines:

root   soft    nofile  40000
root   hard    nofile  40000

Open up /etc/pam.d/common-session and /etc/pam.d/common-session-noninteractive and add the following line to each file:

session required pam_limits.so

Next (re)start the neo4j server with :

sudo neo4j start

or

sudo neo4j restart

Setting up py2neo so we can interact through Python

So we can interact with neo4j through python :

I’m assuming you don’t have py2neo installed yet, if you do know that py2neo v3 and v2 have different syntaxes and implement features in different ways. This guide is purely around 2.0.8.

sudo pip install py2neo == 2.0.8

Now lets have some fun

blog comments powered by Disqus