E agora DBA?: NoSQL- Graph databases (Neo4J), or the graph model

Para versão em português,

/* Ladies And Gentlemen DBAS!!!

In my first post in english (sorry in advance), I would like to show you one of most interesting database models. The Graph Database (or Property Graph data model).

Fig.1 – Modeling my family

Instead of use tables, columns and PK/FK relationships

A vertex is a object with properties. Example: In a book store, the vertices could be books, authors and publishers. Books could be described with properties like Title, Genre, Price, Total Pages, etc... authors receive properties like Name, Birthday, biography, etc..
The edges connects the nodes with labels like Writed_by,
Published_by, etc... Edges can also have properties

Tools

The tool that we are going to use in our "hands on" is here. The Neo4J it is a

We will also use

Note:

Creating an graph

The script will open a java windows

The http://localhost:7474/webadmin/

To start, click on

Inserting vertex, properties and listing

Lets to create a basic social network

You can see below the code to add some users (a litle tribute to some friends of mine)

Just after the ASCII Art, the webadmin instance the variable g that represents the graph.

After that, we use addVertex function to add our first vertex

The vertex properties are passed as a parameter of the function, it is an JSON that basically have this structure:
[PropertyName:PropertyValue, ...]

As you can notice, one differece between a relational database and a noSQL is that noSQL is schemaless,so properties can varies in the vertices.
To list vertex, use g.V, where V is a collection of vertices inside the graph G. To list a specific property type the property name after V. Example: g.V.Name or g.V.any_other_property.

We notice that the Bruno Salim position ("profissão" in portuguese) was writed wrong. Lets update this ( the vertex ID of Bruno is 28).

Very simple, as is simplier add new properties. For example, lets add a position to Gustavo vertex.

Or add a completely new property (this is schemaless!)

The map show all vertex properties (a JSON structure).

Connecting vertices.

Now, we are going to create connections between vertices through the edge type KNOWS

First of all, we assign the vertices to friendly named variables , then we add the edges between them.

Finally, lets answer a few typical questions of social networks and figure the power of graph db.

Gremlin queries are divided by steps separated by dots.

Object.step1.step2.stepN.property

A step use the result of the previous step and can transform, filter or insert some side effects in results.

Who I know?

The

Who are the friends of my friends?

One way to avoid the use of the out step many times it is use the loop step, it repeat the previous step until the rule in brackets is not fulfilled (In our case, we limited the loop to twice times checking the it.loops property.)

Who can introduces Felipe to Gustavo?

The power of graph can be showed solving this question. In a relational database, the correspondent query will be heavy and unsuitable for applications with many users, but in a graph db regardless the quantity of users and iterations the result will keep the same performance.

The loop will iterate until it find the the gustavo vertex. The query list two possible paths.

Replacing the "out" steps for pairs of "outE" (outbound edges) and "inV" (to get the vertices) we will get how they relate.

Lets delete the connection between Spigariol and Gustavo and try again.

In the first line, we use 'g.E' to list all edges, then 'inV' to get the verices and 'has' to filter that vertices with property Name equals "Gustavo". Finally, we use back(2), to show the results of two steps behind.(returning to E).
The next command we remove the Edge, now we have only one possible path to Gustavo.

Conclusion

That was a little tour on GraphDB, what are the tools to access and a "hands on" to create, erase and query the stored vertices. This post is not a full tutorial, for a more complete information please see this great video of Andreas Kollegger < , he explain the graph database concept and present an full demo of Neo4j. My favorite part of the video is when he said that a relational database is great to calculate the salary average of the attendees of the webcast, but the graph would be better to identify who would buy him a beer.

Thanks for your reading. If you have any questions, feel free to send me a email, or follow me on twitter, I´m always posting related content of the marvelous world of data!

Best Regards!

Felipe Antunes

NoSQL- Graph databases (Neo4J), or the graph model - English Version

Nenhum comentário:

Postar um comentário