2023-06-02
Table of Contents
perl -S rdfpuml.pl file.ttl                      # makes file.puml
java -jar plantuml.jar -charset UTF-8 file.puml  # makes file.pngConverts an RDF Turtle file to a readable diagram using PlantUML.
The best way to understand RDF data schemas (ontologies, application profiles, RDF shapes) is with a diagram. Many RDF visualization tools exist, but they have shortcomings, eg
rdfpuml makes true UML diagrams directly from Turtle examples,
with a small amount of tweaking that can be done with puml: formatting triples
(this follows the approach from
Circles and arrows diagrams using stylesheet rules, Dan Connolly, W3C 2005).
Benefits:
Diagram readability is a prime concern. rdfpuml implements the following features for maximum readability.
puml:Inlinerdfpuml prepends prefixes.ttl if it finds such a fule,
so when you make a set of examples, you can keep all your prefixes in one file.
It also predefines the following prefixes:
puml   => 'http://plantuml.com/ontology#'
rdf    => 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'
rdfs   => 'http://www.w3.org/2000/01/rdf-schema#'
skos   => 'http://www.w3.org/2004/02/skos/core#'
crm    => 'http://www.cidoc-crm.org/cidoc-crm/'
crmx   => 'http://purl.org/NET/cidoc-crm/ext#'
frbroo => 'http://example.com/frbroo/'
crmdig => 'http://www.ics.forth.gr/isl/CRMdig/'
crmsci => 'http://www.ics.forth.gr/isl/crmsci/'
leak   => 'http://data.ontotext.com/resource/leak/'puml is used for PlantUML formatting triples, see below.
rdfs and skos are used to display node labels.
The rest are used for reification (see below).
Multiple property instances between nodes are collected in one arrow and shown as several labels. Inverse arrows work fine.

In RDF, Reification is an approach where a statement s p o is represented as a class instance with 3 “addressing” properties and then additional properties are added to elaborate the statement (eg probability, effective date range, who assigned the statement, etc).
This approach has been used in RDF for many years using the RDF Reification vocabulary.
It was developed for CIDOC CRM in the paper Types and Annotations for CIDOC CRM Properties
and is used for British Museum data (bmo:EX_Association, bmo:PX_property),
see Reified Association.
rdfpuml recognizes a number of reification “situations” and renders them as a UML Association, for example

RDF Reification looks like this:
[] a rdf:Statement; rdf:subject s; rdf:predicate p; rdf:object o; <statement metadata>rdf:Statement is the reification class,
rdf:subject, rdf:predicate, rdf:object are the addressing properties,
and it can be applied over any shortcut property p.
The Property Reification Vocabulary (PRV) can be used to describe other reification situations, using terms like in the table below. rdfpuml recognizes the following situations, and in the future you should be able to provide your own PRV descriptions.
REIFICATION CLASS             SUBJECT PROP                   SHORTCUT PROP                    OBJECT PROP                 SHORTCUT
_____________________________ ______________________________ ________________________________ ___________________________ __________________________________________
rdf:Statement                 rdf:subject                    rdf:predicate                    rdf:object                  <any>
crm:E13_Attribute_Assignment  crm:P140_assigned_attribute_to crmx:property                    crm:P141_assigned           <any CRM prop>
crm:E14_Condition_Assessment  crm:P34_concerned              crmx:property                    crm:P35_has_identified      crm:P44_has_condition
crm:E15_Identifier_Assignment crm:P140_assigned_attribute_to crmx:property                    crm:P37_assigned            crm:P1_is_identified_by, crm:P102_has_title
crm:E15_Identifier_Assignment crm:P140_assigned_attribute_to crmx:property                    crm:P38_deassigned          crm:P1_is_identified_by, crm:P102_has_title
crm:E16_Measurement           crm:P39_measured               crmx:property                    crm:P40_observed_dimension  crm:P43_has_dimension
crm:E17_Type_Assignment       crm:P41_classified             crmx:property                    crm:P42_assigned            crm:P2_has_type or subprop
frbroo:F52_Name_Use_Activity  frbroo:R63_named               crmx:property                    frbroo:R64_used_name        crm:P1_is_identified_by, crm:P102_has_title
crmsci:S4_Observation         crmsci:O8_observed             crmsci:O9_observed_property_type crmsci:O16_observed_value
leak:Edge                     leak:hasSource                 <none>                           leak:hasTargetFor CIDOC CRM we need a new extension crmx:property to point to the property being reified (the shortcut), similar to how rdf:predicate is used.
Even for a specific CRM reification class like E17_Type_Assignment,
the shortcut property is not fixed to crm:P2_has_type:
we may need to reify a sub-property thereof, e.g. crm:P72_has_language.
Visuals: the shortcut is shown as a normal relation. The reification node is attached to the relation usign a dashed line. It is automatically positioned below or to the right of the relation, depending on the relation’s direction. The 3 “addressing” properties are shown inside the reification class, and there are little characters in front of them to point to the subject (“←” or “↑”), property (“..” or “:”) and object (“→” or “↓”).
Limitation: you can show as reified a maximum of 2 relations between the same nodes, and even that is ugly.

If you don’t want to show a relation as reified (either because it’s the third one between the same nodes or for other reasons,
use the puml:NoReify class to tell rdfpuml not to reify it, e.g.
cona_term:1000000718-contrib-10000016 a puml:NoReify.
In order to save space, rdfpuml inlines various resources in the subject node, rather than as a separate node. All literals and types are inlined automatically. In addition, you can inline other nodes by using the following.
puml:Inline
Shows a single node inline. This is used quite often for lookup values, e.g.
  <cona/event/competition> a puml:Inline.
  cona_contrib:10000000 a puml:Inline; rdfs:label "Getty Vocabulary Program".puml:InlineProperty
Declares a property to be inlined, i.e. all its objects are shown inlined, e.g.
  fn:annotationSetFrame a puml:InlineProperty.
  fn:annotationSetLU    a puml:InlineProperty.E.g. this is used in this complex diagram showing FrameNet nodes.
 ## Labels
## Labels
Nodes can have labels: puml:label, rdfs:label, skos:prefLabel.
puml:label is used to give a label without any predicate (attribute) name. It’s printed in red (but only if single-line). It’s used with completely different meanings in rdf2rml (to specify a SQL table or join) and in rdf2sparql (to specify a row filter): I know this is a hack.rdfpuml allows you to customize arrows by using properties like puml:DIR-HEAD-LINE-COLOR-LEN.
You can combine the different parts freely (each is optional) and even write them in different order.
DIR
Arrow direction: left, right, up or down (default)
HEAD
Arrowhead (end): none (use for symmetric properties), tri (hollow triangle), star (filled dot), o (empty dot).
WARNING: o has a bug: it sets the arrow direction to the opposite of what was specified.
TODO: allow customizing the arrowtail (beginning).
LINE
Line style: dotted, dashed, bold, hidden.
dotted, dashed are exclusive of each other; bold can be used alone or with them.
hidden can be used to adjust the layout by constraining node positions, and is exclusive of the other line attributes.
This example shows using the parameters DIR-HEAD-LINE-COLOR.
We emit the same relations in the puml: namespace (to customize the arrow)
and in the empty namespace (to show an arrow label).
   <x> puml:none-right  <y1>. <x> :none-right  <y1>. 
   <x> puml:dashed      <y2>. <x> :dashed      <y2>. 
   <x> puml:dotted-bold <y3>. <x> :dotted-bold <y3>. 
   <x> puml:up-black    <y4>. <x> :up-black    <y4>. 
   <x> puml:tri-up      <y5>. <x> :tri-up      <y5>. 
   <x> puml:left-blue   <y6>. <x> :left-blue   <y6>.  - COLOR
- COLOR
Line color: name (e.g. `red`) or hex-code (e.g. `FF0000`)
To see the full list of color names supported by PlantUML, use this command and search for `;color`
    java -jar plantuml.jar -language
For example, 4 of the arrows on this diagram are colored (1 green, 3 red): - LEN
- LEN
Length: a number of 1 or 2 digits. This applies only to vertical arrows (`up,down`).
You can use this to adjust the layout and in some cases avoid the need for parasitic `hidden` arrows.
See [this example](http://www.plantuml.com/plantuml/uml/JOvB3iCW30NtFeKlm2B_NPMhq80A4YGH6DMvVLCtZM0tCvOUSoQTgCG0pXkBDkvqe2PA_bd8vjf6IsupbrfyMiBPbw1pHcw06rHcUw_gWST9BQgogo-qmDsLX3lW_XS5U-3Xpc86u54E_c84dgeJSPDC9FzofEft3zH9VZ7RrPGOFW00)
and its remake in rdfpuml below.
     <x1> puml:down     <y1>.
     <x2> puml:up-2     <y2>.
     <x3> puml:down-3   <y3>.
     <x4> puml:up-4     <y4>.
     <x5> puml:down-5   <y5>.
     <x6> puml:up-6     <y6>.
     <x7> puml:down-7   <y7>.
     <x2> puml:right-9  <y3>.
     <x4> puml:right-7  <y2>.
To customize the arrow of one relation connecting two nodes, use:
<node1> <prop> <node2>
<node1> puml:DIR-HEAD-LINE-COLOR <node2>The arrow will show the label prop but the style specified with puml:DIR-HEAD-LINE-COLOR
To customize the arrow for all relations with the same property, use:
<prop> puml:arrow puml:DIR-HEAD-LINE-COLOR.E.g. on the diagram “Inlines”, the following declaration was used to point all nif:oliaLink arrows upward:
nif:oliaLink puml:arrow puml:up.Stereotype is UML lingo for “role”. In PlantUML these include a «guillemetted italic label» and colored circle.
puml:stereotype "(LETTER,COLOR)LABEL"where LETTER is a single uppercase letter,
COLOR is a color name or hex-code (see </COLOR>),
LABEL is a label, and all the parts are optional.
You can set stereotype on an individual node or a whole class, e.g. (referring to the previous diagram):
<#char=32,34_annoSet> puml:stereotype "(F)Frame"
fn:AnnotationSet      puml:stereotype "(F)Frame"Here is an example that also sets stereotype labels:
gvp:GuideTerm      puml:stereotype "(G,green) Concept".
gvp:Concept        puml:stereotype "(C,lightblue) ThesaurusArray, OrderedCollection".
iso:ThesaurusArray puml:stereotype "(A,red) ThesaurusArray, OrderedCollection".
Here is a bigger example that also shows how arrow directions are handled. It’s a diagram for the Duraspace Portland Common Data Model for digital library metadata (Fedora, Islandora, etc): a remake of one of the Reference Diagrams. (A proposal to make PCDM diagrams with rdfpuml is tracked as duraspace/pcdm#46)

NOTE: Always specify color, else the stereotype text will include the parenthesized letter, eg (C).
This is bug plantuml#1854
Although the use of blank nodes is not recommended in semantic modeling, they are supported by rdfpuml. No special triples are needed for this.
E.g. below is a diagram of EXAMPLE 41: Complete Example from the Web Annotation Data Model.
As you can see, 10 nodes on the right side are blank nodes (have no URL).
The tiny one in the middle has no attributes whatsoever, only the rdf:first, rdf:next outgoing links.
It should have had a type rdf:List, this is an omission in the example.

If you want to visualize not only instances (A-Box) but also class statements and expressions (T-Box), see test/complex-types with its own README.
Here is an example closely mirroring the style of the Industrial Ontology Foundry (IOF):

You can pass options and pragmas to PlantUML using the puml:options property (attached to an empty node).
The default options are:
[] puml:options """
  hide empty members
  hide circle
  skinparam classAttributeIconSize 0
""";Option descriptions:
- hide empty members removes the 2 blank compartments that appear for nodes with no attributes.
- hide circle hides the parasitic circled letter (C). Only nodes with puml:stereotype are shown a circled letter.
- skinparam classAttributeIconSize 0 removes UML attribute visibility flags (public, private, protected)
You can use left to right direction to fit diagrams with large nodes (see two examples below).
Don’t forget to add the hide options, else you’ll get unwanted compartments and circles in nodes:
[] puml:options """
  hide empty members
  hide circle
  skinparam classAttributeIconSize 0
  left to right direction
""".You can also try your luck with smetana, which uses an internal Java implementation of GraphViz instead of an external C program.
One way to invoke smetana is by adding a pragma to puml:options.
See text/saref4city for another way, and a trial.
!pragma layout smetanaFirst example (test/saref4city):

Second example (test/permid):

You can pass additional options to PlantUML to select a skin, set colors, etc by using a global config file (eg plantuml.cfg). For example:
perl -S rdfpuml.pl       file.ttl
plantuml -Iplantuml.cfg  file.pumlBy default, PlantUML uses a drawing canvas of 4096 pixels.
This causes really big diagrams (eg ones that need left to right direction as described above) to be cut off.
To avoid this, add the following command-line option:
-DPLANTUML_LIMIT_SIZE=8192 For example in a batch file:
java -jar ".../plantuml.jar" -charset UTF-8 -DPLANTUML_LIMIT_SIZE=8192 "$@"The input Turtle can include Unicode chars (accented chars, Cyrillic, etc).
See test/unicode for some examples.
It used to be that you needed to invoke the script with option -C, now that is not necessary:
perl -C -S rdfpuml.pl file.ttlI include two simple files to invoke the script: bin/rdfpuml.bat (CMD batch) and bin/rdfpuml (shell script).
- For some reason, the shell script doesn’t work ok with Unicode in Cygwin Bash (so use the BAT file).
- If you have problems with it on Linux, please post an issue.
RDF::Trine, RDF::Query, FindBin. Install them with cpan, cpanm (works best on Strawberry) or cpanp.RDF::Prefixes::Curie. This is my own module located in ../lib, and rdfpuml needs FindBin to locate it.Until rdfpuml is published as a proper perl package, use the following procedure:
rdfpuml/bin to your path.bin/rdfpuml.bat or bin/rdfpuml to run it (but see “Unicode” above)plantuml or puml for short):
#!/bin/sh
java -jar c:/prog/plantuml/plantuml.jar -charset UTF-8 $*@echo off
java -jar c:\prog\plantuml\plantuml.jar -charset UTF-8 %*scoop install plantuml already makes such batch files called plantuml)test/*/Makefile for examples how to set up make.I now use the pp script (part of PAR-Packer) to make a self-contained binary for Windows: rdfpuml.exe. - It includes Perl, all required modules and the script. - On first run it’s slower since it unzips the distribution (over 4k files) - It caches the unzipped distribution to a folder, so on sunbsequent runs it’s much faster
Big thanks to @rschupp who helped me fix this issue: https://github.com/rschupp/PAR-Packer/issues/88 !