Data Science and the Digital Thread | Part 3

The data model Syndeia uses for the Digital Thread is a graph, a collection of vertices and edges, each of which can have a name, type and properties (edges can also have a direction). In our case, we have all of those. Referring back to Figure 1, Part 1, the reader can see why this might be a good fit with our picture of the Digital Thread. But the second advantage of graph databases is that the technology is highly scalable. Pushed by social networks like Facebook and LinkedIn, they have been engineered to handle millions of nodes and connections. The time for a properly formed query is independent of the size of the dataset.

Digital Thread

Figure 1  Graph characteristics

In our data model, repositories, containers and artifacts are all vertices, relations are edges, as illustrated in Figure 2. In addition, there are two special types of edges. One is the “ownedBy” hierarchical relationship. Artifacts and relations are owned by a container, for example, issues are owned by a project in JIRA, and containers are owned by a repository.

Data Model

Figure 2 One view of the Common Data Model in Syndeia

The second is a “hasType” relationship, for example, the edge between Repository and Repository Type in Figure 3. Each element has a type. A Teamwork Cloud (TWC) artifact could be a model, a branch, a revision, a block, or so forth, all artifact types specific to a particular TWC repository.

Data Model

Figure 3 A second view of the Common Data Model in Syndeia

Graphically, our Digital Thread looks something like Figure 4. Each of the larger circles represents a repository. The smaller circles within are containers, which contain artifact and intra-model relations. Syndeia creates a set of inter-model relations between them, which are collected in a Syndeia project container within the Syndeia repository. Note that Syndeia doesn’t try to store the artifact data, only the connections between them with enough information to identify and find the artifacts at the ends.

SysML - Syndeia

Figure 4 How Repositories, Containers, Artifacts and Relations model the Digital Thread in Syndeia

In practice, mapping each repository structure to the common data model is not simple. Each specialized tool has its own set of standard and custom types. The mapping is different for Jama than JIRA, as suggested in Figure 5 where different terminology is used for Repositories, Containers and Artifacts. Some tools diverge quite strongly from the common model, e.g., relations may be treated as attributes, and the mapping is complex.

Jia mapping

Figure 5 Mapping JIRA and Jama data models to the Syndeia common model

In Part 4 (forthcoming), it’s finally time to do some Data Science. We will apply these concepts in generating Gremlin graph queries to analyze a Digital Thread involving seven separate repositories.

For more blogs in the series:

Dirk Zwemer

Dr. Dirk Zwemer (dirk.zwemer@intercax.com) is President of Intercax LLC (Atlanta, GA), a supplier of MBE engineering software platforms like Syndeia and ParaMagic. He is an active teacher and consultant in the field and holds Level 4 Model Builder-Advanced certification as an OMG System Modeling Professional.