Toward Optimistic Version Control in Architecture

Diffing, Patching, and Three-Way Merging for openNURBS 3D Models

Chapter 4
Implementation

4.1 The Abstract Model

The abstract model defines the fundamental concepts that underlie the 3dmdiff, 3dmpatch, and 3dmdiff3 programs in terms of a suite of classes and interfaces. In object-oriented programming, an interface describes a set of behaviors (in the form of methods) without specifying how those behaviors should be carried out. Classes that implement an interface are free to choose the most appropriate means of carrying out the actions it specifies. This ability of different classes to respond to the same method call, or message, with their own set of instructions is known as polymorphism and is one of the pillars of object-oriented programming (Armstrong 2006).

Polymorphism can also be achieved through abstract classes, which provide implementations for some, but not all, of the methods they specify. Abstract classes are generally used to define common routines that can be included in multiple other classes through inheritance. However, unlike a normal parent class, an abstract class forces its subclasses to provide implementations of any methods for which it does not.

The abstract model described here includes both interfaces and abstract classes. Because the Python programming language does not support interfaces, abstract classes are used in place of interfaces in the code for the 3dmdiff, 3dmpatch, and 3dmdiff3 programs.

Figure 4.1. A UML diagram depicting the relationships between the Stringable, Value, and Delta interfaces and their concrete implementations in the abstract model.

4.1.1 Stringables

In computer science, a string is a sequence of textual characters. Because 3dmdiff outputs its deltas in a plain text format, and because 3dmpatch must be able to read that plain text data, the ability to create, manipulate, and parse strings is a crucial part of their operation.

In the abstract model, a Stringable object is one that can be converted to and from a string. The Stringable interface is comprised of two methods:

toString
string
produces a textual representation of the object it is called on. In the Python code, this method is named __str__, which allows it to be automatically called when a string representation of the object is required.
fromString
input
string
object, string
is a class method, which means that it is invoked on a class rather than on any particular instance of that class. It accepts a string and attempts to parse a value from the beginning of that string. If successful, it returns an instance of the class it was called on as well as the unparsed remainder of the input string.

4.1.2 Values

A Value holds a piece of information retrieved from or that can be assigned to a component property and wraps it in a consistent interface that provides methods for common operations such as checking equality and computing deltas. This consistent interface allows other objects to operate on Value instances without needing to make special allowances for the varying types of their enclosed values.

A Value is a Stringable object, which means that possesses toString and fromString methods along with the following:

equals
other
Value
boolean
returns true if the Value passed to it is equal to the one it was called on.
diff
other
Value
Delta
returns an object that describes how the Value it was called on can be changed into the one that was passed to it. The returned object is an instance of the Delta class, which is described in Section 4.1.3.
deltaType
class
is a class method that identifies which child class of Delta is returned by its class's diff method. It is used when parsing openNURBS deltas.

The abstract model defines a number of implementations of the Value interface. A JSONEncodeableValue is one that uses Python's built-in json package to convert values to and from their textual representations. BooleanValue, FloatValue, IntegerValue, and StringValue are subclasses of JSONEncodeableValue that are used for booleans, floating-point numbers, integers, and strings, respectively.

A RegexParseableValue is one that uses a regular expression to parse the textual representation of its value. UUIDValue is a subclass of RegexParseableValue meant for storing UUIDs.

Finally, the EnumeratedValue class serves as a base for value types that have only a limited number of acceptable values. It stores those acceptable values in a lookup table along with their textual representations for encoding and decoding.

4.1.3 Deltas

A Delta represents a change to the value of a component property. Like a Value, a Delta encapsulates the details of that change so that other objects can work with its instances without needing specific knowledge of the type of value that was changed or how the change takes place. A Delta is a Stringable object, which means that possesses toString and fromString methods along with the following:

apply
value
Value
session
Session
Value
produces a new Value by executing the change described in the object it was invoked on with the value that was passed to it.
reverse
Delta
returns a new Delta that performs the opposite change from the one it was called on. This method is used during reverse patch operations.

The abstract model defines one implementation of the Delta interface, named Substitution, which simply replaces an older value with a newer one.

Figure 4.2. A UML diagram depicting part of the abstract model.

4.1.4 Accesssors

An Accessor specifies the means by which values are retrieved from and assigned to component properties. It encapsulates knowledge about how to navigate specific parts of the rhino3dm API in a way that can be leveraged by the rest of the abstract model. The Accessor interface exposes two methods:

get
component
object
any
returns the value of the given component's property.
set
component
object
value
any
void
assigns the provided value to the given component's property.

The abstract model defines two main implementations of the Accessor interface. A FunctionalAccessor is constructed from two functions, one of which retrieves values from a component property and the other of which assigns values to a component property. A PathAccessor leverages Python's introspective capabilities, namely the getattr and setattr functions, to get and set properties based on their names in the rhino3dm API.

4.1.5 Properties

A Property represents a property of a component. It associates an Accessor instance and an implementation of Value with a textual label that is unique within the scope of the property's component. Each Property instance has the following attributes and methods:

name
string
the property's textual label.
type
class
the subclass of Value that is produced by the property's getValue method and expected by its setValue method.
getValue
component
object
model
object
Value
retrieves the value of the given component's property.
setValue
component
object
value
Value
model
object
void
assigns the provided value to the given component's property.

Properties are hashable so that they can be used as keys in a Python dictionary, such as in the PropertyMap classes described in Section 4.1.6. Instances of Property are also considered equal to a string containing their name in order to ease lookups.

4.1.6 Property Maps

A PropertyMap is a mapping (or dictionary, in Python terms) that correlates one or more Property instances to an associated Value or Delta. A PropertyMap has the following methods:

apply
component
Component
model
Model
session
Session
void
applies the values or deltas listed in the PropertyMap to the given component.
merge
other
PropertyMap
session
Session
PropertyMap
returns a new PropertyMap that combines the properties listed in other with those listed in the one it was called on. The merge will not succeed if the two PropertyMaps have different values or deltas assigned to the same property.
readline
line
string
componentType
ComponentType
void
parses a property and value or delta from the given string and adds them to the PropertyMap.
reverse
PropertyMap
returns a new PropertyMap that has the opposite meaning of the one it was called on.
write
output
Stream
void
writes a textual representation of the PropertyMap to the given output stream.

The abstract model defines two concrete implementations of PropertyMap. A PropertyValueMap maps properties to Value instances and is used to describe components that have been added to or deleted from a model. A PropertyDeltaMap maps properties to Delta instances and is used to describe components that have been modified, as well as certain modifications to the model itself.

4.1.7 Tables

A Table specifies how to retrieve components from, add them to, and delete them from a model. Its interface specifies the following methods:

allComponents
model
Model
Iterable<Component>
retrieves the complete list of components in the table.
getComponent
model
Model
id
uuid
Component
retrieves the component with the given ID.
addComponent
component
Component
void
adds a component to the table.
deleteComponent
component
Component
void
removes a component from the table.
intersect
older
Model
newer
Model
Intersection
determines which components have been removed from the given older model, which ones have been added to the given newer model, and which ones appear in both.

4.1.8 Types

A Type enumerates the properties supported by a kind of object.

properties
Iterable<Property>
the properties supported by objects of this type.
getProperty
name
string
Property
returns the property with the given name. Property names are case-insensitive, so Color, color, COLOR, and CoLOr all resolve to the same property.

The abstract model defines two concrete classes that inherit from Type. A ComponentType describes a type of component by associating a table and a list of properties with a unique name. It exposes the following properties and methods in addition to the ones described above:

name
string
the name of the type.
create
object
creates a new component of this type.

The set of component types supported by a model format are gathered in a ComponentTypeRegistry which exposes the following methods:

findByName
name
string
Property
returns the component type with the given name. Like properties, component type names are case-insensitive.
findByClass
cls
class
Property
returns the component type that corresponds to the given class.
fromInstance
instance
object
Property
returns the component type that corresponds to the class of the given object.

A ModelType also inherits from Type, and lists the tables, component types, and properties associated with a model format. It exposes the following properties in addition to those belonging to its parent class:

componentTypes
ComponentTypeRegistry
the set of component types supported by the model format.
componentTypes
Iterable<Table>
the set of tables that store components in this model format.

4.1.9 Component Deltas

A ComponentDelta describes changes to a single model component, and corresponds to a single hunk in the output of a diff operation. It provides the following properties and methods:

id
uuid
the UUID of the component.
type
ComponentType
the type of the component.
properties
PropertyMap
the non-default properties of the component if was added or deleted, or if the component was modified, its altered properties.
apply
model
Model
session
Session
void
applies the changes described in the ComponentDelta to the given model.
fromHeader
header
string
types
ComponentTypeRegistry
ComponentDelta
creates an empty ComponentDelta from a header line. The class of the returned object depends on the first character of the header.
readline
line
string
ComponentDelta
parses a property and value or delta from the given string and adds them to the object's PropertyMap.
reverse
ComponentDelta
returns a new ComponentDelta that has the opposite meaning of the one it was called on.
merge
other
ComponentDelta
session
Session
ComponentDelta
returns a new ComponentDelta that includes both the changes listed in other as well as those listed in the one it was called on. The merge will not succeed if the two ComponentDeltas have different values or deltas assigned to the same property.
write
output
Stream
ComponentDelta
returns a new ComponentDelta that has the opposite meaning of the one it was called on.

The abstract model defines three concrete implementations of ComponentDelta. The ComponentAddition and ComponentDeletion classes represent components that have been added to and deleted from the model, respectively. Each contains a reference to a PropertyValueMap that describes how the component that was added or deleted differs from the default state of a component of its type. The ComponentModification class represents a component that has been modified between older and newer versions of a model. It contains a reference to a PropertyDeltaMap that describes how the component was changed.

4.1.10 Model Deltas

A ModelDelta describes the result of diffing two models. It contains a collection of ComponentDelta instances as well as a list of properties that have changed on the model itself.

components
Iterable<ComponentDelta>
the list of model components that were added, removed, or modified.
hasDifferences
boolean
true when the ModelDelta describes at least one change.
properties
PropertyDeltaMap
the properties of the model that were changed.
apply
model
Model
session
Session
void
applies the changes described in the ModelDelta to the given model.
compare
models
tuple<Model, Model>
session
Session
void
looks for and records differences between the given models.
merge
other
Model
void
returns a new ModelDelta that includes both the changes listed in other as well as those listed in the one it was called on. The merge will not succeed if the two ModelDeltas have different deltas assigned to the same property or if any of the component deltas in the models are incompatible.
read
input
Stream
void
reads a delta from the given input stream.
reverse
ModelDelta
returns a new ModelDelta that has the opposite meaning of the one it was called on.
write
output
Stream
void
writes a textual representation of the ModelDelta to the given output stream.

4.1.11 Sessions

A Session encapsulates procedures for communicating abnormal or unexpected situations to the user in a way that does not require the abstract model to be tied to a specific user interface paradigm. Its interface specifies the following methods:

warn
message
string
void
displays a warning message.
fatal
message
string
void
span>
displays an error message and exits the program.
setContext
componentType
string
componentID
string
property
string
void
sets contextual information about the component and property currently being processed that can be displayed in conjunction with an error or warning message.

4.2 The Adapter Layer

Between the abstract model and the 3dmdiff, 3dmpatch, and 3dmdiff3 programs lies an adapter layer that connects the fundamental concepts and algorithms laid out in the abstract model with the openNURBS component and geometry classes provided by the rhino3dm library. Much of this layer consists of relatively straightforward definitions of component types, value types, and properties in terms of the classes and interfaces described in Section 4.1. For example, the Color property of a geometric object is defined by the expression

Property("Color", value_types.Color, "Attributes.ObjectColor")

where Property is the class described in section \ref{class:property}, "Color" is the name of the property that will appear in deltas, values_types.Color is an implementation of the Value interface (Section \ref{class:value}) that stores the red, green, blue, and alpha values that comprise a color. The string "Attributes.ObjectColor" is converted by the Property constructor into an instance of PathAccessor (Section \ref{class:accessor}) that links the Property instance to the ObjectColor property of a component's Attributes object.

The adapter layer defines a number of Value implementations in addition to Color, including Point3d for storing points in three-dimensional space, Vector3d for storing three-dimensional vectors, and Interval for storing ranges of numbers. It also defines an implementation of Delta called Transformation to store and apply the 4×4 matrices described in Section 3.4.

The adapter layer also attends to a number of non-trivial challenges involved in marrying the abstract model to the rhino3dm library. One such challenge involves the dual identification scheme present in openNURBS models in which components are identified both by a UUID and by an index. A component's UUID is stable, but its index may change if other components of the same type are added to or removed from the model. The 3dmdiff and 3dmdiff3 programs therefore rely on component UUIDs alone in order to prevent changing indexes from increasing the complexity deltas more than necessary. However, certain properties that reference other components, such as the linetype of a layer or the material of a geometric object, do so by storing the referenced component's index. The adapter layer works around these situations by providing a special implementation of Accessor, called an IndexReferenceAccessor, that translates between indexes and UUIDs. The Linetype property of a layer is therefore defined by the expression

Property("Linetype", UUIDValue, IndexReferenceAccessor("LinetypeIndex", tables.LINETYPE_TABLE))

where "LinetypeIndex" is the name of the attribute on the rhino3dm Layer object that contains the index of the layer's linetype and tables.LINETYPE_TABLE is the table in which that index should be looked up to convert it to a UUID.

Another difficulty the adapter layer contends with involves differences in how Python and C++ (the language in which openNURBS is written) handle the movement of data into and out of functions. In C++, data can be passed by value, which means it is copied from one scope (such as the code that calls a function) to another (the code inside the function). It can also be passed by pointer or by reference, in which case the location of the data in memory is shared so that changes to that data that occur in one scope are automatically propagated to the other. Python, on the other hand, uses a simpler system in which all values are passed by reference. The Line property of the LineCurve class is one of a few places in the rhino3dm API where these two systems collide, as seen in the following example:

from rhino3d import LineCurve, Point3d

		
p0 = Point3d(3, 0, 0)
p1 = Point3d(0, 4, 0)
curve = LineCurve(p0, p1)
print(curve.Line.From) # outputs "3.0,0.0,0.0"

		
curve.Line.From = Point3d(10, 10, 0)
print(curve.Line.From) # still outputs "3.0,0.0,0.0"

Under Python's standard pass-by-reference semantics, one would expect the assignment on line 8 to have changed the start point of the LineCurve so that print statement on line 9 would output 10.0,10.0,0.0. However, the underlying C++ function that serves as the getter for LineCurve.Line returns its value by reference. Line 8 therefore assigns a new start point to a copy of the line stored within the LineCurve component rather than the original line. After line 8 has finished executing, the copy is destroyed along with its new start point.

To change the start point of the line contained within the LineCurve component, one must instead store a copy of that line in a variable, change the start point of the copy, and then assign the copy back to the LineCurve's Line attribute as shown below.

line = curve.Line
line.From = Point3d(10, 10, 0)
curve.Line = line

The adapter layer provides an implementation of Accessor, called ValueObjectAccessor, which performs this action automatically. The StartPoint property of a LineCurve is therefore defined using the expression

Property("StartPoint", value_types.Point3d, ValueObjectAccessor("Geometry.Line", "From"))

where the string "Geometry.Line" specifies the object that must be copied and reassigned and "From" indicates the attribute of that property that contains the property's value.

4.3 Command Line Interface

One of the major contributions of this thesis is the development of three programs, named 3dmdiff, 3dmpatch, and 3dmdiff3, that implement diffing, patching, and three-way merging for openNURBS models. The command-line interfaces of these programs are modeled after those provided by the GNU Diffutils package, which furnishes industry-standard open-source tools for comparing and merging plain text files. It is hoped that the similarity of 3dmdiff, 3dmpatch, and 3dmdiff3 to the widely-used diff, patch, and diff3 programs will foster a sense of familiarity among prospective users and allow the openNURBS-specific tools described in this thesis to more easily integrate into existing workflows.

Like their plain text counterparts, 3dmdiff and 3dmpatch make extensive use of the standard streams, a set of three communications channels that provide a way for computer programs to exchange information with their environments. Standard input, also known as stdin, is used to pass information into a program, and in a command line environment, is typically connected to the keyboard. Standard output, abbreviated as stdout, is used to convey the results of a program's execution to the outside world. Data that is written to stdout is typically printed in the command line console. Finally, the standard error (stderr) stream is used for error messages and debugging information that is not part of a program's normal output.

The standard streams can be redirected so that they connect to a file or another program instead of the keyboard or screen. One form of redirection, called a pipe, has already been demonstrated in Section 3.3 with the command ls | wc -l In that example, the | character plugs the standard output stream of ls (which contains a listing of the files and folders in the current directory) into the standard input stream of wc (which counts the number of lines in a file) to count the number of files and folders in the current directory.

The standard output stream can be redirected to a file using the > operator. Rather than being piped directly into wc as in the previous example, the output of ls could have been written to a file called file_list.txt using the command ls > file_list.txt Similarly, the < operator connects a file to a program's standard input stream. The command wc -l < file_list.txt allows wc to read the contents file_list.txt as if they had been typed by the user. Several other forms of stream redirection exist, but the basic patterns shown here are sufficient to understand the operation of the 3dmdiff and 3dmpatch programs.

4.3.1 3dmdiff

The 3dmdiff program finds differences between two openNURBS models. and prints an account of those differences to standard output using the delta format described in Section 3.4. Like GNU diff, it accepts two file system paths as arguments. However, both arguments to 3dmdiff must reference normal files as the program does not support diffing of directory structures. It is invoked as 3dmdiff [options] fromfile tofile where fromfile is a path to the openNURBS model that will serve as the origin of the resulting delta and tofile is a path to the openNURBS model that will serve as its destination. The following options are supported:

-q, --brief
Causes 3dmdiff to report only whether the models differ, and not what those differences are
-s, --report-identical-files
Causes 3dmdiff to print a brief statement when the two models given to it are identical. Normally, it would output nothing in that case.
--label=LABEL
Instructs 3dmdiff to use LABEL in place of fromfile in the delta's preamble. This option may be repeated to make a similar substitution for tofile.
--git
Tells 3dmdiff to expect a different set of arguments for easier integration with Git. See Section 4.4 for more information.
-v, --version
Instructs 3dmdiff to print its version information and then exit.
-h, --help
Instructs 3dmdiff to print a description of its usage and then exit.

Users of 3dmdiff will frequently want to capture the outputted delta in a file to use later with 3dmpatch. This can be accomplished by redirecting standard output to a file as in the command 3dmdiff version01.3dm version02.3dm > my_delta.txt which finds the differences between version01.3dm and version02.3dm and saves the result in a file called my_delta.txt.

4.3.2 3dmpatch

The 3dmpatch program applies a delta to an openNURBS model. Typically, 3dmpatch reads the delta from standard input as follows:

3dmpatch [options] < patchfile

When invoked in this manner, 3dmpatch attempts to apply the changes in the delta to the first file listed in the delta's preamble. Alternatively, the model to which the delta should be applied can be specified as an argument:

3dmpatch [options] originalfile < patchfile

The delta may also be specified as an argument instead of being read from standard input:

3dmpatch [options] originalfile patchfile

All three of these invocation patterns are consistent with the way GNU patch operates. Additionally, the following options are supported:

-o PATH, --output=PATH
Instructs 3dmpatch to write the result of the patch operation to PATH. If this option is not specified, the file is patched in place.
-R, --reverse
Causes 3dmpatch to apply the delta in reverse.
-v, --version
Instructs 3dmpatch to print its version information and then exit.
-h, --help
Instructs 3dmpatch to print a description of its usage and then exit.

4.3.3 3dmdiff3

The 3dmdiff3 program performs a three-way diff between two openNURBS models that share a common ancestor and optionally applies the resulting delta to the common ancestor to produce a merged model. Unlike GNU diff3, 3dmdiff3 produces deltas that are meant to be applied to the common ancestor rather than the first argument. It is invoked as 3dmdiff3 [options] myfile oldfile yourfile where myfile and yourfile are the paths to the openNURBS models to be merged and oldfile is the path to their common ancestor. The following options are supported:

-m, --merge
Instructs 3dmdiff3 to produce a merged model. Without this option, 3dmdiff3 outputs a diff
-o PATH, --output=PATH
Instructs 3dmdiff3 to write the result of a merge operation to PATH. If this option is not specified, oldfile is patched in place.
-v, --version
Instructs 3dmdiff3 to print its version information and then exit.
-h, --help
Instructs 3dmdiff3 to print a description of its usage and then exit.

4.4 Git Integration

Integrating the 3dmdiff and 3dmdiff3 programs into a Git repository involves defining custom diff and merge drivers and then associating those drivers with the .3dm file extension (Gitattributes Documentation, n.d). Git does not presently support integration with third-party patch programs such as 3dmpatch.

The custom diff and merge drivers may be defined in one of several configuration files depending on their desired scope. If the drivers are needed for only one repository, they may be defined in that repository's .git/config file. If they are required in all of a user's repositories, they may be defined in a file called .gitconfig in that user's home folder. Finally, if the drivers are required in all repositories belonging to all users of a computer, they may be defined in \$(prefix)/etc/gitconfig, where \$(prefix) is the directory into which Git was installed (Git-Config Documentation, n.d).

Git configuration files follow a syntax similar to that of INI files, and custom diff and merge drivers are defined as sections within that syntactical framework. The section for a diff driver begins with a header that contains the word diff followed by an arbitrary quoted name. It contains a single key, command, that specifies the program to be used for the diff operation. This program receives seven command-line arguments:

  1. The path of the file being diffed within the repository.
  2. A path from which the contents of the older file can be read.
  3. The 40-hexdigit hash of the older file.
  4. The octal representation of the older file's mode.
  5. A path from which the contents of the newer file can be read.
  6. The 40-hexdigit hash of the newer file.
  7. The octal representation of the newer file's mode.

The --git option to 3dmdiff tells the program to expect those seven arguments in place of the two arguments described in Section 4.3.1. A diff driver for openNURBS models may therefore be defined as:

[diff "opennurbs-diff-driver"]
command = 3dmdiff --git

The section for a merge driver begins with a header that contains the word merge followed by an arbitrary quoted name. It contains up two keys: name defines a human-readable label for the merge driver and driver specifies the command to be used for the merge operation. A third key, recursive, is also supported but is not required for merging openNURBS models.

Git constructs the merge command by replacing special tokens in the text of the driver setting with information relevant to the merge operation. The token %A is replaced with a path from which the merge program can read the current branch's version of the file being merged, and %B is replaced with a path from which it can read the other branch's version. The %O token is replaced with a path to the common ancestor and %P is replaced with the path to which the merged file should be saved. Therefore, a merge driver for openNURBS models may be defined as:

[merge "opennurbs-merge-driver"]
name = Merge driver for openNURBS models
driver = 3dmdiff -m -o %P %A %O %B

The final step of integrating 3dmdiff and 3dmdiff3 with Git is to associate the diff and merge drivers defined above with the .3dm file extension by adding a line similar to the one below to one of Git's attribute files. Note that the names opennurbs-diff-driver and opennurbs-merge-driver correspond to the names assigned to the diff and merge drivers in their respective headers.

*.3dm diff=opennurbs-diff-driver merge=opennurbs-merge-driver

Like Git's configuration files, there are several attributes files that may be used depending on the desired scope of the directives shown above. If the openNURBS diff and merge drivers are needed for only one repository, the line may be added to a file named .gitattributes in the root of that repository or in its .git/info/attributes file. If they are required in all repositories on a machine, the line may added to \$(prefix)/etc/gitattributes, where \$(prefix) is the directory into which Git was installed.