Quick Examples

Machine learning and building models


For years and years the construction industry has been plagued by obfuscated proprietary file formats that hindered collaboration and innovation. A recent paradigm shift in the way buildings are designed and documented, called Building Information Modelling (BIM), and an open standard to describe such models, called IFC, constitute a radical shift towards openness. IFC is an exchange format, very similar to STEP. It re-uses the serialization format and most of the geometry definitions. Needless to say, this makes OpenCASCADE a wonderful pillar to to implement IFC support, as it comes with great support for the modelling paradigms in the world of STEP.

But it doesn’t end there. Due to the more systematic modelling approaches and richer semantics in these novel building models, computational tools to assess them and construct them have become essential. The Python language and its “batteries-included” environment of modules turns out to be a great ecosystem for such developments. In this ecosystem, PythonOCC is a central pillar, that enables to rapidly prototype interactive 3d applications and provides rich tools for assessing and construction geometry. This all using a comfortable syntax, without a need for constant compilation, while still delegating performance critical functions to native C++ modules.

Differences with STEP

While the majority of geometry definitions are derived from STEP, some changes have been introduced to connect better to the construction sector. One addition is the parametric profile definitions. Another important aspect is the frequent use of opening elements to model the relationships between (typically) walls and the elements embedded within them, such as windows and doors. In addition there are semantic constructs to describe object types and families and semantic data related to performance and use stored in property sets. Some of these semantic constructs can have implications on geometry without being explicitly stored as such, for example in the case of layer sets in compound wall structures. In general, the geometrical representation of the building elements is just one of the aspects conveyed in these models.

A parametric I-beam as defined in the IFC schema. Source: buildingSMART.
A parametric I-beam as defined in the IFC schema. Source: buildingSMART.

The definition of openings inside a wall element. Source: buildingSMART.
The definition of openings inside a wall element. Source: buildingSMART.


IfcOpenShell is a free and open source software library that enables users to work with this rich file format. A selection of features include the ability to read and write IFC-SPF files, full support for IFC representation items beyond what is supported in many commercial application today, a Python interface and efficient spatial querying. Since pythonOCC and IfcOpenShell share the same technological foundations they are able to interoperate well. More information: IfcOpenShell website, IfcOpenShell on github, IfcOpenShell with pythonOCC tutorials.

The IfcOpenShell pythonOCC viewer loading the Duplex model. The geometric interpretation of IfcLayerSets has been enabled.
The IfcOpenShell pythonOCC viewer loading the Duplex model. The geometric interpretation of IfcLayerSets has been enabled.


IFC and BIM are actively researched fields. Topics cover the entire spectrum from managerial studies to advanced computational geometry. Typical recurrent technical topics include:

  • the technical analysis of extensions to the schema to incorporate additional modelling domains into the standard
  • the assessment of the conformance of models to schemas and subschemas
  • the ability to check whether models conform to national norms for safety and comfort or institutional design guides
  • the automatic generation and conversion of such models to other formats.

Machine learning

In the post I will outline a recent research project in which we looked into Machine Learning for the classification and validation of IFC models (Krijnen & Tamke, 2015). A copy of the full paper can be found here: http://link.springer.com/chapter/10.1007%2F978-3-319-24208-8_33

It is common practice to use automation for finding design errors in building models. Proprietary applications exist that hardcode such design rules into fixed program code. These rules are then explicitly stated and match the legislation and norms that a building has to comply to. However:

  • Formalizing these requirements into computationally decidable logic can be a daunting task
  • Furthermore, within these definitions rules that relate to common sense (e.g. there should be enough room for a door to be opened) are typically not defined

Machine Learning can complement these imperative design validation procedures as it enables automated processes to gather common constellations of building elements and flag situations that deviate from what is typically constructed. Note that such situations are then not necessarily wrong, but they could be an indication to simply have another look.

In this particular case we predominantly look into geometrical descriptors for elements individually. Elements of that same type that are different from the norm can then be found. The code example below, shows the relative ease of accomplishing something similar by relying on modules such as IfcOpenShell and pythonOCC. It leads to the discovery of a set of elements that should not have been characterized as walls, but rather are beams. Such kinds of misclassification errors might seem trivial, but can cause havoc as rules apply to specific classes of elements and disciplines operate on the elements that match their expertise. Therefore in this case, the structural engineer might overlook these elements if they have been misclassified as architectural walls.

import operator

import OCC.GProp
import OCC.BRepGProp

import ifcopenshell
import ifcopenshell.geom

import numpy

# RGBA colors for the visualisation of elements
RED, GRAY = (1,0,0,1), (0.6, 0.6, 0.6, 0.1)

# Model freely available at:
# http://www.nibs.org/?page=bsa_commonbimfiles
ifc_file = ifcopenshell.open("Duplex_A_20110907_optimized.ifc")

# Settings to specify usage of pyOCC
settings = ifcopenshell.geom.settings()
settings.set(settings.USE_PYTHON_OPENCASCADE, True)

# Some helper functions to map to the list of walls
def create_shape(elem):
    return ifcopenshell.geom.create_shape(settings, elem)

def calc_volume(s):
    props = OCC.GProp.GProp_GProps()
    OCC.BRepGProp.brepgprop_VolumeProperties(s.geometry, props)
    return props.Mass()
def calc_area(s):
    props = OCC.GProp.GProp_GProps()
    OCC.BRepGProp.brepgprop_SurfaceProperties(s.geometry, props)
    return props.Mass()
def normalize(li):
    mean, std = numpy.mean(li), numpy.std(li)
    return map(lambda v: abs(v-mean) / std, li)

# Obtain a list of walls from the model
walls = ifc_file.by_type("IfcWall")
# Create geometry for these walls
shapes = list(map(create_shape, walls))
# Calculate their volumes
volumes = map(calc_volume, shapes)
# Calculate their surface areas
areas = map(calc_area, shapes)
# Compose a feature from the two measures
feature = normalize(map(operator.div, areas, volumes))

# Initialize the viewer
pyocc_viewer = ifcopenshell.geom.utils.initialize_display()

# Loop over the sorted pairs of feature
# values and corresponding geometry
for d, s in sorted(zip(feature, shapes)):
    c = RED if d > 1. else GRAY
    ifcopenshell.geom.utils.display_shape(s, clr=c)
# Fit the model into view

# Allow for user interaction
Visual result in the pythonOCC viewer. Misclassified elements have been identified without prior knowledge or imperative conditions.
Visual result in the pythonOCC viewer. Misclassified elements have been identified without prior knowledge or imperative conditions.

Note that the example above has been simplified in order to condense it. Please consult the full paper for more information on the original research. In addition to this unsupervised machine learning application, a supervised approach is presented to classify building models according to their use (e.g. residential or commercial).

(Krijnen & Tamke, 2015) Krijnen, T. & Tamke, M. (2015). Assessing implicit knowledge in BIM models with machine learning. In Modelling Behaviour (pp. 397-406). Springer International Publishing.