# Extracting Faces from Graphs

*Philip Rideout, Arcol CTO.*

*NOTE: All of the images on this page are interactive. Hover over faces to light them up.*

Two-dimensional sketches are fundamental to CAD. They are the starting point for a variety of 3D modeling operations such as extrusions, lofts, and sweeps. At Arcol, we knew from day one that our sketch objects should go beyond the status quo with flexibility and versatility.

Many CAD applications only permit sketches that are closed loops or open-ended sequences. Arcol allows users to create *networks* of straight line segments and arcs. In a network, more than two lines can arise from a single point. This is quite useful when drawing floor plans. Figma was probably the first design tool to support networks. Since Arcol supports networks too, we’re in part fulfilling our mission of bringing Figma-like power to architects.

Networks pose a number of engineering challenges. One interesting puzzle is how to robustly identify all faces. This is especially crucial for floor plans, because we need to know the area of each room to provide metrics. Users can also assign a unique color to each face, so each face must be triangulated separately for proper rendering.

## What is a face?

First, let's be careful not to conflate sketch faces with brep faces. Faces in a brep (boundary representation) are parametric representations of three-dimensional surfaces.

In the context of an Arcol sketch, we decided that a face is a 2D region with the following characteristics:

- Face boundaries are counter-clockwise sequences of edges. We chose the counter-clockwise direction somewhat arbitrarily, but wished to have consistent winding to simplify downstream operations.
- Faces can have children faces. Each child defines a hole in its parent.
- Every point on the plane is enclosed by only one face.
- The boundary of a face cannot cross itself.

Consider the unusual floor plan in the left panel above. If you show a human this image and ask them to label all the "rooms", they might produce something like the image on the right. However it’s not obvious how to do this programmatically. Note that the hourglass shape (C and D) has no sketch point where the lines cross. Also, the fact that the hourglass is disconnected from the rest of the floor plan makes this interesting. In graph theory parlance, the hourglass is a *connected component*, and the rest of the floor plan is a separate connected component.

It’s important to distinguish source data (the network of arcs and lines) from derived data (the hierarchy of faces). The left panel depicts the data that users manipulate, but the right panel shows what we need for CAD operations and rendering. This implies a transformation process. More on this process shortly, but first some terminology.

## Some terminology

Let's borrow some terms from graph theory before diving into the algorithms.

- A
**walk**is a sequence of vertices such that each consecutive pair has a corresponding edge in the graph. - A
**cycle**is a walk of vertices where the first vertex matches the last vertex. - A
**simple cycle**is a cycle in which no vertex is repeated, except the first and last vertex. - A
**planar straight line graph**(PSLG) is a depiction of a graph where all edges are straight lines, and no two lines cross each other. - An
**interior cycle**is a cycle in a PSLG that winds counter-clockwise. - An
**exterior cycle**is...you guessed it, a clockwise cycle.

Please note that some authors use slightly different definitions.

## The sketch pipeline

The above figure depicts the first two processing stages, which we call *prepare* and *subdivide*. Here's a summary of the entire pipeline:

**prepare**. Add points to all intersections.**subdivide**. Minimally break down arcs to produce a PSLG.**extract**. Produce a list of cycles.**simplify**. Transform all cycles into simple cycles.**treeify**. Arrange cycles into a hierarchy.**restore**. Bring back arcs.

### Prepare

This stage insert points everywhere that two edges cross and splits up the relevant edges.

To find all crossings between straight lines, one approach would be the Bentley–Ottmann algorithm. For circular arcs, a more brute-force *O(n²)* technique can be used, since arcs tend to be less common than straight line segments.

Note that users could place a sketch point in the middle of a line that it isn't connected to. This is a special case where we don't need to insert an additional point, but we still need to split an edge.

### Subdivide

This step temporarily replaces each arc with an approximation composed of straight-line segments. This is necessary because the extract procedure that we'll cover next leverages the winding order of edges that arise from each point, which isn't well-defined for arcs.

Care should be taken with the level of subdivision. There’s no reason to subdivide arcs more than necessary, as this slows down the subsequent processing. But, if we don’t subdivide them sufficiently, new intersections can arise. This is evident in the preceding figure; if we had subdivided the arc into only two pieces, then those pieces would have collided with the hourglass shape, which would have ultimately generated an incorrect face list.

### Extract

We now come to the juicy process of consuming a PSLG and producing a list of cycles. We tried two approaches: one from David Eberly (Constructing a Cycle Basis for a Planar Graph) and one from Jiang and Bunke (An optimal algorithm for extracting the regions of a plane graph). We settled on the latter, as it is efficient and simple to understand.

Try hovering near vertices in the following vignette and note the arcs that appear. Each arc represents a "wedge", which is a crucial building block in the face extraction algorithm.

Producing a list of wedges is fairly simple, assuming that we already have an adjacency map. Iterate though the vertices; at each one, radially sort the adjacent edges, then produce a wedge for each consecutive pair in the edge list. Note that the edge list should be treated like a circular buffer, since a wedge is formed between the first and last entries.

An abbreviated wedge list for the above example is shown below. Each wedge has the form A-B-C where B is the apex of the wedge, and C is a clockwise rotation from A.

3 0 1 1 0 3 4 1 2 0 1 4 2 1 0 etc...

Why is this list useful, and why are we using a 3-tuple notation?

Note that each wedge belongs to a face, and that each wedge is used exactly once. For example, take a look at the face formed by vertices 2-6-5-4-1 and list its constituent wedges in counter-clockwise order:

1 2 6 2 6 5 6 5 4 5 4 1 4 1 2

In the above list, each 3-tuple is a "left shift" of the 3-tuple that immediately precedes it. This suggests an algorithm that frequently searches the list for wedges with specific values for A and B.

Therefore, we should sort the original wedge list using A as the primary key, and B as the secondary key:

0 1 4 0 3 4 1 0 3 1 2 6 1 4 3 etc...

This allows us to subsequently use *O(log n)* bisection to implement a `findWedge(a, b)`

function. That's step 6 in the following algorithm.

1. Create a list of wedges that have the form A-B-C, and call it L. 2. Sort L first by A, then by B. 3. If L has no remaining wedges, quit. 4. Create an empty list of wedges, and call it R. 5. Pick a wedge A-B-C and move it from L to the end of R. 6. Find a wedge B-C-x using bisection, and move it from L to the end of R. 7. Repeat step 6 until no wedge can be found. (i.e. look for any left-shift of the wedge that was just pushed) 8. Push R onto the list of produced regions. 9. Go to step 3.

With respect to the total number of edges in the sketch, the entire procedure is *O(n log n)*. Not bad!

However what we just produced is not quite the list of faces that we ultimately need. It's just a flat list of interior cycles and exterior cycles, some of which might be non-simple. This brings us to the next step in the pipeline.

### Simplify

Consider this:

In this example, the extract algorithm produces the following cycle for the outer triangle:

0 1 2 0 4 3 0

The problem with this is that point 0 appears multiple times along the walk. Therefore the walk is not a simple cycle. This is easy to deal with. We just need to split the walk at one of the instances of the repeated vertex. This leaves us with:

0 1 2 0 0 4 3 0

This has an important implication. We now have a new exterior cycle. We should not throw it away, as it is required to produce the correct hierarchy. We also need to mark the 0 vertex as a "tainted" vertex — you’ll see why in the next step.

### Treeify

After the **extract** and **simplify** steps in our processing pipeline, we come to **treeify**.

In the above example, we wish to obtain the following hierarchy (excluding exterior cycles for clarity):

- B and C are children of A.
- D is a child of C.

Recall that the list of cycles produced by the extract step consist of counter-clockwise interior cycles and clockwise exterior cycles. These can be distinguished by applying the shoelace theorem to compute a signed area. Positive area indicates counter-clockwise winding, while negative area indicates clockwise winding.

If you set aside tainted vertices, parent-child relationships can only arise when a connected component is wholly contained within another connected component. Moreover, each exterior cycle corresponds to the outer boundary of a connected component. This suggests the following two-step process for arranging cycles into a hierarchy:

**Make each interior cycle into a child of an exterior cycle**. To do this, first make a pass over all cycles and populate an adjacency map from each vertex to the interior cycles it belongs to. Next, make a pass over exterior cycles. For each exterior cycle, pick any non-tainted vertex. Perform a depth-first traversal over the adjacency map, stopping when encountering tainted vertices. Along the way, accrue a set of interior cycles; these become the children.**Re-parent exterior cycles into the hierarchy using point-in-polygon tests**. Make a pass over all exterior cycles by sweeping from left to right. For each one, pick any non-tainted vertex in its boundary and perform a standard point-in-polygon test over the existing hierarchy. Find the deepest cycle that contains the point in question, and re-parent the exterior cycle under that cycle.

Typically the number of exterior cycles is very small, so step 2 is fast. If this were not the case, we would consider maintaining a minimal set of "active" root cycles that need to be checked against during the sweep. In practice, the vast majority of sketches have only one exterior cycle, so this optimization doesn’t seem worth pursing.

At this point we can remove exterior cycles from the hierarchy. They are not required for extrusions or triangulation so there’s not much reason to keep them around.

### Restore

In the final step, we replace certain sequences of line segments in the PSLG with the arc primitives that they were subdivided from. This is easy enough to do by referring to a map that was created during the **subdivide** step.

## Conclusion

That's it! The resulting list of interior cycles exactly corresponds to the face list that we're after.

I hope you enjoyed this post. We're pleased with the outcome of the engineering investment we put into sketches, and we hope that our users will be too.

Many more challenges like this lie ahead in Arcol's future. Personally, I can't wait.