---
title: "Correctness properties: edge handling, connectivity, and boundary cases"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Correctness properties: edge handling, connectivity, and boundary cases}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(thinr)
```

## Scope

The published thinning algorithms — Zhang–Suen, Guo–Hall, OPTA/SPTA,
Hilditch, Holt, Lee, K3M — are defined pixel-locally: each pass classifies
a foreground pixel as deletable from its 8-neighbourhood alone. That local
definition leaves three things unstated that any *implementation* must
nonetheless decide, because real images exercise all three:

1. what happens to pixels on the image border, where the 8-neighbourhood
   spills outside the matrix;
2. whether the deletion *order* within a pass preserves connectivity for
   thin (two-pixel-wide) structures;
3. what a distance transform reports when there is no background pixel to
   measure to.

None of these is a new algorithm. They are the boundary conditions the
papers assume away, and getting them wrong produces skeletons that are
locally plausible but globally wrong. This vignette states the guarantees
`thinr` makes, why they follow from the methods rather than departing from
them, and how each is tested. They hold **uniformly across all seven
thinning methods** (and, for §3, across all three distance metrics), so
they are documented once here rather than per method.

This is a note on implementation correctness, not a new method. Where an
earlier version of `thinr` deviated from a published algorithm, the
guarantees below describe the *restoration* of the paper's behaviour, not
an extension of it.

## 1. Edge handling: shapes that touch the frame

Every kernel reads an 8-neighbourhood, so a border pixel has neighbours
that lie outside the matrix. The simplest implementation — iterate over
interior pixels `2..(n-1)` only — never even considers a border pixel for
deletion. A shape that touches the frame then keeps a two- or
three-pixel-thick ridge along that edge, silently violating the
one-pixel-wide contract that is the entire point of thinning.

The papers do not intend the frame to armour the object: they are written
for a shape drawn on an unbounded background. `thinr` reproduces that by
surrounding the image with a one-pixel background margin before thinning
and cropping it back afterward, so a shape is skeletonised **identically
whether or not it touches the border**.

```{r edge}
# A 3-pixel-thick horizontal bar flush against the top edge.
bar_top <- matrix(0L, nrow = 5, ncol = 9)
bar_top[1:3, ] <- 1L

# The same bar one row in from the edge.
bar_interior <- matrix(0L, nrow = 5, ncol = 9)
bar_interior[2:4, ] <- 1L

# Both thin to a single-pixel line, and to the same number of pixels.
sum(thin(bar_top))
sum(thin(bar_interior))
```

Because thinning only removes pixels, an image supplied with a one-pixel
background margin must come back with that margin still clear — a property
that is cheap to assert for every method:

```{r edge-test}
methods <- c("zhang_suen", "guo_hall", "lee", "k3m",
             "hilditch", "opta", "holt")
vapply(methods, function(m) {
  out <- thin(bar_top, method = m)
  # No foreground survives more than one pixel thick on the top edge.
  max(colSums(out[1:2, , drop = FALSE])) <= 1L
}, logical(1))
```

## 2. Connectivity: thin strokes must not fragment

A thinning pass may delete many pixels, and it must not delete so many
that a connected object falls apart. The classic failure is a
two-pixel-wide stroke: if both edges of the stroke are judged deletable
against the *same* pre-pass snapshot and removed together, the middle
vanishes and the stroke collapses into disconnected fragments.

The parallel algorithms avoid this with sub-iterations (Zhang–Suen and
Guo–Hall alternate two passes with complementary conditions), and the
sequential algorithms avoid it by making each deletion visible before the
opposing side is tested. OPTA/SPTA is the sequential case: Naccache &
Shinghal's formulation performs two raster scans per cycle precisely so
that removing one side of a stroke is seen before the other side is
considered. An implementation that instead evaluates all four contour
directions against one frozen snapshot and deletes in a single batch is
faster to write but *is not the published algorithm* — and it fragments
exactly the two-pixel strokes the method was designed to preserve.

`thinr`'s guarantee is the one the methods provide: **an object that
starts as a single 8-connected component thins to a single 8-connected
component.** A two-pixel-wide bar thins to a connected one-pixel line.

```{r connectivity}
# 8-connected component count.
n_components <- function(m) {
  m <- m != 0
  lab <- matrix(0L, nrow(m), ncol(m))
  k <- 0L
  for (i in seq_len(nrow(m))) for (j in seq_len(ncol(m))) {
    if (m[i, j] && lab[i, j] == 0L) {
      k <- k + 1L
      stack <- list(c(i, j))
      lab[i, j] <- k
      while (length(stack)) {
        p <- stack[[length(stack)]]
        stack[[length(stack)]] <- NULL
        for (di in -1:1) for (dj in -1:1) {
          r <- p[1] + di
          cc <- p[2] + dj
          if (r >= 1 && r <= nrow(m) && cc >= 1 && cc <= ncol(m) &&
                m[r, cc] && lab[r, cc] == 0L) {
            lab[r, cc] <- k
            stack[[length(stack) + 1]] <- c(r, cc)
          }
        }
      }
    }
  }
  k
}

# A 2-pixel-thick, 9-pixel-long bar: one component in, one component out.
bar2 <- matrix(0L, nrow = 6, ncol = 13)
bar2[3:4, 3:11] <- 1L
vapply(methods, function(m) n_components(thin(bar2, method = m)),
       integer(1))
```

Every method returns `1`: the stroke stays connected. (Before this
guarantee was enforced, OPTA returned the four corner pixels of the bar —
four disconnected components — a good example of a locally-valid,
globally-broken skeleton.)

## 3. Distance transform with no background pixel

`distance_transform()` reports, for each foreground pixel, the distance to
the nearest background pixel. When the image is *entirely* foreground there
is no such pixel, and the only consistent answer is that the distance is
unbounded. All three metrics return `Inf`:

```{r dt-inf}
solid <- matrix(1L, nrow = 4, ncol = 4)
vapply(c("euclidean", "manhattan", "chessboard"),
       function(mt) unique(as.vector(distance_transform(solid, metric = mt))),
       numeric(1))
```

A finite sentinel here (for example "one more than the image diagonal")
would be indistinguishable from a genuine large distance and would poison
any downstream comparison or medial-axis threshold. `Inf` is both the
mathematically correct value and what `EBImage::distmap()` returns, so the
two are interchangeable on this boundary case. Note also that only pixels
count as background: the region outside the matrix is **not** an implicit
background border, so a fully-foreground image does not measure distance to
the frame.

## 4. Invalid input fails loudly

Thinning and distance transforms are defined on `{0, 1}` images. An `NA`
is neither foreground nor background, and silently coercing it (to `0`,
say) would change the skeleton or the distance field without warning.
`thinr` rejects `NA` input with an error at the coercion boundary rather
than guessing:

```{r na, error = TRUE}
m <- matrix(1L, nrow = 3, ncol = 3)
m[2, 2] <- NA
thin(m)
```

This matches `EBImage`'s own contract for `distmap()`
(`'x' shouldn't contain any NAs`).

## Summary

| Property | Guarantee | Applies to |
|---|---|---|
| Edge handling | shapes touching the frame thin as if interior | all 7 thinning methods |
| Connectivity | one component in → one component out; 2px strokes stay connected | all 7 thinning methods |
| Empty background | fully-foreground image → `Inf` everywhere | all 3 distance metrics |
| Invalid input | `NA` rejected with an error, never coerced | `thin()`, `distance_transform()`, `medial_axis()` |

These are correctness boundary conditions, uniform across the toolkit,
consistent with the published methods, and each pinned by a test in the
package suite.

## References

Naccache, N. J., & Shinghal, R. (1984). SPTA: A proposed algorithm for
thinning binary patterns. *IEEE Transactions on Systems, Man, and
Cybernetics*, SMC-14(3), 409–418.

Zhang, T. Y., & Suen, C. Y. (1984). A fast parallel algorithm for thinning
digital patterns. *Communications of the ACM*, 27(3), 236–239.

Felzenszwalb, P. F., & Huttenlocher, D. P. (2012). Distance transforms of
sampled functions. *Theory of Computing*, 8(19), 415–428.