# Ruminations on Rust Geodesy

## Rumination 000: Overall architecture and philosophy

Thomas Knudsen <knudsen.thomas@gmail.com>

2021-07-31. Last [revision](#document-history) 2021-08-26

---

### Prologue

#### What is Rust Geodesy?

Rust Geodesy, RG, is a geodetic software system, not entirely unlike [PROJ](https://proj.org), but with much more limited transformation functionality: While PROJ is mature, well supported, well tested, and production ready, RG is neither of these. This is partially due to RG being a new born baby, partially due to its aiming at a (much) different set of use cases.

So when I liberally insert comparisons with PROJ in the following, it is for elucidation, not for mocking - neither of PROJ, nor of RG: I have spent much pleasant and instructive time with PROJ, both as a PROJ core developer and as a PROJ user (more about that in an upcomming *Rumination on RG*). But I have also spent much pleasant time learning Rust and developing RG, so I feel deeply connected to both PROJ and RG.

PROJ and RG do, however, belong in two different niches of the geodetic software ecosystem: Where PROJ is the production work horse, with the broad community of end users and developers, RG aims at a much more narrow community of geodesists, for geodetic development work - e.g. for development of transformations that may eventually end up in PROJ. As stated in the [README](/README.md)-file, RG aims to:

1. Support experiments for evolution of geodetic standards.
2. Support development of geodetic transformations.
3. Hence, provide easy access to a number of basic geodetic operations, not limited to coordinate operations.
4. Support experiments with data flow and alternative abstractions. Mostly as a tool for aims (1, 2, and 3)

#### Why Rust Geodesy?

The motivation for these aims, i.e. the **why** of the project, is the **wish to amend explicitly identified shortcommings** in the existing landscape of geodetic software and standards.

#### How will it emerge?

The development work driven by this motivation is supported by a few basic design principles, the **how** of the project:

- An architectural scaffolding of four dimensional data flow paths, enabling the constrution of complex operations from simpler elements
- A design philosophy of keeping things flexible by not overspecifying
- A geodetic focus on transformations, i.e. relations *between* systems, rather than definition *of* systems

or in fewer words: *Don't overdo it*.

### Getting beefy

But talking architecture and design philosophy out of thin air is at best counterproductive, so let's start with a brief example, demonstrating the RG idiom for converting geographical coordinates to UTM zone 32 coordinates.

```rust
fn main() {
    // [0] Use a brief name for some much used functionality
    use geodesy::CoordinateTuple as Coord;

    // [1] Build some context
    let mut ctx = geodesy::Context::new();

    // [2] Obtain a handle to the utm-operator
    let utm32 = ctx.operation("utm: {zone: 32}").unwrap();

    // [3] Coordinates of some Scandinavian capitals
    let copenhagen = Coord::geo(55., 12., 0., 0.);
    let stockholm  = Coord::geo(59., 18., 0., 0.);

    // [4] We put the coordinates into an array
    let mut data = [copenhagen, stockholm];

    // [5] Then do the forward conversion, i.e. geo -> utm
    ctx.fwd(utm32, &mut data);
    println!({:?}, data);

    // [6] And go back, i.e. utm -> geo
    ctx.inv(utm32, &mut data);
    Coord::geo_all(&mut data);
    println!({:?}, data);
}
```

(See also [Idiomatic Rust](#note-idiomatic-rust) in the Notes section)

At comment `[0]`, we start by renaming the library functionality for coordinate handling, from `geodesy::CoordinateTuple` to `Coord`. Since coordinates are at the heart of what we're doing, it should have a brief and clear name. Then why give it such a long name by design, you may wonder - well, `CoordinateTuple` is the ISO-19111 standard designation of what we colloquially would call *the coordinates*.

---

```rust
// [1] Build some context
let mut ctx = geodesy::Context::new();
```

At comment `[1]` we instantiate a `Context`, which should not come as a surprise if you have been using [PROJ](https:://proj.org) recently. The `Context` provides the interface to the messy world external to RG (files, threads, communication), and in general centralizes all the *mutable state* of the system.

Also, the `Context` is the sole interface between the `RG` transformation functionality and the application program: You may instantiate a transformation object, but the `Context` handles it for you. While you need a separate `Context` for each thread of your program, the `Context` itself is designed to eventually do its work in parallel, using several threads.

---

```rust
// [2] Obtain a handle to the utm-operator
let utm32 = ctx.operation("utm: {zone: 32}").unwrap();
```

At comment `[2]`, we use the `operation` method of the `Context` to instantiate an `Operator` (closely corresponding to the `PJ` object in PROJ). The parametrisation of the operator, i.e. the text `utm: {zone: 32}` is expressed in [YAML](https://en.wikipedia.org/wiki/YAML) using parameter naming conventions closely corresponding to those used in PROJ, where the same operator would be described as `proj=utm zone=32`
(see also [ellps implied](#note-ellps-implied) in the Notes section).

So essentially, PROJ and RG uses identical operator parametrisations, but RG, being 40 years younger than PROJ, is able to leverage YAML, an already 20 years old generic, JSON compatible, data representation format. PROJ, on the other hand, was born 20 years prior to YAML, and had to implement its own domain specific format.

Note, however, that contrary to PROJ, when we instantiate an operator in RG, we do not actually get an `Operator` object back, but just a handle to an `Operator`, living its entire life embedded inside the `Context`.
And while the `Context` is mutable, the `Operator`, once created, is *immutable*.

This makes `Operator`s thread-sharable, so the `Context` will eventually (although still not implemented), be able to automatically parallelize large transformation jobs, eliminating some of the need for separate thread handling at the application program level.

Note, by the way, that the method for instantiating an `Operator` is called `Context::opera`**`tion`**`(...)`, not `Context::opera`**`tor`**`(...)`: Conceptually, an **operation** is an *instantiation of an operator*, i.e. an operator with parameters fixed, and ready for work. An **operator** on the other hand, is formally a datatype, i.e. just a description of a memory layout of the parameters.

Hence, the `operation(...)` method returns a handle to an **operation**, which can be used to **operate** on a set of **operands**. It's op...s all the way down!

---

```rust
// [3] Coordinates of some Scandinavian capitals
let copenhagen = Coord::geo(55., 12., 0., 0.);
let stockholm  = Coord::geo(59., 18., 0., 0.);

// [4] We put the coordinates into an array
let mut data = [copenhagen, stockholm];
```

At comments `[3]` and `[4]` we produce the input data we want to transform. Internally, RG represents angles in radians, and follows the traditional GIS coordinate order of *longitide before latitude*. Externally, however, you may pick-and-choose.

In this case, we choose human readable angles in degrees, and the traditional coordinate order used in geodesy and navigation: *latitude before longitude*. The `Coord::geo(...)` function translates that into the internal representation. It has siblings `Coord::gis(...)` and `Coord::raw(...)` which handles GIS coordinate order and raw numbers, respectively. The latter is useful for projected coordinates, cartesian coordinates, and for coordinates with angles in radians. We may also simply give a `CoordinateTuple` as a naked array of four double precision floating point numbers:

```rust
let somewhere = Coord([1., 42., 3., 4.]);
```

The `CoordinateTuple` data type does not enforce any special interpretation of what kind of coordinate it stores: That is entirely up to the `Operator` to interpret. A `CoordinateTuple` simply consists of 4 numbers with no other implied interpretation than their relative order, given by the names *first, second, third, and fourth*, respectively.

RG `Operator`s take *arrays of `CoordinateTuples`* as input, rather than individual elements, so at comment `[4]` we collect the elements into an array.

---

```rust
// [5] Then do the forward conversion, i.e. geo -> utm
ctx.fwd(utm32, &mut data);
println!({:?}, data);
```

At comment `[5]`, we do the actual forward conversion (hence `ctx.fwd(...)`) to utm coordinates. Behind the scenes, `ctx.fwd(...)` splits up the input array into chunks of 1000 elements, for parallel processing in a number of threads (that is: At time of writing, the chunking, but not the thread-parallelism, is implemented).

As the action goes on *in place*, we allow `fwd(..)` to mutate the input data, by using the `&mut`-operator in the method call.

The printout will show the projected data in (easting, northing)-coordinate order:

```rust
CoordinateTuple([ 691875.6321403517, 6098907.825001632, 0.0, 0.0])
CoordinateTuple([1016066.6135867655, 6574904.395327058, 0.0, 0.0])
```

---

```rust
// [6] And go back, i.e. utm -> geo
ctx.inv(utm32, &mut data);
Coord::geo_all(&mut data);
println!({:?}, data);
```

At comment `[6]`, we roundtrip back to geographical coordinates. Prior to print out, we let `Coord::geo_all(...)` convert from the internal coordinate representation, to the geodetic convention of "latitude before longitude, and angles in degrees".

### Redefining the world

Being intended for authoring of geodetic functionality, customization is a very important aspect of the RG design. Hence, RG allows temporal overshadowing of built in functionality by registering user defined macros and operators. This is treated in detail in examples [02 (macros)](/examples/02-user_defined_macros.rs) and [03 (operators)](/examples/03-user_defined_operators.rs). Here, let's just take a minimal look at the workflow, which can be described briefly as *define, register, instantiate, and use:*

First a macro:

```rust
// Define a macro, using hat notation (^) for the macro parameters
let macro_text = "pipeline: {
        steps: [
            cart: {ellps: ^left},
            helmert: {x: ^x, y: ^y, z: ^z},
            cart: {inv: true, ellps: ^right}
        ]
    }";

// Register the macro, under the name "geohelmert"
ctx.register_macro("geohelmert", macro_text);

// Instantiate the geohelmert macro with replacement values
// for the parameters left, right, x, y, z
ed50_wgs84 = ctx.operation("geohelmert: {
    left: intl,
    right: GRS80,
    x: -87, y: -96, z: -120
}").unwrap();

// ... and use:
ctx.fwd(ed50_wgs84, data);
```

Then a user defined operator:

```rust
use geodesy::operator_construction::*;

// See examples/03-user-defined-operators.rs for implementation details
pub struct MyNewOperator {
    args: OperatorArgs,
    foo: f64,
    ...
}

// Register
ctx.register_operator("my_new_operator", MyNewOperator::operator);

// Instantiate
let my_new_operator_with_foo_as_42 = ctx.operation(
    "my_new_operator: {foo: 42}"
).unwrap();

// ... and use:
ctx.fwd(my_new_operator_with_foo_as_42, data);
```

Essentially, once they are registered, macros and user defined operators work exactly like the built-ins. Also, they overshadow the built-in names, so testing alternative implementations of built-in operators is as easy as registering a new operator with the same name as a built-in.

### Going ellipsoidal

Much functionality related to geometrical geodesy can be associated with the ellipsoid model in use, and hence, in a software context, be modelled as methods on the ellipsoid object.

In RG, ellipsoids are represented by the `Ellipsoid` data type:

```rust
pub struct Ellipsoid {
    a: f64,
    ay: f64,
    f: f64,
}
```

In most cases, the ellipsoid in use will be rotationally symmetrical, but RG anticipates the use of triaxial ellipsoids. As can be seen, the `Ellipsoid` data type is highly restricted, containing only the bare essentials for defining the ellipsoidal size and shape. All other items are implemented as methods:

```rust
let GRS80 = geodesy::Ellipsoid::named("GRS80");

let E = GRS80.linear_eccentricity();
let b = GRS80.semiminor_axis();
let c = GRS80.polar_radius_of_curvature();
let n = GRS80.third_flattening();
let es = GRS80.eccentricity_squared();
```

The functionality also includes ancillary latitudes, and computation of geodesics on the ellipsoid - see [example 01](../examples/01-geometrical-geodesy.rs) for details.

### Recent additions

#### GYS: The Geodetic YAML Shorthand

As YAML is somewhat verbose, GYS, the "Geodetic YAML Shorthand" was introduced with RG version 0.6.0. GYS can be discerned from YAML by not containing any curly braces, using pipe symbols (`|`) to indicate pipeline steps, and in general leaving out syntactical elements which are superfluous given that we know the context is RG.

Internally, GYS is transformed to YAML by a simple mechanical rule set, so YAML is still the cornerstone of the RG descriptor system. The two pipelines shown below demonstrate the essentials of speaking GYS:

##### **A pipeline in YAML**

```yaml
ed50_etrs89: {
    steps: [
        cart: {ellps: intl},
        helmert: {x: -87, y: -96, z: -120},
        cart: {inv: true, ellps: GRS80}
    ]
}
```

##### **The same pipeline in Geodetic YAML Shorthand (GYS)**

```js
cart ellps: intl | helmert x:-87 y:-96 z:-120 | cart inv ellps:GRS80
```

A description is considered GYS, and internally translated to YAML, if *at least one* of these conditions is met:

1. It begins and/or ends with a `|` character, i.e. an empty step.
2. It contains a space-delimited `|` character, i.e. the sequence `_|_`.
3. It is wrapped in square brackets: `[cart ellps: intl]`.
4. The first token does not end with `:`

### Comming attractions

RG is in early-stage development, so a number of additions are planned.

#### Geometric geodesy

In `[Knudsen et al, 2019]` we identified a small number of operations collectively considered the "bare minimum requirements for a geodetic transformation system":

1. Geodetic-to-Cartesian coordinate conversion, and its inverse.
2. Helmert transformations of various kinds (2D, 3D, 4D or, equivalently: 4 parameter, 3/7 parameter and 14/15 parameter).
3. The Molodensky transformation.
4. Horizontal grid shift (“NADCON-transformation”).
5. Vertical grid shift (ellipsoidal-to-orthometric height transformation).

Of these only the three first are fully implemented in RG, while the grid shift operations are in various stages of completion. These are **need to do** elements for near future work.

Also, a number of additional projections are in the pipeline: First and foremost the Mercator projection (used in nautical charts), and the Lambert conformal conic projection (used in aeronautical charts).

#### Physical geodesy

Plans for invading the domain of physical geodesy are limited, although the `Ellipsoid` data type will probably very soon be extended with entries for the *International Gravity Formula, 1930* and the *GRS80 gravity formula*.

#### Coordinate descriptors

Combining the generic `CoordinateTuple`s with `CoordinateDescriptor`s will make it possible to sanity check pipelines, and automate coordinate order and unit conversions.

#### Logging

The Rust ecosystem includes excellent logging facilities, just waiting to be implemented in RG.

### Discussion

From the detailed walkthrough of the example above, we can summarize "the philosophy of RG" as:

- **Be flexible:** User defined macros and operators are first class citizens in the RG ecosystem - they are treated exactly as the built-ins, and hence, can be used as vehicles for implementation of new built-in functionality.
- **Don't overspecify:** For example, the `CoordinateTuple` object is just an ordered set of four numbers, with no specific interpretation implied. It works as a shuttle, ferrying the operand between the steps of a pipeline of `Operator`s: the meaning of the operand is entirely up to the `Operator`.
- **Transformations are important. Systems not so much:** RG does not anywhere refer explicitly to input or output system names. Although it can be used to construct transformations between specific reference frames (as in the "ED50 to WGS84" case, in the *user defined macro* example), it doesn't really attribute any meaning to these internally.
- **Coordinates and data flow pathways are four dimensional:** From end to end, data runs through RG along 4D pathways. Since all geodata capture today is either directly or indirectly based on GNSS, the coordinates are inherently four dimensional. And while much application software ignores this fact, embracing it is the only way to obtain even decimeter accuracy over time scales of just a few years. Contemporary coordinate handling software should never ignore this.
- **Draw inspiration from good role models, but not zealously:** PROJ and the ISO-19100 series of geospatial standards are important models for the design of RG, but on the other hand, RG is also built to address and investigate some perceived shortcomings in the role models.

... and, although only sparsely touched upon above:

- **Operator pipelines are awesome:** Perhaps not a surprising stance, since I invented the concept and implemented it in PROJ five years ago, through the [Plumbing for Pipelines](https://github.com/OSGeo/PROJ/pull/453) pull request.

While operator pipelines superficically look like the ISO-19100 series concept of *concatenated operations*, they are more general and as we pointed out in `[Knudsen et al, 2019]`, also very powerful as a system of bricks and mortar for the construction of new conceptual buildings. Use more pipelines!

### Conclusion

Rust Geodesy is a new, still functionally limited, system for experimentation with, and authoring of, new geodetic transformations, concepts, algorithms and standards. Go get it while it's hot!

### References

**Reference:** `[Knudsen et al, 2019]`

Thomas Knudsen, Kristian Evers, Geir Arne Hjelle, Guðmundur Valsson, Martin Lidberg and Pasi Häkli: *The Bricks and Mortar for Contemporary Reimplementation of Legacy Nordic Transformations*. Geophysica (2019), 54(1), 107–116.

### Notes

#### **Note:** ellps implied

In both cases, the use of the GRS80 ellipsoid is implied, but may be expressly stated as  `utm: {zone: 32, ellps: GRS80}` resp. `proj=utm zone=32 ellps=GRS80`

#### **Note:** Idiomatic Rust

In production, we would check the return of `ctx.operation(...)`, rather than just `unwrap()`ping:

```rust
if let Some(utm32) = ctx.operation("utm: {zone: 32}") {
    let copenhagen = C::geo(55., 12., 0., 0.);
    let stockholm = C::geo(59., 18., 0., 0.);
    ...
}
```

In C, using PROJ, the demo program would resemble this (untested) snippet:

```C
#include <proj.h>

#int main() {
    PJ_CONTEXT *C = proj_context_create();
    PJ *P = proj_create(C, "proj=utm zone=32");

    PJ_COORD copenhagen = proj_coord(12, 55, 0, 0);
    PJ_COORD stockholm = proj_coord(18, 59, 0, 0);

    /* Forward */
    copenhagen = proj_trans(P, PJ_FWD, copenhagen);
    stockholm = proj_trans(P, PJ_FWD, stockholm);

    /* ... and back */
    copenhagen = proj_trans(P, PJ_INV, copenhagen);
    stockholm = proj_trans(P, PJ_INV, stockholm);

    proj_destroy(P);
    proj_context_destroy(C);
}
```

### Document History

Major revisions and additions:

- 2021-08-08: Added a section briefly describing GYS
- 2021-08-26: Extended prologue
