Home

thoughts on the usage of types


While working on a game prototype, I ran into an interesting example of using types in programming. Sometimes when I'm feeling lazy or confused about how I want to solve a problem, I define types that have a 1-to-1 correspondence to the nouns in the problem space I am trying to solve. This is the same impulse that is found in the object oriented way of programming, though you can create abstract data types for the elements of a problem in any language that supports them. This is my default way of hacking on a problem from the bottom up when I don't know much about how to solve it and I imagine many people follow a similar methodology. Defining lots of nouns from your problem space is not guaranteed to help you in your programming, but it can be a jumping off point.

I have found the proliferation of nouns to be unhelpful in the long run. Or rather, I think that there is a cost to the introduction of a new noun during development of a program that needs to be offset (and then some) by the benefits that abstraction brings to the code-base. This is well-trodden ground.

It is better to have 100 functions operate on one data structure than 10
functions on 10 data structures.

So, we should be careful about when we define types. Counter to the epigram, the example I want to share is explicitly about creating more data structures than functions. My game prototype uses three different coordinate systems: a normalized coordinate system from (-1,-1) to (1,1) meant to be used by assets, a world coordinate system that is for defining positions and distances in the simulation, and a screen coordinate system in pixel units that defines how graphics are displayed. This time around, I decided to create types for values in these coordinate systems:

struct v2 { float x,y; };
struct npos { struct v2 v; } /* the normalized coordinate system */
struct wpos { struct v2 v; } /* the world coordinate system */
struct spos { struct v2 v; } /* the screen coordinate system */

The initial concern I had was that the very same incompatibility between the types that would help catch errors would prove to be inconvenient and inelegant to work with. That turned out not to be the case, though. I believe this was helped by two separate features of my programming style as of late; my short naming conventions along with my relaxed view of visibility. I do not hide the fact that there is a v2 member in npos, wpos, and spos, and that can be accessed and operated on as an escape hatch when needed. If the same usage occurs multiple times, it can be abstracted out into a function to make it more convenient and remove conversions from coordinate-specific to generic 2-vector operations. Short names make the conversion functions easy to use:

struct wpos ntow(struct npos n, struct wpos p, float s);
struct spos wtos(struct wpos w);
struct wpos stow(struct spos s);

And, if we find we are accessing the vector member for a specific operation, we can abstract that out. It is very common to want to nudge the world positions around by some offset, for example:

struct wpos nudge(struct wpos p, float dx, float dy);

But that same operation does not occur nearly as often in the normalized or screen coordinate systems. It's not like I knew that at the start of programming the game prototype; I discovered that during the process of programming and created the abstractions as needed.

There can be many knock on effects that stem from programming style. As a program increases in size, one loses the ability to use small identifiers because of name collisions. People might instinctually think of small identifiers as amateurish, but they are the main reason why increasing type safety in this scenario was palatable! Similarly, it might feel amateurish to define nudge in the world coordinate system but not the other coordinate systems. If we went ahead and did all of that out of some obligation for completeness, we would just be increasing the code size and making the code-base less ergonomic for only speculative benefits.

Expressing the coordinate systems explicitly via types has caught my mistakes a few times already. Not only that, but there is significantly less cognitive load when working with positions from different coordinate systems. For programs that work with multiple coordinate systems, I'll use this approach from now on.

I don't want the takeaway from this example to be that we should put more concepts into the type system. Rather, I want to instead advocate that we be selective about what we express using the type system. Currently, I've landed on the position that modeling the problem domain with the type system in the hopes that it will somehow help catch errors or increase expressiveness is a mistaken practice. Instead, I had great luck here using the type system for making conversions explicit that I knew historically have been a source of errors for me.