[TYPES] a naive question on datatype declarations

Sat Jan 27 22:37:14 EST 2024

Hey Gershom!
So I think there are a few answers to your question depending on what the goals involved are. The simplest answer I've encountered is that a data declaration translates into a set of inference rules just like the normal inference rules of your logic. So for example, you might have something like functions as a primitive concept of your type theory, with an inference rule like

```
G !- A type    G !- B type-------------------------------- -> FormationG !- A -> B type

G, x:A !- M : B---------------------- -> IntroG !- \x.M : A -> B
G !- M : A -> B      G !- N : A------------------------------------- -> ElimG !- M N : B```

Those might be your core type theory rules, and then for every type declaration, you would add more rules. For lists, `data List a = Nil | Cons a (List a)` might become

```
G !- A type-------------------- List Formation
G !- List A type
------------------- List Intro 1
G !- Nil : List A
G !- M : A     G !- Ms : List A------------------------------------- List Intro 2G !- Cons M Ms : List A
G !- Ms : List A    G !- N : R    G, x : A, xs : List A !- C : R--------------------------------------------------------------------------- List ElimG !- case Ms of { Nil -> N ; Cons x xs -> C } : R```

So in this setting, data declarations are desugared into a set of inference rules. Easy peasy! Now, you might modify this with a semi-generic core component that makes this even easy, so that instead of having entire inference rules get added, it's just some data in some "signature", or whatever. But that's the general gist of how some systems work.
This is fine, but it means that you're kind of leaving out a bunch of things. I mean, it's generally nice to know that your language has a nice model theory. This doesn't guarantee that at all. It might be that you can have completely bogus programs! And even if your language happens to be fine, you don't have any useful theorems about it, etc. You have to construct all of those proofs yourself, for every program you write. It's very tedious.
So what many people do is try to formulate a nice, constrained little core for for what data declarations translate into, so that all the things that you can write down are well understood in a generic way, and all the hard to understand things are inexpressible. So for instance, it's well understood that fixed points of sums-of-products always exist and have guaranteed termination, etc. So then, what you can do is define in your core a small little language of "codes" that are "names" for types. One version, that uses something equivalent to sums-of-products, might be like this:

```---------------------
G !- UNIT code
G !- A code    G !- B code----------------------------------G !- PROD A B code

G !- A code    G !- B code----------------------------------G !- SUM A B code
G !- A type
---------------------------
G !- CONST A code
--------------------G !- REC code
```

These codes aren't types, they're just sort of.. syntactic representations of functors that have fixed points. So then you turn them into types like so:

```G !- C code------------------ Fix Formation
G !- Fix C type```
And then you also have some rules for how to make elements of the type `Fix C` for each possible C. There are some nuances to this so I'm not going to give an example b/c I won't do it justice, but I'll provide some links that are useful.
You'd also want to then have some syntactic rules that make it so that data declarations only parse when they're translatable into a code, and also some generic meta-theory proofs about the denotations of such functions in, say, the category of sets or infinity CPOs or whatever, since thats the kind of thing you'd want to have for denotational reasoning that you would have previously had to do by hand.
There are other approaches that people have too, for instance having a built in concept of W type, and then translating data declarations to W types. There are lot of options here, depending on what your goals are!

Some things to read:

Dagand and McBride - Elaborating Inductive Definitions https://urldefense.com/v3/__https://arxiv.org/abs/1210.6390__;!!IBzWLUs!ULbyK8fp-TW0QIzRKOO6PAud4Y-QUkY0mkJBLotd8zu8NEfox-9Rb17WlgJeg-KJQJxX6d8vIcUUiU7QdzEO0xsmeZw4bBbf$   lots of great references in this one

Nordstrom, Petersson, and Smith - Programming in Martin-Lof Type Theory https://urldefense.com/v3/__https://www.cse.chalmers.se/research/group/logic/book/book.pdf__;!!IBzWLUs!ULbyK8fp-TW0QIzRKOO6PAud4Y-QUkY0mkJBLotd8zu8NEfox-9Rb17WlgJeg-KJQJxX6d8vIcUUiU7QdzEO0xsmeR4IPKiQ$   a big introduction to one particular approach

McBride - Ornamental Algebras, Algebraic Ornaments https://urldefense.com/v3/__https://personal.cis.strath.ac.uk/conor.mcbride/pub/OAAO/LitOrn.pdf__;!!IBzWLUs!ULbyK8fp-TW0QIzRKOO6PAud4Y-QUkY0mkJBLotd8zu8NEfox-9Rb17WlgJeg-KJQJxX6d8vIcUUiU7QdzEO0xsmea6DSvej$   a paper about a lot of things, but the discussion of "descriptions" (Desc and IDesc) is really nice and relevant

- beka valentine