Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah - in theory they are the same thing. You can have a dependent type that's an array with a length of 5 or you can create a type called ArrayLengthFive, and in both cases the type checker ensures that you don't accidentally use an unbounded array.

But the difference is that with a dependent type, you get more guarantees (e.g. my ArrayLengthFive type could actually allow for arrays of length 20!). And the type checker forces you to prove the bounds (there could be a bug in my code when I create an ArrayLengthFive). And the dependent type allows other parts of the system to use the type info, whereas ArrayLengthFive is just an opaque type.

I do think there is room for both ways of working. In the string -> Email example, it's probably enough to parse your string and just call it an email. You don't need to try to encode all the rules about an email into the type itself. But there are certainly other cases where it is useful to encode the data constraints into the type so that you can do more operations downstream.



Parsing a string into an email, of course, is already fraught with peril. Is it valid HTML? Or did you mean the email address? Is it a well formed email, or a validated address that you have received correspondence with? How long ago was it validated? Fun spin on the, "It's an older code, sir, but it checks out."

I've seen attempts at solving each of those issues using types. I am not even positive they aren't solvable.


Validation is an event, with it's own discrete type separate from the address itself. This is no different than a physical address.

    123 Somewhere Lane
    Somewhereville, NY 12345
is a correctly formatted address but is almost certainly not one that physically exists.

Validation that it exists isn't solvable in the type system because, as I mentioned, it is an event. It is only true for the moment it was verified, and that may change at any point in the future, including immediately after verification.


I think I get your point. I would add reconciliation to validation. In that sometimes you cannot validate something without doing it, and are instead storing a possibly out of band reconciliation of result.

I'm curious, though, in how this argument does not apply to many other properties people try and encode into the types? It is one thing if you are only building envelopes and payloads. And, I agree that that gets you a long way. But start building operations on top of the data, and things get a lot more tedious.

It would help to see examples that were not toy problems. Every example I have seen on this is not exactly leaving a good impression.


There are things that can embed a bit of temporal logic into types, such as a linear type that can statically guarantee a file handler has been closed exactly once, or error.

For the most part, what I think people really want are branded types. F# has a nice example of unit, where a float can be branded Fahrenheit or Celsius or Kelvin, and functions can take one of those types as parameters.

You then have some function that can parse user input into a float and brand it one of those types. The compiler doesn't make any guarantees about the user input, only that the resulting branded value cannot be used where a different one is expected. Other good examples are degrees and radians, or imperial and metric, etc.

Depending on what you are doing, knowing at a type level that some number can never be negative (weight in pounds) can save you a lot of hassle. If one part of the system doesn't brand a value, and it feeds into another part of the system that is branded, you're stuck with extraneous runtime checks all over the place or a lot of manual unit tests. Instead, the compiler can point out exactly, and in every instance, where you assumed you had a validated value but instead did not.


Some of these are why I would add reconciliation to the idea. Linear types, specifically. You can encode that you only send a message once. But, without doing reconciliation with the destination, you can not be sure that it was received.

I'm torn, as the examples from small programs make a ton of sense and I largely like them. I just cannot escape that every attempt I've seen at doing these in large programs is often fairly painful. I cannot claim they are more error prone than the ones that skip out on it. I can say they are far more scarce. Such that the main error condition appears to be to stall out on delivery.


> In the string -> Email example, it's probably enough to parse your string and just call it an email. You don't need to try to encode all the rules about an email into the type itself.

There is also the in-between Rust approach. Start with the user input as a byte array. Pass it to a validation function, which returns it encapsulated within a new type.

    #[derive(Clone, Hash, Ord, Eq...)]
    struct Email(Box<str>);

    // validate UTF-8 + 
    fn validate(raw: &[u8]) -> Result<Email, EmailValidationError>;
What is annoying is the boilerplate required. You usually want your new type to behave like a (read-only) version of the original type, and you want some error type with various level of details.

You can macro your way out of most of the boilerplate, with the downside that it quickly becomes a domain specific language.


AKA the "parse, don't validate" approach [1].

1: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...


The salient excerpt is: "A parser is just a function that consumes less-structured input and produces more-structured output". In opposition to a validation function that merely raises an error.

    - validate(raw) -> void
    - parse(raw) -> Email | Error
Of course the type signature is what seals the deal at the end of the day. But I am going to follow this naming convention from now on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: