Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In the spirit of "Parse, Don't Validate", rather than encode "validation" information as a boolean to be checked at runtime, you can define `Email { raw: String }` and hide the constructor behind a "factory function" that accepts any string but returns `Option<Email>` or `Result<Email,ParseError>`.

If you need a stronger guarantee than just a "string that passes simple email regex", create another "newtype" that parses the `Email` type further into `ValidatedEmail { raw: String, validationTime: DateTime }`.

While it does add some "boilerplate-y" code no matter what kind of syntactical sugar is available in the language of your choice, this approach utilizes the type system to enforce the "pass only non-malformed & working email" rule when `ValidatedEmail` type pops up without constantly remembering to check `email.isValidated`.

This approach's benefit varies depending on programming languages and what you are trying to do. Some languages offer 0-runtime cost, like Haskell's `newtype` or Rust's `repr(transparent)`, others carry non-negligible runtime overhead. Even then, it depends on whether the overhead is acceptable or not in exchange for "correctness".



I would still usually prefer email as just a string and validation as a separate property, and they both belong to some other object. Unless you really only want to know if XYZ email exists, it's usually something more like "has it been validated that ABC user can receive email at XYZ address".

Is the user account validated? Send an email to their email string. Is it not validated? Then why are we even at a point in the code where we're considering emailing the user, except to validate the email.

You can use similar logic to what you described, but instead with something like User and ValidatedUser. I just don't think there's much benefit to doing it with specifically the email field and turning email into an object. Because in those examples you can have a User whose email property is a ParseError and you still end up having to check "is the email property result for this user type Email or type ParseError?" and it's very similar to just checking a validation bool except it's hiding what's actually going on.


> I would still usually prefer email as just a string and validation as a separate property, and they both belong to some other object. Unless you really only want to know if XYZ email exists, it's usually something more like "has it been validated that ABC user can receive email at XYZ address".

> Is the user account validated? Send an email to their email string. Is it not validated? Then why are we even at a point in the code where we're considering emailing the user, except to validate the email.

You are looking at this single type in isolation. The benefit of an email type over using a string to hold the email is not validating the actual string as an email address, it's forcing the compiler to issue an error if you ever pass a string to a function expecting an email.

Consider function `foo`, which takes an email and a username parameter.

This compiles just fine but is a logic error:

    void foo (char *email, char *username);
    ...
    char *my_email = parse_input ();
    char *my_user = parse_input ();
    foo (my_user, my_email);
Using a separate type for email means that this refuses to compile:

    void foo (email_t *email, char *username);
    ...
    email_t *my_email = parse_input ();
    char *my_user = parse_input ();
    foo (my_user, my_email); // Compiler error
I hope you can see the value in having the compiler enforce correctness. I have a blog post on this, with this exact example.


  > Because in those examples you can have a User whose email property is a ParseError and you still end up having to check "is the email property result for this user type Email or type ParseError?"
In languages with a strong type system, `User` should hold `email: Option<ValidatedEmail>`. This will reject erroneous attempts `user.email = Email::parse(raw_string);` at compile time, as `Result<Email,ParseError>` is not compatible / assignable to `Option<ValidatedEmail>`.

It's kind of a "oh I forgot to check `email.isValidated`" reminder, except now being presented as an incompatible type assignment and at compile-time. Borrowing Rust's syntax, the type error can be solved with

  user.email = Email::parse(raw_string)
      .ok()
      .and_then(|wellformed_email| {
          email_service.validate_by_send_email(wellformed_email)
      });
Which more or less gets translated as "Check email well-formedness of this raw string. If it's well-formed, try to send a test email. In case of any failure during parsing or test email, leave the `user.email` field to be empty (represented with `Option::None`)".

  > and it's very similar to just checking a validation bool except it's hiding what's actually going on.
Arguably, it's the other way around. Looking back at `email: Option<ValidatedEmail>`, it's visible at compile-time `User` demands "checking validation bool", violate this and you will get a compile-time error.

On the other hand, the usual approach of assigning raw string directly doesn't say anything at all about its contract, hiding the contract of `user.email` must be a well-formed, contactable email. Not only it's possible to assign arbitrary malformed "email" string, remembering to check `email.isValidated` is also programmer due diligence, forget once and now there's a bug.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: