Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't understand why the data is unstructured, and thus SQL is not an option. Isn't the data coming in from calls to the company's own API?


It has to do with semantics and the way you view your data. SQL is not a Turing-complete language (without abusing it!) and represents a subset of Relational Algebra. For us, relations might seem to resemble these of an RDBMS at first sight, but we work with them in a different way.

Also, the data format for input is not tightly-coupled with the LDB system. One can have different processing scripts (lisp) that may organize the data in a different way. Or you can write a C module that extends the underlying db engine to work with your implementation or even your database infrastructure (whether it includes SQL, NoSQL, etc).


I'm not sure if I understand your answer. To quote from the article again:

During our tests, we saw that SQL databases weren’t a good fit due to the fact that our data where unstructured and we needed a lot of complex “joins” (and many indexes).

I guess that you say it's unstructured because the most important parts of bug reports are stack traces and debug messages? My naive assumption is that stack traces can be stored in a relational way, and it seems that you do:

For us, relations might seem to resemble these of an RDBMS at first sight, but we work with them in a different way.

From 'unstructured', 'complex "joins"' and 'work in a different way' I infer that you do a lot of string processing on the fly, and the real problem was about precomputing everything vs. staying flexible?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: