Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think the HDDs are being used for any intensive loads. They have too much latency for most of that. It's probably just archival storage for their scraped content and generated slop.
 help



For "cold" archival storage you would want to use tape, which is far cheaper per TB at scale.

I don't mean that type of archive, but rather "just in case" data like "last month's scrape of this website" after we scraped it 5 more times this month or higher resolution versions of book scans. You might want to still be able to dump it out quickly if you need it. Money is no object for these companies and the cost of HDDs is more than low enough for the flexibility they provide.

If demand for hard drives is this high then it sounds like there wouldn't be near enough tape around either.

This is why I am buying a couple of LTO 6 tapes. Thus far I've been able to buy 4 for approx 120 EUR, 2,5 TB each. They have been around 30 EUR each the past years, and still are approx such price (leaning towards 35 EUR though). I bought a second hand drive for about 500 EUR, and a HBA for it.

Tapes are great for true cold storage (will easily last many decades!) but they will wear out significantly with more intense use: you only get a couple hundred passes total over their full data capacity, either read or write. In practice, you still need plenty of big hard disks to act as nearline storage for practical use, and the tape only rarely does storage and retrieval in bulk. This is also why you see mechanical tape libraries with tens or hundreds of tapes for a single read/write unit: you don't really need more than that.

Yes, I will use them as cold storage, nothing else. Right now, I have the following scheme:

1x a live server (Proxmox, NAS, firewall, and various other capabilities). 2x RAID1 enterprise NVMe, with NAS storage on RAID1 HDDs. 12th gen Intel, so relatively power friendly apart from the enterprise HDDs. 10 gbit local, remote 1 gbit fiber.

1x a live backup server in same city. Syncs every night. I should disable it otherwise, but I don't as of now, since it also gets a livestream of my doorbell camera (I don't use cloud for it). Has 1 gbit fiber, and RAID-1. Runs OS off RAID-1 (cannot add NVMe, older Synology, well maybe USB3 would work, but I'd rather not).

1x a backup server in same location as my home server which auto starts and syncs every week. This is my main old server, a Xeon so not very power friendly. Also RAID-1 NVMe. 10 gbit local.

1x a remote cloud (Norway), to have another copy of the most important data. Doesn't contain everything. Costs me only 50 EUR/year.

So that is a lot of copies of the same data, and quite frankly I not need this many. For the HDDs I want to get rid of RAID-1 and use either RAID-0 or JBOD, doubling the available data (I'm at the max as it is) while still having great data redundancy. And I will want to store my tapes off-site, although it wouldn't have the latest and greatest backups. I still have to look up how to do FDE with LTFS though, but I'll figure it out.

It also seems a good moment to sell some old hardware, given the current prices, but I am not sure if I will. Just something to ponder on later. You'd think I'd like to sell off the Xeon with its 30W CPU, but I quite like the machine (HP Microserver 10 Gen10 Plus). I'd rather sell the Synology which is still a decent machine, but I use voodoo to run recent software on it, and ZFS (with Homebrew / Nixpkgs). Tho neither is useful for ML.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: