Hacker News with Generative AI: Data Storage

Huawei developing SSD-tape hybrid amid US tech restrictions (blocksandfiles.com)
Huawei’s in-house development of Magneto-Electric Disk (MED) archive storage technology combines an SSD with a Huawei-developed tape drive to provide warm (nearline) and cold data storage.
Transactional Object Storage? (mbrt.dev)
I was frustrated by the gap between stateless and stateful applications in the cloud. While I could easily spin up a stateless application as a “serverless” function in any major cloud provider and pretty much forget about it, persisting data between requests was a game of pick two among three: cheap, strongly consistent, portable.
Upspin: A framework for naming everyone's everything (upspin.io)
Upspin is an attempt to address problems like these, and many more.
Backblaze Drive Stats for Q3 2024 (backblaze.com)
As of the end of Q3 2024, Backblaze was monitoring 292,647 hard disk drives (HDDs) and solid state drives (SSDs) in our cloud storage servers located in our data centers around the world.
Floppy Disk Storage (history) (ibm.com)
The once-ubiquitous data storage device gave rise to the modern software industry
Show HN: OpenDAL, one API to access all the storages (S3, Azblob, HDFS, etc.) (github.com/apache)
Apache OpenDAL™: Access Data Freely
LocalStorage vs. IndexedDB vs. Cookies vs. OPFS vs. WASM-SQLite (rxdb.info)
So you are building that web application and you want to store data inside of your users browser. Maybe you just need to store some small flags or you even need a fully fledged database.
Consider adding warnings against using ZFS native encryption (github.com/openzfs)
Among experienced zfs users and developers, it seems to be conventional wisdom that zfs native encryption is not suitable for production usage, particularly when combined with snapshotting and zfs send/recv. There is a long standing data corruption issue with many firsthand user reports:
19th-century photography technique employed in novel data storage method (ieee.org)
19th-century photography technique employed in novel data storage method
DNA stores data in bits after epigenetic upgrade (nature.com)
DNA stores data in bits after epigenetic upgrade
LocalStorage vs. IndexedDB vs. Cookies vs. OPFS vs. WASM-SQLite (rxdb.info)
So you are building that web application and you want to store data inside of your users browser. Maybe you just need to store some small flags or you even need a fully fledged database.
Icechunk: An open-source, cloud-native transactional tensor storage engine (earthmover.io)
Icechunk is a brand new open-source transactional storage engine for tensor / ND-array data designed for use on cloud object storage.
LocalStorage vs. IndexedDB vs. Cookies vs. OPFS vs. WASM-SQLite (rxdb.info)
So you are building that web application and you want to store data inside of your users browser. Maybe you just need to store some small flags or you even need a fully fledged database.
Show HN: Vortex – a high-performance columnar file format (github.com/spiraldb)
Vortex is a toolkit for working with compressed Apache Arrow arrays in-memory, on-disk, and over-the-wire.
Windows 11 24H2 hoards 8.63 GB of junk you can't delete (theregister.com)
Windows 11 24H2 users are finding there is undeletable data that remains on their devices after installing the recently released feature update.
Improving Parquet Dedupe on Hugging Face Hub (huggingface.co)
The Xet team at Hugging Face is working on improving the efficiency of the Hub's storage architecture to make it easier and quicker for users to store and update data and models.
IEEE Roadmap Outlines Development of Mass Digital Storage Technology (ieee.org)
SlateDB – An embedded database built on object storage (slatedb.io)
Unlike traditional LSM-tree storage engines, SlateDB writes data to object storage to provide bottomless storage capacity, high durability, and easy replication.
Build a serverless ACID database with this one neat trick (atomic PutIfAbsent) (eatonphil.com)
Delta Lake is an open protocol for serverless ACID databases. Due to its simplicity, scalability, and the number of open-source implementations, it's quickly becoming the DuckDB of serverless transactional databases. Iceberg is a contender too, and is similar in many ways. But since Delta Lake is simpler (simple != better) that's where we'll focus in this post.
How Discord stores trillions of messages (2023) (discord.com)
In 2017, we wrote a blog post on how we store billions of messages.
A Guide to Imaging Obscure Floppy Disk Formats (zenodo.org)
Memory institutions are grappling with the challenges posed by digital carriers in their collections. While solutions for more recent carriers like hard drives, optical discs, and flash storage are readily available, the landscape becomes trickier when dealing with older formats such as floppy disks.
Valkey 8 sets a new bar for open-source in-memory NoSQL data storage (beehiiv.com)
Vienna, Austria: Valkey, the Redis fork, is kicking rump and taking names. At Open Source Summit Europe, the Linux Foundation announced the release of Valkey 8.0, a giant step forward to the open-source in-memory NoSQL data store.
Tinymind – Write and sync your blog and memo data with GitHub (github.com/mazzzystar)
Turn your GitHub into a blog & memo data storage place with one-click Sign in. No server needed - every input automatically syncs to your GitHub repository.
No Data Lasts Forever (lilysthings.org)
No matter what you do, no data will last forever. You hard drive will fail. Your backup drives will fail. Tech companies will go under and sell off their assets. Optical Discs will rot. Books will decompose. Even if none of these things happen, a natural or manmade disaster could come by and destroy it all anyway.
Mkfs.fat on Linux vs. OS/2 2.1 (uninformativ.de)
I often run VMs with old operating systems in them, like OS/2 2.1. Exchanging data with these VMs is a bit tedious, because there's usually no networking. I can basically only use floppy images or HDD images. For the longest time, I used a 256 MB disk image that I once created manually under MS-DOS. This usually worked, but being confined to a fixed size is not that great.
Human genome stored on 'everlasting' memory crystal (phys.org)
University of Southampton scientists have stored the full human genome on a 5D memory crystal—a revolutionary data storage format that can survive for billions of years.
5D 'Eternity Crystal' Stores 360 TB of Data for Billions of Years (newatlas.com)
Scientists have stored the entire human genome on a five-dimensional crystal that’s capable of digitally storing up to 360 terabytes of information and is built to survive for billions of years.
Valkey 8 sets a new bar for open-source in-memory NoSQL data storage (beehiiv.com)
Vienna, Austria: Valkey, the Redis fork, is kicking rump and taking names. At Open Source Summit Europe, the Linux Foundation announced the release of Valkey 8.0, a giant step forward to the open-source in-memory NoSQL data store. This release focuses on enhancing performance, reliability, and observability, marking a major milestone for the project initially forked from Redis due to licensing changes.
Human genome stored on 'everlasting' memory crystal (southampton.ac.uk)
M-DISC: The storage medium that lasts 1000 years (wikipedia.org)
M-DISC (Millennial Disc) is a write-once optical disc technology introduced in 2009 by Millenniata, Inc.[1] and available as DVD and Blu-ray discs.[2]