U: A Data Store Organized by Content Key
Rather than storing data files in a hierarchical directory structure where both directories and data files are given string names, u stores files named by their content keys. The keys are generated by either SHA1 or SHA3 (Keccak-256). Storage by content keys has several advantages. For one, it is trivial to determine whether a file is corrupt: you simply recalculate the hash. In a distributed storage system, files are requested by key. All machines participating in the retrieval can check file integrity as the file is passing through and drop and re-request if the hash doesn’t match the content key.
u is optimized for storing very large numbers of files. The first byte of the content key determines which top-level directory the file goes in; the second byte determines its lower-level directory. So if a file’s content hash is abcdef…1234, then it will be stored in ab/cd/ef…1234. There are 256 top-level directories and 256 subdirectories below each of these, so 256x256 = 65536 lower-level directories.
// Determine the SHA1 or SHA3 content hash of an arbitrary file
func FileSHA1(path string) (hash string, err error)
func FileSHA3(path string) (hash string, err error)
// Create a u256x256 directory structure
func New(path string) *U256x256
// Attributes for files in U, a u256x256 directory tree
func (u *U256x256) Exists(key string) bool
func (u *U256x256) FileLen(key string) (length int64, err error)
func (u *U256x256) GetPathForKey(key string) string
// Copy a data file and add the copy to U using an SHA1 key. If the
// key doesn't match, the operation fails.
func (u *U256x256) CopyAndPut1(path, key string) (
written int64, hash string, err error)
// Retrieve a file by its SHA1 key.
func (u *U256x256) GetData1(key string) (
data []byte, err error)
// Insert a data file into U; the original is lost.
func (u *U256x256) Put1(inFile, key string) (
length int64, hash string, err error)
// Write a buffer into U, storing it by its SHA1 key.
func (u *U256x256) PutData1(data []byte, key string) (
length int64, hash string, err error)
// Similar functions using the SHA3 (Keccak-256) hash function.
func (u *U256x256) CopyAndPut3(path, key string) (
written int64, hash string, err error)
func (u *U256x256) GetData3(key string) (
data []byte, err error){ var path string
func (u *U256x256) Put3(inFile, key string) (
length int64, hash string, err error)
func (u *U256x256) PutData3(data []byte, key string) (
length int64, hash string, err error)