Skip to content

Latest commit

 

History

History
56 lines (27 loc) · 2.06 KB

File metadata and controls

56 lines (27 loc) · 2.06 KB

Storage In System Design

[TOC]

Storage System

storage_system

Object Store

object_store_use_case

Example 1: Dropbox Multimedia Search

Indexing Pipeline for Metadata

dropbox_index_pipeline

The data flows through the system in the following way:

  • Raw files are stored in a blob store
  • Riviera extracts features and metadata from these files
  • Information flows through third-party connectors
  • Kafka message brokers transport the data
  • Transformers process and structure the information
  • Finally, everything populates the search index

Geolocation-Aware Retrieval System

dropbox_retrieval_system

During indexing, when a file contains GPS coordinates in its metadata, Dropbox converts those coordinates into a hierarchical chain of location IDs. For example, a photo taken in San Francisco would generate a chain linking San Francisco to California to the United States. This hierarchy is crucial because it enables flexible searching at different geographic levels.

Just-In-Time Preview Generation

dropbox_preview

When a search returns results, the system generates preview URLs that the frontend can fetch. These URLs point to a preview service built on top of Riviera that generates thumbnails and previews in multiple resolutions on the fly. To avoid repeatedly generating the same preview, the system caches them for 30 days, striking a balance between storage costs and performance.

Reference

[1] Block, Object, and File Storage in System Design

[2] System Design CheatSheet for Interview

[3] Modern Storage Systems

[4] Dropbox Multimedia Search: Making File Search More Useful