Amazon S3: The Service Behind Everything
Свайпніть щоб показати меню
Pick any modern AWS app and trace its data. Sooner or later, you hit Amazon Simple Storage Service (S3). Customer uploads land in S3. Server logs go to S3. Lambda deployment packages live in S3. CloudFront caches from S3. Data lakes are built on S3. Backups end up in S3. The service is so foundational that AWS bills S3 separately on most invoices because the volume justifies its own line.
This chapter covers what S3 actually is, how it differs from a file system, and the vocabulary the rest of this course depends on.
Objects, Not Files
S3 stores objects in buckets. The difference from a file system matters:
- An object has a key (its name), a value (the bytes), metadata (key-value pairs), and a version ID if versioning is on;
- There are no real directories. The slashes in
photos/2026/cat.jpgare just part of the key — S3 fakes folders in the Console for human comfort; - Objects are immutable. To "edit" one, you upload a new version with the same key;
- Maximum object size is 5 TB. Maximum single PUT is 5 GB — beyond that you must use multipart upload, covered in chapter 5.
Buckets Are Globally Named
Bucket names live in a single global namespace. If acme-uploads is taken anywhere on AWS, you cannot use it either. Buckets are also region-scoped — when you create one, you pick a region, and the data stays there unless you replicate it.
Names follow DNS rules: lowercase, 3 to 63 characters, no underscores, no consecutive periods. The classic rookie mistake is picking a name already taken — go specific like acme-prod-uploads-2026.
Durability and Availability
S3 advertises 11 nines (99.999999999%) of durability — meaning if you store 10 million objects, you should expect to lose one every 10,000 years on average. AWS hits this number by replicating each object across multiple devices in at least three availability zones within the region.
Availability is a separate number — typically 99.99% for the standard storage class. Durability is "will the data survive?". Availability is "can I read it right now?".
Strong Read-After-Write Consistency
Diana built a Lambda that wrote a file to S3 and immediately read it back to process it. For years, this would sometimes fail — S3 used eventual consistency. As of December 2020, S3 provides strong read-after-write consistency for all operations: PUT, GET, LIST, and DELETE. After a successful PUT, the next GET returns the new object. Always.
This single change quietly eliminated an enormous category of bug.
The S3 API in 30 Seconds
Six operations cover 95% of S3 usage:
- PutObject — upload an object;
- GetObject — download an object;
- ListObjectsV2 — list keys in a bucket, paginated, 1,000 per call;
- DeleteObject — delete one;
- CopyObject — server-side copy from one key or bucket to another;
- HeadObject — get metadata without downloading the bytes. That's it. Everything else — presigned URLs, multipart uploads, lifecycle rules — is built on top.
What S3 Is Not
A few mental traps to avoid:
- S3 is not a file system. Mounting it via tools like
s3fsexists but is slow and breaks expectations. Use S3 as an object store, not a disk; - S3 is not a database. You cannot query inside objects efficiently — that's what Amazon Athena or Redshift Spectrum do over S3 data;
- S3 is not cheap for high-frequency small requests. PUT and GET are billed per request — millions of tiny objects can cost more than expected. The next four chapters dig into the parts of S3 most developers learn the hard way: storage classes, security, presigned URLs, and multipart uploads.
Дякуємо за ваш відгук!
Запитати АІ
Запитати АІ
Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат