Skip to content

S3

Links: 101 AWS SAA Index


Introduction

  • S3 is infinitely scaling.
  • S3 allows users to store files (objects) in directories (buckets)
  • Each bucket must have a globally unique name across all accounts in AWS.
  • S3 is a global resource but the buckets are defined at the region level. It has a global console but you need to select a region for you bucket.
  • We keep objects in buckets and these objects have a key.
    • Key is nothing but the full path to the object/file. Example
      • s3://my-bucket/my_file.txt
      • s3://my-bucket/my_folderl/another_folder/my_file.txt
    • The key is composed of 2 things prefix + object name
      • attachments/Pasted image 20220423090054.jpg
For data going into S3 you don’t pay anything. You only pay for data going out of S3.
  • There is no concept of directories within buckets. Using the UI you will think that there are directories. But keys are very long names that contain slashes(”/”).
  • Object values are content of the body. The maximum size of the object is 5TB.
  • If you are uploading a file of more than 5GB then you must use multi-part upload.
  • We can assign metadata (list of key/value pairs), tags and versions (if versioning is enabled)
  • There are two ways of viewing the object in S3.
    • Object URL: If the bucket is not public you will get a permission denied error.
    • Presigned URLs: Object URL with tokens. Anyone can view the object.
  • When a PUT request is successful you get a HTTP status code and MD5 checksum.

Versioning

  • To version our files in S3 versioning must be enabled at the bucket level.
    • The versioning state applies to all (never some) of the objects in that bucket.
What is meant by versioning

This means if you re upload a file with the same name then the version number will change. So instead of overwriting the file it will create a new file version.

  • It is a best practice to version the buckets.
    • It will protect against unintended deletes since we can restore a version
    • We can easily roll back to the previous version.
  • We can enable versioning of the bucket even after it is created. Any file that is not versioned prior to version enabling will have a version of “null”.
  • Suspending versioning doesn’t delete the previous versions.
  • We can only see all the versions if we use the list versions toggle.
  • Version numbers are not simple numbers but rather complex strings
  • If you delete something in a versioned bucket then delete markers are added to the object.
    • The UI won't show it but if you toggle the list versions you can see the delete marker
    • To undelete and object just delete the delete marker.
    • If you try to delete a specific version of an object then that delete is permanent.
    • s3:ObjectRemoved:DeleteMarkerCreated is triggered when a delete marker is created for a versioned object.
    • s3:ObjectRemoved:Delete is triggered when you delete a specific version of the object.
  • GET requests do not retrieve delete marker objects
    • The only way to list delete markers (and other versions of an object) is by using the versions subresource in a GET Bucket versions request.
    • A simple GET does not retrieve delete marker objects.

Security

  • For security of S3 bucket we have three things:

    • IAM policies: It provides user based control. You can control which API calls should be allowed for a specific user.
    • Bucket policies: These are also known as resource based rules.
    • ACLs (Access Control lists): They can only grant permission to accounts and not specific users.
    • attachments/Pasted image 20220423095746.jpg
  • We have bucket settings for blocking public access to prevents any data leaks.

    • This should be always checked unless we are running a public website from S3.
    • We can also block public access to all the buckets in the account.

Object Ownership

  • By default, an S3 object is owned by the account that uploaded the object. That's why granting the destination account the permissions to perform the cross account copy makes sure that the destination owns the copied objects.
  • To get full access to the object, the object owner must explicitly grant the bucket owner access. You can create a bucket policy to require external users to grant bucket-owner-full-control using ACLs.
  • Object ownership is important for managing permissions using a bucket policy.
    • For a bucket policy to apply to an object in the bucket, the object must be owned by the account that owns the bucket.
An IT company has built a solution wherein a Redshift cluster writes data to an Amazon S3 bucket belonging to a different AWS account. However, it is found that the files created in the S3 bucket using the UNLOAD command from the Redshift cluster are not even accessible to the S3 bucket owner. Why?

By default, an S3 object is owned by the AWS account that uploaded it. So the S3 bucket owner will not implicitly have access to the objects written by the Redshift cluster.

A development team had enabled and configured CloudTrail for all the Amazon S3 buckets used in a project. The project manager owns all the S3 buckets used in the project. However, the manager noticed that he DID NOT receive any object-level API access logs when the data was read by another AWS account.

The bucket owner also needs to be object owner to get the object access logs.

Bucket Policies

  • Resource BasedBucket policies are bucket wide rules which we can set from the S3 console, allows cross account access.
Whenever you hear the term cross account access always go for bucket policies instead of ACLs
When can an IAM principal access an S3 object
  • If the user lAM permissions allow it OR the resource policy (bucket policies) ALLOWS it
  • AND there's no explicit DENY

So if you have an IAM policy that allows S3 access but the bucket policy is explicitly denying you then you won’t be able to access the object.

  • Bucket policies like normal policies are simple JSON documents.

  • Some common use cases of S3 bucket policies are

    • Grant public access to the bucket
    • Force objects to be encrypted at upload
      • attachments/Pasted image 20230304080701.jpg
    • Cross account access
  • Optional conditions on:

    • Public IP
      • attachments/Pasted image 20230304080801.jpg
    • CloudFront OAI
    • MFA
      • attachments/Pasted image 20230304080846.jpg
  • Other examples:

    • Grant user access to list and download all objects in an S3 bucket
      • attachments/Pasted image 20230304080949.jpg
Lambda and S3 different accounts

If the IAM role that you create for the Lambda function is in the same AWS account as the bucket, then you don't need to grant Amazon S3 permissions on both the IAM role and the bucket policy. Instead, you can grant the permissions on the IAM role and then verify that the bucket policy doesn't explicitly deny access to the Lambda function role. If the IAM role and the bucket are in different accounts, then you need to grant Amazon S3 permissions on both the IAM role and the bucket policy.

MFA Delete

  • We can enable MFA for deleting versioned objects. Once enabled will also need MFA to suspend bucket versioning.
  • Only the root account (not even the admin account) can enable/disable MFA delete.
  • We need to use CLI/SDK or S3 Rest API to enable MFA delete.
  • To use MFA Delete, Versioning must be enabled on the bucket.
    • If you are unable to to enable MFA Delete from the CLI then this might be one of the reasons.
  • Once MFA-Delete is enabled you cannot use console to delete the versioned files. We have to use CLI/SKD or S3 Rest API.

Presigned URLs

  • URLs that are valid only for a limited time. These URLs are signed with some credentials from AWS.
  • We can generate the pre signed URLs using the Console, SDK or the CLI.
  • For downloads we can use the CLI but for uploads we must use the SDK.
  • By default the Pre-Signed URL will have an expiration of 3600 seconds. We can change this using the --expires-in [Time in Seconds] argument.
  • You can perform secure uploads using pre signed URLs.
  • Users given the Pre-Signed URL will inherit the permission of the person who generated the URL for GET/PUT (Download/Upload).
  • Use case: Allow only logged-in users to download a premium video on your S3 bucket.

Access Logs

  • For audit purposes you may want to log all the access to S3 buckets.
  • Any request made to S3 bucket from any account authorised or denied will be logged in another S3 bucket.
  • We can analyse the logs using Amazon Athena.
Do not set your logging bucket to be the monitored bucket otherwise there will be a logging loop and the size will grow exponentially.
  • For logging and audit we can store S3 Access logs in another bucket.
    • The target logging bucket must be in the same region.
  • If we only want to log the API calls can be logged via CloudTrail
When to use S3 access logs over Cloud Trail

Cloud Trail only captures a subset of API calls for Amazon S3 as events, including calls from the Amazon S3 console and code calls to the Amazon S3 APIs. Whereas Amazon S3 server access logs provide detailed records for the requests that are made to an S3 bucket like the requester, bucket name, request time, request action, referrer, turnaround time, and error code information. Access logs provide more visibility to object level operations of the bucket.


Last updated: 2023-03-24