1. What is AWS S3?
Amazon Simple Storage Service (S3) is a highly scalable, durable, and secure object storage service provided by AWS. It is designed to store and retrieve any amount of data from anywhere on the web. S3 is commonly used for hosting static websites, storing backups, archiving data, and serving as a data lake for analytics. It offers 11 nines (99.999999999%) durability, making it extremely reliable for long-term storage.
2. Explain the S3 storage classes.
AWS S3 offers multiple storage classes optimized for different use cases:
- Standard: High durability and availability for frequently accessed data.
- Intelligent-Tiering: Automatically moves objects between frequent and infrequent tiers based on access patterns.
- Standard-IA (Infrequent Access): Lower cost for data accessed less often but still requires rapid retrieval.
- One Zone-IA: Similar to Standard-IA but stored in a single Availability Zone (lower cost, less redundancy).
- Glacier: Low-cost archival storage with retrieval times ranging from minutes to hours.
- Glacier Deep Archive: Cheapest option for long-term archival with retrieval times up to 12 hours.
3. How is data organized in S3?
Data in S3 is stored in buckets, which act like top-level containers. Each bucket contains objects, which are the actual files or data. Objects consist of:
- Key (unique identifier within the bucket)
- Value (the data)
- Metadata (information about the object)
4. What is a bucket policy?
A bucket policy is a JSON-based access control document attached to an S3 bucket. It defines permissions for users, roles, or services, specifying which actions (e.g., GetObject, PutObject) are allowed or denied. Bucket policies are essential for implementing fine-grained access control.
5. Explain CORS in S3.
Cross-Origin Resource Sharing (CORS) allows web applications hosted on one domain to access resources from another domain. In S3, CORS is configured at the bucket level to enable browsers to make cross-origin requests to S3 objects, which is critical for web apps using S3-hosted assets.
6. How can you secure data in S3?
Security in S3 involves multiple layers:
- Access Control: Use IAM policies, bucket policies, and ACLs.
- Encryption:
- In-transit: Use SSL/TLS (HTTPS).
- At-rest: Use Server-Side Encryption (SSE-S3, SSE-KMS, SSE-C) or Client-Side Encryption.
- Block Public Access: Enable S3’s block public access settings.
- Monitoring: Use AWS CloudTrail and S3 Access Logs for auditing.
7. What is versioning in S3?
Versioning allows you to keep multiple versions of an object in a bucket. It helps protect against accidental deletions or overwrites. When enabled, every update creates a new version, and previous versions can be restored.
8. Explain the difference between S3 and EBS.
- S3: Object storage, ideal for static files, backups, and large-scale data storage.
- EBS (Elastic Block Store): Block storage for EC2 instances, suitable for databases and applications requiring low-latency disk access.
9. How do you enable versioning for an S3 bucket?
Enable versioning via:
- AWS Console: Navigate to bucket → Properties → Enable Versioning.
- AWS CLI:
aws s3api put-bucket-versioning --bucket <bucket-name> --versioning-configuration Status=Enabled.
10. What is the significance of S3 Object URL?
Each object in S3 has a unique URL (HTTP/HTTPS) that allows direct access. Example:https://bucket-name.s3.amazonaws.com/object-key.
Access depends on permissions and whether the object is public or private.
11. Explain S3 Object Lifecycle Policies.
Lifecycle policies automate object management by transitioning objects to cheaper storage classes or deleting them after a certain period. Example: Move objects to Glacier after 90 days.
12. What is S3 Transfer Acceleration?
It speeds up uploads/downloads by routing traffic through Amazon CloudFront edge locations, reducing latency for global users.
13. What is Multipart Upload in S3?
Multipart Upload splits large files into smaller parts and uploads them in parallel, improving speed and reliability. Recommended for files >100 MB.
14. How do you secure data in transit to S3?
Use HTTPS (SSL/TLS) for all communications with S3 to encrypt data in transit.
15. What is the maximum size for an S3 object?
5 TB per object. For uploads >100 MB, use Multipart Upload.
16. Explain Cross-Region Replication in S3.
CRR automatically replicates objects from one bucket to another in a different AWS region for disaster recovery and compliance.
17. Difference between S3 and EFS?
- S3: Object storage for static data.
- EFS (Elastic File System): Shared file storage for EC2, supports NFS protocol.
18. Use case for S3 Select?
Retrieve only required data from large objects (CSV, JSON, Parquet) using SQL-like queries, reducing transfer costs and improving performance.
19. Concept of S3 Access Points?
Access Points provide unique hostnames with custom permissions for different applications accessing the same bucket.
20. S3 event notification feature?
Triggers notifications (SNS, SQS, Lambda) for events like object creation, deletion, or restore.
21. Monitor S3 bucket metrics?
Use Amazon CloudWatch for metrics like request count, storage size, and replication status.
22. Difference between S3 and Glacier?
S3 = immediate access; Glacier = archival storage with retrieval times from minutes to hours.
23. Optimize costs in S3?
Use Intelligent-Tiering, Lifecycle Policies, and delete unused objects. Analyze access patterns with Storage Class Analysis.
24. How S3 works with CloudFront?
S3 acts as an origin for CloudFront, enabling global content delivery with caching and low latency.
25. S3 Storage Class Analysis feature?
Analyzes access patterns to recommend transitions to cost-effective storage classes.
26. Enable logging for an S3 bucket?
Specify a target bucket for logs in bucket properties. Logs include request details for auditing.
27. What is S3 Select + Glacier?
Allows querying data stored in Glacier without full retrieval, saving time and cost.
28. Set up CORS in S3?
Add CORS configuration in bucket properties with allowed origins, headers, and methods.
29. Use of S3 Batch Operations?
Perform bulk actions (copy, tag, delete) on millions of objects using a manifest file.
30. Enable server access logging for an S3 bucket?
Specify target bucket and prefix in bucket properties to store access logs.
Scenario-Based Detailed Answers
- Optimizing performance: Used Transfer Acceleration, multi-part uploads, and parallel threads.
- Securing sensitive data: SSL/TLS for transit; SSE-KMS for at-rest; strict IAM policies.
- Cost optimization: Intelligent-Tiering + Lifecycle Policies for infrequent data.
- Multi-region architecture: CRR with versioning and Transfer Acceleration.
- Large dataset migration: AWS Snowball/DataSync, multi-part uploads, integrity checks.
- Handling cost spikes: CloudWatch alerts, Intelligent-Tiering, Lifecycle Policies.
- Improving query performance: S3 Select for partial data retrieval.
- Troubleshooting permissions: Checked bucket policy, ACLs, IAM roles for conflicts.
- Cross-account access: Bucket policy with external account ARN and allowed actions.
No comments:
Post a Comment