Difference between revisions of "Overview of your Storage on DeepSense"

From DeepSense Docs
Jump to: navigation, search
(Update cloud storage)
Line 1: Line 1:
 
== Overview ==
 
== Overview ==
  
DeepSense is a platform for AI/ML for oceans research data.  All users will have a home directory, and access to various filesystemsEach filesystem has a default [[#Quotas and Policies|quota]], with more space available upon request.  While some of the filesystems are backed up, it is the responsibility of the user to maintain their original data on their own site.
+
DeepSense is a platform for AI/ML for oceans research data. You have store the project data for your DeepSense projects. DeepSense is not meant for long term data storageData will only be stored so long as your project is ongoingOnce a project is completed, it is expected that the users will remove their data in a timely fashion.
  
DeepSense is not meant for long term data storageData will only be stored so long as your project is ongoing.  Once a project is completed, it is expected that the users will remove their data in a timely fashion.
+
DeepSense is not intended to be used for data sharingWhile each user in your project/group will have access to shared space, it won't be accessible by any other users.
  
DeepSense is not intended to be used for data sharing.  While each user in your project/group will have access to shared space, it won't be accessible by any other users.  We do not host databases for sharing data, or for web access.
 
  
The filesystems provided are a resource shared by all users. It is expected that users make sensible use of the space, and follow the guidelines outlined here. It is possible the quotas and policies will change in the future, but we will strive to provide plenty of notice.
+
== Amazon EBS - Elastic Block Store ==
 +
Amazon Elastic Block Store (Amazon EBS) provides persistent block storage volumes in the AWS Cloud for use with Amazon EC2 instances. To protect you from component failure and to provide high availability and durability, each Amazon EBS volume is automatically replicated within its Availability Zone. Amazon EBS volumes provide the reliable and low-latency performance required to run your workloads. You can scale your usage up or down in minutes with Amazon EBS, all while paying a low price for only what you provision.
  
== Filesystems ==
+
== Amazon S3 - Simple Storage Service ==  
 +
Amazon S3 is an object storage service that provides industry-leading scalability, data availability, security, and performance. Customers of all sizes and industries can use it to store and protect any amount of data for a variety of use cases, including websites, mobile apps, backup and restore, archiving, enterprise applications, IoT devices, and big data analytics. Amazon S3 offers simple management features that allow you to organise your data and configure fine-grained access controls to meet your specific business, organisational, and compliance needs. Amazon S3 is built to last for 99.999999999% (11 9s) and stores data for millions of applications for businesses all over the world.
  
There are several different filespaces available to users.  They are shared filesystems that can be accessed from any of the nodes.  They each have separate purposes.  See also [[Transferring Data]].
+
== Amazon S3 Glacier ==
 
+
Amazon S3 Glacier (S3 Glacier) is a safe and long-lasting service for low-cost data archiving and backup. With S3 Glacier, you can store your data for months, years, or even decades at a low cost.
=== Home directory ===
 
 
 
Each user has a home directory in /dshome/''subdirectory''/.  The subdirectory (visitor, research, faculty, grad, etc.) will depend on the type of LDAP account you have.  This is primarily designed for your personal use, and only you have permission to access it.  It is ideal for managing scripts, source code and test data sets. It is not meant for large data storage.
 
 
 
=== Data ===
 
 
 
Each user/project will have access to a directory in the data filesystem. 
 
 
 
* If you are a member of a project, the directory will be /data/''projectname'' (or groupname).  Everyone in your project group will have access to this directory.
 
* If you are an individual student, your directory is /data/''username''.  Only you will have access to this directory.
 
 
 
The data filesystem will house the bulk of your data, and it has a larger default quota. This is also the primary location for transferring large amounts of data.  It is accessible via [[Getting_started#2._Transfer_data|samba]].
 
 
 
=== Scratch ===
 
 
 
Each user/project will have access to a directory in the scratch filesystem.
 
 
 
* If you are a member of a project, the directory will be /scratch/''projectname'' (or groupname).  Everyone in your project group will have access to this directory.
 
* If you are an individual student, your directory is /scratch/''username''.  Only you will have access to this directory.
 
 
 
The scratch filesystem is intended to support data used during job execution.  It has a larger default quota, and can support a larger number of files.  '''Note''': this is temporary space, and is not backed up.  Data which has not been accessed in 60 days may be purged, though we will contact you prior to doing so.  Data needed for longer storage should be stored elsewhere. 
 
 
 
Delete files that you no longer need as soon as you are done with them, rather than leaving large amounts of data sitting untouched.
 
 
 
== Quota and Backup Policies ==
 
 
 
Users have a limited amount of space on each of the above filesystems.  For more information, please see
 
[[Quota Information and Management]]. 
 
 
 
Our backup server backups user files every day.  For more information, please see [[Backup Policies]].
 

Revision as of 23:33, 10 March 2023

Overview

DeepSense is a platform for AI/ML for oceans research data. You have store the project data for your DeepSense projects. DeepSense is not meant for long term data storage. Data will only be stored so long as your project is ongoing. Once a project is completed, it is expected that the users will remove their data in a timely fashion.

DeepSense is not intended to be used for data sharing. While each user in your project/group will have access to shared space, it won't be accessible by any other users.


Amazon EBS - Elastic Block Store

Amazon Elastic Block Store (Amazon EBS) provides persistent block storage volumes in the AWS Cloud for use with Amazon EC2 instances. To protect you from component failure and to provide high availability and durability, each Amazon EBS volume is automatically replicated within its Availability Zone. Amazon EBS volumes provide the reliable and low-latency performance required to run your workloads. You can scale your usage up or down in minutes with Amazon EBS, all while paying a low price for only what you provision.

Amazon S3 - Simple Storage Service

Amazon S3 is an object storage service that provides industry-leading scalability, data availability, security, and performance. Customers of all sizes and industries can use it to store and protect any amount of data for a variety of use cases, including websites, mobile apps, backup and restore, archiving, enterprise applications, IoT devices, and big data analytics. Amazon S3 offers simple management features that allow you to organise your data and configure fine-grained access controls to meet your specific business, organisational, and compliance needs. Amazon S3 is built to last for 99.999999999% (11 9s) and stores data for millions of applications for businesses all over the world.

Amazon S3 Glacier

Amazon S3 Glacier (S3 Glacier) is a safe and long-lasting service for low-cost data archiving and backup. With S3 Glacier, you can store your data for months, years, or even decades at a low cost.