Ohio State is in the process of revising websites and program materials to accurately reflect compliance with the law. While this work occurs, language referencing protected class status or other activities prohibited by Ohio Senate Bill 1 may still appear in some places. However, all programs and activities are being administered in compliance with federal and state law.

Microbiome Platform Data Policy

Data Policy


IDI-GEMS will store your sequences and analysis (if applicable) for up to 3 months, with the storage period commencing once all relevant analyses have been completed and all summary reports have been provided. Ideally you would transfer all of your relevant data before the 3-month period expires. After this 3-month period, you will receive a notification informing you that all data associated with a specific quote/project will be purged from our system. The notification will also request confirmation that you have received and transferred your data. If we do not receive a response within 2 weeks of this notification, we will proceed with the data removal.

Data Access/Transfer

  We store and process data on the Ohio State Supercomputer (OSC). You have several methods available to access and transfer your sequences and results. The options are as follows:
1)    Via OSC (If you have an existing project account or if you are able to create one)
    a.    Within OSC: Transfer data to your directory using the UNIX “cp” command
    b.    Within OSC: Transfer data to your directory using OSC OnDemand (GUI)
    c.    Outside OSC: Transfer data to your server using the UNIX “sftp” command
2)    Globus transfer: Transfer data from OSC to your server
3)    Hard drive transfer: Transfer via physical hard drive (under 2 TB limit)

Using the OSC system directly:


Many clients will be interfacing with our services directly on OSC whereby they have their own established OSC project ID. If you do not have a project ID and would like to use this approach, you can gain access to OSC by following the instructions in this link: https://www.osc.edu/supercomputing/portals/client_portal/projects_budgets_and_charge_accounts 
Note, you must have a user account with PI status to create a project. 

A new user account can be created following the steps here: https://www.osc.edu/supercomputing/portals/client_portal/self_signup_for_accounts . 
PI status can be applies to this account following the steps here:  https://www.osc.edu/supercomputing/portals/client_portal/manage_profile_information

If you are unfamiliar with operations on a terminal and/or connected to a server, please spend some time thoroughly exploring the below two links, describing what a server is and the essentials of navigating OSC. Effective use of bioinformatic data requires some basic understanding of UNIX operating systems. These tutorials are critical to your success in engaging with your data.
https://www.osc.edu/documentation/tutorials/unix_basics 
https://www.osc.edu/content/new_user_training

Option 1, direct copy on OSC:


Many clients will be given access to their data location via ACL https://en.wikipedia.org/wiki/Access-control_list whereby they have the direct read write and execute access to their data location within the CoMS directory structure.

 This is by far the most simple and straight forward way to transfer data. In this case, transferring data to their own location is simply the below, replacing the <> and everything in-between with your relevant paths.


1)    $cp -r </fs/ess/PAS2095/your/data/location> </data/location/of/you/choice> 
Note, large data transfers can take a substantial amount of time and it is advised to use a multiplexer like “tmux” or “screens” for this https://www.redhat.com/en/blog/introduction-tmux-linux https://linuxize.com/post/how-to-use-linux-screen/ 


To use i.e tmux, do:
1)    $tmux new -s some_name # create a new tmux session
2)    $cp -r </fs/ess/PAS2095/your/data/location> </data/location/of/you/choice> 
3)    Use the key stroke - press and hold control, press and release B, release control, press D – You will see the phrase “detached”. Your transfer is now running in the background.
4)    $tmux ls # list tmux sessions
5)    $tmux attach -t some_name # reattach to your tmux session
6)    $tmux kill_session -t some_name # end the tmux session

Option 2, copy data via OSC OnDemand:


1)    Log in to OSC OnDemand at https://www.osc.edu/resources/online_portals/ondemand using your OSC credentials.
2)    Click the “Access OSC” drop down menu, and “System Gateway”
3)    Click the “Files” drop down menu which will list several of your commonly accessed project directories. We will start in “Home Directory” for clarity.
4)    In the grey bar towards the top of the page, there is a box labeled “Change Directory”, click this and input the full path to your data which has been provided by the CoMS team, something like /fs/ess/PAS2095/CoMS/etc/path/your/data. This will bring you to a complete list of your data structure. 
5)    Here you have many options to now do the below
The “Copy/Move” tab will allow you to transfer data as below.

5a) Select the data you wish to transfer. You will see a box on the left side that says, “Copy or move the files from below … to the current directory” DO NOT MOVE OR COPY YET.

5b) Use the “Change directory” tab again to navigate to where you would like the data to be transferred to.

5c) Click “Copy” The data will now be transferred. Note this may take some time.

6)    You can also “Download” the data here to your local computer, which is self-explanatory or transfer it with Globus. We can access Globus externally to OSC and will describe that process in the following section.
 

Option 3, using sftp:


This approach will be most useful to users outside of OSC, who have been given access to an OSC project directory but need to move data to another separate remote server. The client will need access to both the source location and target destination via ssh https://www.ssh.com/academy/ssh/sftp-ssh-file-transfer-protocol .
1)    The best way to do this is to log into one of the remote servers via ssh as below, and navigate to the location where you would like the data to be transferred to. On OSC this is done on a terminal by the below for example.
$ssh

2)    Then while logged into the first server, log into the second server by sftp as below
$sftp user@server.etc.etc.edu

3)    For a full list of sftp the options available do:

sftp> help

This returns:
Available commands:
bye                                Quit sftp
cd path                            Change remote directory to 'path'
chgrp [-h] grp path                Change group of file 'path' to 'grp'
chmod [-h] mode path               Change permissions of file 'path' to 'mode'
chown [-h] own path                Change owner of file 'path' to 'own'
copy oldpath newpath               Copy remote file
cp oldpath newpath                 Copy remote file
df [-hi] [path]                    Display statistics for current directory or
                                  filesystem containing 'path'
exit                               Quit sftp
get [-afpR] remote [local]         Download file
help                               Display this help text
lcd path                           Change local directory to 'path'
lls [ls-options [path]]            Display local directory listing
lmkdir path                        Create local directory
ln [-s] oldpath newpath            Link remote file (-s for symlink)
lpwd                               Print local working directory
ls [-1afhlnrSt] [path]             Display remote directory listing
lumask umask                       Set local umask to 'umask'
mkdir path                         Create remote directory
progress                           Toggle display of progress meter
put [-afpR] local [remote]         Upload file
pwd                                Display remote working directory
quit                               Quit sftp
reget [-fpR] remote [local]        Resume download file
rename oldpath newpath             Rename remote file
reput [-fpR] local [remote]        Resume upload file
rm path                            Delete remote file
rmdir path                         Remove remote directory
symlink oldpath newpath            Symlink remote file
version                            Show SFTP version
!command                           Execute 'command' in local shell
!                                  Escape to local shell
?                                  Synonym for help

4)    Most important here are “ls, lls, cd, lcd”. “cd” will change directories in your current sftp session, “lcd” will change directories on the original server that you ssh’ed into and subsequently sftp’ed into the target server. Likewise, “ls” will list the directory structure of the current sftp location, while “lls” will list the structure of the ssh location.
5)    Next, we can “get” files and directories from the sftp server as below.
On the sftp server identify files or directories that you would like to transfer, and simply do:

sftp> get test_file 
or
sftp> get -r test_directory

This will transfer the test_file or test_directory to the current location on the ssh server.
6)    This process is exactly reversed with the “put” command by”

sftp> put test_file 
or
sftp> put -r test_directory

Which will grab files or directories from the ssh server and transfer them to the sftp server.
7)    To exit the sftp server and drop back to the ssh server, do

sftp> bye

More information about sftp can be found here https://www.digitalocean.com/community/tutorials/how-to-use-sftp-to-securely-transfer-files-with-a-remote-server


Using Globus:


Several academic and non-academic institutions will provide their members access to Globus for data transfers between servers and/or personal computers. However, some organizations bar users from data transfers via Globus for security reasons. If the client is outside of the OSU system, please check with your administrator before using this approach as your institute may have specific installation/operation instructions.
1)    Install Globus connect personal from https://www.globus.org/globus-connect-personal or your organizations specific Globus portal. OSCs instructions are here https://www.osc.edu/resources/getting_started/howto/howto_use_globus_overview .
2)    After installation and login according to your specific requirements, Find the Globus connect icon in your toolbar and click on “Web: Transfer files

3)    In the top left of the screen, adjust the Panels to a two panel display.

4)    In the “Collection” section of the data locations panel search for Globus endpoints that you would like to use and have access to. These will be set by your specific organization, see links above.

5)    In the “Path” field, refine the directions to your data location and the location you would like to transfer data to. In the below instance we are transferring data to 1. Dean’s local desktop or 2. a made-up location withing OSC’s structure.
     
6)    With you source and target locations set (note in the below, these are not real locations), you can simply drag and drop files, directories etc… as needed

7)    To monitor the data transfer process/other activities, check the “ACTIVITY” tab on the far-left side of the screen

Using a physical hard drive:


Please contact IDI/GEMS for this option. Generally, if the data transfer is less then 2TB, we can provide a physical hard drive with the relevant data included. Details for this method should be arranged on a case-by-case basis.