Processing and storage

Research data is highly valuable, and it must be properly stored to avoid data loss or damage. Processing and maintaining research data can be difficult, especially if you have big data sets or material that requires legal protection. When we collect data, our primary focus is on the actual collection and content, which may lead to a lack of awareness of the data's storage, organizing, and documenting.

The researcher must ensure that research data is continuously stored and processed in a secure manner in accordance with UiT's Management System for Information Security and Privacy. - Principles and guidelines for the management of research data at UiT

Project managers and supervisors are in charge of guaranteeing information security in research projects on a daily basis.

It is essential to have good research data naming, organization, and documentation in order to find, understand, and use the data correctly. This is especially important when working in groups or sharing datasets. Quality-assured research data archives have stringent requirements for data format and documentation. However, it is equally important that you keep order for your part so that you can efficiently find and understand your data in the future.

There may also be unforeseen costs associated with data processing and storage. As a result, it is important to plan your data processing ahead of time.

Structuring and documentation

About structuring and organising data, metadata and description of data

Structuring and documentation

UiTs Principles and guidelines for the management of research data at UiT requires all employees and students to document their data in accordance with best practices and for future reuse:

Research data must be documented with metadata, method descriptions, and permanent identifiers that allow other researchers to find and use the data. Metadata ensure compliance to international standards / de facto standards when applicable and describe the data content with a focus for future use.

All work using research data must be meticulously documented, with generous amounts of metadata and a descriptive ReadMe file. It is good practice to begin the documentation early and continue to add information throughout the project. Documentation procedures should be defined during the planning phase. If you put off structure and documentation, there is a risk that critical information will be lost or incorrectly recorded. If you plan your job wisely, you can save a lot of time and avoid unneeded duplication.

Metadata is information about your data that is organized and standardized. Metadata is receiving more attention and demands since it is essential to make research data FAIR. Machine-readable metadata forms allow for indexing and searching, as well as providing contextual information critical for understanding and reusing data across technological platforms, institutions, and borders. The degree of FAIR is determined by the quality and scope of the metadata. As a result, it is important that the data be documented using properly filled metadata forms.

Many different standards for metadata documentation have been developed, both generic and subject-specific. Follow the scientific conventions set for your field, and use standardized terms, taxonomies/ontologies, and vocabulary whenever possible. This increases the reusability of your data.

Many data archives, organizations, and journals set metadata requirements. Check this early on so you know what metadata to collect for your project.

Addionally, many different standards for metadata documentation exist, both generic and subject specific. It is recommended that you follow the scientific customs for your field of researchs, and where possible use its standardized terminology, taxonomy/ontology or vocabulary. Examples of generic metadata-stadards are Dublin Core, Darwin Core (biology), and the Data documentation initiative. Research Data Alliance, FAIRSharing.org and the Digital Curation Centre all provide overviews of various standards.

Tools for simplifying documentation have been created for some metadata standards. However, in most cases, it will be more practical to collect the information in a ReadMe file that is saved alongside the data (see below). This will also be a good alternative if no metadata standard exists for your field of research.

ReadMe files are plain text files used to describe software packages. When working with data, a ReadMe file that follows the dataset and serves as a guide for understanding the data might be useful. The ReadMe file should ensure that the data is understandable by you or others when the dataset is shared and published.

It is recommended that you create the ReadMe file early on and place it in the dataset's main directory. Every time you operate on the data, the file can be updated here.

The ReadMe file should explain how the dataset was created, how complete it is, and under what conditions it can be reused. Much of the information in a ReadMe file will overlap with generic metadata information, but the ReadMe file must additionally include a detailed method description, an overview of the files, and an explanation of the files' contents. Make your descriptions as specific and as clear as possible. Define phrases and acronyms, and use well-known technical terms. This is necessary in order to make the dataset FAIR and reusable. The text in the ReadMe file can be reused in article publishing, which is an added benefit of keeping a good method description.

A ReadMe file must have at least the following:

General background information (title, DOI, contact info, date, place, ownership, financier).
Descriptions of methods (protocols, instruments, software).
File overview.
File-specific information, including a description of variables and units.
Reference and conditions for reuse.

Templates and examples of ReadMe files can be found in the user guide for DataverseNO.

Examples of other relevant documentation that should accompany the dataset:

Descriptions, instructions, and protocols for the phases of collection, processing, and analysis.
Configuration files and log files from calibration, processing, and analysis.
Dictionaries and code form.
Variable lists.
Information letter and consent form.
Form for notifying NSD and ethical approvals.
Questionnaire and interview guide.
Permits and licenses from rights holders, if any.

File and folder organization and naming
It is important that you and your colleagues agree on how the research data should be organized early on and that this is followed by all parties involved. Make a plan for how the data will be organized in files and folders, as well as how they will be titled. It will be essential to have clear and simple file and folder names.

Tips for organizing your files:

A hierarchical folder structure can help you keep track of and structure your data.
Organize the folders into relevant categories.
All folders should have a consistent naming structure. Make the folder name match the contents of the folders.
Make the file names reflect the folder structure. This will make it easier to keep track of the data when you archive it later.

Use wording that is meaningful in the project. It should be possible to understand the contents of a file without having to open it.

Some general guidelines for naming files and folders:

Use consistent file names.
Use descriptive but short file names (
Avoid spaces. Instead, you can use underscores (e.g. first_study), hyphens (e.g. first-study), or camel style (FirstStudy).
Avoid special characters like \ /? : * ”> <| : #% ”{} | ^ [] `~ æÆ øØ åÅ äÄ öÖ.
Use international date format: YYYY-MM-DD (e.g. 2021-06-01).
Use more digits if the files are numbered (e.g. 001 instead of 1). Then you avoid clutter when sorting.

Some elements that can be included in filenames are for example:

Date/time interval/location.
Name of study/project.
Version number.
File content.
Name/initials of the researcher.

Avoid:

Non-descriptive, generic folder names such as "Current".
Personal names of folders within a project, folder names should reflect the content.
Overlapping categories or multiple similar folders located in different locations.
Multiple copies of the same file in different folders. If necessary, you can create shortcuts to a file.

File and folder names often control how the files are sorted. Thus, the desired sorting can be decisive for the choice of name syntax.

Remember to provide organizational and name syntax documentation in a ReadMe file (see above) at the top level of the folder hierarchy.

A webinar on the topic of organizing and documenting research data is held every semester for those interested in learning more. A PowerPoint presentation with more information is also available on the course page.

If you need advice and guidance with metadata and documentation, please contact the research support team at researchdata@hjelp.uit.no.

Updated: 14.12.2023, updated by: Noortje Haugstvedt

Secure storage, collection and processing

About secure storage a routines for different types of data

Secure storage, collection and processing

According to the Information Management and Privacy management statement, research data must be accessible to those who need it (accessibility), safe against unintended and unlawful change (integrity), and inaccessible to unauthorized individuals (confidentiality). Integrity and availability refer to storing data on a reliable, backed-up system. As an UiT employee or student, you'll have access to the cloud service Office 365, which includes SharePoint and OneDrive. Data stored on these services is backed up automatically. SharePoint is recommended for storing research material, whereas OneDrive is exclusively used for personal storage. This is because content on OneDrive is automatically deleted when a user leaves the institution.

To ensure confidentiality, all data at UiT must be security-classified. The control system categorizes them as open/green, internal/yellow, confidential/red, or strictly confidential/black. The classification serves as the foundation for determining the level of security (IT-technical, organizational, and physical) that the information will be subjected to. In practice, this means that you must classify your data in order to ensure the appropriate level of security. This will then define where and how the data should be handled, as well as whether or not it should be protected. In Office 365, the classification is made visible as labels using Azure Information Protection (AIP).

The IT department decides which types of data categories the various services and systems are approved for using risk assessments. More information regarding approved services for different kinds of data can be found "Which services can you use for which content?"

Private equipment and licenses should not be used for research data processing because UiT's data is then processed privately. Since many cloud services are located in countries subjected to other regulations, it is unclear whether adequate security measures have been met. UiT loses control of this data and is unable to comply with legal requirements such as the Privacy Regulation.

When saving data on a laptop or external storage medium, one must consider the possibility of loss and take appropriate precautions.

Services for storage and processing of research data
SharePoint can be used as a collaboration tool with other researchers. The program can also be used to collaborate on articles or book scripts. File Sender, provided by UNINETT, may be an alternative for sharing larger files.

Those that need to document experimental work could use the RSpace Enterprise electronic lab notebook.

If you have specific requirements for calculation and storage resources, you may learn more about what the IT department has to offer at "Lagring og publisering av forskningsdata" (Norwegian only).

Projects working with sensitive data should consider signing an agreement with the Sensitive Data Services (TSD). They provide storage and processing security solutions that cover the full workflow, from data collecting via a web form to processing and analysis.

Every semester, webinars on "Storage of research data" and "Data cleaning and tidy spreadsheets" are held for those interested in learning more about the processing and storage of research data. Those who work in the lab may find it useful to take a course in the usage of the electronic lab notebook RSpace. You will find more information on the Training and Teaching page.

Updated: 14.12.2023, updated by: Noortje Haugstvedt

Electronic Lab Notebook (ELN)

ELN RSpace is available at UiT

Electronic Lab Notebook (ELN)

UiT has entered into an agreement with a provider of an electronic lab notebook, RSpace Enterprise. The service meets the security requirements at UiT and is considered a legally valid alternative to the traditional lab book. This is an important step in ensuring the integrity and availability of laboratory data produced at UiT. This is an important step toward ensuring the integrity and availability of laboratory data produced at UiT.

RSpace Enterprise is a versatile and generic electronic lab notebook that can be tailored to a wide range of experimental work and data management needs. The system enables new working methods that streamline workflow, stimulate reuse, and improve transparency and reproducibility.

Easy to collaborate and share documents within projects.
Security: Audit tracking, digital signing of documents, and material cannot be deleted.
Timesaving: Advanced search functions and documents are linked using links.
Availability: All files are stored together in the cloud.
Integration with several electronic services, including OneDrive, ChemAxon Marvin, eCAT, Protocols.io, SnapGene, and Dataverse (DataverseNO).
Full mobility of work. Documents can be exported in a variety of file formats. Data produced by visiting researchers and students remain available, and a copy can be given to them.

Since RSpace is a service provided by UiT, it is possible to get technical support and follow-up when needed.

How to get started with RSpace
RSpace is browser-based and works on mobile, tablet, and PC/Mac devices. UiT has its own RSpace server. To create a user account, follow these steps:
1. Log in to http://uit-rspace.researchspace.com/ using your FEIDE profile.
2. Register the license by sending an email to researchdata@hjelp.uit.no with your name, position, and affiliation.

Before using RSpace
One of the most significant benefits of using an electronic lab notebook is the ability to collaborate with colleagues and/or students. This is done by establishing a digital lab group. A principal investigator (PI) is needed for each lab group. A PI can run the lab group and have access to all the group's files. Before starting with RSpace, it is a good idea to have a plan for how the lab group(s) will be organized, who will be the PI, and whether the lab will be open or closed. Click here to download a support document that describes RSpace roles and organization or contact UiT's support team. Send an email to researchdata@hjelp.uit.no if you want to become a PI and create a new lab group.

Training and support
If you wish to learn more about electronic lab notebooks, a webinar is held every semester. We also provide interactive workshops on RSpace ELN. A PowerPoint presentation with additional information is also available on the course page.

Resources
More information about the functions in RSpace can be found here: UiTs introduction videos to RSpace (Norwegian), tutorial PDF (Norwegian), roles and organization, RSpace Inventory (PDF) or on RSpace's help pages (English) and YouTube channel (engelsk).

Updated: 14.12.2023, updated by: Noortje Haugstvedt

Updated: 03.01.2022, updated by: Tanja Larssen

Research data portal