The Gluster Blog

Gluster blog stories provide high-level spotlights on our users all over the world

WORM (Write Once Read Multiple), Retention and Compliance

Gluster
2016-07-05

This feature is about having WORM-based compliance/archiving solution in glusterfs. It mainly focus on the following

  • Compliance: Laws and regulations to access and store intellectual property and confidential information.
  • WORM/Retention : Store data in a tamper-proof and secure way & Data accessibility policies
  • Archive: Storing data in effectively and efficiently & Disaster-Recovery solution

 

WORM Retention empowers GlusterFS users to safeguard their data in a tamper proof manner. It further enables the users to maintain and track the state of the file transformation with regards to time periods (writable, read-only and un-deletable). Thereafter , nullifying any effort to change contents , location and properties of a static file in brick.

 

Existing implementation:
The existing feature is implemented at the volume (collection of multiple storage units often referred to as bricks) level, which implies that if the volume option stays enabled ; the files in that volume will only be in read only state thereafter.

New files can be created in that volume but once the file was closed the file would be readonly. This will not even allow users to delete the files which are no longer required or of no use. It was rigid and inconvenient, without providing options or controls to the users of GlusterFS. To avoid this the more flexible and file-level WORM/Retention feature is implemented.

 

Feature details:
Enhancing the existing WORM translator in Gluster and to introduce File Level WORM/Retention Symantec in Gluster with Autocommit (Automatic WORM/Retention Transition). By introducing File Level WORM/Retention, each file gets its own WORM/Retention properties and by introducing Autocommit feature the user is presented with valuable options/controls to have the optimum settings according to his/her requirements. Thus making Gluster Compliance/ archival solution more relevant to the Archival Market. The life cycle of a WORM/Retained file is shown in the figure. A normal file becomes WORM-Retained either manually or by auto-commit, till the retention period. After the retention period the file will get transmitted to WORM state. A file in WORM state can transmit back to WORM-Retained state if necessary, using the manual transition procedure. The WORM-Held (Legal-Hold) state is currently not implemented in 3.8. It will be a future enhancement to the feature. A WORM file can be deleted, which was not possible with the previous implementation.

Worm Data Validation

In 3.8 File Level WORM/Retention is going as experimental feature.

  1. We will be having file level worm/retention symantec i.e
    1. Each file will have its own WORM/Retention properties
      1. Retention Period
      2. WORM/Retention state
    2. There will be only 2 modes of WORM/Retention that will be supported
      1. Relaxed : Retention period of the file can be increased or decrease(not below the modification time)
      2. Enterprise : Retention period of the file can only be increased and not decreased
    3. Volume Level Retention profile:
      1. Default Retention Period : Time till which a file should be undeletable
      2. Autocommit Period : Time period at/after which the namespace scan has to take place (automatic/lazy auto-commit) to do the state transition
      3. WORM/Retention Mode : Relaxed/Enterprise
    4. Posix commands for WORM/Retention Operations
      1. “touch -a” or “touch -t” command to increase or decrease retention period
      2. “chmod -w” or equivalent command to make a file read-only on demand
    5. WORM/Retention Transition :
      1. Manual using posix command
      2. Automatic transition : Dormant Files will be converted into WORM files based on Auto-commit Period. In 3.8 it would be a lazy mechanism, IO Triggered Using timeouts for untouched files. The next IO will cause the transition.

 

How to test:
Enabling the feature:

Turn off the features.read-only and features.worm volume options if active. Turn on the features.worm-file-level option. This will enable the file-level WORM feature. Set the features.retention-mode option to manage the retention period of a WORM-Retained file later. Set the features.default-retention-period and features.auto-commit-period options as required. Time periods are specified in seconds.
The FOPs will do the state transition, or necessary actions only on those files which are created when these configurations are set, and volume options are in the same state.

Img2

Manual transition:

This is done by using the posix command chmod.
chmod -w <filename>

chmod 0444 <filename>

chmod u=r,o=r,g=r <filename>
or any other equivalent command which removes the write bits for all three types of users. The code which checks for this is shown below
    if (stbuf->ia_prot.owner.write == 0 &&

        stbuf->ia_prot.group.write == 0 &&

        stbuf->ia_prot.other.write == 0)

            ret = _gf_true;
If the condition satisfies then it will make the state transition from Normal/WORM state to WORM-Retained state. The access time of the file will be pointing to the time till which the file will be retained. During this time the file will be in the immutable and undeletable state.

Img3

In this figure the access time of the file was 17:09:21 previously. After the state transition the access time points to the time till which the file will be undeletable. In this case it is 17:10:31, i.e., the sum of time when it got state transmitted and the default-retention-period.
Autocommit:

Lazy autocommit way of state transition is implemented in the current version. This will be done when the next IO(link, unlink, rename, or truncate)  is triggered. It will look for the dormant files if the auto-commit-period is expired, i.e., the difference between the current time and the start_time (creation time) of the file is greater than the auto-commit-period. If this condition is satisfied and the file is not accessed upto the auto-commit-period value, then it gets transmitted to WORM-Retained state, and access time points to the time till which the file will retain.
In the below figure, the “rm -f file2” (unlink) command does the state transition since the timeout is happened for file2. It displays Read-only file system error and blocks the FOP. The access time is pointing to the retention time of the file after the transition.

Img4

Updating the retention time:

For a WORM-Retained file, we can change the retention time. The access time will be pointing to the previously set retention time of the file. We can change this by using the “touch -a” or “touch -t” commands. The time will be set or not based on the retention-mode which is set on the file. If the retention mode is “relax” then the command will succeed if the time we have specified is not less than the modification time of the file. If the mode is “enterprise, we can only increase the time that is set.


In the below figure the first “touch -t” fails with Read-only file system error, since the retention mode is “relax” and we are trying to decrease the access time of the file less than the modification time of the file. Second time it succeeds and sets the access time (retention time) to a value higher than the modification time.

Img5

Performing IO on WORM/Retained files:

The link, unlink, rename, write, and truncate FOPs will fail on a WORM retained file. While performing the link, unlink, rename, or truncate FOP, if the file’s retention period is over, it will do the state transition to the WORM state. The access time of the file will then point to a value which is the difference of the access time and the default-retention-period of that file. If the file’s retention time is not updated once it is moved to WORM-Retained state, the access time will point to the actual access time of the file before the state transition. If it has been updated afterwards, then it may not point to the actual access time of the file before state transition.
If the FOP performed after timeout is unlink, then it will do the state transition to WORM state, and will pass the FOP. So the file will no longer be available. So if you want to keep the files for some more time, increase the retention period of the file, before it gets into WORM state.
In the below figure, “file3” gets manually transmitted to WORM-Retained state. The unlink, rename, and write FOPs are blocked since it is in WORM-Retained state. Truncate and link FOPs will also fail for “file3”.

Img6

Performing IO on WORM files:

For a WORM file, the link, rename, truncate and write FOPs should fail. Unlink FOP will pass and delete the file since the retention time of the file will be expired. User can either keep the files or he can delete it, since the files timeout is happend.
If user wants to keep the file he can again move the file to WORM-Retained state, by using the posix command chmod which is used in case of manual transition. This will again put the file under retention policies.
The below figure shows the state transition of file3 from WORM-Retained state to WORm state. Access time of the file after the transition is pointing to the access time before the WORM-Retention transition. The unlink FOP succeeds since the file is no longer retained.

Img7

User improvements:

    1. Users can still use older volume level worm feature
    2. Users can play around with the file level worm feature.

 

Limitations (Plans for next releases):

    1. No Data validation of Read-only data i.e Integration with bitrot not done.
    2. Internal operations like tiering, rebalancing, self-healing will fail on WORMed files
    3. Since gluster is a user land filesystem, no control on ctime. We need to implement this.
    4. WORM/Retention based tiering.

 

Owners:

  • Joseph Fernandes <josephaug26@gmail.com>
  • Karthik Subrahmanya <ksubrahm@redhat.com>

 

Reference:

http://www.gluster.org/community/documentation/index.php/Features/gluster_compliance_archive

BLOG

  • 06 Dec 2020
    Looking back at 2020 – with g...

    2020 has not been a year we would have been able to predict. With a worldwide pandemic and lives thrown out of gear, as we head into 2021, we are thankful that our community and project continued to receive new developers, users and make small gains. For that and a...

    Read more
  • 27 Apr 2020
    Update from the team

    It has been a while since we provided an update to the Gluster community. Across the world various nations, states and localities have put together sets of guidelines around shelter-in-place and quarantine. We request our community members to stay safe, to care for their loved ones, to continue to be...

    Read more
  • 03 Feb 2020
    Building a longer term focus for Gl...

    The initial rounds of conversation around the planning of content for release 8 has helped the project identify one key thing – the need to stagger out features and enhancements over multiple releases. Thus, while release 8 is unlikely to be feature heavy as previous releases, it will be the...

    Read more