5. FILE MANAGEMENT

File Management in Operating Systems: Overview

File Management is one of the most important functions of an operating system (OS). It involves the creation, storage, retrieval, and management of files and directories in a computer system. The OS provides a file system that offers a structured way to store and organize data on storage devices such as hard drives, SSDs, and external media. The OS also enforces policies for file access, protection, and security.


1. Objectives of File Management

The main goals of file management are:

  • Efficient Data Storage: Store and retrieve files in an organized manner that optimizes the use of storage devices.

  • Data Sharing: Facilitate data sharing among different users and processes by allowing controlled access to files and directories.

  • File Access: Provide mechanisms for reading, writing, and modifying files in a flexible manner (e.g., sequential, random access).

  • Data Integrity: Ensure that files are not corrupted, lost, or improperly modified due to system failures or unauthorized access.

  • File Protection and Security: Protect files from unauthorized access, modification, and deletion, ensuring data privacy and security.

  • Uniform Naming: Allow files to be identified and accessed in a consistent way, regardless of their storage location.


2. File System

The file system is the component of the OS that organizes and manages files and directories on storage devices. It defines how data is stored, retrieved, and organized.

Key Components of a File System:

  • Files: A file is a collection of related data identified by a unique name (e.g., documents, images, applications).

  • Directories: Directories (or folders) are used to organize files into a hierarchical structure, making it easier to manage large numbers of files.

  • Metadata: Information about the file, such as its size, creation date, access permissions, and owner, which is stored in file system tables.

  • File Types: Different file types can exist, such as regular files (text or binary data), directories, symbolic links, and special files (e.g., devices).

  • Mounting: A process where the OS makes a file system available to the user by associating it with a directory structure.


3. File Access Methods

File access methods determine how data within a file can be read, written, and modified. The common access methods are:

  • Sequential Access: Data in a file is accessed in a fixed order, from beginning to end. This is typical for text files or log files. Operations occur one after the other in sequence (e.g., tape drives).

  • Direct (Random) Access: Files can be accessed directly by specifying the location or block number of the data. This method is faster and more flexible, allowing data to be read or written at any point (e.g., databases, disks).

  • Indexed Access: An index is created for the file, mapping logical data to its physical location. The index allows for faster searches and retrieval of records.


4. Directory Implementation

Directories are essential for organizing files. They store information such as file names, types, locations, and metadata.

Directory Structures:

  • Single-Level Directory: All files are stored in one directory. Simple, but can be inefficient when many files are stored together.

  • Two-Level Directory: Separate directories for each user, allowing for file name reuse by different users.

  • Tree-Structured Directory: A hierarchical directory system that organizes files and directories into a tree-like structure, allowing for more complex organization and management.

  • Acyclic-Graph Directory: Allows sharing of files and directories by multiple users through links, creating a more complex but flexible directory structure.

  • General Graph Directory: Similar to an acyclic graph but allows for cycles, creating challenges for traversing and maintaining the directory structure.


5. File Allocation Techniques

File allocation is the method used by the file system to assign space on a storage device for files. The main techniques are:

  1. Contiguous Allocation:

    • Files are stored in contiguous blocks on the disk.
    • Advantages: Fast access due to the file’s location being continuous.
    • Disadvantages: Leads to fragmentation and difficulty in resizing files.
  2. Linked Allocation:

    • Files are stored as linked blocks scattered across the disk.
    • Each block contains a pointer to the next block, forming a linked list.
    • Advantages: Easy to grow files without needing contiguous space.
    • Disadvantages: Slower access because blocks are scattered.
  3. Indexed Allocation:

    • An index block contains pointers to all the blocks of a file.
    • The index block is kept in memory for fast access, and the file’s data blocks can be non-contiguous.
    • Advantages: Solves both fragmentation and file resizing issues.
    • Disadvantages: Requires extra storage for the index.

6. File Protection and Security

File protection and security mechanisms ensure that only authorized users and processes can access files.

Techniques for File Protection:

  • Access Control Lists (ACLs): Each file and directory has a list of users and their permissions (read, write, execute).

  • File Permissions: Most systems use file permission models, where the owner, group, and others have distinct access rights (e.g., in Unix/Linux systems, these rights are represented as rwx for read, write, and execute).

  • Encryption: Encrypting files ensures that unauthorized users cannot read the data, even if they gain access to the file.

  • User Authentication: Ensures that only authorized users can access files. This typically involves user accounts and passwords.

  • Backup and Recovery: Regular backups of files provide protection against data loss due to hardware failures, accidental deletion, or corruption.

Security Policies:

  • Discretionary Access Control (DAC): The file owner controls who can access the file.

  • Mandatory Access Control (MAC): Access control policies are set by the system based on levels of security.

  • Role-Based Access Control (RBAC): Users are assigned roles with predefined permissions, allowing for more scalable access control.