NTFS File System
NTFS is the standard filesystem for Windows, developed by Microsoft. NT means "New Technology", and FS is of course "File System".
NTFS Structure and MFT Structure
At the beginning of Volume formatted as NTFS, a special area called MFT (Master File Table) is allocated. The MFT holds information on "NTFS whole management" and "Entries information with some Attributes of all files / all folders in NTFS". It is an i-node area in Linux.
Normally 12.5% of the free disk space is reserved for MFT area.
The MFT consists of a group of metafiles that starts with "$" and a record of each file / folder (fixed length of 1 KByte). It has the following structure.
Structure of a file / folder
As mentioned above, entries of each file / each folder on the NTFS file system are stored as 1 KByte long records in the MFT.
One record is one record. Like Linux XFS etc, directories (folders) are also considered as one type of file.
Attribute information attached to the file is stored in each record. The attribute is preceded by "$", but it should not be confused with the Metafile.
Type | Attribute | Description |
0x10 | $STANDARD _INFORMATION | Timestamp, Flag(file type), USN, etc. |
0x20 | $ATTRIBUTE _LIST | Attributes not fit in MFT and the position. |
0x30 | $FILE_NAME | File name ( or folder name). |
0x40 | $OBJECT_ID | File Identifier. (used in Office app) |
0x80 | $DATA | Contents written in the file. |
0x50 | $SECURITY _DESCRIPTOR | Owner and permissions information. |
0xE0 | $EA | Extended Attributes. (Described later.) |
0x90 | $INDEX_ROOT | Indicate B+Tree 's root positioon. (Attribute used in folder) |
0xA0 | $INDEX _ALLOCATION | List of assigned node index of B+Tree. (Attribute used in folder) |
0xB0 | $BITMAP | Indicate assigned/not assigned of index with B+Tree.(Attribute use in folder) |
0xC0 | $REPARSE _POINT | Specify the link destination of Symbolic Link etc. |
In Linux XFS etc. "Attribute (meta information)" and "Data" are distinguished, but in Windows NTFS "Data" ($DATA) is also one of the attributes.
For example, the Data area ($DATA attribute) of the file (test.txt) written with "test" by Notepad contains 4 characters "test" and it is 4 bytes in size. Check the size of the property and it is 4 bytes. (On Linux, the line feed code entered arbitrarily, it becomes 5 bytes)
For general files, for example, the following four attributes are included.
- $STANDARD_INFORMATION
- $FILE_NAME
- $OBJECT_ID
- $DATA (In case of folder, $INDEX_ROOT, $INDEX_ALLOCATION, $BITMAP)
All attributes (including $DATA) are stored in the MFT area until the size of the entire attribute does not exceed 1 KByte. As a result, "size on disk" is 0 byte as shown in the above figure of test.txt.
When this size becomes large and it can not be stored in the MFT, we use "Mapping Pair" (also called Runlist) to write it in the "File System Data" area (the area counted on the size of the disk).
Access to HDD is done with "cluster size (also called as "Allocation Unit" or "Block Size", 4 KB in recent general PC)" unit, so if you add more strings to your data more and more, "size on disk" will be 4 KB . For example, if you increase the data area of test.txt to 260 bytes, it looks like the following.
Like this, what we can not fit into MFT is called Non-resident. (In that synonym, entering MFT is called Resident.)
The processing will change depending on whether the attribute that can not fit in MFT is $DATA or other attributes.
If $DATA can not fit in the MFT, "Mapping Pair" is stored in the $DATA attribute in the MFT as described above. This shows where the $DATA information is stored in which cluster range of the "File System Data" area.
Specifically, {cluster position X1, cluster number Y1} is stored, indicating that $DATA is stored in consecutive clusters from logical position "X1" to cluster "X1 + Y1" of the cluster. This may be divided into several (fragmentation) as shown in the figure below, and if it is bad, performance will be degraded unless defragmentation is performed.
Let's look at concrete tools. Install and launch the Active @ Disk Editor freeware, select "test.txt" as shown below and click "Inspect File Record".
First of all in 4 byte state it is displayed as follows.
The Non resident flag in the attribute header of $DATA is 0, and the character string "test" is stored as ASCII as the attribute value of $DATA.
Next, when increasing to 704 bytes, it becomes as follows.
The Non resident flag in the attribute header of $DATA is 1, and "Mapping Pair" is stored as the attribute value of $DATA. Size is 0x41. Although this is confusing, 4 means the size of the first cluster (cluster position) (4 bytes), 1 means the size of Cluster count (1 byte). Since the sector start position is a large value of 20,182,697, it means that 4 byte size is required. Since Cluster count has a value of 1, I use only 1 byte.
On the other hand, as a case, if there is too much information amount other than $DATA, and 1 KByte is overflowed, creates 1 KB long Auxiliary Record (or Child Record) in the MFT.
We will store attributes that protrude into this AUX Record, and add the $ATTRIBUTE_LIST attribute to the original record (Base Record), recording the position information of the AUX Record and the attribute information that protruded in it.
Alternate Data Stream : ADS
The $DATA attribute supports multiple streams and can have information like hidden data. It is also used to add security information to .exe files etc, but in some cases it is used as a virus location.
For example, let C: testtest.txt have an alternate data stream named Strm 1 and have the data "ads 1" in it, type the following command in Power Shell.
PS C:test> Set-Content test.txt -Value ads1 -Stream Strm1
I could set it. To view this data in Power Shell as well, type:
PS C:test> Get-Content test.txt -Stream Strm1
ads1
It was confirmed that ads 1 was included as data.
Next, let's look at this state with the command prompt dir /r command.
As shown above, alternative data streams are represented in the following format.
Also, the main data stream which is not a substitute (ordinary file) is "unname (unnamed)",
Like this. In the example shown above, test.txt is equivalent to this, and $DATA is omitted.
To put it a bit more strictly, it supports multistream in all attributes, not just $ DATA, and in general it has the following format.
Similarly mostly it is "unnname", but this is often used for metafiles.
Extended Attributes
NTFS has an area called extended attributes. This is similar to alternative data streams and is used like hidden data. However, we do not see much implementation yet.
I could not find a mechanism to set in Windows, but I could confirm that it can be set from Linux via CIFS.
Step 1. Install attr to CentOS 7
[root@localhost ~]# yum -y install attr
Step 2. CIFS mount from CentOS 7 to Windows (172.16.2.3)
[root@localhost ~]# mount -t cifs //172.16.2.3/home/test -o username=test,password=P@$$w0rd /mnt/test
Step 3. Write the value to extended attribute to test.txt in the mount point of CentOS 7
[root@localhost test]# setfattr -n user.test -v test-value test.txt [root@localhost test]# getfattr -d test.txt # file: test.txt user.TEST="test-value"
Extended attributes of Linux have four namespaces {user, trusted, security, system}, but user seems to be familiar with extended attributes of Windows. If you use the command "EaQuery64.exe" on the site below, you can see extended attribute information on Windows.
Execution commands and results are as follows.
C:hometest>EaQuery64.exe /Target:C:hometesttest.txt /Mode:0 /Verbose:2 /Identifier:*
EaQuery v1.0.0.1
TargetFile: C:hometesttest.txt
NextEntryOffset: 0
Flags: 0x00
EaNameLength: 4
EaName: TEST
EaValueLength: 10
EaValue:
0000 74 65 73 74 2d 76 61 6c 75 test-valu
C:hometest>
Somehow the last letter of the value is missing. Since EAValueLength is 10, it seems to be a bug of EaQuery64.exe.
Let's change the permission of test.txt. Specifically, for user "test" who mounts with CIFS, try to remove "Read extended attributes" and "Write extended attributes" of Advanced permissions.
Then you can not read or write from Linux.
[root@localhost test]# getfattr -d test.txt getfattr: test.txt: Permission denied [root@localhost test]# setfattr -n user.test2 -v test-value test.txt setfattr: test.txt: Permission denied
This is one of the reasons why the user space of extended attributes of Linux is connected to extended attributes of Windows.
If you try to write to extended attributes of space {trusted, security, system} other than user via Linux via CIFS, an error "Unsupported operation" will be issued and execution will fail.
[root@localhost test]# setfattr -n trusted.test2 -v test2-value test.txt setfattr: test.txt: Operation not supported [root@localhost test]# setfattr -n security.test2 -v test2-value test.txt setfattr: test.txt: Operation not supported [root@localhost test]# setfattr -n system.test2 -v test2-value test.txt setfattr: test.txt: Operation not supported
コメント