This is a document I wrote a while back for work that I thought I would release in hopes that some people out there would find it useful.
Preferably, you have a backup of the file system that you can use. If not, the filesystem you are about to try to to recover a file on must meet these requirements:
- No new files have been created on the filesystem.
- No files have been extended.
- The filesystem is able to be unmounted.
- It is a JFS filesystem, not JFS2
If so, then please, drink a few more beers and continue, but before you do…
BACKUP THE CURRENT FILESYSTEM!
Also, note that if you are dealing with a directory that has been deleted and would like to recover both the directory and the files under that directory, you should try Recovering a Deleted Directory ( a document I have yet to post.. ). It follows many of the same steps, but has some very important differences. Do not try and use this procedure to recover deleted directories and the files that were contained within them. You will mess up.
Before we begin, I need to note a few things. I take no responsibility if this screws up your system. Use this at your own risk. Also, the example presented here is an actual representation of me recovering a deleted file, this is not just made up numbers. Also, this only works on jfs filesystems, not jfs2. The jfs2 fsdb is much different and I haven’t had a chance to play with it to determine the proper way of doing this.
Now that I’ve said that, we can begin. We’ll use an example directory with some example files. Our directory is called /test and our filesystem is testlv, otherwise known as /dev/testlv. In our example, our Junior System Admin, Myron, has accidentally deleted a perl script called testfile.pl and needs to recover it.
Note: If you are performing this operation on a filesystem while in maintenance mode, do NOT use option 1 when asked on how to mount the filesystems. ALWAYS use option 2, which specifies to start a shell before mounting the filesystems. Otherwise, the system will force a fsck -y on the filesystem and delete your files.
Step 1.
First, run this command:
ls -id /test
Output:
[test:/]# ls -id /test
2 /test/
This informs us that the inode for the directory /test is 2. Record this for future use.
Step 2.
Unmount /test
umount /test
Output: None
We must unmount the directory. We don’t want anyone to try and use it while we are attempting to restore the file.
Step 3
Now we’ll start up the filesystem debugger.
fsdb /dev/testlv
Output:
[test:/]# fsdb /dev/testlv File System: /dev/testlv File System Size: 193200128 (512 byte blocks) Disk Map Size: 1660 (4K blocks) Inode Map Size: 831 (4K blocks) Fragment Size: 4096 (bytes) Allocation Group Size: 16384 (fragments) Inodes per Allocation Group: 8192 Total Inodes: 12075008 Total Fragments: 24150016
This starts the filesystem debugger on our testlv filesystem.
Step 4
Now we look at our inode number.
2i
Output:
2i i#: 2 md: d-g-rwxr-xr-x ln: 4 uid: 3 gid: 3 szh: 0 szl: 512 (actual size: 512) a0: 0x25d a1: 0x00 a2: 0x00 a3: 0x00 a4: 0x00 a5: 0x00 a6: 0x00 a7: 0x00 at: Mon Jan 10 11:19:17 2005 mt: Mon Jan 10 11:11:26 2005 ct: Mon Jan 10 11:11:26 2005
The INODE in the command is the inode number we recorded in step #1. This will display the inode information for the directory. The field a0 contains the block number of the directory. The following steps assume only field a0 is used. If a value appears in a1, etc, it may be necessary to repeat steps #5 and #6 for each block until the file to be recovered is found.
Step 5
Move to the block
a0b
Output:
a0b 0x000025d000 : 0x00000000 (0)
This moves to the block pointed to by field “a0″ of this inode.
Step 6
Now we need to print out some data.
p256c
Output:
p256c 0x000025d000: \0 \0 \0 \? \0 \? \0 \? . \0 \0 \0 \0 \0 \0 \? 0x000025d010: \0 \? \0 \? . . \0 \0 \0 \0 \0 \? \0 \? \0 \n 0x000025d020: l o s t + f o u n d \0 \0 \0 \0 \0 \? 0x000025d030: \0 $ \0 \? m e m _ r e p o r t _ 2 0x000025d040: 0 0 4 1 1 0 1 . d m p . g z \0 \0 0x000025d050: \0 \0 \0 \? \0 \s \0 \? o r a s c r a t 0x000025d060: c h . c p i o . g z \0 \0 \0 \0 \0 \? 0x000025d070: \0 ( \0 \s u s e r _ a c t i v i t 0x000025d080: y _ 2 0 0 4 1 1 0 1 . d m p . g 0x000025d090: z \0 \0 \0 \0 \0 \0 \? \0 , \0 ! u s e r 0x000025d0a0: _ a c t i v i t y _ d e t _ 2 0 0x000025d0b0: 0 4 1 1 0 1 . d m p . g z \0 \0 \0 0x000025d0c0: \0 \? ` \0 \? @ \0 \? E C R 1 X \0 \0 \0 0x000025d0d0: \0 \0 \0 \? \? 0 \0 \? t e s t f i l e 0x000025d0e0: . p l \0 \? \0 \a t e s t d i r \0 0x000025d0f0: j d u c k o . t x t \0 \0 \0 \0 \0 \?
The command p256c stands for ‘print 256 bytes in character mode’. You could type ‘p128c’ and it would print 128 bytes in character mode and so on. The beginning left column is the address of the first character in that row. The important thing in this output is to find which line the file to be recovered is on. Our file ( testfile.pl ) is located on line 0×000025d0d0. Next, we have to find the address of the first character of our filename. To do this, starting at 0, count in hexidecimal until you reach the first character of the filename. In our example, the ‘t’ of testfile.pl is at address 0×000025d0d8. Record this address.
If you cannot find your filename here, issue the command again. It will print the next 256 bytes in character mode. Do this until you find your filename.
Here’s a layout to help you in figuring out how we got the address:
Address: 0 1 2 3 4 5 6 7 8 9 A B C D E F 0x000025d0d0: \0 \0 \0 \? \? 0 \0 \? t e s t f i l e
Step 7
Reset our position.
a0b
Output:
a0b 0x000025d000 : 0x00000000 (0)
This resets our position back to the beginning of the a0 block. This is necessary whenever you want to reprint out the byte data. Remember, however, that if you had to use the ‘p’ command many times to find your filename, you will probably have to use it many times each time you reset back to the beginning.
Step 8
Print our data in decimal
p256e
Output:
p256e 0x000025d000: 0 2 12 1 11776 0 0 2 0x000025d010: 12 2 11822 0 0 16 20 10 0x000025d020: 27759 29556 11110 28533 28260 0 0 17 0x000025d030: 36 26 28005 27999 29285 28783 29300 24370 0x000025d040: 12336 13361 12592 12590 25709 28718 26490 0 0x000025d050: 0 18 28 18 28530 24947 25458 24948 0x000025d060: 25448 11875 28777 28462 26490 0 0 19 0x000025d070: 40 29 30067 25970 24417 25460 26998 26996 0x000025d080: 31071 12848 12340 12593 12337 11876 28016 11879 0x000025d090: 31232 0 0 20 44 33 30067 25970 0x000025d0a0: 24417 25460 26998 26996 31071 25701 29791 12848 0x000025d0b0: 12340 12593 12337 11876 28016 11879 31232 0 0x000025d0c0: 18 24576 320 5 17731 21041 22528 0 0x000025d0d0: 0 21 304 11 29797 29556 26217 27749 0x000025d0e0: 11888 27648 288 7 29797 29556 25705 29184 0x000025d0f0: 27236 30051 27503 11892 30836 0 0 23 0x000025d100: 260 16 27233 28005 29549 24947 29537 29281 0x000025d110: 11892 30836 0 0 0 0 0 0 0x000025d120: 0 0 0 0 0 0 0 0 0x000025d130: 0 0 0 0 0 0 0 0 0x000025d140: 0 0 0 0 0 0 0 0
The command ‘p256e’ stands for ‘print 256 bytes in decimal word format’. This output can be helpful and confusing at the same time. First, find the beginning address that our file name is on. In our example, this was 0×000025d0d0. The line in decimal format reads:
0x000025d0d0: 0 21 304 11 29797 29556 26217 27749
For each file, assume the following:
{ADDRESS}: x x x x x x x x x
| | | | |---- filename -----|
inode # --+----+ | |
| +-- filename length
record LENGTH --+
Note that the inode # may begin on any part of the line. The reason we print the data in decimal format is to help us determine where in the line the inode number is. There are several ways to help you do this, here are some:
- Count the number of characters in your filename, then try and find that number in our address line. ( eg: There are 11 characters in the filename ‘testfile.pl’. ) You can see on our line there is a matching number 11.
- Recount to the address 0×000025d0d8, assuming each column represents two numbers. The first column is 0 and 1. The second column is 2 and 3, then 4 and 5, etc. When you reach the column that matches your address, go back one column. The number in this column should match up with your filename length. Unless, of course, your filename is over 255 characters.
Once you are sure you have the the correct column for your filename length, you are going to count back three more columns. This should put at the first column of the inode number. We’ll use our example decimal line to explain this more:
0x000025d0d0: 0 21 304 11 29797 29556 26217 27749
Like we mentioned before, testfile.pl is 11 characters. We find a matching number 11 in the 4th column. That means that the column with ‘304′ is our record length field and the 0 and 21 columns make up our inode. Now, that we know which columns our inode is in ( columns 1 and 2 ), we must translate this number into our real inode number.
Step 9
Calculate our inode.
Thanks to Arthur Dent for this update
In some cases, our inode number may be a lower number and may not need any special treatment, however, for larger inodes, the information will span columns.
Here’s an example. The directory ECR1X is on an address above ours. Its inode number, like ours, is in columns 1 and 2. However, if you compare the decimal lines, you can immediately see the difference.
EC1X Decimal: 0x000025d0c0: 18 24576 testfile.pl Decimal: 0x000025d0d0: 0 21
To calculate the inode if it spans 2 columns, use the following formula:
firstcolumn * 65536 + secondcolumn = inode
In our case, we are not using column 1, so all we need is column 2 from the previous step. It was 21, so now we know the inode number of the missing file. We’re close to recovery!
Step 10
We go to our new inode number
21i
Output:
21i i#: 21 md: f---rw-r--r-- ln: 0 uid: 0 gid: 3 szh: 0 szl: 45 (actual size: 45) a0: 0xeff a1: 0x00 a2: 0x00 a3: 0x00 a4: 0x00 a5: 0x00 a6: 0x00 a7: 0x00 at: Mon Jan 10 14:16:40 2005 mt: Mon Jan 10 14:16:48 2005 ct: Mon Jan 10 14:16:53 2005
From this output, you can see that we have a file.
Step 11
21i.ln=1
Output:
21i.ln=1 0x0000020a88 : 0x00000001 (1)
This sets the link count of the file back to 1. You can verify this by reissuing the command from step #10 and noticing that the ‘ln’ field has incremented.
21i i#: 21 md: f---rw-r--r-- ln: 1 uid: 0 gid: 3 szh: 0 szl: 45 (actual size: 45) a0: 0xeff a1: 0x00 a2: 0x00 a3: 0x00 a4: 0x00 a5: 0x00 a6: 0x00 a7: 0x00 at: Mon Jan 10 14:16:40 2005 mt: Mon Jan 10 14:16:48 2005 ct: Mon Jan 10 14:16:53 2005
We have now told the filesystem that the link count for inode 21 should be 1. This means that there should be a filename pointing at this inode. This basically reverses what the OS actually does when deleting files. It doesn’t actually erase the file data, instead, it unlinks the filename from its inode number, effectively preventing you from seeing the data.
Step 12
Quit.
q
Output:
q [test:/]#
This quits out of the fsdb.
Step 13
Fsck our volume
fsck /dev/testlv
Output:
[test:/]# fsck /dev/testlv ** Checking /dev/rtestlv (/test) ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts Unreferenced file I=21 owner=root mode=100644 size=45 mtime=Jan 10 14:16 2005 ; RECONNECT? y ** Phase 5 - Check Inode Map Bad Inode Map; SALVAGE? y ** Phase 5b - Salvage Inode Map ** Phase 6 - Check Block Map Bad Block Map; SALVAGE? y ** Phase 6b - Salvage Block Map 18 files 21893872 blocks 171306256 free ***** Filesystem was modified *****
This does a filesystem check on /dev/testlv. As you can see, it finds an inode claiming it is linked to, but no file that links to it. We answer ‘y’ to tell it to reconnect the inode to a filename, effectively giving us our file back!
Step 14
Remount our directory.
mount /test
Output: None
We must remount our filesystem to get back at our file.
Step 15
Go into lost and found. It’s where all lost little kiddies go. Duh.
cd /test/lost+found
Output: None
Our file is now located in lost+found. If you do an ‘ls’ in this directory, you will see something like the following:
[test:/test/lost+found]# ls -l total 8 -rw-r--r-- 1 root sys 45 Jan 10 14:16 21
And if we cat the file 21, we get the following:
[test:/test/lost+found]# cat 21 #!/usr/bin/perl print "this is a test\n";
Ta-da! It’s Myron’s missing perl script!
As a final aside, I will say that there may be different and much better ways of recovering files on AIX, however, this is the way I constructed from notes I found on various mailing lists and a few days of fooling around with it. So if you see some mistakes in this document or have some suggestions for better ways of doing this, please, let me know! I will happily update this document with better information as it is provided.
I hope this helps some of you who have to deal with certain people who accidentally delete files on your systems. Nothing beats a good backup but when you don’t have one of those, this can always be used as a fallback.
March 31st, 2008 at 1:38 am
Nice! When the first column for the inode number is not 0, how do you get the complete inode number?
April 8th, 2008 at 8:23 pm
[...] will be posting an update to the JFS undelete document in a few days with the solutions to the issues of large inodes and some other comments I got. [...]
July 29th, 2008 at 1:36 pm
To calculate the inode number from the decimal screen, instead of having to go to the hex screen, multiply the first column by 65536, then add the second column. Here’s an example below.
Decimal:
0×000025d0c0: 18 24576
Hex:
0×000025d0c0: 0012 6000
18 * 65536 + 24576 = 1204224
0×00126000 = 1204224
As you can see, the results are identical.
Have you made any progress in doing this same procedure using the JFS2 filesystem? The fsdb commands are different with a JFS2 filesystem, and I haven’t yet figured out how to find the inode or block number of my deleted file. Any help would be appreciated.
Thanks!
July 29th, 2008 at 1:46 pm
Hey Arthur,
Thanks for the information, I’ll modify the document with your notes.
I haven’t had any luck yet with the jfs2 when there are a large amount of files. I was studying it for a bit but got caught up in other things. Hopefully I can have something in a few weeks.
August 27th, 2008 at 6:34 am
Very useful info. Any chance of getting the procedure for recovering entire directories posted?