AIX - JFS Recovering a deleted file ( undelete )

Unix, Work Add comments

This is a document I wrote a while back for work that I thought I would release in hopes that some people out there would find it useful.

Preferably, you have a backup of the file system that you can use. If not, the filesystem you are about to try to to recover a file on must meet these requirements:

  • No new files have been created on the filesystem.
  • No files have been extended.
  • The filesystem is able to be unmounted.
  • It is a JFS filesystem, not JFS2

If so, then please, drink a few more beers and continue, but before you do…

BACKUP THE CURRENT FILESYSTEM!

Also, note that if you are dealing with a directory that has been deleted and would like to recover both the directory and the files under that directory, you should try Recovering a Deleted Directory ( a document I have yet to post.. ). It follows many of the same steps, but has some very important differences. Do not try and use this procedure to recover deleted directories and the files that were contained within them. You will mess up.

Before we begin, I need to note a few things. I take no responsibility if this screws up your system. Use this at your own risk. Also, the example presented here is an actual representation of me recovering a deleted file, this is not just made up numbers. Also, this only works on jfs filesystems, not jfs2. The jfs2 fsdb is much different and I haven’t had a chance to play with it to determine the proper way of doing this.

Now that I’ve said that, we can begin. We’ll use an example directory with some example files. Our directory is called /test and our filesystem is testlv, otherwise known as /dev/testlv. In our example, our Junior System Admin, Myron, has accidentally deleted a perl script called testfile.pl and needs to recover it.

Note: If you are performing this operation on a filesystem while in maintenance mode, do NOT use option 1 when asked on how to mount the filesystems. ALWAYS use option 2, which specifies to start a shell before mounting the filesystems. Otherwise, the system will force a fsck -y on the filesystem and delete your files.

Step 1.

First, run this command:

ls -id /test

Output:

[test:/]# ls -id /test
    2 /test/

This informs us that the inode for the directory /test is 2. Record this for future use.

Step 2.

Unmount /test

umount /test

Output: None

We must unmount the directory. We don’t want anyone to try and use it while we are attempting to restore the file.

Step 3

Now we’ll start up the filesystem debugger.

fsdb /dev/testlv

Output:

[test:/]# fsdb /dev/testlv

File System:                           /dev/testlv

File System Size:                         193200128  (512 byte blocks)
Disk Map Size:                                 1660  (4K blocks)
Inode Map Size:                                 831  (4K blocks)
Fragment Size:                                 4096  (bytes)
Allocation Group Size:                        16384  (fragments)
Inodes per Allocation Group:                   8192
Total Inodes:                              12075008
Total Fragments:                           24150016

This starts the filesystem debugger on our testlv filesystem.

Step 4

Now we look at our inode number.

2i

Output:

2i
i#:      2  md: d-g-rwxr-xr-x  ln:    4  uid:    3  gid:    3
szh:        0  szl:      512  (actual size:      512)
a0: 0x25d       a1: 0x00        a2: 0x00        a3: 0x00
a4: 0x00        a5: 0x00        a6: 0x00        a7: 0x00
at: Mon Jan 10 11:19:17 2005
mt: Mon Jan 10 11:11:26 2005
ct: Mon Jan 10 11:11:26 2005

The INODE in the command is the inode number we recorded in step #1. This will display the inode information for the directory. The field a0 contains the block number of the directory. The following steps assume only field a0 is used. If a value appears in a1, etc, it may be necessary to repeat steps #5 and #6 for each block until the file to be recovered is found.

Step 5

Move to the block

a0b

Output:

a0b
0x000025d000  :  0x00000000 (0)

This moves to the block pointed to by field “a0″ of this inode.

Step 6

Now we need to print out some data.

p256c

Output:

p256c

0x000025d000:   \0 \0 \0 \? \0 \? \0 \? .  \0 \0 \0 \0 \0 \0 \?
0x000025d010:   \0 \? \0 \? .  .  \0 \0 \0 \0 \0 \? \0 \? \0 \n
0x000025d020:   l  o  s  t  +  f  o  u  n  d  \0 \0 \0 \0 \0 \?
0x000025d030:   \0 $  \0 \? m  e  m  _  r  e  p  o  r  t  _  2
0x000025d040:   0  0  4  1  1  0  1  .  d  m  p  .  g  z  \0 \0
0x000025d050:   \0 \0 \0 \? \0 \s \0 \? o  r  a  s  c  r  a  t
0x000025d060:   c  h  .  c  p  i  o  .  g  z  \0 \0 \0 \0 \0 \?
0x000025d070:   \0 (  \0 \s u  s  e  r  _  a  c  t  i  v  i  t
0x000025d080:   y  _  2  0  0  4  1  1  0  1  .  d  m  p  .  g
0x000025d090:   z  \0 \0 \0 \0 \0 \0 \? \0 ,  \0 !  u  s  e  r
0x000025d0a0:   _  a  c  t  i  v  i  t  y  _  d  e  t  _  2  0
0x000025d0b0:   0  4  1  1  0  1  .  d  m  p  .  g  z  \0 \0 \0
0x000025d0c0:   \0 \? `  \0 \? @  \0 \? E  C  R  1  X  \0 \0 \0
0x000025d0d0:   \0 \0 \0 \? \? 0  \0 \? t  e  s  t  f  i  l  e
0x000025d0e0:   .  p  l  \0 \?    \0 \a t  e  s  t  d  i  r  \0
0x000025d0f0:   j  d  u  c  k  o  .  t  x  t  \0 \0 \0 \0 \0 \?

The command p256c stands for ‘print 256 bytes in character mode’. You could type ‘p128c’ and it would print 128 bytes in character mode and so on. The beginning left column is the address of the first character in that row. The important thing in this output is to find which line the file to be recovered is on. Our file ( testfile.pl ) is located on line 0×000025d0d0. Next, we have to find the address of the first character of our filename. To do this, starting at 0, count in hexidecimal until you reach the first character of the filename. In our example, the ‘t’ of testfile.pl is at address 0×000025d0d8. Record this address.

If you cannot find your filename here, issue the command again. It will print the next 256 bytes in character mode. Do this until you find your filename.

Here’s a layout to help you in figuring out how we got the address:

Address:        0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
0x000025d0d0:   \0 \0 \0 \? \? 0  \0 \? t  e  s  t  f  i  l  e

Step 7

Reset our position.

a0b

Output:

a0b
0x000025d000  :  0x00000000 (0)

This resets our position back to the beginning of the a0 block. This is necessary whenever you want to reprint out the byte data. Remember, however, that if you had to use the ‘p’ command many times to find your filename, you will probably have to use it many times each time you reset back to the beginning.

Step 8

Print our data in decimal

p256e

Output:

p256e

0x000025d000:         0       2      12       1   11776       0       0       2
0x000025d010:        12       2   11822       0       0      16      20      10
0x000025d020:     27759   29556   11110   28533   28260       0       0      17
0x000025d030:        36      26   28005   27999   29285   28783   29300   24370
0x000025d040:     12336   13361   12592   12590   25709   28718   26490       0
0x000025d050:         0      18      28      18   28530   24947   25458   24948
0x000025d060:     25448   11875   28777   28462   26490       0       0      19
0x000025d070:        40      29   30067   25970   24417   25460   26998   26996
0x000025d080:     31071   12848   12340   12593   12337   11876   28016   11879
0x000025d090:     31232       0       0      20      44      33   30067   25970
0x000025d0a0:     24417   25460   26998   26996   31071   25701   29791   12848
0x000025d0b0:     12340   12593   12337   11876   28016   11879   31232       0
0x000025d0c0:        18   24576     320       5   17731   21041   22528       0
0x000025d0d0:         0      21     304      11   29797   29556   26217   27749
0x000025d0e0:     11888   27648     288       7   29797   29556   25705   29184
0x000025d0f0:     27236   30051   27503   11892   30836       0       0      23
0x000025d100:       260      16   27233   28005   29549   24947   29537   29281
0x000025d110:     11892   30836       0       0       0       0       0       0
0x000025d120:         0       0       0       0       0       0       0       0
0x000025d130:         0       0       0       0       0       0       0       0
0x000025d140:         0       0       0       0       0       0       0       0

The command ‘p256e’ stands for ‘print 256 bytes in decimal word format’. This output can be helpful and confusing at the same time. First, find the beginning address that our file name is on. In our example, this was 0×000025d0d0. The line in decimal format reads:

0x000025d0d0:         0      21     304      11   29797   29556   26217   27749

For each file, assume the following:

   {ADDRESS}:  x    x    x    x    x    x    x    x    x
               |    |    |    |    |---- filename -----|
     inode # --+----+    |    |
                         |    +-- filename length
         record LENGTH --+

Note that the inode # may begin on any part of the line. The reason we print the data in decimal format is to help us determine where in the line the inode number is. There are several ways to help you do this, here are some:

  • Count the number of characters in your filename, then try and find that number in our address line. ( eg: There are 11 characters in the filename ‘testfile.pl’. ) You can see on our line there is a matching number 11.
  • Recount to the address 0×000025d0d8, assuming each column represents two numbers. The first column is 0 and 1. The second column is 2 and 3, then 4 and 5, etc. When you reach the column that matches your address, go back one column. The number in this column should match up with your filename length. Unless, of course, your filename is over 255 characters.

Once you are sure you have the the correct column for your filename length, you are going to count back three more columns. This should put at the first column of the inode number. We’ll use our example decimal line to explain this more:

0x000025d0d0:         0      21     304      11   29797   29556   26217   27749

Like we mentioned before, testfile.pl is 11 characters. We find a matching number 11 in the 4th column. That means that the column with ‘304′ is our record length field and the 0 and 21 columns make up our inode. Now, that we know which columns our inode is in ( columns 1 and 2 ), we must translate this number into our real inode number.

Step 9

Calculate our inode.

Thanks to Arthur Dent for this update
In some cases, our inode number may be a lower number and may not need any special treatment, however, for larger inodes, the information will span columns.

Here’s an example. The directory ECR1X is on an address above ours. Its inode number, like ours, is in columns 1 and 2. However, if you compare the decimal lines, you can immediately see the difference.

EC1X Decimal:
0x000025d0c0:      18  24576
testfile.pl Decimal:
0x000025d0d0:       0     21

To calculate the inode if it spans 2 columns, use the following formula:

firstcolumn * 65536 + secondcolumn = inode

In our case, we are not using column 1, so all we need is column 2 from the previous step. It was 21, so now we know the inode number of the missing file. We’re close to recovery!

Step 10

We go to our new inode number

21i

Output:

21i
i#:     21  md: f---rw-r--r--  ln:    0  uid:    0  gid:    3
szh:        0  szl:       45  (actual size:       45)
a0: 0xeff       a1: 0x00        a2: 0x00        a3: 0x00
a4: 0x00        a5: 0x00        a6: 0x00        a7: 0x00
at: Mon Jan 10 14:16:40 2005
mt: Mon Jan 10 14:16:48 2005
ct: Mon Jan 10 14:16:53 2005

From this output, you can see that we have a file.

Step 11

21i.ln=1

Output:

21i.ln=1
0x0000020a88  :  0x00000001 (1)

This sets the link count of the file back to 1. You can verify this by reissuing the command from step #10 and noticing that the ‘ln’ field has incremented.

21i
i#:     21  md: f---rw-r--r--  ln:    1  uid:    0  gid:    3
szh:        0  szl:       45  (actual size:       45)
a0: 0xeff       a1: 0x00        a2: 0x00        a3: 0x00
a4: 0x00        a5: 0x00        a6: 0x00        a7: 0x00
at: Mon Jan 10 14:16:40 2005
mt: Mon Jan 10 14:16:48 2005
ct: Mon Jan 10 14:16:53 2005

We have now told the filesystem that the link count for inode 21 should be 1. This means that there should be a filename pointing at this inode. This basically reverses what the OS actually does when deleting files. It doesn’t actually erase the file data, instead, it unlinks the filename from its inode number, effectively preventing you from seeing the data.

Step 12

Quit.

q

Output:

q
[test:/]#

This quits out of the fsdb.

Step 13

Fsck our volume

fsck /dev/testlv

Output:

[test:/]# fsck /dev/testlv

** Checking /dev/rtestlv (/test)
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
Unreferenced file  I=21  owner=root mode=100644
size=45 mtime=Jan 10 14:16 2005 ; RECONNECT? y
** Phase 5 - Check Inode Map
Bad Inode Map; SALVAGE? y
** Phase 5b - Salvage Inode Map
** Phase 6 - Check Block Map
Bad Block Map; SALVAGE? y
** Phase 6b - Salvage Block Map
18 files 21893872 blocks 171306256 free
***** Filesystem was modified *****

This does a filesystem check on /dev/testlv. As you can see, it finds an inode claiming it is linked to, but no file that links to it. We answer ‘y’ to tell it to reconnect the inode to a filename, effectively giving us our file back!

Step 14

Remount our directory.

mount /test

Output: None

We must remount our filesystem to get back at our file.

Step 15

Go into lost and found. It’s where all lost little kiddies go. Duh.

cd /test/lost+found

Output: None

Our file is now located in lost+found. If you do an ‘ls’ in this directory, you will see something like the following:

[test:/test/lost+found]# ls -l
total 8
-rw-r--r--   1 root     sys              45 Jan 10 14:16 21

And if we cat the file 21, we get the following:

[test:/test/lost+found]# cat 21
#!/usr/bin/perl

print "this is a test\n";

Ta-da! It’s Myron’s missing perl script!

As a final aside, I will say that there may be different and much better ways of recovering files on AIX, however, this is the way I constructed from notes I found on various mailing lists and a few days of fooling around with it. So if you see some mistakes in this document or have some suggestions for better ways of doing this, please, let me know! I will happily update this document with better information as it is provided.

I hope this helps some of you who have to deal with certain people who accidentally delete files on your systems. Nothing beats a good backup but when you don’t have one of those, this can always be used as a fallback.

5 Responses to “AIX - JFS Recovering a deleted file ( undelete )”

  1. sys Says:

    Nice! When the first column for the inode number is not 0, how do you get the complete inode number?

  2. the second phase » Blog Archive » JFS undelete update Says:

    [...] will be posting an update to the JFS undelete document in a few days with the solutions to the issues of large inodes and some other comments I got. [...]

  3. Arthur Dent Says:

    To calculate the inode number from the decimal screen, instead of having to go to the hex screen, multiply the first column by 65536, then add the second column. Here’s an example below.

    Decimal:
    0×000025d0c0: 18 24576
    Hex:
    0×000025d0c0: 0012 6000

    18 * 65536 + 24576 = 1204224

    0×00126000 = 1204224

    As you can see, the results are identical.

    Have you made any progress in doing this same procedure using the JFS2 filesystem? The fsdb commands are different with a JFS2 filesystem, and I haven’t yet figured out how to find the inode or block number of my deleted file. Any help would be appreciated.

    Thanks!

  4. steve Says:

    Hey Arthur,

    Thanks for the information, I’ll modify the document with your notes.

    I haven’t had any luck yet with the jfs2 when there are a large amount of files. I was studying it for a bit but got caught up in other things. Hopefully I can have something in a few weeks.

  5. Dean Gabriel Says:

    Very useful info. Any chance of getting the procedure for recovering entire directories posted?

Leave a Reply