Re: TCT / tctutils for HP-UX 11.00 + some insight into unrm'ing large files

From: Knut Eckstein (knutat_private)
Date: Sun Aug 04 2002 - 13:36:24 PDT

  • Next message: Knut Eckstein: "Re: TCT / tctutils for HP-UX 11.00 + some insight into unrm'ing large files"

    Hello,
    
    Wietse Venema wrote:
    > You need to be aware that the file system allocates one indirect
    > data block every 16 Mbytes or so (or whatever the number of block
    > addresses in an indirect block).  This could cause additional
    > "jumps" in the file allocation.
    
    Yes, correct, I did not mention this in my notes page, although
    the effect can be seen in the sample checkbig messages given there,
    because these blocks are also part of the unrm output.
    
    The "Zero block found ..."lines denote a zeroized 1K block each.
    Together, the 8 consecutive lines represent a former 8K indirect block.
    Altogether there were 130 indirect/double indirect blocks needed for
    this file. When verifying the operation of unrm last week I doublechecked
    that the correct number of former indirect blocks can be found in the unrm
    output. But I observed ca. 1000 "jumps" more than expected, so this
    explains only a part of the "additional jumps".
    
    > Regarding the small-file limit: when I did some measurements with
    > smaller files on 4.4BSD I found that the file system behaved as
    > expected: 12 direct data blocks, one indirect block with data block
    > addresses, a contiguous run of data blocks, one doubly-indirect
    > block, one indirect block, another contiguous run of data blocks,
    > and so on.
    
    How big was your cylinder group size in relation to the file you tested
    with? In my case, with a cg size of 10 MB, the 16 MB of data blocks that
    "follow" one indirect block wouldn't even fit in one cg. Maybe my cg size
    is way too small, I just used the default chosen by newfs.
    Looking at your observations, it seems to me that "everything took place"
    within one cg. So if your cg-change margin was at the default of 25%, that
    would mean that your filesize was less than 25% of the cg size. As McKusick
    et. al point out in "Design and Implemetation of 4.4BSD", when moving from
    one cg to another due to the 25% fill rate of the current cg being reached,
    FFS chooses a cg group which has more free disk blocks than the average cg.
    This is where statistics come into play, so one cannot truly predict in which
    cg allocation will continue. That's why I was expecting a "jump" every 2.5 MB.
    
    > Knut Eckstein:
    > 
    >>Like with 10.20, I tested unrm on 11.00 on a file > 2GB. While waiting
    >>almost three hours during the unrm run, I thought I should analyze the
    >>
    > 
    > That is incredibly slow. unrm runs at the same speed as dd.
    
    The machine is a 1995 99 MHz PARisc and one 9GB disc is connected to
    the fast narrow SCSI bus. Both the unrm "source" and "target" partition
    reside on the same physical drive, so I assume there is a lot of
    head-movement involved... Also, the high amount of fragmentation that
    I observed in the "source" file might slow things down a bit.
    
    # time bigfile 2113536 > /big/bigfile
    
    real    12:21.4
    user     9:29.1
    sys      2:11.0
    
    
    
    
    -----------------------------------------------------------------
    This list is provided by the SecurityFocus ARIS analyzer service.
    For more information on this free incident handling, management 
    and tracking system please see: http://aris.securityfocus.com
    



    This archive was generated by hypermail 2b30 : Thu Aug 08 2002 - 03:14:59 PDT