Re: [PATCH] BSD Secure Levels LSM

From: Chris Wright (chrisw@private)
Date: Mon Nov 24 2003 - 18:06:32 PST

  • Next message: Michael A. Halcrow: "Re: [PATCH] BSD Secure Levels LSM"

    * Michael A. Halcrow (mahalcro@private) wrote:
    > Please take some time to look this over and provide feedback.  It is
    > meant to be nothing more than a lightweight security measure that an
    > administrator who is already familiar with BSD Secure Levels can just
    > pop into the kernel with reasonable expectations that he will get
    > equivalent functionality.
    
    Thanks for posting the module.  Some comments below...
    
    > +/**
    > + * Is this redundant with CAP_SYS_PTRACE?  --serge 
    
    Doesn't seem like it, since with CAP_SYS_PTRACE you don't know the tracee,
    (or child).
    
    > +static int seclvl_ptrace( struct task_struct* parent, 
    > +			  struct task_struct* child ) 
    > +{ 
    > +	if( seclvl >= 0 ) {
    > +		if( child->pid == 1 ) {
    > +			printk( KERN_WARNING "Attempt to ptrace the init "
    > +				"process dissallowed in secure level %d\n", 
    > +				seclvl );
    > +			return -EPERM;
    > +		}
    > +	}
    > +	return 0; 
    > +}
    > +
    > +/**
    > + * Capability checks for seclvl.  The majority of the policy
    > + * enforcement for seclvl takes place here.
    > + */
    > +static int seclvl_capable( struct task_struct* tsk, int cap )
    > +{
    > +	/* init can do anything it wants */
    > +	if( tsk->pid == 1 ) {
    > +		return 0;
    > +	}
    > +
    > +	switch( seclvl ) {
    > +	case 2:
    > +		/* fall through */
    > +	case 1:
    > +		if( cap == CAP_LINUX_IMMUTABLE ) {
    > +			printk( KERN_WARNING "Attempt to modify a file with "
    > +				"S_IMMUTABLE and/or S_APPEND file attributes "
    > +				"in secure level %d denied\n", seclvl );
    > +			return -EPERM;
    > +		} else if( cap==CAP_SYS_RAWIO ) { // Somewhat broad...
    > +			printk( KERN_WARNING "Attempt to perform raw I/O "
    > +				"while in secure level %d denied\n", seclvl );
    > +			return -EPERM;
    > +		} else if( cap == CAP_NET_ADMIN ) {
    > +			printk( KERN_WARNING "Attempt to perform network "
    > +				"administrative task while in secure level "
    > +				"%d denied\n", seclvl );
    > +			return -EPERM;
    > +		} else if( cap == CAP_SETUID ) {
    > +			printk( KERN_WARNING "Attempt to setuid while in "
    > +				"secure level %d denied\n", seclvl );
    > +			return -EPERM;      
    > +		}
    > +		break;
    > +	case 0:
    > +		/* fall through */
    > +	case -1:
    > +		/* fall through */
    > +	default: break;
    > +	}
    
    style nitpick:
    
    	case 0:
    	case -1:
    	default:
    		break;
    	}
    
    is preferred.
    
    > +	/* from dummy.c */
    > +	if( cap_is_fs_cap( cap ) ? tsk->fsuid == 0 : tsk->euid == 0 )
    > +		return 0; /* capability granted */
    > +	printk( KERN_WARNING "Capability denied\n" );
    > +	return -EPERM; /* capability denied */
    > +}
    > +
    > +/**
    > + * Module loading policy enforcement in seclvl >= 1
    > + */
    
    
    > +static int seclvl_module_load( struct module* mod )
    > +{
    > +	if( seclvl >= 1 ) {
    > +		printk( KERN_WARNING "Attempt to perform module load in "
    > +			"secure level %d denied: current->pid = [%d], "
    > +			"current->group_leader->pid = [%d]\n", seclvl,
    > +			current->pid, current->group_leader->pid );
    > +		return -EPERM;
    > +	}
    > +	return 0;
    > +}
    > +
    > +/**
    > + * Module deletion policy enforcement in seclvl >= 1
    > + */
    > +static int seclvl_module_delete( struct module* mod )
    > +{
    > +	if( seclvl >= 1 ) {
    > +		printk( KERN_WARNING "Attempt to unload a module in secure "
    > +			"level %d denied: current->pid = [%d], "
    > +			"current->group_leader->pid = [%d]\n", seclvl, 
    > +			current->pid, current->group_leader->pid );
    > +		return -EPERM;
    > +	}
    > +	return 0;
    > +}
    
    AFAICT, these don't actually justify the new hooks.  The CAP_SYS_MODULE
    capable() test could be added above to the seclvl_capable() switch, and
    it would achieve the seclvl protection from module load/unload, right?
    
    > +/**
    > + * Disallow reversing the clock in seclvl > 1
    > + */
    > +static int seclvl_settime( struct timespec* tv )
    > +{
    > +	struct timespec now;
    > +	if( seclvl > 1 ) {
    > +		now = current_kernel_time();
    > +		if( tv->tv_sec < now.tv_sec ||
    > +		    ( tv->tv_sec == now.tv_sec &&
    > +		      tv->tv_nsec < now.tv_nsec ) ) {
    > +			printk( KERN_WARNING "Attempt to decrement time in "
    > +				"secure level %d denied: current->pid = [%d],"
    > +				"current->group_leader->pid = [%d]\n", seclvl, 
    > +				current->pid, current->group_leader->pid );
    > +			return -EPERM;
    > +		} /* if attempt to decrement time */
    > +	} /* if seclvl > 1 */
    > +	return 0;
    
    When we first went through this, there were multiple ways to get to
    setting time, with no refactoring to a common helper (an issue in the
    kernel).  Since then, much of this code has changed, and I believe we
    may be in a better shape to catch all spots that set time.  For example,
    a quick look shows that stime(2) is not caught by the current BK LSM
    tree, and I don't think your patch is any different.
    
    > +/**
    > + * This function gives the administrator the option of breaking out of
    > + * Secure Levels by executing a file of his choice.  It is assumed
    > + * that, if an administrator chooses to use this feature, he is
    > + * employing additional security measures to prevent an attacker from
    > + * using this to circumvent the security policies of the system.
    > + */
    > +static int seclvl_bprm_set_security( struct linux_binprm* bprm )
    > +{
    > +	if( !magicdentry )
    > +		return 0;
    > +	if( bprm->file->f_dentry->d_inode == magicdentry->d_inode ) {
    > +		printk( KERN_WARNING "%s called: reset seclvl=0\n",
    > +			bprm->filename );
    > +		seclvl = 0;
    > +	}
    > +	return 0;
    > +}
    > +
    > +int is_mounted( struct inode* inode )
    > +{
    > +	struct list_head* super_blocks;
    > +	struct list_head* p;
    > +	super_blocks = &current->fs->rootmnt->mnt_sb->s_list;
    > +	list_for_each( p, super_blocks ) {
    > +		struct super_block* s = sb_entry( p );
    > +		if( s && s->s_bdev && ( s->s_bdev->bd_inode == inode ) )
    > +			return 1;
    > +	}
    > +	return 0;
    > +}
    
    This is racy.  How about something that uses the exported kernel api's?
    
    Like:
    
    struct super_block *sb;
    if (S_ISBLK(inode->i_mode) {
    	struct block_device *bdev = inode->i_bdev;
    	sb = get_super(bdev);
    	if (sb) {
    		drop_super(sb);
    		return 1;
    	}
    }
    return 0;
    
    > +/**
    > + * Security for writes to block devices is regulated by this seclvl
    > + * function.
    > + */
    > +static int seclvl_inode_permission( struct inode* inode, int mask,
    > +				   struct nameidata* nd)
    > +{
    > +	/*
    > +	 * For now deny all writes to block devices in seclvl 1 & 2.
    > +	 * We really want to only do that in 2.  In seclvl 1, we only
    > +	 * want to deny write to *mounted* block devices.
    > +	 */
    > +	if( current->pid == 1 )
    > +		return 0;
    > +	if( S_ISBLK( inode->i_mode ) && ( mask & MAY_WRITE ) ) {
    > +		switch( seclvl ) {
    > +		case 2:
    > +			printk( KERN_WARNING "Write to block device denied "
    > +				"in secure level [%d]\n", seclvl );
    > +			return -EPERM;
    > +		case 1:
    > +			if( is_mounted( inode ) ) {
    > +				printk( KERN_WARNING "Write to mounted block "
    > +					"device denied in secure level [%d]\n",
    > +					seclvl );
    > +				return -EPERM;
    > +			}
    > +			if( MAJOR( inode->i_rdev ) == 1 ) {
    > +				if( MINOR( inode->i_rdev ) == 1 ||
    > +				    MINOR( inode->i_rdev ) == 2 ) {
    
    I'm not sure I'd excpect this kind of magic number to persist.
    There are people who push for dynamic and random major minor numbers.
    Is the CAP_SYS_RAWIO check not sufficient?  Uhh, wait, does this check
    actually work?  These are char devices, not block devices.
    
    > +						"Attempt to write /dev/mem "
    > +						"or /dev/kmem in secure "
    > +						"level [%d]\n", seclvl );
    > +					return -EPERM;
    > +				} /* if device is /dev/mem or /dev/kmem */
    > +			} /* if memory device */
    > +		} /* switch seclvl */
    > +	} /* If attempting to write to block device */
    > +	return 0;
    > +}
    > +
    > +/**
    > + * The SUID and SGID bits cannot be set in seclvl >= 1
    > + */
    > +static int seclvl_inode_setattr( struct dentry* dentry, struct iattr* iattr )
    > +{
    > +	if( seclvl > 0 ) {
    > +		if( iattr->ia_valid & ATTR_MODE )
    > +			if( iattr->ia_mode & S_ISUID || 
    > +			    iattr->ia_mode & S_ISGID ) {
    
    This latter check could break the case where you set a file for
    mandatory locking.  Not sure if that's intentional.
    
    > +				printk( KERN_WARNING "Attempt to modify SUID "
    > +					"or SGID bit denied in seclvl [%d]\n",
    > +					seclvl );
    > +				return -EPERM;
    > +			}
    > +	}
    > +	return 0;
    > +}
    > +
    > +/**
    > + * Cannot unmount in secure level 2
    > + */
    > +static int seclvl_umount( struct vfsmount* mnt, int flags )
    > +{
    > +	if( current->pid == 1 ) {
    > +		return 0;
    > +	}
    > +	if( seclvl == 2 ) {
    > +		printk( KERN_WARNING "Attempt to unmount in secure level "
    > +			"%d\n", seclvl );
    > +		return -EPERM;
    > +	}
    > +	return 0;
    > +}
    > +
    > +/* I know dummy has this, but I'm not convinced we want this. */
    > +static void seclvl_task_reparent_to_init( struct task_struct* p )
    > +{ 
    > +	p->euid = p->fsuid = 0;
    > +	return; 
    > +}
    
    It's used in conjuction with capable() checks.  Which you still capture
    (ahead of the uid check), so I guess it could be ok.  It's the type of
    side effect that makes for interesting stacking ;-)
    
    > +int seclvl_proc_register( void )
    > +{
    > +	struct proc_dir_entry* entry;
    > +	entry = create_proc_entry( "seclvl", S_IFREG | S_IRUGO, &proc_root );
    
    This is really /proc abuse.  We are trying to move away from /proc.
    Perhaps we need to revisit sysfs infrastructure.
    
    > +	if( magicpath ) {
    > +		struct nameidata nd;
    > +		ret = path_lookup( magicpath, LOOKUP_FOLLOW, &nd );
    > +		if( ret ) {
    > +			/* 
    > +			 * Better not to load, or to potentially mess up
    > +			 * filesystem by loading?
    > +			 * Maybe in these days of jfs's, we should load?
    > +			 */
    > +			printk( KERN_ERR "Seclvl: Bad filename [%s] specified"
    > +				" for seclvl escape.\n"
    > +				"Seclvl NOT LOADED (%d).\n",
    > +				magicpath, ret);
    > +			goto out;
    > +		}
    > +		magicdentry = dget( nd.dentry );
    
    I'm not sure how safe this is, since it provides an unauthenticated
    backdoor, no?  At least just the dentry isn't subject to full all the various
    vfsmnt issues.  However, couldn't one simply bind a new file over this
    one, then if you can't umount, this thing is effectively gone...
    
    thanks,
    -chris
    -- 
    Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net
    



    This archive was generated by hypermail 2b30 : Mon Nov 24 2003 - 18:07:40 PST