Hi, I'd still like to see bsdjail/vserver/zone functionality in linux. It seems to me the following pieces are needed: filesystem namespaces (mostly there, probably want shared subtrees) read-only bind mounts (not there yet) task separation (ie ptrace, etc: can be done by selinux) task-hiding ability (see attached patches) network jails (see below) hostname/domainname per jail? (is this necessary?) resource management - can be done by selinux, ckrm, etc filesystem controls - can be done by selinux, using a simple policy (attached) provided jails get their own (loopback is fine) filesystem; else read-only bind mounts would also help. more? Some intuitive script(s) to use all of the above. Attached are the old task_lookup patch which was used by the bsdjail lsm, a patch for selinux to utilize this hook, and a sample jail policy and .fc, which presumably would eventually be changed to a jail_domain() policy macro. Does this seem at all useful by itself, or should this wait until it were actually needed for a complete linux jails implementation? (Note that access_vectors.diff patches /etc/selinux/targeted/src/policy/flask/access_vectors, jail2.fc can go in /etc/selinux/targeted/src/policy/file_contexts/misc/, and jail2.te can go into /etc/selinux/targeted/src/policy/domains/misc/) It seems to me the greatest challenge is network jails. I don't think this can be done right with selinux. I believe you can restrict a domain's access to remote addresses by IP, but not to local addresses during bind. Am I wrong in assuming jails would be useless without this? (I suppose they could at least be useful for sandboxes of some sort) Does anyone have ideas on a good way to implement these? Some time ago I sent out an RFC for network namespaces, which allowed a process to essentially give up its access to a network device. The patch only allowed a process to give up access to real network devices, not ip aliases (ie eth0:0). But this seems much less useful for allowing admins to provide multiple jails. The linux-vserver team is working on virtual networking which (IIUC) creates a virtual network device which is then associated with a virtual address, a real network device, and a jail. This appears to be a way to make the simple version of network namespaces I describe in the paragraph above more useful, since we would not need to deal with ip aliases. Is there any interest in seeing the virtual network devices and network namespaces pushed upstream? Read-only bind mounts? The attached task-lookup patches? thanks, -serge
This archive was generated by hypermail 2.1.3 : Wed Jun 29 2005 - 09:09:08 PDT