Projects :: DAPdHave you ever been in charge of a distributed serverpark, colocated at more than one IP supplier, wanting to monitor all these machines from a single box, checking their system load, logged in users, network throughput, and seeing at a glance what server is doing which tasks ? With a growing serverpark running any of NetBSD, OpenBSD, FreeBSD or even Linux, do you lose track of where your bandwidth goes ? Tired of crashing snmpd's on your servers, and IP colocation admins unwilling to run MRTG on your switchports ? Are you an admin at an ISP, with a server you frequently log on to, to read mail, do daily chores etcetera. Do you know your fellow admins have a server of their own also, and sometimes want to be able to figure out where your admin-friends are ? Have you ever ran rwhod, finding that it's stability and portability is poor, that it has to at least start as root due to privileged portusage, and most of the time also remains in possession of superuser privileges listen to the hostile Internet for incoming (possibly forged) packets ? Tired of seeing other people's boxes on your stats, just because they are in the same subnetwork as you ?
Then the dap daemon is a softwarepackage for you!
What is DAPd?It is a daemon that runs entirely as a non-privileged user (eg your own user account), sending regular updates of the server status to a specified set of peers in a cluster. You can specify any number of information elements, such as uptime, networkdevice counters (packet/octet), logged in users (via utmp), load averages and memory/swap usage. When the program starts, it binds a userdefined UDP socket on the local machine, listening for incoming updates from its peers, or optionally becoming mute and thus safe from possible attacks from the Internet. All incoming packets are verified and then stored in a local state directory, for use by the client programs. On the client side, there are a number of programs which read the state directory and extract the information needed. One utility reports uptime/load, another expands the 'who' functionaly over the entire cluster, counts the amount of logged in users, locates a specific user by least-idle TTY, and more are to come.
Why run DAPd?If you have a set of servers for which you want to see information like logged in users, processor load, memory usage, network usage, and you want to see all these statistics on every box (or just one NOC box), you will want to have a sampling program on each server sending information around about its status. If you work at an ISP and would like to be in closer contact with your fellow admins across the network, or even across the Internet. If you want to find me, and the machine I am logged onto, with the least idletime, you can use the 'rwho' utility to locate me using the information my server has sent around the network regarding logged in users. If you think Netsaint or Nagios are too elaborate to take care of a simple task such as monitoring load or memory usage (NRPE and check_by_ssh do not appeal to you), you can have dapd monitor these things for you and let them ring a bell if some critical limits are surpassed. If you want runtime statistics about load/mem/cpu/disk usage and others, nicely graphed and presented in a webinterface, you can use dapd as a basis for measuring things and pushing them into a round robin database.
Features in DAPdThe current features include:
- Starts and runs as a non-privileged user (eg, non-root)
- Opens a UDP socket to communicate to its peers
- Can optionally not open a listensocket for incoming packets
- Gathers system information without calling other binaries (all built in)
- Can run in foreground (DJB daemontools-compatible)
- Can run as background daemon (BSD/Linux)
- Can log to file, stdout or syslog
- Stable. Uses clean code
- Portable. Runs on (at least) Net/Free/OpenBSD and Linux
- IPv6 enabled. It's what I stand for
- Highly configurable through a C-style configuration parser
- RRDtool graphing addon, creating onthefly round robin databases of your serverpark
- MD5-HMAC 'shared secret' challenge to make peers authenticate themselves
- ACL control on which clients may send updates, and which may request RRDs
- An incredibly stupid name
- identity and cluster
- boot time
- uname information (system type, architecture, revision)
- utmp (logged in users)
- network IO (pps/bps) by interface
- RAM use: used, free, shared, cached, buffers
- SWAP use: used, free
- disk IO (read/write ops/bytes) by device
- disk usage (KB/inodes inuse/free/reserved) by partition
- open filedescriptors, processes