MP451: Sam Rose
Spring 2012, 1 credit
My involvement with COSI this school year involves experimentation, development, and implementation of SMART and RAID monitoring tools for use with Nagios. Additionally I am involved with various lab maintenance, including several servers and lab builds.
Much of this work was performed in the fall semester when I did not enroll in MP451 credit. This work, as well as additional work this semester, is going toward credit this semester with permission of Prof. Matthews.
Nagios Disk Monitoring
The goal of this project is to develop a system which can monitor the health (SMART data) of hard drives in various networked servers, as well as the status of md RAID arrays. This system will allow Nagios to send alerts if hard drives report imminent failure or RAID arrays drop a drive.
Much of this framework was already available in the form of open-source Nagios plugins. Modification was needed to allow these tools to be run on machines other than the Nagios server itself, allowing any machine running SNMP to be monitored.
Tasks completed already:
- Studied Nagios configuration and its interaction with its plugins and SNMP
- Modified an open-source Nagios plugin to allow generic SNMP "exec" commands to be run on a remote system and send their returns back to the originating system
- Modified open-source Nagios RAID and SMART monitoring plugins to produce outputs parsable over SNMP
- Tested plugins on several servers and a specific-pourpose machine with several known bad hard drives
- Wrote a documentation page explaining how to install these plugins and configure both the remote server and Nagios for monitoring
Tasks remaining:
- Work with COSI system administrators to implement this monitoring on various servers
- Find an appropriate place to store these plugins so they are available for use
- Present on Nagios and SNMP after a COSI meeting
COSI Lab Maintenance
- Analyzed failure of remote backup server PLBackup1 and advised on a replacement
- Installed new operating systems on machine VR-KIT for Prof. Searleman, and worked with OIT to acquire licenced software
- Maintain server xen3, and repourposed it from a VM server to a research/compute machine since the lab already had enough VM hosting capacity
- Repaired system COSI-09 for Vido after a video card anomaly
- In queue: install new operating systems on machine VR-Alienware for Prof. Searleman
ITL Lab Maintenance
- Developed new Windows 7 64-bit build to replace aging build
- Worked with OIT and CS professors to install all necessary software
- Worked with Christian Mesh to pave the build to the rest of the lab
- Maintain the lab during the semester to accommodate software and change requests