Difference between revisions of "Management"

From CSLabsWiki
(Server Side (management itself))
 
(44 intermediate revisions by 2 users not shown)
Line 4: Line 4:
 
|ip_addr = 128.153.145.62
 
|ip_addr = 128.153.145.62
 
|contact_person = [[User:Jared|Jared Dunbar]]
 
|contact_person = [[User:Jared|Jared Dunbar]]
|last_update = ''February 2016''
+
|last_update = ''July 2017''
|services = server status indicator
+
|services = server status collection server
|development_status = 50%
+
|development_status = 30%
|category = VM
+
|category = Machines
 
|handoff = no
 
|handoff = no
 
}}
 
}}
   
{{Machine
 
| screenshot =
 
| maintainer = [[User:Jared|Jared Dunbar]]
 
| hostname = management
 
| operating_system = Armbian (Debian) Jessie (kernel 4.4.1-sunxi)
 
| interface1 = {{Network Interface | name = eth0 | mac = 02:8e:08:41:65:6a | ip = 128.153.145.62 }}
 
| cpuspecs = Hard Float Dual Core Allwinner A20 armv7l, Mali 400 MP2
 
| ramspecs = 1GB DDR3 ECC
 
| hddspecs = 8GB SD Class 4 (4MB/s)
 
}}
 
   
  +
'''Management''' (stat3) is a vm used for monitoring the status of hosts in the server room, ie. checking the CPU, RAM, and hard drive stats, among other configurable things.
   
'''Management''' is a SBC (single board computer) used for monitoring the status of VM's on other machines and the status of the hardware in the server room, ie. checking the CPU, RAM, and hard drive stats, among other configurable things.
+
Each computer in the server room that is configured sends data periodically which will be shown in an uptime page on a webpage that can easily be used to determine system uptime and service uptime among other things.
   
  +
Also, you can view COSI network stats at <del>http://management.cosi.clarkson.edu/cacti with the csguest user (and default password)</del> in raw data at http://stat.cosi.clarkson.edu/data since Cacti broke (again)
Each computer in the server room that will be assigned to this list will have a startup executable written in BASH scripts and C/C++ executables that will send data periodically to '''Management''' which will be shown in an uptime page on a webpage that can easily be used to determine system uptime and service uptime among other things.
 
   
  +
=Installing Management Clients=
Currently installed on the machine are the following:
 
   
  +
Required Software: Git, g++, make
<pre>
 
htop openssh-client vim openjdk-7-jdk p7zip-full g++ sudo git upower apcupsd
 
</pre>
 
   
  +
On Debian:
=Client Side (runs on a server)=
 
==Requirements==
 
   
  +
<pre style="background-color:#ffcccc">
<pre>
 
  +
apt update && apt install make g++ git
g++ top awk tail bash sed
 
 
</pre>
 
</pre>
   
  +
==Clone with Git==
The source code for the client executable is available online at https://github.com/jrddunbr/management-client
 
   
  +
First, set git to allow all certificates, and get the files using Git.
The bash scripts are made wherever necessary (it's expandable and each server can theoretically have as many keys as it wants, each data parameter is stored as a key) and here are some functional examples:
 
   
  +
<pre style="background-color:#ffcccc">
==CPU:==
 
  +
git config --global http.sslVerify false
<pre>
 
  +
git clone https://gitlab.cosi.clarkson.edu/jared/manage2client.git
#!/bin/bash
 
DATA=$(
 
top -bn2 | \
 
grep "Cpu(s)" | \
 
sed -n '1!p' | \
 
sed "s/.*, *\([0-9.]*\)%* id.*/\1/" | \
 
awk '{print 100 - $1}')
 
echo $DATA
 
/manage/management-client 128.153.145.62 80 cpu $DATA
 
 
</pre>
 
</pre>
   
  +
Re-secure the system by only accepting repos with certificates.
==Used-Ram:==
 
<pre>
 
#!/bin/bash
 
read _1 MEMTOTAL _2 <<< "$(head -n 1 /proc/meminfo)"
 
read _1 MEMAVAIL _2 <<< "$(tail -n +3 /proc/meminfo | head -n 1)"
 
DATA="$(( (MEMTOTAL - MEMAVAIL) / 1024 ))MB"
 
echo $DATA
 
/manage/management-client 128.153.145.62 80 used-ram $DATA
 
</pre>
 
   
  +
<pre style="background-color:#ffcccc">
==Total-Ram:==
 
  +
git config --global http.sslVerify true
<pre>
 
#!/bin/bash
 
read _1 MEMTOTAL _2 <<< "$(head -n 1 /proc/meminfo)"
 
DATA="$(( MEMTOTAL / 1024 ))MB"
 
echo $DATA
 
/manage/management-client 128.153.145.62 80 total-ram $DATA
 
 
</pre>
 
</pre>
   
==Uptime:==
+
==Prepare files==
<pre>
 
#!/bin/bash
 
DATA=$(uptime -p | sed -e 's/ /_/g')
 
echo $DATA
 
/manage/management-client 128.153.145.62 80 uptime "$DATA"
 
</pre>
 
 
==Virsh:==
 
   
  +
Move the folder to the root.
Only install this one if you have a VM server running. This MUST be run as the root user.
 
   
  +
<pre style="background-color:#ffcccc">
<pre>
 
  +
mv manage2client /manage
while
 
read ID NAME STAT;
 
do echo "NAME=$NAME, STAT=$STAT";
 
STAT=$(echo $STAT | sed -e 's/ /_/g')
 
/manage/management-client 128.153.145.62 80 $NAME $STAT
 
done <<< "$(virsh list --all | tail -n +3)"
 
 
</pre>
 
</pre>
   
  +
Move the systemd service to the systemd serivces folder
==Compiling Managemnet==
 
   
  +
<pre style="background-color:#ffcccc">
These scripts expect management-client.cpp to be compiled as
 
  +
sudo mv /manage/manage.service /etc/systemd/system/manage.service
<pre>
 
g++ management-client.cpp -o management-client --std=c++11
 
 
</pre>
 
</pre>
and to be in the /manage folder (for simplicity, I tend to put them all in the same folder).
 
   
  +
==Configure system==
===management.cpp===
 
<pre>
 
/*
 
* File: management-client.cpp
 
* Author: jared
 
*
 
* Created on January 21, 2016, 3:50 PM
 
*/
 
   
  +
If the hard drive you want to track is not /dev/sda1, select a different mount point to track in totaldisk.sh and useddisk.sh
#include <cstdlib>
 
#include <iostream>
 
#include <fstream>
 
#include <sys/socket.h>
 
#include <arpa/inet.h>
 
#include <netdb.h>
 
   
  +
If you want to have virsh, edit run.sh, and un-comment the line with virsh.sh
using namespace std;
 
   
  +
If you want to poll faster, change sleep from 30 to 5. Any faster, and the Linux scheduler will fall behind on busy boxes.
class tcp_module {
 
private:
 
string address;
 
int port;
 
struct sockaddr_in server;
 
int sock;
 
public:
 
tcp_module(string, int);
 
void sendTcp(string);
 
};
 
   
  +
==Compile Management for your platform==
tcp_module::tcp_module(string addr, int portNumber) {
 
address = addr;
 
port = portNumber;
 
   
  +
<pre style="background-color:#ffcccc">
sock = socket(AF_INET, SOCK_STREAM, 0);
 
  +
make
if (sock == -1) {
 
cerr << "Error creating the socket\nExiting\n";
 
exit(1);
 
}
 
 
if (inet_addr(address.c_str()) == -1) {
 
struct hostent *host;
 
if ((host = gethostbyname(address.c_str()))) {
 
cerr << "Error resolving host " << address << "\nPlease use an IP if you are not, exiting\n";
 
exit(1);
 
}
 
}
 
 
server.sin_addr.s_addr = inet_addr(address.c_str());
 
server.sin_family = AF_INET;
 
server.sin_port = htons(port);
 
 
if (connect(sock, ((struct sockaddr*) &server), sizeof (server)) < 0) {
 
cerr << "Error connecting to host. Exiting.\n";
 
exit(1);
 
}
 
}
 
 
void tcp_module::sendTcp(string data) {
 
if (send(sock, data.c_str(), data.length(), 0) < 0) {
 
cerr << "Error sending data. Exiting\n";
 
exit(1);
 
}
 
}
 
 
bool isConnected = false;
 
 
void sendKey(tcp_module connection, string key, string data) {
 
string send = "GET /" + key + "/" + data + " HTTP/1.1\r\n\r\n";
 
connection.sendTcp(send);
 
}
 
 
int main(int argc, char** argv) {
 
 
if(argc != 5) {
 
cerr << "Error, not enough arguments!\n";
 
cout << "\n Usage:\n\n./[executable] host port key value\n";
 
for(int i = 0; i < argc; i++) {
 
cout << argv[i] << " ";
 
}
 
cout << "\n";
 
exit(1);
 
}
 
 
string key = argv[3];
 
string data = argv[4];
 
 
string host = argv[1];
 
int port = atoi(argv[2]);
 
tcp_module tcp(host, port);
 
sendKey(tcp, key, data);
 
 
return 0;
 
}
 
 
</pre>
 
</pre>
   
  +
==Enable Systemd Services==
==Startup==
 
 
I also have one script that runs all of the client scripts. The Bash script that runs all the other bash scripts looks a lot like this:
 
 
===Bash Start Script===
 
   
  +
<pre style="background-color:#ffcccc">
/manage/run.sh
 
  +
sudo systemctl enable manage
<pre>
 
  +
sudo systemctl start manage
#!/bin/bash
 
cd /manage
 
while true
 
do
 
/manage/cpu.sh &
 
/manage/used-ram.sh &
 
/manage/total-ram.sh &
 
/manage/uptime.sh &
 
# /manage/virsh.sh & # this is for if you have a virsh virtual machine system running, to monitor VM stats.
 
sleep 20
 
done
 
 
</pre>
 
</pre>
   
  +
==Whitelist==
   
  +
Email dunbarj@clarkson.edu to get the server added to the whitelist
==Extensibility==
 
   
  +
=Installing Management Server=
It is easy to make more customized bash scripts that will complete other tasks. The compiled file has an expected input of ./management-client (IP) (PORT) (KEY_NAME) (VALUE) and this causes a key to go up, and saves at the value. When the server gets this as a rest call, the server reads it because it's in the 145 subnet and then sets it into the data structures of the program.
 
   
  +
Start with an Arch VM
One thing to note is that it takes only 5 arguments: the executable, the IP, port, key, and value. Each one has to have no spaces in it. If you want your key or value to have a space, place an underscore and it will be replaced with a space. The error it tells you is basic, but informative. You must use an IP for the ip field, it will not accept hostnames.
 
   
  +
==Set Hostname==
=Server Side (management itself)=
 
==Requirements==
 
   
  +
Edit
The server side of the software is available at https://github.com/jrddunbr/management-server and is still a work in progress.
 
   
  +
<pre style="background-color:#ccccff">
It requires the following to be installed:
 
  +
/etc/hostname
 
<pre>
 
openjdk-7-jdk wget upower apcupsd
 
 
</pre>
 
</pre>
   
  +
Clear the contents and enter this on the first line, and save
==Setup==
 
   
  +
<pre style="background-color:#ccffcc">
You place the compiled .jar file in a handy place along with a few files (most found in the Github repo as examples):
 
  +
management
 
==Configuration==
 
 
<pre>
 
index.html # a template HTML file that is used to list all of the servers, uptimes, and other data.
 
server.html # a template HTML file that is used to list one server and all of the associated key and value pairs that it has.
 
master.yml # a file that defines master keys, which are server side keys that define server characteristics locally, used to enable servers, specify if they are urgent to server uptime
 
servers/<servername>.txt # file that configures maintainers
 
 
</pre>
 
</pre>
   
  +
==Set Network==
Maintainers Listing (in servers folder)
 
<pre>
 
f:<firstname>
 
l:<lastname>
 
e:<email>
 
i:<irc>
 
c:<cell or other phone number>
 
</pre>
 
 
You might need to create a servers folder if it crashes.
 
 
Inside the servers folder, there are configurable per-server configs.
 
 
Make sure that you check that your YAML files are parsed properly or I guarantee that the Java code will crash. There are a few good online checkers out there.
 
 
==Startup==
 
 
I made the startup script for the management server much the same as the client one.
 
   
  +
Copy example ethernet-static to netctl folder
The sh file is as follows:
 
   
  +
<pre style="background-color:#ffcccc">
<pre>
 
  +
cp /etc/netctl/examples/ethernet-static /etc/netctl/ehternet
cd /manage
 
date >> runtime.log
 
java -jar management-server.jar >> runtime.txt
 
 
</pre>
 
</pre>
   
  +
Edit
==Downsides (pending improvements)==
 
   
  +
<pre style="background-color:#ccccff">
One downside to the whole system is that it depends on TALOS's HTTPS server to be running when this starts because it fetches the domain files. It can use a fallback mechanism where it copies the file to the hard drive as a backup, and you could technically put the file there for it to read. A new configuration key needs to be added to the master list before this will work however.. coming soon! (there's a github fork called sans-talos)
 
  +
/etc/netctl/ethernet
 
=Hardware Implementation:=
 
 
Fetch Armbian Jessie for the pcduino 3. It's OK that it's not the nano lite version even though currently we are using a pcduino 3 nano lite.
 
 
Flash that to the SD card, log into the root user set the root password, and then run the reboot command. Wait for it to restart again, and then reboot.
 
 
At this point, the system has set up the SSH server, expanded / to the full size of the SD card (up to 32GB).
 
 
Now, install a thing:
 
 
<pre>htop openssh-client vim openjdk-7-jdk p7zip-full g++ sudo git upower apcupsd</pre>
 
 
And now edit some files (make them contain this following contents):
 
 
vim /etc/hostname
 
 
<pre>management</pre>
 
 
vim /etc/network/interfaces
 
 
<pre>
 
# Wired adapter #1
 
auto eth0
 
iface eth0 inet static
 
address 128.153.145.62
 
netmask 255.255.254.0
 
gateway 128.153.145.1
 
 
# Local loopback
 
auto lo
 
iface lo inet loopback
 
 
</pre>
 
</pre>
   
  +
Clear the contents and set it to this:
and edit the sshd config for the default cosi ssh port:
 
   
  +
<pre style="background-color:#ccffcc">
vim /etc/ssh/sshd_config
 
  +
Description='A basic static ethernet connection'
 
  +
Interface=ens3 # Make sure this is the interface or you won't have a network
set the line that says Port
 
  +
Connection=ethernet
 
  +
IP=static
After you have done that, reboot.
 
  +
Address=('128.153.145.62/24')
 
  +
Gateway='128.153.145.1'
 
  +
DNS=('128.153.145.3')
You are now to follow the default instructions for setting up the software itself.
 
 
=Start Scripts:=
 
 
==System V Start Script==
 
 
/etc/init.d/manage (the name of the control script will be manage - don't use any extension and make sure it is executable)
 
<pre>
 
### BEGIN INIT INFO
 
# Provides: manage # this needs to match the name of the file
 
# Required-Start: $remote_fs $syslog $network $all
 
# Required-Stop: $remove_fs $syslog
 
# Default-Start: 2 3 4 5
 
# Default-Stop: 0 1 6
 
### END INIT INFO
 
 
/usr/bin/java -jar /manage/management-server.jar > runtime.log & # make your executable run in here.
 
 
</pre>
 
</pre>
   
  +
=Objectives=
and make that run at startup with:
 
 
<pre>
 
update-rc.d manage defaults
 
</pre>
 
 
For more on LSB start scripts, visit https://wiki.debian.org/LSBInitScripts
 
 
==Systemd Start Script==
 
/etc/systemd/system/manage.service: (the service will be called manage, due to the filename. Always append .service to the service file)
 
<pre>
 
[Unit]
 
Description=manage # use a title for the application that will help someone realize what is going on
 
 
[Service]
 
ExecStart=/bin/bash /manage/run.sh # point this to the executable you intend to run.
 
#ExecStop= # point this to a stop script if you have one, if you don't it will just kill the process when you tell it to stop
 
 
[Install]
 
WantedBy=multi-user.target
 
</pre>
 
 
===Systemd Helpful Tips===
 
 
systemctl enable <name> for enabling units
 
 
systemctl disable <name> for disabling
 
 
systemctl start <name> for starting
 
   
  +
*Create a monitoring system that can monitor all of the servers, battery backups, network, and also some temperature sensors placed through the server room at strategic locations
systemctl stop <name> for stopping
 
  +
*Notify computers of when to power down in a power outage
  +
*Create API's that can be used to interface the management platform
   
  +
==Plans==
systemctl status <name> for the executable status.
 
   
  +
* Update ALL instances of Management to stat3 when client completed (and depricate the old versions - we still have versions of Management 1.0 and manage2client out there)
=Plans:=
 
  +
* Create new server with authentication (both real encryption and perhaps OpenComputers compatible for Minecraft servers :P)
  +
* Configurable low power options
  +
* Email Notifications
  +
* Shutdowns
  +
* Sensors Interface - configurability is a must
  +
* Better web interface? Cookies? Logins? LDAP? PAM?
  +
* Ability for custom messages, custom dashboards?
   
  +
[[Category:Web Service]]
Additional planned features (in this general order) are:
 
*manage battery backups and tell servers when exactly to power down in the event of an outage
 
*select server subnet(s)
 
*add specific server IP's not in the above subnets
 
*add some more master key configurations for fallback mechanisms
 
*make it independent of any server (favicon excluded) so that it will operate even when Talos is down. So long as there's a gateway and network (and UPS power), this thing had better be running
 
*database system to store the data collected
 
*graph display of events?
 

Latest revision as of 12:50, 29 July 2017

Management
Cosi-management.png
IP Address(es): 128.153.145.62
Contact Person: Jared Dunbar
Last Update: July 2017
Services: server status collection server


Management (stat3) is a vm used for monitoring the status of hosts in the server room, ie. checking the CPU, RAM, and hard drive stats, among other configurable things.

Each computer in the server room that is configured sends data periodically which will be shown in an uptime page on a webpage that can easily be used to determine system uptime and service uptime among other things.

Also, you can view COSI network stats at http://management.cosi.clarkson.edu/cacti with the csguest user (and default password) in raw data at http://stat.cosi.clarkson.edu/data since Cacti broke (again)

Installing Management Clients

Required Software: Git, g++, make

On Debian:

apt update && apt install make g++ git

Clone with Git

First, set git to allow all certificates, and get the files using Git.

git config --global http.sslVerify false
git clone https://gitlab.cosi.clarkson.edu/jared/manage2client.git

Re-secure the system by only accepting repos with certificates.

git config --global http.sslVerify true

Prepare files

Move the folder to the root.

mv manage2client /manage

Move the systemd service to the systemd serivces folder

sudo mv /manage/manage.service /etc/systemd/system/manage.service

Configure system

If the hard drive you want to track is not /dev/sda1, select a different mount point to track in totaldisk.sh and useddisk.sh

If you want to have virsh, edit run.sh, and un-comment the line with virsh.sh

If you want to poll faster, change sleep from 30 to 5. Any faster, and the Linux scheduler will fall behind on busy boxes.

Compile Management for your platform

make

Enable Systemd Services

sudo systemctl enable manage
sudo systemctl start manage

Whitelist

Email dunbarj@clarkson.edu to get the server added to the whitelist

Installing Management Server

Start with an Arch VM

Set Hostname

Edit

/etc/hostname

Clear the contents and enter this on the first line, and save

management

Set Network

Copy example ethernet-static to netctl folder

cp /etc/netctl/examples/ethernet-static /etc/netctl/ehternet

Edit

/etc/netctl/ethernet

Clear the contents and set it to this:

Description='A basic static ethernet connection'
Interface=ens3 # Make sure this is the interface or you won't have a network
Connection=ethernet
IP=static
Address=('128.153.145.62/24')
Gateway='128.153.145.1'
DNS=('128.153.145.3')

Objectives

  • Create a monitoring system that can monitor all of the servers, battery backups, network, and also some temperature sensors placed through the server room at strategic locations
  • Notify computers of when to power down in a power outage
  • Create API's that can be used to interface the management platform

Plans

  • Update ALL instances of Management to stat3 when client completed (and depricate the old versions - we still have versions of Management 1.0 and manage2client out there)
  • Create new server with authentication (both real encryption and perhaps OpenComputers compatible for Minecraft servers :P)
 * Configurable low power options
 * Email Notifications
 * Shutdowns
  • Sensors Interface - configurability is a must
  • Better web interface? Cookies? Logins? LDAP? PAM?
  • Ability for custom messages, custom dashboards?