Updated for systemd user units, fleshed out INSTALL.md
This commit is contained in:
		
							
								
								
									
										12
									
								
								.gitignore
									
									
									
									
										vendored
									
									
								
							
							
						
						
									
										12
									
								
								.gitignore
									
									
									
									
										vendored
									
									
								
							@@ -1,10 +1,12 @@
 | 
			
		||||
*
 | 
			
		||||
!psi-alerts.sh
 | 
			
		||||
!psi-alerts@.service
 | 
			
		||||
!psi-monitor.service
 | 
			
		||||
!psi-monitor.sh
 | 
			
		||||
!psi-by-example
 | 
			
		||||
!.gitignore
 | 
			
		||||
!CONFIGURE.md
 | 
			
		||||
!INSTALL.md
 | 
			
		||||
!README.md
 | 
			
		||||
!psi-alerts-user.service
 | 
			
		||||
!psi-alerts.sh
 | 
			
		||||
!psi-alerts@.service
 | 
			
		||||
!psi-by-example
 | 
			
		||||
!psi-monitor-user.service
 | 
			
		||||
!psi-monitor.service
 | 
			
		||||
!psi-monitor.sh
 | 
			
		||||
 
 | 
			
		||||
							
								
								
									
										31
									
								
								CONFIGURE.md
									
									
									
									
									
								
							
							
						
						
									
										31
									
								
								CONFIGURE.md
									
									
									
									
									
								
							@@ -1,19 +1,26 @@
 | 
			
		||||
# CONFIGURE
 | 
			
		||||
Included in this project are a number of systemd units:
 | 
			
		||||
# CONFIGURE Included in this project are a number of systemd units:
 | 
			
		||||
    * psi-monitor.service
 | 
			
		||||
        * uses psi-monitor executable (in /usr/bin/)
 | 
			
		||||
    * psi-alerts@.service (system template service)
 | 
			
		||||
        * uses psi-alerts.sh script
 | 
			
		||||
    * psi-alerts@.service (systemd template service)
 | 
			
		||||
        * uses psi-alerts.sh script in */usr/local/bin/*
 | 
			
		||||
    * psi-alerts-user.service (systemd user service)
 | 
			
		||||
        * also uses psi-alerts.sh script in *~/bin/* (or wherever you want to
 | 
			
		||||
          put it)
 | 
			
		||||
 | 
			
		||||
The `psi-alerts.sh` is essentially a daemon (a systemd simple service), and for
 | 
			
		||||
now the systemd template needs to be instantiated with the username that will
 | 
			
		||||
execute `psi-alerts.sh`.  Also, a systemd unit override should be created, like
 | 
			
		||||
so:  
 | 
			
		||||
execute `psi-alerts.sh` (if using the systemd template).  Also, a systemd unit
 | 
			
		||||
override should be created, like so:  
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
sudo cp psi-alerts@.service /etc/systemd/system/
 | 
			
		||||
sudo systemctl edit psi-alerts@<user>.service
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
--OR--
 | 
			
		||||
```
 | 
			
		||||
cp psi-alerts-user.service ~/.config/systemd/user/psi-alerts.service
 | 
			
		||||
systemctl --user edit psi-alerts.service
 | 
			
		||||
```
 | 
			
		||||
This will open an editor, and in later versions of systemd the comment code will be included, clearly showing where the override should be entered:
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
@@ -32,17 +39,21 @@ Environment=SSH_HOST="localhost"
 | 
			
		||||
Environment=SSH_PORT=5999
 | 
			
		||||
Environment=SSH_ID_PATH="~user/.ssh/psi-alerts"
 | 
			
		||||
Environment=CLEAR_THRESHOLD="5.0"
 | 
			
		||||
ExecStart=  # Clear ExecStart for user unit
 | 
			
		||||
ExecStart=/path/to/psi-alerts.sh --user # User unit
 | 
			
		||||
 | 
			
		||||
### Edits below this comment will be discarded
 | 
			
		||||
 | 
			
		||||
### /etc/systemd/system/psi-alerts@.service
 | 
			
		||||
# [Unit]
 | 
			
		||||
# Description=Pressure Stall Information (PSI) alerts
 | 
			
		||||
# PartOf=multi-user.target
 | 
			
		||||
# PartOf=multi-user.target  # system template
 | 
			
		||||
# PartOf=default.target     # user service
 | 
			
		||||
# After=psi-monitor.service
 | 
			
		||||
#
 | 
			
		||||
# [Service]
 | 
			
		||||
# User=%i
 | 
			
		||||
# 
 | 
			
		||||
# User=%i # User unit will not have User=%i
 | 
			
		||||
# Type=simple
 | 
			
		||||
# ExecStart=psi-alerts.sh
 | 
			
		||||
#
 | 
			
		||||
@@ -85,5 +96,5 @@ All of these are required except where noted, there are no default options
 | 
			
		||||
  (SMS and email will still work, as they don't use SSH)
 | 
			
		||||
* **CLEAR_THRESHOLD**:  The percentage threshold the some avg300 threshold
 | 
			
		||||
  should be below before considering the alert cleared.  This will depend
 | 
			
		||||
  highly on the workload running on 
 | 
			
		||||
  highly on the workload running on the system.
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
							
								
								
									
										56
									
								
								INSTALL.md
									
									
									
									
									
								
							
							
						
						
									
										56
									
								
								INSTALL.md
									
									
									
									
									
								
							@@ -4,4 +4,60 @@ First, clone this repository with the `--recurse-submodules` flag:
 | 
			
		||||
$ git clone --recurse-submodules https://git.eldon.me/trey/psi-alerts.git
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
`--recurse-submodules` is only necessary if you wish to use the modified
 | 
			
		||||
psi-by-example program for `psi-monitor`.  I found this too noisy to be of use,
 | 
			
		||||
it alerts too quickly so I wrote my own with relaxed timing.
 | 
			
		||||
 | 
			
		||||
If you want to use the psi-by-example/psi-monitor code, you'll need to compile
 | 
			
		||||
it: 
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
gcc -o psi-monitor psi-monitor.c
 | 
			
		||||
``` 
 | 
			
		||||
 | 
			
		||||
## Using the systemd template unit
 | 
			
		||||
1.  Copy the `psi-alerts.sh` and `psi-monitor.sh` scripts to */usr/local/bin*:
 | 
			
		||||
 | 
			
		||||
    ```
 | 
			
		||||
    sudo cp psi-alerts.sh /usr/local/bin
 | 
			
		||||
    sudo cp psi-monitor.sh /usr/local/bin/psi-monitor 
 | 
			
		||||
    ### OR ###
 | 
			
		||||
    sudo cp psi-by-example/psi-monitor /usr/local/bin
 | 
			
		||||
    ```
 | 
			
		||||
 | 
			
		||||
2.  Copy the systemd units to */etc/systemd/system*:
 | 
			
		||||
 | 
			
		||||
    ```
 | 
			
		||||
    sudo cp psi-alerts@.service psi-monitor.service /etc/systemd/system/
 | 
			
		||||
    ```
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
## Using the systemd user units
 | 
			
		||||
1.  Copy the `psi-alerts.sh` and `psi-monitor.sh` scripts to *~/bin* (or
 | 
			
		||||
    wherever you want them):
 | 
			
		||||
    
 | 
			
		||||
    ```
 | 
			
		||||
    cp -a psi-alerts.sh psi-monitor.sh ~/bin/
 | 
			
		||||
    ```
 | 
			
		||||
 | 
			
		||||
2.  Copy the systemd user units to *~/.config/systemd/user/*
 | 
			
		||||
 | 
			
		||||
    ```
 | 
			
		||||
    cp psi-alerts-user.service ~/.config/systemd/user/psi-alerts.service
 | 
			
		||||
    cp psi-monitor-user.service ~/.config/systemd/user/psi-monitor.service
 | 
			
		||||
    ```
 | 
			
		||||
 | 
			
		||||
# CONFIGURE
 | 
			
		||||
See *CONFIGURE.md* in this repository
 | 
			
		||||
 | 
			
		||||
# ENABLE and START
 | 
			
		||||
## system template instance:
 | 
			
		||||
```
 | 
			
		||||
sudo systemctl enable --now psi-monitor.service psi-alerts@<user>.service
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
## User instance
 | 
			
		||||
```
 | 
			
		||||
systemctl --user enable --now psi-monitor.service psi-alerts.service
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
							
								
								
									
										12
									
								
								psi-alerts-user.service
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										12
									
								
								psi-alerts-user.service
									
									
									
									
									
										Normal file
									
								
							@@ -0,0 +1,12 @@
 | 
			
		||||
[Unit]
 | 
			
		||||
Description=Pressure Stall Information (PSI) alerts
 | 
			
		||||
PartOf=default.target
 | 
			
		||||
After=psi-monitor.service
 | 
			
		||||
 | 
			
		||||
[Service]
 | 
			
		||||
Type=simple
 | 
			
		||||
ExecStart=psi-alerts.sh
 | 
			
		||||
 | 
			
		||||
[Install]
 | 
			
		||||
WantedBy=default.target
 | 
			
		||||
 | 
			
		||||
@@ -56,6 +56,14 @@ notification_cmd="${NOTIFICATION_CMD}"
 | 
			
		||||
notification_hist_cmd="${NOTIFICATION_HIST_CMD}"
 | 
			
		||||
notification_opts="${NOTIFICATION_OPTS}"
 | 
			
		||||
id_idx="${NOTIFICATION_IDX}"
 | 
			
		||||
user=false
 | 
			
		||||
 | 
			
		||||
if [[ -n "${1}" ]]; then
 | 
			
		||||
    if  [[ "${1}" == "-u" ]] || \
 | 
			
		||||
        [[ "${1}" == "--user" ]]; then
 | 
			
		||||
        user=true
 | 
			
		||||
    fi 
 | 
			
		||||
fi
 | 
			
		||||
       
 | 
			
		||||
get_ssh_agent () {
 | 
			
		||||
    for dir in /tmp/ssh-*; do
 | 
			
		||||
@@ -132,7 +140,7 @@ send_notice () {
 | 
			
		||||
                print "Connection to notification daemon failed!" >&2
 | 
			
		||||
                false
 | 
			
		||||
        else
 | 
			
		||||
            echo ${notification_id}
 | 
			
		||||
            print ${notification_id}
 | 
			
		||||
            true
 | 
			
		||||
        fi
 | 
			
		||||
    elif [[ -n "${ssh_id_path}" ]]; then
 | 
			
		||||
@@ -141,11 +149,11 @@ send_notice () {
 | 
			
		||||
                print "Connection to notification daemon failed!" >&2
 | 
			
		||||
                false
 | 
			
		||||
        else
 | 
			
		||||
            echo ${notification_id}
 | 
			
		||||
            print ${notification_id}
 | 
			
		||||
            true
 | 
			
		||||
        fi
 | 
			
		||||
    else
 | 
			
		||||
        echo "No SSH notifications configured.  Returning." >&2
 | 
			
		||||
        print "No SSH notifications configured.  Returning." >&2
 | 
			
		||||
        false
 | 
			
		||||
    fi
 | 
			
		||||
    #set +x
 | 
			
		||||
@@ -154,7 +162,7 @@ send_notice () {
 | 
			
		||||
send () {
 | 
			
		||||
    #set -x
 | 
			
		||||
    if [[ "${#@}" -lt 2 ]] && [[ "${#@}" -gt 3 ]]; then
 | 
			
		||||
        echo "Wrong number of arguments to send()!" >&2
 | 
			
		||||
        print "Wrong number of arguments to send()!" >&2
 | 
			
		||||
        return false
 | 
			
		||||
    fi
 | 
			
		||||
 | 
			
		||||
@@ -236,7 +244,7 @@ exec_notices () {
 | 
			
		||||
                send "${psi_type}" "${current_alarms}" "${email_to}"
 | 
			
		||||
                ;;
 | 
			
		||||
            *)
 | 
			
		||||
                echo "Something went wrong!" >&2
 | 
			
		||||
                print "Something went wrong!" >&2
 | 
			
		||||
                false
 | 
			
		||||
                ;;
 | 
			
		||||
        esac
 | 
			
		||||
@@ -257,7 +265,7 @@ check_dunst_id_is_visible () {
 | 
			
		||||
        "${notification_hist_cmd} | jq '.data[0][].id.data'"); then
 | 
			
		||||
        if ! ids=$(ssh -qi "${ssh_id_path}" -p ${ssh_port} -l "${ssh_user}" \
 | 
			
		||||
            "${ssh_host}" "${notification_hist_cmd} | jq '.data[0][].id.data'"); then
 | 
			
		||||
                echo "Connection to dunst failed!" >&2
 | 
			
		||||
                print "Connection to dunst failed!" >&2
 | 
			
		||||
                return 2
 | 
			
		||||
        fi
 | 
			
		||||
    fi
 | 
			
		||||
@@ -278,10 +286,14 @@ local last_line=""
 | 
			
		||||
 | 
			
		||||
#set -x
 | 
			
		||||
while true; do
 | 
			
		||||
    local line=$(journalctl -u ${svc} -n1)
 | 
			
		||||
    local now=$(date +%s)
 | 
			
		||||
    local last_timestamp=$(date -d "$(awk '{print $1" "$2" "$3}' <<< "${line}")" +%s)
 | 
			
		||||
    local time_diff=$(( now - last_timestamp ))
 | 
			
		||||
    if ${user}; then
 | 
			
		||||
        line=$(journalctl --user -u ${svc} -n1)
 | 
			
		||||
    else
 | 
			
		||||
        line=$(journalctl -u ${svc} -n1)
 | 
			
		||||
    fi
 | 
			
		||||
    now=$(date +%s)
 | 
			
		||||
    last_timestamp=$(date -d "$(awk '{print $1" "$2" "$3}' <<< "${line}")" +%s)
 | 
			
		||||
    time_diff=$(( now - last_timestamp ))
 | 
			
		||||
    if [[ "${last_line}" == "${line}" ]]; then
 | 
			
		||||
        # last line hasn't changed, check to see if we can clear alarms
 | 
			
		||||
        if (( time_diff >= 3 )); then
 | 
			
		||||
 
 | 
			
		||||
							
								
								
									
										10
									
								
								psi-monitor-user.service
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										10
									
								
								psi-monitor-user.service
									
									
									
									
									
										Normal file
									
								
							@@ -0,0 +1,10 @@
 | 
			
		||||
[Unit]
 | 
			
		||||
Description=Pressure Stall Information (PSI) Monitor
 | 
			
		||||
PartOf=default.target
 | 
			
		||||
 | 
			
		||||
[Service]
 | 
			
		||||
Type=simple
 | 
			
		||||
ExecStart=/home/trey/bin/psi-monitor.sh 80
 | 
			
		||||
 | 
			
		||||
[Install]
 | 
			
		||||
WantedBy=default.target
 | 
			
		||||
							
								
								
									
										0
									
								
								psi-monitor.sh
									
									
									
									
									
										
										
										Normal file → Executable file
									
								
							
							
						
						
									
										0
									
								
								psi-monitor.sh
									
									
									
									
									
										
										
										Normal file → Executable file
									
								
							
		Reference in New Issue
	
	Block a user