Perform powerful automated checks on repository files | Using GitHub Actions ✔️

Perform powerful automated checks on repository files | Using GitHub Actions

What are GitHub Actions?

GitHub Actions are a powerful feature of the GitHub platform, providing methods to automate workflows and allow better monitoring and management of repositories.

From a repository maintainer perspective, GitHub Actions are an essential tool that should be used where possible to ease the process of keeping repositories organised. The ability to automate workflows is especially important in the case of managing open source repositories where contributions are accepted from the public.

The responsibility of a repository maintainer is to ensure any files submitted to the repository meet the contribution guidelines. This task can involve several processes such as reviewing the code submitted, performing tests and ensuring documentation and release notes are updated where necessary. 

To simplify this process, maintainers should make use of automated checks which are a feature provided by Github Actions. 

When a repository is set up with a workflow, checks are triggered when an event occurs such as when a pull request is submitted. These workflows run jobs running sequentially and the outcome of each job is generated as a report as a pass or fail.

Read the GitHub docs for a more in-depth look at how GitHub Actions work.

Need a way to automate checks on files pushed to your repository? As an open source maintainer, the answer should always be yes.

In this post, we’ll take a look and break down the implementation of the GitHub Action workflow Check Changed Files Helper. This handy Workflow provides a starting template for open source maintainers to get creative and perform customised checks on individual or multiple files changed on the repository.

View the GitHub repository here.

Related: 5 Compelling reasons to become an open source maintainer and accelerate your growth as a programmer 🚀

Purpose of GitHub Action Workflow

One of the tedious aspects of repository maintenance is having to perform manual checks of each file submitted as a push or pull request. 

Is the contribution useful for the repository? Is the code formatted properly? Will the contribution break the codebase? These are amongst the questions maintainers will rightfully ask when reviewing individual contributions.

Check Changed Files Helper is a GitHub Action that aims to simplify the process of performing checks on files submitted to a repository through the power of automation.

The purpose is to have a workflow that when triggered, will identify the files submitted as part of the pull request and automatically perform any checks that can be automated, reducing the need for manual checks.

For example, let’s take a repository with a codebase consisting primarily of Python source files. Using Check Changed Files Helper, the maintainer can choose to identify any Python files submitted in a pull request to the main branch and perform a check on that file by running it with the Python command to ensure the code is runnable.

Imagine a repository receiving many pull requests daily. The maintainers can now rely on this workflow to check the integrity of every submitted source file before considering merging the file.

A simple execution check is used in the  Check Changed Files Helper repository as a sample. However, maintainers can extend this to perform any other checks that are relevant for the repository.

Structure of the Workflow and how it works

This high-level overview diagram represents the structure of the workflow:

Workflow diagram of Check Changed Files Helper

Perform powerful automated checks on repository files | Using GitHub Actions

Workflow Setup

The workflow is implemented in the file: auto-check-changed-files.yml. 

View entire file on the GitHub repo.

Let’s break this down.

name: auto-check-changed-files
 
on:
 push:
   branches: [ main ]
 pull_request:
   branches: [ main ]

name – Refers to the name of the workflow which identifies the workflow in the repository action page.

on – This sets the event that will trigger the workflow run. In this case there are two scenarios; on a push event occurring on the main branch or on a pull request submitted to the main branch. 

jobs:
 auto-check-changed-files:
   runs-on: ubuntu-latest

jobs – used to identify the action to be performed within the workflow. A workflow can have multiple jobs. In this workflow, there are several actions to be performed under the “auto-check-changes-files” job.

Actions are executed in GitHub hosted runners which are essentially virtual computer environments. In this case, we have chosen a runner using the latest available ubuntu distribution. 

 steps:
    # checkout repo
   - uses: actions/checkout@v2

Steps identifies the sequence of tasks to be run as part of the job. There can be several steps within a job, in this case, we keep each task under a single job.

The first task in the step is uses: actions/checkout@v2. Uses selects an action to be used as part of each step in the job. Actions (identified by “actions/”) are reusable units of code that can be obtained from your own repo or a public repo. GitHub provides several actions as a starting point or templates in the GitHub Action marketplace.

The action actions/checkout@v2 checks out the repository before using the action, a standard step used in most workflows. @v2 is used to specify the version of the action.


   
   # setup python
   - name: Set up Python 3.8
     uses: actions/setup-python@v2
     with:
       python-version: 3.8
  
   # install requirements.txt if it exists
   - name: Install Python dependencies
     run: |
       python -m pip install --upgrade pip
       if [ -f requirements.txt ]; then pip install -r requirements.txt; fi


The next step is identified using the name Set up Python 3.8 where the action is called to set up python on the runner environment. The with command adds input parameters for the action. We are using with here to pass to the setup-python action that we are setting up version 3.8 of Python.

Then another step called Install Python dependencies uses the run command. run allows a shell command to be executed within the runner environment. In this case we are running the python command to install and upgrade pip then to install any dependencies listed in the requirements.txt  file if it exists.

This gives the runner environment needed to work with Python files.


   
   # setup node
   - name: Setup Node.js environment
     uses: actions/setup-node@v2.4.1
  
   # install dependencies if package.json exists
   - name: Install js dependencies
     run: if [ -f package.json ]; then npm install; fi

Similar to the previous step, this step first sets up an action this time to Setup Node.js environment then install dependencies listed in package.json if it exists.

This gives the runner environment to work with JavaScript source files.

   
   # use action to get files that changed
   - uses: lots0logs/gh-action-get-changed-files@2.1.4
     with:
      token: ${{ secrets.GITHUB_TOKEN }}

This step uses the Get Changes Files Action. When a workflow is triggered, this Action outputs a JSON file in the runner filesystems containing files that were changed on the repository. Different JSON files are created according to the nature of the changed files as follows:

  • all: Added, deleted, renamed and modified files. -> ${HOME}/files.json
  • added: Added files -> ${HOME}/files_modified.json
  • deleted: Deleted files -> ${HOME}/files_added.json
  • renamed: Renamed files -> ${HOME}/files_deleted.json
  • modified: Modified files -> ${HOME}/files_renamed.json

The ability to identify exactly which files are changed in the repository is central to this workflow as we can focus our checks only on these files.

   
   # bash script to get list of changed files and check each file
   # set permission of script and execute
   - name: Checking files
     run: |
       echo "job: running script to check changed files:"
       chmod +x ./.github/workflows/auto-check-changed-files.sh
       ./.github/workflows/auto-check-changed-files.sh
       echo "job: Done"


This is the final step of the workflow. The run command is used to indicate the start of this step by echoing to the console. Write permission is then given to the bash script auto-check-changed-files.sh using chmod before executing the script.

Power of Bash

The final step in the workflow file executes the auto-check-changed-files.sh

bash file also located in the “.github/workflows” directory.

View entire file on the GitHub repo.

The purpose of having a separate bash file that can be executed by the workflow is to be able to perform more powerful and customised functions that may not be as easily available using the syntax provided by the YAML syntax.

A breakdown of the bash file is as follows.

#!/bin/bash
 
echo "run-sh: Inside auto-check-changed-files.sh"
 
# print content of json holding all files to be checked
a=$(cat /home/runner/files.json)
echo "Content of files.json: $a"
 
# get number of files to check
num_files=$(jq '. | length' /home/runner/files.json)
echo "number of files to check = $num_files"

The start of the bash file is to print several things to the console. First, a message is echoed to indicate we are now inside the bash script. Next, we echo the content of files.json which was created by the Get Changes Files Action and contains all files changed during the push or pull request and therefore all files we want to check. 

Finally, we use the jq JSON command processor to identify how many fields are in files.json and echo the number of files to be checked.

# loop over each file to be checked
for file in $(jq  '.[]' /home/runner/files.json | cut -d '"' -f 2); do
   echo "checking file: $file"

This is the start of a loop that is going to scan over each of the changed files listed in files.json. “cut -d ‘”‘ -f 2” is used to remove the commas present in the filename.

 We also echo the filename being checked at each iteration of the loop.

   # .c files
   if [[ $file == *".c"* ]]; then
       echo "This is a .c file, executing"
       # compile and run. treat warnings as errors
       gcc $file -Werror && ./a.out
 
   # .py files
   elif [[ $file == *".py"* ]]; then
       echo "This is a .py file, executing"
       # run python file
       python3 $file
 
   # .js files
   elif [[ $file == *".js"* ]]; then
       echo "This is a .js file, executing"
       # run js file
       node $file
 
   # .sh files
   # but beware dont execute this script itself if modified otherwise we'll enter endless loop!
   elif [[ "$file" == *".sh"* ]]; then
       if [[ "$file" != *"auto-check-changed-files.sh"* ]]; then
           echo "This is a .sh file, executing"
           chmod +x "$file" # give executable permission
           sh $file
       else
           echo "$file was modified. No need to execute"
       fi


Within the loop we scan the file types we want to detect from the filename. For example, “.c” file types are detected from the statement if [[ $file == *”.c”* ]]; then.

The same technique is used to detect file types of .py .js .sh. 

Within each statement for each file type, we execute the file according to the file type. For .c files, this involved compiling the file using gcc and executing the binary a.out. .py files are run using the python interpreter. .js files are run using node. .sh files are executed using the shell command sh.

Note, an extra step is performed on .sh files to ensure we don’t perform the check on auto-check-changed-files.sh if it happens to be one of the changed files captured by the  Get Changes Files Action in files.json.

   # file types with no action
   else
       echo "No action needed on this file type!"
   fi
 
done

Finally, in case the changed file type is not recognised or not a file we want to perform a check on we echo a message to state no action in needed.

Putting it all together

Using the two files auto-check-changed-files.yml and auto-check-changed-files.sh that together represent the GitHub Action we get a workflow that can be used to perform a basic sanity check on the contents of source files submitted to a repository.

Of course, the format provided may not suit every repository but it can be used as a template to extend the nature of the check performed on any files changed as part of a push or pull request.

The purpose of Check Changed Files Helper is to provide a functional workflow that can be customised to meet the needs of the repository.

Once again, review the diagram of the workflow structure to get the overall picture:

Perform powerful automated checks on repository files | Using GitHub Actions

Conclusion

Check Changed Files Helper is a GitHub Action designed to simplify the work of repository maintainers by providing a template to detect any changed files and perform any required checks according to the file type in an automated manner. 

The Check Changed Files Helper GitHub repository is available as open source under the MIT license. So fork, copy and alter the contents as needed for your own use.

Get creative and modify the workflow YAML and bash script to suit the needs of your repository!

Suggestions on improvements to the repository are also welcome by opening an Issue on the repository.

See other articles you may be interested in below: