How to integrate continuous API fuzzing into the CI/CD?

Sponsored Post

How to integrate continuous API fuzzing into the CI/CD?

API security is a growing concern for businesses that offer or consume APIs. APIs, or application programming interfaces, allow different software systems to communicate and exchange data. They allow businesses to build integrations and connect with partners, customers, and other stakeholders.

However, as more sensitive data is being shared through APIs, it is essential to ensure that these interfaces are secure and protected from unauthorized access or manipulation.

In this blog post, we’ll discuss how continuous fuzzing can be a powerful tool to secure APIs and how developers can adopt a “secure by default” approach by integrating continuous fuzzing into SDLC processes.

Fuzzing can be applied to any function but for this blog post, we will discover how we can fuzz REST API payloads using Golang’s fuzzing library.

What is Fuzzing?

Fuzzing, also known as fuzz testing, is a type of software testing that involves feeding invalid, unexpected, or random data to a program and observing how it responds. The goal of fuzz testing is to identify vulnerabilities in a program that attackers could potentially exploit.

Fuzzing is a powerful method for finding security vulnerabilities because it can simulate attacks that hackers might use to exploit a program. By sending a large number of different inputs to the program, fuzz testing can uncover vulnerabilities that other testing methods might not detect.

There are many ways to fuzz your code in a staging environment but we want the fuzzing process to become a part of our Software Development Life Cycle. To achieve this, we’ll use the Go programming language. In Go 1.8, fuzzing was introduced as a part of the standard testing library, and it’s straightforward to implement some fuzzer functions as a part of unit tests, and we can “fuzz” our code without any external tool or a library.

Please note that you don’t need to have unit tests for fuzzing. Nevertheless, having unit tests will help us as a base for our fuzz test.

Let’s start with a simple fuzzing example which should look like the following snippet. First, we need to feed the fuzzer with some seed corpus (sample input), and the fuzzing library will call the target function (Reverse) with some random input generated from the seed data, and if the function fails at some point, we will catch it.

func Reverse(s string) (string, error) {
        if !utf8.ValidString(s) {
                return s, errors.New("input is not valid UTF-8")
        }
        r := []rune(s)
        for i, j := 0, len(r)-1; i < len(r)/2; i, j = i+1, j-1 {
                r[i], r[j] = r[j], r[i]
        }
        return string(r), nil
}
package main

import (
        "testing"
        "unicode/utf8"
)

func FuzzReverse(f testing.F) {
        testcases := []string{"Hello, world", " ", "!12345"}
        for _, tc := range testcases {
                f.Add(tc) // Use f.Add to provide a seed corpus
        }
        f.Fuzz(func(t testing.T, orig string) {
                rev, err1 := Reverse(orig)
                if err1 != nil {
                        return
                }
                doubleRev, err2 := Reverse(rev)
                if err2 != nil {
                        return
                }
                if orig != doubleRev {
                        t.Errorf("Before: %q, after: %q", orig, doubleRev)
                }
                if utf8.ValidString(orig) && !utf8.ValidString(rev) {
                        t.Errorf("Reverse produced invalid UTF-8 string %q", rev)
                }
        })
}

We need to keep in mind that fuzzing is an expensive operation, and it ends only if there is a crash., That’s the downside, so we should decide how frequently the function should be fuzzed according to the criticality. Many critical applications (like google-chrome) are being fuzzed constantly. In this example, we’ll fuzz frequently (in each build) but for a very short time.

Luckily, go tools support this option as well.

└> go test -v -fuzz . --fuzztime=30s
=== FUZZ  FuzzReverse
fuzz: elapsed: 0s, gathering baseline coverage: 0/47 completed
fuzz: elapsed: 0s, gathering baseline coverage: 47/47 completed, now fuzzing with 12 workers
fuzz: elapsed: 3s, execs: 697351 (232449/sec), new interesting: 0 (total: 47)
fuzz: elapsed: 6s, execs: 1448115 (250179/sec), new interesting: 0 (total: 47)
fuzz: elapsed: 9s, execs: 2151568 (234515/sec), new interesting: 0 (total: 47)
fuzz: elapsed: 12s, execs: 2837852 (228799/sec), new interesting: 0 (total: 47)
fuzz: elapsed: 15s, execs: 3516539 (226187/sec), new interesting: 1 (total: 48)
fuzz: elapsed: 18s, execs: 4197205 (226882/sec), new interesting: 1 (total: 48)
fuzz: elapsed: 21s, execs: 4859241 (220710/sec), new interesting: 1 (total: 48)
fuzz: elapsed: 24s, execs: 5493189 (211323/sec), new interesting: 1 (total: 48)
fuzz: elapsed: 27s, execs: 6156103 (220938/sec), new interesting: 2 (total: 49)
fuzz: elapsed: 30s, execs: 6827045 (223682/sec), new interesting: 3 (total: 50)
fuzz: elapsed: 30s, execs: 6827045 (0/sec), new interesting: 3 (total: 50)
--- PASS: FuzzReverse (30.09s)
PASS
ok      github.com/ckalpakoglu/fuzzing  30.094s

As the example above shows, fuzzing enables developers to test for the unexpected. It does not replace the need for other types of tests but rather complements them. It is a great way to increase test coverage and identify test cases.

From a security perspective, fuzzing continuously is essential for several reasons.

  • Identifying input validation vulnerabilities: APIs often rely on input validation to ensure that only valid data is accepted. Fuzz testing can help identify input validation vulnerabilities by sending many input values to the API and observing how it responds.
  • Testing for robustness: Fuzz testing can help developers determine whether their API is robust enough to handle various inputs, including invalid, unexpected, or malicious data. This can help ensure that the API is secure and can withstand attacks.
  • Uncovering hidden/logic vulnerabilities: Fuzz testing can help identify hidden/logic vulnerabilities in a program that might not be immediately apparent. By sending a large number of different inputs to the program, fuzz testing can uncover vulnerabilities that other testing methods might not detect.

Let’s use this approach to fuzz our REST endpoints and add fuzz tests into the DevOps pipeline to run it on every build.

The example API can be found at https://github.com/kondukto-io/simple-fuzzing. The project layout is simple as follows and self-explanatory:


├── cmd
│   └── server.go
├── handlers
│   ├── db.go
│   ├── handlers.go
│   ├── user.go
│   └── user_test.go
├── main.go
└── util
    └── util.go

For the sake of this blog post, we will focus on the /handlers directory but first, let’s investigate the “server.go”.

The code is pretty straightforward, and to keep it even simpler, we have two handlers: CreateUser and GetUserByID.

package cmd

import (
	"database/sql"

	"github.com/labstack/echo/v4"
	"github.com/labstack/echo/v4/middleware"
	_ "github.com/mattn/go-sqlite3"

	"github.com/kondukto-io/simple-fuzzing/handlers"
)

const (
	port = ":8888"
)

func Execute() error {
	// setup the database
	db, err := sql.Open("sqlite3", "file::memory:?cache=shared")
	if err != nil {
		panic(err)
	}
	defer db.Close()

	e := echo.New()
	// middlewares
	e.Use(middleware.Logger())


	// run the db migration. This should run once 
	err = handlers.MigrateDB(db)
	if err != nil {
		panic(err)
	}

	// Initialize the handlers
	h := handlers.NewHandler(db)

	// Routes
	e.POST("/create", h.CreateUser)
	e.GET("/user/:id", h.GetUserByID)

	return e.Start(port)
}

The idea is to write a fuzz test for each endpoint, and to do that, we need to look at the handler function.

import (
        "net/http"

        "github.com/labstack/echo/v4"

        "github.com/kondukto-io/simple-fuzzing/util"
)

type User struct {
        ID    string `json:"id"`
        Name  string `json:"name"`
        Email string `json:"email"`
}

func (h *Handler) CreateUser(c echo.Context) error {
        u := new(User)
        if err := c.Bind(u); err != nil {
              / in the production you should not dump the error message directly
                return &echo.HTTPError{Code: http.StatusBadRequest, Message: err.Error()}
        }

        stmt, err := h.db.Prepare("INSERT INTO users(id, name, email) values (?, ?, ?)")
        if err != nil {
            // in ,the production you should not dump the error message directly
                return &echo.HTTPError{Code: http.StatusBadRequest, Message: err.Error()}
        }

        defer stmt.Close()

        _, err = stmt.Exec(u.ID, u.Name, u.Email)
        if err != nil {
                // in the production you should not dump all the error message
                return &echo.HTTPError{Code: http.StatusBadRequest, Message: err.Error()}
        }
        return c.JSON(http.StatusOK, u)
}


//...<snipped>... 

The handler does the following:

  1. There is a User model (struct) with three fields.
  2. The handler expects a JSON request and maps the data with the User model.
  3. The given User input is inserted into the “USERS” table in the database.
  4. If an error occurs, the function returns a HTTP status code  “400 - BadRequest” with the error message.

As you can see, there is no data validation in the handler, but the INSERT operation uses a parameterized query.

package handlers

import (
	"bytes"
	"encoding/json"
	"fmt"
	"net/http"
	"net/http/httptest"
	"regexp"
	"testing"
	"unicode/utf8"

	"github.com/DATA-DOG/go-sqlmock"
	"github.com/labstack/echo/v4"

	"github.com/kondukto-io/simple-fuzzing/util"
)

var (
	// we use test cases for the unit tests
	// and for fuzz test as a seed corpus
	tests = []struct {
		name    string
		args    User
		wantErr bool
	}{
		{
			name: "success",
			args: User{
				ID:    "1111",
				Name:  "kondukto",
				Email: "helo@kondukto.io",
			},
			wantErr: false,
		},
		{
			name: "fail",
			args: User{
				ID:    "1212121212121212121212121111",
				Name:  "kondukto",
				Email: "helo@kondukto.io",
			},
			wantErr: true,
		},
		{
			name: "fail",
			args: User{
				ID:    "s1111", // not a valid ID
				Name:  "kondukto",
				Email: "helo@kondukto.io",
			},
			wantErr: true,
		},
	}
)

func FuzzCreateUser(f *testing.F) {
	// setup the db
	db, mock, err := sqlmock.New()
	if err != nil {
		f.Fatalf("an error '%s' was not expected when opening a mock db conn", err)
	}
	defer db.Close()

	for _, tt := range tests {
		f.Add(tt.args.ID, tt.args.Name, tt.args.Email)
	}

	f.Fuzz(func(t *testing.T, id, name, email string) {
		if !util.VaildID(id) || !utf8.ValidString(name) || !utf8.ValidString(email) {
			return
		}

		mock.ExpectPrepare(regexp.QuoteMeta("INTO users(id, name, email) values (?, ?, ?)"))

		h := NewHandler(db)
		input := User{
			ID:    id,
			Name:  name,
			Email: email,
		}

		t.Log(input)

		body, err := json.Marshal(input)
		if err != nil {
			t.Fatalf("error %v", err)
		}

		e := echo.New()
		req := httptest.NewRequest(http.MethodPost, "/", bytes.NewReader(body))
		req.Header.Set(echo.HeaderContentType, echo.MIMEApplicationJSON)
		rec := httptest.NewRecorder()
		c := e.NewContext(req, rec)
		c.SetPath("/create")

		mock.ExpectExec(regexp.QuoteMeta("INSERT INTO users(id, name, email) values (?, ?, ?)")).
			WithArgs(input.ID, input.Name, input.Email).WillReturnResult(sqlmock.NewResult(1, 1))

		// testing the function
		if err := h.CreateUser(c); err != nil {
			t.Errorf("CreateUser() err = %v", err)
		}

		// ensure all expectations have been met
		if err = mock.ExpectationsWereMet(); err != nil {
			fmt.Printf("unmet expectation error: %s", err)
		}
	})
}

Ideally, we prefer to derive our fuzz tests from unit tests to maintain the structure as is. It is easier, and adding more test cases will increase the fuzzer’s seed corpus.

Finally, we run a fuzz test and wait for the crash. Go’s fuzzing library will store all the crash cases in the testdata directory. So, whenever a crash occurs, the fuzzer will test this parameter again.

As we discussed previously, fuzzing is a never-ending process, that’s why fuzzing only "critical" endpoints can be a good option.

└> go test -v -fuzz=FuzzCreateUser --fuzztime=10s .
=== RUN   TestCreateUser
=== RUN   TestCreateUser/success
=== RUN   TestCreateUser/fail
=== RUN   TestCreateUser/fail#01
--- PASS: TestCreateUser (0.00s)
    --- PASS: TestCreateUser/success (0.00s)
    --- PASS: TestCreateUser/fail (0.00s)
    --- PASS: TestCreateUser/fail#01 (0.00s)
=== RUN   TestGetUserByID
=== RUN   TestGetUserByID/success
=== PAUSE TestGetUserByID/success
=== RUN   TestGetUserByID/fail
=== PAUSE TestGetUserByID/fail
=== RUN   TestGetUserByID/fail#01
=== PAUSE TestGetUserByID/fail#01
=== CONT  TestGetUserByID/success
=== CONT  TestGetUserByID/fail#01
=== CONT  TestGetUserByID/fail
--- PASS: TestGetUserByID (0.00s)
    --- PASS: TestGetUserByID/fail#01 (0.00s)
    --- PASS: TestGetUserByID/fail (0.00s)
    --- PASS: TestGetUserByID/success (0.00s)
=== RUN   FuzzGetUserByID
=== RUN   FuzzGetUserByID/seed#0
    user_test.go:179:    ==== value is: 1111
=== RUN   FuzzGetUserByID/seed#1
=== RUN   FuzzGetUserByID/seed#2
--- PASS: FuzzGetUserByID (0.00s)
    --- PASS: FuzzGetUserByID/seed#0 (0.00s)
    --- PASS: FuzzGetUserByID/seed#1 (0.00s)
    --- PASS: FuzzGetUserByID/seed#2 (0.00s)
=== FUZZ  FuzzCreateUser
fuzz: elapsed: 0s, gathering baseline coverage: 0/168 completed
fuzz: elapsed: 0s, gathering baseline coverage: 168/168 completed, now fuzzing with 12 workers
fuzz: elapsed: 3s, execs: 66723 (22240/sec), new interesting: 6 (total: 174)
fuzz: elapsed: 6s, execs: 112971 (15416/sec), new interesting: 7 (total: 175)
fuzz: elapsed: 9s, execs: 134377 (7133/sec), new interesting: 7 (total: 175)
fuzz: elapsed: 11s, execs: 141507 (3532/sec), new interesting: 7 (total: 175)
--- PASS: FuzzCreateUser (11.03s)
PASS
ok      github.com/kondukto-io/simple-fuzzing/handlers  11.036s

Finally, we can add these fuzz tests to our CI/CD pipeline to continuously fuzz our endpoints on each build or each PR.

The go tools do not support multiple fuzzing at the moment. We can fuzz each handler separately.

name: My workflow

# Controls when the action will run. 
on:
  push:
    branches: [ master ]

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Check out the repo
        uses: actions/checkout@v3

      - uses: actions/setup-go@v3
        with:
          go-version: '1.19'

      - name: Build
        run: go build -v ./...

      - name: Test
        run: go test -v ./...

      - name: Fuzz Create User handler
        run: go test -v -fuzz=FuzzCreateUser --fuzztime=20s ./handlers

      - name: Fuzz GetUserByID handler
        run: go test -v -fuzz=FuzzGetUserByID --fuzztime=20s ./handlers

Conclusion

Testing is crucial to increase the quality of the software we develop and fuzzing is an effective and proven method to find bugs in software.

From a security engineering perspective, fuzz testing can be an effective way to achieve a “secure by default” approach in development.

In this blog post, we wanted to show you an alternative approach to improve the “security culture” among developers and how continuous fuzzing in the pipeline can be used as a security measure in (API) development.

Next time we will introduce some vulnerabilities in the API and hope to find them with fuzz testing.

Feel free to reach out to us if you have any questions about how to implement a DevSecOps pipeline from scratch.

Resources: