How To Solve The Knapsack Problem With Linear Programming

Sometime in 2023, I came across the Knapsack Problem, a resource allocation situation in which you need to determine and place the optimal combination of items in a knapsack.

Here, you are given a set of items, with each item having a weight (cost) and a price (gain). To make things interesting, the knapsack has a weight limit (constraint) indicating that only items on or below this limit can be placed in the knapsack. The task is to find out these items in such a way that we get the most (optimal) price

My first step was to research and identify solutions that have been defined previously. Authors like Shayan Hashemi in his article Neural Knapsack make use of a simple neural network-based model to identify what items should be picked. Shuo Wang explored several methods including dynamic programming to solve the problem.

These solutions are innovative and could be emulated, but I was not content as I assumed that a plausible solution should be borne from an optimization technique as opposed to simply predicting an output.

Further reading led me to Linear Programming, a mathematical optimization technique, which was what I needed.

What is Linear Programming?

Linear Programming is an optimization technique where the goal is to minimize or maximize an objective function subject to certain constraint(s). The objective function takes the form of a linear equation comprising decision variables and coefficients. The decision variables are unknown and they need to be discovered. The constraint(s) could either be inequalities or equalities depending on the nature of the problem.

A typical form of Linear Programming can be seen below:

$$\begin{align*} Maximize:z=2X_1+3X_2\\ Subject\hspace{0.3cm}to: X_1 + X_2 <= 10\\ \\ Variables:\hspace{0.3cm}X_1 >= 0\\ X_2 >= 0 \end{align*}$$

In this program, the objective function (z) to be maximized is. This equation is maximized when the optimal objective value has been discovered while ensuring the constraints are met.

There are primarily two ways of expressing a Linear Program. The first is the standard form which is described in the following way:

  1. The objective function is to be maximized;

  2. The decision variables are positive in expression;

  3. The constraints are expressed as inequalities;

  4. The inequalities are expressed as lesser-than-or-equal-to.

The second is the slack form in which:

  1. The objective function is also to be maximized;

  2. The decision variables are non-negative (positive);

  3. The constraints are expressed as equalities rather than inequalities. This means that the constraint above will take this form instead:

$$X1+X2+s=10$$

Slack variable (s) is added because the previous constraint is expressed as an inequality.

If you are new to Linear Optimization and you like to learn more, here are some articles that can get you started.

How Does Linear Optimization Solve The Knapsack Problem

Following our initial description of the problem let us assume we have the following items

$$items = [item_1, item_2, item_3, item_4, item_5]$$

Each item has the following weights (in Kilogrammes) and prices (in dollars):

$$\begin{align*} weights=[46,40,42,38,10]\\ prices=[12,19,19,15,8]\hspace{0.7cm} \end{align*}$$

We also know that the knapsack has a weight limit of 40 kg.

How can we use Linear Programming to determine which items should be placed in the knapsack?

The Objective Function

Since the objective function conveys the problem we are trying to solve. It will take this form:

$$\begin{align*} Maximize:\hspace{0.3cm}z=P_1*item_1 + P_2*item_2+P_3*item_3+P_4*item_4+P_5*item_5 \end{align*}$$

More compactly, our objective function can be represented as:

$$\begin{align*} Maximize:\hspace{0.3cm}z = P^Titems \end{align*}$$

Some things to note:

  • We wish to arrive at the optimal price (gain) in this problem so this is a maximization problem.

The Constraints

We know the knapsack has a limit of 40kg, this is our constraint. We can express this constraint as the following equation

$$\begin{align*} Subject\hspace{0.3cm}to:\hspace{0.3cm}W^Titem<=40\hspace{0.1cm}kg \\ \\ Variables:\hspace{0.3cm}0<=item_n<=1 \end{align*}$$

One interesting thing to note is that our decision variable (item) will take binary values. 0 indicates the item is not chosen, while 1 indicates the item is chosen to be placed in the knapsack.

Solving in Python

To solve this problem in Python, we will be using the PuLP. You can install it with this line of code pip install pulp

# importing the tools we need
import numpy as np
from pulp import * 

# creating the items
items = range(5)
weights = [46, 40, 42, 38, 10]
prices = [12, 19, 19, 15, 8]
constraint = 40

With the libraries imported and elements of the problem spelt out, we can then go ahead to instantiate a knapsack model with PuLP after which, we create our decision variables from the items.

# instantiating model
knapsack_model = LpProblem(name = 'knapsack-problem', sense = LpMaximize)

# creating the decision variables from the items

#####################################################################################################################
# Here we specify the prefix for the decision variables (e.g. item_1)                                               #
# The decision variables will take the value of 0 or 1, hence we specify Binary as the Category of variables        #                                                             #
# When we print the output of this line of computation, we get the following:                                       #                                                         #
#####################################################################################################################
decision_variables = LpVariable.dicts('item', items, cat = 'Binary')

#OUTPUT: {0: item_0, 1: item_1, 2: item_2, 3: item_3, 4: item_4}

We can now add the objective function and constraints to the model

obj_fn = lpSum(prices[i] * decision_variables[i] for i in items)

# OUTPUT: 12*item_0 + 19*item_1 + 19*item_2 + 15*item_3 + 8*item_4 + 0

# add to the model
knapsack_model += obj_fn

A preview of the model:

knapsack-problem:
MAXIMIZE
12*item_0 + 19*item_1 + 19*item_2 + 15*item_3 + 8*item_4 + 0
VARIABLES
0 <= item_0 <= 1 Integer
0 <= item_1 <= 1 Integer
0 <= item_2 <= 1 Integer
0 <= item_3 <= 1 Integer
0 <= item_4 <= 1 Integer

Now, we add the constraint and solve the problem using the solve() method

constraint = lpSum(W[i] * selected_items[i] for i in items) <= C
# OUTPUT: 46*item_0 + 40*item_1 + 42*item_2 + 38*item_3 + 10*item_4 + -40 <= 0

knapsack_model += constraint

knapsack_model.solve()

Results

LpStatus[knapsack_model]
# OUTPUT: 'Optimal'

knapsack_model.objective.value()
# OUTPUT: 19.0

def fetch_decision_variables(model):
    return [
        int(variable.varValue) for variable in model.variables()
    ]

fetch_decision_variables(knapsack_model)
# OUTPUT: [0, 1, 0, 0, 0]

Results from the model indicate the optimal solution is to pick only item 2. Not only does this particular item's weight fall below the constraint, but at that weight, it has the highest price achievable.

For just a single instance of the problem, it is easy to validate. But for one hundred, or even a thousand instances, it may be time-consuming to look through each instance at a time to validate.

One way out is to run the instances through this The brute force algorithm built by Shayan Hashemi thereafter, measure the performance of the model on the same instances.

Conclusion

In this article we went through an example of a resource allocation problem, the knapsack problem as described above. We also implemented a simple solution using linear programming, where we decided on the item(s) that could fit into the knapsack considering the weight constraint the knapsack has. As a plus, our model picked the optimal item for us.

You can take this a step further to apply this solution to any other resource allocation problem, it could even be the knapsack problem.

Let me know what you learn from this in the comments.