Binary Classification using the Gluon API

Posted on Sat 16 December 2017 in DL


In this short tutorial, I would like to introduce a relative new high-level python deep learning API - Gluon

The Gluon API specification is an effort to improve speed, flexibility, and accessibility of deep learning technology for all developers, regardless of their deep learning framework of choice. The Gluon API offers a flexible interface that simplifies the process of prototyping, building, and training deep learning models without sacrificing training speed. It offers four distinct advantages:

  • Simple, Easy-to-Understand Code: Gluon offers a full set of plug-and-play neural network building blocks, including predefined layers, optimizers, and initializers.
  • Flexible, Imperative Structure: Gluon does not require the neural network model to be rigidly defined, but rather brings the training algorithm and model closer together to provide flexibility in the development process.
  • Dynamic Graphs: Gluon enables developers to define neural network models that are dynamic, meaning they can be built on the fly, with any structure, and using any of Python’s native control flow.
  • High Performance: Gluon provides all of the above benefits without impacting the training speed that the underlying engine provides. Following is a review of my favorite mechanical keyboard.

Breast Cancer Classification

First, we need to load required python libs.

import mxnet as mx
from mxnet import gluon, autograd, ndarray
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

As the title of this tutorial says, the goal is to predict if this is a breast cancer, based on the famous Breast Cancer Wisconsin (Diagnostic) Dataset. One can download it using the scikit-learn using the code bellow.

from sklearn.datasets import load_breast_cancer
data = load_breast_cancer()
df = pd.DataFrame(, columns=data.feature_names)
y =

Before training an artificial neural network, it is highly important to normalize the data. Otherwise, you may experience some issues with training and predictions. For normalization I'm using only pandas:

df_norm = (df - df.mean()) / (df.max() - df.min())

Time to split our dataset into train and test datasets.

X_train, X_test, y_train, y_test = train_test_split(df_norm, test_size=0.2, random_state=1111)

Next, I prefer to set hyperparameters of the model separately, in order to tune them fast.

LEARNING_R = 0.001
EPOCHS = 150

Now, we have to prepare the data:

train_dataset =,y_train)
test_dataset =,y_test)
train_data =,
                                      batch_size=BATCH_SIZE, shuffle=True)

test_data =,
                                     batch_size=BATCH_SIZE, shuffle=False)

Finally, we can start to build our first Deep Learning model using the Gluon API. The model consists of four layers there the third one is a Batch Normalization layer.

net = gluon.nn.Sequential()

# Define the model architecture
with net.name_scope():
    net.add(gluon.nn.Dense(64, activation="relu"))
    net.add(gluon.nn.Dense(32, activation="relu") ) 
    net.add(gluon.nn.Dense(1, activation="sigmoid"))

# Intitalize parametes of the model

# Add binary loss function
binary_cross_entropy = gluon.loss.SigmoidBinaryCrossEntropyLoss()

trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': LEARNING_R})

Training time!

for e in range(EPOCHS):
    for i, (data, label) in enumerate(train_data):
        data = data.as_in_context(mx.cpu()).astype('float32')
        label = label.as_in_context(mx.cpu()).astype('float32')
        with autograd.record(): # Start recording the derivatives
            output = net(data) # the forward iteration
            loss = binary_cross_entropy(output, label)
        # Provide stats on the improvement of the model over each epoch
        curr_loss = ndarray.mean(loss).asscalar()
    if e % 20 == 0:
        print("Epoch {}. Current Loss: {}.".format(e, curr_loss))

In order to predict, you have to iterate through the test dataset.

y_pred = np.array([])
for data,label in test_data:
        data = data.as_in_context(mx.cpu()).astype('float32')
        label = label.as_in_context(mx.cpu()).astype('float32')
        output = net(data)
        y_pred = np.append(y_pred, output.asnumpy())

y_pred_labels = np.where(y_pred > 0.48, 1, 0)

Let's check the final accuracy score on unseen data.

print(accuracy_score(y_test, y_pred_labels))
>> 0.98245614035087714

Quite an impressive result, however, you can try to bit my score :)

The full code can be found here

More information regarding the Gluon API can be found Reference