--- title: Wide & Deep Learning (W&D) keywords: fastai sidebar: home_sidebar nb_path: "nbs/models/tf/widedeep.ipynb" ---
class
Linear
[source]
Linear
(*args
, **kwargs
) ::Layer
This is the class from which all layers inherit.
A layer is a callable object that takes as input one or more tensors and
that outputs one or more tensors. It involves computation, defined
in the call()
method, and a state (weight variables), defined
either in the constructor __init__()
or in the build()
method.
Users will just instantiate a layer and then treat it as a callable.
Args:
trainable: Boolean, whether the layer's variables should be trainable.
name: String name of the layer.
dtype: The dtype of the layer's computations and weights. Can also be a
tf.keras.mixed_precision.Policy
, which allows the computation and weight
dtype to differ. Default of None
means to use
tf.keras.mixed_precision.global_policy()
, which is a float32 policy
unless set to different value.
dynamic: Set this to True
if your layer should only be run eagerly, and
should not be used to generate a static computation graph.
This would be the case for a Tree-RNN or a recursive network,
for example, or generally for any layer that manipulates tensors
using Python control flow. If False
, we assume that the layer can
safely be used to generate a static computation graph.
Attributes:
name: The name of the layer (string).
dtype: The dtype of the layer's weights.
variable_dtype: Alias of dtype
.
compute_dtype: The dtype of the layer's computations. Layers automatically
cast inputs to this dtype which causes the computations and output to also
be in this dtype. When mixed precision is used with a
tf.keras.mixed_precision.Policy
, this will be different than
variable_dtype
.
dtype_policy: The layer's dtype policy. See the
tf.keras.mixed_precision.Policy
documentation for details.
trainable_weights: List of variables to be included in backprop.
non_trainable_weights: List of variables that should not be
included in backprop.
weights: The concatenation of the lists trainable_weights and
non_trainable_weights (in this order).
trainable: Whether the layer should be trained (boolean), i.e. whether
its potentially-trainable weights should be returned as part of
layer.trainable_weights
.
input_spec: Optional (list of) InputSpec
object(s) specifying the
constraints on inputs that can be accepted by the layer.
We recommend that descendants of Layer
implement the following methods:
__init__()
: Defines custom layer attributes, and creates layer state
variables that do not depend on input shapes, using add_weight()
.build(self, input_shape)
: This method can be used to create weights that
depend on the shape(s) of the input(s), using add_weight()
. __call__()
will automatically build the layer (if it has not been built yet) by
calling build()
.call(self, inputs, *args, **kwargs)
: Called in __call__
after making
sure build()
has been called. call()
performs the logic of applying the
layer to the input tensors (which should be passed in as argument).
Two reserved keyword arguments you can optionally use in call()
are:training
(boolean, whether the call is in inference mode or training
mode). See more details in the layer/model subclassing guidemask
(boolean tensor encoding masked timesteps in the input, used
in RNN layers). See more details in the layer/model subclassing guide
A typical signature for this method is call(self, inputs)
, and user could
optionally add training
and mask
if the layer need them. *args
and
**kwargs
is only useful for future extension when more input parameters
are planned to be added.get_config(self)
: Returns a dictionary containing the configuration used
to initialize this layer. If the keys differ from the arguments
in __init__
, then override from_config(self)
as well.
This method is used when saving
the layer or a model that contains this layer.Examples:
Here's a basic example: a layer with two variables, w
and b
,
that returns y = w . x + b
.
It shows how to implement build()
and call()
.
Variables set as attributes of a layer are tracked as weights
of the layers (in layer.weights
).
class SimpleDense(Layer):
def __init__(self, units=32):
super(SimpleDense, self).__init__()
self.units = units
def build(self, input_shape): # Create the state of the layer (weights)
w_init = tf.random_normal_initializer()
self.w = tf.Variable(
initial_value=w_init(shape=(input_shape[-1], self.units),
dtype='float32'),
trainable=True)
b_init = tf.zeros_initializer()
self.b = tf.Variable(
initial_value=b_init(shape=(self.units,), dtype='float32'),
trainable=True)
def call(self, inputs): # Defines the computation from inputs to outputs
return tf.matmul(inputs, self.w) + self.b
# Instantiates the layer.
linear_layer = SimpleDense(4)
# This will also call `build(input_shape)` and create the weights.
y = linear_layer(tf.ones((2, 2)))
assert len(linear_layer.weights) == 2
# These weights are trainable, so they're listed in `trainable_weights`:
assert len(linear_layer.trainable_weights) == 2
Note that the method add_weight()
offers a shortcut to create weights:
class SimpleDense(Layer):
def __init__(self, units=32):
super(SimpleDense, self).__init__()
self.units = units
def build(self, input_shape):
self.w = self.add_weight(shape=(input_shape[-1], self.units),
initializer='random_normal',
trainable=True)
self.b = self.add_weight(shape=(self.units,),
initializer='random_normal',
trainable=True)
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b
Besides trainable weights, updated via backpropagation during training,
layers can also have non-trainable weights. These weights are meant to
be updated manually during call()
. Here's a example layer that computes
the running sum of its inputs:
class ComputeSum(Layer):
def __init__(self, input_dim):
super(ComputeSum, self).__init__()
# Create a non-trainable weight.
self.total = tf.Variable(initial_value=tf.zeros((input_dim,)),
trainable=False)
def call(self, inputs):
self.total.assign_add(tf.reduce_sum(inputs, axis=0))
return self.total
my_sum = ComputeSum(2)
x = tf.ones((2, 2))
y = my_sum(x)
print(y.numpy()) # [2. 2.]
y = my_sum(x)
print(y.numpy()) # [4. 4.]
assert my_sum.weights == [my_sum.total]
assert my_sum.non_trainable_weights == [my_sum.total]
assert my_sum.trainable_weights == []
For more information about creating layers, see the guide Making new Layers and Models via subclassing
class
DNN
[source]
DNN
(*args
, **kwargs
) ::Layer
This is the class from which all layers inherit.
A layer is a callable object that takes as input one or more tensors and
that outputs one or more tensors. It involves computation, defined
in the call()
method, and a state (weight variables), defined
either in the constructor __init__()
or in the build()
method.
Users will just instantiate a layer and then treat it as a callable.
Args:
trainable: Boolean, whether the layer's variables should be trainable.
name: String name of the layer.
dtype: The dtype of the layer's computations and weights. Can also be a
tf.keras.mixed_precision.Policy
, which allows the computation and weight
dtype to differ. Default of None
means to use
tf.keras.mixed_precision.global_policy()
, which is a float32 policy
unless set to different value.
dynamic: Set this to True
if your layer should only be run eagerly, and
should not be used to generate a static computation graph.
This would be the case for a Tree-RNN or a recursive network,
for example, or generally for any layer that manipulates tensors
using Python control flow. If False
, we assume that the layer can
safely be used to generate a static computation graph.
Attributes:
name: The name of the layer (string).
dtype: The dtype of the layer's weights.
variable_dtype: Alias of dtype
.
compute_dtype: The dtype of the layer's computations. Layers automatically
cast inputs to this dtype which causes the computations and output to also
be in this dtype. When mixed precision is used with a
tf.keras.mixed_precision.Policy
, this will be different than
variable_dtype
.
dtype_policy: The layer's dtype policy. See the
tf.keras.mixed_precision.Policy
documentation for details.
trainable_weights: List of variables to be included in backprop.
non_trainable_weights: List of variables that should not be
included in backprop.
weights: The concatenation of the lists trainable_weights and
non_trainable_weights (in this order).
trainable: Whether the layer should be trained (boolean), i.e. whether
its potentially-trainable weights should be returned as part of
layer.trainable_weights
.
input_spec: Optional (list of) InputSpec
object(s) specifying the
constraints on inputs that can be accepted by the layer.
We recommend that descendants of Layer
implement the following methods:
__init__()
: Defines custom layer attributes, and creates layer state
variables that do not depend on input shapes, using add_weight()
.build(self, input_shape)
: This method can be used to create weights that
depend on the shape(s) of the input(s), using add_weight()
. __call__()
will automatically build the layer (if it has not been built yet) by
calling build()
.call(self, inputs, *args, **kwargs)
: Called in __call__
after making
sure build()
has been called. call()
performs the logic of applying the
layer to the input tensors (which should be passed in as argument).
Two reserved keyword arguments you can optionally use in call()
are:training
(boolean, whether the call is in inference mode or training
mode). See more details in the layer/model subclassing guidemask
(boolean tensor encoding masked timesteps in the input, used
in RNN layers). See more details in the layer/model subclassing guide
A typical signature for this method is call(self, inputs)
, and user could
optionally add training
and mask
if the layer need them. *args
and
**kwargs
is only useful for future extension when more input parameters
are planned to be added.get_config(self)
: Returns a dictionary containing the configuration used
to initialize this layer. If the keys differ from the arguments
in __init__
, then override from_config(self)
as well.
This method is used when saving
the layer or a model that contains this layer.Examples:
Here's a basic example: a layer with two variables, w
and b
,
that returns y = w . x + b
.
It shows how to implement build()
and call()
.
Variables set as attributes of a layer are tracked as weights
of the layers (in layer.weights
).
class SimpleDense(Layer):
def __init__(self, units=32):
super(SimpleDense, self).__init__()
self.units = units
def build(self, input_shape): # Create the state of the layer (weights)
w_init = tf.random_normal_initializer()
self.w = tf.Variable(
initial_value=w_init(shape=(input_shape[-1], self.units),
dtype='float32'),
trainable=True)
b_init = tf.zeros_initializer()
self.b = tf.Variable(
initial_value=b_init(shape=(self.units,), dtype='float32'),
trainable=True)
def call(self, inputs): # Defines the computation from inputs to outputs
return tf.matmul(inputs, self.w) + self.b
# Instantiates the layer.
linear_layer = SimpleDense(4)
# This will also call `build(input_shape)` and create the weights.
y = linear_layer(tf.ones((2, 2)))
assert len(linear_layer.weights) == 2
# These weights are trainable, so they're listed in `trainable_weights`:
assert len(linear_layer.trainable_weights) == 2
Note that the method add_weight()
offers a shortcut to create weights:
class SimpleDense(Layer):
def __init__(self, units=32):
super(SimpleDense, self).__init__()
self.units = units
def build(self, input_shape):
self.w = self.add_weight(shape=(input_shape[-1], self.units),
initializer='random_normal',
trainable=True)
self.b = self.add_weight(shape=(self.units,),
initializer='random_normal',
trainable=True)
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b
Besides trainable weights, updated via backpropagation during training,
layers can also have non-trainable weights. These weights are meant to
be updated manually during call()
. Here's a example layer that computes
the running sum of its inputs:
class ComputeSum(Layer):
def __init__(self, input_dim):
super(ComputeSum, self).__init__()
# Create a non-trainable weight.
self.total = tf.Variable(initial_value=tf.zeros((input_dim,)),
trainable=False)
def call(self, inputs):
self.total.assign_add(tf.reduce_sum(inputs, axis=0))
return self.total
my_sum = ComputeSum(2)
x = tf.ones((2, 2))
y = my_sum(x)
print(y.numpy()) # [2. 2.]
y = my_sum(x)
print(y.numpy()) # [4. 4.]
assert my_sum.weights == [my_sum.total]
assert my_sum.non_trainable_weights == [my_sum.total]
assert my_sum.trainable_weights == []
For more information about creating layers, see the guide Making new Layers and Models via subclassing
class
WideDeep
[source]
WideDeep
(*args
, **kwargs
) ::Model
Model
groups layers into an object with training and inference features.
Args:
inputs: The input(s) of the model: a keras.Input
object or list of
keras.Input
objects.
outputs: The output(s) of the model. See Functional API example below.
name: String, the name of the model.
There are two ways to instantiate a Model
:
1 - With the "Functional API", where you start from Input
,
you chain layer calls to specify the model's forward pass,
and finally you create your model from inputs and outputs:
import tensorflow as tf
inputs = tf.keras.Input(shape=(3,))
x = tf.keras.layers.Dense(4, activation=tf.nn.relu)(inputs)
outputs = tf.keras.layers.Dense(5, activation=tf.nn.softmax)(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
Note: Only dicts, lists, and tuples of input tensors are supported. Nested inputs are not supported (e.g. lists of list or dicts of dict).
A new Functional API model can also be created by using the intermediate tensors. This enables you to quickly extract sub-components of the model.
Example:
inputs = keras.Input(shape=(None, None, 3))
processed = keras.layers.RandomCrop(width=32, height=32)(inputs)
conv = keras.layers.Conv2D(filters=2, kernel_size=3)(processed)
pooling = keras.layers.GlobalAveragePooling2D()(conv)
feature = keras.layers.Dense(10)(pooling)
full_model = keras.Model(inputs, feature)
backbone = keras.Model(processed, conv)
activations = keras.Model(conv, feature)
Note that the backbone
and activations
models are not
created with keras.Input
objects, but with the tensors that are originated
from keras.Inputs
objects. Under the hood, the layers and weights will
be shared across these models, so that user can train the full_model
, and
use backbone
or activations
to do feature extraction.
The inputs and outputs of the model can be nested structures of tensors as
well, and the created models are standard Functional API models that support
all the existing APIs.
2 - By subclassing the Model
class: in that case, you should define your
layers in __init__()
and you should implement the model's forward pass
in call()
.
import tensorflow as tf
class MyModel(tf.keras.Model):
def __init__(self):
super().__init__()
self.dense1 = tf.keras.layers.Dense(4, activation=tf.nn.relu)
self.dense2 = tf.keras.layers.Dense(5, activation=tf.nn.softmax)
def call(self, inputs):
x = self.dense1(inputs)
return self.dense2(x)
model = MyModel()
If you subclass Model
, you can optionally have
a training
argument (boolean) in call()
, which you can use to specify
a different behavior in training and inference:
import tensorflow as tf
class MyModel(tf.keras.Model):
def __init__(self):
super().__init__()
self.dense1 = tf.keras.layers.Dense(4, activation=tf.nn.relu)
self.dense2 = tf.keras.layers.Dense(5, activation=tf.nn.softmax)
self.dropout = tf.keras.layers.Dropout(0.5)
def call(self, inputs, training=False):
x = self.dense1(inputs)
if training:
x = self.dropout(x, training=training)
return self.dense2(x)
model = MyModel()
Once the model is created, you can config the model with losses and metrics
with model.compile()
, train the model with model.fit()
, or use the model
to do prediction with model.predict()
.
def test_model():
user_features = {'feat': 'user_id', 'feat_num': 100, 'embed_dim': 8}
seq_features = {'feat': 'item_id', 'feat_num': 100, 'embed_dim': 8}
features = [user_features, seq_features]
model = WideDeep(features, hidden_units=[8, 4, 2], dnn_dropout=0.5)
model.summary()
test_model()
Model: "model" __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) [(None, 2)] 0 [] tf.__operators__.getitem (Slic (None,) 0 ['input_1[0][0]'] ingOpLambda) tf.__operators__.getitem_1 (Sl (None,) 0 ['input_1[0][0]'] icingOpLambda) embedding (Embedding) (None, 8) 800 ['tf.__operators__.getitem[0][0]' ] embedding_1 (Embedding) (None, 8) 800 ['tf.__operators__.getitem_1[0][0 ]'] tf.concat (TFOpLambda) (None, 16) 0 ['embedding[0][0]', 'embedding_1[0][0]'] tf.__operators__.add (TFOpLamb (None, 2) 0 ['input_1[0][0]'] da) dnn (DNN) (None, 2) 182 ['tf.concat[0][0]'] linear (Linear) (None, 1) 200 ['tf.__operators__.add[0][0]'] dense_3 (Dense) (None, 1) 3 ['dnn[0][0]'] tf.math.multiply (TFOpLambda) (None, 1) 0 ['linear[0][0]'] tf.math.multiply_1 (TFOpLambda (None, 1) 0 ['dense_3[0][0]'] ) tf.__operators__.add_1 (TFOpLa (None, 1) 0 ['tf.math.multiply[0][0]', mbda) 'tf.math.multiply_1[0][0]'] tf.math.sigmoid (TFOpLambda) (None, 1) 0 ['tf.__operators__.add_1[0][0]'] ================================================================================================== Total params: 1,985 Trainable params: 1,985 Non-trainable params: 0 __________________________________________________________________________________________________