Core functionalities¶
Computation Graph¶
The ComputationGraph is the workhorse of dynet. From the Dynet technical report :
[The] computation graph represents symbolic computation, and the results of the computation are evaluated lazily: the computation is only performed once the user explicitly asks for it (at which point a “forward” computation is triggered). Expressions that evaluate to scalars (i.e. loss values) can also be used to trigger a “backward” computation, computing the gradients of the computation with respect to the parameters.
-
int
dynet::
get_number_of_active_graphs
()¶ Gets the number of active graphs.
This is 0 or 1, you can’t create more than one graph at once
- Return
- Number of active graphs
-
unsigned
dynet::
get_current_graph_id
()¶ Get id of the current active graph.
This can help check whether a graph is stale
- Return
- Id of the current graph
-
struct
dynet::
ComputationGraph
¶ - #include <dynet.h>
Computation graph where nodes represent forward and backward intermediate values, and edges represent functions of multiple values.
To represent the fact that a function may have multiple arguments, edges have a single head and 0, 1, 2, or more tails. (Constants, inputs, and parameters are represented as functions of 0 parameters.) Example: given the function z = f(x, y), z, x, and y are nodes, and there is an edge representing f with which points to the z node (i.e., its head), and x and y are the tails of the edge. You shouldn’t need to use most methods from the ComputationGraph except for
backward
since most of them are available directly from the Expression class.Public Functions
-
ComputationGraph
()¶ Default constructor.
-
VariableIndex
add_input
(real s, Device *device)¶ Add scalar input.
The computational network will pull inputs in from the user’s data structures and make them available to the computation
- Return
- The index of the created variable
- Parameters
s
: Real numberdevice
: The device to place input value
-
VariableIndex
add_input
(const real *ps, Device *device)¶ Add scalar input by pointer.
The computational network will pull inputs in from the user’s data structures and make them available to the computation
- Return
- The index of the created variable
- Parameters
ps
: Pointer to a real numberdevice
: The device to place input value
-
VariableIndex
add_input
(const Dim &d, const std::vector<float> &data, Device *device)¶ Add multidimentsional input.
The computational network will pull inputs in from the user’s data structures and make them available to the computation
- Return
- The index of the created variable
- Parameters
d
: Desired shape of the inputdata
: Input data (as a 1 dimensional array)data
: The data points corresponding to each indexdevice
: The device to place input value
-
VariableIndex
add_input
(const Dim &d, const std::vector<float> *pdata, Device *device)¶ Add multidimentsional input by pointer.
The computational network will pull inputs in from the user’s data structures and make them available to the computation
- Return
- The index of the created variable
- Parameters
d
: Desired shape of the inputpdata
: Pointer to the input data (as a 1 dimensional array)device
: The device to place input value
-
VariableIndex
add_input
(const Dim &d, const std::vector<unsigned int> &ids, const std::vector<float> &data, Device *device, float defdata = 0.f)¶ Add sparse input.
The computational network will pull inputs in from the user’s data structures and make them available to the computation. Represents specified (not learned) inputs to the network in sparse array format, with an optional default value.
- Return
- The index of the created variable
- Parameters
d
: Desired shape of the inputids
: The indexes of the data points to updatedata
: The data points corresponding to each indexdevice
: The device to place input valuedefdata
: The default data with which to set the unspecified data points
-
VariableIndex
add_parameters
(Parameter p)¶ Add a parameter to the computation graph.
- Return
- The index of the created variable
- Parameters
p
: Parameter to be added
-
VariableIndex
add_parameters
(LookupParameter p)¶ Add a full matrix of lookup parameters to the computation graph.
- Return
- The index of the created variable
- Parameters
p
: LookupParameter to be added
-
VariableIndex
add_const_parameters
(Parameter p)¶ Add a parameter to the computation graph (but don’t update)
- Return
- The index of the created variable
- Parameters
p
: Parameter to be added
-
VariableIndex
add_const_parameters
(LookupParameter p)¶ Add a full matrix of lookup parameter to the computation graph (but don’t update)
- Return
- The index of the created variable
- Parameters
p
: LookupParameter to be added
-
VariableIndex
add_lookup
(LookupParameter p, const unsigned *pindex)¶ Add a lookup parameter to the computation graph.
Use pindex to point to a memory location where the index will live that the caller owns
- Return
- The index of the created variable
- Parameters
p
: Lookup parameter from which to pickpindex
: Pointer to the index to lookup
-
VariableIndex
add_lookup
(LookupParameter p, unsigned index)¶ Add a lookup parameter to the computation graph.
- Return
- The index of the created variable
- Parameters
p
: Lookup parameter from which to pickindex
: Index to lookup
-
VariableIndex
add_lookup
(LookupParameter p, const std::vector<unsigned> *pindices)¶ Add lookup parameters to the computation graph.
Use pindices to point to a memory location where the indices will live that the caller owns
- Return
- The index of the created variable
- Parameters
p
: Lookup parameter from which to pickpindices
: Pointer to the indices to lookup
-
VariableIndex
add_lookup
(LookupParameter p, const std::vector<unsigned> &indices)¶ Add lookup parameters to the computation graph.
- Return
- The index of the created variable
- Parameters
p
: Lookup parameter from which to pickindices
: Indices to lookup
-
VariableIndex
add_const_lookup
(LookupParameter p, const unsigned *pindex)¶ Add a lookup parameter to the computation graph.
Just like add_lookup, but don’t optimize the lookup parameters
- Return
- The index of the created variable
- Parameters
p
: Lookup parameter from which to pickpindex
: Pointer to the indices to lookup
-
VariableIndex
add_const_lookup
(LookupParameter p, unsigned index)¶ Add a lookup parameter to the computation graph.
Just like add_lookup, but don’t optimize the lookup parameters
- Return
- The index of the created variable
- Parameters
p
: Lookup parameter from which to pickindex
: Index to lookup
-
VariableIndex
add_const_lookup
(LookupParameter p, const std::vector<unsigned> *pindices)¶ Add lookup parameters to the computation graph.
Just like add_lookup, but don’t optimize the lookup parameters
- Return
- The index of the created variable
- Parameters
p
: Lookup parameter from which to pickpindices
: Pointer to the indices to lookup
-
VariableIndex
add_const_lookup
(LookupParameter p, const std::vector<unsigned> &indices)¶ Add lookup parameters to the computation graph.
Just like add_lookup, but don’t optimize the lookup parameters
- Return
- The index of the created variable
- Parameters
p
: Lookup parameter from which to pickindices
: Indices to lookup
- template <class Function>
-
VariableIndex
add_function
(const std::initializer_list<VariableIndex> &arguments)¶ Add a function to the computation graph.
This what is called when creating an expression
- Return
- The index of the output variable
- Parameters
arguments
: List of the arguments indices
- Template Parameters
Function
: Function to be applied
- template <class Function, typename... Args>
-
VariableIndex
add_function
(const std::initializer_list<VariableIndex> &arguments, Args&&... side_information)¶ Add a function to the computation graph (with side information)
This what is called when creating an expression
- Return
- The index of the output variable
- Parameters
arguments
: List of the arguments indicesside_information
: Side information that is needed to compute the function
- Template Parameters
Function
: Function to be applied
-
void
clear
()¶ Reset ComputationGraph to a newly created state.
[long description]
-
void
checkpoint
()¶ Set a checkpoint.
-
void
revert
()¶ Revert to last checkpoint.
-
Dim &
get_dimension
(VariableIndex index) const¶ Get dimension of a node.
- Return
- Dimension
- Parameters
index
: Variable index of the node
-
const Tensor &
forward
(const Expression &last)¶ Run complete forward pass from first node to given one, ignoring all precomputed values.
- Return
- Value of the
last
Expression after execution - Parameters
last
: Expression up to which the forward pass must be computed
-
const Tensor &
forward
(VariableIndex i)¶ Run complete forward pass from first node to given one, ignoring all precomputed values.
- Return
- Value of the end Node after execution
- Parameters
i
: Variable index of the node up to which the forward pass must be computed
-
const Tensor &
incremental_forward
(const Expression &last)¶ Run forward pass from the last computed node to given one.
Useful if you want to add nodes and evaluate just the new parts.
- Return
- Value of the
last
Expression after execution - Parameters
last
: Expression up to which the forward pass must be computed
-
const Tensor &
incremental_forward
(VariableIndex i)¶ Run forward pass from the last computed node to given one.
Useful if you want to add nodes and evaluate just the new parts.
- Return
- Value of the end Node after execution
- Parameters
last
: Variable index of the node up to which the forward pass must be computed
-
const Tensor &
get_value
(VariableIndex i)¶ Get forward value for node at index i.
Performs forward evaluation if note available (may compute more than strictly what is needed).
- Return
- Requested value
- Parameters
i
: Index of the variable from which you want the value
-
const Tensor &
get_value
(const Expression &e)¶ Get forward value for the given expression.
Performs forward evaluation if note available (may compute more than strictly what is needed).
- Return
- Requested value
- Parameters
e
: Expression from which you want the value
-
const Tensor &
get_gradient
(VariableIndex i)¶ Get gradient for node at index i.
Performs backward pass if not available (may compute more than strictly what is needed).
- Return
- Requested gradient
- Parameters
i
: Index of the variable from which you want the gradient
-
const Tensor &
get_gradient
(const Expression &e)¶ Get forward gradient for the given expression.
Performs backward pass if not available (may compute more than strictly what is needed).
- Return
- Requested gradient
- Parameters
e
: Expression from which you want the gradient
-
void
invalidate
()¶ Clears forward caches (for get_value etc).
-
void
backward
(const Expression &last, bool full = false)¶ Computes backward gradients from the front-most evaluated node.
The parameter
full
specifies whether the gradients should be computed for all nodes (true
) or only non-constant nodes.By default, a node is constant unless
- it is a parameter node
- it depends on a non-constant node
Thus, functions of constants and inputs are considered as constants.
Turn
full
on if you want to retrieve gradients w.r.t. inputs for instance. By default this is turned off, so that the backward pass ignores nodes which have no influence on gradients w.r.t. parameters for efficiency.- Parameters
last
: Expression from which to compute the gradientfull
: Whether to compute all gradients (including with respect to constant nodes).
-
void
backward
(VariableIndex i, bool full = false)¶ Computes backward gradients from node i (assuming it already been evaluated).
The parameter
full
specifies whether the gradients should be computed for all nodes (true
) or only non-constant nodes.By default, a node is constant unless
- it is a parameter node
- it depends on a non-constant node
Thus, functions of constants and inputs are considered as constants.
Turn
full
on if you want to retrieve gradients w.r.t. inputs for instance. By default this is turned off, so that the backward pass ignores nodes which have no influence on gradients w.r.t. parameters for efficiency.- Parameters
i
: Index of the node from which to compute the gradientfull
: Whether to compute all gradients (including with respect to constant nodes). Turn this on if you want to retrieve gradients w.r.t. inputs for instance. By default this is turned off, so that the backward pass ignores nodes which have no influence on gradients w.r.t. parameters for efficiency.
-
void
print_graphviz
() const¶ Used for debugging.
-
unsigned
get_id
() const¶ Get the unique graph ID.
This ID is incremented by 1 each time a computation graph is created
- Return
- graph is
-
Nodes¶
Nodes are constituents of the computation graph. The end user doesn’t interact with Nodes but with Expressions.
However implementing new operations requires to create a new subclass of the Node class described below.
-
struct
dynet::
Node
¶ - #include <dynet.h>
Represents an SSA variable.
Contains information on tha computation node : arguments, output value and gradient of the output with respect to the function. This class must be inherited to implement any new operation. See nodes.cc for examples. An operation on expressions can then be created from the new Node, see expr.h/expr.cc for examples
Subclassed by dynet::Abs, dynet::AddVectorToAllColumns, dynet::AffineTransform, dynet::Average, dynet::AverageColumns, dynet::BinaryLogLoss, dynet::BlockDropout, dynet::Concatenate, dynet::ConcatenateToBatch, dynet::Constant, dynet::ConstantMinusX, dynet::ConstantPlusX, dynet::ConstParameterNode, dynet::ConstScalarMultiply, dynet::Conv2D, dynet::Cube, dynet::CwiseMultiply, dynet::CwiseQuotient, dynet::CwiseSum, dynet::DotProduct, dynet::Dropout, dynet::DropoutBatch, dynet::DropoutDim, dynet::Erf, dynet::Exp, dynet::ExponentialLinearUnit, dynet::Filter1DNarrow, dynet::FlipGradient, dynet::FoldRows, dynet::GaussianNoise, dynet::Hinge, dynet::HingeDim, dynet::HuberDistance, dynet::Identity, dynet::InnerProduct3D_1D, dynet::InnerProduct3D_1D_1D, dynet::InputNode, dynet::KMaxPooling, dynet::KMHNGram, dynet::L1Distance, dynet::L2Norm, dynet::Log, dynet::LogDet, dynet::LogGamma, dynet::LogisticSigmoid, dynet::LogSoftmax, dynet::LogSumExp, dynet::MatrixInverse, dynet::MatrixMultiply, dynet::Max, dynet::MaxDimension, dynet::MaxPooling1D, dynet::MaxPooling2D, dynet::Min, dynet::MinDimension, dynet::MomentBatches, dynet::MomentDimension, dynet::MomentElements, dynet::Negate, dynet::NoBackprop, dynet::PairwiseRankLoss, dynet::ParameterNodeBase, dynet::PickBatchElements, dynet::PickElement, dynet::PickNegLogSoftmax, dynet::PickRange, dynet::PoissonRegressionLoss, dynet::Pow, dynet::RandomBernoulli, dynet::RandomGumbel, dynet::RandomNormal, dynet::RandomUniform, dynet::Rectify, dynet::Reshape, dynet::RestrictedLogSoftmax, dynet::ScalarInputNode, dynet::SelectCols, dynet::SelectRows, dynet::Softmax, dynet::SoftSign, dynet::SparseInputNode, dynet::Sparsemax, dynet::SparsemaxLoss, dynet::Sqrt, dynet::Square, dynet::SquaredEuclideanDistance, dynet::SquaredNorm, dynet::StdBatches, dynet::StdDimension, dynet::StdElements, dynet::Sum, dynet::SumBatches, dynet::SumDimension, dynet::SumElements, dynet::Tanh, dynet::ToDevice, dynet::TraceOfProduct, dynet::Transpose, dynet::VanillaLSTMC, dynet::VanillaLSTMGates, dynet::VanillaLSTMH, dynet::WeightNormalization
Public Functions
-
virtual Dim
dim_forward
(const std::vector<Dim> &xs) const = 0¶ Compute dimensions of result for given dimensions of inputs.
Also checks to make sure inputs are compatible with each other
- Return
- Dimension of the output
- Parameters
xs
: Vector containing the dimensions of the inputs
-
virtual std::string
as_string
(const std::vector<std::string> &args) const = 0¶ Returns important information for debugging.
See nodes-conv.cc for examples
- Return
- String description of the node
- Parameters
args
: String descriptions of the arguments
-
size_t
aux_storage_size
() const¶ Size of the auxiliar storage.
in general, this will return an empty size, but if a component needs to store extra information in the forward pass for use in the backward pass, it can request the memory here (nb. you could put it on the Node object, but in general, edges should not allocate tensor memory since memory is managed centrally for the entire computation graph).
- Return
- Size
-
virtual void
forward_impl
(const std::vector<const Tensor *> &xs, Tensor &fx) const = 0¶ Forward computation.
This function contains the logic for the forward pass. Some implementation remarks from nodes.cc:
- fx can be understood as a pointer to the (preallocated) location for the result of forward to be stored
- fx is not initialized, so after calling forward fx must point to the correct answer
- fx can be repointed to an input, if forward(x) evaluates to x (e.g., in reshaping)
- scalars results of forward are placed in fx.v[0]
- DYNET manages its own memory, not Eigen, and it is configured with the EIGEN_NO_MALLOC option. If you get an error about Eigen attempting to allocate memory, it is (probably) because of an implicit creation of a temporary variable. To tell Eigen this is not necessary, the noalias() method is available. If you really do need a temporary variable, its capacity must be requested by Node::aux_storage_size
Note on debugging problems with differentiable components
- fx is uninitialized when forward is called- are you relying on it being 0?
- Parameters
xs
: Pointers to the inputsfx
: pointer to the (preallocated) location for the result of forward to be stored
-
virtual void
backward_impl
(const std::vector<const Tensor *> &xs, const Tensor &fx, const Tensor &dEdf, unsigned i, Tensor &dEdxi) const = 0¶ Accumulates the derivative of E with respect to the ith argument to f, that is, xs[i].
This function contains the logic for the backward pass. Some implementation remarks from nodes.cc:
- dEdxi MUST ACCUMULATE a result since multiple calls to forward may depend on the same x_i. Even, e.g., Identity must be implemented as dEdx1 += dEdf. THIS IS EXTREMELY IMPORTANT
- scalars results of forward are placed in fx.v[0]
- DYNET manages its own memory, not Eigen, and it is configured with the EIGEN_NO_MALLOC option. If you get an error about Eigen attempting to allocate memory, it is (probably) because of an implicit creation of a temporary variable. To tell Eigen this is not necessary, the noalias() method is available. If you really do need a temporary variable, its capacity must be requested by Node::aux_storage_size
Note on debugging problems with differentiable components
- dEdxi must accummulate (see point 4 above!)
- Parameters
xs
: Pointers to inputsfx
: OutputdEdf
: Gradient of the objective w.r.t the output of the nodei
: Index of the input w.r.t which we take the derivativedEdxi
: Gradient of the objective w.r.t the input of the node
-
virtual bool
supports_multibatch
() const¶ Whether this node supports computing multiple batches in one call.
If true, forward and backward will be called once with a multi-batch tensor. If false, forward and backward will be called multiple times for each item.
- Return
- Support for multibatch
-
virtual bool
supports_multidevice
() const¶ Whether this node supports processing inputs/outputs on multiple devices.
DyNet will throw an error if you try to process inputs and outputs on different devices unless this is activated.
- Return
- Support for multi-device
-
void
forward
(const std::vector<const Tensor *> &xs, Tensor &fx) const¶ perform the forward/backward passes in one or multiple calls
- Parameters
xs
: Pointers to the inputsfx
: pointer to the (preallocated) location for the result of forward to be stored
-
void
backward
(const std::vector<const Tensor *> &xs, const Tensor &fx, const Tensor &dEdf, unsigned i, Tensor &dEdxi) const¶ perform the backward passes in one or multiple calls
- Parameters
xs
: Pointers to inputsfx
: OutputdEdf
: Gradient of the objective w.r.t the output of the nodei
: Index of the input w.r.t which we take the derivativedEdxi
: Gradient of the objective w.r.t the input of the node
-
virtual int
autobatch_sig
(const ComputationGraph &cg, SigMap &sm) const¶ signature for automatic batching This will be equal only for nodes that can be combined. Returns 0 for unbatchable functions.
-
virtual std::vector<int>
autobatch_concat
(const ComputationGraph &cg) const¶ which inputs can be batched This will be true for inputs that should be concatenated when autobatching, and false for inputs that should be shared among all batches.
-
virtual Node *
autobatch_pseudo_node
(const ComputationGraph &cg, const std::vector<VariableIndex> &batch_ids) const¶ create a pseudonode for autobatching This will combine together multiple nodes into one big node for the automatic batching functionality. When a node representing one component of the mini-batch can be used as-is it is OK to just return the null pointer, otherwise we should make the appropriate changes and return a new node.
-
virtual void
autobatch_reshape
(const ComputationGraph &cg, const std::vector<VariableIndex> &batch_ids, const std::vector<int> &concat, std::vector<const Tensor *> &xs, Tensor &fx) const¶ reshape the tensors for auto Takes in info, and reshapes the dimensions of xs (for which “concat” is true), and fx. By default do no reshaping, which is OK for componentwise operations.
-
void
autobatch_reshape_concatonly
(const ComputationGraph &cg, const std::vector<VariableIndex> &batch_ids, const std::vector<int> &concat, std::vector<const Tensor *> &xs, Tensor &fx) const¶ reshape the tensors for auto Takes in info, and reshapes the dimensions of xs (for which “concat” is true) and fx by concatenating their batches.
-
unsigned
arity
() const¶ Number of arguments to the function.
- Return
- Arity of the function
Public Members
-
std::vector<VariableIndex>
args
¶ Dependency structure
-
void *
aux_mem
¶ this will usually be null. but, if your node needs to store intermediate values between forward and backward, you can use store it here. request the number of bytes you need from aux_storage_size(). Note: this memory will be on the CPU or GPU, depending on your computation backend
-
virtual Dim
Parameters and Model¶
Parameters are things that are optimized. in contrast to a system like Torch where computational modules may have their own parameters, in DyNet parameters are just parameters.
To deal with sparse updates, there are two parameter classes:
- Parameters represents a vector, matrix, (eventually higher order tensors) of parameters. These are densely updated.
- LookupParameters represents a table of vectors that are used to embed a set of discrete objects. These are sparsely updated.
-
struct
dynet::
ParameterStorageBase
¶ - #include <model.h>
This is the base class for ParameterStorage and LookupParameterStorage, the objects handling the actual parameters.
You can access the storage from any Parameter (resp. LookupParameter) class, use it only to do low level manipulations.
Subclassed by dynet::LookupParameterStorage, dynet::ParameterStorage
Public Functions
-
virtual void
scale_parameters
(float a) = 0¶ Scale the parameters.
- Parameters
a
: scale factor
-
virtual void
scale_gradient
(float a) = 0¶ Scale the gradient.
- Parameters
a
: scale factor
-
virtual void
zero
() = 0¶ Set the parameters to 0.
-
virtual void
squared_l2norm
(float *sqnorm) const = 0¶ Get the parameter squared l2 norm.
- Parameters
sqnorm
: Pointer to the float holding the result
-
virtual void
g_squared_l2norm
(float *sqnorm) const = 0¶ Get the squared l2 norm of the gradient w.r.t. these parameters.
- Parameters
sqnorm
: Pointer to the float holding the result
-
virtual bool
is_updated
() const = 0¶ Check whether corpus is updated.
-
virtual bool
has_grad
() const = 0¶ Check whether the gradient is zero or not (true if gradient is non-zero)
-
virtual size_t
size
() const = 0¶ Get the size (number of scalar parameters)
- Return
- Number of scalar parameters
-
virtual void
-
struct
dynet::
ParameterStorage
¶ - #include <model.h>
Storage class for Parameters.
Inherits from dynet::ParameterStorageBase
Subclassed by dynet::ParameterStorageCreator
Public Functions
-
void
copy
(const ParameterStorage &val)¶ Copy from another ParameterStorage.
- Parameters
val
: ParameterStorage to copy from
-
void
accumulate_grad
(const Tensor &g)¶ Add a tensor to the gradient.
After this method gets called, g <- g + d
- Parameters
g
: Tensor to add
-
void
clear
()¶ Clear the gradient (set it to 0)
-
void
clip
(float left, float right)¶ Clip the values to the range [left, right].
Public Members
-
std::string
name
¶ Name of this parameter
-
bool
updated
¶ Whether this is updated
-
bool
nonzero_grad
¶ Whether the gradient is zero
-
ParameterCollection *
owner
¶ Pointer to the collection that “owns” this parameter
-
void
-
struct
dynet::
LookupParameterStorage
¶ - #include <model.h>
Storage class for LookupParameters.
Inherits from dynet::ParameterStorageBase
Subclassed by dynet::LookupParameterStorageCreator
Public Functions
-
void
initialize
(unsigned index, const std::vector<float> &val)¶ Initialize one particular lookup.
- Parameters
index
: Index of the lookput to initializeval
: Values
-
void
copy
(const LookupParameterStorage &val)¶ Copy from another LookupParameterStorage.
- Parameters
val
: Other LookupParameterStorage to copy from
-
void
accumulate_grad
(const Tensor &g)¶ Add a Tensor to the gradient of the whole lookup matrix.
after this
grads<-grads + g
- Parameters
g
: [description]
-
void
accumulate_grad
(unsigned index, const Tensor &g)¶ Add a Tensor to the gradient of one of the lookups.
after this
grads[index]<-grads[index] + g
- Parameters
index
: [description]g
: [description]
-
void
accumulate_grads
(unsigned n, const unsigned *ids_host, const unsigned *ids_dev, float *g)¶ Add tensors to muliple lookups.
After this method gets called,
grads[ids_host[i]] <- grads[ids_host[i]] + g[i*dim.size():(i+1)*dim.size()]
- Parameters
n
: size ofids_host
ids_host
: Indices of the gradients to updateids_dev
: [To be documented] (only for GPU)g
: Values
Public Members
-
std::string
name
¶ Name of this parameter
-
std::unordered_set<unsigned>
non_zero_grads
¶ Gradients are sparse, so track which components are nonzero
-
bool
updated
¶ Whether this lookup parameter should be updated
-
bool
nonzero_grad
¶ Whether all of the gradients have been updated. Whether the gradient is zero
-
ParameterCollection *
owner
¶ Pointer to the collection that “owns” this parameter
-
void
-
struct
dynet::
Parameter
¶ - #include <model.h>
Object representing a trainable parameter.
This objects acts as a high level component linking the actual parameter values (ParameterStorage) and the ParameterCollection. As long as you don’t want to do low level hacks at the ParameterStorage level, this is what you will use.
Public Functions
-
Parameter
()¶ Default constructor.
Constructor.
This is called by the model, you shouldn’t need to use it
- Parameters
p
: Shared pointer to the parameter storage
-
ParameterStorage &
get_storage
() const¶ Get underlying ParameterStorage object.
- Return
- ParameterStorage holding the parameter values
-
string
get_fullname
() const¶ Get the full name of the ParameterStorage object.
-
void
zero
()¶ Zero the parameters.
-
float
current_weight_decay
() const¶ Get the current weight decay for the parameters.
-
void
set_updated
(bool b)¶ Set the parameter as updated.
- Parameters
b
: Update status
-
void
scale
(float s)¶ Scales the parameter (multiplies by
s
)- Parameters
s
: scale
-
void
scale_gradient
(float s)¶ Scales the gradient (multiplies by
s
)- Parameters
s
: scale
-
bool
is_updated
()¶ Check the update status.
- Return
- Update status
-
void
clip_inplace
(float left, float right)¶ Clip the values of the parameter to the range [left, right] (in place)
-
void
set_value
(const std::vector<float> &val)¶ set the values of the parameter
Public Members
-
std::shared_ptr<ParameterStorage>
p
¶ Pointer to the storage for this Parameter
-
-
struct
dynet::
LookupParameter
¶ - #include <model.h>
Object representing a trainable lookup parameter.
Public Functions
-
LookupParameterStorage &
get_storage
() const¶ Get underlying LookupParameterStorage object.
- Return
- LookupParameterStorage holding the parameter values
-
void
initialize
(unsigned index, const std::vector<float> &val) const¶ Initialize one particular column.
- Parameters
index
: Index of the column to be initializedval
: [description]
-
void
zero
()¶ Zero the parameters.
-
string
get_fullname
() const¶ Get the full name of the ParameterStorage object.
-
float
current_weight_decay
() const¶ Get the current weight decay for the parameters.
-
void
scale
(float s)¶ Scales the parameter (multiplies by
s
)- Parameters
s
: scale
-
void
scale_gradient
(float s)¶ Scales the gradient (multiplies by
s
)- Parameters
s
: scale
-
void
set_updated
(bool b)¶ Set the parameter as updated.
- Parameters
b
: Update status
-
bool
is_updated
()¶ Check the update status.
- Return
- Update status
Public Members
-
std::shared_ptr<LookupParameterStorage>
p
¶ Pointer to the storage for this Parameter
-
LookupParameterStorage &
-
class
dynet::
ParameterCollection
¶ - #include <model.h>
This is a collection of parameters.
if you need a matrix of parameters, or a lookup table - ask an instance of this class. This knows how to serialize itself. Parameters know how to track their gradients, but any extra information (like velocity) will live here
Subclassed by dynet::Model
Public Functions
-
ParameterCollection
()¶ Constructor.
-
float
gradient_l2_norm
() const¶ Returns the l2 of your gradient.
Use this to look for gradient vanishing/exploding
- Return
- L2 norm of the gradient
-
void
reset_gradient
()¶ Sets all gradients to zero.
-
Parameter
add_parameters
(const Dim &d, float scale = 0.0f, const std::string &name = "", Device *device = dynet::default_device)¶ Add parameters to model and returns Parameter object.
creates a ParameterStorage object holding a tensor of dimension
d
and returns a Parameter object (to be used as input in the computation graph). The coefficients are sampled according to thescale
parameter- Return
- Parameter object to be used in the computation graph
- Parameters
d
: Shape of the parameterscale
: If scale is non-zero, initializes according to \(mathcal U([-\mathrm{scale},+\mathrm{scale}]\), otherwise uses Glorot initializationname
: Name of the parameterdevice
: Device placement for the parameter
-
Parameter
add_parameters
(const Dim &d, Device *device)¶ Add parameters to model and returns Parameter object.
creates a ParameterStorage object holding a tensor of dimension
d
and returns a Parameter object (to be used as input in the computation graph).- Return
- Parameter object to be used in the computation graph
- Parameters
d
: Shape of the parameterdevice
: Device placement for the parameter
-
Parameter
add_parameters
(const Dim &d, const std::string &name, Device *device = dynet::default_device)¶ Add parameters to model and returns Parameter object.
creates a ParameterStorage object holding a tensor of dimension
d
and returns a Parameter object (to be used as input in the computation graph).- Return
- Parameter object to be used in the computation graph
- Parameters
d
: Shape of the parametername
: Name of the parameterdevice
: Device placement for the parameter
-
Parameter
add_parameters
(const Dim &d, const ParameterInit &init, const std::string &name = "", Device *device = dynet::default_device)¶ Add parameters with custom initializer.
- Return
- Parameter object to be used in the computation graph
- Parameters
d
: Shape of the parameterinit
: Custom initializername
: Name of the parameterdevice
: Device placement for the parameter
-
std::vector<std::shared_ptr<ParameterStorageBase>>
get_parameter_storages_base
() const¶ Get parameters base in current model.
- Return
- list of points to ParameterStorageBase objects
-
std::shared_ptr<ParameterStorage>
get_parameter_storage
(const std::string &pname)¶ Get parameter in current model.
It is not recommended to use this
- Return
- the pointer to the Parameter object
-
std::vector<std::shared_ptr<ParameterStorage>>
get_parameter_storages
() const¶ Get parameters in current model.
- Return
- list of points to ParameterStorage objects
-
LookupParameter
add_lookup_parameters
(unsigned n, const Dim &d, const std::string &name = "", Device *device = dynet::default_device)¶ Add lookup parameter to model.
Same as add_parameters. Initializes with Glorot
- Return
- LookupParameter object to be used in the computation graph
- Parameters
n
: Number of lookup indicesd
: Dimension of each embeddingname
: Name of the parameterdevice
: Device placement for the parameter
-
LookupParameter
add_lookup_parameters
(unsigned n, const Dim &d, const ParameterInit &init, const std::string &name = "", Device *device = dynet::default_device)¶ Add lookup parameter with custom initializer.
- Return
- LookupParameter object to be used in the computation graph
- Parameters
n
: Number of lookup indicesd
: Dimension of each embeddinginit
: Custom initializername
: Name of the parameterdevice
: Device placement for the parameter
-
std::shared_ptr<LookupParameterStorage>
get_lookup_parameter_storage
(const std::string &lookup_pname)¶ Get lookup parameter in current model.
It is not recommended to use this
- Return
- the pointer to the LookupParameter object
-
std::vector<std::shared_ptr<LookupParameterStorage>>
get_lookup_parameter_storages
() const¶ Get lookup parameters in current model.
- Return
- list of points to LookupParameterStorage objects
-
void
project_weights
(float radius = 1.0f)¶ project weights so their L2 norm = radius
NOTE (Paul) : I am not sure this is doing anything currently. The argument doesn’t seem to be used anywhere... If you need this raise an issue on github
- Parameters
radius
: Target norm
-
void
set_weight_decay_lambda
(float lambda)¶ Set the weight decay coefficient.
- Parameters
lambda
: Weight decay coefficient
-
const std::vector<std::shared_ptr<ParameterStorage>> &
parameters_list
() const¶ Returns list of shared pointers to ParameterSorages.
You shouldn’t need to use this
- Return
- List of shared pointers to ParameterSorages
-
const std::vector<std::shared_ptr<LookupParameterStorage>> &
lookup_parameters_list
() const¶ Returns list of pointers to LookupParameterSorages.
You shouldn’t need to use this
- Return
- List of pointers to LookupParameterSorages
-
size_t
parameter_count
() const¶ Returns the total number of tunable parameters (i. e. scalars) contained within this model.
That is to say, a 2x2 matrix counts as four parameters.
- Return
- Number of parameters
-
size_t
updated_parameter_count
() const¶ Returns total number of (scalar) parameters updated.
- Return
- number of updated parameters
-
void
set_updated_param
(const Parameter *p, bool status)¶ [brief description]
[long description]
- Parameters
p
: [description]status
: [description]
-
void
set_updated_lookup_param
(const LookupParameter *p, bool status)¶ [brief description]
[long description]
- Parameters
p
: [description]status
: [description]
-
bool
is_updated_param
(const Parameter *p)¶ [brief description]
[long description]
- Return
- [description]
- Parameters
p
: [description]
-
bool
is_updated_lookup_param
(const LookupParameter *p)¶ [brief description]
[long description]
- Return
- [description]
- Parameters
p
: [description]
-
ParameterCollection
add_subcollection
(const std::string &name = "")¶ Add a sub-collection.
This will allow you to add a ParameterCollection that is a (possibly named) subset of the original collection. This is useful if you want to save/load/update only part of the parameters in the model.
- Return
- The subcollection
-
size_t
size
()¶ Get size.
Get the number of parameters in the ParameterCollection
-
std::string
get_fullname
() const¶ get namespace of current ParameterCollection object(end with a slash)
-
L2WeightDecay &
get_weight_decay
()¶ Get the weight decay object.
-
-
struct
dynet::
ParameterInit
¶ - #include <param-init.h>
Initializers for parameters.
Allows for custom parameter initialization
Subclassed by dynet::ParameterInitConst, dynet::ParameterInitFromFile, dynet::ParameterInitFromVector, dynet::ParameterInitGlorot, dynet::ParameterInitIdentity, dynet::ParameterInitNormal, dynet::ParameterInitSaxe, dynet::ParameterInitUniform
Public Functions
-
ParameterInit
()¶ Default constructor.
-
virtual void
initialize_params
(Tensor &values) const = 0¶ Function called upon initialization.
Whenever you inherit this struct to implement your own custom initializer, this is the function you want to overload to implement your logic.
- Parameters
values
: The tensor to be initialized. You should modify it in-place. See dynet/model.cc for some examples
-
-
struct
dynet::
ParameterInitNormal
¶ - #include <param-init.h>
Initialize parameters with samples from a normal distribution.
Inherits from dynet::ParameterInit
Public Functions
-
ParameterInitNormal
(float m = 0.0f, float v = 1.0f)¶ Constructor.
- Parameters
m
: Mean of the gaussian distributionv
: Variance of the gaussian distribution (reminder : the variance is the square of the standard deviation)
-
-
struct
dynet::
ParameterInitUniform
¶ - #include <param-init.h>
Initialize parameters with samples from a uniform distribution.
Inherits from dynet::ParameterInit
Public Functions
-
ParameterInitUniform
(float scale)¶ Constructor for uniform distribution centered on 0.
[long description]Samples parameters from \(mathcal U([-\mathrm{scale},+\mathrm{scale}]\)
- Parameters
scale
: Scale of the distribution
-
ParameterInitUniform
(float l, float r)¶ Constructor for uniform distribution in a specific interval.
[long description]
- Parameters
l
: Lower bound of the intervalr
: Upper bound of the interval
-
-
struct
dynet::
ParameterInitConst
¶ - #include <param-init.h>
Initialize parameters with a constant value.
Inherits from dynet::ParameterInit
Public Functions
-
ParameterInitConst
(float c)¶ Constructor.
- Parameters
c
: Constant value
-
-
struct
dynet::
ParameterInitIdentity
¶ - #include <param-init.h>
Initialize as the identity.
This will raise an exception if used on non square matrices
Inherits from dynet::ParameterInit
Public Functions
-
ParameterInitIdentity
()¶ Constructor.
-
-
struct
dynet::
ParameterInitGlorot
¶ - #include <param-init.h>
Initialize with the methods described in Glorot, 2010
In order to preserve the variance of the forward and backward flow across layers, the parameters \(\theta\) are initialized such that \(\mathrm{Var}(\theta)=\frac 2 {n_1+n_2}\) where \(n_1,n_2\) are the input and output dim. Important note : The underlying distribution is uniform (not gaussian)
Note: This is also known as Xavier initialization
Inherits from dynet::ParameterInit
Public Functions
-
ParameterInitGlorot
(bool is_lookup = false, float gain = 1.f)¶ Constructor.
- Parameters
is_lookup
: Boolean value identifying the parameter as a LookupParametergain
: Scaling parameter. In order for the Glorot initialization to be correct, you should ût this equal to \(\frac 1 {f'(0)}\) where \(f\) is your activation function
-
-
struct
dynet::
ParameterInitSaxe
¶ - #include <param-init.h>
Initializes according to Saxe et al., 2014
Initializes as a random orthogonal matrix (unimplemented for GPU)
Inherits from dynet::ParameterInit
Public Functions
-
ParameterInitSaxe
(float gain = 1.0)¶ Constructor.
-
-
struct
dynet::
ParameterInitFromFile
¶ - #include <param-init.h>
Initializes from a file.
Useful for reusing weights, etc...
Inherits from dynet::ParameterInit
Public Functions
-
ParameterInitFromFile
(std::string f)¶ Constructor.
- Parameters
f
: File name (format should just be a list of values)
-
-
struct
dynet::
ParameterInitFromVector
¶ - #include <param-init.h>
Initializes from a
std::vector
of floats.Inherits from dynet::ParameterInit
Public Functions
-
ParameterInitFromVector
(std::vector<float> v)¶ Constructor.
- Parameters
v
: Vector of values to be used
-
Tensor¶
Tensor objects provide a bridge between C++ data structures and Eigen Tensors for multidimensional data.
Concretely, as an end user you will obtain a tensor object after calling .value()
on an expression. You can then use functions described below to convert these tensors to float
s, arrays of float
s, to save and load the values, etc...
Conversely, when implementing low level nodes (e.g. for new operations), you will need to retrieve Eigen tensors from dynet tensors in order to perform efficient computation.
-
std::ostream &
dynet::
operator<<
(std::ostream &os, const Tensor &t)¶ You can use
cout<<tensor;
for debugging or saving.- Parameters
os
: output streamt
: Tensor
-
real
dynet::
as_scalar
(const Tensor &t)¶ Get a scalar value from an order 0 tensor.
Throws an
runtime_error
exception if the tensor has more than one element.TODO : Change for custom invalid dimension exception maybe?
- Return
- Scalar value
- Parameters
t
: Input tensor
-
std::vector<real>
dynet::
as_vector
(const Tensor &v)¶ Get the array of values in the tensor.
For higher order tensors this returns the flattened value
- Return
- Values
- Parameters
v
: Input tensor
-
std::vector<Eigen::DenseIndex>
dynet::
as_vector
(const IndexTensor &v)¶ Get the array of indices in an index tensor.
For higher order tensors this returns the flattened value
- Return
- Index values
- Parameters
v
: Input index tensor
-
real
dynet::
rand01
()¶ This is a helper function to sample uniformly in \([0,1]\).
- Return
- \(x\sim\mathcal U([0,1])\)
-
int
dynet::
rand0n
(int n)¶ This is a helper function to sample uniformly in \(\{0,\dots,n-1\}\).
- Return
- \(x\sim\mathcal U(\{0,\dots,n-1\})\)
- Parameters
n
: Upper bound (excluded)
-
real
dynet::
rand_normal
()¶ This is a helper function to sample from a normalized gaussian distribution.
- Return
- \(x\sim\mathcal N(0,1)\)
-
struct
dynet::
Tensor
¶ - #include <tensor.h>
Represents a tensor of any order.
This provides a bridge between classic C++ types and Eigen tensors.
Public Functions
-
Tensor
()¶ Create an empty tensor.
-
Tensor
(const Dim &d, float *v, Device *dev, DeviceMempool mem)¶ Creates a tensor.
[long description]
- Parameters
d
: Shape of the tensorv
: Pointer to the valuesdev
: Devicemem
: Memory pool
-
Eigen::Map<Eigen::MatrixXf>
operator*
()¶ Get the data as an Eigen matrix.
- Return
- Eigen matrix
-
Eigen::Map<Eigen::VectorXf>
vec
()¶ Get the data as an Eigen vector.
This returns the full tensor contents even if it has many dimensions
- Return
- Flattened tensor
-
Eigen::TensorMap<Eigen::Tensor<float, 1>>
tvec
()¶ Get the data as an order 1 Eigen tensor.
this returns the full tensor contents as a one dimensional Eigen tensor which can be used for on-device processing where dimensions aren’t important
- Return
- Eigen order 1 tensor
-
Eigen::TensorMap<Eigen::Tensor<float, 2>>
tbvec
()¶ Get the data as an order 2 tensor including batch size.
this returns the full tensor contents as a two dimensional Eigen tensor where the first dimension is a flattened representation of each batch and the second dimension is the batches
- Return
- batch size x elements per batch matrix
- template <int Order>
-
Eigen::TensorMap<Eigen::Tensor<float, Order + 1>>
tb
()¶ Get view as an Eigen Tensor where the final dimension is the various batches.
-
float *
batch_ptr
(unsigned bid)¶ Get the pointer for a particular batch.
Automatically broadcasting if the size is zero
- Return
- Pointer to the memory where the batch values are located
- Parameters
bid
: Batch id requested
-
Eigen::Map<Eigen::MatrixXf>
batch_matrix
(unsigned bid)¶ Get the matrix for a particular batch.
Automatically broadcasting if the size is zero.
- Return
- Matrix at batch id
bid
(of shaped.rows()
xd.cols()
) - Parameters
bid
: Batch id requested
-
Eigen::Map<Eigen::MatrixXf>
rowcol_matrix
()¶ Get the data as a matrix, where each “row” is the concatenation of rows and columns, and each “column” is batches.
- Return
- matrix of shape
d.rows() * d.cols()
xd.batch_elems()
-
Eigen::Map<Eigen::MatrixXf>
colbatch_matrix
()¶ Get the data as a matrix, where each “row” is the concatenation of rows, and each “column” is the concatenation of columns and batches.
- Return
- matrix of shape
d.rows() * d.cols()
xd.batch_elems()
-
bool
is_valid
() const¶ Check for NaNs and infinite values.
This is very slow: use sparingly (it’s linear in the number of elements). This raises a
std::runtime_error
exception if the Tensor is on GPU because it’s not implemented yet- Return
- Whether the tensor contains any invalid value
-
Tensor
batch_elem
(unsigned b) const¶ Get a Tensor object representing a single batch.
If this tensor only has a single batch, then broadcast. Otherwise, check to make sure that the requested batch is smaller than the number of batches.
TODO: This is a bit wasteful, as it re-calculates
bs.batch_size()
every time.- Return
- Sub tensor at batch
b
- Parameters
b
: Batch id
-
-
struct
dynet::
IndexTensor
¶ - #include <tensor.h>
Represents a tensor of indices.
This holds indices to locations within a dimension or tensor.
Public Functions
-
IndexTensor
()¶ Create an empty tensor.
-
IndexTensor
(const Dim &d, Eigen::DenseIndex *v, Device *dev, DeviceMempool mem)¶ Creates a tensor.
[long description]
- Parameters
d
: Shape of the tensorv
: Pointer to the valuesdev
: Devicemem
: Memory pool
- template <int Order>
-
Eigen::TensorMap<Eigen::Tensor<Eigen::DenseIndex, Order>>
t
()¶ Get view as a Tensor.
-
-
struct
dynet::
TensorTools
¶ - #include <tensor.h>
Provides tools for creating, accessing, copying and modifying tensors (in-place)
Public Static Functions
-
void
clip
(Tensor &d, float left, float right)¶ Clip the values in the tensor to a fixed range.
- Parameters
d
: Tensor to modifyleft
: Target minimum valueright
: Target maximum value
-
void
constant
(Tensor &d, float c)¶ Fills the tensor with a constant value.
- Parameters
d
: Tensor to modifyc
: Target value
-
void
identity
(Tensor &val)¶ Set the (order 2) tensor as the identity matrix.
this throws a runtime_error exception if the tensor isn’t a square matrix
- Parameters
val
: Input tensor
-
void
randomize_bernoulli
(Tensor &val, real p, real scale = 1.0f)¶ Fill the tensor with bernoulli random variables and scale them by scale.
- Parameters
val
: Input tensorp
: Parameter of the bernoulli distributionscale
: Scale of the random variables
-
void
randomize_normal
(Tensor &val, real mean = 0.0f, real stddev = 1.0f)¶ Fill the tensor with gaussian random variables.
- Parameters
val
: Input tensormean
: Meanstddev
: Standard deviation
-
void
randomize_uniform
(Tensor &val, real left = 0.0f, real right = 1.0f)¶ Fill the tensor with uniform random variables.
- Parameters
val
: Input tensorleft
: Left bound of the intervalright
: Right bound of the interval
-
void
randomize_orthonormal
(Tensor &val, real scale = 1.0f)¶ Takes a square matrix tensor and sets it as a random orthonormal matrix.
More specifically this samples a random matrix with RandomizeUniform and then performs SVD and returns the left orthonormal matrix in the decomposition, scaled by
scale
- Parameters
val
: Input tensorscale
: Value to which the resulting orthonormal matrix will be scaled
-
float
access_element
(const Tensor &v, int index)¶ Access element of the tensor by index in the values array.
AccessElement and SetElement are very, very slow (potentially) - use appropriately
- Return
v.v[index]
- Parameters
v
: Tensorindex
: Index in the memory
-
float
access_element
(const Tensor &v, const Dim &index)¶ Access element of the tensor by indices in the various dimension.
This only works for matrix shaped tensors (+ batch dimension). AccessElement and SetElement are very, very slow (potentially) - use appropriately
- Return
(*v)(index[0], index[1])
- Parameters
v
: Tensorindex
: Indices in the tensor
-
void
set_element
(const Tensor &v, int index, float value)¶ Set element of the tensor by index in the values array.
AccessElement and SetElement are very, very slow (potentially) - use appropriately
- Parameters
v
: Tensorindex
: Index in the memoryvalue
: Desired value
-
void
copy_element
(const Tensor &l, int lindex, Tensor &r, int rindex)¶ Copy element from one tensor to another (by index in the values array)
- Parameters
l
: Source tensorlindex
: Source indexr
: Target tensorrindex
: Target index
-
void
set_elements
(const Tensor &v, const std::vector<float> &vec)¶ Set the elements of a tensor with an array of values.
(This uses memcpy so be careful)
- Parameters
v
: Input Tensorvec
: Values
-
void
copy_elements
(Tensor &v, const Tensor &v_src)¶ Copy one tensor into another.
- Parameters
v
: Target tensorv_src
: Source tensor
-
void
accumulate
(Tensor &v, const Tensor &v_src)¶ Accumulate the values of one tensor into another.
- Parameters
v
: Target tensorv_src
: Source tensor
-
void
logsumexp
(const Tensor &x, Tensor &m, Tensor &z)¶ Calculate the logsumexp function over all columns of the tensor.
- Parameters
x
: The input tensorm
: A tensor of scratch memory to hold the maximum values of each columnz
: The output tensor
-
IndexTensor
argmax
(const Tensor &v, unsigned dim = 0, unsigned num = 1)¶ Calculate the index of the maximum value.
- Return
- A newly allocated LongTensor consisting of argmax IDs. The length of the dimension “dim” will be “num”, consisting of the appropriate IDs.
- Parameters
v
: A tensor where each row represents a probability distributiondim
: Which dimension to take the argmax overnum
: The number of kmax values
-
IndexTensor
categorical_sample_log_prob
(const Tensor &v, unsigned dim = 0, unsigned num = 1)¶ Calculate samples from a log probability.
- Return
- A newly allocated LongTensor consisting of argmax IDs. The length of the dimension “dim” will be “num”, consisting of the appropriate IDs.
- Parameters
v
: A tensor where each row represents a log probability distributiondim
: Which dimension to take the sample overnum
: The number of samples for each row
-
void
Dimensions¶
The Dim class holds information on the shape of a tensor. As explained in Unorthodox Design, in DyNet the dimensions are represented as the standard dimension + the batch dimension, which makes batched computation transparent.
-
DYNET_MAX_TENSOR_DIM
¶ Maximum number of dimensions supported by dynet : 7
-
struct
dynet::
Dim
¶ - #include <dim.h>
The Dim struct stores information about the dimensionality of expressions.
Batch dimension is treated separately from standard dimension.
Public Functions
-
Dim
()¶ Default constructor.
-
Dim
(std::initializer_list<unsigned int> x)¶ Initialize from a list of dimensions.
The batch dimension is 1 in this case (non-batched expression)
- Parameters
x
: List of dimensions
-
Dim
(std::initializer_list<unsigned int> x, unsigned int b)¶ Initialize from a list of dimensions and a batch size.
- Parameters
x
: List of dimensionsb
: Batch size
-
Dim
(const std::vector<long> &x)¶ Initialize from a vector of dimensions.
The batch dimension is 1 in this case (non-batched expression)
- Parameters
x
: Array of dimensions
-
Dim
(const std::vector<long> &x, unsigned int b)¶ Initialize from a vector of dimensions and a batch size.
- Parameters
x
: Vector of dimensionsb
: Batch size
-
unsigned int
size
() const¶ Total size of a batch.
- Return
- Batch size * size of a batch
-
unsigned int
batch_size
() const¶ Size of a batch (product of all dimensions)
- Return
- Size of a batch
-
unsigned int
sum_dims
() const¶ Sum of all dimensions within a batch.
- Return
- Sum of the dimensions within a batch
-
Dim
truncate
() const¶ remove trailing dimensions of 1
iterate all the dimensions of Dim, stop at last dimension of 1
- Return
- truncated dimension
-
void
resize
(unsigned int i)¶ Change the number of dimensions.
- Parameters
int
: New number of dimensions
-
unsigned int
ndims
() const¶ Get number of dimensions.
- Return
- Number of dimensions
-
unsigned int
rows
() const¶ Size of the first dimension.
- Return
- Size of the first dimension
-
unsigned int
num_nonone_dims
() const¶ Number of non-one dimensions.
- Return
- Number of non-one dimensions
-
unsigned int
cols
() const¶ Size of the second dimension (or 1 if only one dimension)
- Return
- Size of the second dimension (or 1 if only one dimension)
-
unsigned int
batch_elems
() const¶ Batch dimension.
- Return
- Batch dimension
-
void
set
(unsigned int i, unsigned int s)¶ Set specific dimension.
Set the value of a specific dimension to an arbitrary value
- Parameters
i
: Dimension indexs
: Dimension size
-
unsigned int
operator[]
(unsigned int i) const¶ Access a specific dimension as you would access an array element.
- Return
- Size of dimension i
- Parameters
i
: Dimension index
-
unsigned int
size
(unsigned int i) const¶ Size of dimension i.
- Return
- Size of dimension i
- Parameters
i
: Dimension index
-
void
delete_dim
(unsigned int i)¶ Remove one of the dimensions.
- Parameters
i
: index of the dimension to be removed
-
void
delete_dims
(std::vector<unsigned int> dims, bool reduce_batch)¶ Remove multi-dimensions.
- Parameters
dims
: dimensions to be removedreduce_batch
: reduce the batch dimension or not
-
void
insert_dim
(unsigned int i, unsigned int n)¶ Insert a dimension.
- Parameters
i
: the index before which to insert the new dimensionn
: the size of the new dimension
-
Dim
transpose
() const¶ Transpose a vector or a matrix.
This raises an invalid_argument exception on tensors with more than 2 dimensions
- Return
- The transposed Dim structure
-
void
print_profile
(std::ostream &out) const¶ Print the unbatched profile as a string.
-