Coding Tips and Style¶
Coding Practices¶
Testing:
Before committing any code, tests should be run to make sure that the new code didn’t break anything.
This can be done by using the make test
command.
It is also highly recommended that you add unit tests for any new functionality.
Unit tests are implemented in the tests
directory.
When making a bug fix, you can add a test that broke before the fix but passes afterwards.
That being said, tests are not an absolute requirement, so if you have a contribution but aren’t sure how to do tests, please don’t let this stop you from contributing.
Coding Style Conventions¶
DyNet (the main version in C++) has certain coding style standards:
Overall Philosophy: DyNet is designed to minimize the computational overhead when creating networks. Try to avoid doing slow things like creating objects or copying memory in places that will be called frequently during computation graph construction.
Function Names: Function names are written in “snake_case”.
const: Always use const if the input to a function is constant.
Pointer vs. Reference: When writing functions, use the following guidelines (quoted from here):
- Only pass a value by pointer if the value 0/NULL is a valid input in the current context.
- If a function argument is an out-value, then pass it by reference.
- Choose “pass by value” over “pass by const reference” only if the value is a POD (Plain Old Datastructure) or small enough (memory-wise) or in other ways cheap enough (time-wise) to copy.
Error handling: The C++ core of DyNet provides a mechanism for error handling that
should be used in all code. It consists of 3 macros as follows (included in globals.h
):
DYNET_INVALID_ARG(msg)
: This is used to throw an error that is triggered when a user passes an invalid argument to one of the functions.DYNET_RUNTIME_ERR(msg)
: This is used to throw an error that could be triggered by a user, but is not the result of an invalid argument. For example, it could be used when something is not implemented yet, or when the program dies due to lack of memory, etc.DYNET_ASSERT(expr,msg)
: This is to be used to check things that should only happen due to a programming error within DyNet itself, and should never be triggered by a user.expr
is a condition, andmsg
is a message explaining the exception, withostream
-style formatting.
Coding Tips/How To¶
Adding New Operations¶
One of the most common things that one will want to do to modify DyNet is to add a new operation to calculate a new function. You can find more information on how to do so at the end of the tutorial slides here (note that some file names are old).
Taking a look at the existing operations in the nodes-XXX.h
and nodes-XXX.cc
files
will be the best guide in creating new operations. Here are some fine-grained tips for
those that want to dive into the process.
fx
is a pointer to the (preallocated) location for the result of forward to be storedfx
is not initialized, so after calling forwardfx
must contain the correct answer- dEdxi MUST ACCUMULATE a result since multiple calls to forward may depend on
the same
x_i
. Even, e.g., Identity must be implemented asdEdx1 += dEdf
. - scalars results of forward are placed in
fx.v[0]
- DyNet manages its own memory, not Eigen, and it is configured with the EIGEN_NO_MALLOC option. If you get an error about Eigen attempting to allocate memory, it is (probably) because of an implicit creation of a temporary variable. If you really do need a temporary variable, its capacity must be requested by Node::aux_storage_size
And here are some notes on debugging problems with new operations
- fx is uninitialized when forward is called- are you relying on it being 0?
- dEdxi must accumulate (see point 3 above!)
Decreasing Compile Time¶
DyNet has a GPU_NUMFILES
option that allows you to specify the number of separate
files that are compiled when compiling for GPU. This is set to 4 on Linux/Mac and 1 on
Windows (because MSVC doesn’t support parallel compilation for GPU code). If you’re
developing new operations for DyNet, it might be a good idea to set GPU_NUMFILES
to zero, which will result in all nodes-XXX.cc
files being compiled separately.
In this case, if you change a single file, it will only recompile that file instead
of recompiling all of the code in all of the nodes-XXX.cc
files.