Skip to content

↑ H3 Features

H3 is a indexing system for representing geospatial data. For more details about it refer to https://eng.uber.com/h3.

Preprocessing

Ludwig will parse the H3 64bit encoded format automatically.

preprocessing:
    missing_value_strategy: fill_with_const
    fill_value: 576495936675512319

Parameters:

  • missing_value_strategy (default: fill_with_const) : What strategy to follow when there's a missing value in an h3 column Options: fill_with_const, fill_with_mode, bfill, ffill, drop_row. See Missing Value Strategy for details.
  • fill_value (default: 576495936675512319): The value to replace missing values with in case the missing_value_strategy is fill_with_const

Preprocessing parameters can also be defined once and applied to all H3 input features using the Type-Global Preprocessing section.

Input Features

Input H3 features are transformed into a int valued tensors of size N x 19 (where N is the size of the dataset and the 19 dimensions represent 4 H3 resolution parameters (4) - mode, edge, resolution, base cell - and 15 cell coordinate values.

The encoder parameters specified at the feature level are:

  • tied (default null): name of another input feature to tie the weights of the encoder with. It needs to be the name of a feature of the same type and with the same encoder parameters.

Example H3 feature entry in the input features list:

name: h3_feature_name
type: h3
tied: null
encoder: 
    type: embed

The available encoder parameters are:

  • type (default embed): the possible values are embed, weighted_sum, and rnn.

Encoder type and encoder parameters can also be defined once and applied to all H3 input features using the Type-Global Encoder section.

Encoders

Embed Encoder

This encoder encodes each component of the H3 representation (mode, edge, resolution, base cell and children cells) with embeddings. Children cells with value 0 will be masked out. After the embedding, all embeddings are summed and optionally passed through a stack of fully connected layers.

encoder:
    type: embed
    dropout: 0.0
    embedding_size: 10
    output_size: 10
    activation: relu
    norm: null
    use_bias: true
    bias_initializer: zeros
    weights_initializer: xavier_uniform
    embeddings_on_cpu: false
    reduce_output: sum
    norm_params: null
    num_fc_layers: 0
    fc_layers: null

Parameters:

  • dropout (default: 0.0) : Dropout probability for the embedding.
  • embedding_size (default: 10) : The maximum embedding size adopted.
  • output_size (default: 10) : If an output_size is not already specified in fc_layers this is the default output_size that will be used for each layer. It indicates the size of the output of a fully connected layer.
  • activation (default: relu): The default activation function that will be used for each layer. Options: elu, leakyRelu, logSigmoid, relu, sigmoid, tanh, softmax, null.
  • norm (default: null): The default norm that will be used for each layer. Options: batch, layer, null. See Normalization for details.
  • use_bias (default: true): Whether the layer uses a bias vector. Options: true, false.
  • bias_initializer (default: zeros): Initializer to use for the bias vector. Options: uniform, normal, constant, ones, zeros, eye, dirac, xavier_uniform, xavier_normal, kaiming_uniform, kaiming_normal, orthogonal, sparse, identity.
  • weights_initializer (default: xavier_uniform): Initializer to use for the weights matrix. Options: uniform, normal, constant, ones, zeros, eye, dirac, xavier_uniform, xavier_normal, kaiming_uniform, kaiming_normal, orthogonal, sparse, identity.
  • embeddings_on_cpu (default: false): Whether to force the placement of the embedding matrix in regular memory and have the CPU resolve them. Options: true, false.
  • reduce_output (default: sum): How to reduce the output tensor along the s sequence length dimension if the rank of the tensor is greater than 2. Options: last, sum, mean, avg, max, concat, attention, none, None, null.
  • norm_params (default: null): Parameters used if norm is either batch or layer.
  • num_fc_layers (default: 0): The number of stacked fully connected layers.
  • fc_layers (default: null): List of dictionaries containing the parameters for each fully connected layer.

Weighted Sum Embed Encoder

This encoder encodes each component of the H3 representation (mode, edge, resolution, base cell and children cells) with embeddings. Children cells with value 0 will be masked out. After the embedding, all embeddings are summed with a weighted sum (with learned weights) and optionally passed through a stack of fully connected layers.

encoder:
    type: weighted_sum
    dropout: 0.0
    embedding_size: 10
    output_size: 10
    activation: relu
    norm: null
    use_bias: true
    bias_initializer: zeros
    weights_initializer: xavier_uniform
    embeddings_on_cpu: false
    should_softmax: false
    norm_params: null
    num_fc_layers: 0
    fc_layers: null

Parameters:

  • dropout (default: 0.0) : Dropout probability for the embedding.
  • embedding_size (default: 10) : The maximum embedding size adopted.
  • output_size (default: 10) : If an output_size is not already specified in fc_layers this is the default output_size that will be used for each layer. It indicates the size of the output of a fully connected layer.
  • activation (default: relu): The default activation function that will be used for each layer. Options: elu, leakyRelu, logSigmoid, relu, sigmoid, tanh, softmax, null.
  • norm (default: null): The default norm that will be used for each layer. Options: batch, layer, null. See Normalization for details.
  • use_bias (default: true): Whether the layer uses a bias vector. Options: true, false.
  • bias_initializer (default: zeros): Initializer to use for the bias vector. Options: uniform, normal, constant, ones, zeros, eye, dirac, xavier_uniform, xavier_normal, kaiming_uniform, kaiming_normal, orthogonal, sparse, identity.
  • weights_initializer (default: xavier_uniform): Initializer to use for the weights matrix. Options: uniform, normal, constant, ones, zeros, eye, dirac, xavier_uniform, xavier_normal, kaiming_uniform, kaiming_normal, orthogonal, sparse, identity.
  • embeddings_on_cpu (default: false): Whether to force the placement of the embedding matrix in regular memory and have the CPU resolve them. Options: true, false.
  • should_softmax (default: false): Determines if the weights of the weighted sum should be passed though a softmax layer before being used. Options: true, false.
  • norm_params (default: null): Parameters used if norm is either batch or layer.
  • num_fc_layers (default: 0): The number of stacked fully connected layers.
  • fc_layers (default: null): List of dictionaries containing the parameters for each fully connected layer.

RNN Encoder

This encoder encodes each component of the H3 representation (mode, edge, resolution, base cell and children cells) with embeddings. Children cells with value 0 will be masked out. After the embedding, all embeddings are passed through an RNN encoder.

The intuition behind this is that, starting from the base cell, the sequence of children cells can be seen as a sequence encoding the path in the tree of all H3 hexes.

encoder:
    type: rnn
    dropout: 0.0
    cell_type: rnn
    num_layers: 1
    embedding_size: 10
    recurrent_dropout: 0.0
    hidden_size: 10
    bias_initializer: zeros
    activation: tanh
    recurrent_activation: sigmoid
    unit_forget_bias: true
    weights_initializer: xavier_uniform
    recurrent_initializer: orthogonal
    reduce_output: last
    embeddings_on_cpu: false
    use_bias: true
    bidirectional: false

Parameters:

  • dropout (default: 0.0) : The dropout rate
  • cell_type (default: rnn) : The type of recurrent cell to use. Available values are: rnn, lstm, lstm_block, lstm, ln, lstm_cudnn, gru, gru_block, gru_cudnn. For reference about the differences between the cells please refer to PyTorch's documentation. We suggest to use the block variants on CPU and the cudnn variants on GPU because of their increased speed. Options: rnn, lstm, lstm_block, ln, lstm_cudnn, gru, gru_block, gru_cudnn.
  • num_layers (default: 1) : The number of stacked recurrent layers.
  • embedding_size (default: 10) : The maximum embedding size adopted.
  • recurrent_dropout (default: 0.0): The dropout rate for the recurrent state
  • hidden_size (default: 10): The size of the hidden representation within the transformer block. It is usually the same as the embedding_size, but if the two values are different, a projection layer will be added before the first transformer block.
  • bias_initializer (default: zeros): Initializer to use for the bias vector. Options: uniform, normal, constant, ones, zeros, eye, dirac, xavier_uniform, xavier_normal, kaiming_uniform, kaiming_normal, orthogonal, sparse, identity.
  • activation (default: tanh): The activation function to use Options: elu, leakyRelu, logSigmoid, relu, sigmoid, tanh, softmax, null.
  • recurrent_activation (default: sigmoid): The activation function to use in the recurrent step Options: elu, leakyRelu, logSigmoid, relu, sigmoid, tanh, softmax, null.
  • unit_forget_bias (default: true): If true, add 1 to the bias of the forget gate at initialization Options: true, false.
  • weights_initializer (default: xavier_uniform): Initializer to use for the weights matrix. Options: uniform, normal, constant, ones, zeros, eye, dirac, xavier_uniform, xavier_normal, kaiming_uniform, kaiming_normal, orthogonal, sparse, identity.
  • recurrent_initializer (default: orthogonal): The initializer for recurrent matrix weights Options: uniform, normal, constant, ones, zeros, eye, dirac, xavier_uniform, xavier_normal, kaiming_uniform, kaiming_normal, orthogonal, sparse, identity.
  • reduce_output (default: last): How to reduce the output tensor along the s sequence length dimension if the rank of the tensor is greater than 2. Options: last, sum, mean, avg, max, concat, attention, none, None, null.
  • embeddings_on_cpu (default: false): Whether to force the placement of the embedding matrix in regular memory and have the CPU resolve them. Options: true, false.

  • use_bias (default: true): Whether to use a bias vector. Options: true, false.

  • bidirectional (default: false): If true, two recurrent networks will perform encoding in the forward and backward direction and their outputs will be concatenated. Options: true, false.

Output Features

There is currently no support for H3 as an output feature. Consider using the TEXT type.