Skip to contents

Construct a deep learning entity-matching model based on field similarity encodings. The class wraps the constructor of the DLMatchingModel Python class from the neer-match package. The class is built using the tensorflow framework.

Usage

DLMatchingModel(
  similarity_map,
  initial_feature_width_scales = 10L,
  feature_depths = 2L,
  initial_record_width_scale = 10L,
  record_depth = 4L,
  ...
)

Arguments

similarity_map

A SimilarityMap object.

initial_feature_width_scales

An integer or an integer vector of initial feature width scales for each field-pair network. The scale is multiplied by the number of similarities used in the field-pair network to determine the number of units of the first dense layer. If the input is a scalar, the same value is used for all field-pair networks.

feature_depths

An integer or an integer vector of feature depths for each field-pair network. The depth is the number of hidden dense layers used in the field-pair network. If the input is a scalar, the same value is used for all field-pair networks.

initial_record_width_scale

An integer representing the initial record width scale. The scale is multiplied by the number of field-pair networks to determine the number of units of the first dense layer of the record-pair network.

record_depth

An integer representing the record depth. The depth is the number of hidden dense layers used in the record-pair network.

...

Additional arguments passed to the Python constructor. These arguments are passed down to the tf.keras.Model constructor.

Examples

smap <- SimilarityMap(
  instructions = list(
    `a~c` = list("discrete", "gaussian"),
    `b` = list("discrete", "levenshtein")
  )
)
model <- DLMatchingModel(smap)