Skip to contents

Construct a neural-symbolic learning entity-matching model based on field similarity encodings. The class wraps the constructor of the NSMatchingModel Python class from the neer-match package. The class is built using the tensorflow and ltn frameworks.

The model uses the field-pair and record-pair networks of the DLMatchingModel model as predicates. The fuzzy logic component of the model aims to find weights for the predicates such that for every matching example, at least one of the field-pair predicates is (fuzzily) satisfied, and for every non-matching example, not all of the field-pair predicates are (fuzzily) satisfied.

In addition to purely neural-symbolic models, i.e., models trained using only the satisfiability loss, the class can also be used to construct hybrid models. Hybrid models are trained using a weighted average of the satisfiability and the binary cross-entropy losses.

Usage

NSMatchingModel(
  similarity_map,
  initial_feature_width_scales = 10L,
  feature_depths = 2L,
  initial_record_width_scale = 10L,
  record_depth = 4L,
  ...
)

Arguments

similarity_map

A SimilarityMap object.

initial_feature_width_scales

An integer or an integer vector of initial feature width scales for each field-pair network. The scale is multiplied by the number of similarities used in the field-pair network to determine the number of units of the first dense layer. If the input is a scalar, the same value is used for all field-pair networks.

feature_depths

An integer or an integer vector of feature depths for each field-pair network. The depth is the number of hidden dense layers used in the field-pair network. If the input is a scalar, the same value is used for all field-pair networks.

initial_record_width_scale

An integer representing the initial record width scale. The scale is multiplied by the number of field-pair networks to determine the number of units of the first dense layer of the record-pair network.

record_depth

An integer representing the record depth. The depth is the number of hidden dense layers used in the record-pair network.

...

Additional arguments passed to the Python constructor. These arguments are passed down to the tf.keras.Model constructor. constructor.

See also

Examples

smap <- SimilarityMap(
  instructions = list(
    `z~w` = list("jaro", "levenshtein", "discrete"),
    `b ~ c` = list("jaro_winkler", "hamming")
  )
)
model <- NSMatchingModel(smap)