Construct a neural-symbolic learning entity-matching model based on field
similarity encodings. The class wraps the constructor of the
NSMatchingModel Python class from the neer-match package.
The class is built using the tensorflow and
ltn frameworks.
The model uses the field-pair and record-pair networks of the
DLMatchingModel model as predicates. The fuzzy logic
component of the model aims to find weights for the predicates such
that for every matching example, at least one of the field-pair
predicates is (fuzzily) satisfied, and for every non-matching example,
not all of the field-pair predicates are (fuzzily) satisfied.
In addition to purely neural-symbolic models, i.e., models trained using only the satisfiability loss, the class can also be used to construct hybrid models. Hybrid models are trained using a weighted average of the satisfiability and the binary cross-entropy losses.
Usage
NSMatchingModel(
similarity_map,
initial_feature_width_scales = 10L,
feature_depths = 2L,
initial_record_width_scale = 10L,
record_depth = 4L,
...
)Arguments
- similarity_map
A
SimilarityMapobject.- initial_feature_width_scales
An integer or an integer vector of initial feature width scales for each field-pair network. The scale is multiplied by the number of similarities used in the field-pair network to determine the number of units of the first dense layer. If the input is a scalar, the same value is used for all field-pair networks.
- feature_depths
An integer or an integer vector of feature depths for each field-pair network. The depth is the number of hidden dense layers used in the field-pair network. If the input is a scalar, the same value is used for all field-pair networks.
- initial_record_width_scale
An integer representing the initial record width scale. The scale is multiplied by the number of field-pair networks to determine the number of units of the first dense layer of the record-pair network.
- record_depth
An integer representing the record depth. The depth is the number of hidden dense layers used in the record-pair network.
- ...
Additional arguments passed to the Python constructor. These arguments are passed down to the
tf.keras.Modelconstructor. constructor.
Examples
smap <- SimilarityMap(
instructions = list(
`z~w` = list("jaro", "levenshtein", "discrete"),
`b ~ c` = list("jaro_winkler", "hamming")
)
)
model <- NSMatchingModel(smap)
