Evaluates the model. Calculates and returns the loss and various accuracy metrics for a fitted model and the passed data.
The method emulates the behavior of the tf.keras.evaluate method. It automatically constructs a data generator from the left and right datasets iterating over all the elements of their Cartesian product. The generator's labels are generated from the matches data frame. The method uses the generator to evaluate the model.
DLMatchingModel
The method passes the constructed generator and any additional call arguments to directly to the tf.keras.evaluate.
NSMatchingModel
The method iterates over the batches of the constructed generator and calculates the numbers of true positives, true negatives, false positives, and false negatives. In addition, it calculates accuracy, precision, recall, and F1 score. The method returns a named list with the calculated metrics.
RefutationModel
The method iterates over the batches of the constructed generator and calculates the numbers of true positives, true negatives, false positives, and false negatives. In addition, it calculates accuracy, precision, recall, and F1 score. The method returns a named list with the calculated metrics.
Usage
evaluate(object, left, right, matches, ...)
# S4 method for class 'neer_match.matching_model.DLMatchingModel'
evaluate(object, left, right, matches, ...)
# S4 method for class 'neer_match.matching_model.NSMatchingModel'
evaluate(
object,
left,
right,
matches,
batch_size = 32L,
satisfiability_weight = 1
)
# S4 method for class 'neer_match.reasoning.RefutationModel'
evaluate(
object,
left,
right,
matches,
batch_size = 32L,
satisfiability_weight = 1
)Arguments
- object
A matching model object.
- left
A data frame with the left records.
- right
A data frame with the right records.
- matches
A data frame with the indices of the matching record pairs.
- ...
Additional arguments passed to tf.keras.evaluate.
- batch_size
The batch size (integer).
- satisfiability_weight
A numeric value in the range \([0, 1]\) representing the weight of the satisfiability loss in the total loss of a hybrid model.
Examples
smap <- SimilarityMap(
instructions = list(
`reviews` = list("discrete", "gaussian"),
`developer ~ dev` = list("jaro_winkler", "damerau_levenshtein")
)
)
model <- DLMatchingModel(smap)
compile(model, loss = tensorflow::tf$keras$losses$BinaryCrossentropy())
matching_data <- fuzzy_games_example_data()
eval_set <- c(1:3)
train_left <- matching_data$left[-eval_set, ]
train_right <- matching_data$right[-eval_set, ]
train_matches <- matching_data$matches[
!(matching_data$matches$left %in% eval_set) |
!(matching_data$matches$right %in% eval_set),
] - 3L
fit(
model,
train_left, train_right, train_matches,
epochs = 1L,
verbose = 0L
)
#> <keras.src.callbacks.history.History object at 0x7fab8821c1c0>
eval_left <- matching_data$left[eval_set, ]
eval_right <- matching_data$right[eval_set, ]
eval_matches <- matching_data$matches[
(matching_data$matches$left %in% eval_set) &
(matching_data$matches$right %in% eval_set),
]
print(evaluate(model, eval_left, eval_right, eval_matches))
#> [1] 0.6390041
