Record Evaluations

To record evaluations of your model, follow these steps:

Step 1: Go over your validation records.

Step 2: Generate model prediction.

Step 3: Create SingleTagInferenceRecord or MultiTagInferenceRecord based on your evaluation data.

Step 4: Add the record to the EvaluationRecorder.

How does it work?

In the below sample code, a validation loop iterates over each record to obtain predictions and corresponding scores from the model. For each iteration, a unique identifier is generated for the record, and a SingleTagInferenceRecord or MultiTagInferenceRecord is created to encapsulate the predicted label, actual label, and associated score.

These records are then added to the evaluation recorder. Once all records have been processed, the recording is marked as finished, initiating the generation of the evaluation report. It's important to note that no additional records can be added once the recording is finished. This streamlined process ensures accurate recording and analysis of model predictions for evaluation purposes.

Sample Code for Single tag evaluation records

from markov.api.schemas.model_recording import SingleTagInferenceRecord,RecordMetaType
# Your validation loop that goes over each record to get a score from the model

while ...
   # This is the prediction from your model and the corresponding best score
   predicted_label, score = model.predict(input_value)
   # generate a unique identifier for this record
   urid= recorder.gen_urid(input_value)
   record = SingleTagInferenceRecord(urid=urid
                                    inferred=predicted_label, 
                                    actual="YOUR_ACTUAL_LABEL",
                                    score=score)
   evaluation_recorder.add_record(record)
# call finish to mark the recording as finished. 
# This starts the generation of the evaluation report for this recording.
# Additional records can't be added once the recording is marked finished. 
evaluation_recorder.finish()

Below is the sample code for adding multitag evaluation records. Multi-tag evaluation records have multiple labels assigned to a single record.

Sample Code for Multitag evaluation records

from markov.api.schemas.model_recording import SingleTagInferenceRecord,RecordMetaType
# Your validation loop that goes over each record to get a score from the model
while ...
   # This is the prediction from your model and the corresponding best score
   predicted_labels, scores = model.predict(input_value)
   for  predicted_label, score in zip(predicted_labels,scores):
     # generate a unique identifier for this record
     urid= recorder.gen_urid(input_value)
     # Create a MultiTagInferenceRecord
     record = MultiTagInferenceRecord(urid=urid)
     record.add_evaluation(actual=input_value,predicted=predicted_label,score=score)
     evaluation_recorder.add_record(record)
# call finish to mark the recording as finished. 
# This starts the generation of the evaluation report for this recording.
evaluation_recorder.finish()