Applications - Performance

Updated on 14 Nov 2024
5 Minutes to read

Print
Dark
Light

Article summary

Did you find this summary helpful?

Thank you for your feedback!

Three application endpoints return performance statistics such as precision, recall, and F1 score for all previously released models calculated against different data sets.

All performance endpoints accept the same query arguments and return data in the same format.

Performance Request Body

Argument	Example	Definition
start unix timestamp	1494885870	(optional) Earliest model release timestamp which to return performance results. Defaults to 0.
end unix timestamp	1494886421	(optional) Latest model release timestamp which to return performance results. Defaults to current time.
limit integer	10	(optional) Maximum number of records to return. Defaults to returning all available records.
reverse boolean	True	(optional) True - return records in descending time order. False - return records in ascending time order. Defaults to False.

Application Performance Response

Fields	Example	Definition
app_id string	"di71rG94"	Unique ID of the application.
data_count integer	347	Total number of validation data tested against the model.
loss float	0.09	The error rate of the model.
model_performance object	See "Model Performance Object"	A model performance object.
model_performance_per_subject array of objects	See "Model Performance Object"	An array of model performance maps for each output subject.
output_subjects list	["cat", "dog"]	The list of application output subject_uids.
release_metrics string	"best_F1"	The statistic used to release improved models.
release_time unix timestamp	1234567890	Original model release timestamp.
type string	"classification"	Application type.
updated_at unix timestamp	1234567890	The time the model metrics were updated.

Model Performance Object

Field	Definition
FN integer	False Negatives, the total count of subject-media associations in the validation set incorrectly missed. I.e. the model did not detect a subject present in the media.
FP integer	False Positives, the total count of subject-media associations in the validation set incorrectly detected. I.e. the model detected a subject not present in the media.
TN integer	True Negatives, the total count of subject-media associations correctly negatively identified. I.e. the model correctly negatively associated the subject with the media.
TP integer	True Positives, the total count of subject-media associations correctly identified in the validation set. I.e. the model correctly identified the subject in the media.
accuracy float	A measure of overall accuracy of the model. Measured as:
precision float	The fraction of positive subject-media assertions made by the model that are correct. Measured as:
recall float	The fraction of positive subject-media associations detected by the model. Measured as:
F1 float	An overall measure of the model's accuracy. The harmonic average of precision and recall. Measured as:

Release Validation Performance

Release Validation performance reports model performance scores tested against the validation data set at the time each model was released.

GET /1/applications/{application_id}/performance/releaseValidation
Host: https://api.cogniac.io

Current Validation Performance

Current Validation performance reports model performance scores tested against the current validation data set.

GET /1/applications/{application_id}/performance/currentValidation
Host: https://api.cogniac.io

New Random Performance

New Random performance reports model performance against a data set that each model has not encountered in either a test or validation set at the time of mode release.

GET /1/applications/{application_id}/performance/newRandom
Host: https://api.cogniac.io

Example: Retrieve Application Performance Against Current Validation Set

cURL
Python

curl -X GET https://api.cogniac.io/1/applications/di71rG94/performance/currentValidation?limit=10 \
-H "Authorization: Bearer abcdefg.hijklmnop.qrstuvwxyz" \
-H "Content-Type: application/json" \
| json_pp

import requests
import json 

my_headers = {"Authorization":"Bearer "}
response = requests.get("https://api.cogniac.io/1/applications/di71rG94/performance/currentValidation?limit=10",
                    headers=my_headers)

if response.status_code == 200:
    formatted_json = json.dumps(response.json(), indent=2)
    print(formatted_json)
else:
    print(f"Error: {response.status_code}")
    print(response.content)

{
    "data": [
        {
          "app_id": "di71rG94",
          "data_count": 54,
          "loss": 0.2407407407407407,
          "model_performance": {
            "FP": 13,
            "F1": 0.7592592592592593,
            "recall": 0.7592592592592593,
            "precision": 0.7592592592592593,
            "TP": 41,
            "TN": 41,
            "FN": 13,
            "accuracy": 0.7592592592592593
          },
          "model_performance_per_subject": {
            "cat": {
              "FP": 7,
              "F1": 0.8354430379746836,
              "recall": 0.8461538461538461,
              "precision": 0.825,
              "TP": 33,
              "TN": 8,
              "FN": 6,
              "accuracy": 0.7592592592592593
            },
            "dog": {
              "FP": 6,
              "F1": 0.5517241379310344,
              "recall": 0.5333333333333333,
              "precision": 0.5714285714285714,
              "TP": 8,
              "TN": 33,
              "FN": 7,
              "accuracy": 0.7592592592592593
            }
          },
          "output_subjects": [
            "cat",
            "dog"
          ],
          "release_metrics": "best_F1",
          "release_time": 1484251629.765,
          "type": "classification",
          "updated_at": 1484251629.765
        },
        {
          "app_id": "di71rG94",
          "data_count": 104,
          "loss": 0.25,
          "model_performance": {
            "FP": 26,
            "F1": 0.75,
            "recall": 0.75,
            "precision": 0.75,
            "TP": 78,
            "TN": 78,
            "FN": 26,
            "accuracy": 0.75
          },
          "model_performance_per_subject": {
            "cat": {
              "FP": 10,
              "F1": 0.8194444444444444,
              "recall": 0.7866666666666666,
              "precision": 0.855072463768116,
              "TP": 59,
              "TN": 19,
              "FN": 16,
              "accuracy": 0.75
            },
            "dog": {
              "FP": 16,
              "F1": 0.5937499999999999,
              "recall": 0.6551724137931034,
              "precision": 0.5428571428571428,
              "TP": 19,
              "TN": 59,
              "FN": 10,
              "accuracy": 0.75
            }
          },
          "output_subjects": [
            "cat",
            "dog"
          ],
          "release_metrics": "best_F1",
          "release_time": 1484278580.835871,
          "type": "classification",
          "updated_at": 1484278580.835871
        }, ...
      ]
      }

Was this article helpful?

What's Next

Applications - Consensus

Table of contents

Performance Request Body
Application Performance Response
Model Performance Object
Release Validation Performance
Current Validation Performance
New Random Performance
Example: Retrieve Application Performance Against Current Validation Set