This document refines the evaluation methods for the competition that were introduced in the technical annex of the call. Specifically, it refers to the procedures and parameters of the tests and to the criteria “Accuracy”, "Installation complexity", and "Availability". Remember that each localization system will be evaluated in two phases: Phase 1. In this phase each team must locate a person inside an Area of Interest (AoI). The AoI in a typically AAL scenario could be inside a specific room (bathroom, bedroom), in front of a kitchen etc. The AoI will be disclosed to the competitors before the competition, as early as possible. Phase 2. In this phase a person that moves inside the Living Lab must be located and tracked (we plan only 2D localization and tracking here). During this phase only the person to be localized will be inside the Living Lab. In this phase each localization system should produce localization data with a frequency of 1 new item of data every half a second (this will be also used to evaluate availability). The path followed by the person will be the same for each test, and it will not be disclosed to competitors before the application of the benchmarks. 1. Organization and Test ProceduresA few notes about the organization of the competition:
Installation of localization systems at the living lab:
In all the tests the competing systems will localize the movements of an actor (a person of the organization trained in moving along pre-defined paths). The tests in the two phases are organized as follows: Phase 1: Each system is requested to identify 5 Areas of Interest (AoI). The actor will move along random paths and will stop in each AoI for 30 seconds. For details about the AoI please refer to Annex 1. Phase 2: Each system is requested to track the actor along three different paths (the paths are the same for all competitors). The evaluation criteria Accuracy and Availability will be computed on the three paths aggregated. Each path will last up to a couple of minutes. To help the competitors we give a sample path (that will not be used during the competition) in the Annex 1. 2. Refinement of CriteriaAccuracy: Each localization system will produce a stream of tuples: , one sample every half a second. Phase 1) The user will stop (after a random walk) 30 seconds in each Area of Interest (AoI). Accuracy in this case will be measured as the fraction T of time in which the localization system provides the correct information about:
The score is given by: Accuracy score = 10*T
Phase 2) The stream produced by competing systems will be compared against a logfile of the expected position of the user. Specifically, we will evaluate the individual error of each measure (the Euclidian distance between the measured and the expected points), and we will estimate 75th percentile P of the errors. In order to produce the score, P will be scaled in the range [0,10] according to the following formula: Accuracy score = 0 if P>4 m Accuracy score =10 if P<= 5="" p=""> Accuracy score =4*(0.5-P)+10 if 0,5m Accuracy score =2*(4-P) if 2m
The final score on accuracy will be the average between the scores obtained in phase 1 and in phase 2. Installation Complexity: Thus measures the time T necessary to install the localization system. The time T is measured in minutes from the time in which the competitor enter in the living lab to the time when they declare they completed the installation (no further operations/configurations of the system will be admitted after that time), and it will be multiplied by the number of people N working on the installation. The parameter T*N will be translated in a score (ranging from 0 to 10) according with the following formula: Installation Complexity Score = 10 if T*N <=10 Installation Complexity Score = 10 * (60-T*N) / 50 if 10 Installation Complexity Score = 0 if T*N >60 Availability: Availability A is measured as the ratio between the available samplings produced by the localization system and the expected samples. In both the first and second phase, each localization system is expected to provide 1 sample every half a second, hence the number of expected samplings is given by the duration of the test * 2. The values of availability A will be translated into a score (ranging from 0 to 10) according to the following formula: Availability score = 10 * A
|