Evaluating the capabilities and generality in artificial intelligence

Hernandez-Orallo Jose

doi:10.11517/jsaisigtwo.2021.AGI-019_04

抄録

The evaluation of artificial intelligence in all its varieties is one of the greatest scientific challenges of our time. This is more so as we are dealing with cognition, which takes us beyond the evaluation of performance towards the evaluation of behaviour. In this talk, I will introduce a series of endeavours about the present and future measurement of AI, such as the evaluation of capabilities rather than task performance, the evaluation of general-purpose systems rather than specialised ones, the evaluation of AI extenders rather than externalised systems, the evaluation of the transformative effect on skills in the workplace, etc. To this end I will vindicate some key elements: the identification of the dimensions of difficulty to determine capability and generality profiles, the proper study of instance variation to ensure robustness in evaluation, the consideration of operating conditions on top of instance distributions, and the need of more ambitious meta-analyses of experimental data about AI measurement.

著者関連情報

お気に入り & アラート

閲覧履歴

発行機関からのお知らせ

第二種研究会の全記事は認証なしでアクセス可能です．また，各記事の著作権は原則として著者に帰属します．

責任著者(Corresponding author)

会議情報

J-STAGEへの登録はこちら（無料）