Skip to content

Actions: tatsu-lab/alpaca_eval

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
891 workflow runs
891 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

[NOTEBOOK] add length-corrected GLM
test format leaderboard #99: Pull request #237 synchronize by YannDubs
February 19, 2024 12:02 2m 14s yann/add_arena_openai
February 19, 2024 12:02 2m 14s
[NOTEBOOK] add length-corrected GLM
test format leaderboard #98: Pull request #237 opened by YannDubs
February 19, 2024 12:01 1m 58s yann/add_arena_openai
February 19, 2024 12:01 1m 58s
[NOTEBOOK] add length-corrected GLM
alpaca_eval unit tests #462: Pull request #237 opened by YannDubs
February 19, 2024 12:01 3m 56s yann/add_arena_openai
February 19, 2024 12:01 3m 56s
update ELO for llama-2-13b-chat-hf (#235)
alpaca_eval unit tests #461: Commit bcc09d8 pushed by YannDubs
February 14, 2024 04:05 4m 11s main
February 14, 2024 04:05 4m 11s
pages build and deployment
pages-build-deployment #366: by YannDubs
February 14, 2024 04:04 1m 31s main
February 14, 2024 04:04 1m 31s
update ELO for llama-2-13b-chat-hf
alpaca_eval unit tests #460: Pull request #235 opened by gblazex
February 13, 2024 18:54 3m 55s gblazex:patch-2
February 13, 2024 18:54 3m 55s
pages build and deployment
pages-build-deployment #365: by github-pages bot
February 12, 2024 21:44 1m 29s main
February 12, 2024 21:44 1m 29s
[DATA] add results from the Arena openai models (#234)
Format leaderboard #83: Commit c0ce3f9 pushed by YannDubs
February 12, 2024 21:42 1m 51s main
February 12, 2024 21:42 1m 51s
[DATA] add results from the Arena openai models (#234)
alpaca_eval unit tests #459: Commit c0ce3f9 pushed by YannDubs
February 12, 2024 21:42 3m 31s main
February 12, 2024 21:42 3m 31s
pages build and deployment
pages-build-deployment #364: by YannDubs
February 12, 2024 21:42 1m 25s main
February 12, 2024 21:42 1m 25s
[DATA] add results from the Arena openai models
test format leaderboard #97: Pull request #234 opened by YannDubs
February 12, 2024 21:32 1m 51s yann/add_arena_openai
February 12, 2024 21:32 1m 51s
[DATA] add results from the Arena openai models
alpaca_eval unit tests #458: Pull request #234 opened by YannDubs
February 12, 2024 21:32 4m 44s yann/add_arena_openai
February 12, 2024 21:32 4m 44s
Update ELO scores to Feb 2
alpaca_eval unit tests #457: Pull request #233 opened by gblazex
February 12, 2024 14:52 3m 35s gblazex:main
February 12, 2024 14:52 3m 35s
[ENH] avoid infinite loop
alpaca_eval unit tests #456: Commit 5df7581 pushed by YannDubs
February 11, 2024 06:57 3m 46s main
February 11, 2024 06:57 3m 46s
pages build and deployment
pages-build-deployment #363: by YannDubs
February 11, 2024 06:57 1m 20s main
February 11, 2024 06:57 1m 20s
[DOC] add annotation interpretation (#232)
alpaca_eval unit tests #455: Commit 7468834 pushed by YannDubs
February 11, 2024 06:53 3m 58s main
February 11, 2024 06:53 3m 58s
pages build and deployment
pages-build-deployment #362: by YannDubs
February 11, 2024 06:53 1m 24s main
February 11, 2024 06:53 1m 24s
[DOC] add annotation interpretation
alpaca_eval unit tests #454: Pull request #232 opened by YannDubs
February 11, 2024 06:52 3m 45s yann/add_annotation_interpretation
February 11, 2024 06:52 3m 45s
Format leaderboard
Format leaderboard #82: Manually run by YannDubs
February 11, 2024 06:33 2m 20s main
February 11, 2024 06:33 2m 20s
pages build and deployment
pages-build-deployment #361: by github-pages bot
February 11, 2024 06:06 1m 20s main
February 11, 2024 06:06 1m 20s
[DEV] Analyzing length-controlled metrics. (#231)
Format leaderboard #81: Commit 34e01fa pushed by YannDubs
February 11, 2024 06:04 1m 50s main
February 11, 2024 06:04 1m 50s
[DEV] Analyzing length-controlled metrics. (#231)
alpaca_eval unit tests #453: Commit 34e01fa pushed by YannDubs
February 11, 2024 06:04 3m 53s main
February 11, 2024 06:04 3m 53s
pages build and deployment
pages-build-deployment #360: by YannDubs
February 11, 2024 06:04 1m 28s main
February 11, 2024 06:04 1m 28s
[TEST] ensure that the baseline is always correct
Format leaderboard #80: Commit 45ec7c0 pushed by YannDubs
February 11, 2024 06:03 1m 36s main
February 11, 2024 06:03 1m 36s
[TEST] ensure that the baseline is always correct
alpaca_eval unit tests #452: Commit 45ec7c0 pushed by YannDubs
February 11, 2024 06:03 3m 41s main
February 11, 2024 06:03 3m 41s
ProTip! You can narrow down the results and go further in time using created:<2024-02-11 or the other filters available.