Commit graph

56 commits

Author SHA1 Message Date
Isadora White
93db8b664c fixing construction tasks 2025-03-22 17:13:14 -05:00
Isadora White
57af4f13cc adding back logging 2025-03-22 15:17:08 -05:00
MaxRobinsonTheGreat
6dc5c6401a Merge branch 'main' into cleanup 2025-03-20 16:45:20 -05:00
Isadora White
e3d61ceead changing task path 2025-03-20 15:36:45 -05:00
Isadora White
7bf97660eb added exp name to prompter.js and splitting around think tokens 2025-03-19 23:52:08 -05:00
Isadora White
6f8027b86f add additional logging of config settings 2025-03-19 23:34:22 -05:00
Isadora White
bb50486e81 add new options and better logging 2025-03-19 23:29:11 -05:00
Isadora White
94faf8f82a Merge branch 'main' of https://github.com/icwhite/mindcraft 2025-03-17 13:11:21 -07:00
Isadora White
15ef88aa61 silly deletion issue 2025-03-17 13:11:14 -07:00
Isadora White
dd5dd72381
Merge pull request #7 from icwhite/add-ollama-evaluation
Add ollama evaluation
2025-03-17 12:38:32 -07:00
hlillemark
5a230e707e Edit eval script to work with ollama model 2025-03-17 11:53:31 -07:00
Isadora White
fe4e75612d one more try catch block around starting the server 2025-03-16 22:10:32 -07:00
Isadora White
2e4974ddd8 dynamic logging 2025-03-16 21:56:08 -07:00
Isadora White
192d92bc06 fix restart and the issue where the server sometimes doesn't get deleted properly 2025-03-16 20:31:30 -07:00
Isadora White
19bedf0593 making eval script more robust to server randomly crashing 2025-03-16 18:50:34 -07:00
Isadora White
1ccba3a4b5 new train, test, dev tasks and new analysis files 2025-03-16 17:55:05 -07:00
Isadora White
125aa73d6c adding blocked actions 2025-03-14 18:51:41 -07:00
Isadora White
d399f8e214 blocked actions 2025-03-14 18:49:33 -07:00
Isadora White
8c78398056 fix single agent issue 2025-03-14 15:19:26 -07:00
Isadora White
dc3322b518 more /op commands 2025-03-14 00:14:37 -07:00
Isadora White
d8e933a25d small fix to loading with cheats 2025-03-13 21:31:16 -07:00
Isadora White
406ebe6072 add url option to evaluation script 2025-03-09 23:14:26 -07:00
Isadora White
5103cd82eb evaluation script small update 2025-03-09 22:51:00 -07:00
Isadora White
5b95b2f816 confirming correct profile path 2025-03-09 13:16:57 -07:00
Isadora White
2f80b65d42 fixing small bugs related to single agent support 2025-03-09 13:01:54 -07:00
Isadora White
3576c6fb07 Merge branch 'main' into main_save_03-07-2025 2025-03-09 12:02:58 -07:00
Isadora White
e9e8cf3c88 single agent task support in evaluation script 2025-03-09 12:02:48 -07:00
Isadora White
b75d941d97 smol changes 2025-03-08 19:24:13 -08:00
Isadora White
7492582ed9 changed eval script slightly 2025-03-07 17:29:43 -08:00
Mehul Maheshwari
dda1ba200f better readability for incoming devs. revised blueprint generation for ladder fix. cheats enabled for jill_0 and andy_0. 2025-03-06 16:16:02 -08:00
Isadora White
9aefb4676a updating port number for vllm 2025-03-05 15:53:43 -08:00
Isadora White
7e7f893cf3 fix evaluation script bug 2025-03-05 10:10:24 -08:00
Isadora White
6e0d7e1eaf change evaluation_script to allow for new world names 2025-03-04 22:08:04 -08:00
Isadora White
34145168dc refactoring changes 2025-03-04 12:09:23 -08:00
Isadora White
e3ec9d34b4 fixing merge related small bugs 2025-03-04 11:54:09 -08:00
mmaheshwari2
2036288e13 tested eval script, changed model to 4o-mini 2025-03-04 10:28:11 -08:00
Isadora White
44fc1b4618 evaluation script for vllm 2025-03-03 06:10:52 +00:00
Isadora White
37b1fc0bed increase timeout length for adding bot to world for the first time 2025-03-01 21:37:20 -08:00
Isadora White
af79c78fbb fixing evaluation script to actually add bots as op and add new models 2025-03-01 19:21:35 -08:00
Isadora White
ae39028d3b longer sleeps, early breaking for scenarios where there is only one agent 2025-02-28 18:31:19 -08:00
Isadora White
39cec7cf82 changed the checking if complete cycle to be more frequent and updated the collab_profile 2025-02-28 16:59:56 -08:00
Isadora White
7cafc254d1 making the default to load in the collaborative profiles 2025-02-23 21:11:08 -08:00
Isadora White
2da97b5607 adding a mechanism to add environment variables to the keys.json automatically 2025-02-23 18:55:13 -08:00
Isadora White
8a75d8a78e changing the give to player command to account for an edge case where the players are too close together and moving away takes time 2025-02-22 17:53:42 -08:00
Isadora White
719b72da9e set up to use s3 logging instead of wandb 2025-02-21 17:02:21 -08:00
Isadora White
d4565aa68c small fixes, the items were being given twice to the agents on initialization and accounting for blocked_actions not being in the task file 2025-02-20 21:45:29 -08:00
Isadora White
7a19f34e22
Merge branch 'main' into evaluation_parallelization 2025-02-18 18:34:46 -08:00
Isadora White
aad19d616c fixed evaluation script to allow for parallel worlds again 2025-02-18 16:39:31 -08:00
Isadora White
fb5d95debe fixed the issue with garbling commands by instead putting the commands in a bash script and running them that way 2025-02-17 17:25:12 -08:00
Isadora White
bc15700196 fixed wandb logging 2025-02-15 12:02:44 -08:00