-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sync : llama.cpp (training, refactoring) #548
Conversation
@@ -1771,6 +1771,7 @@ extern "C" { | |||
GGML_OPT_NO_CONTEXT, | |||
GGML_OPT_INVALID_WOLFE, | |||
GGML_OPT_FAIL, | |||
GGML_OPT_CANCEL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xaedes I've added the GGML_OPT_CANCEL
return code and simplified the cancellation logic during optimization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense.
@@ -20220,8 +20205,8 @@ static enum ggml_opt_result ggml_opt_lbfgs( | |||
ggml_vec_cpy_f32(nx, gp, g); | |||
|
|||
ls = linesearch_backtracking(¶ms, nx, x, &fx, g, d, step, xp, f, gb, &cplan, np, ps, &cancel, callback, callback_data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here instead of passing &cancel
, we should check the return code if it matches GGML_OPT_CANCEL
#define GGML_GRAPH_HASHTABLE_SIZE 8273 | ||
// #define GGML_GRAPH_HASHTABLE_SIZE 8273 | ||
// #define GGML_GRAPH_HASHTABLE_SIZE 16411 | ||
#define GGML_GRAPH_HASHTABLE_SIZE 32771 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a chance that this and the increase to GGML_MAX_NODES
will break the examples that allocate the graph on the stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. I think we will migrate things as we go, or alternatively - migrate everything after #547 is merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah.. I used heap allocation because of this.
We could rewrite the examples to use heap allocated graphs with ggml_new_graph
.
Maybe it would be more convenient to have a way to dynamically grow the graph beyond some stack allocatable initial capacity. But then all code that iterates over or adds/deletes nodes need to be changed. E.g. by replacing with calls to new API functions for accessing graph nodes. Sounds too bloated.
Or change the build process to add libraries with custom compile definitions for the sizes and let finetune and train-text-from-scratch link against those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gpt-2
example has already been updated to allocate on the heap. We just need to apply the same treatment to the rest of the examples
ggml-ci
No description provided.