Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync : llama.cpp (training, refactoring) #548

Merged
merged 5 commits into from
Oct 4, 2023
Merged

sync : llama.cpp (training, refactoring) #548

merged 5 commits into from
Oct 4, 2023

Conversation

ggerganov
Copy link
Member

@ggerganov ggerganov commented Oct 3, 2023

No description provided.

@ggerganov ggerganov changed the title ggml : sync llama.cpp (training, refactoring) sync : llama.cpp (training, refactoring) Oct 3, 2023
@@ -1771,6 +1771,7 @@ extern "C" {
GGML_OPT_NO_CONTEXT,
GGML_OPT_INVALID_WOLFE,
GGML_OPT_FAIL,
GGML_OPT_CANCEL,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xaedes I've added the GGML_OPT_CANCEL return code and simplified the cancellation logic during optimization

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes a lot of sense.

@@ -20220,8 +20205,8 @@ static enum ggml_opt_result ggml_opt_lbfgs(
ggml_vec_cpy_f32(nx, gp, g);

ls = linesearch_backtracking(&params, nx, x, &fx, g, d, step, xp, f, gb, &cplan, np, ps, &cancel, callback, callback_data);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here instead of passing &cancel, we should check the return code if it matches GGML_OPT_CANCEL

@ggerganov ggerganov requested a review from slaren October 4, 2023 08:15
#define GGML_GRAPH_HASHTABLE_SIZE 8273
// #define GGML_GRAPH_HASHTABLE_SIZE 8273
// #define GGML_GRAPH_HASHTABLE_SIZE 16411
#define GGML_GRAPH_HASHTABLE_SIZE 32771
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a chance that this and the increase to GGML_MAX_NODES will break the examples that allocate the graph on the stack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. I think we will migrate things as we go, or alternatively - migrate everything after #547 is merged

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.. I used heap allocation because of this.

We could rewrite the examples to use heap allocated graphs with ggml_new_graph.

Maybe it would be more convenient to have a way to dynamically grow the graph beyond some stack allocatable initial capacity. But then all code that iterates over or adds/deletes nodes need to be changed. E.g. by replacing with calls to new API functions for accessing graph nodes. Sounds too bloated.

Or change the build process to add libraries with custom compile definitions for the sizes and let finetune and train-text-from-scratch link against those.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gpt-2 example has already been updated to allocate on the heap. We just need to apply the same treatment to the rest of the examples

@ggerganov ggerganov merged commit ef33685 into master Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants