-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[debug client] Limit key import concurrency to 4 #591
Conversation
memory: "64Mi" | ||
cpu: "250m" | ||
limits: | ||
memory: "512Mi" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes the client
process die quick instead of crashing control plane when memory is full.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you check if this works with docker-compose client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you check if this works with docker-compose client?
Yep, it worked for me!
@okdas wdyt of trying something like this? It's a bit more efficient since it's limited to the number of CPUs. This implementation did save a bunch of memory on my setup with 12 cores.
//go:build debug
package debug
import (
"fmt"
"os"
"runtime"
"sync"
"github.com/pokt-network/pocket/app/client/keybase"
"github.com/pokt-network/pocket/build"
"github.com/pokt-network/pocket/logger"
cryptoPocket "github.com/pokt-network/pocket/shared/crypto"
"gopkg.in/yaml.v2"
)
const (
// NOTE: This is the number of validators in the private-keys.yaml manifest file
numValidators = 999
debugKeybaseSuffix = "/.pocket/keys"
)
var (
// TODO: Allow users to override this value via `datadir` flag or env var or config file
debugKeybasePath string
)
// Initialise the debug keybase with the 999 validator keys from the private-keys manifest file
func init() {
homeDir, err := os.UserHomeDir()
if err != nil {
logger.Global.Fatal().Err(err).Msg("Cannot find user home directory")
}
debugKeybasePath = homeDir + debugKeybaseSuffix
// Initialise the debug keybase with the 999 validators
if err := initializeDebugKeybase(); err != nil {
logger.Global.Fatal().Err(err).Msg("Cannot initialise the keybase with the validator keys")
}
}
func initializeDebugKeybase() error {
var (
validatorKeysPairMap map[string]string
err error
)
validatorKeysPairMap, err = parseValidatorPrivateKeysFromEmbeddedYaml()
if err != nil {
return err
}
// Create/Open the keybase at `$HOME/.pocket/keys`
kb, err := keybase.NewBadgerKeybase(debugKeybasePath)
if err != nil {
return err
}
db, err := kb.GetBadgerDB()
if err != nil {
return err
}
// Add the keys if the keybase contains less than 999
curAddr, _, err := kb.GetAll()
if err != nil {
return err
}
// Add validator addresses if not present
if len(curAddr) < numValidators {
logger.Global.Debug().Msgf(fmt.Sprintf("Debug keybase initializing... Adding %d validator keys to the keybase", numValidators-len(curAddr)))
// Use writebatch to speed up bulk insert
wb := db.NewWriteBatch()
// Create a channel to receive errors from goroutines
errCh := make(chan error, numValidators)
// Create a WaitGroup to wait for all goroutines to finish
var wg sync.WaitGroup
// Create a worker pool with a limited number of workers
numWorkers := runtime.NumCPU()
wg.Add(numWorkers)
// Create a channel to send tasks to the workers
taskCh := make(chan string, numValidators)
// Start the worker pool
for i := 0; i < numWorkers; i++ {
go func() {
defer wg.Done()
for privHexString := range taskCh {
// Import the keys into the keybase with no passphrase or hint as these are for debug purposes
keyPair, err := cryptoPocket.CreateNewKeyFromString(privHexString, "", "")
if err != nil {
errCh <- err
return
}
// Use key address as key in DB
addrKey := keyPair.GetAddressBytes()
// Encode KeyPair into []byte for value
keypairBz, err := keyPair.Marshal()
if err != nil {
errCh <- err
return
}
if err := wb.Set(addrKey, keypairBz); err != nil {
errCh <- err
return
}
}
}()
}
// Send tasks to the workers
for _, privHexString := range validatorKeysPairMap {
taskCh <- privHexString
}
// Close the task channel to signal the workers to exit
close(taskCh)
// Wait for all workers to finish
wg.Wait()
// Check if any goroutines returned an error
select {
case err := <-errCh:
return err
default:
}
if err := wb.Flush(); err != nil {
return err
}
}
// Close DB connection
if err := kb.Stop(); err != nil {
return err
}
return nil
}
// parseValidatorPrivateKeysFromEmbeddedYaml fetches the validator private keys from the embedded build/localnet/manifests/private-keys.yaml manifest file.
func parseValidatorPrivateKeysFromEmbeddedYaml() (map[string]string, error) {
// Parse the YAML file and load into the config struct
var config struct {
ApiVersion string `yaml:"apiVersion"`
Kind string `yaml:"kind"`
MetaData map[string]string `yaml:"metadata"`
Type string `yaml:"type"`
StringData map[string]string `yaml:"stringData"`
}
if err := yaml.Unmarshal(build.PrivateKeysFile, &config); err != nil {
return nil, err
}
validatorKeysMap := make(map[string]string)
for id, privHexString := range config.StringData {
validatorKeysMap[id] = privHexString
}
return validatorKeysMap, nil
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach makes sense to make, but I'm still trying to catch up and understand on how we got here. @0xBigBoss I think it was introduced by #578, but the issue should've been there beforehand, right?
Since this is just a client debug key feature, I'm approving, but would like a 👍 from @deblasis or @0xBigBoss before merging.
memory: "64Mi" | ||
cpu: "250m" | ||
limits: | ||
memory: "512Mi" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you check if this works with docker-compose client?
Co-authored-by: Daniel Olshansky <[email protected]>
Hey @0xBigBoss, that is a great suggestion. I did try to switch to To be frank, I don't know if we should have debug client consume more than 0.5Gi RAM. It is possible we might need to adjust complexity of |
@okdas That makes sense on the memory limits. What’s the benefit of re-importing every time a new pod is created? It’s really slow. Could we mount a persistent volume to |
Why not, that is a great idea. Added a PVC for that pod. Thanks! |
## Description <!-- REMOVE this comment block after following the instructions 1. Add a summary of the change including: motivation, reasons, context, dependencies, etc... 2. If applicable, specify the key files that should be looked at. --> `make localnet_client_debug` currently fails on machines with low~ish amount of RAM. Key import loads 999 validator keys quickly, but cryptographic calculations are so intense on memory it only imports 250 keys before dying OOM on my 16Gi virtual machine. ## Type of change Please mark the relevant option(s): - [ ] New feature, functionality or library - [x] Bug fix - [ ] Code health or cleanup - [ ] Major breaking change - [ ] Documentation - [ ] Other <!-- add details here if it a different type of change --> ## List of changes <!-- REMOVE this comment block after following the instructions List out all the changes made --> - Wrapped key import part into concurrency limiter (https://github.com/korovkin/limiter) ## Testing - [x] `make develop_test` - [x] `make localnet_client_debug` - [x] `make client_start && make client_connect` <!-- REMOVE this comment block after following the instructions If you added additional tests or infrastructure, describe it here. Bonus points for images and videos or gifs. --> ## Required Checklist - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have tested my changes using the available tooling - [x] I have updated the corresponding CHANGELOG ### If Applicable Checklist - [ ] I have updated the corresponding README(s); local and/or global - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added, or updated, [mermaid.js](https://mermaid-js.github.io) diagrams in the corresponding README(s) - [ ] I have added, or updated, documentation and [mermaid.js](https://mermaid-js.github.io) diagrams in `shared/docs/*` if I updated `shared/*`README(s) --------- Co-authored-by: Daniel Olshansky <[email protected]>
Description
make localnet_client_debug
currently fails on machines with low~ish amount of RAM. Key import loads 999 validator keys quickly, but cryptographic calculations are so intense on memory it only imports 250 keys before dying OOM on my 16Gi virtual machine.Type of change
Please mark the relevant option(s):
List of changes
Testing
make develop_test
make localnet_client_debug
make client_start && make client_connect
Required Checklist
If Applicable Checklist
shared/docs/*
if I updatedshared/*
README(s)