Skip to content
This repository was archived by the owner on Jul 18, 2024. It is now read-only.

Update DockerfilePytorch and fix CICD files #7

Merged
merged 9 commits into from
Aug 30, 2022
Merged

Update DockerfilePytorch and fix CICD files #7

merged 9 commits into from
Aug 30, 2022

Conversation

zigzagcai
Copy link
Contributor

@zigzagcai zigzagcai commented Aug 29, 2022

  1. add spark lib in pytorch env
  2. add recsys support (lightgbm, xgboost, transformers)
  3. base image change from oneapi-aikit to ubuntu
  4. fix cicd scripts

@Jian-Zhang Jian-Zhang requested a review from xuechendi August 29, 2022 05:57
@xuechendi
Copy link
Contributor

@zigzagcai , I successfully built my docker using this new Dockerfile with one additional argument as below:
docker build -t e2eaiok-pytorch -f DockerfilePytorch . --build-arg http_proxy --build-arg https_proxy

I think we'd better add this info to README to guide users who also needs proxy when building docker, below is my BKM

  1. config ~/.docker/config.json with proxy info
{
        ...
        "proxies": {
                "default": {
                        "httpProxy": "http://${proxy_host}:${proxy_port}",
                        "httpsProxy": "http://${proxy_host}:${proxy_port}",
                        "noProxy": "localhost,::1,127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16"
                }
        }
}
  1. When call docker build, add below cmd when using proxy
--build-arg http_proxy --build-arg https_proxy
# Example:
docker build -t e2eaiok-pytorch -f DockerfilePytorch . --build-arg http_proxy --build-arg https_proxy

ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-amd64
ENV SPARK_HOME /home/spark-3.2.1-bin-hadoop3.2
ENV PYTHONPATH $SPARK_HOME/python/:$PYTHONPATH
ENV PYTHONPATH $SPARK_HOME/python/lib/py4j-0.10.9.3-src.zip:$PYTHONPATH
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized you only installed spark but no hadoop, which leads to we can only use local filesystem instead of HDFS as backend, personally I think it should be OK for now, but we may see this as a potential issue if we need to do distributed data processing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, Thanks.

@zigzagcai zigzagcai changed the title Update DockerfilePytorch Update DockerfilePytorch and refine CICD files Aug 30, 2022
@zigzagcai zigzagcai changed the title Update DockerfilePytorch and refine CICD files Update DockerfilePytorch and fix CICD files Aug 30, 2022
@zigzagcai
Copy link
Contributor Author

zigzagcai commented Aug 30, 2022

image
I have tested the refined PyTorch Dockerfile locally, and it works fine with DLRM

@zigzagcai
Copy link
Contributor Author

@zigzagcai , I successfully built my docker using this new Dockerfile with one additional argument as below: docker build -t e2eaiok-pytorch -f DockerfilePytorch . --build-arg http_proxy --build-arg https_proxy

I think we'd better add this info to README to guide users who also needs proxy when building docker, below is my BKM

  1. config ~/.docker/config.json with proxy info
{
        ...
        "proxies": {
                "default": {
                        "httpProxy": "http://${proxy_host}:${proxy_port}",
                        "httpsProxy": "http://${proxy_host}:${proxy_port}",
                        "noProxy": "localhost,::1,127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16"
                }
        }
}
  1. When call docker build, add below cmd when using proxy
--build-arg http_proxy --build-arg https_proxy
# Example:
docker build -t e2eaiok-pytorch -f DockerfilePytorch . --build-arg http_proxy --build-arg https_proxy

Thanks. README.md has been updated.

@zigzagcai
Copy link
Contributor Author

@Jian-Zhang @xuechendi Please help to merge this PR, so that @csdingbin can trigger DLRM CICD tests with the updated Dockerfile.
Thanks!

@xuechendi xuechendi merged commit 9eedb65 into intel:main Aug 30, 2022
@zigzagcai zigzagcai deleted the update-pytorch-docker branch August 31, 2022 01:13
xuechendi added a commit to xuechendi/e2eAIOK that referenced this pull request Nov 1, 2022
Add tests for recdp and add built mvn to avoid extra building process
xuechendi pushed a commit that referenced this pull request Oct 11, 2023
…model merge (#375)

* change save path

* simplify test code

* rename path

* add

* add

* add

* add

* add

* bug fix

* add

* add

* add

* add

* add

* restore

* add

* add

* add

* add

* Dtuner models (#7)

* bug fix

* add

* add

* update readme

* delete

* update test scripts

* update

* support direct eval after merging model

* update

* update

* ssf load previous config

* automatic fill deltaargs

* update config

* copy code to merged dir

* add test

* update readme for merge model

* bug fix

* check the code file existence

* update readme

* update model name list

* refine test scripts

* bug fix

* bug fix

* allow tokenizer to be None

* bug fix

* bug fix

* fix import

* update path

* move merge testing scripts
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants