Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: Support for HTTP Proxy Configuration in Air-gapped Databricks Workspaces #1323

Closed
1 task done
Tracked by #1528
kfarhane28 opened this issue Apr 8, 2024 · 3 comments · Fixed by #2210
Closed
1 task done
Tracked by #1528
Assignees
Labels
feat/installer install/upgrade the app

Comments

@kfarhane28
Copy link
Contributor

kfarhane28 commented Apr 8, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Problem statement

After UCX installation, runing the assessment workflow in an air-gapped Databricks workspace is not possible.

Some tasks in the assessment Workflow require downloading libraries from the internet, example "setup_tacl" Task.
Our workspace cannot access the internet without going through the HTTP proxy, so these tasks are failing.

Proposed Solution

  • There needs to be a way to pass the proxy configuration as a parameter to the tasks in the workflow that require installing libraries from the internet.

  • On possible solution, is to use init scripts. Example of the init script, applied to main and tacl clusters:

export HTTPS_PROXY=http://myproxy:8080
export HTTP_PROXY=http://myproxy:8080
export https_proxy=http://myproxy:8080
export http_proxy=http://myproxy:8080

pip install databricks-labs-blueprint
pip install databricks-sdk==0.24.0
pip install pyyaml
pip install databricks-labs-lsql==0.3.0

Could you please propose a new version that supports execution of workflows in an air-gapped Workspaces?

Additional Context

Related to:

@kfarhane28 kfarhane28 added enhancement New feature or request needs-triage labels Apr 8, 2024
@github-project-automation github-project-automation bot moved this to Triage in UCX Apr 8, 2024
@nfx
Copy link
Collaborator

nfx commented Apr 9, 2024

@kfarhane28 Does that proxy env variable exist on the machine which runs the install command? Is the proxy same in both cases?

@nfx nfx added feat/installer install/upgrade the app and removed needs-triage labels Apr 9, 2024
@kfarhane28
Copy link
Contributor Author

@nfx In fact the http proxy used on the cloud is different from the proxy used on the machine running the install. The install is done from my windows laptop on-premise.

@nfx nfx removed the enhancement New feature or request label Apr 22, 2024
@nfx nfx moved this from Triage to Month Backlog in UCX May 2, 2024
@JCZuurmond
Copy link
Member

JCZuurmond commented Jul 17, 2024

I researched this issue, herewith the summary:

Where is the HTTP proxy relevant?

Situation Comment
When installing ucx dependencies at Databricks runtime, for example at the start of the assessment workflow. #573 resolves this by uploading the dependencies (as wheels) to Databricks when installing ucx
When installing ucx using databricks labs install ucx The proxy environment variables can be set in the shell running the installation command, similar to the mentioned init script in the top comment. T.b.d. if the proxy environment variables are sufficient or if support for passing the proxy settings to the installation is required.
When installing a non-ucx dependency at Databricks runtime, for example with a %pip install ... in a notebook Relevant when resolving dependencies for linting, though, not sure if it is relevant for Air-gapped Databricks workspaces as that pip command will not work thus unlikely to be defined or if it works with optional pip install flags then those flags are passed to the pip install called by the linter.

@kfarhane28 : Could you verify if you agree with the above table? Specifically, did #573 resolve the issue of installing ucx's dependencies at Databricks runtime and could you install ucx from your machine using the proxy environment variables?

Wheel house

A wheel house is the collection of wheels to install ucx and its dependencies. In it's simplest form, these wheels are kept inside the ucx Github repository: wheelhouse/....whl.

Approaches for creating a wheel house:

  • Wheelhouse
    • Approach : Keep ucx's dependencies as (binary) wheels in the github repository
    • Pros :
      • Ucx's dependencies are kept inside its Github repository, thus available at installation
    • Considerations :
  • pip-tools
    • Approach : Use pip-tools to pin ucx dependencies. Install and store those in the ucx repository, like our own "wheel house".
    • Pros:
    • Considerations:
      • We need to make sure cross-environment installations work. The least, the installation of wheels created to-be-commited to the ucx "wheel house" inside this repository should work on the Databricks runtime used by ucx.
      • Use the pip-sync command to install the libraries in the ucx "wheel house" inside this repository. In theory, should be possible by using --pip-args to pass --target ./wheelhouse to the underlying pip install calls.
  • hatch
    • Approach : Install ucx and its dependencies using hatch, then pip freeze to lock the dependencies. Use the pip freeze to install locked dependencies into a "wheel house" inside the ucx repo
    • Pros :
    • Considerations:
      • "there is no support for re-creating an environment given a set of dependencies in a reproducible manner"; does not support lock files. Could be circumvented by installing the dependencies using hatch and then pip freeze, but that has less guarantee to be reproducible.
      • No functionality for upgrading (specific) dependencies, can regenerate the pip freeze and use git to track the diffs
      • We need to make sure cross-environment installations work. The least, the installation of wheels created to-be-commited to the ucx "wheel house" inside this repository should work on the Databricks runtime used by ucx.

Independent of the approach for creating the wheel house, it implies keeping binaries in ucx's Github repository

Suggestions

  1. Single way of installing ucx's dependencies at Databricks runtime:
    a. Always upload the wheels when installing ucx to Databricks
    b. Always install the dependencies at Databricks runtime by referencing a wheel. To be sure, we add the --no-index flag to the pip install command so that PyPi is not used.
  2. Create a "wheel house" inside the ucx repository using pip-tools and update the ucx install script to use these wheels for step 1.
  3. Alternative to 2, support passing pip install flags during ucx installation, similar to pip-tools' --pip-args, to allow users to pass the --proxy flag to the pip install.

@nfx nfx closed this as completed in #2210 Jul 22, 2024
@nfx nfx closed this as completed in 08eecdb Jul 22, 2024
@github-project-automation github-project-automation bot moved this from Month Backlog to Archive in UCX Jul 22, 2024
JCZuurmond added a commit that referenced this issue Jul 23, 2024
## Changes
Improve error messages in case of connection errors

### Linked issues
Partially resolves #1323

### Functionality

- [x] modified existing command: `databricks labs ucx (un)install`


### Tests

- [x] manually tested
- [x] added integration tests
nfx added a commit that referenced this issue Jul 26, 2024
* Fixed codec error in md ([#2234](#2234)). In this release, we have addressed a codec error in the `md` file that caused issues on Windows machines due to the presence of curly quotes. This has been resolved by replacing curly quotes with straight quotes. The affected code pertains to the `.setJobGroup` pattern in the `SparkContext` where `spark.addTag()` is used to attach a tag, and `getTags()` and `interruptTag(tag)` are used to act upon the presence or absence of a tag. These APIs are specific to Spark Connect (Shared Compute Mode) and will not work in `Assigned` access mode. Additionally, the release includes updates to the README.md file, providing solutions for various issues related to UCX installation and configuration. These changes aim to improve the user experience and ensure a smooth installation process for software engineers adopting the project. This release also enhances compatibility and reliability of the code for users across various operating systems. The changes were co-authored by Cor and address issue [#2234](#2234). Please note that this release does not provide medical advice or treatment and should not be used as a substitute for professional medical advice. It also does not process Protected Health Information (PHI) as defined in the Health Insurance Portability and Accountability Act of 1996, unless certain conditions are met. All names used in the tool have been synthetically generated and do not map back to any actual persons or locations.
* Group manager optimisation: during group enumeration only request the attributes that are needed ([#2240](#2240)). In this optimization update to the `groups.py` file, the `_list_workspace_groups` function has been modified to reduce the number of attributes requested during group enumeration to the minimum set necessary. This improvement is achieved by removing the `members` attribute from the list of requested attributes when it is requested during enumeration. For each group returned by `self._ws.groups.list`, the function now checks if the group is out of scope and, if not, retrieves the group with all its attributes using the `_get_group` function. Additionally, the new `scan_attributes` variable limits the attributes requested during the initial enumeration to "id", "displayName", and "meta". This optimization reduces the risk of timeouts caused by large attributes and improves the performance of group enumeration, particularly in cases where members are requested during enumeration due to API issues.
* Group migration: additional logging ([#2239](#2239)). In this release, we have implemented logging improvements for group migration within the group manager. These enhancements include the addition of new informational and debug logs aimed at helping to understand potential issues during group migration. The affected functionality includes the existing workflow `group-migration`. New logging statements have been added to numerous methods, such as `rename_groups`, `_rename_group`, `_wait_for_rename`, `_wait_for_renamed_groups`, `reflect_account_groups_on_workspace`, `delete_original_workspace_groups`, and `validate_group_membership`, as well as data retrieval methods including `_workspace_groups_in_workspace`, `_account_groups_in_workspace`, and `_account_groups_in_account`. These changes will provide increased visibility into the group migration process, including starting to rename/reflect groups, checking for renamed groups, and validating group membership.
* Group migration: improve robustness while deleting workspace groups ([#2247](#2247)). This pull request introduces changes to the group manager aimed at enhancing the reliability of deleting workspace groups, addressing an issue where deletion was being skipped for groups that had recently been renamed due to eventual consistency concerns. The changes involve double-checking the deletion of groups by ensuring they can no longer be directly retrieved from the API and are no longer present in the list of groups during enumeration. Additionally, logging has been improved, and the renaming of groups will be updated in a subsequent pull request. The `remove-workspace-local-backup-groups` workflow and related tests have been modified, and new classes indicating incomplete deletion or rename operations have been implemented. These changes improve the robustness of deleting workspace groups, reducing the likelihood of issues arising post-deletion and enhancing overall system consistency.
* Improve error messages in case of connection errors ([#2210](#2210)). In this release, we've made significant improvements to error messages for connection errors in the `databricks labs ucx (un)install` command, addressing part of issue [#1323](#1323). The changes include the addition of a new import, `RequestsConnectionError` from the `requests` package, and updates to the error handling in the `run` method to provide clearer and more informative messages during connection problems. A new `except` block has been added to handle `TimeoutError` exceptions caused by `RequestsConnectionError`, logging a warning message with information on troubleshooting network connectivity issues. The `configure` method has also been updated with a docstring noting that connection errors are not handled within it. To ensure the improvements work as expected, we've added new manual and integration tests, including a test for a simulated workspace with no internet connection, and a new function to configure such a workspace. The test checks for the presence of a specific warning message in the log output. The changes also include new type annotations and imports. The target audience for this update includes software engineers adopting the project, who will benefit from clearer error messages and guidance when troubleshooting connection problems.
* Increase timeout for sequence of slow preliminary jobs ([#2222](#2222)). In this enhancement, the timeout duration for a series of slow preliminary jobs has been increased from 4 minutes to 6 minutes, addressing issue [#2219](#2219). The modification is implemented in the `test_running_real_remove_backup_groups_job` function in the `tests/integration/install/test_installation.py` file, where the `get_group` function's `retried` decorator timeout is updated from 4 minutes to 6 minutes. This change improves the system's handling of slow preliminary jobs by allowing more time for the API to delete a group and minimizing errors resulting from insufficient deletion time. The overall functionality and tests of the system remain unaffected.
* Init `RuntimeContext` from debug notebook to simplify interactive debugging flows ([#2253](#2253)). In this release, we have implemented a change to simplify interactive debugging flows in UCX workflows. We have introduced a new feature that initializes the `RuntimeContext` object from a debug notebook. The `RuntimeContext` is a subclass of `GlobalContext` that manages all object dependencies. Previously, all UCX workflows used a `RuntimeContext` instance for any object lookup, which could be complex during debugging. This change pre-initializes the `RuntimeContext` object correctly, making it easier to perform interactive debugging. Additionally, we have replaced the use of `Installation.load_local` and `WorkspaceClient` with the newly initialized `RuntimeContext` object. This reduces the complexity of object lookup and simplifies the code for debugging purposes. Overall, this change will make it easier to debug UCX workflows by pre-initializing the `RuntimeContext` object with the necessary configurations.
* Lint child dependencies recursively ([#2226](#2226)). In this release, we've implemented significant changes to our linting process for enhanced context awareness, particularly in the context of parent-child file relationships. The `DependencyGraph` class in the `graph.py` module has been updated with new methods, including `parent`, `root_dependencies`, `root_paths`, and `root_relative_names`, and an improved `_relative_names` method. These changes allow for more accurate linting of child dependencies. The `lint` function in the `files.py` module has also been modified to accept new parameters and utilize a recursive linting approach for child dependencies. The `databricks labs ucx lint-local-code` command has been updated to include a `paths` parameter and lint child dependencies recursively, improving the linting process by considering parent-child relationships and resulting in better contextual code analysis. The release contains integration tests to ensure the functionality of these changes, addressing issues [#2155](#2155) and [#2156](#2156).
* Removed deprecated `install.sh` script ([#2217](#2217)). In this release, we have removed the deprecated `install.sh` script from the codebase, which was previously used to install and set up the environment for the project. This script would check for the presence of Python binaries, identify the latest version, create a virtual environment, and install project dependencies. Going forward, developers will need to utilize an alternative method for installing and setting up the project environment, as the use of this script is now obsolete. We recommend consulting the updated documentation for guidance on the new installation process.
* Tentatively fix failure when running assessment without a hive_metastore ([#2252](#2252)). In this update, we have enhanced the error handling of the `LocalCheckoutContext` class in the `workspace_cli.py` file. Specifically, we have addressed the issue where a fatal failure occurred when running an assessment without a Hive metastore ([#2252](#2252)) by implementing a more graceful error handling mechanism. Now, when the metastore fails to load during the initialization of a `LinterContext` object, a warning message is logged instead, and the `MigrationIndex` is initialized with an empty list. This change is linked to the resolution of issue [#2221](#2221). Additionally, we have imported the `MigrationIndex` class from the `hive_metastore.migration_status` module and added a logger to the module. However, please note that functional tests for this specific modification have not been conducted.
* Total Storage Credentials count widget for Assessment Dashboard ([#2201](#2201)). In this commit, a new widget has been added to the Assessment Dashboard that displays the current total number of storage credentials created in the workspace, up to a limit of 200. This change includes a new SQL query to retrieve the count of storage credentials from the `inventory.external_locations` table and modifies the display of the widget with customized settings. Additionally, a new warning mechanism has been implemented to prevent migration from exceeding the UC storage credentials limit of 200. A new method, `get_roles_to_migrate`, has been added to `access.py` to retrieve the roles that need to be migrated. If the number of roles exceeds 200, a `RuntimeWarning` is raised. User documentation and manual testing have been updated to reflect these changes, but no unit or integration tests have been added yet. This feature is part of the implementation of issue [#1600](#1600) and is co-authored by Serge Smertin.
* Updated dashboard install using latest `lsql` release ([#2246](#2246)). In this release, the install function for the UCX dashboard has been updated in the `databricks/labs/ucx/install.py` file to use the latest `lsql` release. The `databricks labs instal ucx` command has been modified to accommodate the updated `lsql` version and now includes new methods for upgrading dashboards from Redash to Lakeview, as well as creating and deleting dashboards in Lakeview, which also feature functionality to publish dashboards. The changes have been manually tested and verified on a staging environment. The query formatting in the dashboard has been improved, and the `--width` parameter is no longer necessary in certain instances. This update streamlines the dashboard installation process, enhances its functionality, and ensures its compatibility with the latest `lsql` release.
* Updated sqlglot requirement from <25.7,>=25.5.0 to >=25.5.0,<25.8 ([#2248](#2248)). In this update, we have adjusted the version requirements for the SQL transpiler library, sqlglot, in our pyproject.toml file. The requirement has been updated from ">=25.5.0, <25.7" to ">=25.5.0, <25.8", allowing us to utilize the latest features and bug fixes available in sqlglot version 25.7.0 while still maintaining our previous version constraint. The changelog from sqlglot's repository has been included in this commit, detailing the new features and improvements introduced in version 25.7.0. A list of commits made since the previous version is also provided. The diff of this commit shows that the change only affects the version constraint for sqlglot and does not impact any other parts of the codebase. This update ensures that we are using the most recent stable version of sqlglot while maintaining backward compatibility.

Dependency updates:

 * Updated sqlglot requirement from <25.7,>=25.5.0 to >=25.5.0,<25.8 ([#2248](#2248)).
@nfx nfx mentioned this issue Jul 26, 2024
nfx added a commit that referenced this issue Jul 26, 2024
* Fixed codec error in md
([#2234](#2234)). In this
release, we have addressed a codec error in the `md` file that caused
issues on Windows machines due to the presence of curly quotes. This has
been resolved by replacing curly quotes with straight quotes. The
affected code pertains to the `.setJobGroup` pattern in the
`SparkContext` where `spark.addTag()` is used to attach a tag, and
`getTags()` and `interruptTag(tag)` are used to act upon the presence or
absence of a tag. These APIs are specific to Spark Connect (Shared
Compute Mode) and will not work in `Assigned` access mode. Additionally,
the release includes updates to the README.md file, providing solutions
for various issues related to UCX installation and configuration. These
changes aim to improve the user experience and ensure a smooth
installation process for software engineers adopting the project. This
release also enhances compatibility and reliability of the code for
users across various operating systems. The changes were co-authored by
Cor and address issue
[#2234](#2234). Please note
that this release does not provide medical advice or treatment and
should not be used as a substitute for professional medical advice. It
also does not process Protected Health Information (PHI) as defined in
the Health Insurance Portability and Accountability Act of 1996, unless
certain conditions are met. All names used in the tool have been
synthetically generated and do not map back to any actual persons or
locations.
* Group manager optimisation: during group enumeration only request the
attributes that are needed
([#2240](#2240)). In this
optimization update to the `groups.py` file, the
`_list_workspace_groups` function has been modified to reduce the number
of attributes requested during group enumeration to the minimum set
necessary. This improvement is achieved by removing the `members`
attribute from the list of requested attributes when it is requested
during enumeration. For each group returned by `self._ws.groups.list`,
the function now checks if the group is out of scope and, if not,
retrieves the group with all its attributes using the `_get_group`
function. Additionally, the new `scan_attributes` variable limits the
attributes requested during the initial enumeration to "id",
"displayName", and "meta". This optimization reduces the risk of
timeouts caused by large attributes and improves the performance of
group enumeration, particularly in cases where members are requested
during enumeration due to API issues.
* Group migration: additional logging
([#2239](#2239)). In this
release, we have implemented logging improvements for group migration
within the group manager. These enhancements include the addition of new
informational and debug logs aimed at helping to understand potential
issues during group migration. The affected functionality includes the
existing workflow `group-migration`. New logging statements have been
added to numerous methods, such as `rename_groups`, `_rename_group`,
`_wait_for_rename`, `_wait_for_renamed_groups`,
`reflect_account_groups_on_workspace`,
`delete_original_workspace_groups`, and `validate_group_membership`, as
well as data retrieval methods including
`_workspace_groups_in_workspace`, `_account_groups_in_workspace`, and
`_account_groups_in_account`. These changes will provide increased
visibility into the group migration process, including starting to
rename/reflect groups, checking for renamed groups, and validating group
membership.
* Group migration: improve robustness while deleting workspace groups
([#2247](#2247)). This pull
request introduces changes to the group manager aimed at enhancing the
reliability of deleting workspace groups, addressing an issue where
deletion was being skipped for groups that had recently been renamed due
to eventual consistency concerns. The changes involve double-checking
the deletion of groups by ensuring they can no longer be directly
retrieved from the API and are no longer present in the list of groups
during enumeration. Additionally, logging has been improved, and the
renaming of groups will be updated in a subsequent pull request. The
`remove-workspace-local-backup-groups` workflow and related tests have
been modified, and new classes indicating incomplete deletion or rename
operations have been implemented. These changes improve the robustness
of deleting workspace groups, reducing the likelihood of issues arising
post-deletion and enhancing overall system consistency.
* Improve error messages in case of connection errors
([#2210](#2210)). In this
release, we've made significant improvements to error messages for
connection errors in the `databricks labs ucx (un)install` command,
addressing part of issue
[#1323](#1323). The changes
include the addition of a new import, `RequestsConnectionError` from the
`requests` package, and updates to the error handling in the `run`
method to provide clearer and more informative messages during
connection problems. A new `except` block has been added to handle
`TimeoutError` exceptions caused by `RequestsConnectionError`, logging a
warning message with information on troubleshooting network connectivity
issues. The `configure` method has also been updated with a docstring
noting that connection errors are not handled within it. To ensure the
improvements work as expected, we've added new manual and integration
tests, including a test for a simulated workspace with no internet
connection, and a new function to configure such a workspace. The test
checks for the presence of a specific warning message in the log output.
The changes also include new type annotations and imports. The target
audience for this update includes software engineers adopting the
project, who will benefit from clearer error messages and guidance when
troubleshooting connection problems.
* Increase timeout for sequence of slow preliminary jobs
([#2222](#2222)). In this
enhancement, the timeout duration for a series of slow preliminary jobs
has been increased from 4 minutes to 6 minutes, addressing issue
[#2219](#2219). The
modification is implemented in the
`test_running_real_remove_backup_groups_job` function in the
`tests/integration/install/test_installation.py` file, where the
`get_group` function's `retried` decorator timeout is updated from 4
minutes to 6 minutes. This change improves the system's handling of slow
preliminary jobs by allowing more time for the API to delete a group and
minimizing errors resulting from insufficient deletion time. The overall
functionality and tests of the system remain unaffected.
* Init `RuntimeContext` from debug notebook to simplify interactive
debugging flows
([#2253](#2253)). In this
release, we have implemented a change to simplify interactive debugging
flows in UCX workflows. We have introduced a new feature that
initializes the `RuntimeContext` object from a debug notebook. The
`RuntimeContext` is a subclass of `GlobalContext` that manages all
object dependencies. Previously, all UCX workflows used a
`RuntimeContext` instance for any object lookup, which could be complex
during debugging. This change pre-initializes the `RuntimeContext`
object correctly, making it easier to perform interactive debugging.
Additionally, we have replaced the use of `Installation.load_local` and
`WorkspaceClient` with the newly initialized `RuntimeContext` object.
This reduces the complexity of object lookup and simplifies the code for
debugging purposes. Overall, this change will make it easier to debug
UCX workflows by pre-initializing the `RuntimeContext` object with the
necessary configurations.
* Lint child dependencies recursively
([#2226](#2226)). In this
release, we've implemented significant changes to our linting process
for enhanced context awareness, particularly in the context of
parent-child file relationships. The `DependencyGraph` class in the
`graph.py` module has been updated with new methods, including `parent`,
`root_dependencies`, `root_paths`, and `root_relative_names`, and an
improved `_relative_names` method. These changes allow for more accurate
linting of child dependencies. The `lint` function in the `files.py`
module has also been modified to accept new parameters and utilize a
recursive linting approach for child dependencies. The `databricks labs
ucx lint-local-code` command has been updated to include a `paths`
parameter and lint child dependencies recursively, improving the linting
process by considering parent-child relationships and resulting in
better contextual code analysis. The release contains integration tests
to ensure the functionality of these changes, addressing issues
[#2155](#2155) and
[#2156](#2156).
* Removed deprecated `install.sh` script
([#2217](#2217)). In this
release, we have removed the deprecated `install.sh` script from the
codebase, which was previously used to install and set up the
environment for the project. This script would check for the presence of
Python binaries, identify the latest version, create a virtual
environment, and install project dependencies. Going forward, developers
will need to utilize an alternative method for installing and setting up
the project environment, as the use of this script is now obsolete. We
recommend consulting the updated documentation for guidance on the new
installation process.
* Tentatively fix failure when running assessment without a
hive_metastore
([#2252](#2252)). In this
update, we have enhanced the error handling of the
`LocalCheckoutContext` class in the `workspace_cli.py` file.
Specifically, we have addressed the issue where a fatal failure occurred
when running an assessment without a Hive metastore
([#2252](#2252)) by
implementing a more graceful error handling mechanism. Now, when the
metastore fails to load during the initialization of a `LinterContext`
object, a warning message is logged instead, and the `MigrationIndex` is
initialized with an empty list. This change is linked to the resolution
of issue [#2221](#2221).
Additionally, we have imported the `MigrationIndex` class from the
`hive_metastore.migration_status` module and added a logger to the
module. However, please note that functional tests for this specific
modification have not been conducted.
* Total Storage Credentials count widget for Assessment Dashboard
([#2201](#2201)). In this
commit, a new widget has been added to the Assessment Dashboard that
displays the current total number of storage credentials created in the
workspace, up to a limit of 200. This change includes a new SQL query to
retrieve the count of storage credentials from the
`inventory.external_locations` table and modifies the display of the
widget with customized settings. Additionally, a new warning mechanism
has been implemented to prevent migration from exceeding the UC storage
credentials limit of 200. A new method, `get_roles_to_migrate`, has been
added to `access.py` to retrieve the roles that need to be migrated. If
the number of roles exceeds 200, a `RuntimeWarning` is raised. User
documentation and manual testing have been updated to reflect these
changes, but no unit or integration tests have been added yet. This
feature is part of the implementation of issue
[#1600](#1600) and is
co-authored by Serge Smertin.
* Updated dashboard install using latest `lsql` release
([#2246](#2246)). In this
release, the install function for the UCX dashboard has been updated in
the `databricks/labs/ucx/install.py` file to use the latest `lsql`
release. The `databricks labs instal ucx` command has been modified to
accommodate the updated `lsql` version and now includes new methods for
upgrading dashboards from Redash to Lakeview, as well as creating and
deleting dashboards in Lakeview, which also feature functionality to
publish dashboards. The changes have been manually tested and verified
on a staging environment. The query formatting in the dashboard has been
improved, and the `--width` parameter is no longer necessary in certain
instances. This update streamlines the dashboard installation process,
enhances its functionality, and ensures its compatibility with the
latest `lsql` release.
* Updated sqlglot requirement from <25.7,>=25.5.0 to >=25.5.0,<25.8
([#2248](#2248)). In this
update, we have adjusted the version requirements for the SQL transpiler
library, sqlglot, in our pyproject.toml file. The requirement has been
updated from ">=25.5.0, <25.7" to ">=25.5.0, <25.8", allowing us to
utilize the latest features and bug fixes available in sqlglot version
25.7.0 while still maintaining our previous version constraint. The
changelog from sqlglot's repository has been included in this commit,
detailing the new features and improvements introduced in version
25.7.0. A list of commits made since the previous version is also
provided. The diff of this commit shows that the change only affects the
version constraint for sqlglot and does not impact any other parts of
the codebase. This update ensures that we are using the most recent
stable version of sqlglot while maintaining backward compatibility.

Dependency updates:

* Updated sqlglot requirement from <25.7,>=25.5.0 to >=25.5.0,<25.8
([#2248](#2248)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat/installer install/upgrade the app
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants