-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change startup script name generation #1020
Conversation
Hi @atrosinenko, Thank you for your contribution! We really value the time you've taken to put this together. Before we proceed with reviewing this pull request, please sign the Lightbend Contributors License Agreement: |
(qualifiedClassName, lowerCased.replace('.', '-'), lowerCased.split("\\.").last) | ||
} | ||
|
||
names.groupBy(_._3).toSeq.flatMap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the preferred way to rephrase this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue that _._3
is hard to read as this parameter has no name.
A better way would be
names.groupBy {
case (_, _, foo) => foo
}.toSeq.flatMap {
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't thought about it. It really looks more readable.
Strange, at least one of Travis failures looks like some misconfiguration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your well crafted PR ( refactoring logic and unit tests ).
I think the logic is way to complex. You already have an amazingly simple idea in your code 😄
Instead of applying heuristics to hope for the best, we can actually be certain what name to use.
- Create a tuple of (
simpe name
,fully qualified name
) - Group by the
simple name
->Map[simple name, List[fully qualified name]
- Partition `(fully qualified names length == 1, fully qualified names length > 1)
- Extend
simple name
and start from step 1 with thefully qualified names length > 1
partition
(qualifiedClassName, lowerCased.replace('.', '-'), lowerCased.split("\\.").last) | ||
} | ||
|
||
names.groupBy(_._3).toSeq.flatMap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue that _._3
is hard to read as this parameter has no name.
A better way would be
names.groupBy {
case (_, _, foo) => foo
}.toSeq.flatMap {
...
}
If you propose to extend the
So it may be not so simple. On the other hand, when these heuristics fail, it does not render resulting zip completely unusable (though, it may became unusable in fully automatic setups). So there may be another option to drop the most nested numbering logic completely (not very good, does the other means lower the chance of failure to some negligible level?). Or apply some iterative algorithm. Another problem, if we decide to iteratively extend the
If we just extend it from right to left, then we probably end up with meaningless |
This is true if we change the class name. In your example the camel cased name
I'm willing to choose this trade off. I consider this rather an edge case as this could be fixed on an application level by renaming classes or moving it to another package. I know naming is one of the harder problems in software engineering, but users of your scripts will be happy to have meaningful script names instead of weird package names ;) I first thought of adding a "resolve conflict strategy" as sbt-assembly does it, but I think it's an overkill as we need to document and maintain it. Long story short: A recursive algorithm extending only the conflicting classes until all conflicts are resolved seems the most stable and simplest solution :) |
So should the lower casing algorithm be completely dropped? Previously, class names were splitted into words separated with |
Either this or we replace the dots with underscored. This would keep backwards compatibility and wouldn't generate toooo ugly script names. |
Hi @atrosinenko, Thank you for your contribution! We really value the time you've taken to put this together. Before we proceed with reviewing this pull request, please sign the Lightbend Contributors License Agreement: |
Please excuse me for overcomplicating everything once again, but here is another try: now heuristics are even worse :( but they should give more meaningful results (see tests). Based on your review I have to either try to simplify it to some informally provable form or drop it in favor of simpler prefix-based strategy. |
Thanks for your time and dedication @atrosinenko :) Still I think this in an edge case and we should rather add the ability to configure script names then a complicated algorithm that can't be explain in two sentences. Maintaining and explaining this will cost more time I fear ;) What do you thing of this implementation? private case class MainClass(scriptName: String, fullyQualifiedClassName: String) {
val parts: Array[String] = toLowerCase(fullyQualifiedClassName).split("\\.")
def asTuple: (String, String) = (fullyQualifiedClassName, scriptName)
}
def getMainClasses(discoveredMainClasses: Seq[String]): Seq[(String, String)] = {
val mainClasses = discoveredMainClasses.map { qualifiedClassName =>
val lowerCased = toLowerCase(qualifiedClassName)
val parts = lowerCased.split("\\.")
MainClass(parts.last, qualifiedClassName)
}
disambiguateNames(mainClasses, 1).map(_.asTuple)
}
private def disambiguateNames(mainClasses: Seq[MainClass], expansionIndex: Int): Seq[MainClass] =
if (mainClasses.isEmpty) {
// end of recursion
mainClasses
} else {
// find duplicates
val (duplicates, uniques) = mainClasses
.groupBy(_.scriptName)
.partition {
case (_, classes) => classes.length > 1
}
// expand names for duplicated script names
val expandedScriptNames = for {
(_, classes) <- duplicates
extendedClasses <- classes.map(expandClassName(expansionIndex))
} yield extendedClasses
uniques.toSeq.flatMap(_._2) ++ disambiguateNames(expandedScriptNames.toSeq, expansionIndex + 1)
}
private def expandClassName(expansionIndex: Int)(mainClass: MainClass): MainClass = {
val scriptName = mainClass.parts.takeRight(expansionIndex + 1).mkString("_")
mainClass.copy(scriptName = scriptName)
}
|
Please excuse me for the delay and thank you for code advice. I have rewritten disambiguation logic once again, now heavily based on your proposal. Now it looks much simpler than from the second try. I decided to expand from the left after stripping common prefix instead of expanding from the right -- looks like it can give more meaningful distinction by features instead of some technical innermost packages (but I don't insist it is the most correct approach). I had to manually rebase previous commits as well, so it may be worth to minimally glance on them. |
Oops, fixed the corner case of single main class (the common prefix included the class name itself) -- that didn't caught by the tests on Linux. |
Thanks @atrosinenko I'll try to take a look this week. I'll be away for 4 weeks on vacation then and will merge it after that :) |
I'll be on vacation un 20.10. Thanks for your understanding if this can't be merged until then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all your patience and passion you put into this. Once again I'm requesting more simplification 😉
The main reasons is that the behaviour is non-obvious for the user. As a way forward I would suggest that we simply keep the disambiguateNames
method and remove all the prettifying stuff. The next step would be a pull request (if you have time and motivation) and add a setting like
executableScriptNames: SettingKey[Map[String, String]]
which provides a way to customize the generated script names.
*/ | ||
def toLowerCase(qualifiedClassName: String): String = { | ||
// suppose list is not very huge, so no need in tail recursion | ||
def split(chars: List[Char]): List[Char] = chars match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this is too much for an edge case. Prettifying should be done by the user.
We could do that in a second pull request where we provide a setting that let's the user
configure executable script names for each main class.
|
||
private[this] def disambiguateNames( | ||
mainClasses: Seq[MainClass], | ||
expansionIndex: Int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that parameter name 👍
val parts = lowerCased.split("\\.") | ||
MainClass(qualifiedClassName, parts) | ||
} | ||
val commonPrefixLength = mainClasses.map(_.parts.init.toList).reduce(commonPrefix).size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave out every prettifying here. As mentioned below the way forward would be
a user facing setting where a mapping between mainClass and executable script name
can be configured.
(fullyQualifiedClassName, scriptName(expansionIndex)) | ||
} | ||
|
||
private[this] def disambiguateNames( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done implementation. This the bare minimum for these feature to work
This commit tries to lower the chance of generation of multiple scripts with equal names in one archive. Fixes #1016. Additionally, an attempt was made to improve the word splitting for the script names, so, for example "UITest" becomes "ui-test" and not "u-i-test".
Please excuse me for huge delay. I have stripped some of complicated logic. For now, I have left lowercasing logic as-is , but changed other beautifying to simply the following:
I cannot prove right now that lower-casing of fully qualified class names is reversible (that is, cannot introduce name clashes), but names that differ just in letter case are already clashed on Windows. Supposing all lower-cased qualified names are distinct, this trivial disambiguation seems to not introduce clashes:
|
No need to apologise 😄 . I have to thank you for keeping up with all my change requests ❤️ The changes are looking maintainable (great docs, thank you 👍 ) and reasonable. |
I will try to make requested fixes soon. I spotted that name lower-casing logic can introduce clashes, at least in some corner cases like |
But since this logic just inserts some |
True, and you are right that these are too many coincidences ( or rather obscure class namings ). |
Factor out common code from `BashStartScriptPlugin` and `BatStartScriptPlugin`
I have added a note to the docs about name disambiguation and added a warning on script name conflicts (although, I did not test the warning itself at all, only added a test on duplication detection). When adding a warning, I refactored |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice refactoring! I like the shape this becomes. The documentation is also pretty neat.
During the refactoring you added an interesting type. I didn't quite get it, but if necessary this is good to go.
Thanks for you dedication ❤️
IO.write(script, scriptContent) | ||
// TODO - Better control over this! | ||
if (makeScriptsExecutable) | ||
script.setExecutable(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You you simply do
script.setExecutable(makeScriptsExecutable)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, assuming script.setExecutable(false)
is no-op and works well on Windows :) Meanwhile, does building zips with executable bits set on Windows works at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, I don't hesitate to simplify this line, if it is possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. No idea what happens on windows. We can leave this as is and should put a comment in there that setExecutable
may behave differently on different platforms.
|
||
IO.write(file, scriptContent) | ||
if (makeScriptsExecutable) | ||
file.setExecutable(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You you simply do
script.setExecutable(makeScriptsExecutable)
def withScriptName(scriptName: String): SpecializedScriptConfig | ||
} | ||
|
||
protected[this] type SpecializedScriptConfig <: ScriptConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some template methods (or how they call it?) can depend on fields from the specialized type. Particularly, it was the way I allow Replacements.appDefines()
to access configLocation
and extraDefines
from specialized config.
mainClasses: Seq[String], | ||
config: SpecializedScriptConfig): Seq[(String, String)] | ||
|
||
protected[this] trait ScriptConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like binding the script config type to the actual instance.
@@ -223,6 +223,48 @@ For two main classes ``com.example.FooMain`` and ``com.example.BarMain`` ``sbt s | |||
|
|||
Now you can package your application as usual, but with multiple start scripts. | |||
|
|||
A note on script names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well done 👍
Forgot to say, there were some hardcoded |
All tests seem to pass, so the Then this is good to merge ❤️ |
I have added an explanation for the |
Nice nice nice 😎 I retriggered the one build that timed-out. If this is green, then we can merge this. |
Thanks for all your time and assistance! :) |
This PR tries to lower the chance of generation of multiple scripts
with equal names in one archive. Fixes #1016.
Additionally, an attempt was made to improve the word splitting
for the script names, so, for example
UITest
becomesui-test
and not
u-i-test
.I had to edit
ForwarderScripts.apply(...)
as well but I'm not sure I did it the right way.