-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data Validation in XML reader #3592
Comments
In order to avoid the performance problem that you mention with ranges, the Xlsx Reader calls |
At any rate, you have created an unusually good example. One of your data validations exposes a nasty problem. Your "any of the texts in E1:E5", which Xlsx would store as |
Thanks for the hint to shrinkRangeToFit. This does the trick for my purposes, though in general it could be a problem, but for all formats. To explore this bug I added a simple formula to my test sheet: If I open it with PhpSpreadsheet and save as Xlsx the following file is produced: PhpSpreadsheet tries to convert the formula but leaves the validation rule unchanged - is there any syntactical difference or could the same code be used? But in either way the code seems to be flawed (see the example) as PhpSpreadsheet ruins the formula. If helpful and if I did not make any mistake during conversion, I could open a separate bug report for this. |
No need for a separate bug report. I am aware of the additional problem and am working on it. |
And ... the DataValidator isValid function is incomplete - it currently validates only one type. That will need to be fixed as part of the PR I'm working on. |
Fix PHPOffice#3592. Fix PHPOffice#3594. The original issue asked that Data Validation be added for Xml spreadsheets. However, doing so exposed some other Data Validation problems, which are corrected along with the Xml portion. The main other problem is that the code for `Cell\DataValidator::isValid` covered almost no possibilities. It is expanded to include all possibilities, except for Custom Types, which I will continue to think about. Many tests are added. An obscure problem is that Xlsx Reader does not quite work properly when the Data Validation cells are beyond the high-used column or row. We use the high-used values because it is completely impractical to set all the cells in a range when an entire column or row is selected. This change causes the leftmost top cell in a range to be explicitly allocated, affecting the high-used values. It is not 100% effective, but it will be much more difficult to get into the problem situation, which I will continue to think about. Although our expectation is that we will not see any Xml Spreadsheets using non-standard namespacing, much of the Xml Reader code is changed to use proper namespace techniques. A few areas which just had nothing to do with this change continue to use less robust techniques; I plan to address those soon after implementing this change. One of the Data Validation types is for time of day. Testing revealed that `Shared\Date::convertIsoDate` was insufficiently robust when determining whether its input consisted of a time of day without date. It is now more robust.
Fix #3592. Fix #3594. The original issue asked that Data Validation be added for Xml spreadsheets. However, doing so exposed some other Data Validation problems, which are corrected along with the Xml portion. The main other problem is that the code for `Cell\DataValidator::isValid` covered almost no possibilities. It is expanded to include all possibilities, except for Custom Types, which I will continue to think about. Many tests are added. An obscure problem is that Xlsx Reader does not quite work properly when the Data Validation cells are beyond the high-used column or row. We use the high-used values because it is completely impractical to set all the cells in a range when an entire column or row is selected. This change causes the leftmost top cell in a range to be explicitly allocated, affecting the high-used values. It is not 100% effective, but it will be much more difficult to get into the problem situation, which I will continue to think about. Although our expectation is that we will not see any Xml Spreadsheets using non-standard namespacing, much of the Xml Reader code is changed to use proper namespace techniques. A few areas which just had nothing to do with this change continue to use less robust techniques; I plan to address those soon after implementing this change. One of the Data Validation types is for time of day. Testing revealed that `Shared\Date::convertIsoDate` was insufficiently robust when determining whether its input consisted of a time of day without date. It is now more robust.
This is:
What is the expected behavior?
Respect datavalidation in the XML reader
What is the current behavior?
Validation is not yet parsed.
What features do you think are causing the issue
Does an issue affect all spreadsheet file formats? If not, which formats are affected?
Only XML format is affected
Which versions of PhpSpreadsheet and PHP are affected?
All versions are affected
I attach a simple XML file which should cover the most relevant cases - if you need anything more just ask.
As I am still stuck with an ancient version I implemented this myself (quite rude, but it still might be used as a starting point - actually it seems easier than I expected. $children_x are the Excel-namespaced children of the worksheet).
Care must be taken with the "hide"-Options: XML uses "show" as a default and can only hide; in PhpSpreadsheet the defaults are set the other way round.
One big issue seems to be performance if ranges are used. Column A in the sample file uses a simple, single validation rule, column B uses one with exception of B1 - the XML reflects how Excel thinks of this. I have not found a way to replicate this into PhpSpreadsheet, but using Coordinate::extractAllCellReferencesInRange() to split ranges into single cells is really expensive.
validation.txt
validate.txt
The text was updated successfully, but these errors were encountered: