Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when reading FILTER function #3245

Closed
1 of 8 tasks
kory- opened this issue Dec 14, 2022 · 6 comments
Closed
1 of 8 tasks

Error when reading FILTER function #3245

kory- opened this issue Dec 14, 2022 · 6 comments
Assignees

Comments

@kory-
Copy link

kory- commented Dec 14, 2022

This is:

- [x] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

Can be read and output without error
Maybe same problem #390

What is the current behavior?

=FILTER(A:A, A:A<>"")

When reading an Excel file that uses the FILTER function, I get a "Formula Error: Unexpected ," error.

What are the steps to reproduce?

Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:

<?php

require __DIR__ . '/vendor/autoload.php';

use PhpOffice\PhpSpreadsheet\Reader\Xlsx as Reader;
use PhpOffice\PhpSpreadsheet\Writer\Xlsx as Writer;

// Create new Spreadsheet object
$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();

$reader = new Reader;
$file_name = 'template.xlsx';
$spreadsheet = $reader->load($file_name);

$writer = new Writer($spreadsheet);
$outputPath = 'output.xlsx';
$writer->save( $outputPath );

If this is an issue with reading a specific spreadsheet file, then it may be appropriate to provide a sample file that demonstrates the problem; but please keep it as small as possible, and sanitize any confidential information before uploading.

What features do you think are causing the issue

  • Reader
  • Writer
  • Styles
  • Data Validations
  • Formula Calculations
  • Charts
  • AutoFilter
  • Form Elements

Does an issue affect all spreadsheet file formats? If not, which formats are affected?

XLSX

Which versions of PhpSpreadsheet and PHP are affected?

phpspreadsheet: 1.25.2 php: 7.4.6
template.xlsx

@MarkBaker MarkBaker self-assigned this Dec 14, 2022
@MarkBaker
Copy link
Member

The problem isn't in loading/reading the file, it's in the calculation engine (and possibly requires an update to the Xlsx Writer as well).

As described in the MS documentation:

The newer functions in MS Excel, introduced since Excel 2010 are prefixed with _xlfn., so a function like SWITCH() (introduced in Excel 2016) is maintained internally in Excel as _xlfn.SWITCH(), and maintained in PhpSpreadsheet in the same way. When you see this function in Excel itself, it strips the prefix from the displayed function name if that version of Excel supports the function, but the prefix is visible in older versions of Excel, and the calculation cannot be executed in those older versions.
When we do a calculation in PhpSpreadsheet, we don't care if the prefix is present or not, the SWITCH() function will still be executed if it is SWITCH() or _xlfn.SWITCH().

If you add a new formula containing that function in your own code, you would probably write it as =SWITCH(...) rather than as =_xlfn.SWITCH(...) (we can't expect you to know about the internals of function names, or the list of functions that require the prefix), so the calculation engine will work with either.

We do have to ensure that the full function name (including the prefix) is used when we save an Xlsx file though, to ensure that MS Excel behaviour is maintained with that file.
So when we write the formula when you save a spreadsheet, we add the prefix if it isn't present; so if you create a formula with the function SWITCH(), we write it to the saved file as _xlfn.SWITCH(), so Excel behaviour will be maintained. We have a lookup table listing all the affected functions in /Writer/Xlsx/Xlfn.php

All of that works as expected.


However, MS have now decided to add a second prefix for two of the new functions that already existed and already required the _xlfn. prefix, but they haven't yet documented this. I don't know the purpose of the new _xlws. prefix, but it seems that it is used in addition to the existing _xlfn. prefix, so these two functions, SORT() and FILTER() may now appear internally in the Excel file as _xlfn.FILTER() or as _xlfn._xlws.FILTER(). Because PhpSpreadsheet is aware of the _xlfn prefix, it will work without issue in the former case; but because it isn't aware of the additional new _xlws. prefix; files that contain formula using SORT() or FILTER() with that additional new prefix aren't being handled.

Your file is created using the newest Excel, so it's using the double _xlfn._xlws.FILTER() rather than the simpler _xlfn.FILTER().


Fixing the calculation engine to handle the dual prefix is fairly straightforward, and I've already made that change locally; but I'll also need to modify the writer to add that second prefix for SORT() and FILTER() if it isn't present to match the latest Excel behaviour, and that's a bit more complicated to implement. It will take me a bit longer before I can submit a PR for the change; but I should have a fully tested PR to handle it later in the week. I'll delay the planned 1.26.0 release until I can include it.

@kory-
Copy link
Author

kory- commented Dec 14, 2022

I am relieved to hear that you are already aware of this and that it is being corrected.
I will wait for the one that includes it. Thank you.

@MarkBaker
Copy link
Member

I am relieved to hear that you are already aware of this and that it is being corrected.

Thanks to the Python XlsxWriter library

@kory-
Copy link
Author

kory- commented Dec 19, 2022

Sorry to add, but if the FILTER function is specified, the written

=@FILTER(A:A, A:A<>"")

and the unintended result of the @mark being added is also a known issue?

@MarkBaker
Copy link
Member

=@filter(A:A, A:A<>"")

PhpSpreadsheet doesn't yet fully support dynamic array functions, and cannot without a significant bc break; and that includes support for the new Spill Range Operator (#) and the Implicit Intersection Operator (@). At present, the calculation engine itself only works with fixed range functions; and while the implementations for dynamic range functions have been implemented, the fixed range limitation means that they only populate the first cell in the dynamic range. While this is (in part) the purpose of the @ operator, I can't provide support for it in version 1.x of PhpSpreadsheet, because full implementation of dynamic array formulae requires some major changes to the library that create a bc break.

I've been pushing to get the v2.0 release ready to support this, and the work is close to complete, and that includes supporting the Implicit Intersection Operator (@). But it is not yet ready for release; and recent updates to v1.x have created a merge conflict nightmare that will take some time to resolve, especially as I'm also trying to provide the functionality to support Structured References for tables (which also use the @ operator), and handling for the new Excel Lambda functions.

@kory-
Copy link
Author

kory- commented Dec 20, 2022

Thanks for the reply.
I see the problem is deeper rooted than I thought, I'll wait for v2.

MarkBaker added a commit that referenced this issue Dec 21, 2022
### Added

- Extended flag options for the Reader `load()` and Writer `save()` methods
- Apply Row/Column limits (1048576 and XFD) in ReferenceHelper [PR #3213](#3213)
- Allow the creation of In-Memory Drawings from a string of binary image data, or from a stream. [PR #3157](#3157)
- Xlsx Reader support for Pivot Tables [PR #2829](#2829)
- Permit Date/Time Entered on Spreadsheet to be calculated as Float [Issue #1416](#1416) [PR #3121](#3121)

### Changed

- Nothing

### Deprecated

- Direct update of Calculation::suppressFormulaErrors is replaced with setter.
- Font public static variable defaultColumnWidths replaced with constant DEFAULT_COLUMN_WIDTHS.
- ExcelError public static variable errorCodes replaced with constant ERROR_CODES.
- NumberFormat constant FORMAT_DATE_YYYYMMDD2 replaced with existing identical FORMAT_DATE_YYYYMMDD.

### Removed

- Nothing

### Fixed

- Fixed handling for `_xlws` prefixed functions from Office365 [Issue #3245](#3245) [PR #3247](#3247)
- Conditionals formatting rules applied to rows/columns are removed [Issue #3184](#3184) [PR #3213](#3213)
- Treat strings containing currency or accounting values as floats in Calculation Engine operations [Issue #3165](#3165) [PR #3189](#3189)
- Treat strings containing percentage values as floats in Calculation Engine operations [Issue #3155](#3155) [PR #3156](#3156) and [PR #3164](#3164)
- Xlsx Reader Accept Palette of Fewer than 64 Colors [Issue #3093](#3093) [PR #3096](#3096)
- Use Locale-Independent Float Conversion for Xlsx Writer Custom Property [Issue #3095](#3095) [PR #3099](#3099)
- Allow setting AutoFilter range on a single cell or row [Issue #3102](#3102) [PR #3111](#3111)
- Xlsx Reader External Data Validations Flag Missing [Issue #2677](#2677) [PR #3078](#3078)
- Reduces extra memory usage on `__destruct()` calls [PR #3092](#3092)
- Additional properties for Trendlines [Issue #3011](#3011) [PR #3028](#3028)
- Calculation suppressFormulaErrors fix [Issue #1531](#1531) [PR #3092](#3092)
- Permit Date/Time Entered on Spreadsheet to be Calculated as Float [Issue #1416](#1416) [PR #3121](#3121)
- Incorrect Handling of Data Validation Formula Containing Ampersand [Issue #3145](#3145) [PR #3146](#3146)
- Xlsx Namespace Handling of Drawings, RowAndColumnAttributes, MergeCells [Issue #3138](#3138) [PR #3136](#3137)
- Generation3 Copy With Image in Footer [Issue #3126](#3126) [PR #3140](#3140)
- MATCH Function Problems with Int/Float Compare and Wildcards [Issue #3141](#3141) [PR #3142](#3142)
- Fix ODS Read Filter on number-columns-repeated cell [Issue #3148](#3148) [PR #3149](#3149)
- Problems Formatting Very Small and Very Large Numbers [Issue #3128](#3128) [PR #3152](#3152)
- XlsxWrite preserve line styles for y-axis, not just x-axis [PR #3163](#3163)
- Xlsx Namespace Handling of Drawings, RowAndColumnAttributes, MergeCells [Issue #3138](#3138) [PR #3137](#3137)
- More Detail for Cyclic Error Messages [Issue #3169](#3169) [PR #3170](#3170)
- Improved Documentation for Deprecations - many PRs [Issue #3162](#3162)
@kory- kory- closed this as completed Dec 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants