Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Handling of Sheet Names Resembling Cell Addresses in Formulas #3907

Closed
SrwLucas opened this issue Feb 19, 2024 · 1 comment · Fixed by #3915
Closed

Incorrect Handling of Sheet Names Resembling Cell Addresses in Formulas #3907

SrwLucas opened this issue Feb 19, 2024 · 1 comment · Fixed by #3915

Comments

@SrwLucas
Copy link

This is:

- [x] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

The settings sheet name should remain unchanged in the formula after inserting a new column in the data sheet, even when the sheet name resembles a cell address and the formula targets a cell range within that sheet.

What is the current behavior?

After inserting a new column in the data sheet, the settings sheet name is offset, causing an error in the formula.

What are the steps to reproduce?

  1. Create a spreadsheet with two sheets, for example, a data sheet and a settings sheet.
  2. In the data sheet, set up a formula that references data in the settings sheet.
  3. Ensure that the settings sheet name resembles a cell address, such as "F1 (Settings)", and that the formula targets a cell range within that sheet, for example, "!A1:B1".
  4. Insert a new column before the column containing the formula.
  5. Observe the offset of the settings sheet name in the formula.
<?php

require __DIR__ . '/vendor/autoload.php';

// Create new Spreadsheet object
$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();

// add code that show the issue here...
const DATA_TITLE = "DATA";
const SETTINGS_TITLE = "F1 (SETTINGS)";

const FORMULA_CELL = "A1";
const DATA_CELL = "A2";
const SETTINGS_DATA_CELL_ONE = "A1";
const SETTINGS_DATA_CELL_TWO = "B1";

$dataSheet = $spreadsheet->getSheet(0)->setTitle(DATA_TITLE);
$settingsSheet = $spreadsheet->createSheet(1)->setTitle(SETTINGS_TITLE);

$formula = sprintf("=SUM(%s,'%s'!%s:%s)", DATA_CELL, SETTINGS_TITLE, SETTINGS_DATA_CELL_ONE, SETTINGS_DATA_CELL_TWO);
$settingsValues = [[10], [20]];
$dataValues = [[$formula, 1]];

$settingsSheet->fromArray($settingsValues);
$dataSheet->fromArray($dataValues);

// log result: =SUM(A2,'F1 (SETTINGS)'!A1:B1)
error_log("\n\n ========== CELL VALUE BEFORE ======> " . print_r($dataSheet->getCell(FORMULA_CELL)->getValue(), true));
$dataSheet->insertNewColumnBefore("A");
// log result: =SUM(B2,'G1 (SETTINGS)'!A1:B1)
error_log("\n\n ========== CELL VALUE AFTER ======> " . print_r($dataSheet->getCell("B1")->getValue(), true));

### What features do you think are causing the issue

- [ ] Reader
- [ ] Writer
- [ ] Styles
- [ ] Data Validations
- [x] Formula Calculations
- [ ] Charts
- [ ] AutoFilter
- [ ] Form Elements

The issue seems to stem from the regular expressions used in the updateFormulaReferences method of the ReferenceHelper.php file, which incorrectly identify "F1" as a cell address without considering that this information is actually part of the sheet name.

### Which versions of PhpSpreadsheet and PHP are affected?
PhpSpreadsheet : 1.24 -> 2.0.0 (Didn't test older versions)
PHP: 8.0 -> 8.2 (Didn't test older versions)
@oleibman
Copy link
Collaborator

Confirmed. You have certainly opened a can of worms here. In addition to the problem you've documented, there are at least two others. A defined name with a name like Sheet1A1 would also be inappropriately changed in a formula by the existing logic, and sheet names in formulas are compared to the current sheet name case-sensitively and should be compared case-insensitively. Fixing these problems seems like it will have a high probability of regression problems. It will need to be handled carefully, and will take a while.

oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Feb 22, 2024
Fix PHPOffice#3907. After row/column insertion/deletion, PhpSpreadsheet updates formulas which include cells which have moved. However, it can mis-identify cell addresses within the formula. Examples:
- `=SUM(A2,'F1 (SETTINGS)'!A1:B1)` It identifes F1 as a cell address.
- `=SUM(A2,'x F1 (SETTINGS)'!A1:B1)` It identifes F1 as a cell address. (This looks the same as the above, but, for technical reasons, it's different.)
- `=SUM(A2,definedname1A1)` It identifes A1 as a cell address.
- Sheet names in formulas are compared case-sensitively, and should be compared insensitively. This can make a difference if the formula includes its own sheet name, e.g. on sheet `Data`, formula `=SUM(DATA!A1:A2)` might have to change, but it will not do so with the existing logic.

The defined name part is fairly straightforward. The regular expressions that identify a cell address just have to be a bit more robust. It was doing a negative look-behind for an alphabetic character or dollar sign; underscore, period, and digits, all of which can be part of a defined name, need to be added to that list.

The other situations need a bit of a kludge, but not one so bad that I'm ashamed of it. The formulas will be altered before analysis so that sheet names are replaced with Unicode FFFD (sheetname does not match current sheet), FFFC (sheetname, enclosed in apostrophes, matches current sheet), and FFFB (sheetname, not enclosed in apostrophes, matches current sheet). This prevents the existing regular expressions from finding a cell address within a sheet name, and makes it easy to restore the original, with or without apostrophes, when the sheet name matches the current sheet and the cell(s) which it qualifies have to be changed.

Tests are added for all the situations mentioned above. No existing tests required changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants