Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RichText can not get the correct value, when one part is plain text and the other part is rich text. #442

Closed
expbenson opened this issue Mar 28, 2018 · 6 comments

Comments

@expbenson
Copy link

This is:

- [x] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

When I use RichText()->getPlainText(), I should get complete text, contain plain text and rich text.

What is the current behavior?

In xlsx, Like cell value: "plain text, rich text", RichText()->getPlainText() only return the rich text part("rich text"). the first part is missing.

What are the steps to reproduce?

In source code https://github.com/PHPOffice/PhpSpreadsheet/blob/develop/src/PhpSpreadsheet/Reader/Xlsx.php, line 2070

foreach ($is->r as $run) {
    if (isset($run->rPr)) {
        // ...
    }
}

missing the else part.

It should be

if (isset($run->rPr)) {
    // ...
} else {
    $value->createText(StringHelper::controlCharacterOOXML2PHP((string) $run->t));
}

In Xlsx, a cell can contain part to plain text, and the other part of rich text.
In above, $run element can only have "t" attribute.

Which versions of PhpSpreadsheet and PHP are affected?

PhpSpreadsheet 1.2.0, PHP 7.1.6.

@BlueM
Copy link

BlueM commented Mar 29, 2018

I can confirm that the problem exists in 1.2.0 and was gone after downgrading to 1.1.0

@sztanpet
Copy link

Have tested the proposed solution, can confirm it works fine.

@expbenson
Copy link
Author

Code in below can reproduce the bug. (Prepare an empty xlsx file)

$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load('empty.xlsx');
$worksheet = $spreadsheet->getActiveSheet();
$richText = new \PhpOffice\PhpSpreadsheet\RichText\RichText();
$richText->createText('plain text');
$payable = $richText->createTextRun('rich text');
$payable->getFont()->setBold(true);
$payable->getFont()->setColor( new \PhpOffice\PhpSpreadsheet\Style\Color( \PhpOffice\PhpSpreadsheet\Style\Color::COLOR_DARKGREEN ) );
$worksheet->getCell('A1')->setValue($richText);

$output = "richtext.xlsx";
$writer = \PhpOffice\PhpSpreadsheet\IOFactory::createWriter($spreadsheet, "Xlsx");
$writer->save($output);

$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load($output);
$worksheet = $spreadsheet->getActiveSheet();
$cell = $worksheet->getCell('A1');
/** @var \PhpOffice\PhpSpreadsheet\RichText\RichText $value */
$value = $cell->getValue();
echo $value->getPlainText();
// output: 'rich text'

@dancostinel
Copy link

dancostinel commented Mar 29, 2018

Hi. I'm facing the same problem. I'm having, in an .xlsx file, with some rows that contains plain text, part of it containing styles like: underline, bold or italic, and some of them without.

  • "phpoffice/phpspreadsheet": "^1.2",
  • PHP 7.1.14

Here's my case.
The xlsx file content I want to read, and save it to the database, exactly as is: https://imgur.com/a/9Ptee
Code I used:

$path = $this->get('kernel')->getRootDir() . '/../intrebari.xlsx';
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader('Xlsx');
$reader->setReadDataOnly(true);
$spreadsheet = $reader->load($path);
$worksheet = $spreadsheet->getActiveSheet();
$highestRow = $worksheet->getHighestRow(); // e.g. 10
$highestColumn = $worksheet->getHighestColumn(); // e.g 'F'
$highestColumnIndex = \PhpOffice\PhpSpreadsheet\Cell\Coordinate::columnIndexFromString($highestColumn); // e.g. 5

echo '<table>' . "\n";
for ($row = 1; $row <= $highestRow; ++$row) {
    echo '<tr>' . PHP_EOL;
    for ($col = 1; $col <= $highestColumnIndex; ++$col) {
        $cellValue = $worksheet->getCellByColumnAndRow($col, $row)->getValue();
        if ($cellValue instanceof \PhpOffice\PhpSpreadsheet\RichText\RichText) {
            foreach ($cellValue->getRichTextElements() as $elem) {
                echo '<td>'. $elem->getFont()->getBold() . '</td>';
            }
        } else {
            echo '<td>'. $cellValue . '</td>';
        }
    echo '</tr>' . PHP_EOL;
}
echo '</table>' . PHP_EOL;

Using this

echo '<td>'. $elem->getFont()->getBold() . '</td>';

I expect to get, for example, that bold i in the word miile-i, but I don't. Or that şi which is both italic and bold.

And one more issue, the $highestColumn = $worksheet->getHighestColumn(); detects as columns D and E have content (so it iterates over them too), and I'm not, as you can clearly see in the picture.

Any tip to fix the issues! Thanks!

LE: And downgrading to v1.1.0 as BlueM suggested, didn't help.

@gintsmurans
Copy link
Contributor

Yup, same issue here. v1.2.0

@sztanpet
Copy link

sztanpet commented Apr 8, 2018

current develop branch (@ 5f03659) still has the issue

@expbenson expbenson reopened this Apr 8, 2018
Dfred pushed a commit to Dfred/PhpSpreadsheet that referenced this issue Nov 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants