Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] get wrong stream position while get "FileHeader" stream in ".hwp" file #1

Closed
sboh1214 opened this issue Sep 14, 2020 · 4 comments · Fixed by #5
Closed

[Bug] get wrong stream position while get "FileHeader" stream in ".hwp" file #1

sboh1214 opened this issue Sep 14, 2020 · 4 comments · Fixed by #5
Assignees
Labels
bug Something isn't working

Comments

@sboh1214
Copy link
Collaborator

Hello, I'm using OLEKit with hwp file which use in Korea.
It has stream "FileHeader" with fixed size 256 bytes.
However, OLEKit throws uncaught error Thread 1: EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
when I try to get data of stream by readDataToEnd()

let positiveURL = URL(fileURLWithPath: #file)
        .deletingLastPathComponent()
        .appendingPathComponent("SampleHwp")
        .appendingPathComponent("blank.hwp")
let ole = try OLEFile(positiveURL.path)
let fileHeaderStream = ole.root.children.first(where: { $0.name == "FileHeader"})!
let reader = try ole.stream(fileHeaderStream)
let data = reader.readDataToEnd()

You can also find this code snippet in my repo sboh1214/HwpKit

In hex, stream "FileHeader" should read as ascii (about in position 16112)

48 57 50 20 44 6F 63 75 6D 65 6E 74 20 46 69 6C 65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 04 03 00 05
HWP Document File ................ 4 3 0 5 (version codes of hwp file) --> Omitted below

Overall, thanks for your library!

@MaxDesiatov
Copy link
Collaborator

Thanks for checking out this library! If I understand correctly, in HwpKit in HwpFileHeader initializer you're trying to read the signature, but there isn't enough data in the corresponding DataReader. Here on the screenshot byteOffset is already 255, but you're trying to read 32 more bytes, and the total length of data is 256 bytes.

Screenshot 2020-09-14 at 14 03 46

Thanks for pointing me to this issue, DataReader should check for the length by itself. I'll add preconditions to DataReader to make this error obvious.

@MaxDesiatov
Copy link
Collaborator

MaxDesiatov commented Sep 14, 2020

Also, the error is specifically in this snippet in HwpFileHeader:

        data = dataReader.readDataToEnd()
        signature = String(data: dataReader.readData(ofLength: 32), encoding: .utf16LittleEndian) ?? "Error"

You've used readDataToEnd() once, which set the byteOffset value (the reading "cursor") to the end of the stream. There's nothing left to read, so you can't use readData again on this stream unless you use seek to rewind the reading position to somewhere else where data is available. Or maybe you need to create a different stream from other entry altogether to get the data you're looking for, I don't know the details of your format to say for sure.

Does that resolve your issue?

@MaxDesiatov MaxDesiatov added the question Further information is requested label Sep 14, 2020
@sboh1214
Copy link
Collaborator Author

sboh1214 commented Sep 15, 2020

Thank you for your kind answer and for investigating my error!
However, there is one more question as I wrote above.

I open this file with ole-py, which is another ole/com tool written in python.
You can see code in this repo.
https://github.com/sboh1214/ole-py/blob/master/test.py

As a result, the "FileHeader" stream of .hwp file starts with the signature "HWP Document File" encoded with ASCII.

('FileHeader', 'DocInfo', '\x05HwpSummaryInformation', 'PrvImage', 'PrvText', 'Scripts/JScriptVersion', 'Scripts/DefaultJScript', 'DocOptions/_LinkDoc', 'BinData/BIN0001.png', 'BodyText/Section0')
========================================
HWP Document File

However, when I use OLEKit, I have a result below.
https://github.com/sboh1214/HwpKit/blob/646014bb13910badc6f4dbc6cc88589d249495dd/Tests/HwpKitTests/HwpKitTests.swift#L35

Optional("´¥\0­\u{14}zøÄf�"\u{12}ÍA�tgu�9¹\u{19}�>�\u{1E}\u{1D}W\u{1A}\rñ£")

Can you help me what is wrong? Sorry for using the infamous file format :)

@MaxDesiatov MaxDesiatov added bug Something isn't working and removed question Further information is requested labels Sep 15, 2020
@MaxDesiatov
Copy link
Collaborator

Thanks for the update, this looks like a bug to me, I'm investigating now...

@MaxDesiatov MaxDesiatov assigned MaxDesiatov and unassigned sboh1214 Sep 15, 2020
@sboh1214 sboh1214 changed the title [Error] readDataToEnd() Thread 1: EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0) [Bug] get wrong stream position while get "FileHeader" stream in ".hwp" file Sep 16, 2020
MaxDesiatov added a commit that referenced this issue Sep 16, 2020
The sector data truncation logic was checking the constant value passed to the `oleStream` function, instead of the incremented iteration index.

Resolves #1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants