Skip to content

Commit

Permalink
major new feature: support array compression and decompression
Browse files Browse the repository at this point in the history
  • Loading branch information
fangq committed Apr 28, 2019
1 parent 9c01046 commit 3322f6f
Show file tree
Hide file tree
Showing 12 changed files with 410 additions and 40 deletions.
7 changes: 7 additions & 0 deletions AUTHORS.txt
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,13 @@ The script loadjson.m was built upon previous works by
- Joel Feenstra: http://www.mathworks.com/matlabcentral/fileexchange/20565
date: 2008/07/03

The data compression/decompression utilities ({zlib,gzip,base64}{encode,decode}.m)
were copied from

- "Byte encoding utilities" by Kota Yamaguchi
https://www.mathworks.com/matlabcentral/fileexchange/39526-byte-encoding-utilities
date: 2013/01/04


This toolbox contains patches submitted by the following contributors:

Expand Down
30 changes: 30 additions & 0 deletions LICENSE_BSD.txt
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,33 @@ ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The views and conclusions contained in the software and documentation are those of the
authors and should not be interpreted as representing official policies, either expressed
or implied, of the copyright holders.




For the included compression/decompression utilities (base64encode.m, base64decode.m,
gzipencode.m, gzipdecode.m, zlibencode.m, zlibdecode.m), the author Kota Yamaguchi
requires the following copyright declaration:

Copyright (c) 2012, Kota Yamaguchi
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
24 changes: 24 additions & 0 deletions base64decode.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
function output = base64decode(input)
%BASE64DECODE Decode Base64 string to a byte array.
%
% output = base64decode(input)
%
% The function takes a Base64 string INPUT and returns a uint8 array
% OUTPUT. JAVA must be running to use this function. The result is always
% given as a 1-by-N array, and doesn't retrieve the original dimensions.
%
% See also base64encode
%
% Copyright (c) 2012, Kota Yamaguchi
% URL: https://www.mathworks.com/matlabcentral/fileexchange/39526-byte-encoding-utilities
% License : BSD, see LICENSE_*.txt
%

error(nargchk(1, 1, nargin));
error(javachk('jvm'));
if ischar(input), input = uint8(input); end

output = typecast(org.apache.commons.codec.binary.Base64.decodeBase64(input), 'uint8')';

end

23 changes: 23 additions & 0 deletions base64encode.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
function output = base64encode(input)
%BASE64ENCODE Encode a byte array using Base64 codec.
%
% output = base64encode(input)
%
% The function takes a char, int8, or uint8 array INPUT and returns Base64
% encoded string OUTPUT. JAVA must be running to use this function. Note
% that encoding doesn't preserve input dimensions.
%
% See also base64decode
%
% Copyright (c) 2012, Kota Yamaguchi
% URL: https://www.mathworks.com/matlabcentral/fileexchange/39526-byte-encoding-utilities
% License : BSD, see LICENSE_*.txt
%

error(nargchk(1, 1, nargin));
error(javachk('jvm'));
if ischar(input), input = uint8(input); end

output = char(org.apache.commons.codec.binary.Base64.encodeBase64Chunked(input))';

end
34 changes: 34 additions & 0 deletions gzipdecode.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
function output = gzipdecode(input)
%GZIPDECODE Decompress input bytes using GZIP.
%
% output = gzipdecode(input)
%
% The function takes a compressed byte array INPUT and returns inflated
% bytes OUTPUT. The INPUT is a result of GZIPENCODE function. The OUTPUT
% is always an 1-by-N uint8 array. JAVA must be enabled to use the function.
%
% See also gzipencode typecast
%
% Copyright (c) 2012, Kota Yamaguchi
% URL: https://www.mathworks.com/matlabcentral/fileexchange/39526-byte-encoding-utilities
% License : BSD, see LICENSE_*.txt
%

error(nargchk(1, 1, nargin));
error(javachk('jvm'));
if ischar(input)
warning('gzipdecode:inputTypeMismatch', ...
'Input is char, but treated as uint8.');
input = uint8(input);
end
if ~isa(input, 'int8') && ~isa(input, 'uint8')
error('Input must be either int8 or uint8.');
end

gzip = java.util.zip.GZIPInputStream(java.io.ByteArrayInputStream(input));
buffer = java.io.ByteArrayOutputStream();
org.apache.commons.io.IOUtils.copy(gzip, buffer);
gzip.close();
output = typecast(buffer.toByteArray(), 'uint8')';

end
32 changes: 32 additions & 0 deletions gzipencode.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
function output = gzipencode(input)
%GZIPENCODE Compress input bytes with GZIP.
%
% output = gzipencode(input)
%
% The function takes a char, int8, or uint8 array INPUT and returns
% compressed bytes OUTPUT as a uint8 array. Note that the compression
% doesn't preserve input dimensions. JAVA must be enabled to use the
% function.
%
% See also gzipdecode typecast
%
% Copyright (c) 2012, Kota Yamaguchi
% URL: https://www.mathworks.com/matlabcentral/fileexchange/39526-byte-encoding-utilities
% License : BSD, see LICENSE_*.txt
%

error(nargchk(1, 1, nargin));
error(javachk('jvm'));
if ischar(input), input = uint8(input); end
if ~isa(input, 'int8') && ~isa(input, 'uint8')
error('Input must be either char, int8 or uint8.');
end

buffer = java.io.ByteArrayOutputStream();
gzip = java.util.zip.GZIPOutputStream(buffer);
gzip.write(input, 0, numel(input));
gzip.close();
output = typecast(buffer.toByteArray(), 'uint8')';

end

6 changes: 5 additions & 1 deletion loadubjson.m
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@
% the "name" tag is treated as a string. To load
% these UBJSON data, you need to manually set this
% flag to 1.
% opt.Compression 'zlib' or 'gzip': specify array compression
% method; currently only support 'gzip' or 'zlib'.
% opt.CompressArraySize [0|int]: only compress arrays with a total
% element count larger than this number.
%
% output:
% dat: a cell array, where {...} blocks are converted into cell arrays,
Expand Down Expand Up @@ -133,7 +137,7 @@
parse_char('}');
end
if(isstruct(object))
object=struct2jdata(object);
object=struct2jdata(object,'Base64',0);
end

%%-------------------------------------------------------------------------
Expand Down
97 changes: 79 additions & 18 deletions savejson.m
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,10 @@
% back to the string form
% opt.SaveBinary [0|1]: 1 - save the JSON file in binary mode; 0 - text mode.
% opt.Compact [0|1]: 1- out compact JSON format (remove all newlines and tabs)
%
% opt.Compression 'zlib' or 'gzip': specify array compression
% method; currently only support 'gzip' or 'zlib'.
% opt.CompressArraySize [0|int]: only compress arrays with a total
% element count larger than this number.
% opt can be replaced by a list of ('param',value) pairs. The param
% string is equivallent to a field in opt and is case sensitive.
% output:
Expand Down Expand Up @@ -105,6 +108,21 @@
opt=varargin2struct(varargin{:});
end
opt.IsOctave=exist('OCTAVE_VERSION','builtin');

dozip=jsonopt('Compression','',opt);
if(~opt.IsOctave && ~isempty(dozip))
if(~(strcmpi(dozip,'gzip') || strcmpi(dozip,'zlib')))
error('compression method "%s" is not supported',dozip);
end
try
error(javachk('jvm'));
matlab.net.base64decode('test');
catch
error('java-based compression is not supported');
end
opt.Compression=dozip;
end

if(isfield(opt,'norowbracket'))
warning('Option ''NoRowBracket'' is depreciated, please use ''SingletArray'' and set its value to not(NoRowBracket)');
if(~isfield(opt,'singletarray'))
Expand Down Expand Up @@ -370,8 +388,11 @@
nl=ws.newline;
sep=ws.sep;

dozip=jsonopt('Compression','',varargin{:});
zipsize=jsonopt('CompressArraySize',0,varargin{:});

if(length(size(item))>2 || issparse(item) || ~isreal(item) || ...
(isempty(item) && any(size(item))) ||jsonopt('ArrayToStruct',0,varargin{:}))
(isempty(item) && any(size(item))) ||jsonopt('ArrayToStruct',0,varargin{:}) || (~isempty(dozip) && numel(item)>zipsize))
if(isempty(name))
txt=sprintf('%s{%s%s"_ArrayType_": "%s",%s%s"_ArraySize_": %s,%s',...
padding1,nl,padding0,class(item),nl,padding0,regexprep(mat2str(size(item)),'\s+',','),nl);
Expand Down Expand Up @@ -411,27 +432,67 @@
txt=sprintf(dataformat,txt,padding0,'"_ArrayIsComplex_": ','1', sep);
end
txt=sprintf(dataformat,txt,padding0,'"_ArrayIsSparse_": ','1', sep);
if(size(item,1)==1)
% Row vector, store only column indices.
txt=sprintf(dataformat,txt,padding0,'"_ArrayData_": ',...
matdata2json([iy(:),data'],level+2,varargin{:}), nl);
elseif(size(item,2)==1)
% Column vector, store only row indices.
txt=sprintf(dataformat,txt,padding0,'"_ArrayData_": ',...
matdata2json([ix,data],level+2,varargin{:}), nl);
if(~isempty(dozip) && numel(data*2)>zipsize)
if(size(item,1)==1)
% Row vector, store only column indices.
fulldata=[iy(:),data'];
elseif(size(item,2)==1)
% Column vector, store only row indices.
fulldata=[ix,data];
else
% General case, store row and column indices.
fulldata=[ix,iy,data];
end
txt=sprintf(dataformat,txt,padding0,'"_ArrayCompressionSize_": ',regexprep(mat2str(size(fulldata)),'\s+',','), sep);
txt=sprintf(dataformat,txt,padding0,'"_ArrayCompressionMethod_": "',dozip, ['"' sep]);
if(strcmpi(dozip,'gzip'))
txt=sprintf(dataformat,txt,padding0,'"_ArrayCompressedData_": "',base64encode(gzipencode(typecast(fulldata(:),'uint8'))),['"' nl]);
elseif(strcmpi(dozip,'zlib'))
txt=sprintf(dataformat,txt,padding0,'"_ArrayCompressedData_": "',base64encode(zlibencode(typecast(fulldata(:),'uint8'))),['"' nl]);
else
error('compression method not supported');
end
else
% General case, store row and column indices.
txt=sprintf(dataformat,txt,padding0,'"_ArrayData_": ',...
matdata2json([ix,iy,data],level+2,varargin{:}), nl);
if(size(item,1)==1)
% Row vector, store only column indices.
txt=sprintf(dataformat,txt,padding0,'"_ArrayData_": ',...
matdata2json([iy(:),data'],level+2,varargin{:}), nl);
elseif(size(item,2)==1)
% Column vector, store only row indices.
txt=sprintf(dataformat,txt,padding0,'"_ArrayData_": ',...
matdata2json([ix,data],level+2,varargin{:}), nl);
else
% General case, store row and column indices.
txt=sprintf(dataformat,txt,padding0,'"_ArrayData_": ',...
matdata2json([ix,iy,data],level+2,varargin{:}), nl);
end
end
else
if(isreal(item))
txt=sprintf(dataformat,txt,padding0,'"_ArrayData_": ',...
matdata2json(item(:)',level+2,varargin{:}), nl);
if(~isempty(dozip) && numel(item)>zipsize)
if(isreal(item))
fulldata=item(:)';
else
txt=sprintf(dataformat,txt,padding0,'"_ArrayIsComplex_": ','1', sep);
fulldata=[real(item(:)) imag(item(:))];
end
txt=sprintf(dataformat,txt,padding0,'"_ArrayCompressionSize_": ',regexprep(mat2str(size(fulldata)),'\s+',','), sep);
txt=sprintf(dataformat,txt,padding0,'"_ArrayCompressionMethod_": "',dozip, ['"' sep]);
if(strcmpi(dozip,'gzip'))
txt=sprintf(dataformat,txt,padding0,'"_ArrayCompressedData_": "',base64encode(gzipencode(typecast(fulldata(:),'uint8'))),['"' nl]);
elseif(strcmpi(dozip,'zlib'))
txt=sprintf(dataformat,txt,padding0,'"_ArrayCompressedData_": "',base64encode(zlibencode(typecast(fulldata(:),'uint8'))),['"' nl]);
else
error('compression method not supported');
end
else
txt=sprintf(dataformat,txt,padding0,'"_ArrayIsComplex_": ','1', sep);
txt=sprintf(dataformat,txt,padding0,'"_ArrayData_": ',...
if(isreal(item))
txt=sprintf(dataformat,txt,padding0,'"_ArrayData_": ',...
matdata2json(item(:)',level+2,varargin{:}), nl);
else
txt=sprintf(dataformat,txt,padding0,'"_ArrayIsComplex_": ','1', sep);
txt=sprintf(dataformat,txt,padding0,'"_ArrayData_": ',...
matdata2json([real(item(:)) imag(item(:))],level+2,varargin{:}), nl);
end
end
end
txt=sprintf('%s%s%s',txt,padding1,'}');
Expand Down
Loading

0 comments on commit 3322f6f

Please sign in to comment.