Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Get an error saving the height profiles when saving grains but not tracing (disordered_tracing set to false) #1081

Open
7 tasks done
derollins opened this issue Jan 28, 2025 · 2 comments
Labels
bug Something isn't working

Comments

@derollins
Copy link
Collaborator

derollins commented Jan 28, 2025

Checklist

  • Find the offending file in the output. If processing halts, re-run analysis with topostats --core 1 process.
  • Describe the bug.
  • Include the configuration file.
  • Copy of the log-file from running with topostats --log-level debug <command>.
  • The exact command that failed. This is what you typed at the command line, including any options.
  • TopoStats version, this is reported by topostats --version
  • Operating System and Python Version

Describe the bug

Hello, I'm back! With another error!

When grainstats is set to true while disordered_traces is set to false I get this error:

[Tue, 28 Jan 2025 16:58:07] [INFO    ] [topostats] Saving all height profiles to output2b/height_profiles.json
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\ggjh246\AppData\Local\anaconda3\envs\topostats_git2\Scripts\topostats.exe\__main__.py", line 7, in <module>
  File "C:\Users\ggjh246\Code\TopoStats\topostats\entry_point.py", line 1214, in entry_point
    args.func(args)
  File "C:\Users\ggjh246\Code\TopoStats\topostats\run_modules.py", line 361, in process
    results.set_index(["image", "threshold", "grain_number"], inplace=True)
  File "C:\Users\ggjh246\AppData\Local\anaconda3\envs\topostats_git2\Lib\site-packages\pandas\core\frame.py", line 6122, in set_index
    raise KeyError(f"None of {missing} are in the columns")
KeyError: "None of ['grain_number'] are in the columns"
Exception ignored in: <_io.FileIO name='C:\\Users\\ggjh246\\test_data\\save-2025.01.22-14.42.45.705.jpk' mode='rb' closefd=True>
ResourceWarning: unclosed file <_io.BufferedReader name='C:\\Users\\ggjh246\\test_data\\save-2025.01.22-14.42.45.705.jpk'>
Exception ignored in: <_io.FileIO name='C:\\Users\\ggjh246\\test_data\\save-2025.01.22-14.43.02.844.jpk' mode='rb' closefd=True>
ResourceWarning: unclosed file <_io.BufferedReader name='C:\\Users\\ggjh246\\test_data\\save-2025.01.22-14.43.02.844.jpk'>

This prevents running the grain analysis without tracing, slowing topostats down when tracing is not desired and generating unwated data.

Copy of the log-file from running with topostats --log-level debug <command>

topostatserrror.txt

[Tue, 28 Jan 2025 17:11:53] [INFO    ] [topostats] [save-2025.01.22-14.43.07.129] : Calculation of nodestats disabled, returning empty dataframe.
[Tue, 28 Jan 2025 17:11:53] [INFO    ] [topostats] [save-2025.01.22-14.43.07.129] : Calculation of Curvature Stats disabled, returning None.
[Tue, 28 Jan 2025 17:11:53] [INFO    ] [topostats] [save-2025.01.22-14.43.07.129] : *** Image Statistics ***
[Tue, 28 Jan 2025 17:11:53] [INFO    ] [topostats] [save-2025.01.22-14.43.07.129] : Saving image to .topostats file
Processing images from C:\Users\ggjh246\test_data, results are under output2b: 100%|█████| 8/8 [02:21<00:00, 16.76s/it][Tue, 28 Jan 2025 17:11:53] [INFO    ] [topostats] [save-2025.01.22-14.43.07.129] Processing completed.
Processing images from C:\Users\ggjh246\test_data, results are under output2b: 100%|█████| 8/8 [02:21<00:00, 17.70s/it]
[Tue, 28 Jan 2025 17:11:53] [INFO    ] [topostats] Saving image stats to : output2b/image_stats.csv.
[Tue, 28 Jan 2025 17:11:53] [INFO    ] [topostats] Saving all height profiles to output2b/height_profiles.json
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\ggjh246\AppData\Local\anaconda3\envs\topostats_git2\Scripts\topostats.exe\__main__.py", line 7, in <module>
  File "C:\Users\ggjh246\Code\TopoStats\topostats\entry_point.py", line 1214, in entry_point
    args.func(args)
  File "C:\Users\ggjh246\Code\TopoStats\topostats\run_modules.py", line 361, in process
    results.set_index(["image", "threshold", "grain_number"], inplace=True)
  File "C:\Users\ggjh246\AppData\Local\anaconda3\envs\topostats_git2\Lib\site-packages\pandas\core\frame.py", line 6122, in set_index
    raise KeyError(f"None of {missing} are in the columns")
KeyError: "None of ['grain_number'] are in the columns"
Exception ignored in: <_io.FileIO name='C:\\Users\\ggjh246\\test_data\\save-2025.01.22-14.42.45.705.jpk' mode='rb' closefd=True>
ResourceWarning: unclosed file <_io.BufferedReader name='C:\\Users\\ggjh246\\test_data\\save-2025.01.22-14.42.45.705.jpk'>
Exception ignored in: <_io.FileIO name='C:\\Users\\ggjh246\\test_data\\save-2025.01.22-14.43.02.844.jpk' mode='rb' closefd=True>
ResourceWarning: unclosed file <_io.BufferedReader name='C:\\Users\\ggjh246\\test_data\\save-2025.01.22-14.43.02.844.jpk'>

Rest of file is in file.

Include the configuration file

# Config file generated 2025-01-28 14:44:34
# # For more information on configuration and how to use it:
# https://afm-spm.github.io/TopoStats/main/configuration.html
base_dir: ./ # Directory in which to search for data files
output_dir: ./output2b # Directory to output results to
log_level: info # Verbosity of output. Options: warning, error, info, debug
cores: 1 # Number of CPU cores to utilise for processing multiple files simultaneously.
file_ext: .jpk # File extension of the data files.
loading:
  channel: height_trace # Channel to pull data from in the data files.
  extract: raw # Array to extract when loading .topostats files.
filter:
  run: true # Options : true, false
  row_alignment_quantile: 0.2 # lower values may improve flattening of larger features
  threshold_method: std_dev # Options : otsu, std_dev, absolute
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 1.0 # Threshold for data below the image background
    above: 1.0 # Threshold for data above the image background
  threshold_absolute:
    below: -1.0 # Threshold for data below the image background
    above: 1.0 # Threshold for data above the image background
  gaussian_size: 1.0121397464510862 # Gaussian blur intensity in px
  gaussian_mode: nearest # Mode for Gaussian blurring. Options : nearest, reflect, constant, mirror, wrap
  # Scar remvoal parameters. Be careful with editing these as making the algorithm too sensitive may
  # result in ruining legitimate data.
  remove_scars:
    run: false
    removal_iterations: 2 # Number of times to run scar removal.
    threshold_low: 0.250 # lower values make scar removal more sensitive
    threshold_high: 0.666 # lower values make scar removal more sensitive
    max_scar_width: 4 # Maximum thickness of scars in pixels.
    min_scar_length: 16 # Minimum length of scars in pixels.
grains:
  run: true # Options : true, false
  # Thresholding by height
  threshold_method: std_dev # Options : std_dev, otsu, absolute, unet
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 0.2 # Threshold for grains below the image background
    above: 1.0 # Threshold for grains above the image background
  threshold_absolute:
    below: -1.0 # Threshold for grains below the image background
    above: 1.0 # Threshold for grains above the image background
  direction: both # Options: above, below, both (defines whether to look for grains above or below thresholds or both)
  # Thresholding by area
  smallest_grain_size_nm2: 50 # Size in nm^2 of tiny grains/blobs (noise) to remove, must be > 0.0
  absolute_area_threshold:
    above: [300, 3000] # above surface [Low, High] in nm^2 (also takes null)
    below: [null, null] # below surface [Low, High] in nm^2 (also takes null)
  remove_edge_intersecting_grains: true # Whether or not to remove grains that touch the image border
  unet_config:
    model_path: null # Path to a trained U-Net model
    grain_crop_padding: 2 # Padding to apply to the grain crop bounding box
    upper_norm_bound: 5.0 # Upper bound for normalisation of input data. This should be slightly higher than the maximum desired / expected height of grains.
    lower_norm_bound: -1.0 # Lower bound for normalisation of input data. This should be slightly lower than the minimum desired / expected height of the background.
  vetting:
    class_conversion_size_thresholds: null # Class conversion size thresholds, list of tuples of 3 integers and 2 integers, ie list[tuple[tuple[int, int, int], tuple[int, int]]] eg [[[1, 2, 3], [5, 10]]] for each region of class 1 to convert to 2 if smaller than 5 nm^2 and to class 3 if larger than 10 nm^2.
    class_region_number_thresholds: null # Class region number thresholds, list of lists, ie [[class, low, high],] eg [[1, 2, 4], [2, 1, 1]] for class 1 to have 2-4 regions and class 2 to have 1 region. Can use None to not set an upper/lower bound.
    class_size_thresholds: null # Class size thresholds (nm^2), list of tuples of 3 integers, ie [[class, low, high],] eg [[1, 100, 1000], [2, 1000, None]] for class 1 to have 100-1000 nm^2 and class 2 to have 1000-any nm^2. Can use None to not set an upper/lower bound.
    nearby_conversion_classes_to_convert: null # Class conversion for nearby regions, list of tuples of two-integer tuples, eg [[[1, 2], [3, 4]]] to convert class 1 to 2 and 3 to 4 for small touching regions
    class_touching_threshold: 5 # Number of dilation steps to use for detecting touching regions
    keep_largest_labelled_regions_classes: null # Classes to keep the only largest regions for, list of integers eg [1, 2] to keep only the largest regions of class 1 and 2
    class_connection_point_thresholds: null # Class connection point thresholds, [[[class_1, class_2], [min, max]]] eg [[[1, 2], [1, 1]]] for class 1 to have 1 connection point with class 2
grainstats:
  run: true # Options : true, false
  edge_detection_method: binary_erosion # Options: canny, binary erosion. Do not change this unless you are sure of what this will do.
  cropped_size: -1 # Length (in nm) of square cropped images (can take -1 for grain-sized box)
  extract_height_profile: true # Extract height profiles along maximum feret of molecules
disordered_tracing:
  run: false # Options : true, false
  min_skeleton_size: 10 # Minimum number of pixels in a skeleton for it to be retained.
  pad_width: 1 # Pixels to pad grains by when tracing
  mask_smoothing_params:
    gaussian_sigma: 2 # Gaussian smoothing parameter 'sigma' in pixels.
    dilation_iterations: 2 # Number of dilation iterations to use for grain smoothing.
    holearea_min_max: [0, null] # Range (min, max) of a hole area in nm to refill in the smoothed masks.
  skeletonisation_params:
    method: topostats # Options : zhang | lee | thin | topostats
    height_bias: 0.6 # Percentage of lowest pixels to remove each skeletonisation iteration. 1 equates to zhang.
  pruning_params:
    method: topostats # Method to clean branches of the skeleton. Options : topostats
    max_length: 10.0 # Maximum length in nm to remove a branch containing an endpoint.
    height_threshold: # The height to remove branches below.
    method_values: mid # The method to obtain a branch's height for pruning. Options : min | median | mid.
    method_outlier: mean_abs # The method to prune branches based on height. Options : abs | mean_abs | iqr.
nodestats:
  run: false # Options : true, false
  node_joining_length: 7.0 # The distance in nanometres over which to join nearby crossing points.
  node_extend_dist: 14.0 # The distance in nanometres over which to join nearby odd-branched nodes.
  branch_pairing_length: 20.0 # The length in nanometres from the crossing point to pair and trace, obtaining FWHM's.
  pair_odd_branches: false # Whether to try and pair odd-branched nodes. Options: true and false.
  pad_width: 1 # Pixels to pad grains by when tracing (should be the same as disordered_tracing).
ordered_tracing:
  run: false
  ordering_method: nodestats # The method of ordering the disordered traces.
  pad_width: 1 # Pixels to pad grains by when tracing (should be the same as disordered_tracing).
splining:
  run: false # Options : true, false
  method: "rolling_window" # Options : "spline", "rolling_window"
  rolling_window_size: 20.0e-9 # size in nm of the rolling window.
  spline_step_size: 7.0e-9 # The sampling rate of the spline in metres.
  spline_linear_smoothing: 5.0 # The amount of smoothing to apply to linear features.
  spline_circular_smoothing: 5.0 # The amount of smoothing to apply to circular features.
  spline_degree: 3 # The polynomial degree of the spline.
curvature:
  run: false # Options : true, false
  colourmap_normalisation_bounds: [-0.5, 0.5] # Radians per nm to normalise the colourmap to.
plotting:
  run: true # Options : true, false
  style: topostats.mplstyle # Options : topostats.mplstyle or path to a matplotlibrc params file
  savefig_format: null # Options : null, png, svg or pdf. tif is also available although no metadata will be saved. (defaults to png) See https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.savefig.html
  savefig_dpi: 100 # Options : null (defaults to the value in topostats/plotting_dictionary.yaml), see https://afm-spm.github.io/TopoStats/main/configuration.html#further-customisation and https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.savefig.html
  pixel_interpolation: null # Options : https://matplotlib.org/stable/gallery/images_contours_and_fields/interpolation_methods.html
  image_set: all # Options : all, core
  zrange: [-4, 4] # low and high height range for core images (can take [null, null]). low <= high
  colorbar: true # Options : true, false
  axes: true # Options : true, false (due to off being a bool when parsed)
  num_ticks: [null, null] # Number of ticks to have along the x and y axes. Options : null (auto) or integer > 1
  cmap: null # Colormap/colourmap to use (default is 'nanoscope' which is used if null, other options are 'afmhot', 'viridis' etc.)
  mask_cmap: blue_purple_green # Options : blu, jet_r and any in matplotlib
  histogram_log_axis: false # Options : true, false
summary_stats:
  run: false # Whether to make summary plots for output data
  config: null

To Reproduce

topostats -c ./test_config2.yaml process

TopoStats Version

Git main branch

Python Version

3.11

Operating System

Windows

Python Packages

absl-py==2.1.0
AFMReader==0.0.3
art==6.4
asttokens==3.0.0
astunparse==1.6.3
biopython==1.85
certifi==2024.12.14
charset-normalizer==3.4.1
cheap_repr==0.5.2
colorama==0.4.6
contourpy==1.3.1
cycler==0.12.1
decorator==5.1.1
docstring_parser==0.16
et_xmlfile==2.0.0
executing==2.2.0
flatbuffers==25.1.24
fonttools==4.55.6
gast==0.6.0
google-pasta==0.2.0
grpcio==1.70.0
h5py==3.12.1
idna==3.10
igor2==0.5.9
imageio==2.37.0
ipython==8.31.0
jedi==0.19.2
joblib==1.4.2
keras==3.8.0
kiwisolver==1.4.8
lazy_loader==0.4
libclang==18.1.1
llvmlite==0.44.0
loguru==0.7.3
magicgui==0.10.0
Markdown==3.7
markdown-it-py==3.0.0
MarkupSafe==3.0.2
matplotlib==3.9.4
matplotlib-inline==0.1.7
mdurl==0.1.2
ml-dtypes==0.4.1
namex==0.0.8
networkx==3.4.2
numba==0.61.0
numpy==1.26.4
numpyencoder==0.3.0
openpyxl==3.1.5
opt_einsum==3.4.0
optree==0.14.0
packaging==24.2
pandas==2.2.3
parso==0.8.4
pillow==11.1.0
prompt_toolkit==3.0.50
protobuf==5.29.3
psutil==5.9.8
psygnal==0.11.1
pure_eval==0.2.3
pyconify==0.2
Pygments==2.19.1
pyparsing==3.2.1
pyspm==0.6.2
python-dateutil==2.9.0.post0
pytz==2024.2
PyYAML==6.0.2
QtPy==2.4.2
requests==2.32.3
rich==13.9.4
ruamel.yaml==0.18.10
ruamel.yaml.clib==0.2.12
schema==0.7.7
scikit-image==0.25.1
scikit-learn==1.6.1
scipy==1.15.1
seaborn==0.13.2
six==1.17.0
skan==0.12.2
snoop==0.6.0
stack-data==0.6.3
superqt==0.7.1
tensorboard==2.18.0
tensorboard-data-server==0.7.2
tensorflow==2.18.0
tensorflow-io-gcs-filesystem==0.31.0
tensorflow_intel==2.18.0
termcolor==2.5.0
threadpoolctl==3.5.0
tifffile==2025.1.10
toolz==1.0.0
topoly==1.0.4
-e git+https://github.com/AFM-SPM/TopoStats.git@c769ca3cfd32bf37ab3a158cf032b80fa5bcf51a#egg=topostats
tqdm==4.67.1
traitlets==5.14.3
typing_extensions==4.12.2
tzdata==2025.1
urllib3==2.3.0
wcwidth==0.2.13
Werkzeug==3.1.3
win32_setctime==1.2.0
wrapt==1.17.2

@derollins derollins added the bug Something isn't working label Jan 28, 2025
@MaxGamill-Sheffield
Copy link
Collaborator

Hey @derollins!

I believe this is caused by the way that some of the results files are indexed - by grain number instead of by index with a column of grain number.

@SylviaWhittle has found this issue and fixed it in her upcoming refactor so keep an eye out for that! In the meantime can I recommend you try including running disordered tracing (using Zhang skeletonisation)? This shouldn't add too much time onto processing but should jiggle around the grainstats columns.

Please lmk if that's worked for you :)

@derollins
Copy link
Collaborator Author

Thanks, that works although while if I plot all the grain images are numbered, there are no numbers in all_ststistics.csv for grains that weren't sucessfully traced. Is this from the same issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants