Crawling¶
Contains functions for gathering metadata from individual DICOM files or entire directories.
- dicom_csv.crawler.get_file_meta(path: Union[Path, str], force: bool = True, read_pixel_array: bool = False, unpack_volumetric: bool = False, extract_private: bool = False) Iterable[dict] [source]¶
Get a dict containing the metadata from the DICOM file located at
path
.- Parameters
PathLike (path -) – full path to file
:param : full path to file :param force - bool: pydicom.filereader.dcmread force parameter, default is False :param : pydicom.filereader.dcmread force parameter, default is False :param read_pixel_array - bool: if True, crawler will add information about DICOM pixel_array, False significantly increases crawling time,
default is True.
- :paramif True, crawler will add information about DICOM pixel_array, False significantly increases crawling time,
default is True.
Notes
- The following keys are added:
- NoError: whether an exception was raised during reading the file.HasPixelArray: (if NoError is True) whether the file contains a pixel array.PixelArrayShape: (if HasPixelArray is True) the shape of the pixel array.
- For some formats the following packages might be required:
>>> conda install -c glueviz gdcm # Python 3.5 and 3.6 >>> conda install -c conda-forge gdcm # Python 3.7
- dicom_csv.crawler.join_tree(top: Union[Path, str], ignore_extensions: Sequence[str] = (), relative: bool = True, verbose: int = 0, read_pixel_array: bool = False, force: bool = True, unpack_volumetric: bool = True, extract_private: bool = False, total: bool = False) DataFrame [source]¶
Returns a dataframe containing metadata for each file in all the subfolders of
top
.- Parameters
top (PathLike) – path to crawled folder
ignore_extensions (Sequence) – list of extensions to skip during crawling
relative (bool) – whether the
PathToFolder
attribute should be relative totop
default is True.verbose (int) –
- the verbosity level:
- 0 - no progressbar1 - progressbar with iterations count2 - progressbar with filenames
total (bool) – whether to show the total number of files in the progressbar. This is adds a bit of overhead, because each file will be visited a second time (without being opened).
References
See the Working with DICOM files tutorial for more details.
Notes
- The following columns are added:
- NoError: whether an exception was raised during reading the file.HasPixelArray:(if NoError is True) whether the file contains a pixel array(added if read_pixel_array is True).PixelArrayShape: (if HasPixelArray is True) the shape of the pixel array (added if read_pixel_array is True).PathToFolderFileName
- For some formats the following packages might be required:
>>> conda install -c glueviz gdcm # Python 3.5 and 3.6 >>> conda install -c conda-forge gdcm # Python 3.7
Aggregation¶
Tools for grouping DICOM metadata into images.
- dicom_csv.aggregation.aggregate_images(metadata: DataFrame, by: Union[str, Sequence[str]], process_series: Optional[Callable] = None) DataFrame [source]¶
Groups DICOM
metadata
into images (series).- Parameters
metadata – a dataframe with metadata returned by
join_tree
.by – a list of column names by which the grouping will be performed. Default columns are: PatientID, SeriesInstanceUID, StudyInstanceUID, PathToFolder, PixelArrayShape, SequenceName.
process_series – a function that processes an aggregated series before it will be joined into a single entry
References
See the Working with DICOM files tutorial for more details.
Notes
- The following columns are added:
- SlicesCount: the number of files/slices in the image.FileNames: a list of slash (“/”) separated file names.InstanceNumbers: (if InstanceNumber is in columns) a list of comma separated InstanceNumber values.
- The following columns are removed:
FileName (replaced by FileNames), InstanceNumber (replaced by InstanceNumbers), any other columns that differ from file to file.
Loading¶
Spatial operations¶
- dicom_csv.spatial.get_orientation_matrix(series: Sequence[Dataset]) ndarray [source]¶
Returns a 3 x 3 orthogonal transition matrix from the image-based basis to the patient-based basis. Rows are coordinates of image-based basis vectors in the patient-based basis. Columns are coordinates of patient-based basis vectors in the image-based basis vectors.
See https://dicom.innolitics.com/ciods/rt-dose/image-plane/00200037 for details.
- class dicom_csv.spatial.Plane(value)[source]¶
Bases:
Enum
An enumeration.
- Sagittal = 0¶
- Coronal = 1¶
- Axial = 2¶
- dicom_csv.spatial.order_series(series: Sequence[Dataset], decreasing: bool = True) Sequence[Dataset] [source]¶
- dicom_csv.spatial.get_slice_locations(series: Sequence[Dataset]) ndarray [source]¶
Computes slices location from ImagePositionPatient. NOTE: the order of slice locations can be both increasing or decreasing for ordered series (see order_series).
- dicom_csv.spatial.locations_to_spacing(locations: Sequence[float], max_delta: float = 0.1, errors: bool = True) float [source]¶
- dicom_csv.spatial.get_slice_spacing(series: Sequence[Dataset], max_delta: float = 0.1, errors: bool = True) float [source]¶
Returns constant distance between slices of a series. If the series doesn’t have constant spacing - raises ValueError if
errors
is True, returnsnp.nan
otherwise.
- dicom_csv.spatial.get_pixel_spacing(series: Sequence[Dataset]) Tuple[float, float] [source]¶
Returns pixel spacing (two numbers) in mm.
- dicom_csv.spatial.get_voxel_spacing(series: Sequence[Dataset]) Tuple[float, float, float] [source]¶
Returns voxel spacing: pixel spacing and distance between slices’ centers.
- dicom_csv.spatial.get_image_position_patient(series: Sequence[Dataset]) ndarray [source]¶
Returns ImagePositionPatient stacked into array.
- dicom_csv.spatial.drop_duplicated_slices(series: Sequence[Dataset], tolerance_hu=1) Sequence[Dataset] [source]¶
- dicom_csv.spatial.get_slice_orientation(*args, **kwds)¶
get_slice_orientation
is deprecated!
- dicom_csv.spatial.get_slices_orientation(series: Sequence[Dataset]) SlicesOrientation ¶
get_slices_orientation
is deprecated!
- class dicom_csv.spatial.SlicesOrientation(transpose: bool, flip_axes: tuple)[source]¶
Bases:
tuple
Defines how slices should be transformed in order to be canonically oriented: First transpose slices if
transpose == True
. Then flip slices alongflip_axes
(they already account for transposition).- property transpose¶
Alias for field number 0
- property flip_axes¶
Alias for field number 1
- dicom_csv.spatial.orientation_matrix_to_slices_orientation(*args, **kwds)¶
orientation_matrix_to_slices_orientation
is deprecated!
- dicom_csv.spatial.get_axes_permutation(*args, **kwds)¶
get_axes_permutation
is deprecated!
- dicom_csv.spatial.get_flipped_axes(*args, **kwargs)¶
<lambda>
is deprecated!
- dicom_csv.spatial.get_image_plane(series: Sequence[Dataset]) Plane ¶
get_image_plane
is deprecated!
Console scripts¶
This library contains a console script around join_tree
,
which is added to your namespace after installation:
dicom-csv folder/with/dicoms path/to/metadata.csv
# pass --help for more details:
dicom-csv --help