04 All for nothing

Extract RoI from CT image

hyojung chang

written by hyojung chang

In order to predict the area of disease using Faster R-CNN, the bounding boxes and labels of the lung disease shown in the CT image are required. Below is a description of how to derive this and compose annotation file.


Extract RoI from CT image

  • If you have original image and mask image

        Contour is a line connecting areas with the same value in the image, and you can easily think of it as getting a border of the same colored area. This means, we can look for the mask area just like looking for a white object on black background when we use grayscale images.

        In other words, we convert the mask image to grayscale(changing the mask area to gray or white) and look for the mask’s contour like looking for a white object on black background.

        Requirment : python, OpenCV


        1) Convert the mask image to grayscale(changing the mask area to gray or white) and look for the mask’s contour like looking for a white object on black background

    mask = cv2.imread(filename)

    imgray = cv2.cvtColor(mask, cv2.COLOR_BGR2GRAY)
    ret, thr = cv2.threshold(imgray, 70255, cv2.THRESH_BINARY)

    contours, _ = cv2.findContours(thr, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

        2) If you want to check contour or draw contour on the images

    original = cv2.imread(filename)

    for i in range(len(contours)) :
        c0 = contours[i]

        cv2.drawContours(mask, c0, -1, (00255), 1)
        cv2.imshow('contour on the mask', mask)

        cv2.drawContours(original, c0, -1, (00255), 1)
        cv2.imshow('contour on the original', original)

        You can see the screen like below.


        3) (Optional) If you want to find x, y, w, h of mask and draw rectangle on the images

    for i in range(len(contours)) :
        c0 = contours[i]
        x, y, w, h = cv2.boundingRect(c0)

        cv2.rectangle(original, (x, y), (x+w, y+h), (00255), 1)
        cv2.imshow('draw rectangle on the original', original)

        You can get [161, 434, 21, 26] and see the screen like below.




  • If you have 3D image(*.mhd, *.raw) and annotation information

        *This part is specialized in the data provided by LUNA 16.

        About data provided by LUNA 16 :

The unit of measurement in CT scans is the Hounsfield Unit (HU), which is a measure of radiodensity. CT scanners are carefully calibrated to accurately measure this. From Wikipedia<Link>.

By default however, the returned values are not in this unit. Let's fix this.

Some scanners have cylindrical scanning bounds, but the output image is square. The pixels that fall outside of these bounds get the fixed value -2000. The first step is setting these values to 0, which currently corresponds to air. Next, let's go back to HU units, by multiplying with the rescale slope and adding the intercept (which are conveniently stored in the metadata of the scans!).

...omitted below

Our plot function takes a threshold argument which we can use to plot certain structures, such as all tissue or only the bones. 400 is a good threshold for showing the bones only (see Hounsfield unit table above). Let's do this!

 Reference : Kaggle<Link>

        The important thing is that the data provided by LUNA 16 is a 3D image, and the HU is a threshold argument which we can use to plot certain structures, such as all tissue or only the bones. 

        We need to get a flattened 2D image and RoI coordinates from these data.


        1)  About file extension

        One image consists of a .mhd file and a .raw file. At this time, .mhd is a description file for the image, and .raw is a file containing specific data.

        Example of .mhd file :

ObjectType = Image
NDims = 3
BinaryData = True
BinaryDataByteOrderMSB = False
CompressedData = False
TransformMatrix = 1 0 0 0 1 0 0 0 1
Offset = -195 -195 -378
CenterOfRotation = 0 0 0
AnatomicalOrientation = RAI
ElementSpacing = 0.7617189884185791 0.7617189884185791 2.5
DimSize = 512 512 141
ElementType = MET_SHORT
ElementDataFile = 1.3.6.1.4.1.14519.5.2.1.6279.6001.173106154739244262091404659845.raw

        2) Read image and Convert 3D coordinate to flattened 2D

        Requirement : python, SimpleITK

        Origin [default is zero] : coordinates of the pixel/voxel with index (0,0,0) in physical units (i.e. mm).

        Spacing [default is one] : Distance between adjacent pixels/voxels in each dimension given in physical units.

def load_itk_image(filename):
    itkimage = sitk.ReadImage(filename)
    ct_scan = sitk.GetArrayFromImage(itkimage)
    origin = itkimage.GetOrigin()
    spacing = itkimage.GetSpacing()
    return ct_scan, origin, spacing

        If you want to check the image, add the code below.

    plt.figure()
    plt.imshow(ct_scan[100,:,:]) 

        LUNA 16 provides annotation.csv, which contains the location information of the lung nodule labeled by the physician. The location is expressed in world coordinates, so we need to convert it to voxel coordinates to use it.  We will use following concept. 

def worldToVoxelCoord(worldCoordoriginspacing):
    stretchedVoxelCoord = np.absolute(worldCoord - origin)
    voxelCoord = stretchedVoxelCoord / spacing
    return voxelCoord

        By using the location of the nodule in annotations.csv and the concept above, a flattened 2D image can be obtained.

    x, y, z = int((nodules[idx, 0]-Origin[0])/SP[0]), int((nodules[idx, 1]-Origin[1])/SP[1]),
int((nodules[idx, 2]-Origin[2])/SP[2])
    data = ct_scan[z]

        3) Mark the RoI(=nodule) position on a flattened 2D image

        Requirement : Matplotlib

        radius : radius of nodule

        pad : 2*radius

    x, y, z = int((nodules[idx, 0]-Origin[0])/SP[0]), int((nodules[idx, 1]-Origin[1])/SP[1]),
int((nodules[idx, 2]-Origin[2])/SP[2])
    data = ct_scan[z]
    radius=int(nodules[idx, 3]/SP[0]/2)

    data[max(0, y - radius):min(data.shape[0], y + radius),
        max(0, x - radius - pad):max(0, x - radius)] = 3000 # draw vertical line
    data[max(0, y - radius):min(data.shape[0], y + radius),
        min(data.shape[1], x + radius):min(data.shape[1], x + radius + pad)] = 3000 # draw vertical line
    data[max(0, y - radius - pad):max(0, y - radius),
        max(0, x - radius):min(data.shape[1], x + radius)] = 3000 # draw horizontal line
    data[min(data.shape[0], y + radius):min(data.shape[0], y + radius + pad),
        max(0, x - radius):min(data.shape[1], x + radius)] = 3000 # draw horizontal line

    plt.imshow(data, cmap='gray')
    plt.show()

         You can see the screen like below.

   

         4) (Optional) If you want to get x0, y0, x1, y1 of RoI
        Just paste the code below :D

print(max(0, x - radius), min(data.shape[1], x + radius), max(0, y - radius), min(data.shape[0], y + radius))

        5) (Optional) If you want to get area of RoI only

    noduleOnly = data.copy()
    noduleOnly = data[max(0, y - radius):min(data.shape[0], y + radius),
max(0, x - radius):min(data.shape[1], x + radius)]

        6) (Optional) If you want to save a flattened 2D image

        Just paste the code below :D

    image.imsave(filename+'.png', data, cmap='gray')

Reference : CSDN<Link>



Thanks for your interest in we_lung_u.
If you want to take a look our whole-line code, visit our Github through below link.