Message Boards Message Boards

Brain haemorrhage diagnosis: using LeNet based deep learning model

Introduction

Brain haemorrhage is a type of stroke. It's caused by an artery in the brain bursting and causing confined bleeding in the enclosing tissues. This bleeding kills brain cells.

The Greek root for blood is hemo. Haemorrhage means "blood bursting forth." Brain haemorrhages are also called cerebral haemorrhages, intracranial haemorrhages, or intracerebral haemorrhages.

Cerebral haemorrhage deems for about 13% of all strokes in the United States. It is the next foremost cause of stroke. (The principal cause of stroke is a blood clot – thrombus – in an artery in the brain, which blocks blood flow and cuts off required oxygen and nutrients to the brain.)

Importing Dataset

I implied a data set of haemorrhage and non-haemorrhage brain from Kaggle. Each class placed in its corresponding variable.

infected=FileNames["*.png","E:\\COURSES\\Wolfram\\BrainTumorImagesDataset\\training_set\\hemmorhage_data"];
uninfected=FileNames["*.png","E:\\COURSES\\Wolfram\\BrainTumorImagesDataset\\training_set\\non_hemmorhage_data"];

Constructing File Objects for Images

I wanted to match each brain image with a value of either true for Hemorrhage or false for no Hemorrhage. To improve the efficiency of the importation, I formulated separate file objects for each of the image variables. Each variable contained 70 images for parasitized and uninfected cells.

infectedIMG = File /@ infected;
uninfectedIMG = File /@ uninfected;

Then I created a record of 70 true and false values which would be used to be matched up with their respective images. I set these lists in variables and made another variable to connect the list of true and false values along with another variable that connected the infected and uninfected file objects.

Length[infectedIMG]
70
infectedvalues=Table[True,Length[infected]];​​Length[uninfected]
70
uninfectedvalues=Table[False,Length[uninfected]];

Finally, using the AssociationThread function, I associated the images with their values and divided the data into two groups, 75% for training and 25% for validation.

data=RandomSample[AssociationThread[infectedIMG->infectedvalues]];​​
traininglength=Length[data]*.75
52.5
trainingdata=data[[1;;52]];​​validationdata=data[[53;;]];

Creating the Neural Network

I then started to work on the Neural Network, which used MNIST image classification. The network's goal is to classify uninfected and infected using true and false to describe whether the patient suffers Brain Haemorrhage or not. I built a NetChain function that had multiple layers. One striking layer is the Resize layer which changes the image dimensions of each image to 135 by 135. This changes the images to comply with the sensitivity of the neural network to the size of images. Further layers include the convolution layer, ramp, and pooling layer, which all work to narrow down pieces and create categories to classify each image to associate them.

dims={135,135}
{135,135}
lenet=NetChain[{ResizeLayer[dims],ConvolutionLayer[20,5],Ramp,(*Takesoutthethenotusefulfeatures*)PoolingLayer[2,2],(*Downsamples*)ConvolutionLayer[50,5],Ramp,(*Takesoutthethenotusefulfeatures*)PoolingLayer[2,2],(*Downsamples*)FlattenLayer[],500,(*Makesfeaturesintofeaturevector"*)Ramp,2,(*Takesoutthethenotusefulfeatures-Trueorfalse*)SoftmaxLayer[]},(*Turnsthevectorintoprobabilities*)"Output"NetDecoder[{"Class",{True,False}}],(*Tensorintotrueorfalse*)"Input"NetEncoder["Image"](*Turnsimageintonumbers*)]

Training the Neural Networks with NetTrain

I trained the neural nets with 10 training rounds.

results=​​NetTrain[lenet,Normal[trainingdata],All,​​ValidationSet->Normal[validationdata],MaxTrainingRounds->10,​​TargetDevice->"CPU"]

NetTrain Result

Training the Neural Network with Augmented Layers

Next I implemented an ImageAugmentationLayer, which randomly crops images to create new data sets to improve my neural network.

augment =  ImageAugmentationLayer[{135, 135},   "Input" -> NetEncoder[{"Image", {139, 139}}],   "Output" -> NetDecoder["Image"]]

I made the images 139 by 139 and allowed the augmentation layer to crop the images by 4 pixels at random within the constraints of the dimensions of 135 by 135.

dims2 = {139, 139}
lenet2 = NetChain[{ResizeLayer[dims2], 
   ImageAugmentationLayer[{135, 135}], ConvolutionLayer[20, 5], Ramp, 
   PoolingLayer[2, 2], ConvolutionLayer[50, 5], Ramp, 
   PoolingLayer[2, 2], FlattenLayer[], 500, Ramp, 2, SoftmaxLayer[]}, 
  "Output" -> NetDecoder[{"Class", {True, False}}], 
  "Input" -> NetEncoder["Image"]]

I trained this data using the neural net, with only 7 layers on CPU.

results2 = 
  NetTrain[lenet2, Normal[trainingdata], All, 
    ValidationSet -> Normal[validationdata], MaxTrainingRounds -> 7]

enter image description here

Creating a Testing Set for Data

enter image description here

Data Visualization

Lastly, I made a ConfusionMatrixPlot using the Classifier Measurements function which compares the neural networks predicted class against the actual class result. enter image description here

Conclusion

I built a neural network that strongly diagnosed Brain Haemorrhage with an accuracy of about 99.000000078%. Furthermore, as displayed in the ConfusionMatrix, there were 18 examples of the neural network prediction matching with the actual results for true and 18 examples of the neural network and actual matching for true.

Future Improvements

To additionally enhance this project, I could implement more augmented datasets to further train and enhance the neural net. Moreover, I could use various images from different datasets to prevent overfitting and improve efficiency. Lastly, I could execute a function that pinpoints Brain Haemorrhage by finding the edges of the Haemorrhage area and sensing the infected cells through the function, image distribution and colour detection.

Attachments:
POSTED BY: Aman Dewangan
6 Replies

Thanks a lot to the @Wolfram Moderation Team for providing me this badge and Featuring me on Contributor board.

POSTED BY: Aman Dewangan

enter image description here -- you have earned Featured Contributor Badge enter image description here Your exceptional post has been selected for our editorial column Staff Picks http://wolfr.am/StaffPicks and Your Profile is now distinguished by a Featured Contributor Badge and is displayed on the Featured Contributor Board. Thank you!

POSTED BY: EDITORIAL BOARD

Respected Sir,

I used this particular database: (https://www.kaggle.com/arya7m/tumour-brain)

And yeah I later on realised that there are no. of faults in the model which I would definitly like to know about and work on them, and I really look ahead to work on it.

POSTED BY: Aman Dewangan
Posted 3 years ago

The issue is ecause, I have used a small Dataset, I didn't find a big database anywhere, so I trained it over a few of them, and that's the reason it dropped immediately to zero, for larger datasets it will surely have errors.

POSTED BY: Updating Name
Posted 3 years ago

Yours seems more realistic. For one thing the loss in Dewangen's drops almost immediately to zero, whereas yours follows a normal curve. I don't see how Dewangen could have gotten that under normal conditions.

POSTED BY: louis sarwal

Hi Aman,

This is a nice piece of work. I have a few comments:

  1. I want sure exactly which dataset you used, as there are several on Kaggle. So I used this one: https://tinyurl.com/dn8jn6jx.

This dataset contains 100 normal head CT slices and 100 other scans are for patients with hemorrhage. Each slice comes from a different patient. There is no distinction between kinds of hemorrhage seen in the scans. The images were taken from an internet search and are of differing size and resolution. The main idea of using this dataset is to develop ways to predict imaging findings even in a context of limited data of varying quality.

  1. I couldnt exactly replicate the way you handled the data. In my MMA version it requires you to specify the Input and Output for each item in the training dataset. Perhaps you are using an earlier version of MMA.

  2. My main criticism is that you chose to focus exclusively on the results for the positive cases, where you achieved 100% accuracy. But this leaves unaddressed the question of false positives.

Also, I believe you are using the same set of data for both validation and testing, which may explain why you were able to achieve 100% accuracy!

In my version I looked and results for both positive and negative scans and found an overall accuracy rate of 90%, which is exactly in line with the results reported on Kaggle.

Let me know if you would be interested in collaborating if you decide you want to do further work.

Jonathan

POSTED BY: Jonathan Kinlay
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract