Breast Cancer Detection Web App

 

Among the many applications of machine learning, one is of particular interest to me. The use of disease detection in machine learning has the potential to help a large number of people in the world and the advent of machine learning and computer vision in the past few years have definitely transformed the fields of medicine, finance, biotechnology and more. The use of disease detection methods using machine learning and computer vision has a number of applications in the medical sector and its use is only expected to grow exponentially as we develop better methods and models. he value of machine learning in healthcare is its ability to process huge datasets beyond the scope of human capability, and then reliably convert analysis of that data into clinical insights that aid physicians in planning and providing care, ultimately leading to better outcomes, lower costs of care, and increased patient satisfaction. 

Many leading tech companies and universities have been doing research on the use of AI in the medical sector. For example, Google has developed a machine learning algorithm to help identify cancerous tumors on mammograms. Stanford is using a deep learning algorithm to identify skin cancer. Such revolutionary and pioneering research motivates enthusiasts of machine learning and computer vision like me to study more and more about these practices and the methods used to develop them. 

I am here with an example of a disease detection app which detects if you have breast cancer based upon a number of features such as radius, age, texture, perimeter, area, smoothness, compactness, concavity, concave points, symmetry and fractal dimension. I will explain more about these parameters shortly. This web app is based on machine learning and uses the Random Forest Classifier Classification Algorithm. The app is coded majorly in python. 

We have deployed the app using Streamlit. It is an open source framework that allows data science teams to deploy web apps fairly easily. It's one of the best hosting services I've used and it's great for quick and easy deployment of web apps. The app is coded in python.  The web app uses interactive visual and graphical interpretations to display the outcome and compare the input parameters given by the user. The graphs compare the values of the patient with others ( both with cancerous and non-cancerous patients). It also provides the accuracy of the result which ranges from around 90-95%. 

A value of 0 on the graphs represents a benign i.e. non-cancerous tumor and a value of 1 represents a malignant i.e. a cancerous tumor. This web app was a learning curve for us and has improved our knowledge about Machine learning significantly. We hope to deploy more apps in the future and share them with you. Feel free to add onto this project and don't hesitate to drop by any suggestions. The link for the Breast Cancer Detection  web app is as follows : https://share.streamlit.io/braxtonova/cancer/main/app.py

About the dataset: The dataset used is the Wisconsin Breast Cancer dataset created by researchers at the University of Wisconsin. It consists of the following parameters: radius (mean of distances from center to points on the perimeter), texture (standard deviation of gray-scale values), perimeter, area, smoothness (local variation in radius lengths), compactness (perimeter^2 / area - 1.0), concavity (severity of concave portions of the contour), concave points (number of concave portions of the contour), symmetry and fractal dimension ("coastline approximation" - 1). For those of you who are not familiar with the terms in statistics, my article about Exploratory Data Analysis can be a good starting point. 

I will provide a brief idea about contours and the coastline paradox (one of my favorite mathematical paradoxes) in this article. In layman terms, an outline representing or bounding the shape or form of something is called a contour. However, we state this is calculus and linear algebra as: a line joining points on a diagram at which some property has the same value. contour line (also isolineisopleth, or isarithm) of a function of two variables is a curve along which the function has a constant value, so that the curve joins points of equal value. It is a plane section of the three-dimensional graph of the function f(xy) parallel to the (xy)-plane.Contour lines are curved, straight or a mixture of both lines on a map describing the intersection of a real or hypothetical surface with one or more horizontal planes. I'd also like to mention about contour integrals, which is a method of evaluating certain integrals along paths in the complex plains. Contour integration is also closely related to complex analysis, application of the residue theorem, Cauchy Integral formula  etc. 

I could talk about these all day but lets move onto the coastline paradox. The coastline paradox revolves around the seemingly simple notion that the coastline of a landmass does not have a well defined length. This results from the fractal curve-like properties of coastlines, i.e., the fact that a coastline typically has a fractal dimension (which in fact makes the notion of length inapplicable). The first recorded observation of this phenomenon was by Lewis Fry Richardson and it was expanded upon by Benoit Mandelbrot. The measured length of the coastline depends on the method used to measure it and the degree of cartographic generalization

Disclaimer: This is just a learning project based on one particular dataset so please do not depend on it to actually know if you have breast cancer or not. It might still be a false positive or false negative. A doctor is still the best fit for the determination of such diseases.

Breast Cancer Awareness Month, also referred to in the United States as National Breast Cancer Awareness Month, is an annual international health campaign organized by major breast cancer charities every October to increase awareness of the disease and to raise funds for research into its cause, prevention, diagnosis, treatment and cure. The National Breast Cancer Awareness month was founded in 1985 as a partnership between the American Cancer Society and the pharmaceutical divisions of Imperial Chemical Industry (now a part of Astrazeneca). The aim of this was to promote mammography as the most effective weapon in the fight against breast cancer. Let's support this initiative and promote the awareness of this disease among the masses.

Note: Some of you'll mentioned that the prediction is always a malignant tumor, that might be the case as the dataset contains relatively a less number of benign data points. Although, if you vary the values of texture and radius you should see the prediction come out as benign for certain cases. 


Comments

Popular posts from this blog

Tennis GOAT Debate

PWA (Powerful WebApp) deployment for Skillocity

Vectors: A Physicist's perspective