myeHEALTH
00.
PROJECT DETAILS
Role
Backend Development
Research & Data Analysis
Tools/Skills
Python
Jupyter Notebook
Timeline
PharmaHacks
Mar. 25th - 27th, 2022
24 hours
Team
Me
Leo
Gabe
Emily
01.
CONTEXT
Multiple myeloma (MM) is a cancer caused by uncontrolled proliferation of plasma cells located in the bone marrow. Plasma cells are specialized white blood cells responsible for regulating immunity, which destroys pathogens (foreign organisms that cause disease) roaming in the external space around cells to prevent recurrent infection. Excessive accumulation of plasma cells leads to the secretion of nonfunctional proteins called “M (monoclonal) proteins”, which begin to surmount healthy cells in the bone marrow [1,2,3].

In 2021, multiple myeloma:
  • represented 1.9% of total new Canadian cancer cases for males in 2021 and 1.4% in females
  • presented an incidence rate of 11 in 100 000 for males and 8 in 100 000 for females
  • affected 3 800 new individuals [4]
02.
THE PROBLEM
Due to its rarity and vague symptoms (back pain, bone pain, fatigue, frequent infections, etc.), multiple myeloma is difficult to diagnose.From conducting secondary research, we were able to gain a better grasp on the average time it takes for a symptomatic individual to meet with a specialist for diagnosis (indicated in boxes below) [5].
To summarize, it could take up to 11 months (almost 1 year!) for an individual to receive diagnosis for multiple myeloma.

Delayed diagnosis can lead to critcal complications, including renal failure, bone fractures, spinal cord compression, and lower chance of survival [5]. Hence, there is a need to deploy more efforts for innovating tools that could expediate the detection of myeloma in patients.
03.
SOLUTION OVERVIEW
Our goal was to develop a tool to predict the presence of myeloma in patients using a machine learning program trained with pre-existing clinical data from myeloma patients.
04.
TAKING A CLOSER LOOK INTO THE DATA SET
Using the medical examination results of 200 MM patients reported in Multiple Myeloma Dataset, our team trained our model with levels of key health attributes considered for MM diagnosis, including levels of hemoglobin, calcium, albumin, beta-2-microglobulin, and evidence of osteolytic lesions.
Tackling our greatest challenge:
Biased Dataset + Insufficient Amount of Data
Since the first training dataset we assembled only contained data on individuals diagnosed with MM, our model was concluding to the wrong result everytime we tested the model using data from an individual without MM. It was impossible for the model to categorize the sample data into a class that was nonexistent in its system. As expected, the first version of our model was incorrectly classifying every test input as either stage 1, 2 or 3 multiple myeloma.
To resolve the biased classification, we reserached the range of values for each medical data (hemoglobin level, calcium level, etc.) typically presented in healthy individuals and created synthetic data confined to the specific criteria. We followed a similar approach to improve the accuracy of the model by extrapolating data that emulated the values indentified in individuals with various stages of multiple myeloma.
06.
BUILDING THE MACHINE LEARNING MODEL
After refining the dataset, I used Python with NumPy, Pandas and Scikit-learn libraries to construct the machine learning model on Jupyter Notebook.
Watch our project demo!
06.
PITCHING
With an accuracy rate of 90%, myeHealth was recognized as the 1st Place Winner at Pharmahacks 2022!
07.
NEXT STEPS
Conduct further research and collect feedback from users (ages 18 - 65) regarding the app's accessibility and usability.

Create alternative screen layouts compatible with wearable devices (e.g. Apple watches) - to be implemented when connection to external devices is established within the app's internal system.

Continue developing the merchant platform - implement compatibility with payment machines.

Add an onboarding/sign-in feature to allow users to securely access their accounts from any device.
[1] C. Koshiaris, “Methods for reducing delays in the diagnosis of multiple myeloma,” International Journal of Hematologic Oncology, vol. 8, no. 1, 2019.
[2] H. C. Allen and P. Sharma, “Histology, Plasma Cells,” StatPearls, 2022.
[3] Healthline, “What Does It Mean If You Have M Proteins in Your Blood?” Healthline, July 24, 2017. [Online]. Available: https://www.healthline.com/health/urine-protein-test#reasons-for-testing [Accessed March 26, 2022].
[4] Myeloma Canada, “Canadian Statistics for Multiple Myeloma,” Myeloma Canada, 2021. [Online]. Available: https://www.myelomacanada.ca/en/about-multiple-myeloma/what-is-myeloma-10/statistics [Accessed March 26, 2022].
[5] C. Koshiaris, O. Jason, L. Abel, B. D. Nicholson, K. Ramasamy, A. Van den Bruel, “Quantifying intervals to diagnosis in myeloma: a systematic review and meta-analysis,” BMJ Open, vol. 8, no. 6, 2018.